Mohammad Hossein Samavatian - Department of Computer
Transcription
Mohammad Hossein Samavatian - Department of Computer
Mohammad Hossein Samavatian Computer Engineering Department, Sharif University Of Technology, Azadi Ave, Tehran, Iran (+98) 9183193525 (+98)9385631523 (+98)(8138224485) [email protected] [email protected] Single Iranian OBJECTIVE Seeking PhD position. EDUCATION MSc of Computer Engineering - Computer Architecture 2011 to 2013 Sharif University of Technology GPA: 3.79 Computer Engineering - Hardware 2007 to 2011 Amirkabir University of Technology (Tehran Polytechnic) GPA: 3.33 (last two years: 3.73, last Three years: 3.73) High school diploma in Mathematics and Physics at school of Exceptional Talents NODET 2003 to 2007 AllameHelli of Hamedan INTERESTS GPGPU, Multi-cores and Many-cores Architecture and Programming. High Performance Systems Architecture. Interconnection Networks. Embedded System Design. Quantum Computers and Reversible Logic. PUBLICATION Mohammad Hossein Samavatian, Mohammad Arjomand, Ramin Bashizade and Hamid Sarbazi-azad. “Architecting the Last-Level Cache for GPUs Using STT-RAM Technology,” ACM Transactions on Design Automation of Electronic Systems(TODAES) In press, 2015. Mohammad Hossein Samavatian, Hamed Abbasitabar, Mohammad Arjomand and Hamid Sarbaziazad, “An Efficient STT-RAM Last Level Cache Architecture For GPUs,” DAC 2014, San Francisco, CA, USA. (ACM) Mahboobeh Houshmand, Morteza Saheb Zamani, Mehdi Sedighi, Mohammad Hossein Samavatian, “Automatic Translation of Quantum Circuits to One-Way Quantum Computation Patterns,” 2014, QINP. Mahboobeh Houshmand, Mohammad Hossein Samavatian, Morteza Saheb Zamani, Mehdi Sedighi, "Extracting One-way Quantum Computation Patterns from Quantum Circuits," International Symposium on Computer Architecture and Digital Systems (CADS), Iran, 2012. (IEEE) PROJECTS A Novel STT-RAM Architecture for Last Level Shared Caches in GPUs (M.Sc. Thesis), 2012-2013. o Supervisor: Prof. Hamid Sarbazi-Azad. Due to the high processing capacity of GPGPUs and their requirement to a large and high speed shared memory between thread processors clusters, exploiting Spin-Transfer Torque (STT) RAM as a replacement with SRAM can result in significant reduction in power consumption and linear enhancement of memory capacity in GPGPUs. In the GPGPU (as a many-core) with ability of parallel thread executing, advantages of STT-RAM technology, such as low read latency and high density, could be so effective. However, the usage of STT-RAM will be grantee applications run time reduction and growth threads throughput, when write operations manages and schedules to have least overhead on read operations. The purpose of this thesis is propose and evaluate a STT-RAM architecture for last level cache (LLC) in GPGPUs which uses circuit and architectural level techniques for managing access operation to LLC. First by reducing retention time of STT-RAM cells hybrid architecture introduced, then with characterization of GPGPU workloads, cache parameters such as cache micro-architecture, data retention time, latency and energy consumption were calculated. Finally by simulating target architecture with different design explores latency and power consumption of cache was measured. Proposed architecture result in Performance gain 16% in average and 100% maximum with 20% power saving. On the other hand with techniques used for data searching in cache, power consumption reduced 40% with performance improvement degradation from 16% to 15%. ADVANCED VLSI course project, spring 2012: o A complete design, synthesis and simulation procedure was done in this project. RTL Simulated by Mentor Model-Sim. Processor Synthesized by Synopsys Design Compiler. Post-synthesis Simulation: Simulated synthesized processor by Model-Sim again with netlist file and with/without SDF file. Check/Examination similarity of synthesized processor with RTL description by “FORMALITY” tool. Placement and Routing done by Cadence SoC Encounter. Post layout simulation by Synopsys HSIM. RECONFIGURABLE COMPUTING course project, spring 2012: Performance-Aware Clustering Algorithm with Simulated Annealing (PASACA): In this project we introduce a novel method named Performance-Aware Simulated Annealing Clustering Algorithm (PASACA) to cluster LEs for FPGA circuits that use simulated annealing algorithm with a suggested cost function in order to gain circuit performance. PASACA reads blif files containing LEs and primary inputs and primary outputs information and after running its clustering algorithm, and generates a net file for giving to VPR program. The clustering results on test benches of PASACA are compared with TVpack. RC Car Automatic Parallel Park design and implementation, spring 2011(More details in HERE). COMPUTER ARCHITECTURE course project, spring 2009. o Implementation of basic computer and micro-instruction computer architectures by Verilog and Simulated by Model-Sim. Logic circuit simulator with JAVA, spring 2008. Simulation of Universal Asynchronous Receiver/Transmitter (UART) by Verilog, spring 2008. Simulating Windows Command Prompt based on File System by C# language, fall 2008. Derivation of Optimal One Way Quantum Computing (1WQC) Model Pattern from Quantum Circuit Model. (B.Sc. Thesis) summer 2011. o Supervisor: Prof. Morteza Saheb-Zamani. Quantum computing is a new method of quantum information processing based on quantum mechanics. One of quantum computing model is measurement based quantum computing (MBQC). MBQC is divided into two categories: TQC (Teleportation quantum computing) and 1WQC (One way quantum computing). MBQC model has no equivalent in the classical world and is based on two features of the quantum world, measurement and entanglement. This project is focused on the 1WQC model and its goal is implement a program to convert quantum circuit to 1WQC model. Computation in 1WQC model includes four main command. These commands are preparation, entanglement, measurement and correction. In current project 1WQC model would be derived from a quantum circuit that construct from CNOT,CZ, J(α), C2NOT and all of one qbit unitary gates. Input of the program is a quantum circuit in QASM format. 1WQC model will be implemented as graph. J(α), CZ and CNOT gates add to 1WQC graph directly. C2NOT and one qbit unitary gates would be implemented and add to graph as combination of three above gates. X and Z gates that are in 1WQC model as correction command will be added to graph but use some methods till don’t create new axillary qbits. Other one or more input qbit gates would be constructed with default and defined gates. Output of program is 1WQC model in CME standard form with some optimization like Pauli simplification and signal shifting that apply on output model. Finally with some test bench evaluate correctness of program. Time consumed and depth of quantum circuit before and after creation of 1WQC model are analyzed. This program written by C++ programming language with Microsoft Visual Studio tool. WORK And TEACHING EXPERIENCES Research Assistant, Prof. Hamid Sarbazi-Azad, Institute for Research in Fundamental Sciences(IPM), Winter 2014 to present Teaching assistant, microprocessor by Dr. Ghasem Miremadi, Sharif Univeristy of Technology, Winter 2013 Microcontroller lab instructor, Amirkabir University of Technology Winter 2012, Winter and Fall 2014 Quantum lab member, Amirkabir University of Technology Research Assistant, Summer 2011, http://ceit.aut.ac.ir/QDA/members.htm Novin Rayaneh Hamedan Internship, Summer 2010 Omid Technologies Microcontroller Developer, Fall and winter 2012, http://www.omid.ca/ TECHINICAL EXPERTISE Programming Languages: C, C++ (expert), C#, Java (familiar) HDL Description Languages: VHDL (expert),Verilog (familiar) Operating Systems: Fedora, Ubuntu, CentOS (expert), SUSE (familiar) PCB CAD tools: Altium designer DXP (expert) EDA tools: Cadence SoC Encounter, Cadence Virtuoso, Synopsys Design Compiler, Synopsys HSIM (familiar) FPGA and CPLD: VPR, TVpack, ABC and FPGA programming (familiar) Microcontrollers: AVR programming(ATmega 8, 16, 32, 64, 128, 2560/1), CodevisionAVR, IAR (AVR & ARM) (expert) Simulation Tools: GPGPU-Sim, Nvsim, CACTI, Hspice, Pspice, Modelsim (expert) Gem5, Proteus (familiar) Honor and awards Rank 15th in PHD entrance exam, 2014. Rank 33rd in MSc entrance exam, 2012. Rank 546th in BSc entrance exam, 2008. Accepted in the first round of chemistry Olympiad in highschool, 2006. LANGUAGE PROFICIENCY English: (Fluent), Persian: (Native) Playing Volleyball and Ping-Pong, Swimming, Mountain climbing Skiing Cinema and Filmmaking Music and Photography Hobbies REFERENCES (Available upon request)