On-Chip Optical Interconnects
Transcription
On-Chip Optical Interconnects
6th International Conference of Soft Computing and Pattern Recognition, August 11-14, 2014, Tunis, Tunisia On-Chip Optical Interconnects: Prospects and Challenges Abderazek Ben Abdallah The University of Aizu School of Computer Science and Engineering Division of Computer Engineering Adaptive Systems Laboratory Aizu-Wakamatsu, Japan E-mail: [email protected] August 13, 2014 [email protected] 1 Agenda Motivation Optical Interconnect Prospects PHENIC Si-Photonics Network-onChip Technology Challenges Conclusion August 13, 2014 [email protected] 2 HP computing today Tianhe-2, # 1 in Nov. 2013 The switch backplane Features • 16,000 nodes, each with 2 Intel Xeon IvyBridge CPUs and 3 Xeon Phi CPUs for a combined total of 3,120,000 computing cores • 33.9 Pflops ( 4% of the Exascale target (2020)) • 17.8 MW (89 % of the 20 MW power limit) August 13, 2014 [email protected] 8 A closer look at a computing system CPU On-chip bottleneck 2nJ/Inst PCI express ( 48/MB/s) GPU Off-chip If we consider exascale within 20 MW ? • We need 20pJ/Instruction ! Target performance is far by using today's machines. August 13, 2014 [email protected] 200pJ/Inst 5 Communications cost Energy cost of data movement relative to the cost of a flop for current and 2018 systems. (Shalf et al., VECPAR 2010) Challenges • Preparing the operands costs more than performing computing on them! • There is no Moore’s law for communications. August 13, 2014 [email protected] 6 Gate vs. interconnect delays Sor. IDEAL Research August 13, 2014 [email protected] 7 Agenda Motivation Optical Interconnect Prospects PHENIC Si-Photonics Network-onChip Technology Challenges Conclusion August 13, 2014 [email protected] 8 The idea Replace wires with waveguides and electrons with photons! August 13, 2014 [email protected] (Photo: Spectrum 2005, Paniccia) 9 Milestones 1mm Si Photonics target area On-chip 1cm Optical wire/Waveguide August 13, 2014 1 km 1m chip to chip 10 cm 1 Mm rack to rack long haul 100m board to board 1000 Km LAN 1m 10Km Optical cable/fiber [email protected] 10 A typical architecture today 8.5 GBpS 30 mW/Gbps DDR3 Coper link Multicore Processor (CMP) • • DRAM Big cores for single thread performance Small cores for multithread performance Accelerating Multi- and Many-core • Coper link consumes large power an alternative approach is needed. August 13, 2014 [email protected] 11 Photonics in computing system today Transmission over fiber Multicore Processor (CMP) DRAM Receiver/Transmitter Optical link • Uses monolithic integration that reduces energy consumption • Utilizes the standard bulk CMOS flow • Cladding is used to increase the total internal reflection reduces data loss August 13, 2014 [email protected] 12 Photonics in computing system today Transmission over fiber (WDM) channel λ1 λ2 λ3 …λn >1 TBps <1 mW/Gbps Multicore Processor (CMP) DRAM Receiver/Transmitter WDM, DWDM • Supports WDM that improves bandwidth density • DWDM can transports tens to hundreds of wavelengths per fiber. • Integrated Tb/s optical link on a single chip is ongoing August 13, 2014 [email protected] 13 (Si) Photonics benefits over electronic Low operating costs Low heating of components Low power Consumption Possibility to integrate more optical functionalities in a single component Low manufacturing cost High Integration High Reliability Higher density of interconnects August 13, 2014 [email protected] 19 Data rate Gb/s Doubling the Data Rate Every 2 Years August 13, 2014 [email protected] 15 Intel 50Gb/s WDM link (A. Aldiuno et al , IPR 2010) 12.5 Gb/s x 4 channels = 50 Gb/s (Intel Lab.) Source: SemiconductorTODAY Compounds&AdvancedSilicon, Vol. 5, Issue 6 • July/August 2010 August 13, 2014 [email protected] 16 Si-Photonics in computing system today Si-Photonics interposer • Optical I/O’s for chip-to-chip and chip-to-board links (IBM, Intel, Fujitsu) • E-O-E transceivers for Opto-Silicon Interposer August 13, 2014 [email protected] 17 Channel technology • Silicon waveguide – Used on-chip – Moderate loss, crossover issues • Free space – Use air – Bunch of micro-mirrors and micro-lenses guide the light around – On-chip use • Hollow metal waveguide – Used for slightly longer distances, at the board level – Low loss, ease of fabrication • Fiber optic cable – Off-chip interconnect August 13, 2014 [email protected] 18 Si-Photonics building blocks Resonator Modulator Laser Source (input) N+ Photodetectors P+ Vm Main components • Laser Source: Inject the required laser lights into waveguide • Modulators: Modulate the laser lights to ‘0’ and ‘1’ states • Photodetectors: Detect the laser lights and convert to electrical signal • Turn Resonators: Control the routing direction of the laser lights August 13, 2014 [email protected] 19 Si-Photonics building blocks A reversely biased p-i-n diode to eliminate the TPAinduced FCA Raman Silicon Laser Simulated Raman Scattering (SRS) On-chip: Vertical Cavity Surface Emitting Laser (VCSEL) • One of the largest volume (and hence, cheapest) lasers currently in use • Is often integrated on-chip • Enables “direct modulation” ( You directly turn the laser ON/OFF in accordance with the data being transmitted ) • Not fully CMOS compatible • Does not support DWDM August 13, 2014 [email protected] 20 Si-Photonics building blocks 5cm SOI nanowire 1.28Tb/s (32 l x 40Gb/s) IBM/Columbia Germanium on SOI, Silicon on Insulator (to 3.6 μm), Silcon Sapphre (to 5.6 μm), Silicon on Nitride (to 6.7 μm) Si Wire/Waveguide • Silicon is transparent above 1100 nm • Nearly all optical data links function at the near-infrared wavelength range between 800 nm and 1600 nm • We operate at 1310 nm (Industry Standard) • SOI wafers cost about 10 times as much as conventional wafers August 13, 2014 [email protected] 21 Transmission over Si Wire/Waveguide Snell’s Law of Refraction: n1 sin 1 n2 v 1 sin 2 n1 v 2 n2 n1 reflected ray reflected ray n2 refracted ray refracted ray 1 1 incident ray 1 1 2 incident ray n2 n1 n2 n1 August 13, 2014 2 [email protected] 22 Total internal reflection in Si Wire/Waveguide n1 reflected ray n2 Let 2 = /2: refracted ray 2 1 Then sin 1 n c sin 2 n1 1 1 incident ray n2 n1 n2 n1 For 1 > c, light ray is completely reflected. Total internal reflection August 13, 2014 [email protected] 23 Total internal reflection in Si Wire/Waveguide ncladding ncore ncladding n1 n2 reflected ray refracted ray 2 1 1 ncladding ncore Total internal reflection keeps all optical energy within the core, even if the fiber bends. incident ray n n1 core2 image from Wikipedia cladding August 13, 2014 [email protected] 24 Si-Photonics building blocks Mach-Zehnder Interferometer (MZI) SOR: Intel Lab. Modulator • Enables high-speed conversion from E to O signals. • Encodes data on a single wavelength channel that is combined with other signals through WDM • MRs are used for modulation due to their high modulation speed (10~20Gbps), low power (47fJ/bit) and small footprint (µm2) August 13, 2014 [email protected] 25 Si-Photonics building blocks Photodetectors • The same Microring used for modulation can be used as a wavelength selective filter (photodetectors) to extract light out of the waveguide, if the microring is doped with a photo-detecting material such as CMOScompatible germanium. • The resonant light will be absorbed by the germanium and converted into an electrical signal. August 13, 2014 [email protected] 26 What is needed for on-chip Si-Photonics interconnects ? There is still a problem of scaling! August 13, 2014 [email protected] 27 Processor is scaling to Man-core Processor Scaling to Man-core • • Are trending toward multi-core architectures with a growing number of cores -> require an increasingly efficient and low-power communications infrastructure to achieve the desired level of bandwidth & connectivity. Si-photonic NoCs provide an effective solution to the power and bandwidth limitations of existing E-NoCs used within CMPs August 13, 2014 [email protected] 28 Processor is scaling to Many-core Processor Scaling to Many-core • • Are trending toward many-core architectures with a growing number of cores -> require an increasingly efficient and low-power communications infrastructure to achieve the desired level of bandwidth & connectivity. Si-photonic NoCs provide an effective solution to the power and bandwidth limitations of existing E-NoCs used within CMPs August 13, 2014 [email protected] 29 Bandwidth, pin count and power scaling 1 Byte/Flop, 8 Flops/core @ 5GHz August 13, 2014 [email protected] 41 What is needed for on-chip Si-Photonics interconnects ? August 13, 2014 [email protected] 31 Critical Specs • • • • • • • Size Bandwidth Power consumption Switching speed Insertion loss Differential loss Crosstalk August 13, 2014 [email protected] 32 Si Photonics on-chip communication C C C C C C C C Switch controller Shared $ Shared $ C C C C X Shared $ Shared $ C C C C Merit #1: High Bandwidth • Can scale easily via WDM/DWDM (electronics only via bus width ) August 13, 2014 [email protected] 33 Si Photonics on-chip communication C C C C C C C C Switch controller Shared $ Shared $ C C C C X Shared $ Shared $ C C C C Merit #2: Low power consumption August 13, 2014 [email protected] 34 Si Photonics on-chip communication C C C C C C C C Switch controller Shared $ Shared $ C C C C X Shared $ Shared $ C C C C Merit #3: High Switching speed • The goal is not communicate as fast as possible, but as fast as needed depending on the application (speed of light 299,792 km/s) August 13, 2014 [email protected] 35 Si Photonics on-chip communication C C C C C C C C Switch controller Shared $ Shared $ C C C C X Shared $ Shared $ C C C C Merit #3: High Switching speed • The goal is not communicate as fast as possible, but as fast as needed depending on the application (Normal or Burst types). August 13, 2014 [email protected] 36 Landscape of SiP on-Chip networks (PNoC) Mesh [Shacham’07] [Petracca’08] August 13, 2014 Mesh Crossbar [Joshi’09a] [Pan’09] [Shacham’07] [Petracca’08] [email protected] Clos [1-21] 37 The basic PNoC building block in1 out1 in2 out1 in2 out2 in1 out2 BAR state CROSS state 2x2 switch • BAR state switch: data passes through • CROSS state switch: data passes to opposite port • Typical wavelength Range: 1260 ~ 1360 or 1510 ~ 1610 nm (Mechanical Switch) Problems: • Lack of processing at bit level in optical domain • Lack of efficient buffering in optical domain August 13, 2014 [email protected] 38 The basic PNoC building block • Just cascading 2x2 switch is not efficient and increases loses. August 13, 2014 [email protected] 39 Agenda Motivation Optical Interconnect Prospects PHENIC Si-Photonics Network-onChip Technology Challenges Conclusion August 13, 2014 [email protected] 40 PHENIC: Hybrid Si-Photonic NoC via size < ~ 2μm Benefits • Higher integration • Shorter interconnect (important for Short message mode) August 13, 2014 • Heterogeneous integration • Reliability • Short message mode & Large/Burst mode [email protected] 41 Routing in Hybrid Si-Photonic NoC D S August 13, 2014 [email protected] 43 Routing in Hybrid Si-Photonic NoC 1.Reserve the path 2.ACK 3. Transmit data on the Photonic layer D 4.Release (tear-down) S August 13, 2014 [email protected] 44 Electrical router and control OASIS-RV1 Chip Layout (45nm CMOS Process, 222.387 uW, 557 pins). Major tasks • Photonic route computation (path setting) • Route computation for short messages on the electronic later (network) • Other control tasks for the photonic switch on the photonic layer (network) August 13, 2014 [email protected] 58 Photonic wavelength switch Major tasks • Photonic data transmission • Optical data cannot be stored (no optical buffers!) • No computation performed August 13, 2014 [email protected] 59 Bandwidth, power and latency August 13, 2014 [email protected] 47 Agenda Motivation Optical Interconnect Prospects Case Study: PHENIC Si-Photonics Network-on-Chip Technology Challenges Conclusion August 13, 2014 [email protected] 48 Electronics integration Intel, core i7, 2011) Intel, 4004, 1971) A billion transistors billions of multiplications per sec 32 nm CMOS 2300 transistors thousands of multiplications per sec 10 μm PMOS August 13, 2014 [email protected] 49 Photonics integration Intel’s 50 Gb/s (4x12.5Gb/s) transceiver (2012). CMOS sensor array 1st Semiconductor Laser (~1962) Single Channel transmitter Luxtera’s photograph of CMOS 4x10Gb/s WDM die (2007) Challenges • Wafer-scale fabrication is difficult • Si does not support some functions • Improvement of cost, space, power, reliability is needed August 13, 2014 [email protected] 50 E-O-E Transceivers (Tx/Rx) – Multilayer option MULTI-CHIPS OPTION Challenges ▸ Single photonics platform (wafer-scale fabrication) ▸ Efficient E/O and O/E conversion ▸ CMOS-driven components August 13, 2014 [email protected] 51 E-O-E Transceivers (Tx/Rx) – Si Photonics Option Si-Photonics option MODULATORS LASERS MUX DETECTORES PLC OPTICAL I/O’s MUX Features • Small photonics component footprint • CMOS compatible fabrication processes • 3D connectivity to CMOS wafers for improved O-E performance August 13, 2014 [email protected] 52 Compact of ON-chip optical wires/wiveguides • Requirements – Performance -> loss ~1dB/cm – High density -> Bending radius ~1μm • Challenges – Meet low-loss despite Si sidewalls imperfection – Realize efficient I/O (fiber) coupling despite large mode mismatch August 13, 2014 [email protected] 53 Reliability Challenges & Vision Architecture Techniques Macro Solutions Micro Solutions Redundant active/passive component (cores, routers etc.) PBC ECC Moore’s law: increasing the bit count exponentially: 2x every 2 years Circuit Techniques Cell creation Comp. Param. Reconfiguration Process Techniques State of the Art Processes Transient, intermittent, and permanent errors/faults are reliability challenges August 13, 2014 [email protected] 54 Agenda Motivation Optical Interconnect Prospects Case Study: PHENIC Si-Photonics Network-on-Chip Technology Challenges Conclusion August 13, 2014 [email protected] 55 Concluding remarks • Computer system interconnects are very complex micro-communication components • Most important metrics – Bandwidth-density – Energy-efficiency • Si-Photonics design approach can improve system throughput by 15-20x • Many issues should be carefully handled – Optimize network design (electrical switching, optical transport) – Optimize physical mapping (layout) for low optical insertion loss August 13, 2014 [email protected] 56 References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. Achraf Ben Ahmed, A. Ben Abdallah, PHENIC: Towards Photonic 3D-Network-on-Chip Architecture for High-throughput Many-core Systems-on-Chip, IEEE Proceedings of the 14th International conference on Sciences and Techniques of Automatic control and computer engineering (STA'2013), Dec. 2013. [DOI] A. Ben Abdallah, PHENIC: Silicon Photonic 3D-Network-on-Chip Architecture for High-performance Heterogeneous Many-core System-on-Chip>PDF, Technical Report, Ref. PTR0901A0715-2013, September 1, 2013. OASIS 3D-Router Hardware Physical Design, Technical Report, Adaptive Systems Laboratory, Division of Computer Engineering, University of Aizu, July 8, 2014. Akram Ben Ahmed, A. Ben Abdallah, Graceful Deadlock-Free Fault-Tolerant Routing Algorithm for 3D Network-on-Chip Architectures, Journal of Parallel and Distributed Computing, 2014. [DOI] Akram Ben Ahmed, Achraf Ben Ahmed, A. Ben Abdallah, Deadlock-Recovery Support for Fault-tolerant Routing Algorithms in 3D-NoC Architectures, IEEE Proceedings of the 7th International Symposium on Embedded Multicore/Many-core SoCs (MCSoC-13), pp., 2013. [DOI] Akram Ben Ahmed, A. Ben Abdallah, Architecture and Design of High-throughput, Low-latency and Fault Tolerant Routing Algorithm for 3D-Network-on-Chip, The Jnl. of Supercomputing, December 2013, Volume 66, Issue 3, pp 1507-1532. [DOI] Akram Ben Ahmed, T. Ouchi, S. Miura, A. Ben Abdallah, ''Run-Time Monitoring Mechanism for Efficient Design of Application-specific NoC Architectures in Multi/Manycore Era'', ''' IEEE Proc. of the 6th International Workshop on Engineering Parallel and Multicore Systems (ePaMuS2013'), July 2013.''' [DOI] Akram Ben Ahmed, T. Ouchi, S. Miura, A. Ben Abdallah, Run-Time Monitoring Mechanism for Efficient Design of Application-specific NoC Architectures in Multi/Manycore Era, Proc. IEEE 6th International Workshop on Engineering Parallel and Multicore Systems (ePaMuS2013'), July 2013. Akram Ben Ahmed, A. Ben Abdallah, ''Low-overhead Routing Algorithm for 3D Network-on-Chip'', '''IEEE Proc. of the The Third International Conference on Networking and Computing (ICNC'12), pp. 23-32, 2012.''' [DOI] Akram Ben Ahmed, A. Ben Abdallah, ''LA-XYZ: Low Latency, High Throughput Look-Ahead Routing Algorithm for 3D Network-on-Chip (3D-NoC) Architecture'', '''IEEE Proceedings of the 6th International Symposium on Embedded Multicore SoCs (MCSoC-12), pp. 167-174, 2012. [DOI] Akram Ben Ahmed, A. Ben Abdallah, ''ONoC-SPL Customized Network-on-Chip (NoC) Architecture and Prototyping for Data-intensive Computation Applications'', '''IEEE Proceedings of The 4th International Conference on Awareness Science and Technology, pp. 257-262, 2012. DOI Kenichi Mori,A. Ben Abdallah, OASIS Network-on-Chip Prototyping on FPGA, Master's Thesis, The University of Aizu, Feb. 2012. [Thesis], [slides] Ben Ahmed Akram, A. Ben Abdallah,[[On the Design of a 3D Network-on-Chip for Many-core SoC, Master's Thesis, The University of Aizu, Feb. 2012. [Thesis], [slides] Shohei Miura, A. Ben Abdallah, Design of Parametrizable Network-on-Chip, '''Master's Thesis, The University of Aizu, Feb. 2012.''' Ryuya Okada, A. Ben Abdallah, ''Architecture and Design of Core Network Interface for Distributed Routing in OASIS NoC'', '''Graduation Thesis, The University of Aizu, Feb. 2012.' A. Ben Ahmed, A. Ben Abdallah, K. Kuroda, Architecture and Design of Efficient 3D Network-on-Chip (3D NoC) for Custom Multicore SoC, IEEE Proc. of the 5th International Conference on Broadband, Wireless Computing, Communication and Applications (BWCCA-2010), pp.67-73, Nov. 2010. (''best paper award'') Kenichi Mori, A. Ben Abdallah, OASIS Network-on-Chip Prototyping on FPGA , Master's Thesis, Graduate School of Computer Science and Engineering, The University of Aizu, Feb. 2012 K. Mori, A. Esch, A. Ben Abdallah, K. Kuroda, Advanced Design Issues for OASIS Network-on-Chip Architecture, IEEE Proc. of the 5th International Conference on Broadband, Wireless Computing, Communication and Applications (BWCCA-2010),pp.74-79, Nov. 2010. T. Uesaka, OASIS NoC Topology Optimization with Short-Path Link, Technical Report, Systems Architecture Group,March 2011. K. Mori, A. Ben Abdallah, OASIS NoC Architecture Design in Verilog HDL, Technical Report,TR-062010-OASIS, Adaptive Systems Laboratory, the University of Aizu, June 2010. Shohei Miura, Abderazek Ben Abdallah, Kenichi Kuroda, PNoC: Design and Preliminary Evaluation of a Parameterizable NoC for MCSoC Generation and Design Space Exploration, The 19th Intelligent System Symposium (FAN 2009), pp.314-317, Sep.2009. Kenichi Mori, Abderazek Ben Abdallah, Kenichi Kuroda, ''Design and Evaluation of a Complexity Effective Network-on-Chip Architecture on FPGA'', The 19th Intelligent System Symposium (FAN 2009), pp.318321, Sep. 2009. A. Ben Abdallah, T. Yoshinaga and M. Sowa, Mathematical Model for Multiobjective Synthesis of NoC Architectures, IEEE Proc. of the 36th International Conference on Parallel Processing, Sept. 4-8, 2007, [email protected] 57 References Multicore Systems-onchip: Practical Hardware/Software Design Issues Hardcover – August 6, 2010 August 13, 2014 [email protected] 58 August 13, 2014 [email protected] 59 University of Aizu August 13, 2014 [email protected] 60 Thank you. August 13, 2014 [email protected] 61
Similar documents
Si-Photonics Technology Towards fJ/bit Optical Communication in
Optical link • Uses monolithic integration that reduces energy consumption • Utilizes the standard bulk CMOS flow • Cladding is used to increase the total internal reflection reduces data loss De...
More information