Reiner - The Xputer Lab Page
Transcription
Reiner - The Xputer Lab Page
[email protected] 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de VIPSI-2012 MONTENEGRO Preface Hotel Splendid in Becici Dec 31, 2012 to Jan 1, 2013 Reiner Hartenstein IEEE fellow FPL fellow SDPS fellow TU Kaiserslautern The Tunnel Vision Syndrome: Challenging Computer Science Education ICT infrastructures, energy-efficient as urgently required: impossible without reinventing ECS practices and education The main problem: the Tunnel Vision Syndrome http://hartenstein.de [email protected] 1 © 2012, [email protected] Outline (1) TU Kaiserslautern 2 http://hartenstein.de Important ICT intrastructures TU Kaiserslautern [Courtesy Ernst Denert] • The Survival of our important ICT infrastructures • The Tunnel Vision Syndrome • The von Neumann Syndrome • Reconfigurable Computing: data-stream-based • Reinvent Computing to fully cover the Taxonomy • Conclusions Lufthansa Reservation anno 1960 http://wiki.answers.com/Q/Why_are_computers_important_in_the_world 3 http://hartenstein.de © 2012, [email protected] TU Kaiserslautern PATMOS 2013 - 23rd International Workshop on Power And Timing Modeling, Optimization and Simulation co-located w. VARI 2013 - 4rd European Workshop on CMOS Variability © 2012, [email protected] 4 http://hartenstein.de Beyond Oil: Predictions TU Kaiserslautern Now with extended scope: Energy-efficient ICT infrastructures are a survival issue of our economy © 2012, [email protected] 5 http://xputer.de/PATMOS/ http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro © 2012, [email protected] 6 http://hartenstein.de 1 [email protected] 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Beyond Oil: Literature TU Kaiserslautern US: ~3 $ G. Fettweis, E. Zimmermann: ICT Energy Consumption - Trends and Challenges; WPMC'08, Lapland, Finland, 8 –11 Sep 2008 … hundreds of books 7 © 2012, [email protected] TU Kaiserslautern Power consumption by internet: x30 til 2030 if trends continue … post petroleum … http://hartenstein.de 8 © 2012, [email protected] at Dallas http://hartenstein.de © New York Times Outline (2) Google‘s Electricity Bill TU Kaiserslautern 8 TU Kaiserslautern Patent for water-based data centers Google going to sell electricity, Cost of a data center determined by the monthly power bill „The possibility of computer equipment power consumption spiraling out of control could have serious consequences • • • • • • The survival of our important ICT infrastructures The Tunnel Vision Syndrome The von Neumann Syndrome Reconfigurable Computing: data-stream-based Reinvent Computing to fully cover the Taxonomy Conclusions for the overall affordability of computing.” [L. A. Barrosso, Google] http://hartenstein.de/ComputerStromverbrauch.pdf © 2012, [email protected] 9 http://hartenstein.de 10 What Synthesis Method? (2) Systolic Arrays (1) TU Kaiserslautern TU Kaiserslautern Historic example of the Tunnel Vision Syndrome IEEE 7th ISCA, La Baule, France, May 6-8, 1980 Why not a general purpose methodology ? 11 of course algebraic! (linear projection) supports only applications with strictly regular data dependencies http://kressarray.de/ 1995: M. J. Foster and H. T. Kung: The Design of SpecialPurpose VLSI Chips ... © 2012, [email protected] http://hartenstein.de © 2012, [email protected] http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro Rainer Kress replaced it by simulated annealing*: supports also any irregular & wild form pipe networks © 2012, [email protected] 12 *) KressArray [ASP-DAC-1995] http://hartenstein.de 2 [email protected] 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Who generates the data streams? “It’s not our job” TU Kaiserslautern TU Kaiserslautern http://xputer.de/ xxx xx x xxx - - x xx xxx xx | x | | | | | | | | http://data-streams.org/ - - - x xx - - - - xx x - - - - - x xx | | | x | | xx | xxx xx x without a sequencer: missed to define the machine paradigm TU Kaiserslautern any irregular pipe network structure supported asM Supersystolic Array asM asM asM asM asM 13 asM asM *) or receives © 2012, [email protected] http://hartenstein.de the Data stream machine (anti machine): an example data counter use, no program counter pipeline network example asM asM: AutoSequencing Memory reconfigurable address generator (GAG) inside asM © 2012, [email protected] 14 http://hartenstein.de Systolic Arrays (2) TU Kaiserslautern IEEE 7th ISCA, La Baule, France, May 6-8, 1980 M. J. Foster and H. T. Kung: The Design of SpecialPurpose VLSI Chips ... from La Baule to the airport Oct. 23, 2012 GAG & enabling technology: published 1989, survey: [M. Herz et al.: IEEE ICECS 2003, Dubrovnik] asM asM asM asM programmed by Flowware 15 © 2012, [email protected] http://hartenstein.de Mario Barbacci: „VAX? That‘s why it is so slow“ http://hartenstein.de Too many terminals „mini“computers: VAX-11/750 TU Kaiserslautern 16 © 2012, [email protected] TU Kaiserslautern sorrowful experiences with the VAX-11/750 quasi-standard around 1980 (personally:) NATO ASI on VLSI at SOGESTA, Urbino, Italy, 1981 UC Berkeley CS department at Kaiserslautern my Xputer lab at Kaiserslautern E.I.S. project NATO ASI, Urbino 1981 © 2012, [email protected] 17 http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro © 2012, [email protected] 18 http://hartenstein.de 3 [email protected] 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Outline (3) TU Kaiserslautern • • • • • • the von Neumann Syndrome TU Kaiserslautern The survival of our important ICT infrastructures The Tunnel vision syndrome The von Neumann Syndrome Reconfigurable Computing: data-stream-based Reinvent computing to fully cover the taxonomy Conclusions von Neumann: by far the most inefficient machine paradigm the 1st electrical computer, ready prototyped for mass production ? which year, which company ? 19 http://hartenstein.de © 2012, [email protected] The History of Computing TU Kaiserslautern Prototype 1884: Herman Hollerith the first reconfigurable computer Not yet invented in 1884: • magnetic tape (1898*), • the vacuum tube (1904), • magnetic drum (1932), • the transistor (1934), • ferrite core memory (1949), • hard disc (1956). 1989 US census use non-volatile !! The LUT (lookup table) size: 2 refrigerators first Xilinx FPGA 100 years later http://hartenstein.de 21 paradigm shift from data streams 22 © 2012, [email protected] TU Kaiserslautern *) wire only http://hartenstein.de Tunnel Vision: EDSAC 2 6 decades later: TU Kaiserslautern http://hartenstein.de Punched Card Data Memory … state of the art ….. TU Kaiserslautern datastream-based ! © 2012, [email protected] 20 © 2012, [email protected] fully invisible other paradigms even hardware design went von Neumann to instruction streams EDSAC 2, 1958: first microprogrammable computer, proposed 1951 about 3 hours MTBF 30 tons, 178 kW almost 1000 square feet of floor space Microprogramming: nested von Neumann machines: instruction streams + microinstruction streams Trailblazing Reconfigurable Computing ? No: nested von Neumann bottlenecks: Multiple multiplexing overhead [Günter Koch et al.: “The universal Bus considered harmful”; 1st EUROMICRO Symp., June 1975, Nice, France] Brief History of Microprogramming: http://cs.clemson.edu/~mark/uprog.html 23 © 2012, [email protected] http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro © 2012, [email protected] 24 http://hartenstein.de 4 [email protected] 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Outline (4) TU Kaiserslautern Power save ~10% PISA project >15000 • • • • • • The Survival of our important ICT infrastructures The Tunnel Vision Syndrome The von Neumann Syndrome Reconfigurable Computing: data-stream-based Reinvent Computing to fully cover the Taxonomy Conclusions DPLA replacing 256 FPGAs’1984 (E.I.S. project) Speedup-Factor TU Kaiserslautern 106 Speed-up Factors by Software to FPGA migration Image processing, Pattern matching, 28500 Multimedia DSP and 3439 6000 Reed-Solomon Decoding video-rate stereo vision MAC pattern 730 1000 900 recognition 400 103 Speed-up factors are not new SPIHT wavelet-based image compression 52 BLAST 40 (avoiding the von Neumann syndrome) 288 457 FFT 88 protein identification 2400 DNA seq. 1116* 8723 3000 crypto CT imaging 1000 Viterbi Decoding Smith-Waterman pattern matching 100 molecular dynamics simulation Bioinformatics *)DES br. equipment size 20 100 DES breaking wireless real-time face detection Astrophysics GRAPE 25 http://hartenstein.de © 2012, [email protected] © 2012, [email protected] RC*: the intensive Impact TU Kaiserslautern TU Kaiserslautern Tarek El-Ghazawi [Tarek El-Ghazawi et al.: IEEE COMPUTER, Febr. 2008] SGI Altix 4700 with RC 100 RASC compared to Beowulf cluster Application . DES breaking Speed-up factor Savings factors Power Cost Size 28514 3439 96 1116 massively saving energy *) RC = Reconfigurable Computing © 2012, [email protected] 27 much less equipment needed http://hartenstein.de Stream Data-Flow Execution Models for Extreme Scale Computing (DFM 2012) Minneapolis, USA, Sep 19-23, 2012, in conjunction with PACT 2012 http://www.cs.ucy.ac.cy/dfmworkshop/ program source compilation result Software instruction streams Flowware data streams Configware datapath structures configured RC: why it‘s so efficient it‘s efficieny is data-stream-based: avoiding the extremely memorycycle-hungry von Neumann syndrome the anti-machine paradigm: no instruction streams at run time going thrugh the FPGA fabrics 28 © 2012, [email protected] http://hartenstein.de FPGA’s Semiconductor Market Share A Clean Terminology, please TU Kaiserslautern http://hartenstein.de 26 TU Kaiserslautern courtesy [Nick Tredennick] • Why stalled ? still < 2% the RC paradox FPGAs Achilles’ heel: long development time: VHDL/Verilog still dominant • Design software unusable except by experts • FPGA companies’ wrong top-level management*: – first: circuit designers, now logic designers – should be: programmers • Nick Tredennick: “a generation behind in required expertise.” Evolution of FPGAs 30 [Peter Thorwartl] © 2012, [email protected] 29 http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro © 2012, [email protected] http://hartenstein.de 5 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Outline (5) TU Kaiserslautern • • • • • • [email protected] 28 December 2012 going beyond the tunnel TU Kaiserslautern The survival of our important ICT infrastructures The Tunnel vision syndrome The von Neumann Syndrome Reconfigurable Computing: data-stream-based Reinvent computing to fully cover the taxonomy Conclusions Creating energy-efficient ICT infrastructures means dramatically much more than just a circuit design issue Energy-efficient programming: not with curricula from the mainframe age ! 31 http://hartenstein.de © 2012, [email protected] © 2012, [email protected] A huge design space Programmability crisis solution impossible Mike Flynn‘s taxonomy without mastering the entire design space TU Kaiserslautern TU Kaiserslautern 32 http://hartenstein.de The tunnel view of the pre-manycore age extending Flynn‘s taxonomy by going heterogeneous: Instruction vs. Data Single vs. Multiple Reiner‘s Taxonomy reconfigurable or not Diana Göhringer‘s Ph.D.thesis Diana‘s Taxonomy datastream-based (anti-machine) 33 © 2012, [email protected] noI versus SI or MI http://hartenstein.de Education Revolution: the M-&-C Design Revolution fragmentation reject Clearing out & intuitive models Switching level submit reject Circuit level submit reject Layout level In-house technology width of specialization © 2012, [email protected] TU Kaiserslautern [1980] reject Logic level submit VLSI Design Education Spreading Rapidly 1980 - 1983 world-wide Application reject RT level submit http://hartenstein.de Das E.I.S.-Projekt: http://xputer.de/EIS/ Application level submit 34 The Mead-&-Conway strategy: Removal of the education dilemma coherence division of specialization: tall thin man TU Kaiserslautern © 2012, [email protected] incubator of workstation and EDA industry etc. Silicon Foundry (external technology) reduced width of specialization 35 Carver Mead Lynn Conway http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro The most effective project in the history of modern computer science © 2012, [email protected] Carver Mead Lynn http://hartenstein.de Conway 36 36 6 [email protected] 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Outline (6) TU Kaiserslautern • • • • • • We need „une' Levée en Masses“ TU Kaiserslautern The survival of our important ICT infrastructures The Tunnel vision syndrome The von Neumann Syndrome Reconfigurable Computing: data-stream-based Reinvent computing to fully cover the taxonomy Conclusions We need „une' „une' Levée Levée en en Masses“ 37 http://hartenstein.de © 2012, [email protected] © 2012, [email protected] 38 TU Kaiserslautern TU Kaiserslautern backup for discussion thank you 39 © 2012, [email protected] http://hartenstein.de © 2012, [email protected] 40 What form of Parallelism? TU Kaiserslautern http://hartenstein.de [Hartenstein’s watering can model] http://hartenstein.de TU Kaiserslautern instruction-stream-based approach: data-stream-based approach: I used this picture in several earlier talks since it is popular no von Neumannbottleneck many von Neumann bottlenecks © 2012, [email protected] 41 http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro Also other speaker use it: see next slide. © 2012, [email protected] 42 http://hartenstein.de 7 [email protected] 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de parallelism solution: TU Kaiserslautern the instruction-stream-based approach the data-stream-based approach has no von Neumann bottleneck von Neumann bottlenecks Copyrightⓒ2005 J.D.Cho http://hartenstein.de © 2012, [email protected] TU Kaiserslautern Dual paradigm mind set: an old hat - but ignored time to space mapping: procedural to structural Duality of procedural Languages program counter: TU Kaiserslautern FF Flowware Languages read next data item goto (data address) jump to (data address) data loop data loop nesting data loop escape data stream branching yes: internally parallel loops more simple: no ALU tasks But there is an Asymmetry 44 © 2012, [email protected] http://hartenstein.de All but ALU is overhead: x20 efficiency TU Kaiserslautern [R. Hameed et al.: Understanding Sources of Inefficiency in General-Purpose Chips; 37th ISCA, June 19-23, 2010, St. Malo, France] token bit evoke FF 1971 data counter(s): Software Languages read next instruction goto (instruction address) jump to (instruction address) instruction loop instruction loop nesting instruction loop escape instruction stream branching no: internally parallel loops Just one of several overhead layers (data cashe) FF 1967: W. A. Clark: Macromodular Computer Systems; 1967 SJCC, AFIPS Conf. Proc. C. G. Bell et al: The Description and Use of RegisterTransfer Modules (RTM's); IEEE Trans-C21/5, May 1972 © 2012, [email protected] TU Kaiserslautern 45 http://hartenstein.de Program Engineering (2) The Generalization of Software Engineering TU Kaiserslautern vN versus Anti-Machine (data stream machine). auto-sequencing Memory asM FE Flowware Engineering CPU SE Software Engineering The Generalization of Software Engineering — © 2012, [email protected] PE pipe network model etc. conditional swap conditional swap *) do not confuse with „dataflow“! CE Configware Engineering DPU Data-Path- Unit DPA Data-PathArray http://hartenstein.de 47 Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro the Bubble Sort algorithm © 2012, [email protected] http://hartenstein.de Parallelized Bubble Sort (Shuffle Sort) conditional swap conditional swap conditional swap Program Engineering structures 46 © 2012, [email protected] conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap direct time to modified space mapping by shuffle Shuffle Sort* (animation) accessing conflicts function 48 partly back to time mapping *) http://xputers.informatik.unikl.de/papers/publications/diplo ma-theses.html#Duhl http://hartenstein.de 8 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Other voices TU Kaiserslautern The Hardware Architecture Challenge: More parallelism needed by orders of magnitude. Entirely New Software Stack needed: New scalable and robust OS needed. Fundamental Programming Issues: New software architectures required The High Cost of Data Movement for OS, RS, APIs and compilers Hardware mechanisms for How to provide a non-disruptive efficient communication path for existing application code? A programming model expressing all available parallelism and locality Compilers and run time systems exploiting parallelism and locality 49 © 2012, [email protected] http://hartenstein.de TU Kaiserslautern IBM Roadrunner: 2,483 kW ASCI Red: 850 kW BTW: July 2005: an early trailblazer Exascale is no longer some vague over-the-horizon notion but rather an aggressively sought-after goal. The future of computing has never seemed so uncertain Absurdely incomprehensible abstractions in „standard“ languages © 2012, [email protected] TU Kaiserslautern Removing paradigm domains and abstraction layers hides critical sources of efficiency limits: memory mapping issues, overhead and bottlenecks We must change how programmers think, also by ….. http://hartenstein.de © 2012, [email protected] TU Kaiserslautern Data-Flow Stream Execution Models for Extreme Scale Computing (DFM 2012) DFM the ubiquitous Memory Wall DF systems could be simpler and more power efficient in handling concurrency and latencies It’s time to revisit Data-driven computation and bring it to Multi-core and extreme scale computing – an overall system concept including hardware and software Thomas Sterling Radical “disruptive research” is required in programmability Operating Systems for Exascale Computing and Beyond http://hartenstein.de Supercomputer High end Programmer Productivity The Law of More: programmer productivity declines disproportionately with increasing parallelism At particular HPC application domains massive parallelism requires 10 – 30 professionalists in multi-disciplinary multi-insitutional teams for 5 - 10 years [Douglass Post, DoD HPCMP, panelist at SC07] © 2012, [email protected] The High Cost of Movement of Data (and Instructions) Novel DF-inspired models, paradigms, architectures, compilers and tools for multi-core and supercomputing. No evolutionary extension of current models. Nam FLOPS yottaFLOPS 1024 zettaFLOPS 1021 exaFLOPS 1018 petaFLOPS 1015 teraFLOPS 1012 gigaFLOPS 109 megaFLOPS 106 kiloFLOPS 103 Language designer‘s tunnel vision Teaching to students the tunnel vision of language designers ? Will we never reach Zettaflops ? TU Kaiserslautern [email protected] 28 December 2012 53 http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro © 2012, [email protected] 52 http://hartenstein.de Trailblazing (the xputer) TU Kaiserslautern 1975 Nizza Univ Bus considered harmful ICCAD 1984 Santa Clara, PISA 1988 Worksh on HW accelerators, Oxford MoM Kilarney 1989 super systolic MoM COMPEURO Hamburg 1989 ICPP-90 Xputer HICSS Koloa, 1991 auto-sequencing 2-dim memory space © 2012, [email protected] 54 http://hartenstein.de 9 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de [email protected] 28 December 2012 Links to Reinvent Computing TU Kaiserslautern Bio TU Kaiserslautern The Grand Challenge to Reinvent Computing http://xputer.de/pucminas/ Reinvent Computing? This idea is not new. See the keynote by Burton Smith (former Cray CTO): http://xputer.de/reinvent/ Dr.-Ing. Reiner Hartenstein is full professor of the TU Kaiserslautern and independent expert and consultant of EDA in Reconfigurable Computing. Invasic Computing: agressive 30 people project http://xputer.de/invasic/ As a scholar of Karl Steinbuch all his academic degrees are from EE at KIT (Karlsruhe Institute of Technology), where he later was associate professor, working in image processing, computer architecture and hardware description languages. He appreciates a decade of fruitful cooperation with colleagues of the University of Brasilia. Invasive Computing — An Overview http://xputer.de/invasive/ KAHRISMA: KArlsruhe's Hypermorphic Reconfigurable-Instruction-Set Multigrained-Array Processor http://www.kahrisma.de/ http://xputer.de/kahr/ ARAMiS (Automotive, Railway and Avionics Multicore Systems), a large German/European project http://xputer.de/aramis/ Prof. Hartenstein is FPL fellow, SDPS fellow, IEEE fellow and recipient of other awards. He gave more than 200 invited talks and 40 international keynote addresses. He has published more than 400 papers and authored, edited or co-edited 16 books 1st Workshop on Data-Flow Execution Models for Extreme Scale Computing (DFM-2011) http://xputer.de/DFM1/ DFM 2012: http://xputer.de/DFM2/ © 2012, [email protected] 55 http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro © 2012, [email protected] 56 http://hartenstein.de 10