tilera
Transcription
tilera
Architectures for Multimedia Systems y TILERA – TILE64 PROCESSOR TILE64™ Mondello Filippo pp 722955 Index y y y Tile Processor Architecture Tile64 implementation Tile Processor Architecture innovations: ◦ ◦ ◦ ◦ ◦ Large number of tiles on a chip iMesh Multicore coherent cache Multicore Hardwall technology Multicore Development Environment Tools Suite Tile Processor Architecture y y MIMD machine 2D grid of 64 homogeneus, generalpurpose compute elements: tiles Tilera’s iMesh on-chip network 4 DDR2 controllers + I/O controllers y TILES: y y ◦ Processor ◦ L1 & L2 cache ◦ non-blocking non blocking switch Tile Processor Architectures Tile64 implementation y Cores: 32-bit, RISC, VLIW, 90nm technology 192 billion 32-bit ops; 256 billion 16-bit ops; half a teraops 8 8-bit bit operations y Memory y ◦ L1 cache: 8KB I, 8KB D, 1 cycle latency ◦ L2 cache: 64KB unified, 7 cycle latency ◦ Off-chip Off hi main i memory, ~70 cycle latency ◦ 32-bit virtual address space per process ◦ 64-bit 64 bit physical address space ◦ Instruction and data TLBs ◦ Cache integrated 2D DMA engine iMesh network y Using multiple processors require a system to allow communication among them. ◦ Old Solution: bus interconnection. Problem: more cores added to chips Æ bus creates data congestion, congestion limiting performance scalability with the increased number of cores ◦ Tilera’s solution: iMesh y iMesh: y y y y y user dynamic d i network t k (UDN) I/O dynamic network (IDN) static network (STN) memory dynamic network (MDN) tile dynamic network (TDN). iMesh network Each tile uses a fully y connected crossbar Æ all-to-all five-way communication. Dynamic networks: • packetized, fire-and-forget interface, dimension-ordered wormhole-routed. • Packet = header word + up to 128 words per packet • Hop latency: • one cycle if packets are going straight • one extra cycle for route calculation when a packet must make a turn at a switch. Static network: • static configuration of the routing decisions at each switch point. • auxiliary processor for reconfiguring the network in a programmatic manner. iMesh network y y y y y UDN Æ userland processes or threads IDN Æ direct communication with I/O devices. devices MDN Æ communication with off-chip DRAM. TDN Æ direct d tile-to-tile l l cache h transfers. f Works k in concert with h the MDN. STN Æ low-latency, high-bandwidth channelized network — great for streaming data. Multicore coherent cache Cache subsystem Æ highhigh performance, two-level, nonblocking cache hierarchy. y Each tile's tile s cache can be shared with other tiles Æ each tile can access the aggregate multi-megabyte cache. Æ each tile can view the collection of on-chip on chip caches of all tiles, serving as an L3 cache. y Neighborhood caching to provide an on on-chip chip distributed shared cache. y Multicore Hardwall technology y Enables the user to define one or many cores as a processing island, eliminating communication between it and other cores unless specified. If a packet attempts to cross the established boundary, an interrupt is signaled and control is passed on to the hypervisor. hypervisor y Tile Processor architecture results well suited to hosting multiple operating systems running independent applications, or multiple instances of the same application, on a single-chip platform. y Multicore Development Environment Tools Suite The Tilera MDE includes a powerful Eclipse-based integrated development environment i t (IDE) (IDE), an ANSI ANSI-standard t d d 'C' compiler, a full-system simulation model and a set of flexible command-line interfaces. y To T achieve hi optimum ti performance f on th the chip, hi th the MDE iincludes l d an optimized user communication library (iLib) offering standard mechanisms such as process management, socket-like streaming channels message passing, channels, passing and shared-memory shared memory communication communication. y Tilera defined the Tilera’s Gentle Slope Programming model which enables the user to begin with familiar programming tools and mo e to advanced, move ad anced large-scale la ge scale multicore m ltico e programming p og amming easily. easil y References:: References y http://www.tilera.com/products/processors.php: ProductBrief_Tile64_Web_v3.pdf y htt // http://www.tilera.com/technology/technology.php: til /t h l /t h l h A ArchBrief_Arch_V1_Web.pdf hB i f A h V1 W b df y http://en.wikipedia.org/wiki/TILE64 y http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4378780 y http://techreport.com/discussions.x/13069 y http://www.hwupgrade.it/news/portatili/64-core-per-il-processore-tile64_22252.html y http://www.theregister.co.uk/2007/08/20/tilera_tile64_chip/ p // g / / / / _ _ p/ y http://arstechnica.com/articles/paedia/cpu/MIT-startup-raises-multicore-bar-withnew-64-core-CPU.ars y http://www tgdaily com/content/view/33451/135/ http://www.tgdaily.com/content/view/33451/135/ y http://www.itjungle.com/tlb/tlb082107-story02.html y http://www.pcmag.com/article2/0,1895,2173203,00.asp