tilera

Transcription

tilera
Architectures for Multimedia
Systems
y
TILERA –
TILE64 PROCESSOR
TILE64™
Mondello Filippo
pp
722955
Index
y
y
y
Tile Processor Architecture
Tile64 implementation
Tile Processor Architecture innovations:
◦
◦
◦
◦
◦
Large number of tiles on a chip
iMesh
Multicore coherent cache
Multicore Hardwall technology
Multicore Development Environment Tools
Suite
Tile Processor Architecture
y
y
MIMD machine
2D grid of 64 homogeneus, generalpurpose compute elements: tiles
Tilera’s iMesh on-chip network
4 DDR2 controllers + I/O controllers
y
TILES:
y
y
◦ Processor
◦ L1 & L2 cache
◦ non-blocking
non blocking switch
Tile Processor Architectures
Tile64 implementation
y
Cores: 32-bit, RISC, VLIW,
90nm technology
192 billion 32-bit ops;
256 billion 16-bit ops;
half a teraops 8
8-bit
bit
operations
y
Memory
y
◦ L1 cache: 8KB I, 8KB D,
1 cycle latency
◦ L2 cache: 64KB unified,
7 cycle latency
◦ Off-chip
Off hi main
i memory,
~70 cycle latency
◦ 32-bit virtual address space
per process
◦ 64-bit
64 bit physical address space
◦ Instruction and data TLBs
◦ Cache integrated 2D DMA
engine
iMesh network
y
Using multiple processors require a system
to allow communication among them.
◦ Old Solution: bus interconnection.
Problem: more cores added to chips Æ bus creates data congestion,
congestion
limiting performance scalability with the increased number of cores
◦ Tilera’s solution: iMesh
y
iMesh:
y
y
y
y
y
user dynamic
d
i network
t
k (UDN)
I/O dynamic network (IDN)
static network (STN)
memory dynamic network (MDN)
tile dynamic network (TDN).
iMesh network
‰ Each tile uses a fully
y connected
crossbar Æ all-to-all five-way
communication.
‰ Dynamic networks:
• packetized, fire-and-forget interface,
dimension-ordered wormhole-routed.
• Packet = header word + up to 128
words per packet
• Hop latency:
• one cycle if packets are going straight
• one extra cycle for route calculation when a packet must make a turn at
a switch.
‰ Static network:
• static configuration of the routing decisions at each switch point.
• auxiliary processor for reconfiguring the network in a programmatic manner.
iMesh network
y
y
y
y
y
UDN Æ userland processes or
threads
IDN Æ direct communication
with I/O devices.
devices
MDN Æ communication with
off-chip DRAM.
TDN Æ direct
d
tile-to-tile
l
l cache
h transfers.
f
Works
k in concert with
h
the MDN.
STN Æ low-latency, high-bandwidth channelized network — great
for streaming data.
Multicore coherent cache
Cache subsystem Æ highhigh
performance, two-level, nonblocking cache hierarchy.
y Each tile's
tile s cache can be shared
with other tiles
Æ each tile can access the
aggregate multi-megabyte
cache.
Æ each tile can view the
collection of on-chip
on chip caches of all
tiles, serving as an L3 cache.
y Neighborhood caching to provide
an on
on-chip
chip distributed shared
cache.
y
Multicore Hardwall technology
y
Enables the user to define one or many cores as a processing
island, eliminating communication between it and other cores
unless specified.
If a packet attempts to cross
the established boundary, an
interrupt is signaled and control
is passed on to the hypervisor.
hypervisor
y Tile Processor architecture
results well suited to hosting
multiple operating systems
running independent
applications, or multiple
instances of the same
application, on a single-chip
platform.
y
Multicore Development
Environment Tools Suite
The Tilera MDE includes a powerful
Eclipse-based integrated development
environment
i
t (IDE)
(IDE), an ANSI
ANSI-standard
t d d 'C'
compiler, a full-system simulation model and
a set of flexible command-line interfaces.
y To
T achieve
hi
optimum
ti
performance
f
on th
the chip,
hi th
the MDE iincludes
l d an
optimized user communication library (iLib) offering standard
mechanisms such as process management, socket-like streaming
channels message passing,
channels,
passing and shared-memory
shared memory communication
communication.
y Tilera defined the Tilera’s Gentle Slope Programming model which
enables the user to begin with familiar programming tools and
mo e to advanced,
move
ad anced large-scale
la ge scale multicore
m ltico e programming
p og amming easily.
easil
y
References::
References
y
http://www.tilera.com/products/processors.php: ProductBrief_Tile64_Web_v3.pdf
y
htt //
http://www.tilera.com/technology/technology.php:
til
/t h l
/t h l
h A
ArchBrief_Arch_V1_Web.pdf
hB i f A h V1 W b df
y
http://en.wikipedia.org/wiki/TILE64
y
http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4378780
y
http://techreport.com/discussions.x/13069
y
http://www.hwupgrade.it/news/portatili/64-core-per-il-processore-tile64_22252.html
y
http://www.theregister.co.uk/2007/08/20/tilera_tile64_chip/
p //
g
/
/ / /
_
_ p/
y
http://arstechnica.com/articles/paedia/cpu/MIT-startup-raises-multicore-bar-withnew-64-core-CPU.ars
y
http://www tgdaily com/content/view/33451/135/
http://www.tgdaily.com/content/view/33451/135/
y
http://www.itjungle.com/tlb/tlb082107-story02.html
y
http://www.pcmag.com/article2/0,1895,2173203,00.asp