Tilera`s Many-core Processor

Transcription

Tilera`s Many-core Processor
Tilera’s Many-core Processor
A scalable architecture on a single chip.
J. Whitesell & S. Ladavich
Tuesday, May 14th, 2013
1
2
History of Tilera
3
History of Tilera
Pros and Cons of Building a Manycore
Architecture
4
History of Tilera
Pros and Cons of Building a Manycore
Architecture
The Tilera Approach
5
History of Tilera
Pros and Cons of Building a Manycore Architecture
The Tilera Approach
ƒ
Tilera’s …
6
History of Tilera
Pros and Cons of Building a Manycore Architecture
The Tilera Approach
ƒ
Tilera’s …
ƒ Tile Architecture
7
History of Tilera
Pros and Cons of Building a Manycore Architecture
The Tilera Approach
ƒ
Tilera’s …
ƒ Tile Architecture
ƒ iMesh Network Topology
8
History of Tilera
Pros and Cons of Building a Manycore Architecture
The Tilera Approach
ƒ
ƒ
Applications …
Tilera’s …
ƒ Tile Architecture
ƒ iMesh Network Topology
9
History of Tilera
Pros and Cons of Building a Manycore Architecture
The Tilera Approach
ƒ
ƒ
Applications …
ƒ Server
Tilera’s …
ƒ Tile Architecture
ƒ iMesh Network Topology
10
History of Tilera
Pros and Cons of Building a Manycore Architecture
The Tilera Approach
ƒ
ƒ
Applications …
ƒ Server
Tilera’s …
ƒ Tile Architecture
ƒ iMesh Network Topology
ƒ Media
11
History of Tilera
Pros and Cons of Building a Manycore Architecture
The Tilera Approach
ƒ
ƒ
Applications …
ƒ Server
Tilera’s …
ƒ Tile Architecture
ƒ iMesh Network Topology
ƒ Media
ƒ Cloud
12
History of Tilera
Pros and Cons of Building a Manycore Architecture
The Tilera Approach
ƒ
ƒ
Tilera’s …
Applications …
ƒ Tile Architecture
ƒ Server
ƒ iMesh Network Topology
ƒ Media
ƒ Cloud
ƒ
Performance Analysis and Benchmarking
13
1990
1994
2002
2004
2007
2011
14
Multi-processor
made of single
chips
1990
1994
2002
2004
2007
MIT’s Dr. Anant Agarwal leads the way for Tiled Manycore
2011
15
Multi-processor
made of single
chips
1990
1994
32-node meshmesh based
cache-coherent
processor
2002
MIT’s RAW architecture
2004
2007
2011
16
Multi-processor
made of single
chips
1990
1994
32-node meshmesh based
cache-coherent
processor
DARPA pays the bill!
Gives 10s of millions 2002
supporting RAW
MIT’s RAW architecture
2004
2007
2011
17
“Tilera has solved the
multi-processor
scalability problem!”
does not exist!”
Multi-processor
made of single
chips
1990
1994
32-node meshmesh based
cache-coherent
processor
DARPA pays the bill!
Gives 10s of millions 2002
supporting RAW
2004 Tilera’s stealth launch
2007
2011
18
“Tilera has solved the
multi-processor
scalability problem!”
does not exist!”
Multi-processor
made of single
chips
1990
1994
32-node meshmesh based
cache-coherent
processor
DARPA pays the bill!
Gives 10s of millions 2002
supporting RAW
2004 Tilera’s stealth launch
Tilera’s corporate
launch
2007
2011
19
Multi-processor
made of single
chips
1990
1994
32-node meshmesh based
cache-coherent
processor
DARPA pays the bill!
Gives 10s of millions 2002
supporting RAW
2004 Tilera’s stealth launch
Tilera’s corporate
launch
2007
2011
Latest line
Gx series is released
20
Traditional Architectures aren’t Scalable
ƒ Most Multi-Core Chips Stop Around 8 Cores
ƒ Bus Interconnect
▪ Creates a Bottleneck for MM Access
▪ Consumes Chip-Area & Power
21
ƒ On-Chip Memory Limits
ƒ Software Support
▪ Efficient API Development is Challenging
▪ Parallel Languages and Programmers are Needed
22
On-Chip Communication is Fast!
ƒ Reduced Overheads
ƒ Finer Grain Size
On-Chip Network Footprint is Small!
ƒ Natural Tiled Connections
ƒ 2-D Mesh Suits 2-D Substrate
23
Create a Basic Modular Unit
ƒ Homogeneous Across Chip
ƒ Known as a Tile
▪ Full-Featured Processor Core
▪ Processor Engine
▪ Cache Engine
▪ Switch Engine
▪ Capable of Running an OS
Basic Look Inside a Tile
24
Processor Engine
ƒ 64-bit VLIW Architecture
▪ 3 Execution Pipelines
 ALU, Flow Control, LD/ST
Cache Engine
ƒ Dynamic Distributed Cache
▪ Shared L2 Caches (L3)
Switch Engine
Detailed Look Inside a Tile
ƒ Direct Neighbor Connections
ƒ I/O Connections on Periphery
25
Networks are easy!
26
Networks are easy!
Communication is cheap!
27
Leverage Multiple
Independent
Networks
28
ƒ
1) How many networks are needed?
29
ƒ
ƒ
1) How many networks are needed?
2) What functionalities do the networks have?
30
ƒ
How are the message types and communications defined?
Message Types:
Dedicated
Networks:
31
ƒ
How are the message types and communications defined?
Message Types:
Implicit Message Passing
Explicit Message Passing
Dedicated
Networks:
32
ƒ
How are the message types and communications defined?
Message Types:
1
1)Implicit
Implicit Message Passing
Explicit Message Passing
Dedicated
Networks:
1)MDN
2)TDN
33
ƒ
How are the message types and communications defined?
Message Types:
1
1)Implicit
through…
Implicit Message Implicit
Passing Messages
Explicit
Message Passing
Tile-to-tile shared address space
Non-uniform / distributed cache access (NUCA)
Dedicated
Networks:
Shared address space in off-chip / main memory
Uniform memory access (UMA)
1)MDN
2)TDN
34
ƒ
How are the message types and communications defined?
Message Types:
1
1)Implicit
Implicit Message Passing
Streaming Data
Explicit Message Passing
Messages
Dedicated
Networks:
1)MDN
2)TDN
35
ƒ
How are the message types and communications defined?
Message Types:
1
1)Implicit
2)Message Passing
Implicit Message Passing
Explicit Message Passing
2
Streaming Data
Messages
Dedicated
Networks:
1)MDN
2)TDN
3)UDN
36
ƒ
How are the message types and communications defined?
Message Types:
1
1)Implicit
2)Message Passing
Implicit Message Passing
Explicit Message Passing
2
Streaming Data
Large Buffers
Messages
Small Buffers
Dedicated
Networks:
1)MDN
2)TDN
3)UDN
37
ƒ
How are the message types and communications defined?
Message Types:
1
1)Implicit
2)Message Passing
3)Streaming Data
a) Small stream
Implicit Message Passing
Explicit Message Passing
2
Streaming Data
Messages
3a
Large Buffers
Small Buffers
Dedicated
Networks:
1)MDN
2)TDN
3)UDN
38
ƒ
How are the message types and communications defined?
Message Types:
1
1)Implicit
2)Message Passing
3)Streaming Data
a) Small stream
b) Large stream
Implicit Message Passing
Explicit Message Passing
2
Streaming Data
Messages
3a
3b
Large Buffers
Small Buffers
Dedicated
Networks:
1)MDN
2)TDN
3)UDN
39
ƒ
How are the message types and communications defined?
Message Types:
1
1)Implicit
2)Message Passing
3)Streaming Data
a) Small stream
b) Large stream
Implicit Message Passing
Explicit Message Passing
2
Streaming Data
3b
Messages
3a
Large Buffers
Small Buffers
Dedicated
Networks:
1)MDN
2)TDN
3)UDN
Special Case:
High Performance
Streaming
40
ƒ
How are the message types and communications defined?
Message Types:
1
1)Implicit
2)Message Passing
3)Streaming Data
a) Small stream
b) Large stream
c) Large/Continuous
Implicit Message Passing
Explicit Message Passing
2
Streaming Data
3b
Messages
3a
Large Buffers
Small Buffers
Dedicated
Networks:
1)MDN
2)TDN
3)UDN
4)STN
3c
Special Case:
High Performance
Streaming
41
ƒ
How are the message types and communications defined?
Message Types:
1
1)Implicit
2)Message Passing
3)Streaming Data
a) Small stream
b) Large stream
c) Large/Continuous
Implicit Message Passing
Explicit Message Passing
2
Streaming Data
3b
Messages
3a
Large Buffers
Small Buffers
Dedicated
Networks:
1)MDN
2)TDN
3)UDN
4)STN
3c
Special Case:
High Performance
Streaming
Special Case:
IO Messages
System Traffic
42
ƒ
How are the message types and communications defined?
Message Types:
1
1)Implicit
2)Message Passing
3)Streaming Data
a) Small stream
b) Large stream
c) Large/Continuous
4)System Level & IO
Implicit Message Passing
Explicit Message Passing
2
Streaming Data
3b
Messages
3a
Large Buffers
Small Buffers
Dedicated
Networks:
1)MDN
2)TDN
3)UDN
4)STN
5)IDN
3c
Special Case:
High Performance
Streaming
4
Special Case:
IO Messages
System Traffic
43
ƒ
How are the message types and communications defined?
Message Types:
5 Independent Hardware Networks:
1
1)Implicit
2)Message Passing
3)Streaming Data
a) Small stream
b) Large stream
c) Large/Continuous
4)System Level & IO
Implicit Message
Passing
Explicit Message Passing
Memory Dynamic Network
Tile Dynamic Network
User Dynamic Network
Static Network
Streaming
Data
I/O Dynamic Network
3b
2
Messages
3a
Large Buffers
Small Buffers
Dedicated
Networks:
1)MDN
2)TDN
3)UDN
4)STN
5)IDN
3c
Special Case:
High Performance
Streaming
4
Special Case:
IO Messages
System Traffic
44
ƒ
How are the message types and communications defined?
Message Types:
5 Independent Hardware Networks:
1
1)Implicit
2)Message Passing
3)Streaming Data
a) Small stream
b) Large stream
c) Large/Continuous
4)System Level & IO
Implicit Message
Passing
Explicit Message Passing
Memory Dynamic Network
Tile Dynamic Network
User Dynamic Network
Static Network
Streaming
Data
I/O Dynamic Network
3b
1)MDN
2)TDN
3)UDN
4)STN
5)IDN
Messages
3a
Large Buffers
Dedicated
Networks:
2
Small Buffers
Which minimize overheads for all desired
forms of communication
3c
Special Case:
High Performance
Streaming
4
Special Case:
IO Messages
System Traffic
45
Parallel Processing in Embedded Domain
ƒ Network
▪ Lossless Packet Capture
▪ Intrusion Detection & Prevention
ƒ Multimedia
▪ Video Conferencing
▪ IP Surveillance
ƒ Cloud
▪ In-Memory Caching
▪ Server Load Balancing
46
Numerous Evaluations
ƒ Single-Core Performance
▪ CoreMark Score
ƒ Parallelized Performance
▪ Information Fusion
▪ Gaussian Elimination
▪ MemCached
Comparisons of SMPs & Many-Core
47
ƒ
Evaluates Single-Core Performance
ƒ 4 Algorithms
ƒ 1 Final Score
Tilera’s Processors Feature:
ƒVLIW Architecture
ƒ3 Pipelines
ƒ64-bit Instr. Words
ƒ All or None Exec.
CoreMark Score
Single-Core Single Thread CoreMark Comparison
48
ƒ
Embedded Wireless Sensor Networks
ƒ Cluster Heads Receive from 10 Sensors
ƒ Head Node Performs Reduction
▪ Moving Average Filter
49
ƒ
Results Vary Based on Application
ƒ Integer-Based Arithmetic
ƒ Floating-Point Intensive
Information Fusion Application
Gaussian Elimination Application
50
ƒ
Why?
ƒ Tiles Lack a Dedicated Floating-point Unit!
Information Fusion Application
Gaussian Elimination Application
51
ƒ
Distributed Memory Caching System
ƒ Creates a Virtual Memory Pool
ƒ Used for Key-Value Stores
ƒ Designed to Alleviate Database Load
ƒ
Currently Implemented by…
ƒ Social Media Giants
▪ Facebook, Twitter, and Zynga
52
ƒ
For a Fixed Memory Footprint
▪ Tilera Achieves 3.35x Throughput @ Less Power
▪ Better Performance per Watt
53
ƒ
The Tile Architecture Exhibits…
ƒ Superior Scalability
▪ Modular Design
▪ Low Cost of On-Chip Communication
▪ Exploiting a Variety of Task Grain Sizes
▪ ILP and TLP
ƒ High Performance per Watt
▪ Relatively Low Clock Speeds
▪ Idle Mode for Unused Tiles
▪ Reducing Costs of Web Datacenters
54
55
Waingold, E.; Taylor, M.; Srikrishna, D.; Sarkar, V.; Lee, W.; Lee, V.; Kim, J.; Frank, M.; Finch, P.; Barua, R.; Babb, J.; Amarasinghe, S.; Agarwal, A., "Baring it all to
software: Raw machines," Computer , vol.30, no.9, pp.86,93, Sep 1997 CURRENTLY NOT NEEDED
Tilera Corporation, “Tile Processor User Architecture Manual,” UG101, Nov. 2011 [Rev. 2.4]
Wentzlaff, D.; Griffin, P.; Hoffmann, H.; Liewei Bao; Edwards, B.; Ramey, C.; Mattina, M.; Chyi-Chang Miao; Brown, J.F.; Agarwal, A., "On-Chip Interconnection
Architecture of the Tile Processor," Micro, IEEE , vol.27, no.5, pp.15,31, Sept.-Oct. 2007
Munir, A.; Gordon-Ross, A.; Ranka, S., "Parallelized benchmark-driven performance evaluation of SMPs and tiled multi-core architectures for embedded
systems," Performance Computing and Communications Conference (IPCCC), 2012 IEEE 31st International , vol., no., pp.416,423, 1-3 Dec. 2012
Berezecki, M.; Frachtenberg, E.; Paleczny, M.; Steele, K., "Many-core key-value store," Green Computing Conference and Workshops (IGCC), 2011
International , vol., no., pp.1,8, 25-28 July 2011
R. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance EMbedded Computing Conference
Proceedings, 2010. Presentation Slides 28-30.
Links to Other Images (Presentation Only):
Tilera Silicon - http://www.datacenterdynamics.com/focus/archive/2011/07/facebook-tilera-chips-more-energy-efficient-x86
AMD Phenom Silicon - http://siliconmadness.blogspot.com/2010/05/amd-phenom-ii-x6-overclocking-record.html
Scalability Graph - www.ll.mit.edu/HPEC/agendas/.../S2_1405_Schooler_presentation.ppt‎
Tilera Products and Theme - http://www.tilera.com/contact/media_library
Single Tile Detail - http://semiaccurate.com/2009/10/29/look-100-core-tilera-gx/
56

Similar documents