Server Forum 2014
Transcription
Server Forum 2014
Server Forum 2014 Copyright © 2014 [Desi Rhoden & Montage Technology] If Servers are the Cloud Then Memory Is the Atmosphere of the Cloud Consider the Details DRAM is still the Best cost/performance Main Memory Servers always want MORE memory KB->MB->GB->TB Modules Continue to drive Server Main Memory Exponential Growth Slow ¢ Access Speed & Cost/bit Fastest $$$$ Processor Fast $ Continuous Main Memory Migration Storage • Experiments help find a better solution • Some main memory is migrating to the processor like another level cache, but probably DRAM instead of SRAM • Some storage memory is migrating into main memory channels - still storage, but closer to main memory and nonvolatile (aka NVDIMM) • As always there are trade offs with migrations and some changes will succeed while others may not • Reliability is a must, but always driven by price, price, price then power • MORE @ lower price & lower power • TB+ systems take a long time to boot • Non-volatile would be great, but speed and re-write ability are still a problem • Continuous improvement by evolution • Big changes take a little time to grow The DRAM DIMM Drives Main Memory For Servers The DIMM Evolution for DDR4 EASY! - Put DDR4 on a PCB. Connect to a Standard Edge connector defined by JEDEC & you’re done. Right? Oh, and add an SPD ROM so the system can use it. T Vref DRAM Data Address & Command DRAM Data SPD Intf • Simple UDIMM - Volume driver for a long time • Just not good enough for DDR4 servers • DDR4 Speeds demand a better solution for SI etc. For DDR4 Servers RDIMMs replace UDIMMs As the base Module Which Also Implies Something Better DDR SDRAM Vref SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM Data Direct from DRAM Balls 1 - 4 loads SDRAM SDRAM T SDRAM T DDR RCD Address Command All Registered 1 or 2 loads Data Direct from DRAM Balls 1 - 4 loads T T SPD Address Data & clock • Solves Address & Command loading & SI problem of the UDIMM • All Address & Command signals registered in DDR RCD • DIMM Cmd/Add load is always 1 (or 2) with DDR3 RDIMM • All Data (DQ) & Data Strobe (DQS) signals route to DIMM connector • Data loads/layout/configuraIon/SI limit speed Vertical DRAM Placement DDR4 New more complex RCD for DDR4 (Same RCD as DDR4 LRDIMM) T T T T Vref Data Direct from DRAM Balls 1 – 2 loads Address Command All Registered 1 load Data Direct from DRAM Balls 1 – 2 loads SPD Address Data & clock • Same basic architecture as DDR3, but more limited configuraIons • Improvement by added features in RCD & DRAM & opImized design • Fewer configuraIons reduces future upgrade possibiliIes • Same old architecture – Data loads/SI limit future speed & density Flower DRAM Placement DDR4 New more complex RCD for DDR4 (Same RCD as DDR4 LRDIMM) T T T T Vref Data Direct from DRAM Balls 1 – 2 loads Address Command All Registered 1 load Data Direct from DRAM Balls 1 – 2 loads SPD Address Data & clock • Same basic architecture as DDR3, but more limited configuraIons • Improvement by added features in RCD & DRAM & opImized design • Fewer configuraIons reduces future upgrade possibiliIes • Same old architecture – Data loads/SI limit future speed & density Extending beyond RDIMM Requires a New Module Architecture The DDR4 LRDIMM Is that New Architecture Vref SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM SDRAM Registered Data in MB 1 load Memory Buffer (MB) SDRAM SDRAM SDRAM T SDRAM T SDRAM DDR3 SDRAM Registered Address/Command 1 load RegisteredDD ata in MB 1 load T T SPD access • Single (MB) in central locaIon on DIMM was a rouIng challenge • All Address & Command signals registered in MB like RDIMM • All Data & Data Strobe signals ALSO registered in MB • Could not be extended to DDR4 -‐ long DQ/DQS & SI limits speed • OK 1st pass at DDR3 speeds, but DDR4 LRDIMM is much beUer DDR4 RCD T T T T T Vref T Data DB to Edge 1 load Address/Command All Registered 1 load DDR4 DB Data DB to Edge 1 load SPD Address Data & clock ! Separate, distributed DB per byte – Fast RCD<-‐>DB bus for control ! Short DIMM Data Stubs & isolated DRAM DQ for beUer SI & low power ! Extensible architecture for the future -‐ memory devices completely isolated Short stub to DDR4 DB enables higher channel speed, higher density and more DIMMs Per Channel (DPC) DDR4 DDR4 DDR4 LRDIMM support up to 3DPC at high DDR4 frequencies 4, 8, 16 rank LRDIMMs supported DDR4 DDR4 DDR4 I/O ( x32 ) DDR4 DDR4 Motherboard Cutout of one Memory Channel DDR4 Strobe CPU Memory Cont. Strobe I/O ( x40 ) ● 4 or more Memory Channels per controller - common in DDR4 ● Multi-core processors & multi processor systems – common ● Lower power is Important key to the whole system today Short stub to DDR4 DB enables higher channel speed, higher density and more DIMMs Per Channel (DPC) DDR4 DDR4 DDR4 LRDIMM support up to 2DPC at higher DDR4 frequencies 4, 8, 16 rank LRDIMMs supported Motherboard Cutout of one Memory Channel DDR4 I/O ( x32 ) DDR4 DDR4 DDR4 Strobe CPU Memory Cont. Strobe I/O ( x40 ) ● Even more Memory Channels per controller in the future ● Multi-core processors & multi processor systems – common ● Lower power is Important key to the whole system today Typical DDP (NOT 3DS for DDR4) • 2 – I/O load per ball • Both Die connected to package Balls • No DDP RDIMM • • • • 2 Styles of 3DS for DDR4 aka: Through Silicon Via (TSV) DRAM Die I/O Logic in boUom DRAM I/O Logic in separate Die • 1 – I/O load per ball regardless of # DRAM Die • Only BoUom Die connected to package Balls • All inter-‐Die connecNons are Internal Only 3DS will exist in 2, 4 & 8 Die packages I/O has same load regardless of # of Die/package Enables Common R/C w/3DS DRAM opNons Both styles of 3DS are part of the JEDEC standard SoluNons for the Density Demand Rank 1 Rank 0 DDR4 Mono (Single Sided) PCB for DIMMs • 2 – I/O load per ball • Both Die connected to package Balls Rank 1 Rank 0 DDR4 – 3DS (4 rank) DDR4 3DS (Single Sided) Possible, but costly Maybe 1 rank RDIMM DDR4 Mono Rank 3 Rank 2 Rank 1 Rank 0 DDR4 – 3DS (8 rank) Rank 7 Rank 3 Rank 6 Rank 2 Rank 5 Rank 1 Rank 4 Rank 0 DDR4 – 3DS (16 rank) Rank 1 Rank 0 A6 A0 A1 A5 A5 A1 A0 A6 Rank 0 Rank 1 DRAM Ballout does not change, but the connection to some signals is swapped to improve routing on DIMM DDR4 DRAM allows “Address Mirroring” much like DDR3. Some signals can be swapped on the DIMM to allow front (even) side DRAM to connect directly to back (odd) side DRAM without affecNng normal funcNonality. The SPD idenNfies if Mirroring was used on a module. All controllers must “de-‐ mirror” Odd rank accesses to internal DRAM registers. The table idenNfies which bits are swapped for mirroring. Signal Name Connector DRAM Ball Label Even Rank Odd Rank A0 A0 A0 A1 A1 A1 A2 A2 A2 A3 A3 A4 A4 A4 A3 A5 A5 A6 A6 A6 A5 A7 A7 A8 A8 A8 A7 A9 A9 A9 A10/AP A10/AP A10/AP A11 A11 A13 A12/BC_n A12/BC_n A12/BC_n A13 A13 A11 A14/WE_n A14/WE_n A14/WE_n A15/CAS_n A15/CAS_n A15/CAS_n A16/RAS_n A16/RAS_n A16/RAS_n A17 A17 A17 BA0 BA0 BA1 BA1 BA1 BA0 BG0 BG0 BG1 BG1 BG1 BG0 Comment Not Valid for x4 <16G Feature RDIMM 1R RDIMM 2R LRDIMM Data Load DIMM Intf Max Speed 4GB DIMM $ 8GB DIMM 16GB DIMM 32GB DIMM Increasing Cost Options 64GB DIMM 128GB DIMM $$$$$$ DIMM Configuration 2014 2133 2014 Stretch 2133 1DPC RDIMM 2015 2400* 2016 Gen 2 2400* 2DPC RDIMM 1866 1866 1866/2133? 2133?* 3DPC RDIMM 1600 1600/1866? 1600 n/a 1DPC LRDIMM 2DPC LRDIMM 2133 2133 2400 26672 2133 2133 2400 26672 3DPC LRDIMM 1600 1866 2133 n/a *RDIMM speed is limited by DRAM + SI from DIMM 2 DDR4 LRDIMM can/will extend beyond 2667 n/a – By 2016 it is unlikely 3DPC will be supported • DDR4 LRDIMM represents a substantial architectural improvement for memory modules. • Extending beyond will require more careful system, DIMM, Logic & DRAM design. • LRDIMMs can be optimized to a single DIMM interface/layout – Memory type & density are independent of DIMM interface • Logic can/will be further optimized in future devices to surpass future system demands. DIMM OrganizaNon x64, x72 ECC DIMM Dimensions (nominal) Pin Count and Pitch DDR4 SDRAMs Supported Capacity DDR4 SDRAM width Serial PD, Thermal Sensor (SPD/TS) 133.35 mm x 31.25 mm Refer to MO-309 (1.25 mm higher than DDR3) 133.35 mm x 18.75 mm 284 Refer to MO-309 (1.25 mm higher than DDR3) PCB thickness changed from 1.27 mm -‐> 1.4 mm 4Gb, 8Gb, 16Gb 78/106-ball FBGA package for x4 and x8 devices. Refer to MO-207: variations DT-z, DW-z 16GB, 32GB, 64GB, 128GB X4, X8 512 byte See EE1004-‐v and TSE2004av specificaIons PC4 -‐ 1.2 Volt ±5% for VDD PC4L -‐ TBD All DDR4 modules use a common VDD–VDDQ power plane. They may be Ied together on the DIMM, but by standard definiIon are supported on the pinout to accommodate future enhancements. 2.5 Volt +10%, -5% for VPP The VPP supply has VSS as its return path. On DIMM It is a separate supply from VDDSPD. Voltage OpIons Interface Notes 2.5 Volt or 3.3 Volt ±10% for VDDSPD 1.2 V signaling SPD supply is operable with 2.5V or 3.3V. LP Card# Sponsor LP/ DRAM PCB H DRAM VLP I/O Rank I/O pcs mm Placement R/C-A* LP x72 2/4/8/16 x4 Mono/3DS 36 31.25 Vertical 2 row R/C-B* LP x72 2/4/8/16 x4 Mono/3DS 36 31.25 Flower 2 row R/C-C LP x72 1 x4 Mono 18 31.25 Vertical 1 row R/C-D LP x72 1 x8 Mono 9 31.25 Vertical 1 row, single sided R/C-E LP x72 2 x8 Mono 18 31.25 Vertical 1 row PCB H DRAM VLP Card# Sponsor LP/ DRAM VLP I/O Rank I/O pcs mm Placement R/C-F* VLP x72 2/4/8 x4 Mono/3DS 18 18.75 Vertical 1 row R/C-G VLP x72 1 x8 Mono 9 18.75 Vertical 1 row R/C-H VLP x72 2 x8 Mono 18 18.75 Vertical 1 row * Raw Card supports all listed Mono & 3DS options with same brd design. DIMM OrganizaNon x64, x72 ECC DIMM Dimensions (nominal) Pin Count and Pitch DDR4 SDRAMs Supported Notes 133.35 mm x 31.25 mm Refer to MO-309 (1.25 mm higher than DDR3) 133.35 mm x 18.75 mm 284 Refer to MO-309 (1.25 mm higher than DDR3) PCB thickness changed from 1.27 mm -‐> 1.4 mm 4Gb, 8Gb, 16Gb 78/106-ball FBGA package for x4 and x8 devices. Refer to MO-207: variations DT-z, DW-z Capacity 16GB, 32GB, 64GB, 128GB DDR4 SDRAM width X4, (X8 requires new DB) Serial PD, Thermal 512 byte See EE1004-‐v and TSE2004av specificaIons Sensor (SPD/TS) All DDR4 modules use a common VDD–VDDQ power plane. PC4 -‐ 1.2 Volt for VDD They may be Ied together on the DIMM, but by standard PC4L -‐ TBD definiIon are supported on the pinout to accommodate Voltage OpIons future enhancements. 2.5 Volt for VPP Interface 2.5 Volt or 3.3 Volt for VDDSPD 1.2 V signaling The VPP supply has VSS as its return path. On DIMM It is treated as a separate supply from VDDSPD. SPD supply is operable with 2.5V or 3.3V. LP Card# Sponsor LP/ PCB Height DRAM # Pkg mm DRAM VLP I/O Rank I/O R/C-A* LP x72 2/4/8/16 x4 Mono/ 3DS 36 31.25 Vertical 2 row R/C-B* LP x72 2/4/8/16 x4 Mono/ 3DS 36 31.25 Flower 2 row R/C-D LP x72 4 x4 DDP (no 3DS) 36 31.25 Vertical 2 row R/C-E LP x72 4 x4 DDP (no 3DS) 36 31.25 Flower 2 row R/C-C* VLP x72 2/4/8 x4 Mono/ 3DS 18 31.25 Horizontal 1 row * Raw Card supports all listed Mono & 3DS options with same PCB design. Low Priority for now Placement The DDR4 LRDIMM Architecture Will Also Drive Next Generation LRDIMM and NVDIMM • DDR4 RCD02 & DB02 Next revision specification – DDR4 DRAM definition is the same, but speed will increase • Fully Backward compatible to DDR4RCD01 and DDR4DB01 • Adding more features to support speed >2400 – – – – – – – – Electrical parameters for 2667MT/s and above Define additional control word space Additional Drive strength sets DB Slew rate control for DB Per bit de skew to optimize SI CTRL Gear down mode (as per DRAM spec) Command to Address Latency (CAL) mode (as per DRAM spec) RCD fractional tCK and QCA output delay control • Optional NVDIMM controller interface, commands & protocol. – NVDIMM functions, features etc. standardization continues in JEDEC and related industry groups. Basic Concept of NVDIMM for Gen 2 DDR4 DDR4 RCD T T T T Vref ! ! ! ! Data DB to Edge 1 load Address/Command DDR4 DB02n All Registered 1 load Data DB to Edge 1 load Gen 2 logic for DDR4 with support for NVDIMM (different DB) CombinaIon of DRAM and Flash on the same module Interface to the host sIll looks like LRDIMM Details are sIll under development in JEDEC SPD Address Data & clock DDR4 RCD T T T T T T DDR4 DB