RAIDR: Retention-Aware Intelligent DRAM Refresh

Comments

Transcription

RAIDR: Retention-Aware Intelligent DRAM Refresh
RAIDR:
Retention-Aware Intelligent DRAM Refresh
Jamie Liu
Ben Jaiyen
Richard Veras
Onur Mutlu
Executive Summary
I
DRAM requires periodic refresh to avoid data loss due to
capacitor charge leakage
RAIDR: Retention-Aware Intelligent DRAM Refresh
2
Executive Summary
I
I
DRAM requires periodic refresh to avoid data loss due to
capacitor charge leakage
Refresh operations interfere with memory accesses and
waste energy
RAIDR: Retention-Aware Intelligent DRAM Refresh
2
Executive Summary
I
I
I
DRAM requires periodic refresh to avoid data loss due to
capacitor charge leakage
Refresh operations interfere with memory accesses and
waste energy
Refresh overhead limits DRAM scaling
RAIDR: Retention-Aware Intelligent DRAM Refresh
2
Executive Summary
I
I
I
I
DRAM requires periodic refresh to avoid data loss due to
capacitor charge leakage
Refresh operations interfere with memory accesses and
waste energy
Refresh overhead limits DRAM scaling
Observation: High refresh rate caused by few weak DRAM
cells
RAIDR: Retention-Aware Intelligent DRAM Refresh
2
Executive Summary
I
I
I
I
I
DRAM requires periodic refresh to avoid data loss due to
capacitor charge leakage
Refresh operations interfere with memory accesses and
waste energy
Refresh overhead limits DRAM scaling
Observation: High refresh rate caused by few weak DRAM
cells
Problem: All cells refreshed at the same high rate
RAIDR: Retention-Aware Intelligent DRAM Refresh
2
Executive Summary
I
I
I
I
I
I
DRAM requires periodic refresh to avoid data loss due to
capacitor charge leakage
Refresh operations interfere with memory accesses and
waste energy
Refresh overhead limits DRAM scaling
Observation: High refresh rate caused by few weak DRAM
cells
Problem: All cells refreshed at the same high rate
Idea: RAIDR decreases refresh rate for most DRAM cells
while refreshing weak cells at a higher rate
RAIDR: Retention-Aware Intelligent DRAM Refresh
2
Executive Summary
I
I
I
I
I
I
DRAM requires periodic refresh to avoid data loss due to
capacitor charge leakage
Refresh operations interfere with memory accesses and
waste energy
Refresh overhead limits DRAM scaling
Observation: High refresh rate caused by few weak DRAM
cells
Problem: All cells refreshed at the same high rate
Idea: RAIDR decreases refresh rate for most DRAM cells
while refreshing weak cells at a higher rate
I
Group parts of DRAM into di erent bins depending on their
required refresh rate
RAIDR: Retention-Aware Intelligent DRAM Refresh
2
Executive Summary
I
I
I
I
I
I
DRAM requires periodic refresh to avoid data loss due to
capacitor charge leakage
Refresh operations interfere with memory accesses and
waste energy
Refresh overhead limits DRAM scaling
Observation: High refresh rate caused by few weak DRAM
cells
Problem: All cells refreshed at the same high rate
Idea: RAIDR decreases refresh rate for most DRAM cells
while refreshing weak cells at a higher rate
I
Group parts of DRAM into di erent bins depending on their
required refresh rate
I
Use Bloom lters for scalable and e cient binning
RAIDR: Retention-Aware Intelligent DRAM Refresh
2
Executive Summary
I
I
I
I
I
I
DRAM requires periodic refresh to avoid data loss due to
capacitor charge leakage
Refresh operations interfere with memory accesses and
waste energy
Refresh overhead limits DRAM scaling
Observation: High refresh rate caused by few weak DRAM
cells
Problem: All cells refreshed at the same high rate
Idea: RAIDR decreases refresh rate for most DRAM cells
while refreshing weak cells at a higher rate
I
Group parts of DRAM into di erent bins depending on their
required refresh rate
I
I
Use Bloom lters for scalable and e cient binning
Refresh each bin at the minimum rate needed
RAIDR: Retention-Aware Intelligent DRAM Refresh
2
Executive Summary
I
I
I
I
I
I
DRAM requires periodic refresh to avoid data loss due to
capacitor charge leakage
Refresh operations interfere with memory accesses and
waste energy
Refresh overhead limits DRAM scaling
Observation: High refresh rate caused by few weak DRAM
cells
Problem: All cells refreshed at the same high rate
Idea: RAIDR decreases refresh rate for most DRAM cells
while refreshing weak cells at a higher rate
I
Group parts of DRAM into di erent bins depending on their
required refresh rate
I
I
I
Use Bloom lters for scalable and e cient binning
Refresh each bin at the minimum rate needed
RAIDR reduces refreshes signi cantly with low overhead in
the memory controller
RAIDR: Retention-Aware Intelligent DRAM Refresh
2
Outline
I
Executive Summary
I
Background & Motivation
I
Key Observation & Our Mechanism: RAIDR
I
Evaluation
I
Conclusion
RAIDR: Retention-Aware Intelligent DRAM Refresh
3
DRAM Refresh
1
1
0
0
1
0
0
1
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
1
0
0
1
0
0
1
1
Rows
Row buffer
RAIDR: Retention-Aware Intelligent DRAM Refresh
4
DRAM Refresh
1
1
0
0
1
0
0
1
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
1
0
0
1
0
0
1
1
Activate
1 1 0 0 1 0 0 1
RAIDR: Retention-Aware Intelligent DRAM Refresh
4
DRAM Refresh
1
1
0
0
1
0
0
1
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
1
0
0
1
0
0
1
1
1 1 0 0 1 0 0 1
RAIDR: Retention-Aware Intelligent DRAM Refresh
4
DRAM Refresh
1
1
0
0
1
0
0
1
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
1
0
0
1
0
0
1
1
Precharge
RAIDR: Retention-Aware Intelligent DRAM Refresh
4
DRAM Refresh
1
1
0
0
1
0
0
1
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
0
RAIDR: Retention-Aware Intelligent DRAM Refresh
0
0
1
0
0
1
1
1
1
0
0
1
0
0
1
1
4
DRAM Refresh
1
1
0
0
1
0
0
1
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
0
0
0
1
1
1
0
0
0
1
0
0
1
1
1
0
RAIDR: Retention-Aware Intelligent DRAM Refresh
0
0
1
0
0
1
1
1
1
0
0
1
0
0
1
1
4
DRAM Refresh
1
1
0
0
1
0
0
1
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
1
0
0
1
0
0
1
1
Activate
1 1 0 0 1 0 0 1
RAIDR: Retention-Aware Intelligent DRAM Refresh
4
DRAM Refresh
1
1
0
0
1
0
0
1
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
1
0
0
1
0
0
1
1
1 1 0 0 1 0 0 1
RAIDR: Retention-Aware Intelligent DRAM Refresh
Refresh
Activate
4
DRAM Refresh
1
1
0
0
1
0
0
1
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
0
0
0
1
0
0
1
1
1
1
0
0
1
0
0
1
1
Precharge
RAIDR: Retention-Aware Intelligent DRAM Refresh
4
Refresh Overhead: Performance
% time spent refreshing
100
80
Present
Future
60
40
20
0 2 Gb
4 Gb
RAIDR: Retention-Aware Intelligent DRAM Refresh
8 Gb 16 Gb 32 Gb 64 Gb
Device capacity
5
Refresh Overhead: Performance
% time spent refreshing
100
80
Present
Future
60
40
20
0 2 Gb
8%
4 Gb
RAIDR: Retention-Aware Intelligent DRAM Refresh
8 Gb 16 Gb 32 Gb 64 Gb
Device capacity
5
Refresh Overhead: Performance
% time spent refreshing
100
80
Present
Future
60
46%
40
20
0 2 Gb
4 Gb
RAIDR: Retention-Aware Intelligent DRAM Refresh
8 Gb 16 Gb 32 Gb 64 Gb
Device capacity
5
% DRAM energy spent refreshing
Refresh Overhead: Energy
100
80
Present
Future
60
40
20
0 2 Gb
4 Gb
RAIDR: Retention-Aware Intelligent DRAM Refresh
8 Gb 16 Gb 32 Gb 64 Gb
Device capacity
6
% DRAM energy spent refreshing
Refresh Overhead: Energy
100
80
Present
Future
60
40
20
0 2 Gb
15%
4 Gb
RAIDR: Retention-Aware Intelligent DRAM Refresh
8 Gb 16 Gb 32 Gb 64 Gb
Device capacity
6
% DRAM energy spent refreshing
Refresh Overhead: Energy
100
80
Present
60
Future
47%
40
20
0 2 Gb
4 Gb
RAIDR: Retention-Aware Intelligent DRAM Refresh
8 Gb 16 Gb 32 Gb 64 Gb
Device capacity
6
Outline
I
Executive Summary
I
Background & Motivation
I
Key Observation & Our Mechanism: RAIDR
I
Evaluation
I
Conclusion
RAIDR: Retention-Aware Intelligent DRAM Refresh
7
Key Observation and Idea
Retention failures
1012 32 GB DRAM
109
106
103
100
100ms
1s
10s
100s
Refresh interval
RAIDR: Retention-Aware Intelligent DRAM Refresh
1000s
8
Key Observation and Idea
Retention failures
1012 32 GB DRAM
109
106
103
100
I
100ms
1s
10s
100s
Refresh interval
1000s
Key observation: Most cells can be refreshed infrequently
without losing data [Kim+, EDL '09]
RAIDR: Retention-Aware Intelligent DRAM Refresh
8
Key Observation and Idea
Retention failures
1012 32 GB DRAM
109
106
103
100
I
< 50 retention failures @ 128 ms
100ms
1s
10s
100s 1000s
Refresh interval
Key observation: Most cells can be refreshed infrequently
without losing data [Kim+, EDL '09]
RAIDR: Retention-Aware Intelligent DRAM Refresh
8
Key Observation and Idea
Retention failures
1012 32 GB DRAM
109
106
103
100
I
< 1000 retention failures @ 256 ms
100ms
1s
10s
100s
Refresh interval
1000s
Key observation: Most cells can be refreshed infrequently
without losing data [Kim+, EDL '09]
RAIDR: Retention-Aware Intelligent DRAM Refresh
8
Key Observation and Idea
Retention failures
1012 32 GB DRAM
109
106
103
100
Cutoff at
64ms
100ms
1s
10s
100s
Refresh interval
1000s
I
Key observation: Most cells can be refreshed infrequently
without losing data [Kim+, EDL '09]
I
Problem: All cells are refreshed at the same worst-case rate
RAIDR: Retention-Aware Intelligent DRAM Refresh
8
Key Observation and Idea
Retention failures
1012 32 GB DRAM
109
106
103
100
100ms
1s
10s
100s
Refresh interval
1000s
I
Key observation: Most cells can be refreshed infrequently
without losing data [Kim+, EDL '09]
I
Problem: All cells are refreshed at the same worst-case rate
I
Key idea: refresh rows containing weak cells more
frequently; refresh other rows less frequently
RAIDR: Retention-Aware Intelligent DRAM Refresh
8
Retention-Aware Intelligent DRAM Refresh
1. Pro ling
I
Determine each row's retention time (how frequently each row
needs to be refreshed to avoid losing data)
RAIDR: Retention-Aware Intelligent DRAM Refresh
9
Retention-Aware Intelligent DRAM Refresh
1. Pro ling
I
Determine each row's retention time (how frequently each row
needs to be refreshed to avoid losing data)
2. Binning
I
Group rows into di erent retention time bins based on their
retention time
RAIDR: Retention-Aware Intelligent DRAM Refresh
9
Retention-Aware Intelligent DRAM Refresh
1. Pro ling
I
Determine each row's retention time (how frequently each row
needs to be refreshed to avoid losing data)
2. Binning
I
Group rows into di erent retention time bins based on their
retention time
3. Refreshing
I
Refresh rows in di erent bins at di erent rates
RAIDR: Retention-Aware Intelligent DRAM Refresh
9
Retention-Aware Intelligent DRAM Refresh
1. Pro ling
I
Determine each row's retention time (how frequently each row
needs to be refreshed to avoid losing data)
2. Binning
I
Group rows into di erent retention time bins based on their
retention time
3. Refreshing
I
Refresh rows in di erent bins at di erent rates
RAIDR: Retention-Aware Intelligent DRAM Refresh
9
1 Retention Time Pro ling
I
To pro le a row:
RAIDR: Retention-Aware Intelligent DRAM Refresh
10
1 Retention Time Pro ling
I
To pro le a row:
1. Write data to the row
RAIDR: Retention-Aware Intelligent DRAM Refresh
10
1 Retention Time Pro ling
I
To pro le a row:
1. Write data to the row
2. Prevent it from being refreshed
RAIDR: Retention-Aware Intelligent DRAM Refresh
10
1 Retention Time Pro ling
I
To pro le a row:
1. Write data to the row
2. Prevent it from being refreshed
3. Measure time before data corruption
RAIDR: Retention-Aware Intelligent DRAM Refresh
10
1 Retention Time Pro ling
I
To pro le a row:
1. Write data to the row
2. Prevent it from being refreshed
3. Measure time before data corruption
Initially
Row 1
Row 2
Row 3
11111111... 11111111... 11111111...
RAIDR: Retention-Aware Intelligent DRAM Refresh
10
1 Retention Time Pro ling
I
To pro le a row:
1. Write data to the row
2. Prevent it from being refreshed
3. Measure time before data corruption
Initially
After 64 ms
Row 1
Row 2
Row 3
11111111... 11111111... 11111111...
11111111... 11111111... 11111111...
RAIDR: Retention-Aware Intelligent DRAM Refresh
10
1 Retention Time Pro ling
I
To pro le a row:
1. Write data to the row
2. Prevent it from being refreshed
3. Measure time before data corruption
Initially
After 64 ms
After 128 ms
Row 1
Row 2
Row 3
11111111... 11111111... 11111111...
11111111... 11111111... 11111111...
11011111... 11111111... 11111111...
RAIDR: Retention-Aware Intelligent DRAM Refresh
10
1 Retention Time Pro ling
I
To pro le a row:
1. Write data to the row
2. Prevent it from being refreshed
3. Measure time before data corruption
Initially
After 64 ms
After 128 ms
Row 1
Row 2
Row 3
11111111... 11111111... 11111111...
11111111... 11111111... 11111111...
11011111... 11111111... 11111111...
(64‒128ms)
RAIDR: Retention-Aware Intelligent DRAM Refresh
10
1 Retention Time Pro ling
I
To pro le a row:
1. Write data to the row
2. Prevent it from being refreshed
3. Measure time before data corruption
Initially
After 64 ms
After 128 ms
After 256 ms
Row 1
Row 2
Row 3
11111111... 11111111... 11111111...
11111111... 11111111... 11111111...
11011111... 11111111... 11111111...
(64‒128ms)
11111011... 11111111...
RAIDR: Retention-Aware Intelligent DRAM Refresh
10
1 Retention Time Pro ling
I
To pro le a row:
1. Write data to the row
2. Prevent it from being refreshed
3. Measure time before data corruption
Initially
After 64 ms
After 128 ms
After 256 ms
Row 1
Row 2
Row 3
11111111... 11111111... 11111111...
11111111... 11111111... 11111111...
11011111... 11111111... 11111111...
(64‒128ms)
11111011... 11111111...
(128‒256ms) (>256ms)
RAIDR: Retention-Aware Intelligent DRAM Refresh
10
Retention-Aware Intelligent DRAM Refresh
1. Pro ling
I
Determine each row's retention time (how frequently each row
needs to be refreshed to avoid losing data)
2. Binning
I
Group rows into di erent retention time bins based on their
retention time
3. Refreshing
I
Refresh rows in di erent bins at di erent rates
RAIDR: Retention-Aware Intelligent DRAM Refresh
11
2 Grouping Rows Into Retention Time Bins
I
Rows are grouped into di erent bins based on their pro led
retention time
RAIDR: Retention-Aware Intelligent DRAM Refresh
12
2 Grouping Rows Into Retention Time Bins
I
Rows are grouped into di erent bins based on their pro led
retention time
DRAM
RAIDR: Retention-Aware Intelligent DRAM Refresh
12
2 Grouping Rows Into Retention Time Bins
I
Rows are grouped into di erent bins based on their pro led
retention time
64-128ms
>256ms
DRAM
128-256ms
RAIDR: Retention-Aware Intelligent DRAM Refresh
12
2 Grouping Rows Into Retention Time Bins
I
Rows are grouped into di erent bins based on their pro led
retention time
Row 1
Row 3
DRAM
Row 2
RAIDR: Retention-Aware Intelligent DRAM Refresh
12
2 Grouping Rows Into Retention Time Bins
I
Rows are grouped into di erent bins based on their pro led
retention time
Row 1
Row 3
DRAM
Row 2
I
Store bins using Bloom lters [Bloom, CACM '70]
RAIDR: Retention-Aware Intelligent DRAM Refresh
12
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
0
0
Hash function 1
0
0
0
0
0
0
Hash function 2
RAIDR: Retention-Aware Intelligent DRAM Refresh
0
0
0
0
0
0
Hash function 3
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
0
0
Hash function 1
0
0
0
0
0
0
Hash function 2
0
0
0
0
0
0
Hash function 3
Insert Row 1
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
0
0
Hash function 1
0
0
0
0
0
0
Hash function 2
0
0
0
0
0
0
Hash function 3
Insert Row 1
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
0
0
Hash function 1
0
0
0
0
0
0
Hash function 2
0
0
0
0
0
0
Hash function 3
Insert Row 1
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
1
0
Hash function 1
1
0
0
0
0
1
Hash function 2
0
0
0
0
0
0
Hash function 3
Insert Row 1
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
1
0
Hash function 1
1
0
0
0
0
1
Hash function 2
0
0
0
0
0
0
Hash function 3
Row 1 present?
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
1
0
Hash function 1
1
0
0
0
0
1
Hash function 2
0
0
0
0
0
0
Hash function 3
Row 1 present?
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
1
1
&
0
Hash function 1
1
1
0
0
&
0
0
1
1
Hash function 2
=1
0
0
0
0
0
0
Hash function 3
Row 1 present?
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
1
1
&
0
Hash function 1
1
1
0
0
&
0
0
1
1
Hash function 2
=1
0
0
0
0
0
0
Hash function 3
Row 1 present?
Yes
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
1
0
Hash function 1
1
0
0
0
0
1
Hash function 2
0
0
0
0
0
0
Hash function 3
Row 2 present?
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
1
0
Hash function 1
1
0
0
0
0
1
Hash function 2
0
0
0
0
0
0
Hash function 3
Row 2 present?
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
0
1
&
0
Hash function 1
1
1
0
&
0
0
0
0
=0
1
Hash function 2
0
0
0
0
0
0
Hash function 3
Row 2 present?
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
0
1
&
0
Hash function 1
1
1
0
&
0
0
0
0
=0
1
Hash function 2
0
0
0
0
0
0
Hash function 3
Row 2 present?
No
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
1
0
Hash function 1
1
1
0
0
0
1
Hash function 2
0
0
1
0
1
0
Hash function 3
Insert Row 4
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
1
0
Hash function 1
1
1
0
0
0
1
Hash function 2
0
0
1
0
1
0
Hash function 3
Row 5 present?
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
1
0
Hash function 1
1
1
0
0
0
1
1
Hash function 2
0
&
0
1
1
&
0
1 =1
1 0
Hash function 3
Row 5 present?
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Storing Retention Time Bins Using Bloom Filters
Example with 64 128ms bin:
0
0
1
0
Hash function 1
1
1
0
0
0
1
1
Hash function 2
0
&
0
1
1
&
0
1 =1
1 0
Hash function 3
Row 5 present?
Yes (false positive)
RAIDR: Retention-Aware Intelligent DRAM Refresh
13
Bloom Filters: Key Characteristics
I
False positives: a row may be declared present in the Bloom
lter even if it was never inserted
RAIDR: Retention-Aware Intelligent DRAM Refresh
14
Bloom Filters: Key Characteristics
I
False positives: a row may be declared present in the Bloom
lter even if it was never inserted
I
Not a problem: Rows may be refreshed more frequently than
needed
RAIDR: Retention-Aware Intelligent DRAM Refresh
14
Bloom Filters: Key Characteristics
I
False positives: a row may be declared present in the Bloom
lter even if it was never inserted
I
I
Not a problem: Rows may be refreshed more frequently than
needed
No false negatives: a row will always be declared present in
the Bloom lter if it was inserted
RAIDR: Retention-Aware Intelligent DRAM Refresh
14
Bloom Filters: Key Characteristics
I
False positives: a row may be declared present in the Bloom
lter even if it was never inserted
I
I
Not a problem: Rows may be refreshed more frequently than
needed
No false negatives: a row will always be declared present in
the Bloom lter if it was inserted
I
No correctness problems: Rows are never refreshed less
frequently than needed
RAIDR: Retention-Aware Intelligent DRAM Refresh
14
Bloom Filters: Key Characteristics
I
False positives: a row may be declared present in the Bloom
lter even if it was never inserted
I
I
No false negatives: a row will always be declared present in
the Bloom lter if it was inserted
I
I
Not a problem: Rows may be refreshed more frequently than
needed
No correctness problems: Rows are never refreshed less
frequently than needed
No over ow: any number of rows may be inserted into a
Bloom lter
RAIDR: Retention-Aware Intelligent DRAM Refresh
14
Bloom Filters: Key Characteristics
I
False positives: a row may be declared present in the Bloom
lter even if it was never inserted
I
I
No false negatives: a row will always be declared present in
the Bloom lter if it was inserted
I
I
Not a problem: Rows may be refreshed more frequently than
needed
No correctness problems: Rows are never refreshed less
frequently than needed
No over ow: any number of rows may be inserted into a
Bloom lter
I
Scalable: contrast with straightforward table implementation
RAIDR: Retention-Aware Intelligent DRAM Refresh
14
Bloom Filters: Key Characteristics
I
False positives: a row may be declared present in the Bloom
lter even if it was never inserted
I
I
No false negatives: a row will always be declared present in
the Bloom lter if it was inserted
I
I
No correctness problems: Rows are never refreshed less
frequently than needed
No over ow: any number of rows may be inserted into a
Bloom lter
I
I
Not a problem: Rows may be refreshed more frequently than
needed
Scalable: contrast with straightforward table implementation
Bloom lters allow implementation of retention time bins
with low hardware overhead
RAIDR: Retention-Aware Intelligent DRAM Refresh
14
Bloom Filters: Key Characteristics
I
False positives: a row may be declared present in the Bloom
lter even if it was never inserted
I
I
No false negatives: a row will always be declared present in
the Bloom lter if it was inserted
I
I
No correctness problems: Rows are never refreshed less
frequently than needed
No over ow: any number of rows may be inserted into a
Bloom lter
I
I
Not a problem: Rows may be refreshed more frequently than
needed
Scalable: contrast with straightforward table implementation
Bloom lters allow implementation of retention time bins
with low hardware overhead
I
1.25 KB storage overhead (2 Bloom lters) for 32 GB DRAM
system
RAIDR: Retention-Aware Intelligent DRAM Refresh
14
Retention-Aware Intelligent DRAM Refresh
1. Pro ling
I
Determine each row's retention time (how frequently each row
needs to be refreshed to avoid losing data)
2. Binning
I
Group rows into di erent retention time bins based on their
retention time
3. Refreshing
I
Refresh rows in di erent bins at di erent rates
RAIDR: Retention-Aware Intelligent DRAM Refresh
15
3 Refreshing Rows at Di erent Rates
Memory controller
chooses each row
as a refresh candidate
every 64ms
Row in 64-128ms bin?
(First Bloom filter: 256B)
Row in 128-256ms bin?
(Second Bloom filter: 1KB)
Refresh the row
Every other 64ms window,
refresh the row
RAIDR: Retention-Aware Intelligent DRAM Refresh
Every 4th 64ms window,
refresh the row
16
Tolerating Temperature Variation: Refresh Rate
Scaling
I
Change in temperature causes retention time of all cells to
change by a uniform and predictable factor
RAIDR: Retention-Aware Intelligent DRAM Refresh
17
Tolerating Temperature Variation: Refresh Rate
Scaling
I
Change in temperature causes retention time of all cells to
change by a uniform and predictable factor
I
Refresh rate scaling: increase the refresh rate for all rows
uniformly, depending on the temperature
RAIDR: Retention-Aware Intelligent DRAM Refresh
17
Tolerating Temperature Variation: Refresh Rate
Scaling
I
Change in temperature causes retention time of all cells to
change by a uniform and predictable factor
I
Refresh rate scaling: increase the refresh rate for all rows
uniformly, depending on the temperature
I
Implementation: counter with programmable period
RAIDR: Retention-Aware Intelligent DRAM Refresh
17
Tolerating Temperature Variation: Refresh Rate
Scaling
I
Change in temperature causes retention time of all cells to
change by a uniform and predictable factor
I
Refresh rate scaling: increase the refresh rate for all rows
uniformly, depending on the temperature
I
Implementation: counter with programmable period
I
Lower temperature ⇒ longer period ⇒ less frequent refreshes
RAIDR: Retention-Aware Intelligent DRAM Refresh
17
Tolerating Temperature Variation: Refresh Rate
Scaling
I
Change in temperature causes retention time of all cells to
change by a uniform and predictable factor
I
Refresh rate scaling: increase the refresh rate for all rows
uniformly, depending on the temperature
I
Implementation: counter with programmable period
I
I
Lower temperature ⇒ longer period ⇒ less frequent refreshes
Higher temperature ⇒ shorter period ⇒ more frequent
refreshes
RAIDR: Retention-Aware Intelligent DRAM Refresh
17
Outline
I
Executive Summary
I
Background & Motivation
I
Key Observation & Our Mechanism: RAIDR
I
Evaluation
I
Conclusion
RAIDR: Retention-Aware Intelligent DRAM Refresh
18
Methodology
I
8-core, 4 GHz, 512 KB 16-way private cache per core
I
32 GB DDR3 DRAM system (2 channels, 4 ranks/channel)
I
1.25 KB storage overhead for 2 Bloom lters
I
Extended temperature range (85 95◦ C) characteristic of
server environments
I
SPEC CPU2006, TPC-C, TPC-H benchmarks in 8-core
multiprogrammed workloads
I
I
I
Benchmarks categorized by memory intensity (LLC misses per
1000 instructions)
Workloads categorized by fraction of memory-intensive
benchmarks
32 workloads per category, 5 workload categories
RAIDR: Retention-Aware Intelligent DRAM Refresh
19
Comparison Points
I
Auto-refresh [DDR3, LPDDR2, . . . ]:
I
I
I
I
Memory controller periodically sends auto-refresh commands
DRAM devices refresh many rows on each command
Baseline typical in modern systems
All rows refreshed at same rate
RAIDR: Retention-Aware Intelligent DRAM Refresh
20
Comparison Points
I
Auto-refresh [DDR3, LPDDR2, . . . ]:
I
I
I
I
I
Memory controller periodically sends auto-refresh commands
DRAM devices refresh many rows on each command
Baseline typical in modern systems
All rows refreshed at same rate
Distributed refresh:
I
I
Memory controller refreshes each row individually by sending
activate and precharge commands to DRAM
All rows refreshed at same rate
RAIDR: Retention-Aware Intelligent DRAM Refresh
20
Comparison Points
I
Auto-refresh [DDR3, LPDDR2, . . . ]:
I
I
I
I
I
Distributed refresh:
I
I
I
Memory controller periodically sends auto-refresh commands
DRAM devices refresh many rows on each command
Baseline typical in modern systems
All rows refreshed at same rate
Memory controller refreshes each row individually by sending
activate and precharge commands to DRAM
All rows refreshed at same rate
Smart Refresh [Ghosh+, MICRO '07]:
I
I
I
I
Memory controller refreshes each row individually
Refreshes to recently activated rows are skipped
Requires programs to activate many rows to be e ective
Very high storage overhead (1.5 MB for 32 GB DRAM)
RAIDR: Retention-Aware Intelligent DRAM Refresh
20
Comparison Points
I
Auto-refresh [DDR3, LPDDR2, . . . ]:
I
I
I
I
I
Distributed refresh:
I
I
I
Memory controller refreshes each row individually by sending
activate and precharge commands to DRAM
All rows refreshed at same rate
Smart Refresh [Ghosh+, MICRO '07]:
I
I
I
I
I
Memory controller periodically sends auto-refresh commands
DRAM devices refresh many rows on each command
Baseline typical in modern systems
All rows refreshed at same rate
Memory controller refreshes each row individually
Refreshes to recently activated rows are skipped
Requires programs to activate many rows to be e ective
Very high storage overhead (1.5 MB for 32 GB DRAM)
No refresh (ideal)
RAIDR: Retention-Aware Intelligent DRAM Refresh
20
Refresh Operations Performed (32 GB DRAM)
% of refreshes performed
100
80
60
40
20
0 Distributed
Smart
RAIDR: Retention-Aware Intelligent DRAM Refresh
RAIDR No-refresh
21
Refresh Operations Performed (32 GB DRAM)
% of refreshes performed
100
80
60
40
25.4%
20
0 Distributed
Smart
RAIDR: Retention-Aware Intelligent DRAM Refresh
RAIDR No-refresh
21
Weighted speedup
Performance
8.5
Auto
RAIDR
8.0 6.1%
Distributed
No Refresh
Smart
8.4%
7.5
9.3%
7.0
8.6%
6.5
6.0
9.6%
5.5
9.8%
5.0
4.5
4.0 0% 25% 50% 75% 100% Avg
Memory-intensive benchmarks in workload
RAIDR: Retention-Aware Intelligent DRAM Refresh
22
Weighted speedup
Performance
8.5
Auto
RAIDR
8.0 6.1%
Distributed
No Refresh
Smart
8.4%
7.5
9.3%
7.0
8.6%
6.5
6.0
9.6%
5.5
9.8%
5.0
4.5
4.0 0% 25% 50% 75% 100% Avg
Memory-intensive benchmarks in workload
RAIDR: Retention-Aware Intelligent DRAM Refresh
22
Weighted speedup
Performance
8.5
Auto
RAIDR
8.0 6.1%
Distributed
No Refresh
Smart
8.4%
7.5
9.3%
7.0
8.6%
6.5
6.0
9.6%
5.5
9.8%
5.0
4.5
4.0 0% 25% 50% 75% 100% Avg
Memory-intensive benchmarks in workload
RAIDR: Retention-Aware Intelligent DRAM Refresh
22
DRAM Energy E ciency
100
Auto
Distributed
Smart
Energy per access (nJ)
18.9%
80
60
RAIDR
No Refresh
17.3%
15.4%
16.1%
13.7%
12.6%
40
20
0 0% 25% 50% 75% 100% Avg
Memory-intensive benchmarks in workload
RAIDR: Retention-Aware Intelligent DRAM Refresh
23
DRAM Energy E ciency
100
Auto
Distributed
Smart
Energy per access (nJ)
18.9%
80
60
RAIDR
No Refresh
17.3%
15.4%
16.1%
13.7%
12.6%
40
20
0 0% 25% 50% 75% 100% Avg
Memory-intensive benchmarks in workload
RAIDR: Retention-Aware Intelligent DRAM Refresh
23
DRAM Energy E ciency
100
Auto
Distributed
Smart
Energy per access (nJ)
18.9%
80
60
RAIDR
No Refresh
17.3%
15.4%
16.1%
13.7%
12.6%
40
20
0 0% 25% 50% 75% 100% Avg
Memory-intensive benchmarks in workload
RAIDR: Retention-Aware Intelligent DRAM Refresh
23
DRAM Device Capacity Scaling: Performance
Weighted speedup
8
7
Auto
RAIDR
6
5
4
3
2
1
0 4 Gb
8 Gb 16 Gb 32 Gb
Device capacity
RAIDR: Retention-Aware Intelligent DRAM Refresh
64 Gb
24
DRAM Device Capacity Scaling: Performance
Weighted speedup
8
7
Auto
RAIDR
6
108%
5
4
3
2
1
0 4 Gb
8 Gb 16 Gb 32 Gb
Device capacity
RAIDR: Retention-Aware Intelligent DRAM Refresh
64 Gb
24
DRAM Device Capacity Scaling: Energy
Energy per access (nJ)
160
140
Auto
RAIDR
120
100
80
60
40
20
0 4 Gb
8 Gb 16 Gb 32 Gb
Device capacity
RAIDR: Retention-Aware Intelligent DRAM Refresh
64 Gb
25
DRAM Device Capacity Scaling: Energy
Energy per access (nJ)
160
140
Auto
RAIDR
50%
120
100
80
60
40
20
0 4 Gb
8 Gb 16 Gb 32 Gb
Device capacity
RAIDR: Retention-Aware Intelligent DRAM Refresh
64 Gb
25
Outline
I
Executive Summary
I
Background & Motivation
I
Key Observation & Our Mechanism: RAIDR
I
Evaluation
I
Conclusion
RAIDR: Retention-Aware Intelligent DRAM Refresh
26
Conclusion
I
Refresh is an energy and performance problem that is
becoming increasingly signi cant in DRAM systems
RAIDR: Retention-Aware Intelligent DRAM Refresh
27
Conclusion
I
Refresh is an energy and performance problem that is
becoming increasingly signi cant in DRAM systems
I
Refresh limits DRAM scaling
RAIDR: Retention-Aware Intelligent DRAM Refresh
27
Conclusion
I
Refresh is an energy and performance problem that is
becoming increasingly signi cant in DRAM systems
I
I
Refresh limits DRAM scaling
High refresh rate is caused by a small number of
problematic cells
RAIDR: Retention-Aware Intelligent DRAM Refresh
27
Conclusion
I
Refresh is an energy and performance problem that is
becoming increasingly signi cant in DRAM systems
I
Refresh limits DRAM scaling
I
High refresh rate is caused by a small number of
problematic cells
I
RAIDR groups rows into bins and refreshes rows in di erent
bins at di erent rates
RAIDR: Retention-Aware Intelligent DRAM Refresh
27
Conclusion
I
Refresh is an energy and performance problem that is
becoming increasingly signi cant in DRAM systems
I
Refresh limits DRAM scaling
I
High refresh rate is caused by a small number of
problematic cells
I
RAIDR groups rows into bins and refreshes rows in di erent
bins at di erent rates
I
Uses Bloom lters for scalable and e cient binning
RAIDR: Retention-Aware Intelligent DRAM Refresh
27
Conclusion
I
Refresh is an energy and performance problem that is
becoming increasingly signi cant in DRAM systems
I
Refresh limits DRAM scaling
I
High refresh rate is caused by a small number of
problematic cells
I
RAIDR groups rows into bins and refreshes rows in di erent
bins at di erent rates
I
I
Uses Bloom lters for scalable and e cient binning
74.6% refresh reduction, 8.6% performance improvement,
16.1% DRAM energy reduction at 1.25 KB overhead
RAIDR: Retention-Aware Intelligent DRAM Refresh
27
Conclusion
I
Refresh is an energy and performance problem that is
becoming increasingly signi cant in DRAM systems
I
Refresh limits DRAM scaling
I
High refresh rate is caused by a small number of
problematic cells
I
RAIDR groups rows into bins and refreshes rows in di erent
bins at di erent rates
I
Uses Bloom lters for scalable and e cient binning
I
74.6% refresh reduction, 8.6% performance improvement,
16.1% DRAM energy reduction at 1.25 KB overhead
I
RAIDR's bene ts improve with increasing DRAM density
RAIDR: Retention-Aware Intelligent DRAM Refresh
27
Conclusion
I
Refresh is an energy and performance problem that is
becoming increasingly signi cant in DRAM systems
I
Refresh limits DRAM scaling
I
High refresh rate is caused by a small number of
problematic cells
I
RAIDR groups rows into bins and refreshes rows in di erent
bins at di erent rates
I
Uses Bloom lters for scalable and e cient binning
I
74.6% refresh reduction, 8.6% performance improvement,
16.1% DRAM energy reduction at 1.25 KB overhead
I
RAIDR's bene ts improve with increasing DRAM density
I
Enables better DRAM scaling
RAIDR: Retention-Aware Intelligent DRAM Refresh
27
RAIDR:
Retention-Aware Intelligent DRAM Refresh
Jamie Liu
Ben Jaiyen
Richard Veras
Onur Mutlu
DRAM Hierarchy
Channel
Rank
Rank
Processor
Core
Bank
Bank
Memory
Controller
Rank
Bank
Rank
Bank
Channel
RAIDR: Retention-Aware Intelligent DRAM Refresh
29
DRAM Array Organization
Bit lines
Row
Cell
Word
lines
Sense
amp
Sense
amp
RAIDR: Retention-Aware Intelligent DRAM Refresh
Sense
amp
Row
buffer
30
DRAM Activation
0V
VDD /2
+
–
VDD
Sense amp
RAIDR: Retention-Aware Intelligent DRAM Refresh
31
DRAM Activation
VPP
VDD /2
+
–
VDD
Sense amp
RAIDR: Retention-Aware Intelligent DRAM Refresh
31
DRAM Activation
VPP
VDD /2+δ
+
–
???
Sense amp
RAIDR: Retention-Aware Intelligent DRAM Refresh
31
DRAM Activation
VPP
VDD
+
–
VDD
Sense amp
RAIDR: Retention-Aware Intelligent DRAM Refresh
31
DRAM Precharge
VPP
VDD
+
–
VDD
Sense amp
RAIDR: Retention-Aware Intelligent DRAM Refresh
32
DRAM Precharge
0V
VDD
+
–
VDD
Sense amp
RAIDR: Retention-Aware Intelligent DRAM Refresh
32
DRAM Precharge
0V
VDD /2
+
–
VDD
Sense amp
RAIDR: Retention-Aware Intelligent DRAM Refresh
32
RAIDR Components
Refresh Rate Scaler Period
Period Counter
Row Counter
64-128ms Bloom Filter
Refresh Rate Scaler Counter
128-256ms Bloom Filter
RAIDR: Retention-Aware Intelligent DRAM Refresh
33
Idle DRAM power consumption (W)
Idle Power Consumption
5
Auto
Self Refresh
4
RAIDR
No Refresh
19.6%
3
12.2%
2
1
0
Normal temp.
RAIDR: Retention-Aware Intelligent DRAM Refresh
Extended temp.
34
Weighted speedup
Performance: 85◦ C
8.5
Auto
RAIDR
8.0 2.9%
Distributed
No Refresh
Smart
7.5
3.8%
7.0
4.4%
6.5
4.1%
6.0
5.5
4.5%
5.0
4.8%
4.5
4.0 0% 25% 50% 75% 100% Avg
Memory-intensive benchmarks in workload
RAIDR: Retention-Aware Intelligent DRAM Refresh
35
Energy: 85◦ C
Energy per access (nJ)
100
Auto
Distributed
Smart
80 10.1%
60
9.0%
40
7.9%
RAIDR
No Refresh
8.3%
6.9%
6.4%
20
0 0% 25% 50% 75% 100% Avg
Memory-intensive benchmarks in workload
RAIDR: Retention-Aware Intelligent DRAM Refresh
36
RAIDR Default Con guration
I
64 128 ms bin: 256 B Bloom lter, 10 hash functions; 28
rows in bin, false positive probability 1.16 · 10−9
I
128 256 ms bin: 1 KB Bloom lter, 6 hash functions; 978
rows in bin, false positive probability 0.0179
RAIDR: Retention-Aware Intelligent DRAM Refresh
37
Refresh Reduction vs. RAIDR Con guration
# of refreshes performed
4.0 ×10
7
3.5
3.0
2.5
2.0
1.5
1.0
0.5
A
RA uto
ID
R
0.0
1 2 1 2 3 4
1 Bin
2 Bins
RAIDR: Retention-Aware Intelligent DRAM Refresh
1 2 3 4 5
3 Bins
38
RAIDR Con gurations
Key
Auto
RAIDR
1 bin (1)
1 bin (2)
2 bins (1)
2 bins (2)
2 bins (3)
2 bins (4)
3 bins (1)
3 bins (2)
3 bins (3)
3 bins (4)
3 bins (5)
Description
Auto-refresh
Default RAIDR: 2 bins (64 128 ms, m = 2048; 128 256 ms, m = 8192)
1 bin (64 128 ms, m = 512)
1 bin (64 128 ms, m = 1024)
2 bins (64 128 ms, m = 2048; 128 256 ms, m = 2048)
2 bins (64 128 ms, m = 2048; 128 256 ms, m = 4096)
2 bins (64 128 ms, m = 2048; 128 256 ms, m = 16384)
2 bins (64 128 ms, m = 2048; 128 256 ms, m = 32768)
3 bins (64 128 ms, m = 2048; 128 256 ms, m = 8192; 256 512 ms, m
3 bins (64 128 ms, m = 2048; 128 256 ms, m = 8192; 256 512 ms, m
3 bins (64 128 ms, m = 2048; 128 256 ms, m = 8192; 256 512 ms, m
3 bins (64 128 ms, m = 2048; 128 256 ms, m = 8192; 256 512 ms, m
3 bins (64 128 ms, m = 2048; 128 256 ms, m = 8192; 256 512 ms, m
RAIDR: Retention-Aware Intelligent DRAM Refresh
Storage Overhead
=
=
=
=
=
32768)
65536)
131072)
262144)
524288)
N/A
1.25 KB
64 B
128 B
512 B
768 B
2.25 KB
4.25 KB
5.25 KB
9.25 KB
17.25 KB
33.25 KB
65.25 KB
39