1
PCM endurance :: 512Bx4 buffer
PartialLine (64B)
Partial Word (4B)
35
PCM endurance :: 512Bx4 buffer
PartialLine (64B)
Partial Word (4B)
figure 7. Pcm endurance. (a) Pcm memory module lifetimes. (b) fraction of buffer modified (d).
30
25
Years
15
20
10
Fraction of buffer width (512B)
0.8
0.6
0.4
0.2
5
0
(a)
cg is mg fft rad oce art equ swi avg
0
(b)
cg is mg fft rad oce art equ swi avg
measuring the number of memory operations Nw + Nr and
calculating the processor cycles spent on these operations
(B/2) · Mf . The processor is Mf faster than fm. The time spent
on memory operations is divided by total execution time T.
buffer’s bits is written to the array. As shown in Figure 7, only
59.3% and 7.6% of the buffer must be written to the array for
64B and 4B partial writes.
( 1)
4. 3. Density versus endurance
PCM cells are presently larger than DRAM cells. Measuring
cell size in square feature sizes, which makes the discussion
independent of process technology, PCM cells are 1. 5–2.0×
larger than DRAM cells.
In particular, 8F 2 DRAM cells provide a sufficiently wide
pitch to enable a folded bitline architecture, which is resilient against bitline noise during voltage sensing. However,
manufacturers often choose the density of 6F 2 DRAM cells.
The narrow pitch in 6F 2 designs preclude folded bitlines,
increasing vulnerability to noise and requiring unconventional array designs. For example, Samsung’s 6F 2
implements array blocks with 320 wordlines, which is not a power
of two, to improve reliability. 5
In contrast, PCM cells occupy between 6F 2 and 20F 2. 10 Part
of this spread is due to differences in design and fabrication
expertise for the new technology. However, we also observe a
correlation between cell size and access device (e.g., the 6F 2 cell
uses the relatively small diode). We favor larger BJTs for their low
access times. Cells with BJTs occupy between 9F 2 and 12F 2.
Given 9–12F2 PCM cells and 6F2 DRAM cells, two-bit
multilevel PCM cells are necessary to be competitive with
respect to density. Two-bit MLC provide an effective density of 4. 5–6.0F 2per bit. However, MLC suffer from lower
endurance. Process and manufacturing set the read window, which quantifies the difference between the lowest
and highest programmed resistances in single-level cells. By
programming the cell to intermediate resistances within the
same read window, MLC inherently require a larger number
of logical states that each occupy a narrower region of the
read window. Thus, wear more quickly impacts the ability to
differentiate these resistances.
Since only a fraction of memory bus activity reaches the
PCM to induce wear, we scale occupancy by write intensity to
estimate the number of write operations arriving at the row
buffers. In the worst case, the entire buffer must be written
to the array. However, not all buffer writes cause array writes
due to coalescing. Nwa/Nwb measures the coalescing effectiveness of the buffer, which filters writes to the array. Lastly,
partial writes mean only the dirty fraction d of a buffer’s 8WP
bits are written to the array. Assuming ideal wear-leveling,
writes will be spread across the C bits in the module. Given
writes per second Ŵ and characterized endurance E, a bit
will fail in Lˆ = E/Ŵ seconds.
In a baseline architecture with a single 2048B-wide buffer, average module lifetime is approximately 1050h as
calculated by Equation 1. For our memory-intensive workloads, we observe 32.8% memory bus utilization. Scaling by
application-specific write intensity, we find 6.9% of memory
bus cycles are utilized by writes. At the memory banks, the
single 2048B buffer provides limited opportunities for write
coalescing, eliminating only 2.3% of writes emerging from
the memory bus. Frequent row replacements in the single
buffer limit opportunities for coalescing.
Figure 7 illustrates significant endurance gains from
reorganized buffers and partial writes. 64B and 4B partial
writes improve endurance to 1. 4 and 11. 2 years, respectively. Buffers use partial writes so that only a fraction of the
4. 4. assumptions and qualifications