Everything You Always Wanted to Know About SDRAM (Memory): But Were Afraid to Ask

Name: Everything You Always Wanted to Know About SDRAM (Memory): But Were Afraid to Ask
Item: Everything You Always Wanted to Know About SDRAM (Memory): But Were Afraid to Ask
Author: Rajinder Gill

by Rajinder Gill on August 15, 2010 10:59 PM EST

Posted in
Memory
Guides
Intel

46 Comments | Add A Comment

46 Comments

It’s coming up on a year since we published our last memory review; possibly the longest hiatus this section of the site has ever seen. To be honest, the reason we’ve refrained from posting much of anything is because things haven’t changed all that much over the last year – barring a necessary shift towards low-voltage oriented ICs (~1.30V to ~1.50V) from the likes of Elpida and PSC. Parts of these types will eventually become the norm as memory controllers based on smaller and smaller process technology, like Intel’s 32nm Gulftown, gain traction in the market.

While voltage requirements have changed for the better, factors relating to important memory timings like CL and tRCD haven’t seen an improvement; we’re almost at the same point we were a year ago. Back then Elpida provided a glimpse of promise with their Hyper-series of ICs. The Hyper part was capable of high-speed, low-latency operation in tandem. Unfortunately, due to problems with long-term reliability, Hyper is now defunct. Corsair and perhaps Mushkin still have enough stock to sell for a while, but once it's gone, that’s it.

Corsair Dominator GTs based on Elpida Hyper - they're being phased out for something slower...

The superseding Elpida BBSE variant ICs and a spread of chips from PSC now dominate the memory scene, ranging from mainstream DDR3-1333 speeds all the way to insanely-rated premium DDR3-2500 kits. Some of these parts are capable of keeping up with Hyper when it comes to CL, but do so by adding a few nanoseconds of random access latency due to a looser tRCD. Given that read and write access operations make up a significant portion of memory power consumption, this step backwards in performance may be a requisite factor for reliability – perhaps something was found by Elpida during the production lifetime of Hyper ICs that prompted a re-examination, leading to a more conservative recipe for data transfer/retrieval.

A few of the newer modules to grace our doorstep

Today’s memory section comeback was fuelled by the arrival of a number of mainstream memory kits at our test labs – many of the kits we were using for motherboard reviews are no longer for sale so we needed to update our inventory of modules anyway. Corsair, Crucial and GSkill kindly sent memory from their mainstream line-ups. The original intent was to look at a few of those kits.

However, during the course of testing these kits, our focus shifted from writing a memory review (showing the same old boring graphs) to compiling something far more meaningful: a guide to memory optimization and addressing, including a detailed look at important memory timings, and an accounting of some of Intel’s lesser-known memory controller features. As such, this article should make a very compelling read for those of you interested in learning more about some of the design and engineering that goes into making memory work, and how a little understanding can go a long way when looking for creative ways to improve memory performance…

The Ins and Outs of Memory Addressing

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

46 Comments

View All Comments

bowhe - Tuesday, October 26, 2010 - link
Thanks for these great articles!

What I didn't understand yet:
You state "Installing more than one DIMM per channel does not double the Memory Bus bandwidth, as modules co-located in the same channel must compete for access to a shared 64-bit sub-bus; however, adding more modules does have the added benefit of doubling the number of pages that may be open concurrently (twice the ranks for twice the fun!)". This sounds very positive, but:

Some system manufacturers state that with 3 dimms the memory frequency can be for example 1333MHz, but with 6 dimms it needs to drop to 800MHz. Why does the frequency need to drop when using 6 versus 3 dimms? Does this apply to high end boards like the Gigabyte-X58A-UD9?

Some manufacturer states in a small side note of a 24GB kit (6x4GB) that the stated frequency/timing is only guaranteed when using 3 dimm slots. This leads me to think that any 3 dimms of the set can do the stated timing, but when all are used something inherent in the design or interaction of the i7 processor, motherboard and dimm prevents the use of stated frequency/timings? What is it?

Can one overcome these limitations by adjusting voltages in a high end board like the Gigabyte-X58A-UD9? (without use of extreme cooling <32F/0C)

Thanks a lot!
kakfjak - Thursday, May 5, 2011 - link

www.stylishdudes.com

All kinds of shoes + tide bag

Free transport
cochleoid - Tuesday, March 12, 2013 - link
"When associated in groups of two (DDR), four (DDR2) or eight (DDR3), these banks form the next higher logical unit, known as a rank. "

This mislead me. DDR2 may have coincidentally introduced 3 bit banks - allowing for 8 bank chips - but a typical old SDRAM (no DDR) chip had 4 banks.

"We can now see why the DDR3 core has a 8n-prefetch (where n refers to the number of banks per rank) as every read access to the memory requires a minimum of 64 bits (8 bytes) of data to be transferred. This is because each bank, of which there are eight for DDR3, fetches no less than 8 bits (1 byte) of data per read request - the equivalent of one column's worth of data. Whether or not the system actually makes use of all 8 bytes of transferred data is irrelevant. Any delivered data not actually requested can be safely disregarded as it's just a copy of what is still retained in memory."

This threw me off even more. What's happening is that the data at 8 consecutive (or otherwise close, depending on the burst mode) column addresses is being bursted on each read. "n" refers to the width of the memory chip, or the size of the "word" at a particular column address. "n" does not have any relation to the number of banks in a rank.

8 8bit-wide DDR3 chips would make a total module width of 64 bits or 8 bytes at each column address. 8 column addresses would be 64 bytes (not 8 bytes, as the article seems to suggest), which actually corresponds to the cacheline size on most PCs.

SDRAM could burst in sizes of 1,2,4,8
DDR could burst in sizes of only 2,4,8
DDR2 could burst in sizes of only 4,8
DDR3 can burst only in 8.
(All of these could burst in 8, filling the 64 byte cachline in one read operation. The difference with the generations of DDR has been a larger minimum wait in interface clock cycles as the interface got faster and the row accesses remained sluggish.)
The internal clock of SDRAM has been limited by the speed of row accesses. What the 2n,4n,8n prefetches are doing is transferring more of this data available in an open row out at higher interface speeds with the rest of the system. It has nothing to do with the banks.

SDRAM chips were segmented into independently operating banks so that parallel operations on interleaved banks could be synchronized or pipelined. 2n, 4n, and 8n prefetch buffering can be applied without independently operating banks.
ricardo_sa - Saturday, March 26, 2016 - link
Thanks for the detailed explanation. You really saved my day. Ive read this article some time ago to help me understand how a DDR3 worked (theres few detailed explanations on google) and it turned out to be the worst mistake possible. I got the concepts wrong because of the incompetence of the publisher and lost a lot of time dealing with that 8 Bank misconception about the 64 bits.

So it turns out one can only write a burst at 1 bank at a time, am i right? Otherwise you could access all the 8 banks in one single write/read....
Huendli - Friday, March 13, 2015 - link
Thanks for this interesting read with much attention to detail!

"a top priority [...] should be to focus development on reducing absolute minimum latency requirements for timings such as CAS and tRCD, rather than chasing raw synthetic bandwidth figures or setting outright frequency records at the expense of unduly high random access times."

The latter's exactly what happened. DDR3-1600 modules with CL7 timings were widely available at the time this article had been written. Nowadays, you only get ridiculously-named bars with equally-ridiculously monstrous heatspreaders, but more bandwidth and worse timings than ever.
Anuradha - Tuesday, March 9, 2021 - link
Each rank consists of 8 banks, OR, each rank consists of 8 ICs and each IC consists of 8 banks??

Everything You Always Wanted to Know About SDRAM (Memory): But Were Afraid to Ask

Post Your Comment

46 Comments

View All Comments

bowhe - Tuesday, October 26, 2010 - link

kakfjak - Thursday, May 5, 2011 - link

cochleoid - Tuesday, March 12, 2013 - link

ricardo_sa - Saturday, March 26, 2016 - link

Huendli - Friday, March 13, 2015 - link

Anuradha - Tuesday, March 9, 2021 - link

Log in

Don't have an account? Sign up now