Measuring Latency

To measure latency, we use the open source "TinyMemBench" benchmark. The source was compiled for x86 with gcc 4.8.1 and optimization was set to "-O3". We used the latency values measured with the largest buffer size: 64MB. The measurement is described well by the manual of TinyMemBench:

Average time is measured for random memory accesses in the buffers of different sizes. The larger the buffer, the more significant the relative contributions of TLB, L1/L2 cache misses and DRAM accesses become. All the numbers represent extra time, which needs to be added to L1 cache latency (4 cycles).

First we tested with 1 DPC. RDIMM-1866 means we use the Micron 1866 RDIMMs. So even though they only work at 1600 MT/s when the Xeon E5-2690 is used, we describe them as "1866". The RDIMM-1600 are Samsung DIMMs which are designed to run at 1600 MT/s.

TinyMemoryBench Latency—Sandy vs Ivy

The iMB buffer increases latency by about 5 to 10%. But LRDIMMs are hardly useful when you insert only one DIMM per channel. They are made and bought to run at 3DPC, so that's what we tested neext. Both LRDIMMs and RDIMMs have to clock back to 1333 MT/s at 3DPC.

TinyMemoryBench Latency—3 DPC

The small latency advantage that RDIMM had is gone. In fact, LRDIMMs seem to have a very small latency advantage over the RDIMMs in this case. Again, memory performance of the Ivy Bridge Xeon is a bit better, but the small clock speed advantage (2.8 vs 2.7 GHz) is probably the simplest and best explanation.

In summary, the differences in latency and bandwidth are pretty small between similar LRDIMMs and RDIMMs. And in that case, it will be nearly impossible to measure any tangible effects on real world applications.

Measuring Stream Throughput Real World Testing
Comments Locked

27 Comments

View All Comments

  • slideruler - Thursday, December 19, 2013 - link

    Am I the only one who's concern with DDR4 in our future?

    Given that it's one-to-one we'll lose the ability to stuff our motherboards with cheap sticks to get to "reasonable" (>=128gig) amount of RAM... :(
  • just4U - Thursday, December 19, 2013 - link

    You really shouldn't need more than 640kb.... :D
  • just4U - Thursday, December 19, 2013 - link

    seriously though .. DDR3 prices have been going up. as near as I can tell their approximately 2.3X the cost of what they once were. Memory makers are doing the semi-happy dance these days and likely looking forward to the 5x pricing schemes of yesteryear.
  • MrSpadge - Friday, December 20, 2013 - link

    They have to come up with something better than "1 DIMM per channel using the same amount of memory controllers" for servers.
  • theUsualBlah - Thursday, December 19, 2013 - link

    the -Ofast flag for Open64 will relax ansi and ieee rules for calculations, whereas the GCC flags won't do that.

    maybe thats the reason Open64 is faster.
  • JohanAnandtech - Friday, December 20, 2013 - link

    Interesting comment. I ran with gcc, Opencc with O2, O3 and Ofast. If the gcc binary is 100%, I get 110% with Opencc (-O2), 130% (-O3) and the same with Ofast.
  • theUsualBlah - Friday, December 20, 2013 - link

    hmm, thats very interesting.

    i am guessing Open64 might be producing better code (atleast) when it comes to memory operations. i gave up on Open64 a while back and maybe i should try it out again.

    thanks!
  • GarethMojo - Friday, December 20, 2013 - link

    The article is interesting, but alone it doesn't justify the expense for high-capacity LRDIMMs in a server. As server professionals, our goal is usually to maximise performance / cost for a specific role. In this example, I can't imagine that better performance (at a dramatically lower cost) would not be obtained by upgrading the storage pool instead. I'd love to see a comparison of increasing memory sizes vs adding more SSD caching, or combinations thereof.
  • JlHADJOE - Friday, December 20, 2013 - link

    Depends on the size of your data set as well, I'd guess, and whether or not you can fit the entire thing in memory.

    If you can, and considering RAM is still orders of magnitude faster than SSDs I imagine memory still wins out in terms of overall performance. Too large to fit in a reasonable amount of RAM and yes, SSD caching would possibly be more cost effective.
  • MrSpadge - Friday, December 20, 2013 - link

    One could argue that the storage optimization would be done for both memory configurations.

Log in

Don't have an account? Sign up now