The ARM vs x86 Wars Have Begun: In-Depth Power Analysis of Atom, Krait & Cortex A15
by Anand Lal Shimpi on January 4, 2013 7:32 AM EST- Posted in
- Tablets
- Intel
- Samsung
- Arm
- Cortex A15
- Smartphones
- Mobile
- SoCs
Modifying a Krait Platform: More Complicated
Modifying the Dell XPS 10 is a little more difficult than Acer's W510 and Surface RT. In both of those products there was only a single inductor in the path from the battery to the CPU block of the SoC. The XPS 10 uses a dual-core Qualcomm solution however. Ever since Qualcomm started doing multi-core designs it has opted to use independent frequency and voltage planes for each core. While all of the A9s in Tegra 3 and both of the Atom cores used in the Z2760 run at the same frequency/voltage, each Krait core in the APQ8060A can run at its own voltage and frequency. As a result, there are two power delivery circuits that are needed to feed the CPU cores. I've highlighted the two inductors Intel lifted in orange:
Each inductor was lifted and wired with a 20 mΩ resistor in series. The voltage drop across the 20 mΩ resistor was measured and used to calculate CPU core power consumption in real time. Unless otherwise stated, the graphs here represent the total power drawn by both CPU cores.
Unfortunately, that's not all that's necessary to accurately measure Qualcomm CPU power. If you remember back to our original Krait architecture article you'll know that Qualcomm puts its L2 cache on a separate voltage and frequency plane. While the CPU cores in this case can run at up to 1.5GHz, the L2 cache tops out at 1.3GHz. I remembered this little fact late in the testing process, and we haven't yet found the power delivery circuit responsible for Krait's L2 cache. As a result, the CPU specific numbers for Qualcomm exclude any power consumed by the L2 cache. The total platform power numbers do include it however as they are measured at the battery.
The larger inductor in yellow feeds the GPU and it's instrumented using another 20 mΩ resistor.
Visualizing Krait's Multiple Power/Frequency Domains
Qualcomm remains adament about its asynchronous clocking with multiple voltage planes. The graph below shows power draw broken down by each core while running SunSpider:
SunSpider is a great benchmark to showcase exactly why Qualcomm has each core running on its own power/frequency plane. For a mixed workload like this, the second core isn't totally idle/power gated but it isn't exactly super active either. If both cores were tied to the same voltage/frequency, the second core would have higher leakage current than in this case. The counter argument would be that if you ran the second core at its max frequency as well it would be able to complete its task quicker and go to sleep, drawing little to no power. The second approach would require a very fast microcontroller to switch between v/f modes and it's unclear which of the two would offer better power savings. It's just nice to be able to visualize exactly why Qualcomm does what it does here.
On the other end of the spectrum however is a benchmark like Kraken, where both cores are fairly active and the workload is balanced across both cores:
Here there's no real benefit to having two independent voltage/frequency planes, both cores would be served fine by running at the same voltage and frequency. Qualcomm would argue that the Kraken case is rare (single threaded performance still dominates most user experience), and the power savings in situations like SunSpider are what make asynchronous clocking worth it. This is a much bigger philosophical debate that would require far more than a couple of graphs to support and it's not one that I want to get into here. I suspect that given its current power management architecture, Qualcomm likely picked the best solution possible for delivering the best possible power consumption. It's more effort to manage multiple power/frequency domains, effort that I doubt Qualcomm would put in without seeing some benefit over the alternative. That being said, what works best for a Qualcomm SoC isn't necessarily what's best for a different architecture.
140 Comments
View All Comments
powerarmour - Friday, January 4, 2013 - link
So yes, finally confirming what anyone with half a brain knows, competitive ARM SoC's use less power.apinkel - Friday, January 4, 2013 - link
I'm assuming you are kidding.Atom is roughly equivalent to (dual core) Krait in power draw but has better performance.
The A15 is faster than either krait or the atom but it's power draw is too much to make it usable in a smartphone (which is I'm assuming why qualcomm had to redesign the A15 architecture for krait to make it fit into the smartphone power envelope).
The battle I still want to see is quad core krait and atom.
ImSpartacus - Friday, January 4, 2013 - link
Let me make sure I have this straight. Did Qualcomm redesign A15 to create Krait?djgandy - Friday, January 4, 2013 - link
No. Qualcomm create their own designs from scratch. They have an Instruction Set licence for ARM but they are arm "clones"apinkel - Friday, January 4, 2013 - link
Sorry, yeah, I could have worded that better.But in any case the comment now has me wondering if I'm off base in my understanding of how Qualcomm does what it does...
I've been under the impression that Qualcomm took the ARM design and tweaked it for their needs (instead of just licensing the instruction set and the full chip design top to bottom). Yeah/Nay?
fabarati - Friday, January 4, 2013 - link
Nay.They do what AMD does, they license the instruction set and create their own cpus that are compatible with the ARM ISA's (in Krait's case, the ARMv7). That's also what Apple did with their Swift cores.
Nvidia tweaked the Cortex A9 in the Tegra 2, but it was still a Cortex A9. Ditto for Samsung, Hummingbird and the Cortex A8.
designerfx - Friday, January 4, 2013 - link
do I need to remind you that the Tegra 3 has disabled cores on the RT? Using an actual android device with Tegra 3 would show better results.madmilk - Friday, January 4, 2013 - link
The disabled 5th core doesn't matter in loaded situations. During idle, screen power dominates, so it still doesn't really matter. About all you'll get is more standby time, and Atom seems to be doing fine there.designerfx - Friday, January 4, 2013 - link
The core allows a lot of different significant things - so in other words, it's extremely significant, including in high load situations as well.That has nothing to do with the Atom. You get more than standby time.
designerfx - Friday, January 4, 2013 - link
also, during idle the screen is off, usually after whatever amount of time the settings are set for. Which is easily indicated in the idle measurements. What the heck are you even talking about?