LG Optimus 2X & NVIDIA Tegra 2 Review: The First Dual-Core Smartphone
by Brian Klug & Anand Lal Shimpi on February 7, 2011 3:53 AM EST- Posted in
- Smartphones
- Tegra 2
- LG
- Optimus 2X
- Mobile
- NVIDIA
In many ways, the smartphone platform has evolved following the same kinds of steps we saw in the early days of the PC—lots of different software and hardware platforms, rapidly changing lead players, faster and faster platform update cadence, the slow emergence of obvious majority leaders. Anand and I have talked extensively about just how striking the similarities are between the PC evolution and the current mobile one, but one of the striking differences is just how much faster that evolution is happening in the mobile space. The reason is simple—nearly all the hard lessons have already been learned in the previous PC evolution, it's just a matter of porting that knowledge to mobile under a different set of constraints.
2011 is going to be a year dominated by multi-core smartphone launches, but there always has to be a first. So just like that, we have our first example of said category of smartphone, the LG Optimus 2X, with Nvidia's dual-core 1 GHz Tegra 2 AP20H at its heart. The Optimus 2X (simply the 2X henceforth) hasn't changed much since we saw it at CES—the hardware is aesthetically the same, and software at first glance is the same as well. We weren't able to publish benchmarks at that time purely because LG hadn't finalized the software build on that test hardware, but we definitely can do so now.
First off are the hardware specs. There's a version of the 2X already shipping on South Korea Telecom which is similar but not identical to the version we were sampled—what we're reviewing is the LG PP990 rather than the SU660. You can look at the specs of that Korean version and compare yourself, but the differences boil down to a few things. The South Korean version ships with 16 GB of internal storage compared to 8 GB like ours, Xenon versus LED flash, likely a different build of Android (more on that later), and a physically different set of Android buttons. The Korean version also has T-DMB for mobile TV. LG hasn't officially announced what carrier the 2X will launch with stateside, nor has it been specific about what UMTS or GSM bands that final version will work with, I'd expect that announcement to happen at MWC. Needless to say, I was surprised that the 2X immediately hopped on HSPA when I inserted my personal AT&T SIM. Regardless, just know that what we're reviewing here is something between the international model and what will be launched in the US. The 2X will launch running Android 2.2.1 and is already slated to move to Android 2.3 at some time in the future.
Physical Comparison | |||||||||
Apple iPhone 4 | Motorola Droid 2 | Samsung Galaxy S Fascinate | Google Nexus S | LG Optimus 2X | |||||
Height | 115.2 mm (4.5") | 116.3 mm (4.6") | 106.17 mm (4.18") | 123.9 mm (4.88") | 123.9 mm (4.87") | ||||
Width | 58.6 mm (2.31") | 60.5 mm (2.4") | 63.5 mm (2.5") | 63.0 mm (2.48") | 63.2 mm (2.48") | ||||
Depth | 9.3 mm ( 0.37") | 13.7 mm (0.54") | 9.91 mm (0.39") | 10.88 mm (0.43") | 10.9 mm (0.43") | ||||
Weight | 137 g (4.8 oz) | 169 g (5.9 oz) | 127 grams (4.5 oz) | 129 grams (4.6 oz) | 139.0 grams (4.90 oz) | ||||
CPU | Apple A4 @ ~800MHz | Texas Instruments OMAP 3630 @ 1 GHz | 1 GHz Samsung Hummingbird | 1 GHz Samsung Hummingbird | NVIDIA Tegra 2 Dual-Core Cortex-A9 (AP20H) @ 1 GHz | ||||
GPU | PowerVR SGX 535 | PowerVR SGX 530 | PowerVR SGX 540 | PowerVR SGX 540 | ULV GeForce @ 100-300 MHz | ||||
RAM | 512MB LPDDR1 (?) | 512 MB LPDDR1 | 512 MB LPDDR1 | 512 MB LPDDR1 | 512 MB LPDDR2 @ 600 MHz data rate | ||||
NAND | 16GB or 32GB integrated | 8 GB integrated, preinstalled 8 GB microSD | 2 GB, 16 GB microSD (Class 2) | 16 GB Integrated | 8 GB integrated (5.51 GB internal SD, 1.12 phone storage), up to 32 microSD | ||||
Camera | 5MP with LED Flash + Front Facing Camera | 5 MP with dual LED flash and autofocus | 5 MP with auto focus and LED flash | 5 MP with Autofocus, LED Flash, VGA front facing, 720P Video | 8 MP with autofocus, LED flash, 1080p24 video recording, 1.3 MP front facing | ||||
Screen | 3.5" 640 x 960 LED backlit LCD | 3.7" 854 x 480 | 4" Super AMOLED 800 x 480 | 4" Super AMOLED 800 x 480 |
4" IPS-LCD 800x480 |
On paper, the 2X is impressive. Highlights are obviously the AP20H Tegra 2 SoC, 4-inch IPS display, 8 MP rear camera and 1.3 MP front facing camera, and 1080p24 H.264 (Baseline) video capture. We're going to go over everything in detail, but starting out the review are our hardware impressions.
75 Comments
View All Comments
GoodRevrnd - Tuesday, February 8, 2011 - link
TV link would be awesome, but why would you need the phone to bridge the TV and network??aegisofrime - Monday, February 7, 2011 - link
May I suggest x264 encoding as a test of the CPU power? There's a version of x264 available for ARM chips, along with NEON optimizations. Should be interesting!Shadowmaster625 - Monday, February 7, 2011 - link
What is the point in having a high performance video processor when you cannot do the two things that actually make use of it? Those two things are: 1. Watch any movie in your collection without transcoding? (FAIL) 2. Play games. No actual buttons = FAIL. If you think otherwise then you dont actually play games. Just stick with facebook flash trash.TareX - Wednesday, February 9, 2011 - link
The only reason I'd pay for a dual core phone is smooth flash-enabled web browsing, not gaming.zorxd - Monday, February 7, 2011 - link
Stock Android has it too. There is also E for EDGE and G for GPRS.Exophase - Monday, February 7, 2011 - link
Hey Anand/Brian,There are some issues I've found with some information in this article:
1) You mention that Cortex-A8 is available in a multicore configuration. I'm pretty sure there's no such thing; you might be thinking of ARM11MPCore.
2) The floating point latencies table is just way off for NEON. You can find latencies here:
http://infocenter.arm.com/help/index.jsp?topic=/co...
It's the same in Cortex-A9. The table is a little hard to read; you have to look at the result and writeback stages to determine the latency (it's easier to read the A9 version). Here's the breakdown:
FADD/FSUB/FMUL: 5 cycles
FMAC: 9 cycles (note that this is because the result of the FMUL pipeline is then threaded through the FADD pipeline)
The table also implies Cortex-A9 adds divide and sqrt instructions to NEON. In actuality, both support reciprocal approximation instructions in SIMD and full versions in scalar. The approximation instructions have both initial approximation with ~9 bits of precision and Newton Rhapson step instructions. The step instructions function like FMACs and have similar latencies. This kind of begs the question of where the A9 NEON DIV and SQRT numbers came from.
The other issue I have with these numbers is that it only mentions latency and not throughput. The main issue is that the non-pipelined Cortex-A8 FPU has throughput almost as bad as its latency, while all of the other implementations have single cycle throughput for 2x 64-bit operations. Maybe throughput is what you mean by "minimum latency", however this would imply that Cortex-A9 VFP can't issue every cycle, which isn't the case.
3) It's obvious from the GLBenchmark 2.0 Pro screenshot that there are some serious color limitations from Tegra 2 (look at the woman's face). This is probably due to using 16-bit. IMG has a major advantage in this area since it renders at full 32-bit (or better) precision internally and can dither the result to 16-bit to the framebuffer, which looks surprisingly similar in quality to non-dithered 32-bit. This makes a 16-bit vs 16-bit framebuffer comparison between the two very unbalanced - it's far more fair to just do both at 32-bit, but it doesn't look like the benchmark has any option for it. Furthermore, Tegra 2 is limited to 16-bit (optionally non-linear) depth buffers, while IMG utilizes 32-bit floating point depth internally. This is always going to be a disadvantage for Tegra 2 and is definitely worth mentioning in any comparison.
Finally I feel like ranting a little bit about your use of the Android Linpack test. Anyone with a little common sense can tell that a native implementation of Linpack on these devices will yield several dozen times more than 40MFLOPS (should be closer to 1-4 FLOP/CPU cycle). What you see here is a blatant example of Dalvik's extreme inability to perform with floating point code that extends well beyond an inability to perform SIMD vectorization.
metafor - Monday, February 7, 2011 - link
According to the developer of Linpack on Android:http://www.greenecomputing.com/category/android/
It is mostly FP64 calculations done on Dalvik. While this may not be the fastest way to go about doing linear algebra, it is a fairly good representation of relative FP64 performance (which only exist in VFP).
And let's face it, few app developers are going to dig into Android's NDK and write NEON optimized code.
Exophase - Monday, February 7, 2011 - link
Then let's ask this instead: who really cares about FP64 performance on a smartphone? I'd also argue that it is not even a good representation of relative FP64 performance since that's being obscured so much by the quality of the JITed code. Hence why you see Scorpion and A9 perform a little over twice as fast as A8 (per-clock) instead of several times faster. VFP is still in-order on Cortex-A9, competent scheduling matters.Maybe a lot of developers won't write NEON code on Android, but where it's written it could very well matter. For one thing, in Android itself. And theoretically one day Dalvik could actually be generating NEON competently.. so some synthetic tests of NEON could be a good look at what could be.
metafor - Monday, February 7, 2011 - link
Well, few people really :)Linpack as it currently exists on Android probably doesn't tell very much at all. But if you're just going to slap together an FP heavy app (pocket scientific computing anyone?) and aren't a professional programmer, this likely represents the result you see.
I wouldn't mind seeing SpecFP ported natively to Android and running NEON. But alas, we'd need someone to roll up their sleeves and do that.
I did do a native compile of Linpack using gcc to test on my Evo, though. It's still not SIMD code, of course, but native results using VFP were around the 70-80MFLOPS mark. Of course, it's scheduling for the A8's FPU and not Scorpion's.
Anand Lal Shimpi - Monday, February 7, 2011 - link
Thanks for your comment :)1) You're very right, I was thinking about the ARM11 - fixed :)
2) Make that 2 for 2. You're right on the NEON values, I mistakenly grabbed the values from the cycles column and not the result column. The DIV/SQRT columns were also incorrect, I removed them from the article.
I mentioned the lack of pipelining in the A8 FPU earlier in the article but I reiterated it underneath the table to hammer the point home. I agree that the lack of pipelining is the major reason for the A8's poor FP performance.
3) Those screenshots were actually taken on IMG hardware. IMG has some pretty serious rendering issues running GLBenchmark 2.0.
4) I'm not happy with the current state of Android benchmarks - Linpack included. Right now we're simply including everything we can get our hands on, but over the next 24 months I think you'll see us narrow the list and introduce more benchmarks that are representative of real world performance as well as contribute to meaningful architecture analysis.
Take care,
Anand