Intel Lunar Lake: Everything You Need to Know

In Short
  • Intel has completely redesigned the Lunar Lake architecture for efficiency.
  • The Compute tile is manufactured on TSMC's N3B node and now includes the CPU, GPU, and NPU.
  • Intel has also removed HyperThreading and integrated memory into the processor to improve efficiency.
  • The new Lunar Lake NPU 4 can perform up to 48 TOPS for AI workloads. Total AI processing capability is 120 TOPS.

At Computex Taiwan in May earlier this year, Intel announced the next generation of Core Ultra 200V architecture, codenamed Lunar Lake. Intel’s new Lunar Lake architecture aims to provide competitive performance at ultra-low power for thin and compact laptops. After all, Qualcomm’s ARM-based Snapdragon X Elite has entered the Windows PC ecosystem, and it’s already grabbing headlines for its efficiency. Thus, let’s learn more about Intel’s Lunar Lake architecture and how it has been redesigned for efficiency.

Intel Lunar Lake Architecture

With Meteor Lake last year, Intel moved from its age-old monolithic design to tile-based chip design. And Lunar Lake takes it even further. Unlike Meteor Lake where the Compute tile only housed the CPU and cache, the Compute tile on Lunar Lake processors will house the CPU, cache, GPU, and NPU as well.

It means that the Compute tile is the largest tile on the die, and the best part this year is that it’s fabricated on TSMC’s N3B process node. Sure, TSMC’s N3B has a lower yield than the latest and much-improved N3E node, but Intel is finally moving away from Intel foundry to TSMC’s advanced 3nm process node, which is great.

As for the platform controller tile that provides I/O and connectivity, it’s built on TSMC’s 6nm (N6) node, like last year’s Meteor Lake. This is the first time Intel is designing its processor, but TSMC is building it. Finally, Intel is packaging the whole chipset using its own Foveros 3D tech.

intel lunar lake processor
Image Courtesy: Intel

Not just that, Intel is also moving the memory to the processor. It means that unified memory, similar to Apple M-series chips, will be available on Lunar Lake chips. The on-package LPDDR5X-8533 RAM is available in 16GB or 32GB capacity.

Overall, the Lunar Lake architecture has undergone significant changes. The CPU, GPU, NPU, and cache are now part of the Compute tile and it’s manufactured on TSMC’s 3nm (N3B) process node, which should lead to much better efficiency. And finally, memory is available directly on the SoC to reduce power consumption and space and improve bandwidth.

During the Computex event, Michelle Holthaus, executive VP and GM at Intel, said, “We’re going to bust the myth that [x86] can’t be as efficient.” Intel claims that x86-based Lunar Lake processors will drop power consumption by a whopping 40%.

It appears that Intel is making all the right moves to improve efficiency with Lunar Lake processors. Now, let’s learn about the new Lunar Lake CPU cores.

Intel Lunar Lake CPU

Lunar Lake will have 8 CPU cores — 4 performance (P) cores named Lion Cove, and 4 efficiency (E) cores named Skymont. As I mentioned above, the CPU is part of the Compute tile. Intel claims the P-core Lion Cove on Lunar Lake brings a 14% IPC gain over Meteor’s Lake Redwood Cove P core.

intel lunar lake CPU architecture
Image Courtesy: Intel via YouTube

Intel has done something very different this time. The chipmaker has completely removed SMT (Simultaneous Multi-threading) after more than two decades from its processor. SMT, popularly known as HyperThreading, allows a core to perform two tasks in parallel. Intel says removing SMT helps in improving performance-per-watt by 5%.

To compensate for the lack of HyperThreading, Intel argues that Lunar Lake processors can execute more instructions per cycle instead of relying on parallel execution. This allows the processor to perform better in single-threaded tasks.

lunar lake skymont core
Image Courtesy: Intel via YouTube

Now coming to the E core, I think Skymont is the headline feature of Lunar Lake processors. Intel says Skymont offers a massive 68% IPC improvement over Meteor’s Lake Crestmont E core. The 4-core Skymont cluster remains separate in a ‘Low Power Island’ from the P-core cluster with access to its own L3 cache.

As a result, Skymont consumes one-third of the power to match Crestmont’s peak performance. So overall, Skymont offers 2x more performance than the Crestmont core in single-threaded tasks.

lunar lake skymont vs lion cove
Image Courtesy: Intel via YouTube

On top of that, Intel has brought granularity to clock speed boosts with Lunar Lake. Instead of ramping up the clock speed by 100MHz, which consumes more power, Lunar Lake architecture can increment the clock speed by 16.67MHz to manage the power budget of any task.

The reduced frequency interval will lead to less power consumption. Overall, Intel says the Lunar Lake CPU can match the single-threaded performance of Meteor Lake at just half the power, which is quite impressive.

Lunar Lake Geekbench Score (Leaked)

While Lunar Lake is scheduled to launch on 3rd September, some Geekbench scores have already leaked. While running the lowest-end SKU (Core Ultra 5 228V), the 8-core CPU scored 2,530 in the single-core test and 9,875 in the multi-core test. The SKU clocks up to 4.5GHz with a TDP of 17W (30W Maximum Turbo Power).

lunar lake geekbench score

And the highest-end SKU (Core Ultra 9 288V) of Lunar Lake manages to score 2,790 in the single-core test and 11,048 in the multi-core test. In some other runs, it even managed to cross the 2,900 mark in single-threaded tasks. This particular SKU goes up to 5.1GHz and has a TDP of 30W.

Intel Lunar Lake: New Xe2 GPU

The integrated GPU on Lunar Lake is built on the Battlemage graphics architecture and it features 8 second-gen Intel Xe cores. It also features 8 ray tracing units for improved gaming performance and real-time ray tracing. Not only that, for AI tasks, the new Lunar Lake GPU can alone perform 67 trillion operations per second (TOPS). That’s pretty impressive, right?

intel lunar lake xe2 gpu
Image Courtesy: Intel via YouTube

In comparison to Meteor Lake GPU, the Lunar Lake GPU is 1.5x faster, and offers XeSS AI-based upscaling as well. Its display engine can handle three 4K HDR screens at 60Hz and a single 8K HDR screen at 60Hz. Finally, Lunar Lake processors support AV1 encoding and decoding as well.

Intel Lunar Lake NPU

Much was said about Meteor Lake’s weak NPU that could only execute up to 10 TOPS, but with Lunar Lake, Intel will be powering a range of Copilot+ PCs for local AI workloads. The new Lunar Lake NPU 4 can perform up to 48 TOPS alone, higher than Microsoft’s 40 TOPS eligibility ceiling for Copilot+ PCs.

lunar lake npu
Image Courtesy: Intel via YouTube

Considering all the compute units, the processor can perform up to a massive 120 TOPS. The GPU can perform up to 67 TOPS, CPU up to 5 TOPS, and the NPU up to 48 TOPS — totaling 120 TOPS. This is even higher than Qualcomm’s total 75 TOPS processing capability on the Snapdragon X Elite. Keep in mind, the TOPS figure is based on INT8 data type.

Intel Lunar Lake: Leaked SKUs

Below, you can check out all the leaked SKUs of the Core Ulra processors based on the Lunar Lake architecture. There are nine different SKUs featuring eight CPU cores on all of them. The distinctive factors are memory, CPU/GPU clock speed, and NPU’s capability.

Lunar Lake SKUsCores/ThreadsMemoryMax CPU FrequencyMax GPU FrequencyNPU (TOPS)TDP Range
Core Ultra 9 288V8C/8T32 GB5.1 GHz2.05 GHz4830W – 30W
Core Ultra 7 268V8C/8T32 GB5.0 GHz2.00 GHz4817W – 30W
Core Ultra 7 266V8C/8T16 GB5.0 GHz2.00 GHz4817W – 30W
Core Ultra 7 258V8C/8T32 GB4.8 GHz1.95 GHz4717W – 30W
Core Ultra 7 256V8C/8T16 GB4.8 GHz1.95 GHz4717W – 30W
Core Ultra 5 238V8C/8T32 GB4.7 GHz1.85 GHz4017W – 30W
Core Ultra 5 236V8C/8T16 GB4.7 GHz1.85 GHz4017W – 30W
Core Ultra 5 228V8C/8T32 GB4.5 GHz1.85 GHz4017W – 30W
Core Ultra 5 226V8C/8T16 GB4.5 GHz1.85 GHz4017W – 30W

Intel Lunar Lake: Additional Improvements

As mentioned above, the RAM is now part of the SoC. This means the CPU, GPU or NPU can access the memory quickly. Intel says moving the memory to the SoC also helps in freeing up space on the motherboard. Since the memory is physically closer to the Compute tile, the bandwidth improves with reduced latency and leads to about a 40% reduction in power consumption.

Of course, with the on-package memory, users won’t be able to upgrade or replace memory which some may not like. Apart from that, Intel says Thread Director has been improved to allocate tasks to suitable cores. Intel is further using machine learning to indicate the system scheduler for better guidance of tasks.

Finally, the TDP range of Lunar Lake processors is between 17W and 30W. Overall, I am very excited about the Lunar Lake processors which are scheduled to arrive on September 3, 2024. It is going to be a thrilling time for consumers as Intel takes on Qualcomm and AMD in the AI PC race. We may finally see improved battery life on x86-powered Windows laptops.

#Tags
comment Comments 0
Leave a Reply