Today's The Fast and the Curious post covers the launch of Skia's new rasterization backend, Graphite, in Chrome on Apple Silicon Macs. Graphite is instrumental in helping Chrome achieve exceptional scores on Motionmark 1.3 and is key to unlocking a ton of future improvements in Chrome Graphics.
In Chrome, Skia is used to render paint commands from Blink and the browser UI into pixels on your screen, a process called rasterization. Skia has powered Chrome Graphics since the very beginning. Skia eventually ran into performance issues as the web evolved and became more complex, which led Chrome and Skia to invest in a GPU accelerated rasterization backend called Ganesh.
Over the years, Ganesh matured into a solid highly performant rasterization backend and GPU rasterization launched on all platforms in Chrome on top of GL (via ANGLE on Windows D3D9/11). However, Ganesh always had a GL-centric design with too many specialized code paths and the team was hitting a wall when trying to implement optimizations that took advantage of modern graphics APIs in a principled manner.
This set the stage for the team to rethink GPU rasterization from the ground up in the form of a new rasterization backend, Graphite. Graphite was developed from the start to be principled by having fewer and more comprehensible code paths. This forward looking design helps take advantage of modern graphics APIs like Metal, Vulkan and D3D12 and paradigms like compute based path rasterization, and is multithreaded by default.
With Graphite in Chrome, we increased our Motionmark 1.3 scores by almost 15% on a Macbook Pro M3. At the same time, we improved real world metrics like INP (interaction to next paint time), LCP (time to largest contentful paint), graphics smoothness (percent dropped frames), GPU process malloc memory usage, and others. This all means substantially smoother interactions, less stutter when scrolling, and less time waiting for sites to show.
Ganesh was originally implemented on OpenGL ES, which had minimal support for multi-threading or GPU capabilities like compute shaders. Since then, modern graphics APIs like Vulkan, Metal and D3D12 have evolved to take advantage of multithreading and expose new GPU capabilities. They allow applications to have much more control over when and how expensive work such as allocating GPU resources is performed and scheduled, while utilizing both the CPU and the GPU effectively.
While we were able to adapt Ganesh to support modern graphics APIs, it had accumulated enough technical debt that it became hard to fully take advantage of the multi-threading and GPU compute capabilities of modern graphics APIs.
For Graphite in Chrome, we chose to use Chrome's WebGPU implementation, Dawn, as the abstraction layer for platform native graphics APIs like Metal, Vulkan and D3D. Dawn provides a baseline for capabilities common in modern graphics APIs and helps us reduce the long term maintenance burden by leveraging Dawn's mature well tested native backends instead of implementing them from scratch for Graphite.
A core part of the GPU rendering pipeline is depth testing, which can reduce or eliminate overdraw by drawing opaque objects in front to back order, followed by translucent objects back to front. In graphics, "overdraw" refers to the unnecessary rendering of the same pixels multiple times, which can negatively impact performance and battery life, especially on mobile devices.
Ganesh never utilized the depth testing capabilities of graphics cards, which was admittedly intended for rendering 3D content and not accelerating 2D graphics. Ganesh suffers from overdraw due to its reliance on adhering to strict painters order when drawing both opaque and translucent objects.
Graphite extends Skia’s GPU rendering to take advantage of the depth test by assigning each “draw” a z value defining its painter’s ordering index. While transparent effects and images must still be drawn from back to front, opaque objects in the foreground can now automatically eliminate overdraw. This means opaque draws can be re-ordered to minimize expensive GPU state changes while relying on the depth buffer to produce correct output.
Depth testing is also used to implement clipping in Graphite by treating clip shapes as depth only draws as opposed to maintaining a clip stack like in Ganesh. Besides reducing algorithmic complexity, a significant benefit to this approach is that the shader program required to render a “draw” does not also depend on the state of the clip stack.
Left: Frame from Motionmark Suits Right: Depth buffer for the same frame.
Chromium is a complex multi-process application, with render processes issuing commands to a shared GPU process that is responsible for actually displaying everything in a webpage, tab, and even the browser UI. The GPU process main thread is the primary driver of all rendering work and is where all GPU commands are issued.
Due to the single threaded nature of Ganesh and OpenGL, only a limited set of work could be moved to other threads, making it easy to overload the main thread causing increased jank and latency ultimately hurting user experience.
In contrast, Graphite's API is designed to take advantage of multithreading capabilities of modern graphics APIs. Graphite’s new core API is centered around independent Recorders that can produce Recordings on multiple threads, with minimal need to synchronize between them. Even though the Recordings are submitted to the GPU on the main thread, more expensive work is moved to other threads when producing Recordings, keeping the GPU main thread free.
When Ganesh was initially implemented, the programmable capabilities of graphics cards were quite limited, and branching in particular was expensive. To work around this, Ganesh had many specialized shader pipelines to handle common cases. These specializations are hard to predict and depend on a large number of factors related to each individual draw, leading to an explosion of different pipelines for essentially the same page content. Since these pipelines must each be compiled, it doesn't work well for modern web content which might have effects and animations trigger new pipelines at any moment, causing noticeable jank.
Graphite’s design philosophy is instead to consolidate the number of rendering pipelines as much as possible while still preserving performance. This reduces the number of pipelines that have to be compiled, and makes it possible for Chrome to ensure they are compiled at startup so they do not interrupt active browsing. Ganesh’s specialization approach also led to surprising performance cliffs. For example, while it could handle simple cases, real page content was often a complex mix. By consolidating pipelines, complex content can be rendered as effectively as simple content.
Currently, Graphite is integrated into Chromium using two Recorders: one handles web content tiles and Canvas2D on the main thread, while the other is for compositing. In the future, this model will open up a number of exciting possibilities to further improve Chrome’s performance. Instead of saturating the main GPU thread with the tasks from each renderer process, rasterization can be forked across multiple threads.
Current:
Future:
Graphite recordings can also be re-issued to the GPU with certain dynamic changes such as translation. This can be used to accelerate scrolling while eliminating the unnecessary work to re-issue rendering commands. This lets us automatically reduce the amount of GPU memory required to cache web content as tiles. If the content is simple enough, the performance difference between drawing a cached image and drawing its content can be worth skipping allocating a tile for it and just re-rendering it each frame.
In the landscape of 2D graphics rendering, GPU compute-based path rasterization is very much en vogue with recent implementations like Pathfinder and vello. We would like to implement these ideas in Skia, possibly using a hybrid approach. Currently, Graphite relies on MSAA where it can, but in many cases we can't due to poor performance on older integrated GPUs or high memory overhead on non-tiling GPUs, and we have to fallback to CPU path rasterization using an atlas for caching. GPU compute based path rasterization would allow us to improve over both the visual quality of MSAA which is often limited to 4 samples per pixel and over the performance of CPU rasterization.
These are future directions the Chrome Graphics team plans to pursue, and we are excited to see how far we can push the needle.
Update (6/10/2025): This blog was updated to reflect that testing was done using the Speedometer 3.1 benchmark, and resulted in a 22% performance improvement. The previous version incorrectly noted that the performance improvement was 10% and that the benchmark was Speedometer 3.
Performance has always been one of the core pillars of Chrome and it’s something we’ve never stopped investing in. Publicly available and open benchmarks, which we create in open collaboration with other browsers, are useful tools for tracking our overall progress, understanding new areas of improvement, and validating potential optimizations. In today’s The Fast and the Curious post, we’d like to go through Chrome’s recent work that enabled it to achieve the highest score ever on the Speedometer benchmark.
For Speedometer, these optimizations have resulted in a 22% improvement since August 2024. That 22% improvement leads to better browser experiences, higher conversions for businesses, and deeper enjoyment of what the web has to offer. If each Chrome user used Chrome for just 10 minutes a day, these improvements collectively save 116 million hours or roughly 166 lifetimes worth of waiting around for websites to load and do things.
Speedometer 3.1 score measured on Apple Macbook Pro M4 with MacOS 15
Speedometer is a benchmark created in open collaboration with other browsers and measures web application responsiveness through workloads that cover a large variety of different areas of the Blink rendering engine used in Chrome:
In essence, Speedometer tests critical components of the entire rendering pipeline. For a deeper dive into these individual parts, we recommend the presentation Life of a Script at Chrome University.
Achieving exceptional web performance requires a multifaceted approach, and optimizing for Speedometer is a testament to overall product excellence. Over the past year, our team has focused on refining fundamental rendering paths across the entire stack. Here are some notable optimization examples.
The team heavily optimized memory layouts of many internal data structures across DOM, CSS, layout, and painting components. Blink now avoids a lot of useless churn on system memory by keeping state where it belongs with respect to access patterns, maximizing utilization of CPU caches. Where internal memory was already relying on garbage collection in Oilpan, e.g. DOM, the usage was expanded by converting types from using malloc to Oilpan. This generally speeds up the affected areas as it packs memory nicely in Oilpan’s backend.
Strings in the renderer improved quite a bit over the last year by avoiding costly representations where possible and switching hashing to rapidhash. More generally, lots of data structures were equipped with better hashes, filters, and probing algorithms.
Where rendering becomes inherently expensive, e.g., for computing CSS styles across various elements, caches are now used much more effectively with better hit rates. At the same time we cache fewer things that are not relevant. Another area where rendering becomes expensive is font shaping; the team significantly improved Apple Advanced Typography font shaping performance which is relevant everywhere text is rendered.
Posted by Thomas Nattestad
Notifications in Chrome are a useful feature to keep up with updates from your favorite sites. However, we know that some notifications may be spammy or even deceptive. We’ve received reports of notifications diverting you to download suspicious software, tricking you into sharing personal information or asking you to make purchases on potentially fraudulent online store fronts.
To defend against these threats, Chrome is launching warnings of unwanted notifications on Android. This new feature uses on-device machine learning to detect and warn you about potentially deceptive or spammy notifications, giving you an extra level of control over the information displayed on your device.
When a notification is flagged by Chrome, you’ll see the name of the site sending the notification, a message warning that the contents of the notification are potentially deceptive or spammy, and the option to either unsubscribe from the site or see the flagged content.
An example of a notification flagged as possibly spam.
If you choose to see the notification you will still see the option to unsubscribe or you can choose to always allow notifications from that site and not see warnings in the future.
What you see when viewing a flagged notification.
How It Works
Chrome uses a local, on-device machine learning model to analyze notification content. This model identifies notifications that are likely to be unwanted. The model is trained on the textual contents of the notification, like the title, body, and action button texts.
Notifications are end to end encrypted. The analysis of each message is done on-device and notification contents are not sent to Google, to protect user privacy. Due to the sensitive nature of notifications content, the model was trained using synthetic data generated by the Gemini large language model (LLM). The training data was evaluated against real notifications Chrome security team collected by subscribing to a variety of websites that were then classified by human experts. To start, this feature is only available on Android as the majority of notifications are sent to mobile devices, however we will evaluate expanding to other platforms in the future.
This feature is just one of many ways Chrome works to reduce the number of potentially harmful notifications you receive. Other ways Chrome protects against potentially harmful notifications include:
In Safety Check you can review any notification permission revocations
Notification warnings are an important step in Chrome's ongoing commitment to user safety. The Chrome Security team in partnership with Google Safe Browsing continually monitors threats to our users in order to evolve our defenses against abusive activity across the web. Keep an eye on our blog for updates on how we are helping you stay one step ahead of online threats.
Since Google announced the Chromium project in 2008, we have been excited to build on the great foundations of open-source web browsers and contribute to the continued development of a rich web platform. Today, Chromium is used by hundreds of different projects globally, including big browsers like Chrome, home electronics from LG, application frameworks like Electron and even custom applications like Bloomberg terminals and SpaceX capsule control software.
In 2024, Google made over 100,000 commits to Chromium, accounting for ~94 percent of contributions. While we have no intention of reducing this investment, we continue to welcome others stepping up to invest more.
Google also continues to invest heavily in the shared infrastructure of the Open Source project to "keep the lights on", including having thousands of servers endlessly running millions of tests, responding to hundreds of incoming bugs per day, ensuring the important ones get fixed, and constantly investing in code health to keep the whole project maintainable. This work represents hundreds of millions of US dollars in annual investment just for maintenance costs before any new feature, innovation or other business priorities can be addressed.
Sustainable funding of critical open source infrastructure remains a hot industry-wide topic of discussion and over the years we’ve heard from many companies and developers about how critical the Chromium project is to their work. They’ve also shared how they would like to support the continued health of the project, beyond direct engineering support.
Today Google is pleased to announce our partnership with The Linux Foundation and the launch of the Supporters of Chromium-based Browsers. The goal of this initiative is to foster a sustainable environment of open-source contributions towards the health of the Chromium ecosystem and financially support a community of developers who want to contribute to the project, encouraging widespread support and continued technological progress for Chromium embedders.
The Supporters of Chromium-based Browsers fund will be managed by the Linux Foundation, following their long established practices for open governance, prioritizing transparency, inclusivity, and community-driven development. We’re thrilled to have Meta, Microsoft, and Opera on-board as the initial members to pledge their support.
We welcome this additional investment into Chromium’s commons and we’re looking forward to working with the other members of the Supporters of Chromium-based Browsers to ensure that it meets the needs of the wider Chromium community. At the same time, we remain committed to being the responsible steward of the Chromium project and to the massive investment necessary to keep Chromium working well for the entire web industry.