Published: May 14, 2024
When we build features with AI models on the web, we often rely on server-side solutions for larger models. This is especially true for generative AI, where even the smallest models are about thousand times bigger than the median web page size. It's also true for other AI use cases, where models can range from 10s to 100s of megabytes.
Since these models aren't shared across websites, each site has to download them on page load. This is an impractical solution for developers and users
While server-side AI is a great option for large models, on-device and hybrid approaches have their own compelling upsides. To make these approaches viable, we need to address model size and model delivery.
That's why we're developing web platform APIs and browser features designed to integrate AI models, including large language models (LLMs), directly into the browser. This includes Gemini Nano, the most efficient version of the Gemini family of LLMs, designed to run locally on most modern desktop and laptop computers. With built-in AI, your website or web application can perform AI-powered tasks without needing to deploy or manage its own AI models.
Discover the benefits of built-in AI, our implementation plan, and how you can take advantage of this technology.
Get an early preview
We need your input to shape the APIs, ensure they fulfill your use cases, and inform our discussions with other browser vendors for standardization.
Join our early preview program to provide feedback on early-stage built-in AI ideas, and discover opportunities to test in-progress APIs through local prototyping.
Join the Chrome AI developer public announcements group to be notified when new APIs become available.
Benefits of built-in AI for web developers
With built-in AI, your browser provides and manages foundation and expert models.
As compared to do it yourself on-device AI, built-in AI offers the following benefits:
- Ease of deployment: As the browser distributes the models, it takes into account the capability of the device and manages updates to the model. This means you aren't responsible for downloading or updating large models over a network. You don't have to solve for storage eviction, runtime memory budget, serving costs, and other challenges.
- Access to hardware acceleration: The browser's AI runtime is optimized to make the most out of the available hardware, be it a GPU, an NPU, or falling back to the CPU. Consequently, your app can get the best performance on each device.
Benefits of running on-device
With a built-in AI approach, it becomes trivial to perform AI tasks on-device, which in turn offers the following upsides:
- Local processing of sensitive data: On-device AI can improve your privacy story. For example, if you work with sensitive data, you can offer AI features to users with end-to-end encryption.
- Snappy user experience: In some cases, ditching the round trip to the server means you can offer near-instant results. On-device AI can be the difference between a viable feature and a sub-optimal user experience.
- Greater access to AI: Your users' devices can shoulder some of the processing load in exchange for more access to features. For example, if you offer premium AI features, you could preview these features with on-device AI so that potential customers can see the benefits of your product, without additional cost to you. This hybrid approach can also help you manage inference costs especially on frequently used user flows.
- Offline AI usage: Your users can access AI features even when there is no internet connection. This means your sites and web apps can work as expected offline or with variable connectivity.
Hybrid AI: On-device and server-side
While on-device AI can handle a large array of use cases, there are certain use cases which require server-side support.
For example, you may need to use larger models or support a wider range of platforms and devices.
You may consider hybrid approaches, dependent on:
- Complexity: Specific, approachable use cases are easier to support with on-device AI. For complex use cases, consider server-side implementation.
- Resiliency: Use server-side by default, and use on-device when the device is offline or on a spotty connection.
- Graceful fallback: Adoption of browsers with built-in AI will take time, some models may be unavailable, and older or less powerful devices may not meet the hardware requirements for running all models optimally. Offer server-side AI for those users.
For Gemini models, you can use backend integration (with Python, Go, Node.js, or REST) or implement in your web application with the new Google AI client SDK for Web.
Browser architecture and APIs
To support built-in AI in Chrome, we created infrastructure to access foundation and expert models for on-device execution. This infrastructure is already powering innovative browser features, such as Help me write, and will soon power APIs for on-device AI.
You'll access built-in AI capabilities primarily with task APIs, such as a translation API or a summarization API. Task APIs are designed to run inference against the best model for the assignment.
In Chrome, these APIs are built to run inference against Gemini Nano with fine-tuning or an expert model. Designed to run locally on most modern devices, Gemini Nano is best for language-related use cases, such as summarization, rephrasing, or categorization.
Also, we intend to provide exploratory APIs, so that you can experiment locally and share additional use cases.
For example, we may provide:
- Prompt API: Send an arbitrary task, expressed in natural language, to the built-in Large Language Model (Gemini Nano in Chrome).
- Fine-tuning (LoRA) API: Improve the built-in LLM's performance on a task by adjusting the model's weights with Low-Rank Adaptation fine tuning.
When to use built-in AI
Here are a few ways we expect built-in AI can benefit you and your users:
- AI-enhanced content consumption: Including summarization, translation, answering questions about some content, categorization, and characterizing.
- AI-supported content creation: Such as writing assistance, proofreading, grammar correction, and rephrasing.
What's next
Join our early preview program to experiment with early-stage built-in AI APIs.
In the meantime, you can learn how to use Gemini Pro on Google's servers with your websites and web apps in our quickstart for the Google AI JavaScript SDK.