Introduction
I am done with managing API keys. Between managing billing limits and the fear of reaching my max limit or the costs getting too expensive, using the cloud for everything is a headache.
It actually happened to me recently: I reached my limit while I was playing with an API and testing some interesting features. I thought, “How can I find a way to avoid using tokens so I won’t get scared of reaching limits?”
That’s when I found Gemini Nano. It’s a powerful AI model already inside your Google Chrome browser. You don’t need to sign up for OpenAI, and you just need a few lines of code. That’s all.
Step 1: Enable the Secret Flags
- Open Chrome and type
chrome://flagsin the address bar. - Search for and enable these two:
#prompt-api-for-gemini-nano#optimization-guide-on-device-model - Relaunch Chrome.
Step 2: The JavaScript Code
Next, you just need a few lines of JavaScript that will work with the hardware for you:
async function askLocalAI(question) {
// 1. Check if the browser's AI is ready
const status = await window.ai.languageModel.capabilities();
if (status.available !== "no") {
// 2. Create a session (The "brain" instance)
const session = await window.ai.languageModel.create({
systemPrompt: "You are a helpful assistant for developers."
});
// 3. Prompt the model directly
const result = await session.prompt(question);
return result;
} else {
return "Local AI is not enabled or still downloading.";
}
}
Why is this better?
- No Limits: You don’t need to fear that your tokens will run out.
- Cost: It is totally free and uses your own hardware to run.
- Privacy: The user data never leaves their machine, which is awesome.
Conclusion
Because we are moving from “AI in the cloud” to “AI everywhere,” this way is cheaper, more reliable, and more fun. I am thinking of building an AI tool with this—what do you think I should build?