If you’ve ever glanced at your Android phone’s storage breakdown and done a double-take at how much space AICore is consuming, you’re not alone. It’s one of those things that’s easy to notice and hard to explain, and for a while, Google wasn’t offering much clarity on it. That’s changed now, and the explanation turns out to be more sensible than the mystery surrounding it suggested.
AICore is the on-device AI backbone that powers a growing list of features on Android 14 and above — smart replies in WhatsApp, scam detection in messages, real-time transcription, grammar correction, audio summarization, and more. It runs Gemini Nano locally on supported hardware, which means your data stays on your device, the features work without an internet connection, and there’s no latency from bouncing a request off a remote server. The trade-off, as anyone who’s installed a multi-gigabyte model knows, is storage.
The storage spike has a simple explanation
Google has now published a support article addressing the one thing that confused people most: why AICore’s storage footprint sometimes balloons unexpectedly. The answer is that when a new version of Gemini Nano becomes available, AICore holds both the old and the new versions simultaneously for up to 3 days before clearing the original version.

It’s a precautionary measure. If the new model version encounters problems after installation, your phone can instantly revert to the previous version rather than re-download gigabytes of model data from scratch. It’s the kind of sensible engineering decision that’s obvious in hindsight, but Google probably should have communicated it sooner, given how much confusion it’s caused.
On-Device AI is worth the storage cost — but Google needs to be upfront
The broader case for on-device AI is genuinely compelling. Sensitive data never leaving your device is a meaningful privacy win in an era when everything seems to be vacuumed into the cloud somewhere. Features that work in airplane mode are more useful than they sound when you’re somewhere with patchy connectivity. And local processing simply feels snappier than waiting on a server response.
But the goodwill only stretches so far when users are left staring at an unexplained storage spike with no context. Documenting it now is the right call — it just shouldn’t have taken this long to get there.