[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-gemma-4-lands-on-google-cloud-en":3,"tags-gemma-4-lands-on-google-cloud-en":30,"related-lang-gemma-4-lands-on-google-cloud-en":41,"related-posts-gemma-4-lands-on-google-cloud-en":45,"series-model-release-94f75563-cdbc-47f2-83c1-0589da2710e1":82},{"id":4,"title":5,"content":6,"summary":7,"source":8,"source_url":9,"author":10,"image_url":11,"keywords":12,"language":18,"translated_content":10,"views":19,"is_premium":20,"created_at":21,"updated_at":21,"cover_image":11,"published_at":22,"rewrite_status":23,"rewrite_error":10,"rewritten_from_id":24,"slug":25,"category":26,"related_article_id":27,"status":28,"google_indexed_at":29,"x_posted_at":10,"tweet_text":10,"title_rewritten_at":10,"title_original":10,"key_takeaways":10,"topic_cluster_id":10,"embedding":10,"is_canonical_seed":20},"94f75563-cdbc-47f2-83c1-0589da2710e1","Gemma 4 lands on Google Cloud","\u003Cp>Google Cloud just added \u003Ca href=\"https:\u002F\u002Fcloud.google.com\u002Fvertex-ai\" target=\"_blank\" rel=\"noopener\">Gemma 4\u003C\u002Fa>, and the headline number is hard to ignore: context windows up to 256K tokens. The family also brings native vision and audio support, more than 140 languages, and an Apache 2.0 license that makes commercial use far less awkward than many closed models.\u003C\u002Fp>\u003Cp>That matters because Google is not just dropping a model file and waving goodbye. It is wiring \u003Ca href=\"https:\u002F\u002Fcloud.google.com\u002Fvertex-ai\u002Fdocs\u002Fgenerative-ai\u002Fmodel-garden\" target=\"_blank\" rel=\"noopener\">Model Garden\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fcloud.google.com\u002Frun\" target=\"_blank\" rel=\"noopener\">Cloud Run\u003C\u002Fa>, \u003Ca href=\"https:\u002F\u002Fcloud.google.com\u002Fkubernetes-engine\" target=\"_blank\" rel=\"noopener\">Google Kubernetes Engine\u003C\u002Fa>, and its TPU stack around Gemma 4 so teams can choose between managed, serverless, and fully controlled deployment paths.\u003C\u002Fp>\u003Ch2>What Google actually shipped\u003C\u002Fh2>\u003Cp>Google says Gemma 4 is its most capable open model family yet, and the release is broader than a single model checkpoint. The lineup includes smaller variants for lighter workloads, a 31B dense model for heavier reasoning, and a 26B mixture-of-experts model that Google says will arrive as fully managed and serverless on Model Garden in the coming days.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775239441417-bla2.png\" alt=\"Gemma 4 lands on Google Cloud\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>The technical pitch is straightforward: long context, multimodal input, and agent-friendly output. In practice, that means a model that can read long codebases, inspect images, process audio, and keep track of multi-step tasks without losing the thread halfway through.\u003C\u002Fp>\u003Cul>\u003Cli>Up to 256K token context windows\u003C\u002Fli>\u003Cli>Native vision and audio processing\u003C\u002Fli>\u003Cli>More than 140 languages\u003C\u002Fli>\u003Cli>Apache 2.0 commercial license\u003C\u002Fli>\u003Cli>Variants from 2B to 31B, plus a 26B MoE model\u003C\u002Fli>\u003C\u002Ful>\u003Cp>For developers, the licensing detail may matter as much as the benchmark talk. Apache 2.0 removes a lot of friction for product teams that want to ship features without legal gymnastics around usage terms.\u003C\u002Fp>\u003Cp>Google also ties the release to its own research line, saying Gemma 4 is built from the same research as Gemini 3. That does not make the models interchangeable, but it does suggest Google is trying to push its open model family closer to the capabilities people expect from its flagship proprietary stack.\u003C\u002Fp>\u003Ch2>Why enterprise teams should care\u003C\u002Fh2>\u003Cp>If you work in enterprise AI, the most interesting part of this announcement is not the model size. It is the deployment story. Google Cloud is positioning Gemma 4 as a model you can keep inside \u003Ca href=\"\u002Fnews\u002Fopenclaw-build-train-personal-ai-agent-en\">your own\u003C\u002Fa> cloud boundary, your own compliance setup, and in some cases your own sovereign infrastructure.\u003C\u002Fp>\u003Cp>That includes \u003Ca href=\"https:\u002F\u002Fcloud.google.com\u002Fsolutions\u002Fsovereign-cloud\" target=\"_blank\" rel=\"noopener\">Sovereign Cloud\u003C\u002Fa> offerings, public cloud with data boundary controls, Google Cloud Dedicated options such as \u003Ca href=\"https:\u002F\u002Fcloud.google.com\u002Fblog\u002Fproducts\u002Finfrastructure\u002Fintroducing-s3ns\" target=\"_blank\" rel=\"noopener\">S3NS in France\u003C\u002Fa>, and \u003Ca href=\"https:\u002F\u002Fcloud.google.com\u002Fdistributed-cloud\" target=\"_blank\" rel=\"noopener\">Google Distributed Cloud\u003C\u002Fa> for air-gapped and on-premises deployments. That spread matters because a lot of AI projects die in procurement, not in the notebook.\u003C\u002Fp>\u003Cblockquote>“I think the biggest thing is we’re seeing companies realize that AI is not a science project anymore.” — Thomas Kurian, Google Cloud Next 2024 keynote\u003C\u002Fblockquote>\u003Cp>Kurian’s point lines up with what Google is doing here. The company is packaging Gemma 4 as infrastructure, not a demo. That means model choice, serving choice, and compliance choice all get treated as first-class decisions.\u003C\u002Fp>\u003Cp>Google also says the 26B MoE model will land as fully managed and serverless on Model Garden soon. That is a smart move. Many teams want open weights, but they do not want to own every layer of serving, scaling, and patching just to get a model into production.\u003C\u002Fp>\u003Ch2>The deployment options are the real story\u003C\u002Fh2>\u003Cp>Gemma 4 is available across several Google Cloud paths, and each one targets a different kind of team. If you want minimal ops overhead, \u003Ca href=\"https:\u002F\u002Fcloud.google.com\u002Fvertex-ai\u002Fdocs\u002Fgenerative-ai\u002Fmodel-garden\u002Fuse-gemma\" target=\"_blank\" rel=\"noopener\">Vertex AI\u003C\u002Fa> and Model Garden are the cleanest route. If you want tight control, GKE and TPU-based serving give you more knobs.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775239438828-8h2y.png\" alt=\"Gemma 4 lands on Google Cloud\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>Google is also pushing agentic use cases hard. The company says Gemma 4 can handle reasoning, function calling, code generation, and structured output, and it pairs that with its open-source \u003Ca href=\"https:\u002F\u002Fgoogle.github.io\u002Fadk-docs\u002F\" target=\"_blank\" rel=\"noopener\">Agent Development Kit\u003C\u002Fa> for building AI agents.\u003C\u002Fp>\u003Cul>\u003Cli>\u003Ca href=\"https:\u002F\u002Fcloud.google.com\u002Frun\" target=\"_blank\" rel=\"noopener\">Cloud Run\u003C\u002Fa> supports Gemma 4 inference on NVIDIA RTX PRO 6000 Blackwell GPUs with 96GB vGPU memory\u003C\u002Fli>\u003Cli>Cloud Run workloads scale to zero when idle\u003C\u002Fli>\u003Cli>\u003Ca href=\"https:\u002F\u002Fcloud.google.com\u002Fkubernetes-engine\u002Fdocs\u002Fconcepts\u002Fabout-gke\" target=\"_blank\" rel=\"noopener\">GKE\u003C\u002Fa> can serve Gemma 4 with vLLM\u003C\u002Fli>\u003Cli>\u003Ca href=\"https:\u002F\u002Fcloud.google.com\u002Ftpu\" target=\"_blank\" rel=\"noopener\">Google Cloud TPUs\u003C\u002Fa> support serving, pretraining, and post-training\u003C\u002Fli>\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm\" target=\"_blank\" rel=\"noopener\">vLLM\u003C\u002Fa> and \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FNeMo\" target=\"_blank\" rel=\"noopener\">NVIDIA NeMo\u003C\u002Fa> are part of the recommended toolchain\u003C\u002Fli>\u003C\u002Ful>\u003Cp>That mix is important because it shows Google is not betting on one serving pattern. Some teams want serverless inference with usage-based pricing. Others want Kubernetes control, custom autoscaling, and GPU or TPU selection. Gemma 4 now fits both camps.\u003C\u002Fp>\u003Cp>For agent builders, the GKE angle may be the most interesting. Google says its new GKE Agent Sandbox can execute LLM-generated code and tool calls inside isolated Kubernetes-native environments, with sub-second cold starts and up to 300 sandboxes per second. If those numbers hold up in real deployments, that is a serious piece of infrastructure for multi-step AI workflows.\u003C\u002Fp>\u003Ch2>How it compares with other open model stacks\u003C\u002Fh2>\u003Cp>Google did not publish the kind of full benchmark table that would let us compare Gemma 4 against every rival model line by line in this post, but the deployment details still tell us a lot. The open model market is increasingly splitting into two camps: teams that want raw weights and teams that want a production path attached to those weights.\u003C\u002Fp>\u003Cp>Gemma 4 looks aimed at the second group. The model family is open, but the surrounding cloud options are what make it attractive for serious work. That is where Google is clearly trying to differentiate itself from the bare-metal download-and-figure-it-out approach.\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong>Gemma 4\u003C\u002Fstrong>: up to 256K context, vision, audio, 140+ languages, Apache 2.0\u003C\u002Fli>\u003Cli>\u003Cstrong>Llama\u003C\u002Fstrong> family: strong open ecosystem, but deployment and compliance are usually left to the buyer\u003C\u002Fli>\u003Cli>\u003Cstrong>Mistral\u003C\u002Fstrong> models: efficient and popular for self-hosting, with a smaller cloud-native bundle around them\u003C\u002Fli>\u003Cli>\u003Cstrong>Gemma 4 on Google Cloud\u003C\u002Fstrong>: managed, serverless, TPU, GKE, Cloud Run, and sovereign deployment paths in one place\u003C\u002Fli>\u003C\u002Ful>\u003Cp>There is also a practical cost angle. Cloud Run’s scale-to-zero behavior, GKE’s autoscaling controls, and TPU support can all reduce waste if your traffic is spiky or your workloads vary by time of day. That is the kind of detail that matters more than a flashy demo when a finance team starts asking questions.\u003C\u002Fp>\u003Cp>For teams already inside Google Cloud, the integration story may be enough reason to test Gemma 4 first. For teams outside that ecosystem, the release is still worth watching because it shows where open-model deployment is heading: fewer one-off scripts, more managed paths, and more attention to sovereignty and compliance from day one.\u003C\u002Fp>\u003Ch2>What to do next\u003C\u002Fh2>\u003Cp>If you are building an internal assistant, document parser, code tool, or multimodal agent, Gemma 4 is worth a real pilot, especially if your data cannot leave your cloud boundary. Start with the smallest model that meets your latency target, then move up only if the task actually needs more reasoning headroom.\u003C\u002Fp>\u003Cp>My guess: the 26B MoE release on \u003Ca href=\"https:\u002F\u002Fcloud.google.com\u002Fvertex-ai\u002Fdocs\u002Fgenerative-ai\u002Fmodel-garden\" target=\"_blank\" rel=\"noopener\">Model Garden\u003C\u002Fa> will get the most attention from teams that want managed open models without running a full MLOps team. If Google keeps the serving story simple and the pricing sane, Gemma 4 could become the default open model choice for many Google Cloud customers this year.\u003C\u002Fp>\u003Cp>The question now is not whether Gemma 4 is capable. It is whether Google can make the deployment path simple enough that teams choose it over stitching together their own stack.\u003C\u002Fp>","Google Cloud brings Gemma 4 to Vertex AI, Cloud Run, GKE, and TPUs, with 256K context, vision, audio, and Apache 2.0 licensing.","cloud.google.com","https:\u002F\u002Fcloud.google.com\u002Fblog\u002Fproducts\u002Fai-machine-learning\u002Fgemma-4-available-on-google-cloud",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775239441417-bla2.png",[13,14,15,16,17],"Gemma 4","Google Cloud","Vertex AI","Cloud Run","GKE","en",1,false,"2026-04-03T18:03:41.196901+00:00","2026-04-03T18:03:41.071+00:00","done","080c8e6a-93f6-417f-b041-51dafb538749","gemma-4-lands-on-google-cloud-en","model-release","b8f87962-35c1-4507-a957-2904710abe69","published","2026-04-07T07:41:09.024+00:00",[31,33,35,37,39],{"name":13,"slug":32},"gemma-4",{"name":16,"slug":34},"cloud-run",{"name":14,"slug":36},"google-cloud",{"name":17,"slug":38},"gke",{"name":15,"slug":40},"vertex-ai",{"id":27,"slug":42,"title":43,"language":44},"gemma-4-lands-on-google-cloud-zh","Gemma 4 登上 Google Cloud","zh",[46,52,58,64,70,76],{"id":47,"slug":48,"title":49,"cover_image":50,"image_url":50,"created_at":51,"category":26},"ebd0ef7f-f14d-4e25-a54e-073b49f9d4b9","why-googles-hidden-gemini-live-models-matter-en","Why Google’s Hidden Gemini Live Models Matter More Than the Demo","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778869237748-4rqx.png","2026-05-15T18:20:23.999239+00:00",{"id":53,"slug":54,"title":55,"cover_image":56,"image_url":56,"created_at":57,"category":26},"6c57f6bf-1023-4a22-a6c0-013bd88ac3d1","minimax-m1-open-hybrid-attention-reasoning-model-en","MiniMax-M1 brings 1M-token open reasoning model","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778797872005-z8uk.png","2026-05-14T22:30:39.599473+00:00",{"id":59,"slug":60,"title":61,"cover_image":62,"image_url":62,"created_at":63,"category":26},"68a2ba2e-f07a-4f28-a69c-24bf66652d2e","gemini-omni-video-review-text-rendering-en","Gemini Omni Video Review: Text Rendering Beats Rivals","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778779286834-fy35.png","2026-05-14T17:20:44.524502+00:00",{"id":65,"slug":66,"title":67,"cover_image":68,"image_url":68,"created_at":69,"category":26},"1d5fc6b1-a87f-48ae-89ee-e5f0da86eb2d","why-xiaomi-mimo-v25-pro-changes-coding-agents-en","Why Xiaomi’s MiMo-V2.5-Pro Changes Coding Agents More Than Chatbots","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778689848027-ocpw.png","2026-05-13T16:30:29.661993+00:00",{"id":71,"slug":72,"title":73,"cover_image":74,"image_url":74,"created_at":75,"category":26},"cb3eac19-4b8d-4ee0-8f7e-d3c2f0b50af5","openai-realtime-audio-models-live-voice-en","OpenAI’s Realtime Audio Models Target Live Voice","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778451653257-dsnq.png","2026-05-10T22:20:33.31082+00:00",{"id":77,"slug":78,"title":79,"cover_image":80,"image_url":80,"created_at":81,"category":26},"84c630af-a060-4b6b-9af2-1b16de0c8f06","anthropic-10-finance-ai-agents-en","Anthropic发布10款金融AI Agent","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778389841959-ktkf.png","2026-05-10T05:10:23.345141+00:00",[83,88,93,98,103,108,113,118,123,128],{"id":84,"slug":85,"title":86,"created_at":87},"d4cffde7-9b50-4cc7-bb68-8bc9e3b15477","nvidia-rubin-ai-supercomputer-en","NVIDIA Unveils Rubin: A Leap in AI Supercomputing","2026-03-25T16:24:35.155565+00:00",{"id":89,"slug":90,"title":91,"created_at":92},"eab919b9-fbac-4048-89fc-afad6749ccef","google-gemini-ai-innovations-2026-en","Google's AI Leap with Gemini Innovations in 2026","2026-03-25T16:27:18.841838+00:00",{"id":94,"slug":95,"title":96,"created_at":97},"5f5cfc67-3384-4816-a8f6-19e44d90113d","gap-google-gemini-ai-checkout-en","Gap Teams Up with Google Gemini for AI-Driven Checkout","2026-03-25T16:27:46.483272+00:00",{"id":99,"slug":100,"title":101,"created_at":102},"f6d04567-47f6-49ec-804c-52e61ab91225","ai-model-release-wave-march-2026-en","Navigating the AI Model Release Wave of March 2026","2026-03-25T16:28:45.409716+00:00",{"id":104,"slug":105,"title":106,"created_at":107},"895c150c-569e-4fdf-939d-dade785c990e","small-language-models-transform-ai-en","Small Language Models: Llama 3.2 and Phi-3 Transform AI","2026-03-25T16:30:26.688313+00:00",{"id":109,"slug":110,"title":111,"created_at":112},"38eb1d26-d961-4fd3-ae12-9c4089680f5f","midjourney-v8-alpha-features-pricing-en","Midjourney V8 Alpha: A Deep Dive into Its Features and Pricing","2026-03-26T01:25:36.387587+00:00",{"id":114,"slug":115,"title":116,"created_at":117},"bf36bb9e-3444-4fb8-ab19-0df6bc9d8271","rag-2026-indispensable-ai-bridge-en","RAG in 2026: The Indispensable AI Bridge","2026-03-26T01:28:34.472046+00:00",{"id":119,"slug":120,"title":121,"created_at":122},"60881d6d-2310-44ef-b1fb-7f98e9dd2f0e","xiaomi-mimo-trio-agents-robots-voice-en","Xiaomi’s MiMo trio targets agents, robots, and voice","2026-03-28T03:05:08.899895+00:00",{"id":124,"slug":125,"title":126,"created_at":127},"f063d8d1-41d1-4de4-8ebc-6c40511b9369","xiaomi-mimo-v2-pro-1t-moe-agents-en","Xiaomi MiMo-V2-Pro: 1T MoE Model for Agents","2026-03-28T03:06:19.238032+00:00",{"id":129,"slug":130,"title":131,"created_at":132},"a1379e9a-6785-4ff5-9b0a-8cff55f8264f","cursor-composer-2-started-from-kimi-en","Cursor’s Composer 2 started from Kimi","2026-03-28T03:11:59.132398+00:00"]