The Magic Box- A Guide to Google Cloud Run

We’ve journeyed across the entire landscape of Google Cloud compute. We’ve laid our own foundation with Compute Engine (IaaS). We’ve mastered the complex but powerful city-planning of GKE (CaaS). And we’ve moved into the effortless luxury apartment of App Engine (PaaS). But a new desire has emerged. Your developers love the scale-to-zero efficiency and zero-server-management of App Engine Standard. However, they also love the freedom of Docker containers that they get with GKE or App Engine Flexible, allowing them to use any language, any library, any binary. They come to you with a simple request: “Can we have both? Can we have the ‘just run my code’ simplicity of serverless, but with the ‘run any container’ flexibility of Docker?”

For years, the answer was “pick one.” Today, the answer is a resounding “yes.” And the service that makes this possible is Cloud Run.

What is Cloud Run?

Cloud Run is a fully managed, serverless platform for running stateless containers. Let’s break that down. It takes the best of serverless and the best of containers and merges them into a single, powerful service.

From Serverless: You get automatic scaling (including scaling down to zero), no infrastructure to manage, and a pay-per-use billing model.
From Containers: You get the freedom to package your application in a standard Docker container, giving you total control over your runtime environment.

The analogy: Cloud Run is like a magical, self-replicating shipping container. When a shipment (an HTTP request) arrives at the port, the container instantly appears, processes the shipment, and then vanishes. If a thousand shipments arrive at once, a thousand containers appear instantly to handle the load. You just provide the blueprint for the container; Google handles the magic.

How It Works: Revisions, Concurrency, and CPU

The workflow is beautifully simple. You package your web application into a container image, push it to Artifact Registry, and then tell Cloud Run to deploy it.

gcloud run deploy my-cool-service \
    --image gcr.io/my-project/my-app:v1 \
    --platform managed \
    --region us-central1 \
    --allow-unauthenticated

Cloud Run takes your container, gives you a secure HTTPS endpoint, and handles everything else. When you deploy a new version of your container or change its configuration, Cloud Run creates a new, immutable Revision. Just like with App Engine, you can split traffic between different revisions for safe, gradual rollouts.

The Secret Sauce: Container Concurrency Here’s a key feature that makes Cloud Run so efficient. A single Cloud Function instance handles only one request at a time. A single Cloud Run container instance, however, can process multiple requests simultaneously. By default, an instance can handle up to 80 concurrent requests.

This is a game-changer for cost and performance. If you have 80 simultaneous users hitting your API, you might only need one Cloud Run instance, whereas you’d need 80 Cloud Function instances. This makes Cloud Run exceptionally cost-effective for services with steady traffic.

CPU Allocation: A Tale of Two Models

CPU is allocated only during request processing (Default): This is the classic serverless model. If your container instance is idle (not actively handling a request), its CPU is severely throttled. You only pay for the CPU you use while processing requests. This is perfect for standard web services.
CPU is always allocated: You can configure your service to have its CPU available at all times, even between requests. This is for applications that need to perform background work. In this mode, the service will not scale to zero.

Controlling Access: Ingress and IAM

Who can access the HTTPS endpoint for your Cloud Run service? You have granular control.

Ingress Control: You can set the ingress to:
- All: The service is public and accessible from anywhere on the internet.
- Internal: The service is private and can only be reached from within your VPC network.
- Internal and Cloud Load Balancing: The service is internal but can also be used as a backend for an Internal or External Load Balancer.
Authentication: Even for a public service, you can require authentication. By default, a service is private. To call it, the user or service account must have the Cloud Run Invoker (roles/run.invoker) IAM role. The --allow-unauthenticated flag during deployment is what grants this role to the special allUsers principal, making it truly public.

Connecting to Your Private World: VPC Access

Just like Cloud Functions and App Engine, your Cloud Run container runs in a secure, Google-managed environment. If it needs to connect to a Cloud SQL database or a Memorystore instance inside your private VPC, it needs a bridge.

This is accomplished using a Serverless VPC Access Connector. You create the connector in your VPC, and then configure your Cloud Run service to use it. This allows your container to securely communicate with other resources using their private IP addresses.

The Final Showdown: When to Use What?

This is one of the most important topics for the ACE exam. With GCE, GKE, App Engine, Cloud Functions, and Cloud Run, which do you choose?

Service	Best For…	You Manage…	Scales to Zero?
Compute Engine (GCE)	Full control, legacy apps, custom OS	Everything (VMs, OS, patching, scaling)	No
GKE	Complex, orchestrated microservices, stateful apps	Containers, cluster configuration, node pools	No (Standard) <br> Yes (Autopilot pods)
App Engine Standard	Web apps in specific runtimes, rapid scaling	Just your code	Yes
Cloud Functions	Single-purpose, event-driven code snippets	Just your code/functions	Yes
Cloud Run	Stateless, request-driven web services in containers	Just your container image	Yes

The simple rule of thumb:

Need a full VM? -> GCE
Need Kubernetes? -> GKE
Have a simple web app in a supported language? -> Start with App Engine Standard.
Have a small piece of code that reacts to an event (like a file upload)? -> Cloud Functions.
Have a web application you want to run as a container and want serverless scaling? -> Cloud Run.

Common Pitfalls & Best Practices

Pitfall: Trying to run a stateful application (like a traditional database) in Cloud Run. The container’s local file system is ephemeral.
Best Practice: Design your containers to be stateless. Externalize all state to a managed service like Cloud SQL, Firestore, or Memorystore.
Pitfall: Setting concurrency too high for a CPU-intensive application. A single instance might get overwhelmed trying to handle too many requests at once.
Best Practice: Tune your concurrency settings. For CPU-heavy work, a lower concurrency (even 1) might be more appropriate. For I/O-bound work, a higher concurrency is fine.
Pitfall: Forgetting that a “CPU always allocated” service will not scale to zero and will incur costs 24/7.
Best Practice: Use the default “CPU during requests” model unless you have a clear need for background processing.
Pitfall: Not building your container images efficiently, leading to large images and slow cold starts.
Best Practice: Use multi-stage builds and minimal base images (like alpine or distroless) to keep your container images small and lean.

Quick Reference Command Center

Here’s a cheatsheet of gcloud commands for managing Cloud Run.

Action	Command
Deploy a Service	`gcloud run deploy [SERVICE_NAME] --image [IMAGE_URL] --region [REGION]`
List Deployed Services	`gcloud run services list`
Describe a Service	`gcloud run services describe [SERVICE_NAME] --region [REGION]`
Set IAM Policy (Make Private)	`gcloud run services remove-iam-policy-binding [SERVICE_NAME] --member=allUsers --role=roles/run.invoker`
Set IAM Policy (Make Public)	`gcloud run services add-iam-policy-binding [SERVICE_NAME] --member=allUsers --role=roles/run.invoker`
Update Traffic Split	`gcloud run services update-traffic [SERVICE_NAME] --to-revisions=REV1=50,REV2=50`
View Service Logs	`gcloud logging read "resource.type=cloud_run_revision AND resource.labels.service_name=[SERVICE_NAME]"`

How It Works: Revisions, Concurrency, and CPU

Controlling Access: Ingress and IAM

Connecting to Your Private World: VPC Access

The Final Showdown: When to Use What?

Quick Reference Command Center

Related Posts

The Cloud Engineer’s Toolkit- Advanced Security, Productivity, and AI Services

A Guide to Deployment Manager & Terraform

Budget & Billing in Google Cloud