Your application is a runaway success. You started with a single, powerful Compute Engine VM serving your website, but now it’s struggling to keep up with the flood of user traffic. Worse, if that one VM fails, your entire application goes down. You have a single point of failure and a performance bottleneck.
The solution is clear: you need to run multiple copies—or “replicas”—of your application on a fleet of VMs. This is where you create a Managed Instance Group (MIG), a collection of identical VMs that can even autoscale based on demand.
But this creates a new, crucial problem. You now have a dozen VMs, each with its own IP address. Which IP do you give to your users? How do you distribute the traffic evenly among them? And how do you stop sending traffic to a VM that has crashed?
You need a traffic cop. You need a grand central station for all your application’s traffic. You need a Load Balancer.
What is a Load Balancer?
A load balancer is a device—or in our case, a managed service—that sits in front of your servers and acts as a single entry point for all incoming traffic. Its job is to:
- Provide a single, stable IP address for your users.
- Distribute incoming requests across your pool of backend servers.
- Continuously monitor the health of those servers and only send traffic to the healthy ones.
In Google Cloud, “Load Balancer” isn’t a single product. It’s a comprehensive suite of different products, each tailored for a specific type of traffic and use case.
The Great Decision Tree
Choosing the right load balancer comes down to answering three fundamental questions.
- Global vs. Regional? Do you need to serve traffic to users all over the world from a single IP, or are your users and servers located in a single geographic region?
- External vs. Internal? Is the traffic coming from the public internet, or is it flowing between services inside your own VPC network?
- HTTP(S) vs. Everything Else (TCP/UDP)? Are you balancing web traffic (Layer 7), where you can make routing decisions based on the URL path or hostname? Or are you balancing other protocols like gaming data (UDP) or a database connection (TCP) (Layer 4)?
Let’s walk through this decision tree by meeting the stars of the show
The Global Superstar: External HTTP(S) Load Balancer
The Scenario: You’re running a global e-commerce website. You have users in New York, London, and Tokyo. You want to give every user a single IP address (www.ethernetdude.com
), but you want their traffic to be served by the VMs closest to them to keep latency low.
In this scenario you would typically use the Global External HTTP(S) Load Balancer.
This is a Layer 7 load balancer, meaning it understands HTTP and HTTPS. Its key features are incredible:
- Single Anycast IP Address: You get one public IP address that is announced from all of Google’s network edge locations around the world. When a user in London hits that IP, their traffic automatically enters Google’s network in Europe.
- Global Backend Support: It can balance traffic between instance groups in multiple regions (
us-east1
,europe-west2
,asia-northeast1
). It will intelligently route users to the closest healthy backend with available capacity. - SSL Offloading: Instead of making each of your VMs handle the encryption/decryption of SSL traffic, the load balancer does it for you at the edge. This frees up CPU cycles on your backend servers. You can even use free, Google-managed SSL certificates that auto-renew.
- Cloud CDN Integration: With a single checkbox, you can enable Google’s Content Delivery Network to cache your static content (like images and videos) at the edge, making your site lightning-fast for users.
Remember: If the question involves global reach, HTTP/HTTPS, or a single IP for multiple regions, this is almost always the answer.
The Regional Workhorse: External Network Load Balancer (TCP/UDP)
The Scenario: You’re not balancing a website. You’re running a backend for a popular online game that uses a custom protocol over UDP port 7777. All your game servers are located in the us-central1
region.
HTTP(S) is of no use to you here. You need a Layer 4 load balancer that can handle raw TCP or UDP traffic. For this, you’ll use the Regional External Network Load Balancer.
- Regional Scope: As the name implies, it’s a regional service. It distributes traffic to backends within a single region.
- Layer 4 Traffic: It understands TCP and UDP but has no knowledge of what’s inside the packets (like HTTP headers or URLs). It just forwards the traffic.
- High Performance: It’s a pass-through load balancer, meaning it does not proxy connections. This allows for extremely high throughput and low latency, perfect for real-time applications.
Remember: If the traffic is external, regional, and NOT HTTP(S) (e.g., SSH, RDP, SMTP, custom TCP/UDP), this is your tool
The Internal Director: Internal TCP/UDP Load Balancer
The Scenario: You have a classic three-tier application running entirely within your VPC. Your web servers (tier 1) need to talk to a cluster of application servers (tier 2). You want to load balance the traffic from the web tier to the app tier, but you absolutely do not want to expose your application servers to the public internet.
You need an internal load balancer. The Internal TCP/UDP Load Balancer is the workhorse for this.
- Internal IP Address: The load balancer gets a private IP address from your own VPC subnet. It is only reachable by other resources within that VPC (or peered VPCs).
- Layer 4 Traffic: Like its external cousin, it balances TCP/UDP traffic and is perfect for internal service-to-service communication.
- High Availability: It’s a managed service that provides a highly available front end for your internal services.
Remember: If the traffic is service-to-service inside your VPC and is TCP/UDP-based, this is the answer.
The Internal Specialist: Internal HTTP(S) Load Balancer
The Scenario: Your internal architecture is based on microservices that communicate with each other over REST APIs (which use HTTP). You want to use advanced routing rules, like sending requests for /users
to the user-service and requests for /inventory
to the inventory-service, all through a single internal entry point.
The Internal TCP/UDP Load Balancer can’t do this because it doesn’t understand URLs. You need the Internal HTTP(S) Load Balancer.
- Internal Layer 7: It’s the internal version of the global superstar. It provides advanced, Layer 7 routing (path-based, host-based) for traffic inside your VPC.
- Proxy-Based: It’s a managed proxy that can handle complex routing and policy enforcement between your internal microservices.
Remember: If the scenario specifies internal microservices that need HTTP/HTTPS-based routing, this is your go-to.
The Anatomy of a Load Balancer
Behind the scenes, a load balancer is not a single box but a collection of configured resources working together.
- Forwarding Rule: The front door. It defines the IP address, protocol, and port that the load balancer listens on.
- Target Proxy / Target Pool: The forwarding rule sends traffic here. It’s the component that terminates user connections.
- URL Map (for Layer 7 only): The receptionist’s directory. It holds the rules that map a request’s path or host to a specific backend service.
- Backend Service / Backend Pool: This defines how the load balancer distributes traffic to its backends (e.g., balancing mode) and points to the health check.
- Backend (Instance Group or NEG): The actual group of VMs (Managed Instance Groups) or containers (Network Endpoint Groups – NEGs) that will receive the traffic.
- Health Check: A crucial component. The load balancer constantly probes the backends using a health check (e.g., pinging a specific URL and looking for a
200 OK
response). If a backend fails its health check, the load balancer stops sending traffic to it until it becomes healthy again.
Common Pitfalls & Best Practices
- Pitfall: Choosing a Regional load balancer when your application has a global user base. This will result in high latency for users far from your chosen region.
- Best Practice: Use the Global External HTTP(S) Load Balancer for any web application with a geographically dispersed audience.
- Pitfall: Exposing an internal backend service to the internet with an external load balancer when it should only be accessible within your VPC.
- Best Practice: Use internal load balancers for all service-to-service communication within your VPC to adhere to the principle of least privilege.
- Pitfall: Configuring a health check that is too aggressive or doesn’t accurately reflect the health of the application. This can cause healthy backends to be removed from rotation.
- Best Practice: Design a dedicated health check endpoint in your application (e.g.,
/healthz
) that returns a200 OK
only if the application is fully functional. - Pitfall: Forgetting to configure firewall rules to allow traffic from the load balancer to your backends.
- Best Practice: Always create a firewall rule that allows ingress traffic from Google’s specific health check IP ranges (
35.191.0.0/16
and130.211.0.0/22
) to your backend instances.
Quick Reference Command Center
Here are some key gcloud
commands for creating a typical Global External HTTP(S) Load Balancer.
Component | Action | Command |
---|---|---|
Instance Group | Create a Managed Instance Group | gcloud compute instance-groups managed create [MIG_NAME] --template=[TEMPLATE_NAME] --size=3 --zone=[ZONE] |
Health Check | Create an HTTP Health Check | gcloud compute health-checks create http [HC_NAME] --port=80 |
Backend Service | Create a Backend Service | gcloud compute backend-services create [BS_NAME] --protocol=HTTP --health-checks=[HC_NAME] --global |
Backend Service | Add the MIG to the Backend Service | gcloud compute backend-services add-backend [BS_NAME] --instance-group=[MIG_NAME] --instance-group-zone=[ZONE] --global |
URL Map | Create a URL Map | gcloud compute url-maps create [MAP_NAME] --default-service=[BS_NAME] |
Target Proxy | Create a Target Proxy | gcloud compute target-http-proxies create [PROXY_NAME] --url-map=[MAP_NAME] |
Forwarding Rule | Create a Global Forwarding Rule | gcloud compute forwarding-rules create [RULE_NAME] --address=[IP_ADDRESS_NAME] --global --target-http-proxy=[PROXY_NAME] --ports=80 |