Part-41: 🚀Google Cloud Run Services – Create Service, Traffic Management, Autoscaling, Revisions and Versions

part-41:-google-cloud-run-services-–-create-service,-traffic-management,-autoscaling,-revisions-and-versions

Google Cloud Run is a fully managed serverless platform that allows you to deploy and run containerized applications without worrying about infrastructure. Whether you’re running a small API or a large-scale production app, Cloud Run scales seamlessly based on incoming requests.

🔑 Key Features of Cloud Run Services

gc1

🌍 Unique HTTPS Endpoint

Every service deployed on Cloud Run gets a unique HTTPS endpoint. This means your service is accessible securely out of the box, with no need to configure certificates manually.

🔒 Private and Public Services

  • You can expose your services publicly to the internet.
  • Or keep them private within your organization for internal workloads.
  • This flexibility makes it perfect for both customer-facing apps and backend microservices.

💰 Pay-per-Use Pricing

Forget about paying for idle resources. With Cloud Run, you are billed only for the exact compute time and resources used while handling requests.

👉 This makes it cost-efficient compared to traditional VM-based deployments.

🔄 Built-in Traffic Management

Cloud Run comes with native traffic splitting and version management.

  • Deploy a new version and gradually shift traffic (e.g., 10% → 50% → 100%).
  • Roll back instantly if issues occur.

This is incredibly useful for A/B testing or canary deployments.

⚡ Fast, Request-Based Auto Scaling

Cloud Run automatically scales your containers based on incoming requests.

  • Zero instances when idle → Scale down to 0.
  • Thousands of instances during traffic peaks → Scale horizontally in seconds.
  • This ensures you only pay for what you need, with the ability to handle sudden traffic spikes effortlessly.

✅ Why Choose Cloud Run?

  • Developer friendly – Deploy with just one command.
  • Secure by default – HTTPS, IAM integration, private networking.
  • Scalable and cost-effective – From zero to thousands of requests instantly.

To Create a Google Cloud Run Services

Step-01: Introduction

  1. Create a Cloud Run Service
  2. Update Applications
  3. Revision URLs
  4. Traffic Splitting
  5. Autoscaling
  6. Implement all the above features using gcloud run

Step-02: Create Service and Access it

gc2

  • Go to Cloud Run -> Create Service
  • Deploy one revision from an existing container image: stacksimplify/google-cloud-run:v1
  • Service Name: myservice1

gc3

  • Authentication: Allow unauthenticated invocations
  • Container port: 80
  • REST ALL LEAVE TO DEFAULTS

gc4

  • Click on CREATE

gc5

# Docker Image used
stacksimplify/google-cloud-run:v1

# Access Application on Browser
https://myservice1-506997606680.us-central1.run.app/

gc6

Google Cloud Run Services – HTTPS Endpoint

Unique HTTPS endpoint for every service

  • Each service with unique sub domain *.run.app
  • Supports custom domains
  • Manages TLS
  • Support for WebSockets, HTTP/2 and gRPC (both end-to-end).

Google Cloud Run Services – Pricing Model

  • Pay-per-use pricing for services
  • Request-based

If an instance is not processing requests, the CPU is not allocated and not charged. Additionally, we pay a per-request fee.

  • Instance-based

You’re charged for the entire lifetime of an instance and the CPU is always allocated. There’s no per-request fee.

gc7

Step-03: Update Application – v2

  • Go to Cloud Run -> myservice1 -> EDIT & DEPLOY NEW REVISION
  • Deploy one revision from an existing container image:

stacksimplify/google-cloud-run:v2

  • Serve this revision immediately: CHECKED
  • Click on DEPLOY

gc8

# Docker Image used
stacksimplify/google-cloud-run:v2

# Access Application on Browser
https://myservice1-506997606680.us-central1.run.app/

gc9

Google Cloud Run Services – Traffic Management

Built-in traffic management – Route Traffic to

  • Latest revision
  • Roll back to previous revision
  • Split Traffic to multiple revisions at same time (gradual rollout)

Step-04: Cloud Run Revisions and Traffic Splitting

  • Split Traffic between version 1 and version 2
  • version-1: 50%
  • version-2: 50%

gc10

# Access Application on Browser
https://myservice1-506997606680.us-central1.run.app/

gc11

Step-05: Add Revision URLs

  • Add Revision URLs
  • version-1: myappv1
  • version-2: myappv2

gc12

# myappv1 Revision URL
https://myappv1---myservice1-czbx2i66ca-uc.a.run.app/

# myappv2 Revision URL
https://myappv2---myservice1-czbx2i66ca-uc.a.run.app/

gc13

Step-06: Deploy V3 Application with Serve this revision immediately UNCHECKED

  • Go to Cloud Run -> myservice1 -> EDIT & DEPLOY NEW REVISION
  • Deploy one revision from an existing container image:

stacksimplify/google-cloud-run:v3

  • Serve this revision immediately: UNCHECKED
  • Click on DEPLOY
# Docker Image used
stacksimplify/google-cloud-run:v3

# Access Application on Browser 
https://myservice1-czbx2i66ca-uc.a.run.app
Observation: V2 version will be still serving

gc14

Add Revision URL for V3

  • version-3: myappv3
# Access Application on Browser using Revision URL
https://myappv3---myservice1-czbx2i66ca-uc.a.run.app/
Observation: 
1. V2 version will be still serving
2. V3 is started serving after revision url added

gc15

gc16

Traffic Split to V3: 10%

gc17

# Access Application on Browser using Revision URL
https://myservice1-czbx2i66ca-uc.a.run.app/
Observation: 
1. V1 version will be serving 40%
2. v2 version will be serving 50%
2. V3 version will be serving 10% - Gradual Rollout

gc18

Step-07: Traffic Splitting

  • version-1: 33%
  • version-2: 33%
  • version-3: 34%
# Access Application on Browser using Revision URL
https://myservice1-czbx2i66ca-uc.a.run.app/
Observation: Traffic splits between 3 versions

Step-08: Verify Additional Tabs

gc19

  • Verify Logs Tabs
  • Verify Metrics Tabs
  • Verify Security Tabs

gc20

Google Cloud Run Services – Autoscaling

Fast request-based auto scaling

  • Minimum Instances: starts from zero, Set to 1 to reduce cold starts
  • Maximum Instances: scale out to 1000 instances and more with a request to increase quota

Scale to zero and minimum instances

  • When minimum instances set to zero and no requests then active instances will be zero
  • New instance created as soon as the request comes in
  • Negatively impacts the response times for the first request

gc21

Step-09: Cloud Run Autoscaling

Cloud Run Autoscaling

  • Minimum number of instances
  • Maximum number of instances
  • Cold Starts

You can control autoscaling behavior with min/max instance limits, while cold starts occur when new instances spin up from zero.

Google Cloud Run Services – Access Modes

gc22

Ingress Control

  • Public Service

Allow direct access from internet

  • Private Service

Allow traffic from VPC
Allow traffic from external Application Load Balancers

Authentication

gc23

  • Un-authenticated Access (Public API or Website)
  • Authenticated Access using Cloud Identity-Aware Proxy (Secure access via web or mobile clients)

Step-10: gcloud: Create Google Cloud Run Service

# gcloud Project Settings
gcloud config list
PROJECT_ID=[YOUR-PROJECT-ID]
PROJECT_ID=gcpdemos
REGION=us-central1
gcloud config set core/project $PROJECT_ID
gcloud config set run/region $REGION
gcloud config list

# Help
gcloud run services --help
gcloud run deploy --help

# List Cloud Run Services
gcloud run services list

# Create Google Cloud Run Service
gcloud run deploy myservice102 
--image=stacksimplify/google-cloud-run:v1 
--allow-unauthenticated 
--port=80 

# List Cloud Run Services
gcloud run services list

# Describe Cloud Run Service
gcloud run services describe myservice102 

Step-11: gcloud: List and Describe Revisions

# Help 
gcloud run revisions --help

# List Revisions
gcloud run revisions list

# Describe Revision
gcloud run revisions describe  
gcloud run revisions describe myservice102-00001-2rk 

Step-12: gcloud: Update Application

# Update Application 
gcloud run services update
gcloud run services update --help 

# Update Application 
gcloud run services update myservice102 --image=stacksimplify/google-cloud-run:v2

# List Revisions
gcloud run revisions list 

# Describe Revision
gcloud run revisions describe  
gcloud run revisions describe myservice102-00001-2rk 

Step-13: gcloud: Update Traffic

# Help
gcloud run services update-traffic --help

# List Revisions
gcloud run revisions list 

# Set Tags (Add Revision URLs)
gcloud run services update-traffic myservice102 
--set-tags=myappv1=myservice102-00001-2rk,myappv2=myservice102-00002-xgl 

# Update Traffic - V1-50%, V2-50%
gcloud run services update-traffic myservice102 
--to-revisions=myservice102-00001-2rk=50,myservice102-00002-xgl=50 

## 1. You can also refer to the current or future LATEST revision in --to-revisions by the string "LATEST". 
## 2. To set 10% of traffic to always float to the latest revision:
gcloud run services update-traffic myservice102 
--to-revisions=myservice102-00001-2rk=100,myservice102-00002-xgl=0 
gcloud run services update-traffic myservice102 --to-revisions=LATEST=10 

# To assign 100% of traffic to the current or future LATEST revision run
gcloud run services update-traffic myservice102 --to-latest 

Step-14: gcloud: Delete Cloud Run Service

# List Cloudn Run Services
gcloud run services list 

# Delete Cloud Run Service
gcloud run services delete myservice102 
gcloud run services delete myservice1
Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post
handling-json-&-yaml-in-python

Handling JSON & YAML in Python

Next Post
from-warehouses-to-libraries:-understanding-data-on-aws-the-easy-way

From Warehouses to Libraries: Understanding Data on AWS the Easy Way

Related Posts