Skip to main content

Deploy on GCP Cloud Run

This guide covers deploying the Hiya Voice Verification engine on Google Cloud Run. Cloud Run provides a fully managed serverless platform for running containers.

Prerequisites

  • Container image pulled and authenticated — see Getting the Container Image
  • A valid API_KEY from your Hiya account team
  • gcloud CLI installed and configured
  • A GCP project with Cloud Run API enabled

Important Considerations

Before choosing Cloud Run, be aware of these constraints:

ConsiderationDetails
gRPC supportCloud Run supports gRPC, but only unary RPCs are fully supported on the default endpoint. Streaming RPCs require HTTP/2 end-to-end.
Cold startsThe engine takes up to 30 seconds to load models. Set minimum instances to avoid cold starts in production.
MemoryThe engine requires at least 8 GB of memory (4 GB for model loading + engine overhead).
In-memory volumesCloud Run supports in-memory volumes, which serve the same purpose as tmpfs for model storage.

Step 1 — Copy the Image to Artifact Registry (Optional)

If you want to host a copy in your own Artifact Registry for lower latency or compliance:

# Authenticate with both registries
cat key.json | docker login -u _json_key --password-stdin europe-docker.pkg.dev
gcloud auth configure-docker <your-region>-docker.pkg.dev

# Copy the image
docker pull europe-docker.pkg.dev/loccus-platform/onpremise-images/engine-api-standalone:<version>
docker tag europe-docker.pkg.dev/loccus-platform/onpremise-images/engine-api-standalone:<version> \
<your-region>-docker.pkg.dev/<your-project>/hiya/engine-api-standalone:<version>
docker push <your-region>-docker.pkg.dev/<your-project>/hiya/engine-api-standalone:<version>

Step 2 — Deploy to Cloud Run

gcloud run deploy hiya-voice-verification \
--image=europe-docker.pkg.dev/loccus-platform/onpremise-images/engine-api-standalone:<version> \
--port=8080 \
--memory=8Gi \
--cpu=4 \
--min-instances=1 \
--max-instances=10 \
--no-allow-unauthenticated \
--set-env-vars="API_KEY=<your-api-key>" \
--execution-environment=gen2 \
--use-http2 \
--region=<your-region>

Key flags:

  • --use-http2 enables HTTP/2 end-to-end, required for gRPC
  • --min-instances=1 avoids cold starts
  • --execution-environment=gen2 provides full Linux compatibility and in-memory filesystem support
  • --no-allow-unauthenticated restricts access to authenticated clients

Step 3 — Add an In-Memory Volume

To mount an in-memory volume (equivalent to tmpfs) for model storage, use a Cloud Run YAML service definition:

# hiya-cloud-run-service.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: hiya-voice-verification
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: "1"
autoscaling.knative.dev/maxScale: "10"
run.googleapis.com/execution-environment: gen2
spec:
containerConcurrency: 1
containers:
- image: europe-docker.pkg.dev/loccus-platform/onpremise-images/engine-api-standalone:<version>
ports:
- containerPort: 8080
name: h2c
env:
- name: API_KEY
value: "<your-api-key>"
resources:
limits:
memory: 8Gi
cpu: "4"
volumeMounts:
- name: models-tmpfs
mountPath: /opt/loccus/models
volumes:
- name: models-tmpfs
emptyDir:
medium: Memory
sizeLimit: 4Gi

Deploy with:

gcloud run services replace hiya-cloud-run-service.yaml --region=<your-region>

Step 4 — Configure Authentication

Cloud Run uses IAM for access control. Grant the roles/run.invoker role to the service accounts or users that need to call the engine:

gcloud run services add-iam-policy-binding hiya-voice-verification \
--region=<your-region> \
--member="serviceAccount:<client-sa>@<project>.iam.gserviceaccount.com" \
--role="roles/run.invoker"

Clients must include an identity token in gRPC requests. See Authenticating service-to-service for details.

Network Configuration

For the engine to reach the Hiya billing platform, ensure outbound access from Cloud Run:

DestinationPortProtocolPurpose
api.hiya.com443HTTPSLicense verification and billing

Cloud Run has outbound internet access by default. If using a VPC connector with private routing, ensure the route to api.hiya.com is available.

Registry Authentication for Cloud Run

Cloud Run can natively pull from any Artifact Registry repository that the deploying service account has access to. However, since the Hiya image is hosted on a separate GCP project, Cloud Run cannot pull it directly.

The recommended approach is to copy the image to your own Artifact Registry as described in Step 1. This also improves pull latency by keeping the image in your own project and region.