Deploy on GCP Cloud Run
This guide covers deploying the Hiya Voice Verification engine on Google Cloud Run. Cloud Run provides a fully managed serverless platform for running containers.
Prerequisites
- Container image pulled and authenticated — see Getting the Container Image
- A valid
API_KEYfrom your Hiya account team gcloudCLI installed and configured- A GCP project with Cloud Run API enabled
Important Considerations
Before choosing Cloud Run, be aware of these constraints:
| Consideration | Details |
|---|---|
| gRPC support | Cloud Run supports gRPC, but only unary RPCs are fully supported on the default endpoint. Streaming RPCs require HTTP/2 end-to-end. |
| Cold starts | The engine takes up to 30 seconds to load models. Set minimum instances to avoid cold starts in production. |
| Memory | The engine requires at least 8 GB of memory (4 GB for model loading + engine overhead). |
| In-memory volumes | Cloud Run supports in-memory volumes, which serve the same purpose as tmpfs for model storage. |
Step 1 — Copy the Image to Artifact Registry (Optional)
If you want to host a copy in your own Artifact Registry for lower latency or compliance:
# Authenticate with both registries
cat key.json | docker login -u _json_key --password-stdin europe-docker.pkg.dev
gcloud auth configure-docker <your-region>-docker.pkg.dev
# Copy the image
docker pull europe-docker.pkg.dev/loccus-platform/onpremise-images/engine-api-standalone:<version>
docker tag europe-docker.pkg.dev/loccus-platform/onpremise-images/engine-api-standalone:<version> \
<your-region>-docker.pkg.dev/<your-project>/hiya/engine-api-standalone:<version>
docker push <your-region>-docker.pkg.dev/<your-project>/hiya/engine-api-standalone:<version>
Step 2 — Deploy to Cloud Run
gcloud run deploy hiya-voice-verification \
--image=europe-docker.pkg.dev/loccus-platform/onpremise-images/engine-api-standalone:<version> \
--port=8080 \
--memory=8Gi \
--cpu=4 \
--min-instances=1 \
--max-instances=10 \
--no-allow-unauthenticated \
--set-env-vars="API_KEY=<your-api-key>" \
--execution-environment=gen2 \
--use-http2 \
--region=<your-region>
Key flags:
--use-http2enables HTTP/2 end-to-end, required for gRPC--min-instances=1avoids cold starts--execution-environment=gen2provides full Linux compatibility and in-memory filesystem support--no-allow-unauthenticatedrestricts access to authenticated clients
Step 3 — Add an In-Memory Volume
To mount an in-memory volume (equivalent to tmpfs) for model storage, use a Cloud Run YAML service definition:
# hiya-cloud-run-service.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: hiya-voice-verification
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: "1"
autoscaling.knative.dev/maxScale: "10"
run.googleapis.com/execution-environment: gen2
spec:
containerConcurrency: 1
containers:
- image: europe-docker.pkg.dev/loccus-platform/onpremise-images/engine-api-standalone:<version>
ports:
- containerPort: 8080
name: h2c
env:
- name: API_KEY
value: "<your-api-key>"
resources:
limits:
memory: 8Gi
cpu: "4"
volumeMounts:
- name: models-tmpfs
mountPath: /opt/loccus/models
volumes:
- name: models-tmpfs
emptyDir:
medium: Memory
sizeLimit: 4Gi
Deploy with:
gcloud run services replace hiya-cloud-run-service.yaml --region=<your-region>
Step 4 — Configure Authentication
Cloud Run uses IAM for access control. Grant the roles/run.invoker role to the service accounts or users that need to call the engine:
gcloud run services add-iam-policy-binding hiya-voice-verification \
--region=<your-region> \
--member="serviceAccount:<client-sa>@<project>.iam.gserviceaccount.com" \
--role="roles/run.invoker"
Clients must include an identity token in gRPC requests. See Authenticating service-to-service for details.
Network Configuration
For the engine to reach the Hiya billing platform, ensure outbound access from Cloud Run:
| Destination | Port | Protocol | Purpose |
|---|---|---|---|
api.hiya.com | 443 | HTTPS | License verification and billing |
Cloud Run has outbound internet access by default. If using a VPC connector with private routing, ensure the route to api.hiya.com is available.
Registry Authentication for Cloud Run
Cloud Run can natively pull from any Artifact Registry repository that the deploying service account has access to. However, since the Hiya image is hosted on a separate GCP project, Cloud Run cannot pull it directly.
The recommended approach is to copy the image to your own Artifact Registry as described in Step 1. This also improves pull latency by keeping the image in your own project and region.