Deploy on GCP Cloud Run

This guide covers deploying the Hiya Voice Verification engine on Google Cloud Run. Cloud Run provides a fully managed serverless platform for running containers.

Prerequisites

Container image pulled and authenticated — see Getting the Container Image
Runtime configuration values for API_KEY, ORG_HANDLE, PLATFORM_REGION, and MIN_ALLOCATION
gcloud CLI installed and configured
A GCP project with Cloud Run API enabled

Important Considerations

Before choosing Cloud Run, be aware of these constraints:

Consideration	Details
WebSocket support	Cloud Run supports WebSocket connections. The engine also uses gRPC on port 8080 for health checks, which requires HTTP/2 end-to-end.
Cold starts	The engine takes up to 30 seconds to load models. Set minimum instances to avoid cold starts in production.
Memory	The engine requires sufficient memory to hold the engine and ML models in memory; contact Hiya for sizing guidance.
In-memory volumes	Cloud Run supports in-memory volumes, which serve the same purpose as tmpfs for model storage.

Step 1 — Copy the Image to Artifact Registry (Optional)

If you want to host a copy in your own Artifact Registry for lower latency or compliance:

# Authenticate with both registries
cat key.json | docker login -u _json_key --password-stdin europe-docker.pkg.dev
gcloud auth configure-docker <your-region>-docker.pkg.dev

# Copy the image
docker pull europe-docker.pkg.dev/loccus-platform/onpremise-images/hiya-voice-verification:<version>
docker tag europe-docker.pkg.dev/loccus-platform/onpremise-images/hiya-voice-verification:<version> \
  <your-region>-docker.pkg.dev/<your-project>/hiya/hiya-voice-verification:<version>
docker push <your-region>-docker.pkg.dev/<your-project>/hiya/hiya-voice-verification:<version>

Step 2 — Deploy to Cloud Run

gcloud run deploy hiya-voice-verification \
  --image=europe-docker.pkg.dev/loccus-platform/onpremise-images/hiya-voice-verification:<version> \
  --port=8080 \
  --memory=8Gi \
  --cpu=4 \
  --min-instances=1 \
  --max-instances=10 \
  --no-allow-unauthenticated \
  --set-env-vars="API_KEY=<your-api-key>,ORG_HANDLE=<your-org-handle>,PLATFORM_REGION=<eu-or-us>,MIN_ALLOCATION=1m" \
  --execution-environment=gen2 \
  --use-http2 \
  --region=<your-region>

Key flags:

--use-http2 enables HTTP/2 end-to-end, required for gRPC health checks
--min-instances=1 avoids cold starts
--execution-environment=gen2 provides full Linux compatibility and in-memory filesystem support
--no-allow-unauthenticated restricts access to authenticated clients

Step 3 — Add an In-Memory Volume

To mount an in-memory volume (equivalent to tmpfs) for model storage, use a Cloud Run YAML service definition:

# hiya-cloud-run-service.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: hiya-voice-verification
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "1"
        autoscaling.knative.dev/maxScale: "10"
        run.googleapis.com/execution-environment: gen2
    spec:
      containerConcurrency: 1
      containers:
        - image: europe-docker.pkg.dev/loccus-platform/onpremise-images/hiya-voice-verification:<version>
          ports:
            - containerPort: 8080
              name: h2c
          env:
            - name: API_KEY
              value: "<your-api-key>"
            - name: ORG_HANDLE
              value: "<your-org-handle>"
            - name: PLATFORM_REGION
              value: "<eu-or-us>"
            - name: MIN_ALLOCATION
              value: "1m"
          resources:
            limits:
              memory: 8Gi
              cpu: "4"
          volumeMounts:
            - name: models-tmpfs
              mountPath: /opt/loccus/models
      volumes:
        - name: models-tmpfs
          emptyDir:
            medium: Memory
            sizeLimit: 8Gi

Deploy with:

gcloud run services replace hiya-cloud-run-service.yaml --region=<your-region>

Step 4 — Configure Authentication

Cloud Run uses IAM for access control. Grant the roles/run.invoker role to the service accounts or users that need to call the engine:

gcloud run services add-iam-policy-binding hiya-voice-verification \
  --region=<your-region> \
  --member="serviceAccount:<client-sa>@<project>.iam.gserviceaccount.com" \
  --role="roles/run.invoker"

Clients must include an identity token in requests. See Authenticating service-to-service for details.

Network Configuration

For the engine to reach the Hiya billing platform, ensure outbound access from Cloud Run:

Destination	Port	Protocol	Purpose
`api.hiya.com`	443	HTTPS	License verification and billing

Cloud Run has outbound internet access by default. If using a VPC connector with private routing, ensure the route to api.hiya.com is available.

Registry Authentication for Cloud Run

Cloud Run can natively pull from any Artifact Registry repository that the deploying service account has access to. However, since the Hiya image is hosted on a separate GCP project, Cloud Run cannot pull it directly.

The recommended approach is to copy the image to your own Artifact Registry as described in Step 1. This also improves pull latency by keeping the image in your own project and region.

Prerequisites​

Important Considerations​

Step 1 — Copy the Image to Artifact Registry (Optional)​

Step 2 — Deploy to Cloud Run​

Step 3 — Add an In-Memory Volume​

Step 4 — Configure Authentication​

Network Configuration​

Registry Authentication for Cloud Run​