Deploy on GCP GKE

This guide covers deploying the Hiya Voice Verification engine on Google Kubernetes Engine (GKE). It builds on the generic Kubernetes guide with GKE-specific configuration.

Prerequisites

A GKE cluster (Standard or Autopilot) with kubectl configured
Container image pulled and authenticated — see Getting the Container Image
Runtime configuration values for API_KEY, ORG_HANDLE, PLATFORM_REGION, and MIN_ALLOCATION
gcloud CLI installed

Recommended Machine Types

We recommend machines based on Intel Emerald Rapids processors for optimal performance:

Machine Series	Category	Notes
N4	General-purpose	General-purpose, DDR5 memory
C4	Compute-optimized	Compute-optimized; Emerald Rapids CPUs available
M4	Memory-optimized	Memory-optimized series with large RAM capacity

Ensure your node pool instances have sufficient RAM per pod to hold the engine and ML models in memory. Contact Hiya for sizing guidance.

GKE Standard vs Autopilot

	Standard	Autopilot
Node management	You manage node pools	Google manages nodes
Instance type control	Full control	Requests via resource limits
tmpfs (emptyDir Memory)	Supported	Supported (counts against pod memory)
Best for	Maximum control	Simpler operations

Both modes work with the Hiya engine. If using Autopilot, ensure your pod resource requests account for the emptyDir memory volume used for model storage. Contact Hiya for sizing guidance.

Step 1 — Authenticate with the Registry

Since you're already on GCP, you can use Workload Identity as an alternative to image pull secrets. However, since the image is hosted on a Hiya-managed project, the simplest approach is still the pull secret method.

Option A — Image Pull Secret (recommended)

kubectl create secret docker-registry hiya-registry \
  --docker-server=europe-docker.pkg.dev \
  --docker-username=_json_key \
  --docker-password="$(cat key.json)"

Option B — gcloud Credential Helper

If you prefer to configure registry access at the node level:

gcloud auth activate-service-account --key-file=key.json
gcloud auth configure-docker europe-docker.pkg.dev

This works for Standard clusters where you control the node configuration. For Autopilot, use Option A.

Step 2 — Create Secrets and Deploy

Store the runtime configuration and apply the deployment:

kubectl create secret generic hiya-engine-config \
  --from-literal=api-key=<your-api-key> \
  --from-literal=org-handle=<your-org-handle> \
  --from-literal=platform-region=<eu-or-us> \
  --from-literal=min-allocation=1m

Apply the Deployment and Service manifests from the Kubernetes guide. No GKE-specific changes are needed.

Step 3 — Expose via Internal Load Balancer (Optional)

For clients outside the cluster, use a GKE internal load balancer. Use a TCP-mode load balancer to support both WebSocket and gRPC health check traffic:

# hiya-ilb-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: hiya-voice-verification-ilb
  annotations:
    networking.gke.io/load-balancer-type: Internal
spec:
  type: LoadBalancer
  selector:
    app: hiya-voice-verification
  ports:
    - name: health
      protocol: TCP
      port: 8080
      targetPort: 8080
    - name: ws
      protocol: TCP
      port: 8081
      targetPort: 8081

Network Configuration

GKE nodes need outbound access to:

Destination	Port	Protocol	Purpose
`europe-docker.pkg.dev`	443	HTTPS	Image pulls (typically already allowed on GCP)
`api.hiya.com`	443	HTTPS	License verification and billing

For private clusters with no external IP on nodes, ensure Cloud NAT is configured for outbound internet access.

Scaling

Use the Horizontal Pod Autoscaler or GKE's Cluster Autoscaler for automatic scaling. The engine is stateless — scaling is as simple as increasing replicas.

kubectl autoscale deployment hiya-voice-verification \
  --cpu-percent=50 \
  --min=1 \
  --max=10

For Standard clusters, the Cluster Autoscaler will provision new nodes when pods are pending.

Prerequisites​

Recommended Machine Types​

GKE Standard vs Autopilot​

Step 1 — Authenticate with the Registry​

Option A — Image Pull Secret (recommended)​

Option B — gcloud Credential Helper​

Step 2 — Create Secrets and Deploy​

Step 3 — Expose via Internal Load Balancer (Optional)​

Network Configuration​

Scaling​