Skip to main content

Deploy on Kubernetes

This guide covers deploying the Hiya Voice Verification engine on any conformant Kubernetes cluster — on-prem, self-managed, or any cloud distribution. For cloud-specific optimizations, see the dedicated guides for AWS EKS, GCP GKE, or Azure AKS.

Prerequisites

  • A running Kubernetes cluster (v1.24+)
  • kubectl configured to access the cluster
  • Container image pulled and authenticated — see Getting the Container Image
  • A valid API_KEY from your Hiya account team

Step 1 — Create an Image Pull Secret

Create a Kubernetes secret from the JSON key file provided by Hiya. This allows the cluster to pull the image from Google Artifact Registry.

kubectl create secret docker-registry hiya-registry \
--docker-server=europe-docker.pkg.dev \
--docker-username=_json_key \
--docker-password="$(cat key.json)"

The pull secret must exist in the same namespace as the Deployment. To use it across multiple namespaces, recreate the secret in each one or use a tool like Sealed Secrets or External Secrets Operator.

Step 2 — Store the API Key

Store your runtime API key in a separate secret:

kubectl create secret generic hiya-engine-config \
--from-literal=api-key=<your-api-key>

Step 3 — Create the Deployment

# hiya-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hiya-voice-verification
spec:
replicas: 1
selector:
matchLabels:
app: hiya-voice-verification
template:
metadata:
labels:
app: hiya-voice-verification
spec:
imagePullSecrets:
- name: hiya-registry
containers:
- name: engine
image: europe-docker.pkg.dev/loccus-platform/onpremise-images/engine-api-standalone:<version>
ports:
- containerPort: 8080
name: grpc
- containerPort: 8081
name: ws
env:
- name: API_KEY
valueFrom:
secretKeyRef:
name: hiya-engine-config
key: api-key
volumeMounts:
- name: models-tmpfs
mountPath: /opt/loccus/models
startupProbe:
grpc:
port: 8080
periodSeconds: 1
failureThreshold: 30
livenessProbe:
grpc:
port: 8080
periodSeconds: 10
resources:
requests:
memory: "6Gi"
cpu: "2"
limits:
memory: "8Gi"
volumes:
- name: models-tmpfs
emptyDir:
medium: Memory
sizeLimit: 4Gi
kubectl apply -f hiya-deployment.yaml

The emptyDir with medium: Memory is the Kubernetes equivalent of Docker's --tmpfs flag — models are loaded into RAM and never written to persistent disk.

Step 4 — Create the Service

Expose the engine within the cluster so that client applications can reach it:

# hiya-service.yaml
apiVersion: v1
kind: Service
metadata:
name: hiya-voice-verification
spec:
selector:
app: hiya-voice-verification
ports:
- name: grpc
protocol: TCP
port: 8080
targetPort: 8080
- name: ws
protocol: TCP
port: 8081
targetPort: 8081
kubectl apply -f hiya-service.yaml

The service will be available at hiya-voice-verification.<namespace>.svc.cluster.local:8080.

Step 5 — Verify

Check that the pod is running and healthy:

kubectl get pods -l app=hiya-voice-verification
kubectl logs -l app=hiya-voice-verification --tail=20

The startup probe gives the engine up to 30 seconds to load its models. Once the pod shows Running and 1/1 ready, the engine is accepting requests.

Scaling

The engine is stateless. To scale, increase the replica count:

kubectl scale deployment hiya-voice-verification --replicas=3

Each pod runs the full engine independently. The Kubernetes Service automatically load-balances gRPC requests across all ready pods.

We recommend maintaining instances at around 50% CPU utilization to balance prompt response times with computational resource efficiency. See Scalability & Recovery for more details.

Updating Registry Credentials

When Hiya rotates your registry credentials, update the pull secret in-place:

kubectl create secret docker-registry hiya-registry \
--docker-server=europe-docker.pkg.dev \
--docker-username=_json_key \
--docker-password="$(cat key.json)" \
--dry-run=client -o yaml | kubectl apply -f -

Supported Container Runtimes

The engine image is OCI-compliant and works with any container runtime supported by your cluster:

  • Docker
  • containerd
  • CRI-O
  • gVisor

We strongly recommend 5th generation Intel Xeon Scalable Processors (Emerald Rapids) for superior performance and efficiency.

ArchitectureNotes
Intel Emerald RapidsRecommended
Intel Sapphire Rapids
Intel Ice Lake
Intel Cascade Lake
Intel Skylake
AMD EPYC Genoa
AMD EPYC Milan