Deploy on AWS EKS
This guide covers deploying the Hiya Voice Verification engine on Amazon Elastic Kubernetes Service (EKS). It builds on the generic Kubernetes guide with EKS-specific configuration.
Prerequisites
- An EKS cluster (v1.24+) with
kubectlconfigured - Container image pulled and authenticated — see Getting the Container Image
- A valid
API_KEYfrom your Hiya account team - AWS CLI and
eksctlinstalled (optional, for cluster management)
Recommended Instance Types
We recommend instances based on Intel Emerald Rapids processors for optimal performance. The following EKS-compatible instance families provide Emerald Rapids:
| Instance Family | Category | Notes |
|---|---|---|
| I7i | Storage-optimized | Up to 192 vCPUs, DDR5, NVMe |
| I7ie | Storage-optimized | Highest local NVMe density in EC2 |
Common general-purpose (M7i), compute-optimized (C7i), and memory-optimized (R7i) families use 4th Gen Sapphire Rapids, not Emerald Rapids. These are still supported but may yield slightly lower performance.
Ensure your node group instances have at least 8 GB of available RAM per pod (4 GB for the in-memory model mount plus engine overhead).
Step 1 — Create Secrets
Create the image pull secret and API key secret:
kubectl create secret docker-registry hiya-registry \
--docker-server=europe-docker.pkg.dev \
--docker-username=_json_key \
--docker-password="$(cat key.json)"
kubectl create secret generic hiya-engine-config \
--from-literal=api-key=<your-api-key>
Step 2 — Deploy
Apply the same Deployment and Service manifests from the Kubernetes guide. No EKS-specific changes are needed in the manifests.
Step 3 — Expose via Load Balancer (Optional)
If client applications outside the cluster need to reach the engine, create a Network Load Balancer (NLB). NLB is recommended over ALB because the engine uses gRPC (HTTP/2).
# hiya-nlb-service.yaml
apiVersion: v1
kind: Service
metadata:
name: hiya-voice-verification-nlb
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal
spec:
type: LoadBalancer
selector:
app: hiya-voice-verification
ports:
- name: grpc
protocol: TCP
port: 8080
targetPort: 8080
- name: ws
protocol: TCP
port: 8081
targetPort: 8081
Set aws-load-balancer-scheme: internal to keep the service within your VPC. Change to internet-facing only if external access is explicitly required.
Network Configuration
Ensure the following network access from your EKS nodes:
| Destination | Port | Protocol | Purpose |
|---|---|---|---|
europe-docker.pkg.dev | 443 | HTTPS | Image pulls |
api.hiya.com | 443 | HTTPS | License verification and billing |
If your VPC uses private subnets with a NAT Gateway, no additional configuration is typically needed. For VPCs with strict egress rules, add these destinations to your security group outbound rules.
Scaling with Cluster Autoscaler or Karpenter
The engine is stateless and scales by increasing the replica count. If you use Cluster Autoscaler or Karpenter, new nodes will be provisioned automatically when pods are pending due to insufficient resources.
For Karpenter, consider adding a NodePool that targets the recommended instance families:
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: hiya-voice-verification
spec:
template:
spec:
requirements:
- key: node.kubernetes.io/instance-type
operator: In
values: ["i7i.2xlarge", "i7i.4xlarge", "m7i.2xlarge", "m7i.4xlarge"]
- key: kubernetes.io/arch
operator: In
values: ["amd64"]