Skip to main content

Deploy on AWS EKS

This guide covers deploying the Hiya Voice Verification engine on Amazon Elastic Kubernetes Service (EKS). It builds on the generic Kubernetes guide with EKS-specific configuration.

Prerequisites

  • An EKS cluster (v1.24+) with kubectl configured
  • Container image pulled and authenticated — see Getting the Container Image
  • A valid API_KEY from your Hiya account team
  • AWS CLI and eksctl installed (optional, for cluster management)

We recommend instances based on Intel Emerald Rapids processors for optimal performance. The following EKS-compatible instance families provide Emerald Rapids:

Instance FamilyCategoryNotes
I7iStorage-optimizedUp to 192 vCPUs, DDR5, NVMe
I7ieStorage-optimizedHighest local NVMe density in EC2

Common general-purpose (M7i), compute-optimized (C7i), and memory-optimized (R7i) families use 4th Gen Sapphire Rapids, not Emerald Rapids. These are still supported but may yield slightly lower performance.

Ensure your node group instances have at least 8 GB of available RAM per pod (4 GB for the in-memory model mount plus engine overhead).

Step 1 — Create Secrets

Create the image pull secret and API key secret:

kubectl create secret docker-registry hiya-registry \
--docker-server=europe-docker.pkg.dev \
--docker-username=_json_key \
--docker-password="$(cat key.json)"

kubectl create secret generic hiya-engine-config \
--from-literal=api-key=<your-api-key>

Step 2 — Deploy

Apply the same Deployment and Service manifests from the Kubernetes guide. No EKS-specific changes are needed in the manifests.

Step 3 — Expose via Load Balancer (Optional)

If client applications outside the cluster need to reach the engine, create a Network Load Balancer (NLB). NLB is recommended over ALB because the engine uses gRPC (HTTP/2).

# hiya-nlb-service.yaml
apiVersion: v1
kind: Service
metadata:
name: hiya-voice-verification-nlb
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal
spec:
type: LoadBalancer
selector:
app: hiya-voice-verification
ports:
- name: grpc
protocol: TCP
port: 8080
targetPort: 8080
- name: ws
protocol: TCP
port: 8081
targetPort: 8081

Set aws-load-balancer-scheme: internal to keep the service within your VPC. Change to internet-facing only if external access is explicitly required.

Network Configuration

Ensure the following network access from your EKS nodes:

DestinationPortProtocolPurpose
europe-docker.pkg.dev443HTTPSImage pulls
api.hiya.com443HTTPSLicense verification and billing

If your VPC uses private subnets with a NAT Gateway, no additional configuration is typically needed. For VPCs with strict egress rules, add these destinations to your security group outbound rules.

Scaling with Cluster Autoscaler or Karpenter

The engine is stateless and scales by increasing the replica count. If you use Cluster Autoscaler or Karpenter, new nodes will be provisioned automatically when pods are pending due to insufficient resources.

For Karpenter, consider adding a NodePool that targets the recommended instance families:

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: hiya-voice-verification
spec:
template:
spec:
requirements:
- key: node.kubernetes.io/instance-type
operator: In
values: ["i7i.2xlarge", "i7i.4xlarge", "m7i.2xlarge", "m7i.4xlarge"]
- key: kubernetes.io/arch
operator: In
values: ["amd64"]