Skip to main content

Scalability, High Availability & Disaster Recovery

Scalability and high availability

The solution offers native support for horizontal scaling, allowing computing capacity to be expanded by launching additional container instances.

Requests are evenly distributed among active containers, ensuring that newly launched instances immediately contribute to processing traffic and bolster overall platform capacity.

We recommend maintaining instances at around 50% CPU utilization to balance prompt response times with computational resource efficiency. This threshold should be adjusted based on anticipated demand and validated through stress testing.

For networks requiring high availability, consider distributing services across multiple locations to ensure redundancy and insulate against localized disruptions.

Backup and disaster recovery

To establish a comprehensive disaster recovery strategy:

  1. Regularly back up the container image along with pertinent deployment configurations, including:

    • .yaml definition files
    • Service secrets
    • Network settings
  2. Perform backups before any system upgrades or configuration changes.

  3. Define a disaster recovery plan aligned with your organization's:

    • Recovery Point Objective (RPO)
    • Recovery Time Objective (RTO)
    • Corresponding Service Level Agreements (SLAs)