Before starting deployment, review these requirements with your infrastructure and security teams. This page covers:
Understanding these requirements upfront prevents delays caused by missing approvals or insufficient quotas.
Confident AI uses the following technologies. Your organization may require approval before deploying new technologies:
Why this tech stack? PostgreSQL is the application’s source of truth. Redis provides fast caching and manages background job queues. Kubernetes enables reliable, scalable container orchestration. External Secrets keeps credentials in Secrets Manager (your security team’s preferred location) while making them available to pods.
Technology approval processes: Many enterprises have technology review boards or approved software lists. If PostgreSQL, Kubernetes, or Terraform aren’t already approved in your environment, initiate that process early—it can take weeks.
Default resource configurations for staging and production environments. These represent starting points—adjust based on your expected workload.
EKS nodes run your application containers. More nodes = more capacity for concurrent users and evaluations. The autoscaler adds nodes during high load and removes them when idle.
RDS stores all application data. The instance class affects query performance; storage grows automatically as you accumulate data.
Which service is most resource-intensive? The evaluations service
(confident-evals) consumes the most CPU during evaluation runs—it processes
LLM outputs and computes metrics. If evaluations are slow, scale this service
first before adding nodes.
EC2 service quotas can block deployment. AWS accounts have default limits on vCPUs. A typical deployment needs ~32 vCPUs (4 nodes × 8 vCPU). Production with autoscaling up to 8 nodes would need 64 vCPUs.
Check your quotas before starting:
If your limit is low (e.g., 32 vCPUs in a new account), request an increase—this can take hours to days.
The deployment provisions the following AWS services:
Some organizations restrict which AWS services can be used. Service Control Policies (SCPs) or internal policies may prohibit certain services. Verify the services above are allowed in your AWS organization before proceeding.
Common restrictions that cause issues:
Confident AI needs to reach external services. Ensure your network allows outbound HTTPS (port 443) to:
Corporate proxies and firewalls: If your organization routes traffic through a proxy or inspects HTTPS, you may need to:
Network restrictions are a common cause of deployment failures that appear as timeouts or SSL errors.
The IAM user or role running Terraform needs permissions to create and manage:
Why so many permissions? Terraform creates a complete, self-contained infrastructure. It needs permission to create all the pieces. After deployment, ongoing operations need far fewer permissions.
IAM permissions are the #1 cause of deployment failures. Most organizations don’t grant broad permissions by default.
Options:
IAM permission errors look like: Error: creating IAM Role: AccessDenied
AWS costs vary by region and usage. Approximate monthly costs for always-on infrastructure:
These are estimates. Actual costs depend on:
Use AWS Cost Explorer after deployment to track actual spending.
Before proceeding to Prerequisites, verify:
Once requirements are understood and approved, proceed to Prerequisites to set up your deployment environment.