Before starting deployment, review these requirements with your infrastructure and security teams. This page covers:
Understanding these requirements upfront prevents delays caused by missing approvals or insufficient quotas.
Confident AI uses the following technologies. Your organization may require approval before deploying new technologies:
Why this tech stack? PostgreSQL is the application’s source of truth. Redis provides fast caching and manages background job queues. Kubernetes enables reliable, scalable container orchestration. External Secrets keeps credentials in Secret Manager (your security team’s preferred location) while making them available to pods.
Technology approval processes: Many enterprises have technology review boards or approved software lists. If PostgreSQL, Kubernetes, or Terraform aren’t already approved in your environment, initiate that process early—it can take weeks.
Default resource configurations for staging and production environments. These represent starting points—adjust based on your expected workload.
GKE worker nodes run your application containers. More nodes = more capacity for concurrent users and evaluations. The autoscaler adds nodes during high load and removes them when idle.
GKE system pool runs Kubernetes system components (kube-dns, kube-proxy, etc.) on a fixed set of 2 nodes.
Cloud SQL for PostgreSQL stores all application data. The machine type affects query performance; storage grows as you accumulate data.
Which service is most resource-intensive? The evaluations service
(confident-evals) consumes the most CPU during evaluation runs—it processes
LLM outputs and computes metrics. If evaluations are slow, scale this service
first before adding nodes.
GCP CPU quotas can block deployment. GCP projects have default limits on CPUs per region and per VM family. A typical deployment needs ~40 vCPUs of N2_CPUS (2×4 system + 4×8 worker).
Check your quotas before starting:
If your limit is low, request an increase—this can take hours to days.
The deployment provisions the following GCP services:
Some organizations restrict which GCP services can be used. Organization policies or folder-level constraints may prohibit certain services. Verify the services above are allowed in your project before proceeding.
Common restrictions that cause issues:
Confident AI needs to reach external services. Ensure your network allows outbound HTTPS (port 443) to:
Corporate proxies and firewalls: If your organization routes traffic through a proxy or inspects HTTPS, you may need to:
Network restrictions are a common cause of deployment failures that appear as timeouts or SSL errors.
The identity running Terraform needs the following GCP IAM roles or equivalent permissions:
Terraform creates and manages:
Permissions are a common cause of deployment failures. Most organizations don’t grant broad permissions by default.
Options:
GCP costs vary by region and usage. Approximate monthly costs for always-on infrastructure:
These are estimates. Actual costs depend on:
Use GCP Billing and Cost Management after deployment to track actual spending.
Before proceeding to Prerequisites, verify:
Once requirements are understood and approved, proceed to Prerequisites to set up your deployment environment.