For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Trust CenterStatusSupportGet a demoPlatform
DocumentationEvals API ReferenceIntegrations & OTELPlatform SettingsSelf-HostingChangelog
DocumentationEvals API ReferenceIntegrations & OTELPlatform SettingsSelf-HostingChangelog
    • Self-Hosting
    • Security & Compliance
  • AWS Deployment
    • Overview
    • Quickstart
    • Requirements
  • Azure Deployment
    • Overview
    • Quickstart
    • Requirements
  • GCP Deployment
    • Overview
    • Quickstart
    • Requirements
      • Prerequisites
      • Configuration
      • Provisioning
      • TLS Certificates
      • Cluster Access
      • Kubernetes Deployment
      • Verification
LogoLogo
Trust CenterStatusSupportGet a demoPlatform
On this page
  • Overview
  • Tools
  • GCP credentials
  • Option A: Interactive login (recommended for individuals)
  • Option B: Service account (recommended for CI/CD and teams)
  • Option C: Workload Identity Federation (from a Compute Engine VM)
  • Repository access
  • Information to gather
  • GCP project ID
  • Domain and URLs
  • OAuth credentials (Google SSO)
  • ECR access credentials
  • Secrets to generate
  • OpenAI API key
  • Network planning
  • What is a VPC?
  • Option A: Create new VPC (recommended for most deployments)
  • Option B: Use existing VPC
  • Checklist before proceeding
  • Next steps
GCP DeploymentStep-by-step guide

Prerequisites

Was this page helpful?
Previous

Configuration

Next
Built with

Overview

Before deploying Confident AI, you need to prepare your local environment and gather required information. This page covers:

  • Installing required CLI tools (Terraform, gcloud, kubectl, Helm)
  • Configuring GCP credentials with appropriate permissions
  • Obtaining repository access from Confident AI
  • Gathering domain, OAuth, and ECR credentials
  • Planning your VPC configuration

Complete all items on this page before proceeding to Configuration.

Tools

Install the following tools on your local machine (or wherever you’ll run the deployment from):

ToolVersionPurposeInstallation
Terraform>= 1.5.0Provisions all GCP infrastructureterraform.io/downloads
gcloud CLILatestAuthenticates with GCP and manages resourcescloud.google.com/sdk/docs/install
kubectl>= 1.28Manages Kubernetes workloads on GKEkubernetes.io/docs/tasks/tools
Helm>= 3.12Installs Kubernetes packages (charts)helm.sh/docs/intro/install
GitLatestClones the deployment repositoriesgit-scm.com

Verify installations:

$terraform version
$gcloud version
$kubectl version --client
$helm version

Corporate laptop restrictions: Many organizations restrict software installation on managed devices. If you can’t install these tools locally, consider: - Using a cloud-based VM (Compute Engine VM) as your deployment workstation - Requesting exceptions from your IT security team - Using pre-approved container images that include these tools

GCP credentials

Terraform needs GCP credentials to create resources on your behalf. The identity you use must have Editor and Project IAM Admin roles on the target project.

Option A: Interactive login (recommended for individuals)

$gcloud auth login
$gcloud auth application-default login
$gcloud config set project "<your-project-id>"

Option B: Service account (recommended for CI/CD and teams)

$gcloud auth activate-service-account \
> --key-file="<path-to-key.json>"
$
$export GOOGLE_APPLICATION_CREDENTIALS="<path-to-key.json>"

Option C: Workload Identity Federation (from a Compute Engine VM)

If running Terraform from a Compute Engine VM with an attached service account, no explicit login is needed. Ensure the service account has the required IAM bindings.

Verify access works:

$gcloud auth list
$gcloud config list

This should return your active account and project.

Permission errors are the #1 cause of failed deployments. Before starting, verify your identity has permissions to create:

  • Projects (if creating new), VPCs, subnets, firewall rules, Cloud NAT
  • GKE clusters and node pools
  • Cloud SQL instances
  • GCS buckets
  • Secret Manager secrets
  • Service accounts and IAM bindings

If your organization requires pre-approved service accounts, work with your cloud security team to get the necessary permissions before proceeding.

Using a service account? Many organizations prohibit using personal credentials for infrastructure provisioning. If you need to use a service account, ensure it has the permissions listed above and that you can authenticate with it from your deployment workstation.

Repository access

Confident AI provides a private GitHub repository containing the deployment code:

RepositoryWhat it contains
confident-terraformTerraform code that sets up the complete GCP infrastructure (including VPC, GKE, Cloud SQL, Cloud Storage, etc.) and performs the initial configuration of the GKE cluster

Your Confident AI representative will grant your GitHub account access to this repository. Once granted, clone it:

$git clone git@github.com:confident-ai/confident-terraform.git

SSH key issues: If the clone fails with “Permission denied (publickey)”, you need to add your SSH key to GitHub. See GitHub’s SSH key documentation.

Corporate proxy/firewall: If git commands hang or timeout, your network may block SSH (port 22). Try using HTTPS instead:

$git clone https://github.com/confident-ai/confident-terraform.git

Information to gather

Before running Terraform, you need several pieces of information. Gather these now to avoid interruptions during configuration.

GCP project ID

You need the project ID where resources will be deployed:

$gcloud config get-value project

Ensure the correct project is selected. If your organization has multiple projects, verify you’re targeting the right one. Deploying to the wrong project can be difficult to undo.

Domain and URLs

Confident AI needs three URLs configured. These determine where your users and applications access the platform:

VariableWhat it’s forExample
Frontend URLWhere users open their browser to access the Confident AI dashboardhttps://app.yourdomain.com
Backend URLAPI endpoint that the SDK and integrations callhttps://api.yourdomain.com
SubdomainUsed for authentication cookies—typically your root domainyourdomain.com

Why separate frontend and backend URLs? The frontend serves the web dashboard, while the backend handles API requests. Separating them allows independent scaling and clearer security boundaries. Both URLs will point to the same load balancer but route to different services.

DNS control required: You must be able to create DNS records (CNAME or A records) for these domains. If your DNS is managed by a different team, loop them in early—DNS changes often require change tickets and approvals.

OAuth credentials (Google SSO)

If using Google for user authentication, you need OAuth credentials. Skip this if using a different identity provider (Okta, Azure AD, etc.—these are configured separately).

  1. Go to Google Cloud Console
  2. Create a new OAuth 2.0 Client ID (Web application type)
  3. Add authorized redirect URI: https://<your-backend-url>/api/auth/callback/google
  4. Save the Client ID and Client Secret

OAuth redirect URI must be exact. The redirect URI must match exactly what you configure in Confident AI. A common mistake is forgetting the /api/auth/callback/google path or using HTTP instead of HTTPS.

ECR access credentials

Confident AI container images are stored in a private AWS Elastic Container Registry (ECR). Your Confident AI representative will provide credentials that allow your GKE cluster to pull these images:

CredentialWhat it’s for
ecr_aws_access_key_idAWS access key that can authenticate to Confident AI’s ECR
ecr_aws_secret_access_keyCorresponding secret key
ecr_aws_account_idThe AWS account ID where images are stored

Why AWS ECR on GCP? Confident AI hosts container images in AWS ECR. Terraform configures a CronJob in your GKE cluster that periodically refreshes ECR pull credentials. Your cluster authenticates to ECR, pulls the images, and runs them in your environment—the images never leave your infrastructure after the initial pull.

Secrets to generate

You need to generate several secure random values. These are used for encrypting sessions, database passwords, and admin access:

$# Auth secret - used to sign authentication tokens (32+ characters)
$openssl rand -base64 32
$
$# PostgreSQL password - primary database password
$openssl rand -base64 24
$
$# ClickHouse password - analytics database password
$openssl rand -base64 24
$
$# ArgoCD admin password - GitOps dashboard access
$openssl rand -base64 16

Save these values securely. You’ll need them during configuration. Use a password manager or secure notes—don’t save them in plain text files or commit them to git. If you lose these values, you may need to redeploy or reset credentials.

OpenAI API key

Confident AI uses OpenAI (or compatible LLM providers) to run evaluations. You need an API key with access to models like GPT-4.

Outbound network access: Your GKE cluster needs outbound HTTPS access to api.openai.com for evaluations to work. If your organization restricts outbound traffic, ensure this is allowlisted. Alternatively, if you use Vertex AI or a self-hosted model, provide those credentials instead.

Network planning

Terraform can either create a new VPC or deploy into an existing one. Understanding this decision is important because it affects network isolation, connectivity, and security.

What is a VPC?

A Virtual Private Cloud (VPC) is your isolated network in GCP. Think of it as your own private data center in the cloud. Resources inside the VPC can talk to each other, but external access is controlled by Cloud NAT and firewall rules.

Option A: Create new VPC (recommended for most deployments)

Terraform creates a dedicated VPC for Confident AI with:

  • GKE subnet: Where GKE worker and system nodes run (with secondary ranges for pods and services)
  • Database subnet: Allocated range for Cloud SQL Private Service Access (VPC peering)
  • Public subnet: For any future public-facing resources
  • Private service connect subnet: For private GCS access via PSC endpoints
  • Cloud NAT: Allows GKE nodes to make outbound requests (for pulling images, calling APIs)

Default IP ranges:

Subnet TypeCIDR BlockWhat it means
VPC10.0.0.0/1665,536 available IP addresses
GKE Nodes10.0.1.0/24256 IPs for GKE nodes
GKE Pods10.4.0.0/14Secondary range for pods (VPC-native)
GKE Services10.0.32.0/20Secondary range for cluster services
Database (PSA)10.0.6.0/24Reserved range for Cloud SQL peering
Public10.0.101.0/24256 IPs for public resources
Private Endpoints10.0.7.0/24256 IPs for PSC endpoints

CIDR conflicts: If these IP ranges overlap with your corporate network (e.g., if you already use 10.0.x.x internally), you’ll have routing problems when setting up VPN connectivity. Check with your network team and choose non-overlapping ranges.

Option B: Use existing VPC

If your organization requires deploying into an existing VPC (common for compliance or network policy reasons), gather:

  • VPC ID (e.g., projects/<project>/global/networks/<name>)
  • VPC address space (e.g., 10.0.0.0/16)
  • GKE subnet ID — must have sufficient IPs for nodes and secondary ranges for pods/services
  • Database PSA range — must have an allocated compute.global-address for Private Service Access
  • Public subnet ID — for load balancer resources

Cloud SQL requires Private Service Access (PSA). PSA is configured by allocating a global address range and creating a VPC peering with servicenetworking.googleapis.com. Without PSA, private Cloud SQL connectivity will fail.

If using an existing VPC, verify a PSA allocation exists or add one before proceeding.

Corporate VPC restrictions: Many organizations have strict policies on what can be created in shared VPCs:

  • Firewall rules may require approval
  • Cloud NAT may need to use existing shared infrastructure
  • Subnet CIDR ranges may be pre-allocated
  • VPC Flow Logs may be required

Work with your network/cloud team to understand these constraints before choosing this option.

Checklist before proceeding

Before moving to Configuration, verify:

  • All CLI tools installed and working (terraform, gcloud, kubectl, helm, git)
  • GCP credentials configured and gcloud auth list succeeds
  • Correct project selected
  • Both GitHub repositories cloned successfully
  • Domain URLs decided and DNS access confirmed
  • OAuth credentials created (if using Google SSO)
  • ECR credentials received from Confident AI
  • Secrets generated and stored securely
  • VPC decision made (new vs. existing) and relevant IDs gathered
  • Network team consulted if using existing VPC or if CIDR conflicts possible

Next steps

Once you have all prerequisites in place, proceed to Configuration to set up your Terraform variables.