You’ve provisioned infrastructure and deployed the application. This final step verifies everything works correctly. You will:
After this step, your deployment is verified and ready for users.
Before testing the application, verify all infrastructure components are healthy.
All nodes should show Ready:
Nodes stuck in NotReady? Check for issues:
Look at the “Conditions” section for clues. Common causes: network plugin issues, insufficient resources, or failed health checks.
Verify the database is accessible from the cluster by checking backend logs:
You should see successful connection messages, not connection refused errors.
Check RDS status in AWS Console:
confidentai-stage-rds)Verify the secret exists and External Secrets can read it:
Verify the buckets exist:
You should see three buckets (e.g., confidentai-stage-testcases-bucket, confidentai-stage-payloads-bucket, and confidentai-stage-chbackups-bucket).
S3 is used for file uploads. If S3 connectivity fails, users won’t be able
to upload datasets or export reports. The backend uses EKS Pod Identity to
access S3—verify the confident-s3-sa service account exists.
The ALB (Application Load Balancer) has been created, but DNS records must point to it.
Example output:
Why a long hostname? AWS ALBs have automatically generated hostnames. You create DNS records (CNAME or ALIAS) that point your friendly domains to this ALB hostname.
Add DNS records for each hostname you configured in the ingress:
CNAME vs. ALIAS:
yourdomain.com without a subdomain). Recommended if using Route 53.If your DNS provider only supports CNAME, you must use subdomains (e.g., app.yourdomain.com), not the root domain.
Corporate DNS changes may require approval. If your DNS is managed by an internal team, submit change requests for all four records. Factor in approval time—this can delay verification by hours or days.
After adding records, verify they resolve correctly:
You should see the ALB hostname in the response. If you see “NXDOMAIN” or your old values, wait for DNS propagation (typically 5-30 minutes, up to 48 hours for some providers).
Use a global DNS checker:
app.yourdomain.com)Open your frontend URL in a browser:
What you should see:
Certificate errors?
The backend exposes a health endpoint:
Expected response:
Connection refused or timeout?
kubectl get pods -n confident-aiExpected: {"status":"healthy"}
Expected: {"status":"ok"}
https://app.yourdomain.comOAuth errors?
https://api.yourdomain.com/api/auth/callback/google.confident_subdomain may be misconfigured. Must be root domain, not full subdomain.This verifies database connectivity and basic write operations.
From your local machine (or any machine that can reach the backend):
Expected: True or success message
SDK can’t connect?
If your machine can’t reach the backend directly (internal ALB), run this test from within the same network (VPN) or from a pod inside the cluster.
Expected: Evaluation runs and results appear in the dashboard.
Evaluation fails with API errors?
openai_api_key not configured or invalidapi.openai.comYour cluster needs outbound HTTPS access to OpenAI (or your configured LLM provider).
Run comprehensive health checks on all services:
Log levels: By default, logs show INFO level and above. If you need more detail for debugging, you can increase verbosity through environment variables (contact Confident AI support for guidance).
Before announcing the deployment is ready for users, verify:
terraform.tfvars not committed to version controlDon’t panic. Most issues have straightforward fixes:
kubectl logs for pod issues, CloudWatch for AWS serviceskubectl get commandsYou’ve completed the Confident AI deployment on AWS:
Your deployment is now ready for users. Welcome to Confident AI!
Need help? Contact your Confident AI representative or email support@confident-ai.com with details about your deployment and any issues encountered.