Troubleshooting

Common issues and solutions for ops0 platform.

Authentication & Access

"Access Denied" when connecting cloud provider

Symptoms: Integration test fails with access denied or insufficient permissions.

Solutions:

AWS: Verify IAM Role

Verify the IAM role trust policy includes ops0's AWS account ID.

AWS: Check External ID

Check the external ID matches what ops0 shows.

GCP: Workload Identity

Ensure Workload Identity Federation is configured correctly.

Azure: Service Principal

Verify the service principal has the Contributor role.

Symptoms: Unable to log in via SSO, redirect loop, or error page.

Solutions:

Check	Solution
SSO provider configuration	Verify ACS URL and Entity ID match ops0 settings
Certificate expiration	Update SAML certificate if expired
User provisioning	Ensure user exists in both SSO provider and ops0
Browser cookies	Clear cookies and try again

Terraform & IaC

"Terraform init failed"

Symptoms: Project fails to initialize, missing providers.

Solutions:

Check provider version constraints

Ensure your required_providers block is valid:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

"State lock failed"

Symptoms: Can't run plan or apply, state is locked.

Solutions:

Wait for other operations to complete - Another deployment may be in progress
Check for failed workflows - A crashed workflow may have left a lock
Contact support - If lock is stuck, support can safely release it

"Terraform plan shows unexpected changes"

Symptoms: Plan shows changes even though you didn't modify anything.

Common causes:

Cause	Solution
Drift	Someone changed resources manually in the console
Provider upgrade	New provider version changed default values
State refresh	Cloud API returned different values

Run Drift Detection to see what changed and decide whether to update code or revert the change.

Kubernetes

"Cluster not connecting"

Symptoms: Cluster shows disconnected, no pod data.

Solutions:

Check ops0 agent is running

kubectl get pods -n ops0

Check agent logs

kubectl logs -n ops0 -l app=ops0-agent

Verify outbound connectivity

Ensure connectivity from the agent to the ops0 control plane endpoint used by brew.ops0.ai over port 443.

Check network policies

Check for network policies blocking egress.

"ops0 agent CrashLoopBackOff"

Symptoms: ops0 agent pod keeps restarting.

Solutions:

Check logs for specific error: kubectl logs -n ops0 -l app=ops0-agent --previous
Verify RBAC permissions are applied: kubectl auth can-i list pods --as=system:serviceaccount:ops0:ops0-agent
Ensure cluster token is valid (regenerate in ops0 if needed)

Workflows

"Workflow stuck in pending"

Symptoms: Workflow won't start, shows pending forever.

Solutions:

Check	Solution
Trigger condition	Verify trigger event actually occurred
Approval step	Check if workflow is waiting for approval
Resource limits	ops0 may be rate-limiting concurrent workflows

"Approval not sending notifications"

Symptoms: Approval step reached but no Slack/email notification.

Solutions:

Verify Slack integration is connected in Settings > Integrations
Check the notification channel exists and ops0 app is invited
Confirm approvers are configured in the workflow step

Policies

"Policy always fails"

Symptoms: Policy fails on every deployment, even compliant ones.

Solutions:

Test your Rego policy

Use the policy editor's "Test" feature with sample input to debug.

Check for typos in resource types

Example: Use aws_s3_bucket, not aws_s3.

Verify input structure

Click "View Input" in a failed policy check to see the actual JSON being evaluated.

GitHub Integration

"PR comments not appearing"

Symptoms: ops0 not posting plan results to pull requests.

Solutions:

Verify GitHub app is installed on the repository
Check GitHub app permissions include "Write" on pull requests
Confirm the project is connected to the correct repository
Check webhook delivery status in GitHub repo settings

"Sync conflicts"

Symptoms: Push/pull fails with merge conflicts.

Solutions:

Pull latest changes from GitHub first
Resolve conflicts in the ops0 editor
Push the merged result

GitLab Integration

"GitLab sync not working"

Symptoms: Files not syncing, merge request creation fails.

Solutions:

Check	Solution
Token type	Use a Personal, Group, or Project Access Token with `api` scope
Token expiration	Regenerate if expired
Self-hosted URL	Verify the GitLab instance URL is correct in integration settings
Repository access	Ensure the token has access to the target repository

Oxid / Query Console

"Oxid init failed"

Symptoms: Enabling Oxid on a project fails during initialization.

Solutions:

Verify the PostgreSQL connection URL format (postgres:// or postgresql://)
Ensure the database is accessible from the ops0 platform
Check that the database user has create table permissions
Review the streaming log output for specific error messages

"Query Console returns no results"

Symptoms: SQL queries return empty results despite having deployed resources.

Solutions:

Ensure Oxid is enabled and configured on the project
Run a deployment or trigger an Oxid sync to populate the state database
Check the schema browser to verify tables exist
Only SELECT queries are allowed; write operations are blocked

Common Error Messages

Error: Invalid credentials

Your cloud integration credentials are invalid or expired. Go to Settings → Integrations and re-authenticate.

Error: State lock held

Another deployment is in progress for this project. Wait for it to complete or cancel it if stuck.

Error: Policy violation

Your changes violate a blocking policy. Review the violation details and fix your code before deploying.

Error: Cluster unreachable

ops0 can't connect to your Kubernetes cluster. Check that the agent is running and can reach the internet.

Error: Rate limit exceeded

You've hit API rate limits with your cloud provider. Wait a few minutes and retry, or request limit increases.

Getting Help

If you can't resolve an issue:

Company Site

Use the company site for contact and product context.

Open the Product

Go directly to the ops0 workspace.

When contacting support, include:

Organization ID (found in Settings)
Project/cluster name
Error messages and screenshots
Steps to reproduce

Next Steps

Getting Started

New to ops0? Start here

FAQ

Common questions answered

Glossary

ops0 terminology explained

Troubleshooting

Authentication & Access

"Access Denied" when connecting cloud provider

AWS: Verify IAM Role

AWS: Check External ID

GCP: Workload Identity

Azure: Service Principal

"SSO login failed"

Terraform & IaC

"Terraform init failed"

Check provider version constraints

"State lock failed"

"Terraform plan shows unexpected changes"

Kubernetes

"Cluster not connecting"

Check ops0 agent is running

Check agent logs

Verify outbound connectivity

Check network policies

"ops0 agent CrashLoopBackOff"

Workflows

"Workflow stuck in pending"

"Approval not sending notifications"

Policies

"Policy always fails"

Test your Rego policy

Check for typos in resource types

Verify input structure

GitHub Integration

"PR comments not appearing"

"Sync conflicts"

GitLab Integration

"GitLab sync not working"

Oxid / Query Console

"Oxid init failed"

"Query Console returns no results"

Common Error Messages

Error: Invalid credentials

Error: State lock held

Error: Policy violation

Error: Cluster unreachable

Error: Rate limit exceeded

Getting Help

Company Site

Open the Product

Next Steps

Getting Started

FAQ

Glossary

Troubleshooting

Authentication & Access

"Access Denied" when connecting cloud provider

AWS: Verify IAM Role

AWS: Check External ID

GCP: Workload Identity

Azure: Service Principal

"SSO login failed"

Terraform & IaC

"Terraform init failed"

Check provider version constraints

"State lock failed"

"Terraform plan shows unexpected changes"

Kubernetes

"Cluster not connecting"

Check ops0 agent is running

Check agent logs

Verify outbound connectivity

Check network policies

"ops0 agent CrashLoopBackOff"

Workflows

"Workflow stuck in pending"

"Approval not sending notifications"

Policies

"Policy always fails"

Test your Rego policy

Check for typos in resource types

Verify input structure

GitHub Integration

"PR comments not appearing"

"Sync conflicts"