Troubleshooting
Common issues and solutions for ops0 platform.
Authentication & Access
"Access Denied" when connecting cloud provider
Symptoms: Integration test fails with access denied or insufficient permissions.
Solutions:
AWS: Verify IAM Role
Verify the IAM role trust policy includes ops0's AWS account ID.
AWS: Check External ID
Check the external ID matches what ops0 shows.
GCP: Workload Identity
Ensure Workload Identity Federation is configured correctly.
Azure: Service Principal
Verify the service principal has the Contributor role.
"SSO login failed"
Symptoms: Unable to log in via SSO, redirect loop, or error page.
Solutions:
| Check | Solution |
|---|---|
| SSO provider configuration | Verify ACS URL and Entity ID match ops0 settings |
| Certificate expiration | Update SAML certificate if expired |
| User provisioning | Ensure user exists in both SSO provider and ops0 |
| Browser cookies | Clear cookies and try again |
Terraform & IaC
"Terraform init failed"
Symptoms: Project fails to initialize, missing providers.
Solutions:
Check provider version constraints
Ensure your required_providers block is valid:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
"State lock failed"
Symptoms: Can't run plan or apply, state is locked.
Solutions:
- Wait for other operations to complete - Another deployment may be in progress
- Check for failed workflows - A crashed workflow may have left a lock
- Contact support - If lock is stuck, support can safely release it
"Terraform plan shows unexpected changes"
Symptoms: Plan shows changes even though you didn't modify anything.
Common causes:
| Cause | Solution |
|---|---|
| Drift | Someone changed resources manually in the console |
| Provider upgrade | New provider version changed default values |
| State refresh | Cloud API returned different values |
Run Drift Detection to see what changed and decide whether to update code or revert the change.
Kubernetes
"Cluster not connecting"
Symptoms: Cluster shows disconnected, no pod data.
Solutions:
Check Hive agent is running
kubectl get pods -n ops0
Check agent logs
kubectl logs -n ops0 -l app=hive-agent
Verify outbound connectivity
Ensure connectivity from the agent to the ops0 control plane endpoint used by brew.ops0.ai over port 443.
Check network policies
Check for network policies blocking egress.
"Hive agent CrashLoopBackOff"
Symptoms: Hive agent pod keeps restarting.
Solutions:
- Check logs for specific error:
kubectl logs -n ops0 -l app=hive-agent --previous - Verify RBAC permissions are applied:
kubectl auth can-i list pods --as=system:serviceaccount:ops0:hive-agent - Ensure cluster token is valid (regenerate in ops0 if needed)
Workflows
"Workflow stuck in pending"
Symptoms: Workflow won't start, shows pending forever.
Solutions:
| Check | Solution |
|---|---|
| Trigger condition | Verify trigger event actually occurred |
| Approval step | Check if workflow is waiting for approval |
| Resource limits | ops0 may be rate-limiting concurrent workflows |
"Approval not sending notifications"
Symptoms: Approval step reached but no Slack/email notification.
Solutions:
- Verify Slack integration is connected in Settings > Integrations
- Check the notification channel exists and ops0 app is invited
- Confirm approvers are configured in the workflow step
Policies
"Policy always fails"
Symptoms: Policy fails on every deployment, even compliant ones.
Solutions:
Test your Rego policy
Use the policy editor's "Test" feature with sample input to debug.
Check for typos in resource types
Example: Use aws_s3_bucket, not aws_s3.
Verify input structure
Click "View Input" in a failed policy check to see the actual JSON being evaluated.
GitHub Integration
"PR comments not appearing"
Symptoms: ops0 not posting plan results to pull requests.
Solutions:
- Verify GitHub app is installed on the repository
- Check GitHub app permissions include "Write" on pull requests
- Confirm the project is connected to the correct repository
- Check webhook delivery status in GitHub repo settings
"Sync conflicts"
Symptoms: Push/pull fails with merge conflicts.
Solutions:
- Pull latest changes from GitHub first
- Resolve conflicts in the ops0 editor
- Push the merged result
GitLab Integration
"GitLab sync not working"
Symptoms: Files not syncing, merge request creation fails.
Solutions:
| Check | Solution |
|---|---|
| Token type | Use a Personal, Group, or Project Access Token with api scope |
| Token expiration | Regenerate if expired |
| Self-hosted URL | Verify the GitLab instance URL is correct in integration settings |
| Repository access | Ensure the token has access to the target repository |
Oxid / Query Console
"Oxid init failed"
Symptoms: Enabling Oxid on a project fails during initialization.
Solutions:
- Verify the PostgreSQL connection URL format (
postgres://orpostgresql://) - Ensure the database is accessible from the ops0 platform
- Check that the database user has create table permissions
- Review the streaming log output for specific error messages
"Query Console returns no results"
Symptoms: SQL queries return empty results despite having deployed resources.
Solutions:
- Ensure Oxid is enabled and configured on the project
- Run a deployment or trigger an Oxid sync to populate the state database
- Check the schema browser to verify tables exist
- Only SELECT queries are allowed; write operations are blocked
Common Error Messages
Error: Invalid credentials
Your cloud integration credentials are invalid or expired. Go to Settings → Integrations and re-authenticate.
Error: State lock held
Another deployment is in progress for this project. Wait for it to complete or cancel it if stuck.
Error: Policy violation
Your changes violate a blocking policy. Review the violation details and fix your code before deploying.
Error: Cluster unreachable
ops0 can't connect to your Kubernetes cluster. Check that the agent is running and can reach the internet.
Error: Rate limit exceeded
You've hit API rate limits with your cloud provider. Wait a few minutes and retry, or request limit increases.
Getting Help
If you can't resolve an issue:
Company Site
Use the company site for contact and product context.
Open the Product
Go directly to the ops0 workspace.
When contacting support, include:
- Organization ID (found in Settings)
- Project/cluster name
- Error messages and screenshots
- Steps to reproduce