ops0ops0

Add Kubernetes Cluster

Connect a Kubernetes cluster to ops0 for monitoring, incident detection, and pod management.

Connection Methods

Hive Agent (Recommended)
Outbound-only connection. Install agent in cluster for full features.
Direct Connection
Connect via kubeconfig or cloud provider credentials.

Add Cluster Wizard

1Name - Enter display name and optional description
2Provider - Select EKS, GKE, AKS, or Self-managed
3Credentials - Provide provider-specific authentication
4Connect - Test and establish connection

AWS EKS

EKS Required Fields

FieldDescription
AWS IntegrationSelect from configured integrations
RegionAWS region (e.g., us-east-1)
Cluster NameEKS cluster name from AWS

IAM Permissions

The AWS integration IAM role requires:

{
  "Effect": "Allow",
  "Action": [
    "eks:DescribeCluster",
    "eks:ListClusters"
  ],
  "Resource": "*"
}

Additionally, the IAM role must be mapped in the cluster's aws-auth ConfigMap for Kubernetes API access.


Google GKE

GKE Required Fields

FieldDescription
Project IDGCP project containing the cluster
Cluster NameGKE cluster name
LocationRegion or zone (e.g., us-central1)
Service Account JSONGCP service account key

Service Account Roles

Create a service account with these roles:

RolePurpose
roles/container.clusterViewerView cluster and workloads
roles/container.developerPod exec and log access

Azure AKS

AKS Required Fields

FieldDescription
Subscription IDAzure subscription ID
Resource GroupResource group containing cluster
Cluster NameAKS cluster name
Tenant IDAzure AD tenant ID
Client IDService principal application ID
Client SecretService principal secret

Service Principal

The service principal needs Azure Kubernetes Service Cluster User Role or equivalent RBAC permissions.


Self-Managed Clusters

For any Kubernetes cluster (on-prem, k3s, kind, etc.) accessible via kubeconfig.

Self-Managed Cluster Required Fields

FieldDescription
KubeconfigFull kubeconfig YAML content

Getting Your Kubeconfig

# Full config (all contexts)
cat ~/.kube/config

# Single cluster, flattened
kubectl config view --minify --flatten

Requirements

  • API server must be reachable from ops0
  • Valid credentials (token, certificate, or exec auth)
  • Appropriate RBAC permissions

Install Hive Agent

The Hive agent enables real-time features without opening inbound connections to your cluster.

Why Hive Agent?

Outbound Only
No ingress rules or public endpoints needed
Real-time Logs
Stream container logs live
Pod Exec
Terminal access to containers
Metrics
CPU and memory utilization

Installation via Helm

After connecting your cluster, install the agent:

helm repo add ops0 https://charts.ops0.io
helm repo update

helm install hive-agent ops0/hive-agent \
  --namespace ops0-system \
  --create-namespace \
  --set token=YOUR_CLUSTER_TOKEN \
  --set endpoint=https://api.ops0.io

The token is provided in the ops0 UI after adding the cluster. Click Install Hive Agent on the cluster detail page to copy the complete command.

Verify Installation

kubectl get pods -n ops0-system

Once the agent pod is running, the cluster status in ops0 changes to Connected.


Required RBAC

For full ops0 functionality, apply this ClusterRole:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: ops0-reader
rules:
  # Core resources
  - apiGroups: [""]
    resources:
      - pods
      - pods/log
      - services
      - namespaces
      - nodes
      - events
    verbs: ["get", "list", "watch"]

  # Workload resources
  - apiGroups: ["apps"]
    resources:
      - deployments
      - replicasets
      - statefulsets
      - daemonsets
    verbs: ["get", "list", "watch"]

  # Terminal access (optional)
  - apiGroups: [""]
    resources: ["pods/exec"]
    verbs: ["create"]

Bind this role to your service account or user.


Troubleshooting

Unable to Connect
Check firewall rules and security groups. Verify the API server endpoint is reachable. For private clusters, ensure network path exists.
Access Denied
Verify RBAC permissions. For EKS, check aws-auth ConfigMap. For GKE, verify service account roles. Regenerate credentials if expired.
Certificate Error
Ensure CA certificate is included in kubeconfig. For self-signed certs, the CA must be trusted or explicitly provided.
Timeout
Network latency or cluster overload. Try again or check cluster health. Large clusters may take longer to enumerate resources.

After Connection

Once connected, you can:

  • View cluster overview - Node count, pod status, resource usage
  • Monitor incidents - Auto-detected CrashLoops, OOM, failures
  • Browse pods - Filter by namespace, view details
  • Access logs - Stream container output in real-time
  • Open terminals - Exec into running containers

Example: Adding an EKS Cluster

Wizard Configuration

Step 1: Name

Display Name:   production-eks
Description:    Production EKS cluster in us-east-1

Step 2: Provider

Provider:       AWS EKS

Step 3: Credentials

AWS Integration:  AWS Production (123456789012)
Region:           us-east-1
Cluster Name:     production-cluster

Step 4: Connect

Testing connection to production-cluster...
✓ AWS credentials valid
✓ EKS cluster found
✓ Kubernetes API accessible
✓ RBAC permissions verified

Connection successful!

Install Hive Agent After Connection

After connection, the UI shows the Helm command:

# Add ops0 Helm repository
helm repo add ops0 https://charts.ops0.io
helm repo update

# Install Hive agent
helm install hive-agent ops0/hive-agent \
  --namespace ops0-system \
  --create-namespace \
  --set token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9... \
  --set endpoint=https://api.ops0.io \
  --set clusterName=production-eks

Verify Agent Status

$ kubectl get pods -n ops0-system

NAME                          READY   STATUS    RESTARTS   AGE
hive-agent-7b8f9c6d5-xk2lp    1/1     Running   0          30s

$ kubectl logs -n ops0-system hive-agent-7b8f9c6d5-xk2lp

2024-01-15T10:30:00Z INFO  Starting Hive Agent v2.1.0
2024-01-15T10:30:01Z INFO  Connected to ops0 API
2024-01-15T10:30:02Z INFO  Cluster registered: production-eks
2024-01-15T10:30:02Z INFO  Starting metrics collection
2024-01-15T10:30:02Z INFO  Starting event watcher
2024-01-15T10:30:03Z INFO  Agent ready, awaiting commands

Result in ops0 UI

Cluster: production-eks
Status:  Connected ✓

Connection Details:
─────────────────────────────────────
Method:         Hive Agent
Agent Version:  v2.1.0
Last Heartbeat: 2 seconds ago
Features:       Logs, Exec, Metrics, Events