ops0ops0

Pods

View pod details, stream container logs, and open terminal sessions for debugging. Access pods from the cluster detail page or directly from incidents.

Pod List

The Pods tab shows all pods in the cluster with filtering options:

ColumnDescription
NamePod name
NamespaceKubernetes namespace
StatusRunning, Pending, Failed, etc.
ReadyReady containers / Total containers
RestartsTotal restart count across all containers
AgeTime since pod creation
NodeNode the pod is scheduled on

Pod Status

StatusColorMeaning
RunningGreenAll containers running and healthy
PendingYellowWaiting for scheduling or image pull
SucceededGrayAll containers completed successfully (Jobs)
FailedRedOne or more containers terminated with error
UnknownGrayPod state cannot be determined

Filtering Pods

FilterDescription
NamespaceSelect specific namespace or "All Namespaces"
StatusRunning, Pending, Failed, Succeeded, or All
SearchFilter by pod name (real-time)

Pod Details

Click any pod to open the detail panel:

Pod Information

FieldDescription
NameFull pod name
NamespaceNamespace
StatusCurrent pod phase
Pod IPCluster IP assigned to pod
NodeNode name and IP
CreatedCreation timestamp
LabelsKubernetes labels (key: value)
AnnotationsKubernetes annotations

Containers

For each container in the pod:

FieldDescription
NameContainer name
ImageContainer image and tag
StatusRunning, Waiting, or Terminated
ReadyReadiness probe status
RestartsContainer restart count
StartedContainer start time

Pod Actions

Logs
Stream container output with search
Terminal
Interactive shell in container
Describe
Full kubectl describe output
Events
Kubernetes events for pod

Viewing Logs

Click Logs to open the log viewer for a container.

Log Viewer Features

FeatureDescription
Live StreamingReal-time log updates via WebSocket connection
SearchFilter logs by keyword or regex
Time RangeLast 15min, 30min, 1hr, 24hr, or All
Container SelectSwitch between containers (multi-container pods)
AI AnalysisAI-powered log analysis panel for root cause detection
DownloadExport logs as text file
Line WrapToggle word wrapping
Pause/ResumePause live streaming to read

AI Log Analysis

The log viewer includes an AI analysis panel that examines container logs and provides:

  • Root cause identification: detects error patterns, exceptions, and failure chains
  • Contextual insights: correlates log entries with pod events and cluster state
  • Remediation suggestions: actionable steps to resolve detected issues
  • Pattern detection: identifies recurring errors and escalating failure rates

Click Analyze Logs in the log viewer toolbar to trigger AI analysis on the current log output.

Time Range Options

OptionDescription
Last 15 minMost recent 15 minutes
Last 30 minMost recent 30 minutes
Last 1 hourMost recent hour
Last 24 hoursMost recent 24 hours
AllAll available logs (may be large)

Multi-Container Pods

For pods with multiple containers (sidecars, init containers):

  1. Click the Container dropdown
  2. Select the container to view
  3. Logs refresh for selected container

Init container logs are available after they complete.


Terminal Access

Click Terminal to open an interactive shell inside the container.

Terminal Features

FeatureDescription
Interactive ShellFull shell access (sh, bash, etc.)
Tab CompletionCommand and path completion
Copy/PasteCtrl+C/Ctrl+V or right-click
ResizeDrag edges to resize terminal
Multiple TabsOpen terminals in multiple pods

Using the Terminal

/ # ls -la
total 64
drwxr-xr-x    1 root     root          4096 Jan 15 10:30 .
drwxr-xr-x    1 root     root          4096 Jan 15 10:30 ..
drwxr-xr-x    2 root     root          4096 Jan 15 10:30 app
-rw-r--r--    1 root     root           123 Jan 15 10:30 config.yaml

/ # cat config.yaml
database:
  host: db.example.com
  port: 5432

/ # env | grep DB
DB_HOST=db.example.com
DB_PORT=5432

Security

Security FeatureDescription
RBAC RequiredUser needs pods/exec create permission
Approval WorkflowOptional approval before exec (if configured)
Audit LoggingAll terminal sessions are logged
Auto-TimeoutSessions close after inactivity period

Pod Events

Click Events to view Kubernetes events for the pod:

Event Types

TypeDescription
NormalInformational events (Scheduled, Pulled, Started)
WarningIssues requiring attention (BackOff, Failed)

Common Events

Type    Reason     Age   Message
────    ──────     ───   ───────
Normal  Scheduled  10m   Successfully assigned default/api-7d9f8c to node-1
Normal  Pulling    10m   Pulling image "api:v1.2.3"
Normal  Pulled     9m    Successfully pulled image in 45s
Normal  Created    9m    Created container api
Normal  Started    9m    Started container api

Warning Events to Watch

EventMeaning
BackOffContainer crashing and backing off restarts
FailedSchedulingNo node has enough resources
FailedMountVolume mount failed (secrets, configmaps)
UnhealthyReadiness or liveness probe failing

Resource Metrics

If metrics-server is installed, pods show resource usage:

CPU:    ████████░░ 80%   (800m / 1000m)
Memory: ██████░░░░ 60%   (600Mi / 1Gi)
MetricDescription
CPUCurrent CPU vs. limit (millicores)
MemoryCurrent memory vs. limit

Pods without limits show usage without percentage.


Deleting Pods

Click Delete to remove a pod from the cluster.

Delete Behavior
Managed pods (Deployment, ReplicaSet, StatefulSet) - Kubernetes recreates a replacement pod automatically
Standalone pods - Permanently deleted, no replacement
Jobs - Pod deleted, Job may restart based on completions

Troubleshooting

Pending
Check Events tab for scheduling failures. Common causes: insufficient CPU/memory, node selectors, taints/tolerations, PVC pending.
CrashLoopBackOff
Check Logs for application errors. Container repeatedly crashing. Look for exit codes, missing configs, or dependency failures.
ImagePullBackOff
Check Events for pull errors. Verify image name/tag exists, registry authentication (imagePullSecrets), and network access to registry.
OOMKilled
Container exceeded memory limit. Increase memory limit in pod spec or optimize application memory usage. Check Describe for exact limit.

Debugging Workflow

1Check pod Status and Ready state
2View Events for scheduling or runtime errors
3Check Logs for application-level errors
4Open Terminal to inspect files, env vars, processes
5Check Resource metrics for CPU/memory pressure

Example: Debugging a Failing Pod

Pod List View

NameNamespaceStatusReadyRestartsAgeNode
api-gateway-7d9f8c6b4d-2xkjpproductionRunning1/102dnode-1
api-gateway-7d9f8c6b4d-9vwrtproductionRunning1/102dnode-2
worker-5f8d9c7b2-kp3mnproductionCrashLoopBackOff0/1151hnode-3
redis-0productionRunning1/105dnode-1

Pod Detail: worker-5f8d9c7b2-kp3mn

Pod Information:

Name:           worker-5f8d9c7b2-kp3mn
Namespace:      production
Status:         CrashLoopBackOff
Pod IP:         10.0.3.45
Node:           ip-10-0-3-78.ec2.internal (10.0.3.78)
Created:        2024-01-15 09:30:00 UTC

Labels:
  app: worker
  version: v1.2.3
  environment: production

Container: worker

Image:          worker:v1.2.3
Status:         Waiting (CrashLoopBackOff)
Ready:          False
Restart Count:  15
Last State:     Terminated (Error, exit code 1)
Started:        10:45:00 UTC
Finished:       10:45:05 UTC

Step 1: View Events

Type    Reason     Age   Message
────    ──────     ───   ───────
Normal  Scheduled  1h    Successfully assigned production/worker-... to node-3
Normal  Pulled     1h    Container image "worker:v1.2.3" already present
Normal  Created    15m   Created container worker
Normal  Started    15m   Started container worker
Warning BackOff    5m    Back-off restarting failed container

Step 2: Check Logs

[10:45:01] INFO  Worker v1.2.3 starting
[10:45:02] INFO  Connecting to Redis at redis.production:6379
[10:45:02] INFO  Redis connection established
[10:45:03] INFO  Loading job queue configuration
[10:45:03] ERROR Missing required environment variable: AWS_REGION
[10:45:03] ERROR Configuration validation failed
[10:45:04] FATAL Exiting due to configuration error

Step 3: Open Terminal

/ # env | grep AWS
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
# AWS_REGION is missing!

/ # cat /etc/config/worker.yaml
redis:
  host: redis.production
  port: 6379
queue:
  name: jobs
  region: ${AWS_REGION}  # This requires AWS_REGION env var

/ # exit

Step 4: Resolution

Found the issue: AWS_REGION environment variable is missing from the deployment.

Fix applied to deployment:

env:
  - name: AWS_REGION
    value: "us-east-1"

After applying the fix:

worker-5f8d9c7b2-new-pod   1/1   Running   0   30s

Example: Streaming Logs

Log Viewer

Container: api-gateway
Time Range: Last 15 min
Search: "error"

[10:42:15] INFO  Request received: GET /api/users/123
[10:42:15] INFO  Database query executed in 45ms
[10:42:15] INFO  Response sent: 200 OK (52ms total)
[10:42:30] WARN  Slow query detected: 850ms for /api/reports
[10:43:01] ERROR Connection timeout to external-service.com:443
[10:43:01] ERROR Retry 1/3 for external API call
[10:43:03] INFO  Retry successful, response received
[10:44:15] INFO  Health check passed
[10:45:00] INFO  Metrics exported to Prometheus

Multi-Container Pod Logs

Pod: web-app-7d9f8c6b4d-2xkjp
Containers: [app] [nginx-sidecar] [log-shipper]

Selected: app
─────────────────────────────────────
[10:45:00] INFO  Application started on port 3000
[10:45:01] INFO  Connected to database
[10:45:02] INFO  Ready to accept connections

Selected: nginx-sidecar
─────────────────────────────────────
10.0.1.45 - - [15/Jan/2024:10:45:15 +0000] "GET /health HTTP/1.1" 200 2
10.0.1.45 - - [15/Jan/2024:10:45:30 +0000] "GET /api/users HTTP/1.1" 200 1523
10.0.2.78 - - [15/Jan/2024:10:45:45 +0000] "POST /api/orders HTTP/1.1" 201 89