Nodes

Monitor and manage Kubernetes cluster nodes, including health status, capacity, and resource allocation.

Node List

View all nodes in a cluster with status and resource usage:

Column	Description
Name	Node name or hostname
Status	Ready, NotReady, Unknown
Roles	master, worker, or custom labels
CPU	Current usage / Total capacity
Memory	Current usage / Total capacity
Pods	Running pods / Max pods
Age	Time since node joined cluster

Node Status

Status	Color	Meaning
Ready	Green	Node is healthy and accepting pods
NotReady	Red	Node has issues (network, kubelet, resources)
Unknown	Gray	Node status cannot be determined
SchedulingDisabled	Yellow	Node is cordoned (no new pods)

Node Details

Click a node to view comprehensive information:

System Information

Field	Description
Name	Node hostname
Provider ID	Cloud provider instance ID
Instance Type	EC2 instance type, GCE machine type, etc.
OS Image	Operating system and version
Kernel Version	Linux kernel version
Container Runtime	containerd, CRI-O, Docker
Kubelet Version	Kubernetes version running on node

Capacity and Allocatable

Resource	Capacity	Allocatable	Usage
CPU	Total cores	Available for pods	Current usage
Memory	Total RAM	Available for pods	Current usage
Pods	Max pods (usually 110)	Pods limit	Running pods
Ephemeral Storage	Total disk	Available disk	Current usage

Capacity vs Allocatable:

Capacity: Total resources on node
Allocatable: Resources available for user pods (after system pods and kubelet overhead)

Resource Utilization

Visual breakdown of resource usage:

CPU Allocation (12 cores total, 10 allocatable)
─────────────────────────────────────────────────
System Reserved:  ██░░░░░░░░  2 cores (16.7%)
Allocated Pods:   ████████░░  8 cores (80% of allocatable)
Free:             ██░░░░░░░░  2 cores (20% of allocatable)

Memory Allocation (32 GB total, 28 GB allocatable)
─────────────────────────────────────────────────
System Reserved:  ██░░░░░░░░  4 GB (12.5%)
Allocated Pods:   ██████████  20 GB (71% of allocatable)
Free:             ████░░░░░░  8 GB (29% of allocatable)

Node Conditions

Condition	Status	Description
Ready	True/False	Node is healthy and ready for pods
MemoryPressure	True/False	Node running low on memory
DiskPressure	True/False	Node running low on disk
PIDPressure	True/False	Too many processes running
NetworkUnavailable	True/False	Network not properly configured

Taints and Tolerations

View node taints that prevent pods from scheduling:

Taint	Effect	Description
`node-role.kubernetes.io/master`	NoSchedule	Master nodes don't run user pods
`node.kubernetes.io/disk-pressure`	NoSchedule	Node has disk pressure
`example.com/special=true`	NoExecute	Custom taint for specialized workloads

Effects:

NoSchedule: Pods without toleration won't be scheduled
PreferNoSchedule: Avoid scheduling if possible
NoExecute: Evict existing pods without toleration

Node Labels

View node labels used for pod scheduling:

Label	Value	Usage
`kubernetes.io/hostname`	node-1.example.com	Node hostname
`node.kubernetes.io/instance-type`	m5.2xlarge	Instance type
`topology.kubernetes.io/zone`	us-east-1a	Availability zone
`custom/workload`	high-memory	Custom label for targeting

Pods Running on Node

List all pods scheduled on this node:

Pod	Namespace	CPU Request	Memory Request	Status
api-gateway-abc123	production	500m	1Gi	Running
nginx-xyz789	production	100m	256Mi	Running
fluentd-daemon-set	kube-system	100m	200Mi	Running

Node Events

View recent Kubernetes events for the node:

Type    Reason           Age   Message
────    ──────           ───   ───────
Normal  Starting         45d   Starting kubelet.
Normal  NodeHasSufficientMemory  45d   Node has sufficient memory
Normal  NodeReady        45d   Node is ready
Warning DiskPressure     2h    Node has disk pressure
Normal  DiskPressureCleared  1h    Disk pressure cleared

Node Operations

ops0 supports the following maintenance operations on nodes directly from the UI. All operations are logged in audit logs.

Cordon

Cordoning marks a node as unschedulable — no new pods will be placed on it, but existing pods continue running.

When to use	Effect
Preparing for maintenance	Prevents new workloads from landing on the node
Node resource pressure	Stops new pods while you investigate

Open Node Detail

Click the node in the node list.

Click Cordon

Click Cordon in the node actions menu.

Confirm

The node status changes to SchedulingDisabled (yellow).

Click Uncordon to restore normal scheduling.

Drain

Draining cordons the node and then evicts all running pods, forcing them to reschedule on other nodes. Use this before taking a node offline for maintenance, upgrades, or decommissioning.

Drain evicts running pods

Draining evicts all pods that can be evicted. Pods managed by a Deployment, StatefulSet, or DaemonSet will reschedule elsewhere. Standalone pods (not managed by a controller) and DaemonSet pods are not evicted by default.

Open Node Detail

Click the node in the node list.

Click Drain

Click Drain in the actions menu.

Confirm

ops0 cordons the node and evicts all eligible pods. The drain completes when all evictable pods have been removed.

After maintenance, click Uncordon to allow scheduling again.

Uncordon

Uncordoning re-enables pod scheduling on a node that was previously cordoned or drained.

Open the node detail
Click Uncordon
Node status returns to Ready

Editing Labels and Annotations

Labels and annotations can be edited directly from the node detail panel without needing kubectl.

Editing Labels

Labels control pod scheduling via nodeSelector and nodeAffinity rules. Changing a label can immediately affect which pods are eligible to schedule on the node.

Open node detail
Click Edit Labels
Add, modify, or remove key-value pairs
Click Save — changes apply immediately via the Kubernetes API

Editing Annotations

Annotations store non-scheduling metadata (tool metadata, operational notes).

Open node detail
Click Edit Annotations
Add, modify, or remove key-value pairs
Click Save

Labels affect scheduling immediately

Removing a label that pods rely on for nodeSelector or nodeAffinity may cause those pods to become unschedulable on next restart. Check which workloads use the label before removing it.

Troubleshooting

Node NotReady

Check node conditions for MemoryPressure, DiskPressure, or NetworkUnavailable. Verify kubelet is running. Check cloud provider for instance issues.

High Resource Usage

View pods on node to identify resource-heavy workloads. Consider scaling horizontally or moving pods to larger nodes.

Pods Not Scheduling

Check allocatable resources vs pod requests. Verify taints don't block scheduling. Review pod events for FailedScheduling reasons.

Example: Node Resource Analysis

Node Overview

Node: ip-10-0-1-45.ec2.internal
Status: Ready
Instance Type: m5.2xlarge (8 vCPU, 32 GB RAM)
Zone: us-east-1a
Kubelet: v1.28.2

Resource Breakdown

CPU Capacity: 8 cores
  System Reserved: 0.5 cores
  Allocatable: 7.5 cores
  Requested: 6.2 cores (83%)
  Used: 5.1 cores (68%)
  Free: 1.3 cores (17%)

Memory Capacity: 32 GB
  System Reserved: 2 GB
  Allocatable: 30 GB
  Requested: 22 GB (73%)
  Used: 18 GB (60%)
  Free: 8 GB (27%)

Pods: 18 / 110 (16%)

Top Resource Consumers

Pod	CPU Usage	Memory Usage
api-gateway-7d9f8c6b4d-2xkjp	1.2 cores	4 GB
worker-5f8d9c7b2-kp3mn	0.9 cores	3.5 GB
redis-0	0.5 cores	2 GB

Recommendations

Node has 17% free CPU capacity - can accommodate more pods
Memory usage is healthy at 60%
No resource pressure conditions
92 pod slots available

Example: Troubleshooting NotReady Node

Node Status

Node: ip-10-0-2-34.ec2.internal
Status: NotReady
Last Heartbeat: 15 minutes ago

Node Conditions

Condition	Status	Last Transition	Message
Ready	False	15m	Kubelet stopped posting node status
MemoryPressure	False	2d	kubelet has sufficient memory
DiskPressure	True	20m	kubelet has disk pressure
PIDPressure	False	2d	kubelet has sufficient PID

Recent Events

Type    Reason           Age   Message
────    ──────           ───   ───────
Normal  NodeReady        2d    Node is ready
Warning DiskPressure     20m   Node has disk pressure
Warning Rebooted         15m   Node rebooted, reason: unknown
Warning NodeNotReady     15m   Kubelet stopped posting status

Investigation Steps

Check Disk Usage: DiskPressure condition is True
Check Instance: Node rebooted 15 minutes ago
Check Kubelet: Kubelet not posting status (may be stopped)
Review Pods: 12 pods on node, likely evicted or pending

Resolution

Action: SSH to node and check kubelet status
Result: Kubelet service crashed after reboot

Fix:
$ sudo systemctl start kubelet
$ sudo systemctl status kubelet

Node recovered after 2 minutes:
Status: Ready
All pods rescheduled and running

Best Practices

• Monitor conditions - Watch for MemoryPressure, DiskPressure, and PIDPressure before they cause failures

• Track capacity - Keep allocatable resources above 20% to handle traffic spikes

• Label nodes - Use labels for targeted workload placement (GPU, high-memory, etc.)

• Review taints - Ensure critical workloads have proper tolerations

• Update gradually - Roll node updates slowly to minimize disruption