ops0ops0

AI Chat

Master conversational server management with Hive's AI Chat interface.

Starting a Conversation

1Select Server - Click on a server in the Hive dashboard
2Open AI Chat - Click the AI Chat tab
3Ask Questions - Type naturally in plain English

Example Conversations

Investigating High CPU

You:"The website is slow and CPU is at 95%"
Hive: I'll investigate the high CPU usage.
✓ Running: ps aux --sort=-%cpu | head -20
Found the culprit:

PID     USER    CPU%   COMMAND
12847   www     89%    php-fpm: pool www

✓ Running: strace -p 12847 -c (sampling for 5 seconds)

The process is making excessive database calls. The /api/search endpoint is running an unoptimized query scanning 2 million rows.

Recommendations:

  1. Add database index on products.name column
  2. Implement pagination on search results
  3. Consider caching frequent searches

Disk Space Emergency

You:"Server is out of disk space"
Hive: I'll find what's consuming the disk space.
✓ Checking disk usage... /dev/sda1 is 98% full
✓ Running: du -sh /* | sort -rh | head -10
Largest directories:

/var/log     → 45 GB (67% of used space)
/var/lib     → 12 GB
/home        → 8 GB

✓ Running: du -sh /var/log/* | sort -rh | head -10

Found: /var/log/nginx/access.log → 38 GB (single file!)

This nginx access log hasn't been rotated and grew to 38GB.

I can help you:

  1. Safely truncate this log file
  2. Set up log rotation to prevent this
You:"Yes, truncate it"
⚠️ This will permanently delete the log contents.
Command: truncate -s 0 /var/log/nginx/access.log
ApproveReject

Service Won't Start

You:"MySQL won't start"
Hive: I'll diagnose the MySQL startup issue.
✓ Checking service status...

mysql.service - MySQL Server
Status: failed (Result: exit-code)

✓ Checking recent logs: journalctl -u mysql -n 50
Error found:

[ERROR] InnoDB: Unable to lock ./ibdata1 error: 11

Another MySQL process is holding the lock file.

✓ Checking for running MySQL processes...

Found: Zombie MySQL process (PID 8834) still holding the lock

To fix this, I need to:

  1. Kill the zombie process
  2. Remove stale socket file
  3. Start MySQL service

Kubernetes Pod Troubleshooting

You:"The payment service pods keep crashing"
Hive: I'll investigate the payment service pod crashes.
✓ Running: kubectl get pods -l app=payment-service

NAME                               READY   STATUS              RESTARTS
payment-service-7d4b8c9f6-x2k4m   0/1     CrashLoopBackOff   5

✓ Running: kubectl logs payment-service-7d4b8c9f6-x2k4m --previous
Error found:

FATAL: Unable to connect to database at db.internal:5432
Connection refused

✓ Running: kubectl get svc db -o wide

Found the issue: The service selector doesn't match the pod labels.
Service has selector "app: database" but pod has label "app: postgres"

Fix: Update the service selector to match the pod label.

You:"Yes, fix it"
⚠️ This will modify Kubernetes resources. Approval required.
Command: kubectl patch svc db -p '{"spec":{"selector":{"app":"postgres"}}}'
ApproveReject

Docker Container Investigation

You:"The redis container keeps dying"
Hive: I'll investigate the Redis container issue.
✓ Running: docker ps -a | grep redis

CONTAINER ID   IMAGE      STATUS                       NAMES
a3b4c5d6e7f8   redis:7    Exited (137) 2 minutes ago   redis-cache

Exit code 137 indicates OOM kill. Checking logs...
✓ Running: docker logs redis-cache --tail 100

Out of memory trying to allocate 16384 bytes

✓ Running: docker inspect redis-cache --format='{{.HostConfig.Memory}}'

Root cause: Redis has 128MB memory limit but dataset exceeded this.

Recommendations:

  1. Increase container memory limit to 512MB
  2. Configure Redis maxmemory policy
  3. Restart the container
You:"Yes, set memory to 512MB and restart"
⚠️ Approval required for container operations.
Commands: docker stop redis-cache && docker rm redis-cache && docker run -d --name redis-cache -m 512m redis:7
ApproveReject

Nginx 502 Bad Gateway

You:"Website is showing 502 bad gateway"
Hive: I'll diagnose the 502 Bad Gateway error.
✓ Checking nginx status... nginx is running (PID 1234)
✓ Running: tail -50 /var/log/nginx/error.log

[error] connect() failed (111: Connection refused) while connecting to upstream, upstream: "http://127.0.0.1:3000"

Nginx is trying to proxy to port 3000 but nothing is listening.
✓ Running: systemctl status node-app
Found: node-app service is stopped (failed)
✓ Running: journalctl -u node-app -n 50

Error: Cannot find module '/var/www/app/server.js'

Root cause: The Node.js app file /var/www/app/server.js is missing.
Last modified: 2 hours ago (possible bad deployment?)

Recommendations:

  1. Check your deployment - server.js may not have been deployed
  2. Roll back to previous version if available

Full-Stack Investigation

You:"Users are complaining the checkout is broken"
Hive: I'll investigate the checkout issue across the full stack.
✓ Step 1: Checking web server (nginx) → nginx is running, no errors
✓ Step 2: Checking application (Node.js)

Found: "PaymentService: Connection timeout to payment-api:8080"

✓ Step 3: Checking payment service → Connection refused!
✓ Step 4: Checking payment container → exited 23 minutes ago
✓ Step 5: Checking container logs

Error: Redis connection failed - max retries exceeded

✓ Step 6: Checking Redis → Over memory limit (2.1GB / 2GB)
ROOT CAUSE FOUND:

Redis hit memory limit → Payment service couldn't write →
Payment service crashed → Checkout API returns 500

Fix sequence:

  1. Increase Redis memory limit to 4GB
  2. Restart Redis container
  3. Restart payment-api container
You:"Yes, fix it all"
✓ Redis memory increased to 4GB
✓ Redis restarted
✓ Payment service is healthy
✓ POST /api/checkout returned 200 OK
Checkout is now working!

Security Investigation

You:"Check if anyone unauthorized accessed this server"
Hive: I'll perform a security audit for unauthorized access.
✓ Checking authentication logs...
⚠️ Found 847 failed SSH login attempts in the last 24 hours
Top attacking IPs:

192.168.1.105 → 312 attempts (internal IP - concerning!)
45.33.32.156   → 245 attempts (external)
167.99.123.45  → 180 attempts (external)

✓ Checking successful logins...
Successful logins in last 7 days:
  • admin from 10.0.0.5 (3 times) ← Your office IP
  • deploy from 10.0.0.10 (12 times) ← CI/CD server
✓ No suspicious successful logins detected
✓ Checking for new user accounts...
✓ No new users created in last 30 days
Summary:

No unauthorized access detected
⚠️ Brute force attempts ongoing (847 in 24h)
⚠️ Internal IP 192.168.1.105 needs investigation

Recommendations:

  1. Install fail2ban to block brute force attempts
  2. Investigate the internal IP
  3. Consider disabling password auth (use keys only)

Capabilities

Hive helps with a wide range of DevOps, SRE, and system administration tasks. Use it to investigate issues, inspect systems, and assist with guided operational work.

Server Health & Monitoring

TaskWhat Hive Does
Health checkChecks CPU, memory, disk, services, network
Performance analysisIdentifies bottlenecks and resource hogs
Service statusChecks if critical services are running
Uptime reportShows system uptime and recent reboots

Troubleshooting

TaskWhat Hive Does
Error investigationFinds errors in logs, traces root causes
Service failuresDiagnoses why services won't start
Performance issuesFinds slow queries, memory leaks, CPU hogs
Connectivity problemsTests network, DNS, ports, firewalls

Log Analysis

TaskWhat Hive Does
Find errorsSearches logs for errors and exceptions
Track patternsIdentifies recurring issues
Filter by timeShows logs from specific time periods
Correlate eventsConnects related events across log files

System Administration

TaskWhat Hive Does
Service managementStart, stop, restart services (with approval)
Process controlFind and manage running processes
File operationsFind, view, and manage files
ConfigurationCheck and update config files

Kubernetes & Containers

TaskWhat Hive Does
Pod managementCheck pod status, logs, describe pods, restart pods
DeploymentsView deployments, rollout status, scale replicas
Services & IngressCheck service endpoints, ingress configurations
Docker containersList containers, view logs, inspect, exec into containers
Container debuggingInvestigate why containers crash, check exit codes

Web Servers & Proxies

TaskWhat Hive Does
Nginx/ApacheCheck config syntax, view access/error logs, reload
SSL/TLSCheck certificate expiry, debug SSL issues
Load balancersVerify upstream health, check configurations
Reverse proxyDebug routing issues, check headers

Databases

TaskWhat Hive Does
MySQL/PostgreSQLCheck status, show slow queries, connection issues
Redis/MongoDBCheck memory usage, connection count, replication
Query analysisFind expensive queries, check locks, deadlocks
Backup verificationCheck backup status, verify dump files

Networking & Cloud

TaskWhat Hive Does
DNS debuggingCheck resolution, dig, nslookup
ConnectivityTest ports, trace routes, check firewalls
Cloud CLIsRun aws, gcloud, az commands for cloud resources
VPN/TunnelsCheck tunnel status, debug connectivity

Security

TaskWhat Hive Does
Login auditShows who logged in and when
Failed attemptsIdentifies brute force attacks
Open portsLists network services and ports
File permissionsChecks for insecure permissions
Vulnerability scanningCheck for outdated packages, exposed services

Multi-Step Investigation

Hive automatically continues investigating until it finds the root cause:

1
Initial Check
Checks application logs → Finds "Connection refused" error
2
Follow the Trail
Connection to what? → Checks error details → It's the database
3
Check Database
Is MySQL running? → No! Service is stopped
4
Why Did It Stop?
Checks MySQL logs → Out of memory kill (OOM)
Root Cause Found
Memory leak in background job processor consumed all RAM

Context Awareness

Hive remembers your conversation and understands context:

You:"Check the web server"

✓ Nginx is running, listening on ports 80 and 443

You:"What about the config?"

Hive understands you mean "nginx config"
Server blocks: 3, SSL: Enabled, Gzip: Enabled, Config test: ✓ OK

You:"Compare it to the database server"

Hive understands you want to compare nginx between servers
db-01 doesn't have nginx installed. Did you mean MySQL?


Approval Workflow

When Hive needs to make changes, you're always in control:

Auto-Run
ls, ps, cat, grep, df, free, top, netstat
!
Needs Approval
restart, kill, install, config changes
Blocked
rm -rf /, shutdown, format

Learn more about Safety & Security.