Why Use This This skill provides specialized capabilities for XiaoMi's codebase.
Use Cases Developing new features in the XiaoMi repository Refactoring existing code to follow XiaoMi standards Understanding and working with XiaoMi's codebase structure
Skill Snapshot Auto scan of skill assets. Informational only.
Valid SKILL.md Checks against SKILL.md specification
Source & Community
Updated At Dec 26, 2025, 09:15 AM
Skill Stats
SKILL.md 90 Lines
Total Files 1
Total Size 0 B
License NOASSERTION
---
name: k8s-troubleshoot
description: Kubernetes troubleshooting toolkit - search pods by labels and execute diagnostic commands inside containers. Use when user reports service errors, exceptions, crashes, timeouts, or needs to check logs, processes, network, or resource usage in K8s pods.
---
# Kubernetes Troubleshooting Skill
A complete toolkit for diagnosing Kubernetes applications. Find pods by labels, then execute commands inside containers for deep diagnostics.
## When to Use
- User reports service errors, exceptions, failures, timeouts
- Need to check application logs or process status
- Diagnose network, memory, or disk issues
- Keywords: error, exception, failed, timeout, crash, not working, logs, troubleshoot, diagnose, pod, container
## Workflow
1. **Search pods** - Find target pods by label selector
2. **Execute diagnostics** - Run commands inside containers
## Scripts
### 1. Search Pods
Find pods by label selector:
```bash
uv run python .claude/skills/k8s-troubleshoot/scripts/search_pods.py -l "app=nginx" -n default
```
| Parameter | Required | Description |
|-----------|----------|-------------|
| `-l, --label-selector` | Yes | Label selector, e.g., `app=nginx` or `project-id=123,pipeline-id=456` |
| `-n, --namespace` | No | Namespace (default: `default`). Use `all` for all namespaces |
**Output**: JSON with `success`, `podCount`, `pods` (name, namespace, phase, containers)
### 2. Execute Command in Pod
Run diagnostic commands inside a container:
```bash
uv run python .claude/skills/k8s-troubleshoot/scripts/exec_pod.py -p "pod-name" -n default -cmd "tail -n 100 /root/logs/app.log"
```
| Parameter | Required | Description |
|-----------|----------|-------------|
| `-p, --pod` | Yes | Pod name |
| `-n, --namespace` | No | Namespace (default: `default`) |
| `-c, --container` | No | Container name (for multi-container pods) |
| `-cmd, --command` | Yes | Command to execute |
**Output**: JSON with `success`, `pod`, `namespace`, `command`, `output`
## Common Diagnostic Patterns
### View application logs
```bash
uv run python .claude/skills/k8s-troubleshoot/scripts/exec_pod.py -p my-pod -n default -cmd "tail -n 100 /root/logs/app.log"
```
### Check Nacos config (dubbo3 issues)
```bash
uv run python .claude/skills/k8s-troubleshoot/scripts/exec_pod.py -p my-pod -n default -cmd "cat /root/logs/nacos/config.log | grep nacos"
```
### Check processes
```bash
uv run python .claude/skills/k8s-troubleshoot/scripts/exec_pod.py -p my-pod -n default -cmd "ps aux | head -20"
```
### Check network
```bash
uv run python .claude/skills/k8s-troubleshoot/scripts/exec_pod.py -p my-pod -n default -cmd "netstat -tlnp"
```
### Check disk and memory
```bash
uv run python .claude/skills/k8s-troubleshoot/scripts/exec_pod.py -p my-pod -n default -cmd "df -h && free -m"
```
## Troubleshooting Tips
| Issue | Diagnostic Command |
|-------|-------------------|
| dubbo3 no provider | Check `/root/logs/nacos/config.log` for nacos address |
| Service not responding | Check process status with `ps aux` and logs |
| Connection issues | Check network with `netstat -tlnp` |
| OOM errors | Check memory with `free -m` |