Add observability stack documentation

This commit is contained in:
giteaadmin 2026-04-06 17:55:11 +00:00
parent 67c4501913
commit 3cef0960ff
1 changed files with 60 additions and 0 deletions

60
observability.md Normal file
View File

@ -0,0 +1,60 @@
# Homelab Observability Stack
## Overview
Full observability setup for wtfsolutions.cc homelab.
## Services
### InfluxDB (LXC 118)
- **URL:** http://influxdb.wtfsolutions.cc
- **Credentials:** admin / influxadmin1
- **Org:** homelab | **Bucket:** metrics
- **Token:** V2Kvcei0Pky6xw2h4gEN7HSKiwBeQkvIkw05_LUOsFKDZYZ-aIBCy_5D_njBqJJISVyEr4JosCq_WyMVRXivqw==
- **Retention:** 90 days
- **DBRP mapping:** telegraf → metrics bucket (InfluxQL compat)
### Grafana (LXC 119)
- **URL:** http://grafana.wtfsolutions.cc
- **Credentials:** admin / pcideas
- **Datasource:** InfluxDB (Flux, UID: cfi6urg8tk4cgd)
- Auth via custom HTTP header (Authorization: Token ...) — do NOT use token field in UI
- **Dashboards:**
- Homelab Fleet Overview — RAM/CPU/Disk across all hosts
- Telegraf Metrics dashboard for InfluxDB 2.0 — per-host drill-down
- **Alerts:** RAM >90%, Disk >90%, CPU >85% → ntfy.sh
### Telegraf
- **Installed on:** All 21 LXCs + proxmox04 host + homeassistant VM
- **Interval:** 30s
- **Inputs:** cpu, mem, disk, diskio, net, system, processes
- **Processor:** converter (uptime uint→float for InfluxDB compat)
- **Note:** LXC uptime reflects Proxmox host uptime (shared kernel) — only VMs show real uptime
- **Config path:** /etc/telegraf/telegraf.conf on each host
### Uptime Kuma (LXC 125)
- **URL:** http://kuma.wtfsolutions.cc
- **Credentials:** admin / kuma123
- **Monitors:** 21 services across 4 groups
- **Status page:** /status/homelab (slug: homelab)
- **Notifications:** ntfy.sh topic wtfsolutions-6fa579e38a5f
### Homepage (LXC 126)
- **URL:** http://homepage.wtfsolutions.cc
- **Config:** /opt/homepage/config/
- **Service:** systemd homepage.service
- **Widgets:** Sonarr, Radarr, Tautulli, Overseerr, qBittorrent, Pi-hole, Uptime Kuma, Immich, Plex
- **Notes:**
- Immich widget requires version: 2 (Immich v1.100+)
- Pi-hole widget: no key needed (unauthenticated API)
- Plex token from Tautulli: oD_GxEfKD4PyZ6LJmopc
## Alerts (ntfy.sh)
- **Topic:** wtfsolutions-6fa579e38a5f
- **Server:** https://ntfy.sh
- **Sources:** Uptime Kuma (service down) + Grafana (resource thresholds)
- **App:** Install ntfy on phone, subscribe to topic above
## Removed Services
- Homarr (LXC 112) — replaced by Homepage
- OpenCloud (LXC 122) — decommissioned
- Authentik (LXC 124) — decommissioned