From 3cef0960ff0bec8e7b0dcf603aad6b09e59d34ba Mon Sep 17 00:00:00 2001 From: giteaadmin Date: Mon, 6 Apr 2026 17:55:11 +0000 Subject: [PATCH] Add observability stack documentation --- observability.md | 60 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) create mode 100644 observability.md diff --git a/observability.md b/observability.md new file mode 100644 index 0000000..2defb52 --- /dev/null +++ b/observability.md @@ -0,0 +1,60 @@ +# Homelab Observability Stack + +## Overview +Full observability setup for wtfsolutions.cc homelab. + +## Services + +### InfluxDB (LXC 118) +- **URL:** http://influxdb.wtfsolutions.cc +- **Credentials:** admin / influxadmin1 +- **Org:** homelab | **Bucket:** metrics +- **Token:** V2Kvcei0Pky6xw2h4gEN7HSKiwBeQkvIkw05_LUOsFKDZYZ-aIBCy_5D_njBqJJISVyEr4JosCq_WyMVRXivqw== +- **Retention:** 90 days +- **DBRP mapping:** telegraf → metrics bucket (InfluxQL compat) + +### Grafana (LXC 119) +- **URL:** http://grafana.wtfsolutions.cc +- **Credentials:** admin / pcideas +- **Datasource:** InfluxDB (Flux, UID: cfi6urg8tk4cgd) + - Auth via custom HTTP header (Authorization: Token ...) — do NOT use token field in UI +- **Dashboards:** + - Homelab Fleet Overview — RAM/CPU/Disk across all hosts + - Telegraf Metrics dashboard for InfluxDB 2.0 — per-host drill-down +- **Alerts:** RAM >90%, Disk >90%, CPU >85% → ntfy.sh + +### Telegraf +- **Installed on:** All 21 LXCs + proxmox04 host + homeassistant VM +- **Interval:** 30s +- **Inputs:** cpu, mem, disk, diskio, net, system, processes +- **Processor:** converter (uptime uint→float for InfluxDB compat) +- **Note:** LXC uptime reflects Proxmox host uptime (shared kernel) — only VMs show real uptime +- **Config path:** /etc/telegraf/telegraf.conf on each host + +### Uptime Kuma (LXC 125) +- **URL:** http://kuma.wtfsolutions.cc +- **Credentials:** admin / kuma123 +- **Monitors:** 21 services across 4 groups +- **Status page:** /status/homelab (slug: homelab) +- **Notifications:** ntfy.sh topic wtfsolutions-6fa579e38a5f + +### Homepage (LXC 126) +- **URL:** http://homepage.wtfsolutions.cc +- **Config:** /opt/homepage/config/ +- **Service:** systemd homepage.service +- **Widgets:** Sonarr, Radarr, Tautulli, Overseerr, qBittorrent, Pi-hole, Uptime Kuma, Immich, Plex +- **Notes:** + - Immich widget requires version: 2 (Immich v1.100+) + - Pi-hole widget: no key needed (unauthenticated API) + - Plex token from Tautulli: oD_GxEfKD4PyZ6LJmopc + +## Alerts (ntfy.sh) +- **Topic:** wtfsolutions-6fa579e38a5f +- **Server:** https://ntfy.sh +- **Sources:** Uptime Kuma (service down) + Grafana (resource thresholds) +- **App:** Install ntfy on phone, subscribe to topic above + +## Removed Services +- Homarr (LXC 112) — replaced by Homepage +- OpenCloud (LXC 122) — decommissioned +- Authentik (LXC 124) — decommissioned