forked from metin-server/m2dev-server
ops: add incident collection pipeline
This commit is contained in:
@@ -38,6 +38,8 @@ The Debian deployment installs:
|
||||
- listing declared ports
|
||||
- restarting the whole stack or specific channels/instances
|
||||
- viewing logs
|
||||
- listing core files in the runtime tree
|
||||
- collecting incident bundles
|
||||
- running the root-only headless healthcheck
|
||||
|
||||
## Examples
|
||||
@@ -84,6 +86,24 @@ Run the end-to-end healthcheck:
|
||||
metinctl healthcheck
|
||||
```
|
||||
|
||||
List core files currently present in the runtime tree:
|
||||
|
||||
```bash
|
||||
metinctl cores
|
||||
```
|
||||
|
||||
Collect an incident bundle with logs, unit status, port state and repository revisions:
|
||||
|
||||
```bash
|
||||
metinctl incident-collect --tag auth-timeout --since "-20 minutes"
|
||||
```
|
||||
|
||||
List the most recent incident bundles:
|
||||
|
||||
```bash
|
||||
metinctl incidents
|
||||
```
|
||||
|
||||
## systemd installer behavior
|
||||
|
||||
`deploy/systemd/install_systemd.py` now uses the same inventory and installs `metinctl`.
|
||||
@@ -95,3 +115,26 @@ It also reconciles enabled game instance units against the selected channels:
|
||||
- if `--restart` is passed, stale game units are disabled with `--now`
|
||||
|
||||
This makes channel enablement declarative instead of depending on whatever happened to be enabled previously.
|
||||
|
||||
## Crash / Incident Pipeline
|
||||
|
||||
The Debian deployment now also installs:
|
||||
|
||||
- `/usr/local/sbin/metin-collect-incident`
|
||||
|
||||
The collector creates a timestamped bundle under:
|
||||
|
||||
- `/var/lib/metin/incidents`
|
||||
|
||||
Each bundle contains:
|
||||
|
||||
- repo revisions for `m2dev-server` and `m2dev-server-src`
|
||||
- `systemctl status` for the whole stack
|
||||
- recent `journalctl` output per unit
|
||||
- listener state from `ss -ltnp`
|
||||
- tailed runtime `syslog.log` and `syserr.log` files
|
||||
- metadata for any `core*` files found under `runtime/server/channels`
|
||||
|
||||
If you call it with `--include-cores`, matching core files are copied into the bundle as well.
|
||||
|
||||
The runtime units now also declare `LimitCORE=infinity`, so after the next service restart the processes are allowed to emit core dumps when the host kernel/core policy permits it.
|
||||
|
||||
Reference in New Issue
Block a user