Files
m2dev-server/docs/server-management.md
2026-04-14 16:13:47 +02:00

220 lines
4.3 KiB
Markdown

# Server Management
This document describes the current Debian-side control plane for the production Metin runtime.
## Inventory
The channel topology now lives in one versioned file:
- `deploy/channel-inventory.json`
It defines:
- auth and DB listener ports
- channel ids
- per-core public ports and P2P ports
- whether a channel is public/client-visible
- whether a special channel should always be included by management tooling
This inventory is now the source used by:
- `channel_inventory.py`
- `channels.py` compatibility exports
- `install.py`
- `deploy/systemd/install_systemd.py`
- `metinctl`
## metinctl
The Debian deployment installs:
- `/usr/local/bin/metinctl`
`metinctl` is a lightweight operational CLI for:
- showing an operational summary
- showing recent auth success/failure activity
- showing auth activity grouped by source IP
- viewing inventory
- listing managed units
- checking service status
- listing declared ports
- listing recent auth failures
- listing recent login sessions
- listing stale open sessions without logout
- restarting the whole stack or specific channels/instances
- viewing logs
- listing core files in the runtime tree
- collecting incident bundles
- running the root-only headless healthcheck
- waiting for login-ready state after restart
## Examples
Show inventory:
```bash
metinctl inventory
```
Show current unit state:
```bash
metinctl status
```
Show a quick operational summary:
```bash
metinctl summary
```
Show declared ports and whether they are currently listening:
```bash
metinctl ports --live
```
Show recent real auth failures and skip smoke-test logins:
```bash
metinctl auth-failures
```
Show recent auth success/failure flow:
```bash
metinctl auth-activity
```
Show only recent auth failures including smoke tests:
```bash
metinctl auth-activity --status failure --include-smoke
```
Show auth activity grouped by IP:
```bash
metinctl auth-ips
```
Include smoke-test failures too:
```bash
metinctl auth-failures --include-smoke
```
Show recent login sessions from `log.loginlog2`:
```bash
metinctl sessions
```
Show only sessions that still have no recorded logout:
```bash
metinctl sessions --active-only
```
Show stale open sessions older than 30 minutes:
```bash
metinctl session-audit
```
Use a different stale threshold:
```bash
metinctl session-audit --stale-minutes 10
```
Restart only channel 1 cores:
```bash
metinctl restart channel:1
```
Restart one specific game instance:
```bash
metinctl restart instance:channel1_core2
```
Tail auth logs:
```bash
metinctl logs auth -n 200 -f
```
Run the deeper end-to-end healthcheck:
```bash
metinctl healthcheck --mode full
```
Run the lighter readiness probe:
```bash
metinctl healthcheck --mode ready
```
Wait until a restarted stack is login-ready:
```bash
metinctl wait-ready
```
List core files currently present in the runtime tree:
```bash
metinctl cores
```
Collect an incident bundle with logs, unit status, port state and repository revisions:
```bash
metinctl incident-collect --tag auth-timeout --since "-20 minutes"
```
List the most recent incident bundles:
```bash
metinctl incidents
```
## systemd installer behavior
`deploy/systemd/install_systemd.py` now uses the same inventory and installs `metinctl`.
It also reconciles enabled game instance units against the selected channels:
- selected game units are enabled
- stale game units are disabled
- if `--restart` is passed, stale game units are disabled with `--now`
This makes channel enablement declarative instead of depending on whatever happened to be enabled previously.
## Crash / Incident Pipeline
The Debian deployment now also installs:
- `/usr/local/sbin/metin-collect-incident`
The collector creates a timestamped bundle under:
- `/var/lib/metin/incidents`
Each bundle contains:
- repo revisions for `m2dev-server` and `m2dev-server-src`
- `systemctl status` for the whole stack
- recent `journalctl` output per unit
- listener state from `ss -ltnp`
- tailed runtime `syslog.log` and `syserr.log` files
- metadata for any `core*` files found under `runtime/server/channels`
If you call it with `--include-cores`, matching core files are copied into the bundle as well.
The runtime units now also declare `LimitCORE=infinity`, so after the next service restart the processes are allowed to emit core dumps when the host kernel/core policy permits it.