# Server runtime audit Engineer-to-engineer writeup of what the VPS `mt2.jakubkadlec.dev` is actually running as of 2026-04-14. Existing docs under `docs/` describe the intended layout (`debian-runtime.md`, `database-bootstrap.md`, `config-and-secrets.md`); this document is a ground-truth snapshot from a live recon session, with PIDs, paths, versions and surprises. Companion: `docs/server-topology.md` for the ASCII diagram and port table. ## TL;DR - Only one metin binary is alive right now: the **`db`** helper on port `9000` (PID `1788997` at audit time, cwd `/home/mt2.jakubkadlec.dev/metin/runtime/server/channels/db`). - **`game_auth` and all `channel*_core*` processes are NOT running.** The listing in the original prompt (auth `:11000/12000`, channel1 cores `:11011/12011` etc.) reflects *intended* state from the systemd units, not the current live process table. `ss -tlnp` only shows `0.0.0.0:9000` for m2. - The game/auth binaries are **not present on disk either**. Only `share/bin/db` exists; there is no `share/bin/game_auth` and no `share/bin/channel*_core*`. Those channels cannot start even if requested. - The `db` unit is currently **flapping / crash-looping**. `systemctl` reports `deactivating (stop-sigterm)`; syserr.log shows repeated `Connection reset by peer` from client peers (auth/game trying to reconnect is the usual culprit, but here nobody is connecting — cause needs verification). Two fresh `core.` files (97 MB each) sit in the db channel dir from 13:24 and 13:25 today. - Orchestration is **pure systemd**, not the upstream `start.py` / tmux setup. The README still documents `start.py`, so the README is stale for the Debian VPS; `deploy/systemd/` + `docs/debian-runtime.md` are authoritative. - MariaDB 11.8.6 is the backing store on `127.0.0.1:3306`. The DB user the stack is configured to use is `bootstrap` (from `share/conf/db.txt` / `game.txt`). The actual password is injected via `/etc/metin/metin.env`, which is `root:root 600` and intentionally unreadable by the runtime user inspector account. ## Host - Hostname: `vmi3229987` (Contabo), public name `mt2.jakubkadlec.dev`. - OS: Debian 13 (trixie). - MariaDB: `mariadbd` 11.8.6, PID `103624`, listening on `127.0.0.1:3306`. - All metin services run as the unprivileged user `mt2.jakubkadlec.dev:mt2.jakubkadlec.dev`. - Runtime root: `/home/mt2.jakubkadlec.dev/metin/runtime/server` (755 MB across `channels/`, 123 MB across `share/`, total metin workspace on the box ~1.7 GB). ## Processes currently alive From `ps auxf` + `ss -tlnp` at audit time: ``` mysql 103624 /usr/sbin/mariadbd — 127.0.0.1:3306 mt2.j+ 1788997 /home/.../channels/db/db — 0.0.0.0:9000 ``` No other m2 binaries show up. `ps` has **zero** matches for `game_auth`, `channel1_core1`, `channel1_core2`, `channel1_core3`, `channel99_core1`. Per-process inspection: | PID | cwd | exe (resolved) | fds of interest | | ------- | ----------------------------------------------- | ------------------------------------------------- | --------------- | | 1788997 | `.../runtime/server/channels/db` | `.../share/bin/db` (via `./db` symlink) | fd 3→syslog.log, fd 4→syserr.log, fd 11 TCP `*:9000`, fd 17 `[eventpoll]` (epoll fdwatch) | The `db` symlink inside the channel dir resolves to `../../share/bin/db`, which is an `ELF 64-bit LSB pie executable, x86-64, dynamically linked, BuildID fc049d0f..., not stripped`. Build identifier from `channels/db/VERSION.txt`: **`db revision: b2b037f-dirty`** — the dirty tag is a red flag, the build wasn't from a clean checkout of `m2dev-server-src`. The `usage.txt` in the same directory shows hourly heartbeat rows with `| 0 | 0 |` since 2026-04-13 21:00 (the "sessions / active" columns are stuck at zero — consistent with no game channels being connected). ## Binaries actually present on disk ``` /home/mt2.jakubkadlec.dev/metin/runtime/server/share/bin/ ├── db ← present, used └── game ← present (shared game binary, but not launched under any instance name that the systemd generator expects) ``` What is NOT present: - `share/bin/game_auth` - `share/bin/channel1_core1`, `channel1_core2`, `channel1_core3` - `share/bin/channel99_core1` The `metin-game-instance-start` helper (`/usr/local/libexec/...`) is a bash wrapper that `cd`s into `channels///` and execs `./`, e.g. `./channel1_core1`. Those per-instance binaries don't exist yet. The channel dirs themselves (`channel1/core1/`, etc.) already contain the scaffolding (`CONFIG`, `conf`, `data`, `log`, `mark`, `package`, `p2p_packet_info.txt`, `packet_info.txt`, `syserr.log`, `syslog.log`, `version.txt`), but `version.txt` says `game revision: unknown` and the per-instance executable file is missing. The log directory has a single stale `syslog_2026-04-13.log`. Interpretation: the deploy pipeline that builds `m2dev-server-src` and drops instance binaries into `share/bin/` has not yet been run (or has not been re-run since the tree was laid out on 2026-04-13). Once Jakub's `debian-foundation` build produces per-instance symlinked/hardlinked binaries, the `metin-game@*` units should come up automatically on the next `systemctl restart metin-server`. ## How things are started All orchestration goes through systemd units under `/etc/systemd/system/`, installed from `deploy/systemd/` via `deploy/systemd/install_systemd.py`. Unit list and roles: | Unit | Type | Role | | ----------------------------------------- | -------- | -------------------------------------------- | | `metin-server.service` | oneshot | top-level grouping, `Requires=mariadb.service`. `ExecStart=/bin/true`, `RemainAfterExit=yes`. All sub-units are `PartOf=metin-server.service` so restarting `metin-server` cycles everything. | | `metin-db.service` | simple | launches `.../channels/db/db` as runtime user, `Restart=on-failure`, `LimitCORE=infinity`, env file `/etc/metin/metin.env`. | | `metin-db-ready.service` | oneshot | runs `/usr/local/libexec/metin-wait-port 127.0.0.1 9000 30` — gate that blocks auth+game until the DB socket is listening. | | `metin-auth.service` | simple | launches `.../channels/auth/game_auth`. Requires db-ready. | | `metin-game@channel1_core1..3.service` | template | each runs `/usr/local/libexec/metin-game-instance-start ` which execs `./` in that channel dir. | | `metin-game@channel99_core1.service` | template | same, for channel 99. | Dependency chain: ``` mariadb.service │ ▼ metin-db.service ──► metin-db-ready.service ──► metin-auth.service └► metin-game@*.service │ ▼ metin-server.service (oneshot umbrella) ``` All units have `PartOf=metin-server.service`, `Restart=on-failure`, `LimitNOFILE=65535`, `LimitCORE=infinity`. None run in Docker. None use tmux, screen or the upstream `start.py`. **The upstream `start.py` / `stop.py` in the repo are NOT wired up on this host** and should be treated as FreeBSD-era legacy. The per-instance launcher `/usr/local/libexec/metin-game-instance-start` (installed by `install_systemd.py`) is: ```bash #!/usr/bin/env bash set -euo pipefail instance="${1:?missing instance name}" root_dir="/home/mt2.jakubkadlec.dev/metin/runtime/server/channels" channel_dir="${instance%_*}" # e.g. channel1 from channel1_core2 core_dir="${instance##*_}" # e.g. core2 workdir="${root_dir}/${channel_dir}/${core_dir}" cd "$workdir" exec "./${instance}" ``` Notes: - the `%_*` / `##*_` parse is brittle — an instance name with more than one underscore would misbehave. For current naming (`channelN_coreM`) it works. - the helper does not redirect stdout/stderr; both go to the journal via systemd. ## Config files the binaries actually read All m2 config files referenced by the running/installed stack, resolved to their real path on disk: | Config file | Read by | Purpose | | ------------------------------------------------------------------------ | ------------- | --------------------------------------------------- | | `share/conf/db.txt` | `db` | SQL hosts, BIND_PORT=9000, item id range, hotbackup | | `share/conf/game.txt` | game cores | DB_ADDR=127.0.0.1, DB_PORT=9000, SQL creds, flags | | `share/conf/CMD` | game cores | in-game command ACL (notice, warp, item, …) | | `share/conf/item_proto.txt`, `mob_proto.txt`, `item_names*.txt`, `mob_names*.txt` | both db and game | static content tables | | `channels/db/conf` (symlink → `share/conf`) | `db` | every db channel looks into this flat conf tree | | `channels/db/data` (symlink → `share/data`) | `db`/`game` | mob/pc/dungeon/spawn data | | `channels/db/locale` (symlink → `share/locale`) | all | locale assets | | `channels/auth/CONFIG` | `game_auth` | `HOSTNAME: auth`, `CHANNEL: 1`, `PORT: 11000`, `P2P_PORT: 12000`, `AUTH_SERVER: master` | | `channels/channel1/core1/CONFIG` | core1 | `HOSTNAME: channel1_1`, `CHANNEL: 1`, `PORT: 11011`, `P2P_PORT: 12011`, `MAP_ALLOW: 1 4 5 6 3 23 43 112 107 67 68 72 208 302 304` | | `channels/channel1/core2/CONFIG` | core2 | `PORT: 11012`, `P2P_PORT: 12012` | | `channels/channel1/core3/CONFIG` | core3 | `PORT: 11013`, `P2P_PORT: 12013` | | `channels/channel99/core1/CONFIG` | ch99 core1 | `HOSTNAME: channel99_1`, `CHANNEL: 99`, `PORT: 11991`, `P2P_PORT: 12991`, `MAP_ALLOW: 113 81 100 101 103 105 110 111 114 118 119 120 121 122 123 124 125 126 127 128 181 182 183 200` | | `/etc/metin/metin.env` | all systemd units via `EnvironmentFile=-` | host-local secrets/overrides, root:root mode 600. Contents not readable during this audit. | Flat `share/conf/db.txt` (verbatim, with bootstrap secrets): ``` WELCOME_MSG = "Database connector is running..." SQL_ACCOUNT = "127.0.0.1 account bootstrap change-me 0" SQL_PLAYER = "127.0.0.1 player bootstrap change-me 0" SQL_COMMON = "127.0.0.1 common bootstrap change-me 0" SQL_HOTBACKUP= "127.0.0.1 hotbackup bootstrap change-me 0" TABLE_POSTFIX = "" BIND_PORT = 9000 CLIENT_HEART_FPS = 60 HASH_PLAYER_LIFE_SEC = 600 BACKUP_LIMIT_SEC = 3600 PLAYER_ID_START = 100 PLAYER_DELETE_LEVEL_LIMIT = 70 PLAYER_DELETE_CHECK_SIMPLE = 1 ITEM_ID_RANGE = 2000000000 2100000000 MIN_LENGTH_OF_SOCIAL_ID = 6 SIMPLE_SOCIALID = 1 ``` The `bootstrap` / `change-me` values are git-tracked placeholders. `config-and-secrets.md` explicitly says these are templates, and real values are expected to come from `/etc/metin/metin.env`. This works because the server source re-reads credentials from the environment when injected; verify by grepping `m2dev-server-src` for the SQL env var names used by `db`/`game`. (**Open question**: confirm which env var names override the in-file creds; the audit session couldn't read `metin.env` directly.) ## Database - Engine: **MariaDB 11.8.6** (`mariadb --version`). - PID: 103624, listening on `127.0.0.1:3306` only. No external TCP exposure, no unix socket checked (likely `/run/mysqld/mysqld.sock`). - Expected databases from `docs/database-bootstrap.md`: `account`, `player`, `common`, `log`, `hotbackup`. - Stack-side DB user: `bootstrap` (placeholder in git, real password in `/etc/metin/metin.env`). - Could not enumerate actual tables during the audit — both `mysql -uroot` and `sudo -u mt2.jakubkadlec.dev mariadb` failed (Access denied), since root uses unix-socket auth for `root@localhost` and the runtime user has no CLI credentials outside the systemd environment. - **To inspect the DB read-only:** either run as root with `sudo mariadb` (unix socket auth — needs confirmation it's enabled), or open `/etc/metin/metin.env` as root, grab the `bootstrap` password, then `mariadb -ubootstrap -p account` etc. Do not attempt writes. ## Logging Every m2 process writes two files in its channel dir, via fd 3 / fd 4: - `syslog.log` — verbose info stream (rotated by date in some dirs: `channel1/core1/log/syslog_2026-04-13.log`). - `syserr.log` — error stream. Look here first on crash. The `db` channel additionally writes to `syslog.log` (36 MB today, rotating appears to be manual — there is a `log/` dir with a daily file but the current `syslog.log` is at the top level) and drops `core.` ELF cores into the channel dir on SIGSEGV/SIGABRT because `LimitCORE=infinity` is set. systemd journal captures stdout/stderr as well, so `journalctl -u metin-db --since '1 hour ago'` is the fastest way to see startup banners and `systemd`-observed restarts. Example from this audit: ``` Apr 14 13:26:40 vmi3229987 db[1788997]: Real Server Apr 14 13:26:40 vmi3229987 db[1788997]: Success ACCOUNT Apr 14 13:26:40 vmi3229987 db[1788997]: Success COMMON Apr 14 13:26:40 vmi3229987 db[1788997]: Success HOTBACKUP Apr 14 13:26:40 vmi3229987 db[1788997]: mysql_real_connect: Lost connection to server at 'sending authentication information', system error: 104 ``` Every `db` start it opens *more than a dozen* AsyncSQL pools ("AsyncSQL: connected to 127.0.0.1 (reconnect 1)" repeated ~12 times), suggesting a large per-instance pool size. Worth checking if that needs tuning. The current `syserr.log` in `channels/db/` is dominated by: ``` [error] [int CPeerBase::Recv()()] socket_read failed Connection reset by peer [error] [int CClientManager::Process()()] Recv failed ``` which is the peer disconnect path. Since no auth/game peers should be connecting right now, this is either a leftover from an earlier start or something else (maybe a healthcheck probe) is touching 9000 and aborting. See open questions. ## Ports Live `ss -tlnp` on the VPS (m2-relevant lines only): | L3:L4 | Who | Exposure | | ---------------- | ------------ | -------------- | | `0.0.0.0:9000` | `db` | **INADDR_ANY** — listens on all interfaces. Look at this. | | `127.0.0.1:3306` | `mariadbd` | localhost only | Not currently listening (would be if auth/game were up): - `11000` / `12000` — auth client + p2p - `11011..11013` / `12011..12013` — channel1 cores + p2p - `11991` / `12991` — channel99 core1 + p2p Other listeners on the host (not m2): `:22`, `:2222` (gitea ssh), `:25` (postfix loopback), `:80/:443` (Caddy), `:3000` (Gitea), `:2019` (Caddy admin), `:33891` (unknown loopback), `:5355` / `:53` (resolver). **Firewalling note:** `db` binding to `0.0.0.0:9000` is a concern. In the normal m2 architecture, `db` only talks to auth/game cores on the same host and should bind to `127.0.0.1` only. Current binding is set by the `BIND_PORT = 9000` line in `share/conf/db.txt`, which in this server fork apparently defaults to `INADDR_ANY`. If the Contabo firewall or iptables/nft rules don't block 9000 from the outside, this is exposed. **Open question: verify iptables/nftables on the host, or move `db` to `127.0.0.1` explicitly in source / config.** ## Data directory layout All under `/home/mt2.jakubkadlec.dev/metin/runtime/server/share/`: ``` share/ ├── bin/ ← compiled binaries (only db + game present today) ├── conf/ ← db.txt, game.txt, CMD, item_proto.txt, mob_proto.txt, │ item_names_*.txt, mob_names_*.txt (17 locales each) ├── data/ ← DTA/, dungeon/, easterevent/, mob_spawn/, monster/, │ pc/, pc2/ (27 MB total) ├── locale/ ← 86 MB, per-locale strings + binary quest outputs ├── mark/ └── package/ ``` Per-channel scaffolding under `channels/` symlinks `conf`, `data`, `locale` back into `share/`, so each channel reads from a single canonical content tree. ## Disk usage footprint ``` /home/mt2.jakubkadlec.dev/metin/ 1.7 G (total metin workspace) runtime/server/share/ 123 M runtime/server/share/data/ 27 M runtime/server/share/locale/ 86 M runtime/server/channels/ 755 M channels/db/core.178508{2,8} ~194 M (two 97 MB coredumps) channels/db/syslog.log 36 M (grows fast) ``` Core dumps dominate the channel dir footprint right now. Cleaning up old `core.*` files is safe when the db is not actively crashing (and only after Jakub has looked at them). ## How to restart channel1_core2 cleanly Pre-flight checklist: 1. Confirm `share/bin/channel1_core2` actually exists on disk — right now it does **not**, so the instance cannot start. Skip straight to the "rebuild / redeploy" section in Jakub's `docs/deploy-workflow.md` before trying. 2. Confirm `metin-db.service` and `metin-auth.service` are `active (running)` (`systemctl is-active metin-db metin-auth`). If not, fix upstream first — a clean restart of core2 requires a healthy auth + db. 3. Check that no player is currently online on that core. With `usage.txt` at 0/0 this is trivially true today, but in prod do `cat channels/channel1/core2/usage.txt` first. 4. Look at recent logs so you have a baseline: `journalctl -u metin-game@channel1_core2 -n 50 --no-pager` Clean restart: ```bash # on the VPS as root or with sudo systemctl restart metin-game@channel1_core2.service systemctl status metin-game@channel1_core2.service --no-pager journalctl -u metin-game@channel1_core2.service -n 100 --no-pager -f ``` Because the unit is `Type=simple` with `Restart=on-failure`, `systemctl restart` sends SIGTERM, waits up to `TimeoutStopSec=60`, then brings the process back up. The binary's own `hupsig()` handler logs the SIGTERM into `syserr.log` and shuts down gracefully. Post-restart verification: ```bash ss -tlnp | grep -E ':(11012|12012)\b' # expect both ports listening tail -n 30 /home/mt2.jakubkadlec.dev/metin/runtime/server/channels/channel1/core2/syserr.log ``` If the process refuses to stay up (`Restart=on-failure` loops it), **do not** just bump `RestartSec`; grab the last 200 journal lines and the last 200 syserr lines and open an issue in `metin-server/m2dev-server-src` against Jakub. Do not edit the unit file ad-hoc on the host. ## Open questions These are things the audit could not determine without making changes or getting more access. They need a human operator to resolve. 1. **Who produces the per-instance binaries** (`channel1_core1`, `channel1_core2`, `channel1_core3`, `channel99_core1`, `game_auth`)? The deploy flow expects them in `share/bin/` and channel dirs but they are missing. Is this still hand-built, or is there a make target that hardlinks `share/bin/game` into each `channel*/core*/` name? 2. **Why is `db` currently flapping** (`deactivating (stop-sigterm)` in systemctl, plus two fresh core dumps on 2026-04-14 13:24/13:25 and dozens of `CPeerBase::Recv()` errors)? Nothing should be connecting to port 9000 right now. 3. **What the real `metin.env` contains** — specifically, the actual `bootstrap` DB password, and whether there is a separate admin-page password override. Audit did not touch `/etc/metin/metin.env`. 4. **Exact override-variable contract** between `share/conf/db.txt` placeholders and the env file. We need to verify which env var names the `db`/`game` source actually reads so we know whether the `change-me` literal is ever used at runtime. 5. **Is `db` intended to bind `0.0.0.0:9000`?** From a defense-in-depth standpoint it should be `127.0.0.1`. Needs either a source fix or a host firewall rule. Check current nftables state. 6. **`VERSION.txt` says `db revision: b2b037f-dirty`.** Which tree was this built from and why "dirty"? Point back at the `m2dev-server-src` commit and confirm the build artefact is reproducible. 7. **Log rotation**: `channels/db/syslog.log` is already 36 MB today with nothing connected. There is a `channels/channel1/core1/log/` dated subdir convention that suggests daily rotation, but `db`'s own syslog is not rotating. Confirm whether `logrotate` or an in-process rotator is expected to own this. 8. **Hourly heartbeat in `usage.txt`** comes from where? Every ~1 h a row is appended — this is probably the `db` backup tick, but confirm it's not some cron job. 9. **`mysqld`'s live databases**: could not enumerate table names without credentials. `docs/database-bootstrap.md` lists the expected set; someone with `metin.env` access should confirm `account`, `player`, `common`, `log`, `hotbackup` are all present and populated. 10. **Stale README**: top-level `README.md` still documents FreeBSD + `start.py`. Not urgent, but worth a `docs:` sweep to point readers at `docs/debian-runtime.md` as the canonical layout.