Files
m2dev-server/docs/server-runtime.md
2026-04-14 13:36:54 +02:00

21 KiB

Server runtime audit

Engineer-to-engineer writeup of what the VPS mt2.jakubkadlec.dev is actually running as of 2026-04-14. Existing docs under docs/ describe the intended layout (debian-runtime.md, database-bootstrap.md, config-and-secrets.md); this document is a ground-truth snapshot from a live recon session, with PIDs, paths, versions and surprises.

Companion: docs/server-topology.md for the ASCII diagram and port table.

TL;DR

  • Only one metin binary is alive right now: the db helper on port 9000 (PID 1788997 at audit time, cwd /home/mt2.jakubkadlec.dev/metin/runtime/server/channels/db).
  • game_auth and all channel*_core* processes are NOT running. The listing in the original prompt (auth :11000/12000, channel1 cores :11011/12011 etc.) reflects intended state from the systemd units, not the current live process table. ss -tlnp only shows 0.0.0.0:9000 for m2.
  • The game/auth binaries are not present on disk either. Only share/bin/db exists; there is no share/bin/game_auth and no share/bin/channel*_core*. Those channels cannot start even if requested.
  • The db unit is currently flapping / crash-looping. systemctl reports deactivating (stop-sigterm); syserr.log shows repeated Connection reset by peer from client peers (auth/game trying to reconnect is the usual culprit, but here nobody is connecting — cause needs verification). Two fresh core.<pid> files (97 MB each) sit in the db channel dir from 13:24 and 13:25 today.
  • Orchestration is pure systemd, not the upstream start.py / tmux setup. The README still documents start.py, so the README is stale for the Debian VPS; deploy/systemd/ + docs/debian-runtime.md are authoritative.
  • MariaDB 11.8.6 is the backing store on 127.0.0.1:3306. The DB user the stack is configured to use is bootstrap (from share/conf/db.txt / game.txt). The actual password is injected via /etc/metin/metin.env, which is root:root 600 and intentionally unreadable by the runtime user inspector account.

Host

  • Hostname: vmi3229987 (Contabo), public name mt2.jakubkadlec.dev.
  • OS: Debian 13 (trixie).
  • MariaDB: mariadbd 11.8.6, PID 103624, listening on 127.0.0.1:3306.
  • All metin services run as the unprivileged user mt2.jakubkadlec.dev:mt2.jakubkadlec.dev.
  • Runtime root: /home/mt2.jakubkadlec.dev/metin/runtime/server (755 MB across channels/, 123 MB across share/, total metin workspace on the box ~1.7 GB).

Processes currently alive

From ps auxf + ss -tlnp at audit time:

mysql    103624  /usr/sbin/mariadbd                     — 127.0.0.1:3306
mt2.j+  1788997  /home/.../channels/db/db               — 0.0.0.0:9000

No other m2 binaries show up. ps has zero matches for game_auth, channel1_core1, channel1_core2, channel1_core3, channel99_core1.

Per-process inspection:

PID cwd exe (resolved) fds of interest
1788997 .../runtime/server/channels/db .../share/bin/db (via ./db symlink) fd 3→syslog.log, fd 4→syserr.log, fd 11 TCP *:9000, fd 17 [eventpoll] (epoll fdwatch)

The db symlink inside the channel dir resolves to ../../share/bin/db, which is an ELF 64-bit LSB pie executable, x86-64, dynamically linked, BuildID fc049d0f..., not stripped. Build identifier from channels/db/VERSION.txt: db revision: b2b037f-dirty — the dirty tag is a red flag, the build wasn't from a clean checkout of m2dev-server-src.

The usage.txt in the same directory shows hourly heartbeat rows with | 0 | 0 | since 2026-04-13 21:00 (the "sessions / active" columns are stuck at zero — consistent with no game channels being connected).

Binaries actually present on disk

/home/mt2.jakubkadlec.dev/metin/runtime/server/share/bin/
├── db        ← present, used
└── game      ← present (shared game binary, but not launched under any
                instance name that the systemd generator expects)

What is NOT present:

  • share/bin/game_auth
  • share/bin/channel1_core1, channel1_core2, channel1_core3
  • share/bin/channel99_core1

The metin-game-instance-start helper (/usr/local/libexec/...) is a bash wrapper that cds into channels/<channel>/<core>/ and execs ./<instance>, e.g. ./channel1_core1. Those per-instance binaries don't exist yet. The channel dirs themselves (channel1/core1/, etc.) already contain the scaffolding (CONFIG, conf, data, log, mark, package, p2p_packet_info.txt, packet_info.txt, syserr.log, syslog.log, version.txt), but version.txt says game revision: unknown and the per-instance executable file is missing. The log directory has a single stale syslog_2026-04-13.log.

Interpretation: the deploy pipeline that builds m2dev-server-src and drops instance binaries into share/bin/ has not yet been run (or has not been re-run since the tree was laid out on 2026-04-13). Once Jakub's debian-foundation build produces per-instance symlinked/hardlinked binaries, the metin-game@* units should come up automatically on the next systemctl restart metin-server.

How things are started

All orchestration goes through systemd units under /etc/systemd/system/, installed from deploy/systemd/ via deploy/systemd/install_systemd.py.

Unit list and roles:

Unit Type Role
metin-server.service oneshot top-level grouping, Requires=mariadb.service. ExecStart=/bin/true, RemainAfterExit=yes. All sub-units are PartOf=metin-server.service so restarting metin-server cycles everything.
metin-db.service simple launches .../channels/db/db as runtime user, Restart=on-failure, LimitCORE=infinity, env file /etc/metin/metin.env.
metin-db-ready.service oneshot runs /usr/local/libexec/metin-wait-port 127.0.0.1 9000 30 — gate that blocks auth+game until the DB socket is listening.
metin-auth.service simple launches .../channels/auth/game_auth. Requires db-ready.
metin-game@channel1_core1..3.service template each runs /usr/local/libexec/metin-game-instance-start <instance> which execs ./<instance> in that channel dir.
metin-game@channel99_core1.service template same, for channel 99.

Dependency chain:

mariadb.service
      │
      ▼
metin-db.service ──► metin-db-ready.service ──► metin-auth.service
                                             └► metin-game@*.service
                                                    │
                                                    ▼
                                             metin-server.service  (oneshot umbrella)

All units have PartOf=metin-server.service, Restart=on-failure, LimitNOFILE=65535, LimitCORE=infinity. None run in Docker. None use tmux, screen or the upstream start.py. The upstream start.py / stop.py in the repo are NOT wired up on this host and should be treated as FreeBSD-era legacy.

The per-instance launcher /usr/local/libexec/metin-game-instance-start (installed by install_systemd.py) is:

#!/usr/bin/env bash
set -euo pipefail
instance="${1:?missing instance name}"
root_dir="/home/mt2.jakubkadlec.dev/metin/runtime/server/channels"
channel_dir="${instance%_*}"           # e.g. channel1 from channel1_core2
core_dir="${instance##*_}"             # e.g. core2
workdir="${root_dir}/${channel_dir}/${core_dir}"
cd "$workdir"
exec "./${instance}"

Notes:

  • the %_* / ##*_ parse is brittle — an instance name with more than one underscore would misbehave. For current naming (channelN_coreM) it works.
  • the helper does not redirect stdout/stderr; both go to the journal via systemd.

Config files the binaries actually read

All m2 config files referenced by the running/installed stack, resolved to their real path on disk:

Config file Read by Purpose
share/conf/db.txt db SQL hosts, BIND_PORT=9000, item id range, hotbackup
share/conf/game.txt game cores DB_ADDR=127.0.0.1, DB_PORT=9000, SQL creds, flags
share/conf/CMD game cores in-game command ACL (notice, warp, item, …)
share/conf/item_proto.txt, mob_proto.txt, item_names*.txt, mob_names*.txt both db and game static content tables
channels/db/conf (symlink → share/conf) db every db channel looks into this flat conf tree
channels/db/data (symlink → share/data) db/game mob/pc/dungeon/spawn data
channels/db/locale (symlink → share/locale) all locale assets
channels/auth/CONFIG game_auth HOSTNAME: auth, CHANNEL: 1, PORT: 11000, P2P_PORT: 12000, AUTH_SERVER: master
channels/channel1/core1/CONFIG core1 HOSTNAME: channel1_1, CHANNEL: 1, PORT: 11011, P2P_PORT: 12011, MAP_ALLOW: 1 4 5 6 3 23 43 112 107 67 68 72 208 302 304
channels/channel1/core2/CONFIG core2 PORT: 11012, P2P_PORT: 12012
channels/channel1/core3/CONFIG core3 PORT: 11013, P2P_PORT: 12013
channels/channel99/core1/CONFIG ch99 core1 HOSTNAME: channel99_1, CHANNEL: 99, PORT: 11991, P2P_PORT: 12991, MAP_ALLOW: 113 81 100 101 103 105 110 111 114 118 119 120 121 122 123 124 125 126 127 128 181 182 183 200
/etc/metin/metin.env all systemd units via EnvironmentFile=- host-local secrets/overrides, root:root mode 600. Contents not readable during this audit.

Flat share/conf/db.txt (verbatim, with bootstrap secrets):

WELCOME_MSG  = "Database connector is running..."
SQL_ACCOUNT  = "127.0.0.1 account bootstrap change-me 0"
SQL_PLAYER   = "127.0.0.1 player bootstrap change-me 0"
SQL_COMMON   = "127.0.0.1 common bootstrap change-me 0"
SQL_HOTBACKUP= "127.0.0.1 hotbackup bootstrap change-me 0"
TABLE_POSTFIX = ""
BIND_PORT               = 9000
CLIENT_HEART_FPS        = 60
HASH_PLAYER_LIFE_SEC    = 600
BACKUP_LIMIT_SEC        = 3600
PLAYER_ID_START         = 100
PLAYER_DELETE_LEVEL_LIMIT = 70
PLAYER_DELETE_CHECK_SIMPLE = 1
ITEM_ID_RANGE           = 2000000000 2100000000
MIN_LENGTH_OF_SOCIAL_ID = 6
SIMPLE_SOCIALID         = 1

The bootstrap / change-me values are git-tracked placeholders. config-and-secrets.md explicitly says these are templates, and real values are expected to come from /etc/metin/metin.env. This works because the server source re-reads credentials from the environment when injected; verify by grepping m2dev-server-src for the SQL env var names used by db/game. (Open question: confirm which env var names override the in-file creds; the audit session couldn't read metin.env directly.)

Database

  • Engine: MariaDB 11.8.6 (mariadb --version).
  • PID: 103624, listening on 127.0.0.1:3306 only. No external TCP exposure, no unix socket checked (likely /run/mysqld/mysqld.sock).
  • Expected databases from docs/database-bootstrap.md: account, player, common, log, hotbackup.
  • Stack-side DB user: bootstrap (placeholder in git, real password in /etc/metin/metin.env).
  • Could not enumerate actual tables during the audit — both mysql -uroot and sudo -u mt2.jakubkadlec.dev mariadb failed (Access denied), since root uses unix-socket auth for root@localhost and the runtime user has no CLI credentials outside the systemd environment.
  • To inspect the DB read-only: either run as root with sudo mariadb (unix socket auth — needs confirmation it's enabled), or open /etc/metin/metin.env as root, grab the bootstrap password, then mariadb -ubootstrap -p account etc. Do not attempt writes.

Logging

Every m2 process writes two files in its channel dir, via fd 3 / fd 4:

  • syslog.log — verbose info stream (rotated by date in some dirs: channel1/core1/log/syslog_2026-04-13.log).
  • syserr.log — error stream. Look here first on crash.

The db channel additionally writes to syslog.log (36 MB today, rotating appears to be manual — there is a log/ dir with a daily file but the current syslog.log is at the top level) and drops core.<pid> ELF cores into the channel dir on SIGSEGV/SIGABRT because LimitCORE=infinity is set.

systemd journal captures stdout/stderr as well, so journalctl -u metin-db --since '1 hour ago' is the fastest way to see startup banners and systemd-observed restarts. Example from this audit:

Apr 14 13:26:40 vmi3229987 db[1788997]: Real Server
Apr 14 13:26:40 vmi3229987 db[1788997]: Success ACCOUNT
Apr 14 13:26:40 vmi3229987 db[1788997]: Success COMMON
Apr 14 13:26:40 vmi3229987 db[1788997]: Success HOTBACKUP
Apr 14 13:26:40 vmi3229987 db[1788997]: mysql_real_connect: Lost connection
     to server at 'sending authentication information', system error: 104

Every db start it opens more than a dozen AsyncSQL pools ("AsyncSQL: connected to 127.0.0.1 (reconnect 1)" repeated ~12 times), suggesting a large per-instance pool size. Worth checking if that needs tuning.

The current syserr.log in channels/db/ is dominated by:

[error] [int CPeerBase::Recv()()] socket_read failed Connection reset by peer
[error] [int CClientManager::Process()()] Recv failed

which is the peer disconnect path. Since no auth/game peers should be connecting right now, this is either a leftover from an earlier start or something else (maybe a healthcheck probe) is touching 9000 and aborting. See open questions.

Ports

Live ss -tlnp on the VPS (m2-relevant lines only):

L3:L4 Who Exposure
0.0.0.0:9000 db INADDR_ANY — listens on all interfaces. Look at this.
127.0.0.1:3306 mariadbd localhost only

Not currently listening (would be if auth/game were up):

  • 11000 / 12000 — auth client + p2p
  • 11011..11013 / 12011..12013 — channel1 cores + p2p
  • 11991 / 12991 — channel99 core1 + p2p

Other listeners on the host (not m2): :22, :2222 (gitea ssh), :25 (postfix loopback), :80/:443 (Caddy), :3000 (Gitea), :2019 (Caddy admin), :33891 (unknown loopback), :5355 / :53 (resolver).

Firewalling note: db binding to 0.0.0.0:9000 is a concern. In the normal m2 architecture, db only talks to auth/game cores on the same host and should bind to 127.0.0.1 only. Current binding is set by the BIND_PORT = 9000 line in share/conf/db.txt, which in this server fork apparently defaults to INADDR_ANY. If the Contabo firewall or iptables/nft rules don't block 9000 from the outside, this is exposed. Open question: verify iptables/nftables on the host, or move db to 127.0.0.1 explicitly in source / config.

Data directory layout

All under /home/mt2.jakubkadlec.dev/metin/runtime/server/share/:

share/
├── bin/          ← compiled binaries (only db + game present today)
├── conf/         ← db.txt, game.txt, CMD, item_proto.txt, mob_proto.txt,
│                   item_names_*.txt, mob_names_*.txt (17 locales each)
├── data/         ← DTA/, dungeon/, easterevent/, mob_spawn/, monster/,
│                   pc/, pc2/ (27 MB total)
├── locale/       ← 86 MB, per-locale strings + binary quest outputs
├── mark/
└── package/

Per-channel scaffolding under channels/ symlinks conf, data, locale back into share/, so each channel reads from a single canonical content tree.

Disk usage footprint

/home/mt2.jakubkadlec.dev/metin/             1.7 G   (total metin workspace)
    runtime/server/share/                    123 M
        runtime/server/share/data/            27 M
        runtime/server/share/locale/          86 M
    runtime/server/channels/                  755 M
        channels/db/core.178508{2,8}        ~194 M   (two 97 MB coredumps)
        channels/db/syslog.log                36 M   (grows fast)

Core dumps dominate the channel dir footprint right now. Cleaning up old core.* files is safe when the db is not actively crashing (and only after Jakub has looked at them).

How to restart channel1_core2 cleanly

Pre-flight checklist:

  1. Confirm share/bin/channel1_core2 actually exists on disk — right now it does not, so the instance cannot start. Skip straight to the "rebuild / redeploy" section in Jakub's docs/deploy-workflow.md before trying.
  2. Confirm metin-db.service and metin-auth.service are active (running) (systemctl is-active metin-db metin-auth). If not, fix upstream first — a clean restart of core2 requires a healthy auth + db.
  3. Check that no player is currently online on that core. With usage.txt at 0/0 this is trivially true today, but in prod do cat channels/channel1/core2/usage.txt first.
  4. Look at recent logs so you have a baseline: journalctl -u metin-game@channel1_core2 -n 50 --no-pager

Clean restart:

# on the VPS as root or with sudo
systemctl restart metin-game@channel1_core2.service
systemctl status  metin-game@channel1_core2.service --no-pager
journalctl -u metin-game@channel1_core2.service -n 100 --no-pager -f

Because the unit is Type=simple with Restart=on-failure, systemctl restart sends SIGTERM, waits up to TimeoutStopSec=60, then brings the process back up. The binary's own hupsig() handler logs the SIGTERM into syserr.log and shuts down gracefully.

Post-restart verification:

ss -tlnp | grep -E ':(11012|12012)\b'       # expect both ports listening
tail -n 30 /home/mt2.jakubkadlec.dev/metin/runtime/server/channels/channel1/core2/syserr.log

If the process refuses to stay up (Restart=on-failure loops it), do not just bump RestartSec; grab the last 200 journal lines and the last 200 syserr lines and open an issue in metin-server/m2dev-server-src against Jakub. Do not edit the unit file ad-hoc on the host.

Open questions

These are things the audit could not determine without making changes or getting more access. They need a human operator to resolve.

  1. Who produces the per-instance binaries (channel1_core1, channel1_core2, channel1_core3, channel99_core1, game_auth)? The deploy flow expects them in share/bin/ and channel dirs but they are missing. Is this still hand-built, or is there a make target that hardlinks share/bin/game into each channel*/core*/<instance> name?
  2. Why is db currently flapping (deactivating (stop-sigterm) in systemctl, plus two fresh core dumps on 2026-04-14 13:24/13:25 and dozens of CPeerBase::Recv() errors)? Nothing should be connecting to port 9000 right now.
  3. What the real metin.env contains — specifically, the actual bootstrap DB password, and whether there is a separate admin-page password override. Audit did not touch /etc/metin/metin.env.
  4. Exact override-variable contract between share/conf/db.txt placeholders and the env file. We need to verify which env var names the db/game source actually reads so we know whether the change-me literal is ever used at runtime.
  5. Is db intended to bind 0.0.0.0:9000? From a defense-in-depth standpoint it should be 127.0.0.1. Needs either a source fix or a host firewall rule. Check current nftables state.
  6. VERSION.txt says db revision: b2b037f-dirty. Which tree was this built from and why "dirty"? Point back at the m2dev-server-src commit and confirm the build artefact is reproducible.
  7. Log rotation: channels/db/syslog.log is already 36 MB today with nothing connected. There is a channels/channel1/core1/log/ dated subdir convention that suggests daily rotation, but db's own syslog is not rotating. Confirm whether logrotate or an in-process rotator is expected to own this.
  8. Hourly heartbeat in usage.txt comes from where? Every ~1 h a row is appended — this is probably the db backup tick, but confirm it's not some cron job.
  9. mysqld's live databases: could not enumerate table names without credentials. docs/database-bootstrap.md lists the expected set; someone with metin.env access should confirm account, player, common, log, hotbackup are all present and populated.
  10. Stale README: top-level README.md still documents FreeBSD + start.py. Not urgent, but worth a docs: sweep to point readers at docs/debian-runtime.md as the canonical layout.