docs: expand operational runbooks

This commit is contained in:
server
2026-04-14 09:03:08 +02:00
parent 6c744ee323
commit 7f4233402a
5 changed files with 394 additions and 4 deletions

97
docs/rollback.md Normal file
View File

@@ -0,0 +1,97 @@
# Rollback
This document describes the practical rollback workflow for the current Debian VPS.
## Important Principle
There are two rollback surfaces:
- source rollback (`m2dev-server-src`)
- runtime-file rollback (`m2dev-server`)
They are related, but they are not the same repository and do not always need to move together.
## Fast Safety Rule
Before a risky deploy, record both current commits:
```bash
ssh mt2
sudo -iu mt2.jakubkadlec.dev git -C ~/metin/repos/m2dev-server-src rev-parse --short HEAD
sudo -iu mt2.jakubkadlec.dev git -C ~/metin/repos/m2dev-server rev-parse --short HEAD
```
If the deploy fails, rollback to those exact commits.
## Source Rollback
Use this when the regression is in:
- server binaries
- network code
- DB logic
- C++ runtime behavior
Example:
```bash
ssh mt2
sudo -iu mt2.jakubkadlec.dev git -C ~/metin/repos/m2dev-server-src checkout <good_commit>
sudo -iu mt2.jakubkadlec.dev cmake --build ~/metin/build/server-src --parallel "$(nproc)"
sudo -iu mt2.jakubkadlec.dev ~/metin/deploy/sync-runtime.sh
systemctl restart metin-server.service
/usr/local/sbin/metin-login-healthcheck
```
## Runtime Rollback
Use this when the regression is in:
- configs
- quests
- locale files
- runtime-only scripts
Example:
```bash
ssh mt2
sudo -iu mt2.jakubkadlec.dev git -C ~/metin/repos/m2dev-server checkout <good_commit>
sudo -iu mt2.jakubkadlec.dev ~/metin/deploy/sync-runtime.sh
sudo -iu mt2.jakubkadlec.dev ~/metin/deploy/compile-quests.sh
systemctl restart metin-server.service
/usr/local/sbin/metin-login-healthcheck
```
## Combined Rollback
Use this when the regression could be caused by an interaction between:
- new binaries
- new runtime files
Rollback both repositories together and rebuild/redeploy.
## systemd Rollback
If the regression is in `deploy/systemd/`, rollback the runtime repository to a known good commit and re-run:
```bash
python3 deploy/systemd/install_systemd.py ... --restart
```
## Verification After Rollback
Minimum verification:
```bash
systemctl status metin-server.service --no-pager
ss -ltnp | rg ':(9000|11000|11011|11012|11013|11991) '
/usr/local/sbin/metin-login-healthcheck
```
## What Not To Do
- do not use `git reset --hard` blindly on a dirty VPS checkout
- do not rollback only one repository if the deploy changed both and the fault domain is unclear
- do not close the current SSH session before the healthcheck passes