docs: design the update manager + manifest generator #3
221
docs/update-manager.md
Normal file
221
docs/update-manager.md
Normal file
@@ -0,0 +1,221 @@
|
||||
# Update manager — design
|
||||
|
||||
This is the design for how the Metin2 client gets updated after the player's first install. Scope covers the launcher, the server-side manifest, the publishing flow, and the security model. Implementation plan is at the bottom.
|
||||
|
||||
## Goals and constraints
|
||||
|
||||
- The **base install is large** (~4.3 GB of packs + binaries). Shipping it through the update channel is a non-goal; base install is a separate bundled download.
|
||||
- Releases can happen **as often as daily**. A small script change in a Python pack should not force players to re-download the full client.
|
||||
- The update must be **atomic from the player's point of view**: they end up either on the old version or on the new one, never on a half-patched client.
|
||||
- **Integrity matters**: a malicious or buggy mirror must not be able to ship tampered files.
|
||||
- **Offline fallback**: if the update server is unreachable, the launcher lets the player into the game with whatever they have.
|
||||
- The launcher is the **single entry point** the player runs. It owns update detection, download, integrity checks, self-update, and game launch.
|
||||
- Publishing is **manual for v1** (`make-release.sh` + rsync), automated via Gitea Actions once the flow is proven.
|
||||
|
||||
## High-level architecture
|
||||
|
||||
```
|
||||
┌──────────────────────────────┐ ┌──────────────────────────────┐
|
||||
│ Player machine │ HTTPS │ VPS (Caddy) │
|
||||
│ ├────────► │
|
||||
│ Launcher.exe │ │ updates.jakubkadlec.dev/ │
|
||||
│ ├─ fetch manifest │ │ manifest.json │
|
||||
│ ├─ verify Ed25519 signature │ │ manifest.json.sig │
|
||||
│ ├─ diff with local files │ │ files/<hash>/<hash> │
|
||||
│ ├─ download missing files │ │ │
|
||||
│ ├─ verify each sha256 │ └──────────────────────────────┘
|
||||
│ ├─ atomic move into place │
|
||||
│ ├─ self-update if needed │
|
||||
│ └─ launch Metin2.exe │
|
||||
│ │
|
||||
│ client/ │
|
||||
│ Metin2.exe │
|
||||
│ Metin2Launcher.exe │
|
||||
│ pack/*.pck assets/* ... │
|
||||
└──────────────────────────────┘
|
||||
```
|
||||
|
||||
### Server-side layout
|
||||
|
||||
Served statically by Caddy from `/var/www/updates.jakubkadlec.dev/`:
|
||||
|
||||
```
|
||||
updates.jakubkadlec.dev/
|
||||
├── manifest.json ← current release manifest
|
||||
├── manifest.json.sig ← Ed25519 signature over manifest.json
|
||||
├── manifests/
|
||||
│ ├── 2026.04.14-1.json ← archived historical manifests
|
||||
│ ├── 2026.04.14-1.json.sig
|
||||
│ └── ...
|
||||
└── files/
|
||||
└── ab/
|
||||
└── abc123...def ← content-addressed blob, named after sha256
|
||||
```
|
||||
|
||||
**Content-addressed storage** means a file is named after its sha256. Two consequences:
|
||||
|
||||
- **Automatic deduplication** across releases: if `item.pck` is unchanged, the new manifest points at the same blob. Nothing is uploaded or stored twice.
|
||||
- **Atomic publishing**: upload new blobs first, then replace `manifest.json` last. A partially-uploaded release never causes an inconsistent client state, because the client never sees the new manifest until it's complete.
|
||||
|
||||
### Manifest
|
||||
|
||||
See [update-manifest.md](./update-manifest.md) for the formal schema. Summary:
|
||||
|
||||
```json
|
||||
{
|
||||
"version": "2026.04.14-1",
|
||||
"created_at": "2026-04-14T12:00:00Z",
|
||||
"previous": "2026.04.13-3",
|
||||
"launcher": {
|
||||
"path": "Metin2Launcher.exe",
|
||||
"sha256": "..."
|
||||
},
|
||||
"files": [
|
||||
{
|
||||
"path": "Metin2.exe",
|
||||
"sha256": "...",
|
||||
"size": 27982848,
|
||||
"platform": "windows",
|
||||
"required": true
|
||||
},
|
||||
{
|
||||
"path": "pack/item.pck",
|
||||
"sha256": "...",
|
||||
"size": 128000000
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
- `version` is date-based (`YYYY.MM.DD-N` where `N` is the daily counter). Human-readable, sortable, forgiving of multiple releases per day.
|
||||
- `previous` lets the launcher show a changelog chain and enables smarter diff strategies later.
|
||||
- `launcher` is called out separately because it needs special handling (self-update).
|
||||
- `platform` is `windows` by default; future native Linux build can use `linux` and the launcher filters by its own platform.
|
||||
- `required: true` files block game launch if missing; optional files (language packs, optional assets) are opportunistic.
|
||||
|
||||
### Security model
|
||||
|
||||
- A single **Ed25519 keypair** signs each manifest. Private key lives on the release machine only (never in any repo). Public key is compiled into the launcher binary.
|
||||
- Launcher **refuses to apply** a manifest whose signature doesn't verify against the baked-in public key. No fallback, no "accept this once" dialog.
|
||||
- **sha256 per file** catches storage or transport corruption. A file whose downloaded bytes don't match the manifest hash is discarded and retried.
|
||||
- **Key rotation** flow: ship a new launcher that knows both the old and new public keys, transition period of a week, then ship one that only knows the new key. Because the launcher itself is delivered through the same update channel, this is clean.
|
||||
- **Transport** is HTTPS via Caddy (Let's Encrypt already). Ed25519 signing is defense-in-depth against compromised CDN / MITM, not the primary trust mechanism.
|
||||
|
||||
### Client behavior
|
||||
|
||||
Launcher does, in order:
|
||||
|
||||
1. **Fetch** `manifest.json` and `manifest.json.sig` (HTTP GET, timeout 10 s).
|
||||
2. **Verify** signature. On failure: abort update, log, go to step 8.
|
||||
3. **Parse** manifest, filter `files[]` by matching `platform`.
|
||||
4. For each file:
|
||||
- **Hash** the local copy (if present). If sha256 matches, skip.
|
||||
- Otherwise **download** the blob from `files/<hash[0:2]>/<hash>` into `staging/<path>` using HTTP Range requests (to resume partial downloads from a prior interrupted run).
|
||||
- **Verify** downloaded bytes against manifest hash. Mismatch = delete staging file, mark file as failed.
|
||||
5. If any **required** file failed after N retries: abort update, log, go to step 8 (offline fallback). Optional files that failed are silently skipped.
|
||||
6. **Self-update check**: if `launcher.sha256` differs from our own running binary, write the new launcher to `Metin2Launcher.new.exe`, spawn a small **trampoline** that waits for our PID to exit, replaces `Metin2Launcher.exe` with `Metin2Launcher.new.exe`, then exits. We then exit ourselves; the trampoline is a tiny native exe that lives alongside the launcher. See [Self-update details](#self-update-details).
|
||||
7. **Atomic apply**: for each non-launcher file, `MoveFileEx(staging, final, MOVEFILE_REPLACE_EXISTING)`. Keep a small manifest of moved paths so we can roll back on failure.
|
||||
8. **Launch**: `CreateProcess("Metin2.exe", ...)` with the current working directory at the client root. Exit the launcher once the game process has established itself.
|
||||
|
||||
### Self-update details
|
||||
|
||||
The Windows filesystem does not allow replacing a currently-running executable. Two common patterns:
|
||||
|
||||
- **Rename-before-replace**: on Windows you can rename `Metin2Launcher.exe` while it's running, then write the new file at the original path. The running process keeps its file handle open via the renamed copy. Next start picks up the new launcher. This works without a trampoline and is what we use for the launcher's own self-update.
|
||||
- **Trampoline** (only if rename-before-replace fails): a ~50 KB `launcher-update.exe` that waits for our PID to exit, replaces the main launcher, then exits. Kept as fallback.
|
||||
|
||||
### Offline fallback
|
||||
|
||||
- If step 1 times out or returns non-2xx, launcher logs the failure and goes straight to step 8. The player gets into the game with whatever local version they already have.
|
||||
- If signature verification (step 2) fails, launcher does **not** fall back silently — it shows an error and refuses to launch, because "the server is lying to me" is more dangerous than "the server is down". This is the one case where we stop the player.
|
||||
- If the game server is down but the update server is up, that's the server runtime team's problem; the launcher is still successful.
|
||||
|
||||
### Directory layout on the player's machine
|
||||
|
||||
```
|
||||
client/
|
||||
├── Metin2Launcher.exe ← self-updating launcher, the player's entry point
|
||||
├── Metin2.exe ← managed by the launcher
|
||||
├── Metin2Launcher.exe.old ← previous launcher, kept for rollback (deleted after 1 successful run)
|
||||
├── Metin2.exe.old ← same for Metin2.exe
|
||||
├── pack/
|
||||
├── assets/
|
||||
├── config/
|
||||
├── log/
|
||||
└── .updates/
|
||||
├── current-manifest.json ← the manifest we're currently on
|
||||
├── staging/ ← download staging area, cleared after successful apply
|
||||
└── launcher.log ← launcher's own log
|
||||
```
|
||||
|
||||
Files under `.updates/` are created by the launcher. The user shouldn't touch them and we ship a `.gitignore` so they don't end up in any accidental archive.
|
||||
|
||||
## Publishing flow (v1, manual)
|
||||
|
||||
1. On a trusted machine (not random laptop), with the private signing key present:
|
||||
```bash
|
||||
./scripts/make-release.sh --version 2026.04.14-1 --source /path/to/fresh/client
|
||||
```
|
||||
2. The script walks the client directory, computes sha256 for each file, writes a `manifest.json`, signs it, and produces a release directory `release/2026.04.14-1/` containing the manifest, its signature, and only the new blobs (ones not already present on the server).
|
||||
3. Human review: diff the new manifest against the previous one, sanity-check size and file count.
|
||||
4. `rsync` the release directory to the VPS:
|
||||
```bash
|
||||
rsync -av release/2026.04.14-1/ mt2.jakubkadlec.dev@mt2.jakubkadlec.dev:/var/www/updates.jakubkadlec.dev/
|
||||
```
|
||||
5. Verify from a second machine: `curl` the manifest, check signature, check a random blob.
|
||||
6. Tag the release in git.
|
||||
|
||||
Manual because v1 should let us feel the flow before we automate. After ~2 weeks of successful manual releases, wire it into Gitea Actions.
|
||||
|
||||
## Publishing flow (v2, Gitea Actions)
|
||||
|
||||
Not implemented in MVP. Sketch:
|
||||
|
||||
- `m2dev-client-src` build artifact (Metin2.exe) and `m2dev-client` runtime content are combined by a release workflow.
|
||||
- The workflow runs `make-release.sh` using a signing key stored as a Gitea secret.
|
||||
- rsyncs to VPS via a deploy SSH key.
|
||||
- Opens a PR that updates `CHANGELOG.md` with the new version.
|
||||
|
||||
Trade-off: automation speed vs. the attack surface of a CI-held signing key. When we get there, we'll probably **sign offline** and let CI only publish pre-signed bundles.
|
||||
|
||||
## Failure modes and what we do about them
|
||||
|
||||
| Failure | Client behavior | Operator behavior |
|
||||
|---|---|---|
|
||||
| Update server 5xx | Launch game with current version | Investigate VPS / Caddy |
|
||||
| Update server returns invalid signature | Refuse to launch, show error | Rotate signing key, investigate source |
|
||||
| Partial download (network drop) | Resume on next run via Range | None, user retries |
|
||||
| Individual file hash mismatch after retries | Skip file if optional, abort if required | Investigate blob corruption |
|
||||
| Launcher self-update fails mid-replace | Rollback from `.old` copy, launch old launcher | Investigate, ship fixed launcher |
|
||||
| Player filesystem is full | Error out with actionable message ("free X MB, retry") | None |
|
||||
| Player has antivirus quarantining files | Error message naming the file that disappeared | Document, whitelist in launcher installer |
|
||||
| Someone ships a manifest with missing blobs | Launcher reports which files it can't fetch | Broken release, re-run publish |
|
||||
|
||||
## Implementation plan
|
||||
|
||||
Effort is real-days of Claude + review time from the team.
|
||||
|
||||
| # | Task | Effort | Output |
|
||||
|---|---|---|---|
|
||||
| 1 | This design doc, reviewed | 0.5 d | `docs/update-manager.md` |
|
||||
| 2 | Manifest schema spec | 0.5 d | `docs/update-manifest.md` |
|
||||
| 3 | `scripts/make-manifest.py` — walk dir, produce unsigned manifest | 1 d | Python script + docs |
|
||||
| 4 | Sign/verify script (Ed25519) | 0.5 d | Python + keygen docs |
|
||||
| 5 | Caddy config for `updates.jakubkadlec.dev` | 0.5 d | Caddyfile fragment + DNS note |
|
||||
| 6 | Launcher (C# .NET 8 self-contained, single-file) — skeleton + HTTP fetch + manifest parse + verify | 2 d | `launcher/` project |
|
||||
| 7 | Launcher — file diff + download + hash verify + atomic apply | 2 d | |
|
||||
| 8 | Launcher — self-update with rename-before-replace + `.old` rollback | 1 d | |
|
||||
| 9 | End-to-end test (publish → client updates → launch) | 1 d | |
|
||||
| 10 | `scripts/make-release.sh` wiring it all together | 1 d | |
|
||||
| 11 | Docs: publisher runbook, player troubleshooting, threat model | 1 d | |
|
||||
|
||||
**MVP is items 1–10**, roughly **10 working days** of implementation. Review + integration + real-world hardening on top.
|
||||
|
||||
## Open questions left for the team
|
||||
|
||||
- **Launcher UI**: bare minimum (single window with a progress bar and "Play" button) vs. something nicer (changelog panel, news feed, image banner)? MVP is bare minimum; richer UI is a v2 concern.
|
||||
- **Localization**: manifest fields are English, but the launcher UI needs Czech (at least). Load strings from the client's existing `locale.pck`, or ship a separate small locale for the launcher? Lean toward the latter because launcher runs before the game and shouldn't depend on game assets.
|
||||
- **News feed**: optional. If yes, add a `news_url` field to the manifest and let the launcher fetch a small JSON blob. Nice-to-have.
|
||||
- **Analytics**: do we want to know how many players are on which version? Simple: launcher sends an HTTP POST with `{version, platform}` after successful update. Requires GDPR thought. Off by default, opt-in.
|
||||
|
||||
None of these block the MVP — they can be decided once the skeleton works.
|
||||
Reference in New Issue
Block a user