Files
m2dev-client-src/docs/pack-profile-analysis.md
server b353339bd8
Some checks failed
build / Windows Build (push) Has been cancelled
Add GM smoke compare workflow for pack profiling
2026-04-15 17:35:02 +02:00

4.2 KiB

Pack Profile Analysis

The client can now emit a runtime pack profiler report into:

log/pack_profile.txt

Enable it with either:

M2PACK_PROFILE=1 ./scripts/run-wine-headless.sh ./build-mingw64-lld/bin

or:

./scripts/run-wine-headless.sh ./build-mingw64-lld/bin -- --m2pack-profile

Typical workflow

Collect two runs with the same scenario:

  1. legacy .pck runtime
  2. m2p runtime

After each run, copy or rename the profiler output so it is not overwritten:

cp build-mingw64-lld/bin/log/pack_profile.txt logs/pack_profile.pck.txt
cp build-mingw64-lld/bin/log/pack_profile.txt logs/pack_profile.m2p.txt

Then compare both runs:

python3 scripts/pack-profile-report.py \
  pck=logs/pack_profile.pck.txt \
  m2p=logs/pack_profile.m2p.txt

For repeated testing, use the wrapper scripts:

./scripts/capture-pack-profile.sh \
  --runtime-root ../m2dev-client \
  --label pck

This stages the runtime into build-mingw64-lld/bin, runs the client with M2PACK_PROFILE=1, then archives:

  • raw report: build-mingw64-lld/bin/log/pack-profile-runs/<label>.pack_profile.txt
  • parsed summary: build-mingw64-lld/bin/log/pack-profile-runs/<label>.summary.txt

To run a full pck vs m2p comparison in one go:

./scripts/compare-pack-profile-runs.sh \
  --left-label pck \
  --left-runtime-root /path/to/runtime-pck \
  --right-label m2p \
  --right-runtime-root /path/to/runtime-m2p

The script captures both runs back-to-back and writes a combined compare report into the same output directory.

Real game-flow smoke compare

Startup-only runs are useful for bootstrap regressions, but they do not show the real hot path once the client reaches login, loading, and game.

For that case, use the CH99 GM smoke compare wrapper:

python3 scripts/compare-pack-profile-gm-smoke.py \
  --left-label pck-only \
  --left-runtime-root /path/to/runtime-pck \
  --right-label secure-mixed \
  --right-runtime-root /path/to/runtime-m2p \
  --master-key /path/to/master.key \
  --sign-pubkey /path/to/signing.pub \
  --account-login admin

What it does:

  • copies the built client into a temporary workspace outside the repository
  • stages each runtime into that workspace
  • temporarily updates the selected GM account password and map position
  • auto-logins through the special GM smoke channel (11991 by default)
  • enters game, performs one deterministic GM warp, archives pack_profile.txt
  • restores the account password and the character map/position afterward
  • deletes the temporary workspace unless --keep-workspaces is used

Archived outputs per run:

  • raw report: <out-dir>/<label>.pack_profile.txt
  • parsed summary: <out-dir>/<label>.summary.txt
  • headless trace: <out-dir>/<label>.headless_gm_teleport_trace.txt
  • startup trace when present: <out-dir>/<label>.startup_trace.txt

This flow is the current best approximation of a real client loading path on the Linux-hosted Wine setup because it records phase markers beyond pure startup.

You can also summarize a single run:

python3 scripts/pack-profile-report.py logs/pack_profile.m2p.txt

What to read first

Packed Load Totals

  • Best top-level comparison for pack I/O cost in the measured run.
  • Focus on delta_ms and delta_pct.

Phase Markers

  • Shows where startup time actually moved.
  • Useful for deciding whether the gain happened before login, during loading, or mostly in game.

Load Time By Phase

  • Confirms which phase is paying for asset work.
  • Usually the most important line is loading.

Loader Stages

  • Shows whether the cost is mostly in decrypt or zstd.
  • For m2p, expect small manifest overhead and the main costs in aead_decrypt and zstd_decompress.
  • For legacy .pck, expect xchacha20_decrypt and zstd_decompress.

Top Phase Pack Loads

  • Useful when the total difference is small, but one or two packs dominate the budget.

Decision hints

If Packed Load Totals improve, but Phase Markers do not, the bottleneck is probably outside pack loading.

If zstd_decompress dominates both formats, the next lever is compression strategy and pack layout.

If decrypt dominates, the next lever is reducing decrypt work on the hot path or changing how often the same data is touched.