m2dev-client-src/docs/pack-profile-analysis.md

# Pack Profile Analysis

The client can now emit a runtime pack profiler report into:

```text
log/pack_profile.txt
```

Enable it with either:

```bash
M2PACK_PROFILE=1 ./scripts/run-wine-headless.sh ./build-mingw64-lld/bin
```

or:

```bash
./scripts/run-wine-headless.sh ./build-mingw64-lld/bin -- --m2pack-profile
```

## Typical workflow

Collect two runs with the same scenario:

1. legacy `.pck` runtime
2. `m2p` runtime

After each run, copy or rename the profiler output so it is not overwritten:

```bash
cp build-mingw64-lld/bin/log/pack_profile.txt logs/pack_profile.pck.txt
cp build-mingw64-lld/bin/log/pack_profile.txt logs/pack_profile.m2p.txt
```

Then compare both runs:

```bash
python3 scripts/pack-profile-report.py \
  pck=logs/pack_profile.pck.txt \
  m2p=logs/pack_profile.m2p.txt
```

For repeated testing, use the wrapper scripts:

```bash
./scripts/capture-pack-profile.sh \
  --runtime-root ../m2dev-client \
  --label pck
```

This stages the runtime into `build-mingw64-lld/bin`, runs the client with
`M2PACK_PROFILE=1`, then archives:

- raw report: `build-mingw64-lld/bin/log/pack-profile-runs/<label>.pack_profile.txt`
- parsed summary: `build-mingw64-lld/bin/log/pack-profile-runs/<label>.summary.txt`

To run a full `pck` vs `m2p` comparison in one go:

```bash
./scripts/compare-pack-profile-runs.sh \
  --left-label pck \
  --left-runtime-root /path/to/runtime-pck \
  --right-label m2p \
  --right-runtime-root /path/to/runtime-m2p
```

The script captures both runs back-to-back and writes a combined compare report
into the same output directory.

## Real game-flow smoke compare

Startup-only runs are useful for bootstrap regressions, but they do not show the
real hot path once the client reaches `login`, `loading`, and `game`.

For that case, use the CH99 GM smoke compare wrapper:

```bash
python3 scripts/compare-pack-profile-gm-smoke.py \
  --left-label pck-only \
  --left-runtime-root /path/to/runtime-pck \
  --right-label secure-mixed \
  --right-runtime-root /path/to/runtime-m2p \
  --master-key /path/to/master.key \
  --sign-pubkey /path/to/signing.pub \
  --account-login admin
```

What it does:

- copies the built client into a temporary workspace outside the repository
- stages each runtime into that workspace
- temporarily updates the selected GM account password and map position
- auto-logins through the special GM smoke channel (`11991` by default)
- enters game, performs one deterministic GM warp, archives `pack_profile.txt`
- restores the account password and the character map/position afterward
- deletes the temporary workspace unless `--keep-workspaces` is used

Archived outputs per run:

- raw report: `<out-dir>/<label>.pack_profile.txt`
- parsed summary: `<out-dir>/<label>.summary.txt`
- headless trace: `<out-dir>/<label>.headless_gm_teleport_trace.txt`
- startup trace when present: `<out-dir>/<label>.startup_trace.txt`

This flow is the current best approximation of a real client loading path on the
Linux-hosted Wine setup because it records phase markers beyond pure startup.

You can also summarize a single run:

```bash
python3 scripts/pack-profile-report.py logs/pack_profile.m2p.txt
```

## What to read first

`Packed Load Totals`

- Best top-level comparison for pack I/O cost in the measured run.
- Focus on `delta_ms` and `delta_pct`.

`Phase Markers`

- Shows where startup time actually moved.
- Useful for deciding whether the gain happened before login, during loading, or mostly in game.

`Load Time By Phase`

- Confirms which phase is paying for asset work.
- Usually the most important line is `loading`.

`Loader Stages`

- Shows whether the cost is mostly in decrypt or zstd.
- For `m2p`, expect small manifest overhead and the main costs in `aead_decrypt` and `zstd_decompress`.
- For legacy `.pck`, expect `xchacha20_decrypt` and `zstd_decompress`.

`Top Phase Pack Loads`

- Useful when the total difference is small, but one or two packs dominate the budget.

## Decision hints

If `Packed Load Totals` improve, but `Phase Markers` do not, the bottleneck is probably outside pack loading.

If `zstd_decompress` dominates both formats, the next lever is compression strategy and pack layout.

If decrypt dominates, the next lever is reducing decrypt work on the hot path or changing how often the same data is touched.