m2dev-client-src/docs/pack-profile-analysis.md

# Pack Profile Analysis

The client can now emit a runtime pack profiler report into:

```text
log/pack_profile.txt
```

Enable it with either:

```bash
M2PACK_PROFILE=1 ./scripts/run-wine-headless.sh ./build-mingw64-lld/bin
```

or:

```bash
./scripts/run-wine-headless.sh ./build-mingw64-lld/bin -- --m2pack-profile
```

## Typical workflow

Collect two runs with the same scenario:

1. legacy `.pck` runtime
2. `m2p` runtime

After each run, copy or rename the profiler output so it is not overwritten:

```bash
cp build-mingw64-lld/bin/log/pack_profile.txt logs/pack_profile.pck.txt
cp build-mingw64-lld/bin/log/pack_profile.txt logs/pack_profile.m2p.txt
```

Then compare both runs:

```bash
python3 scripts/pack-profile-report.py \
  pck=logs/pack_profile.pck.txt \
  m2p=logs/pack_profile.m2p.txt
```

For repeated testing, use the wrapper scripts:

```bash
./scripts/capture-pack-profile.sh \
  --runtime-root ../m2dev-client \
  --label pck
```

This stages the runtime into `build-mingw64-lld/bin`, runs the client with
`M2PACK_PROFILE=1`, then archives:

- raw report: `build-mingw64-lld/bin/log/pack-profile-runs/<label>.pack_profile.txt`
- parsed summary: `build-mingw64-lld/bin/log/pack-profile-runs/<label>.summary.txt`

To run a full `pck` vs `m2p` comparison in one go:

```bash
./scripts/compare-pack-profile-runs.sh \
  --left-label pck \
  --left-runtime-root /path/to/runtime-pck \
  --right-label m2p \
  --right-runtime-root /path/to/runtime-m2p
```

The script captures both runs back-to-back and writes a combined compare report
into the same output directory.

You can also summarize a single run:

```bash
python3 scripts/pack-profile-report.py logs/pack_profile.m2p.txt
```

## What to read first

`Packed Load Totals`

- Best top-level comparison for pack I/O cost in the measured run.
- Focus on `delta_ms` and `delta_pct`.

`Phase Markers`

- Shows where startup time actually moved.
- Useful for deciding whether the gain happened before login, during loading, or mostly in game.

`Load Time By Phase`

- Confirms which phase is paying for asset work.
- Usually the most important line is `loading`.

`Loader Stages`

- Shows whether the cost is mostly in decrypt or zstd.
- For `m2p`, expect small manifest overhead and the main costs in `aead_decrypt` and `zstd_decompress`.
- For legacy `.pck`, expect `xchacha20_decrypt` and `zstd_decompress`.

`Top Phase Pack Loads`

- Useful when the total difference is small, but one or two packs dominate the budget.

## Decision hints

If `Packed Load Totals` improve, but `Phase Markers` do not, the bottleneck is probably outside pack loading.

If `zstd_decompress` dominates both formats, the next lever is compression strategy and pack layout.

If decrypt dominates, the next lever is reducing decrypt work on the hot path or changing how often the same data is touched.