Revert stream M2Pack archive support

Support experimental stream M2Pack archives
Add GM smoke compare workflow for pack profiling
2026-04-15 19:06:59 +02:00 · 2026-04-15 18:40:40 +02:00 · 2026-04-15 17:35:02 +02:00 · 2026-04-15 17:06:11 +02:00 · 2026-04-15 16:39:16 +02:00 · 2026-04-15 16:34:26 +02:00
18 changed files with 2863 additions and 17 deletions
--- a/docs/pack-profile-analysis.md
+++ b/docs/pack-profile-analysis.md
@@ -0,0 +1,147 @@
+# Pack Profile Analysis
+
+The client can now emit a runtime pack profiler report into:
+
+```text
+log/pack_profile.txt
+```
+
+Enable it with either:
+
+```bash
+M2PACK_PROFILE=1 ./scripts/run-wine-headless.sh ./build-mingw64-lld/bin
+```
+
+or:
+
+```bash
+./scripts/run-wine-headless.sh ./build-mingw64-lld/bin -- --m2pack-profile
+```
+
+## Typical workflow
+
+Collect two runs with the same scenario:
+
+1. legacy `.pck` runtime
+2. `m2p` runtime
+
+After each run, copy or rename the profiler output so it is not overwritten:
+
+```bash
+cp build-mingw64-lld/bin/log/pack_profile.txt logs/pack_profile.pck.txt
+cp build-mingw64-lld/bin/log/pack_profile.txt logs/pack_profile.m2p.txt
+```
+
+Then compare both runs:
+
+```bash
+python3 scripts/pack-profile-report.py \
+  pck=logs/pack_profile.pck.txt \
+  m2p=logs/pack_profile.m2p.txt
+```
+
+For repeated testing, use the wrapper scripts:
+
+```bash
+./scripts/capture-pack-profile.sh \
+  --runtime-root ../m2dev-client \
+  --label pck
+```
+
+This stages the runtime into `build-mingw64-lld/bin`, runs the client with
+`M2PACK_PROFILE=1`, then archives:
+
+- raw report: `build-mingw64-lld/bin/log/pack-profile-runs/<label>.pack_profile.txt`
+- parsed summary: `build-mingw64-lld/bin/log/pack-profile-runs/<label>.summary.txt`
+
+To run a full `pck` vs `m2p` comparison in one go:
+
+```bash
+./scripts/compare-pack-profile-runs.sh \
+  --left-label pck \
+  --left-runtime-root /path/to/runtime-pck \
+  --right-label m2p \
+  --right-runtime-root /path/to/runtime-m2p
+```
+
+The script captures both runs back-to-back and writes a combined compare report
+into the same output directory.
+
+## Real game-flow smoke compare
+
+Startup-only runs are useful for bootstrap regressions, but they do not show the
+real hot path once the client reaches `login`, `loading`, and `game`.
+
+For that case, use the CH99 GM smoke compare wrapper:
+
+```bash
+python3 scripts/compare-pack-profile-gm-smoke.py \
+  --left-label pck-only \
+  --left-runtime-root /path/to/runtime-pck \
+  --right-label secure-mixed \
+  --right-runtime-root /path/to/runtime-m2p \
+  --master-key /path/to/master.key \
+  --sign-pubkey /path/to/signing.pub \
+  --account-login admin
+```
+
+What it does:
+
+- copies the built client into a temporary workspace outside the repository
+- stages each runtime into that workspace
+- temporarily updates the selected GM account password and map position
+- auto-logins through the special GM smoke channel (`11991` by default)
+- enters game, performs one deterministic GM warp, archives `pack_profile.txt`
+- restores the account password and the character map/position afterward
+- deletes the temporary workspace unless `--keep-workspaces` is used
+
+Archived outputs per run:
+
+- raw report: `<out-dir>/<label>.pack_profile.txt`
+- parsed summary: `<out-dir>/<label>.summary.txt`
+- headless trace: `<out-dir>/<label>.headless_gm_teleport_trace.txt`
+- startup trace when present: `<out-dir>/<label>.startup_trace.txt`
+
+This flow is the current best approximation of a real client loading path on the
+Linux-hosted Wine setup because it records phase markers beyond pure startup.
+
+You can also summarize a single run:
+
+```bash
+python3 scripts/pack-profile-report.py logs/pack_profile.m2p.txt
+```
+
+## What to read first
+
+`Packed Load Totals`
+
+- Best top-level comparison for pack I/O cost in the measured run.
+- Focus on `delta_ms` and `delta_pct`.
+
+`Phase Markers`
+
+- Shows where startup time actually moved.
+- Useful for deciding whether the gain happened before login, during loading, or mostly in game.
+
+`Load Time By Phase`
+
+- Confirms which phase is paying for asset work.
+- Usually the most important line is `loading`.
+
+`Loader Stages`
+
+- Shows whether the cost is mostly in decrypt or zstd.
+- For `m2p`, expect small manifest overhead and the main costs in `aead_decrypt` and `zstd_decompress`.
+- For legacy `.pck`, expect `xchacha20_decrypt` and `zstd_decompress`.
+
+`Top Phase Pack Loads`
+
+- Useful when the total difference is small, but one or two packs dominate the budget.
+
+## Decision hints
+
+If `Packed Load Totals` improve, but `Phase Markers` do not, the bottleneck is probably outside pack loading.
+
+If `zstd_decompress` dominates both formats, the next lever is compression strategy and pack layout.
+
+If decrypt dominates, the next lever is reducing decrypt work on the hot path or changing how often the same data is touched.
--- a/scripts/capture-pack-profile.sh
+++ b/scripts/capture-pack-profile.sh
@@ -0,0 +1,193 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+usage() {
+  cat <<'EOF'
+Usage:
+  capture-pack-profile.sh [options] [build-bin-dir] [-- <client args...>]
+
+Examples:
+  capture-pack-profile.sh --label pck
+  capture-pack-profile.sh --runtime-root ../m2dev-client --label baseline
+  capture-pack-profile.sh ./build-mingw64-lld/bin -- --m2pack-strict-hash 1
+
+Options:
+  --runtime-root DIR   Stage runtime content from DIR before launching
+  --label NAME         Output label for the captured report
+  --out-dir DIR        Directory for archived raw + summary reports
+  --timeout SEC        Overrides M2_TIMEOUT for the headless run
+  --copy-runtime       Materialize runtime instead of symlinking it
+  -h, --help           Show this help
+
+Behavior:
+  - Enables M2PACK_PROFILE=1 for the client run
+  - Archives build-bin-dir/log/pack_profile.txt under --out-dir
+  - Writes a parsed summary next to the raw report
+  - Treats timeout exit codes (124/137) as acceptable if the report exists
+
+Environment overrides:
+  M2_STAGE_RUNTIME_SCRIPT   path to stage-linux-runtime.sh
+  M2_HEADLESS_RUN_SCRIPT    path to run-wine-headless.sh
+  M2_PACK_PROFILE_PARSER    path to pack-profile-report.py
+EOF
+}
+
+sanitize_label() {
+  printf '%s' "$1" \
+    | tr '[:upper:]' '[:lower:]' \
+    | sed -E 's/[^a-z0-9._-]+/-/g; s/^-+//; s/-+$//; s/-{2,}/-/g'
+}
+
+script_dir="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)"
+repo_root="$(realpath "$script_dir/..")"
+default_build_bin_dir="$repo_root/build-mingw64-lld/bin"
+default_runtime_root="$repo_root/../m2dev-client"
+
+stage_script="${M2_STAGE_RUNTIME_SCRIPT:-$script_dir/stage-linux-runtime.sh}"
+run_script="${M2_HEADLESS_RUN_SCRIPT:-$script_dir/run-wine-headless.sh}"
+parser_script="${M2_PACK_PROFILE_PARSER:-$script_dir/pack-profile-report.py}"
+
+runtime_root=""
+label=""
+out_dir=""
+timeout_seconds="${M2_TIMEOUT:-20}"
+copy_runtime=0
+
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --runtime-root)
+      runtime_root="$2"
+      shift 2
+      ;;
+    --label)
+      label="$2"
+      shift 2
+      ;;
+    --out-dir)
+      out_dir="$2"
+      shift 2
+      ;;
+    --timeout)
+      timeout_seconds="$2"
+      shift 2
+      ;;
+    --copy-runtime)
+      copy_runtime=1
+      shift
+      ;;
+    -h|--help)
+      usage
+      exit 0
+      ;;
+    --)
+      shift
+      break
+      ;;
+    -*)
+      echo "unknown option: $1" >&2
+      usage >&2
+      exit 1
+      ;;
+    *)
+      break
+      ;;
+  esac
+done
+
+build_bin_dir="${1:-$default_build_bin_dir}"
+if [[ $# -gt 0 ]]; then
+  shift
+fi
+
+build_bin_dir="$(realpath -m "$build_bin_dir")"
+
+if [[ -z "$label" ]]; then
+  label="$(date +%Y%m%d-%H%M%S)"
+fi
+
+safe_label="$(sanitize_label "$label")"
+if [[ -z "$safe_label" ]]; then
+  safe_label="capture"
+fi
+
+if [[ -z "$out_dir" ]]; then
+  out_dir="$build_bin_dir/log/pack-profile-runs"
+fi
+out_dir="$(realpath -m "$out_dir")"
+
+if [[ -z "$runtime_root" ]]; then
+  if [[ ! -e "$build_bin_dir/config/locale.cfg" || ! -e "$build_bin_dir/pack" ]]; then
+    if [[ -d "$default_runtime_root" ]]; then
+      runtime_root="$default_runtime_root"
+    fi
+  fi
+fi
+
+if [[ -n "$runtime_root" ]]; then
+  runtime_root="$(realpath "$runtime_root")"
+fi
+
+if [[ ! -x "$run_script" ]]; then
+  echo "headless runner not executable: $run_script" >&2
+  exit 1
+fi
+
+if [[ ! -f "$parser_script" ]]; then
+  echo "pack profile parser not found: $parser_script" >&2
+  exit 1
+fi
+
+if ! command -v python3 >/dev/null 2>&1; then
+  echo "python3 not found in PATH" >&2
+  exit 1
+fi
+
+mkdir -p "$build_bin_dir/log" "$out_dir"
+
+if [[ -n "$runtime_root" ]]; then
+  if [[ ! -x "$stage_script" ]]; then
+    echo "runtime staging script not executable: $stage_script" >&2
+    exit 1
+  fi
+
+  stage_args=()
+  if [[ "$copy_runtime" -eq 1 ]]; then
+    stage_args+=(--copy)
+  fi
+  stage_args+=("$runtime_root" "$build_bin_dir")
+  "$stage_script" "${stage_args[@]}"
+fi
+
+report_path="$build_bin_dir/log/pack_profile.txt"
+raw_out="$out_dir/$safe_label.pack_profile.txt"
+summary_out="$out_dir/$safe_label.summary.txt"
+
+rm -f "$report_path"
+
+set +e
+M2PACK_PROFILE=1 M2_TIMEOUT="$timeout_seconds" "$run_script" "$build_bin_dir" -- "$@"
+run_exit=$?
+set -e
+
+if [[ ! -f "$report_path" ]]; then
+  echo "pack profile report was not generated: $report_path" >&2
+  exit "$run_exit"
+fi
+
+cp "$report_path" "$raw_out"
+python3 "$parser_script" "$raw_out" > "$summary_out"
+
+case "$run_exit" in
+  0|124|137)
+    ;;
+  *)
+    echo "client run exited with $run_exit; archived report anyway:" >&2
+    echo "  raw: $raw_out" >&2
+    echo "  summary: $summary_out" >&2
+    exit "$run_exit"
+    ;;
+esac
+
+echo "captured pack profile:"
+echo "  raw: $raw_out"
+echo "  summary: $summary_out"
--- a/scripts/compare-pack-profile-gm-smoke.py
+++ b/scripts/compare-pack-profile-gm-smoke.py
@@ -0,0 +1,728 @@
+#!/usr/bin/env python3
+from __future__ import annotations
+
+import argparse
+import os
+import secrets
+import shutil
+import subprocess
+import sys
+import tempfile
+from dataclasses import dataclass
+from pathlib import Path
+
+
+DEFAULT_SERVER_HOST = "173.249.9.66"
+DEFAULT_AUTH_PORT = 11000
+DEFAULT_CHANNEL_PORT = 11991
+DEFAULT_MAP_NAME = "gm_guild_build"
+DEFAULT_MAP_INDEX = 200
+DEFAULT_GLOBAL_X = 96000
+DEFAULT_GLOBAL_Y = 12800
+ACCEPTABLE_RUN_CODES = {0, 124, 137}
+
+
+@dataclass
+class PlayerState:
+    player_id: int
+    name: str
+    map_index: int
+    x: int
+    y: int
+
+
+@dataclass
+class AccountState:
+    account_id: int
+    password_hash: str
+    player: PlayerState
+
+
+def parse_args() -> argparse.Namespace:
+    parser = argparse.ArgumentParser(
+        description=(
+            "Run a real login -> game -> GM warp pack profile comparison under "
+            "Wine using two runtime roots."
+        ),
+    )
+    parser.add_argument(
+        "--left-runtime-root",
+        type=Path,
+        required=True,
+        help="Runtime root for the first capture.",
+    )
+    parser.add_argument(
+        "--right-runtime-root",
+        type=Path,
+        required=True,
+        help="Runtime root for the second capture.",
+    )
+    parser.add_argument(
+        "--left-label",
+        default="left",
+        help="Label for the first capture.",
+    )
+    parser.add_argument(
+        "--right-label",
+        default="right",
+        help="Label for the second capture.",
+    )
+    parser.add_argument(
+        "--master-key",
+        type=Path,
+        required=True,
+        help="Path to the runtime master key hex file.",
+    )
+    parser.add_argument(
+        "--sign-pubkey",
+        type=Path,
+        required=True,
+        help="Path to the signing public key hex file.",
+    )
+    parser.add_argument(
+        "--key-id",
+        default="1",
+        help="Runtime key_id for the secure pack loader.",
+    )
+    parser.add_argument(
+        "--account-login",
+        default="admin",
+        help="GM account used for the smoke flow.",
+    )
+    parser.add_argument(
+        "--slot",
+        type=int,
+        default=0,
+        help="Character slot used for auto-select.",
+    )
+    parser.add_argument(
+        "--timeout",
+        type=int,
+        default=90,
+        help="Headless client timeout in seconds per capture.",
+    )
+    parser.add_argument(
+        "--server-host",
+        default=DEFAULT_SERVER_HOST,
+        help="Server host used in loginInfo.py.",
+    )
+    parser.add_argument(
+        "--auth-port",
+        type=int,
+        default=DEFAULT_AUTH_PORT,
+        help="Auth port used in loginInfo.py.",
+    )
+    parser.add_argument(
+        "--channel-port",
+        type=int,
+        default=DEFAULT_CHANNEL_PORT,
+        help="Channel port used in loginInfo.py.",
+    )
+    parser.add_argument(
+        "--map-name",
+        default=DEFAULT_MAP_NAME,
+        help="Headless GM target map name.",
+    )
+    parser.add_argument(
+        "--map-index",
+        type=int,
+        default=DEFAULT_MAP_INDEX,
+        help="Server-side map index used to place the GM character before login.",
+    )
+    parser.add_argument(
+        "--global-x",
+        type=int,
+        default=DEFAULT_GLOBAL_X,
+        help="Global X coordinate used for the initial load and GM warp target.",
+    )
+    parser.add_argument(
+        "--global-y",
+        type=int,
+        default=DEFAULT_GLOBAL_Y,
+        help="Global Y coordinate used for the initial load and GM warp target.",
+    )
+    parser.add_argument(
+        "--command-delay",
+        type=float,
+        default=3.0,
+        help="Delay before the headless GM sends the first /warp command.",
+    )
+    parser.add_argument(
+        "--settle-delay",
+        type=float,
+        default=1.0,
+        help="Delay between warp arrival and the next GM command.",
+    )
+    parser.add_argument(
+        "--warp-timeout",
+        type=float,
+        default=15.0,
+        help="Timeout per GM warp step in seconds.",
+    )
+    parser.add_argument(
+        "--out-dir",
+        type=Path,
+        default=None,
+        help="Directory for archived reports and compare output.",
+    )
+    parser.add_argument(
+        "--work-root",
+        type=Path,
+        default=None,
+        help="Directory where temporary workspaces are created.",
+    )
+    parser.add_argument(
+        "--keep-workspaces",
+        action="store_true",
+        help="Keep temporary workspaces for debugging instead of deleting them.",
+    )
+    parser.add_argument(
+        "source_build_bin_dir",
+        nargs="?",
+        default=None,
+        help="Source build/bin directory to copy out of the repository.",
+    )
+    return parser.parse_args()
+
+
+def sanitize_label(value: str) -> str:
+    safe = []
+    prev_dash = False
+    for ch in value.lower():
+        keep = ch.isalnum() or ch in "._-"
+        if keep:
+            if ch == "-" and prev_dash:
+                continue
+            safe.append(ch)
+            prev_dash = ch == "-"
+        else:
+            if not prev_dash:
+                safe.append("-")
+                prev_dash = True
+    result = "".join(safe).strip("-")
+    return result or "capture"
+
+
+def require_tool(name: str) -> None:
+    if shutil.which(name) is None:
+        raise SystemExit(f"{name} not found in PATH")
+
+
+def run_checked(args: list[str], *, cwd: Path | None = None, env: dict[str, str] | None = None) -> subprocess.CompletedProcess[str]:
+    completed = subprocess.run(
+        args,
+        cwd=str(cwd) if cwd else None,
+        env=env,
+        stdout=subprocess.PIPE,
+        stderr=subprocess.PIPE,
+        text=True,
+    )
+    if completed.returncode != 0:
+        detail = completed.stderr.strip() or completed.stdout.strip()
+        raise RuntimeError(f"command failed ({completed.returncode}): {' '.join(args)}\n{detail}")
+    return completed
+
+
+def run_mysql(database: str, sql: str) -> str:
+    completed = subprocess.run(
+        ["mysql", "-B", "-N", database, "-e", sql],
+        stdout=subprocess.PIPE,
+        stderr=subprocess.PIPE,
+        text=True,
+    )
+    if completed.returncode != 0:
+        detail = completed.stderr.strip() or completed.stdout.strip()
+        raise RuntimeError(f"mysql failed: {detail}")
+    return completed.stdout.strip()
+
+
+def sql_quote(value: str) -> str:
+    return "'" + value.replace("\\", "\\\\").replace("'", "\\'") + "'"
+
+
+def load_account_state(account_login: str, slot: int) -> AccountState:
+    if slot < 0 or slot > 3:
+        raise SystemExit(f"unsupported slot index: {slot}")
+
+    account_id_text = run_mysql(
+        "account",
+        f"SELECT id FROM account WHERE login={sql_quote(account_login)} LIMIT 1;",
+    )
+    if not account_id_text:
+        raise SystemExit(f"account not found: {account_login}")
+    account_id = int(account_id_text)
+
+    password_hash = run_mysql(
+        "account",
+        f"SELECT password FROM account WHERE login={sql_quote(account_login)} LIMIT 1;",
+    )
+    if not password_hash:
+        raise SystemExit(f"account password hash not found: {account_login}")
+
+    pid_text = run_mysql(
+        "player",
+        f"SELECT pid{slot + 1} FROM player_index WHERE id={account_id} LIMIT 1;",
+    )
+    if not pid_text or pid_text == "0":
+        raise SystemExit(f"slot {slot} has no character for account: {account_login}")
+    player_id = int(pid_text)
+
+    player_row = run_mysql(
+        "player",
+        f"SELECT id, name, map_index, x, y FROM player WHERE id={player_id} LIMIT 1;",
+    )
+    if not player_row:
+        raise SystemExit(f"player row not found for id={player_id}")
+    parts = player_row.split("\t")
+    if len(parts) != 5:
+        raise SystemExit(f"unexpected player row format for id={player_id}: {player_row}")
+
+    return AccountState(
+        account_id=account_id,
+        password_hash=password_hash,
+        player=PlayerState(
+            player_id=int(parts[0]),
+            name=parts[1],
+            map_index=int(parts[2]),
+            x=int(parts[3]),
+            y=int(parts[4]),
+        ),
+    )
+
+
+def restore_account_state(account_login: str, state: AccountState) -> None:
+    run_mysql(
+        "account",
+        (
+            "UPDATE account SET password=%s WHERE login=%s;"
+            % (sql_quote(state.password_hash), sql_quote(account_login))
+        ),
+    )
+    run_mysql(
+        "player",
+        (
+            "UPDATE player SET map_index=%d, x=%d, y=%d WHERE id=%d;"
+            % (
+                state.player.map_index,
+                state.player.x,
+                state.player.y,
+                state.player.player_id,
+            )
+        ),
+    )
+
+
+def verify_account_state(account_login: str, slot: int, expected: AccountState) -> None:
+    current = load_account_state(account_login, slot)
+    if current.password_hash != expected.password_hash:
+        raise RuntimeError(
+            f"account password hash did not restore for {account_login}"
+        )
+    if (
+        current.player.map_index != expected.player.map_index
+        or current.player.x != expected.player.x
+        or current.player.y != expected.player.y
+        or current.player.player_id != expected.player.player_id
+    ):
+        raise RuntimeError(
+            "player state did not restore for "
+            f"{account_login}: expected map={expected.player.map_index} "
+            f"x={expected.player.x} y={expected.player.y}, got "
+            f"map={current.player.map_index} x={current.player.x} y={current.player.y}"
+        )
+
+
+def set_temp_account_state(
+    account_login: str,
+    player_id: int,
+    *,
+    temp_password: str,
+    map_index: int,
+    global_x: int,
+    global_y: int,
+) -> None:
+    run_mysql(
+        "account",
+        (
+            "UPDATE account SET password=PASSWORD(%s) WHERE login=%s;"
+            % (sql_quote(temp_password), sql_quote(account_login))
+        ),
+    )
+    run_mysql(
+        "player",
+        (
+            "UPDATE player SET map_index=%d, x=%d, y=%d WHERE id=%d;"
+            % (map_index, global_x, global_y, player_id)
+        ),
+    )
+
+
+def remove_previous_logs(build_bin_dir: Path) -> None:
+    log_dir = build_bin_dir / "log"
+    log_dir.mkdir(parents=True, exist_ok=True)
+    for path in log_dir.glob("*.txt"):
+        path.unlink(missing_ok=True)
+    for path in [build_bin_dir / "syserr.txt", build_bin_dir / "ErrorLog.txt", build_bin_dir / "loginInfo.py"]:
+        path.unlink(missing_ok=True)
+
+
+def write_login_info(
+    build_bin_dir: Path,
+    *,
+    account_login: str,
+    temp_password: str,
+    slot: int,
+    server_host: str,
+    auth_port: int,
+    channel_port: int,
+) -> None:
+    login_info = "\n".join(
+        [
+            f"addr={server_host!r}",
+            f"port={channel_port}",
+            f"account_addr={server_host!r}",
+            f"account_port={auth_port}",
+            f"id={account_login!r}",
+            f"pwd={temp_password!r}",
+            f"slot={slot}",
+            "autoLogin=1",
+            "autoSelect=1",
+            "",
+        ]
+    )
+    (build_bin_dir / "loginInfo.py").write_text(login_info, encoding="utf-8")
+
+
+def copy_source_bin(source_build_bin_dir: Path, workspace_bin_dir: Path) -> None:
+    run_checked(
+        [
+            "rsync",
+            "-a",
+            "--delete",
+            "--exclude",
+            "config",
+            "--exclude",
+            "pack",
+            "--exclude",
+            "mark",
+            "--exclude",
+            "log",
+            "--exclude",
+            "BGM",
+            "--exclude",
+            "bgm",
+            "--exclude",
+            "locale",
+            "--exclude",
+            "sound",
+            f"{source_build_bin_dir}/",
+            str(workspace_bin_dir),
+        ]
+    )
+
+
+def copy_if_exists(src: Path, dst: Path) -> None:
+    if src.exists():
+        shutil.copy2(src, dst)
+
+
+def prepare_runtime_wrapper(runtime_root: Path, wrapper_root: Path) -> None:
+    if not (runtime_root / "config").exists():
+        raise SystemExit(f"runtime config missing: {runtime_root / 'config'}")
+    if not (runtime_root / "pack").exists():
+        raise SystemExit(f"runtime pack missing: {runtime_root / 'pack'}")
+
+    wrapper_root.mkdir(parents=True, exist_ok=True)
+
+    def link_entry(name: str, target: Path) -> None:
+        link_path = wrapper_root / name
+        if link_path.exists() or link_path.is_symlink():
+            link_path.unlink()
+        os.symlink(target, link_path)
+
+    link_entry("config", runtime_root / "config")
+    link_entry("pack", runtime_root / "pack")
+
+    optional_entries = {
+        "mark": runtime_root / "mark",
+        "locale": runtime_root / "locale",
+        "sound": runtime_root / "sound",
+    }
+
+    bgm_target = runtime_root / "BGM"
+    if not bgm_target.exists():
+        bgm_target = runtime_root / "bgm"
+    if bgm_target.exists():
+        optional_entries["BGM"] = bgm_target
+
+    for name, target in optional_entries.items():
+        if target.exists():
+            link_entry(name, target)
+
+
+def capture_run(
+    *,
+    label: str,
+    runtime_root: Path,
+    source_build_bin_dir: Path,
+    out_dir: Path,
+    work_root: Path | None,
+    keep_workspace: bool,
+    master_key_hex: str,
+    sign_pubkey_hex: str,
+    key_id: str,
+    account_login: str,
+    slot: int,
+    timeout_seconds: int,
+    server_host: str,
+    auth_port: int,
+    channel_port: int,
+    map_name: str,
+    map_index: int,
+    global_x: int,
+    global_y: int,
+    command_delay: float,
+    settle_delay: float,
+    warp_timeout: float,
+    stage_script: Path,
+    run_script: Path,
+    parser_script: Path,
+) -> tuple[Path, Path | None]:
+    workspace_path = Path(
+        tempfile.mkdtemp(
+            prefix=f"pack-profile-gm-smoke-{sanitize_label(label)}.",
+            dir=str(work_root) if work_root else None,
+        )
+    )
+    workspace_bin_dir = workspace_path / "bin"
+    workspace_runtime_dir = workspace_path / "runtime"
+    workspace_bin_dir.mkdir(parents=True, exist_ok=True)
+
+    account_state = load_account_state(account_login, slot)
+    temp_password = ("Tmp" + secrets.token_hex(6))[:16]
+    safe_label = sanitize_label(label)
+    raw_out = out_dir / f"{safe_label}.pack_profile.txt"
+    summary_out = out_dir / f"{safe_label}.summary.txt"
+    trace_out = out_dir / f"{safe_label}.headless_gm_teleport_trace.txt"
+    startup_out = out_dir / f"{safe_label}.startup_trace.txt"
+    syserr_out = out_dir / f"{safe_label}.syserr.txt"
+
+    try:
+        copy_source_bin(source_build_bin_dir, workspace_bin_dir)
+        remove_previous_logs(workspace_bin_dir)
+        prepare_runtime_wrapper(runtime_root, workspace_runtime_dir)
+        run_checked(["bash", str(stage_script), str(workspace_runtime_dir), str(workspace_bin_dir)])
+        write_login_info(
+            workspace_bin_dir,
+            account_login=account_login,
+            temp_password=temp_password,
+            slot=slot,
+            server_host=server_host,
+            auth_port=auth_port,
+            channel_port=channel_port,
+        )
+        set_temp_account_state(
+            account_login,
+            account_state.player.player_id,
+            temp_password=temp_password,
+            map_index=map_index,
+            global_x=global_x,
+            global_y=global_y,
+        )
+
+        env = os.environ.copy()
+        env.update(
+            {
+                "M2PACK_PROFILE": "1",
+                "M2PACK_MASTER_KEY_HEX": master_key_hex,
+                "M2PACK_SIGN_PUBKEY_HEX": sign_pubkey_hex,
+                "M2PACK_KEY_ID": key_id,
+                "M2_TIMEOUT": str(timeout_seconds),
+                "M2_HEADLESS_SCENARIO": "gm_teleport",
+                "M2_HEADLESS_WARP_STEPS": f"{map_name},{global_x},{global_y}",
+                "M2_HEADLESS_COMMAND_DELAY": str(command_delay),
+                "M2_HEADLESS_SETTLE_DELAY": str(settle_delay),
+                "M2_HEADLESS_WARP_TIMEOUT": str(warp_timeout),
+            }
+        )
+        env.setdefault("WINEDEBUG", "-all")
+
+        completed = subprocess.run(
+            ["bash", str(run_script), str(workspace_bin_dir)],
+            cwd=str(workspace_bin_dir),
+            env=env,
+            stdout=subprocess.PIPE,
+            stderr=subprocess.PIPE,
+            text=True,
+        )
+
+        report_path = workspace_bin_dir / "log" / "pack_profile.txt"
+        if not report_path.is_file():
+            detail = completed.stderr.strip() or completed.stdout.strip()
+            raise RuntimeError(f"pack profile report not generated for {label}: {detail}")
+
+        trace_path = workspace_bin_dir / "log" / "headless_gm_teleport_trace.txt"
+        startup_path = workspace_bin_dir / "log" / "startup_trace.txt"
+        syserr_path = workspace_bin_dir / "syserr.txt"
+
+        shutil.copy2(report_path, raw_out)
+        copy_if_exists(trace_path, trace_out)
+        copy_if_exists(startup_path, startup_out)
+        copy_if_exists(syserr_path, syserr_out)
+
+        summary = run_checked([sys.executable, str(parser_script), str(raw_out)])
+        summary_out.write_text(summary.stdout, encoding="utf-8")
+
+        trace_text = trace_path.read_text(encoding="utf-8", errors="ignore") if trace_path.exists() else ""
+        scenario_ok = "Scenario success current_map=" in trace_text
+
+        if completed.returncode not in ACCEPTABLE_RUN_CODES:
+            detail = completed.stderr.strip() or completed.stdout.strip()
+            raise RuntimeError(
+                f"client run for {label} exited with {completed.returncode}: {detail}"
+            )
+
+        if not scenario_ok:
+            raise RuntimeError(
+                f"GM smoke scenario for {label} did not finish successfully; "
+                f"see {trace_out if trace_out.exists() else trace_path}"
+            )
+
+        return raw_out, (workspace_path if keep_workspace else None)
+    finally:
+        try:
+            restore_account_state(account_login, account_state)
+            verify_account_state(account_login, slot, account_state)
+        finally:
+            if not keep_workspace:
+                shutil.rmtree(workspace_path, ignore_errors=True)
+
+
+def main() -> int:
+    args = parse_args()
+
+    require_tool("mysql")
+    require_tool("rsync")
+    require_tool("python3")
+
+    script_dir = Path(__file__).resolve().parent
+    repo_root = script_dir.parent
+    source_build_bin_dir = (
+        Path(args.source_build_bin_dir).resolve()
+        if args.source_build_bin_dir
+        else (repo_root / "build-mingw64-lld" / "bin").resolve()
+    )
+    if not (source_build_bin_dir / "Metin2_RelWithDebInfo.exe").is_file():
+        raise SystemExit(f"client exe not found: {source_build_bin_dir / 'Metin2_RelWithDebInfo.exe'}")
+
+    stage_script = script_dir / "stage-linux-runtime.sh"
+    run_script = script_dir / "run-wine-headless.sh"
+    parser_script = script_dir / "pack-profile-report.py"
+
+    for path, kind in [
+        (stage_script, "runtime stage script"),
+        (run_script, "headless run script"),
+        (parser_script, "pack profile parser"),
+    ]:
+        if not path.exists():
+            raise SystemExit(f"{kind} not found: {path}")
+
+    out_dir = (
+        args.out_dir.resolve()
+        if args.out_dir
+        else (source_build_bin_dir / "log" / "pack-profile-gm-smoke" / next(tempfile._get_candidate_names())).resolve()
+    )
+    out_dir.mkdir(parents=True, exist_ok=True)
+
+    work_root = args.work_root.resolve() if args.work_root else None
+    if work_root:
+        work_root.mkdir(parents=True, exist_ok=True)
+
+    master_key_hex = args.master_key.read_text(encoding="utf-8").strip()
+    sign_pubkey_hex = args.sign_pubkey.read_text(encoding="utf-8").strip()
+    if not master_key_hex:
+        raise SystemExit(f"master key file is empty: {args.master_key}")
+    if not sign_pubkey_hex:
+        raise SystemExit(f"sign pubkey file is empty: {args.sign_pubkey}")
+
+    left_report, left_workspace = capture_run(
+        label=args.left_label,
+        runtime_root=args.left_runtime_root.resolve(),
+        source_build_bin_dir=source_build_bin_dir,
+        out_dir=out_dir,
+        work_root=work_root,
+        keep_workspace=args.keep_workspaces,
+        master_key_hex=master_key_hex,
+        sign_pubkey_hex=sign_pubkey_hex,
+        key_id=args.key_id,
+        account_login=args.account_login,
+        slot=args.slot,
+        timeout_seconds=args.timeout,
+        server_host=args.server_host,
+        auth_port=args.auth_port,
+        channel_port=args.channel_port,
+        map_name=args.map_name,
+        map_index=args.map_index,
+        global_x=args.global_x,
+        global_y=args.global_y,
+        command_delay=args.command_delay,
+        settle_delay=args.settle_delay,
+        warp_timeout=args.warp_timeout,
+        stage_script=stage_script,
+        run_script=run_script,
+        parser_script=parser_script,
+    )
+
+    right_report, right_workspace = capture_run(
+        label=args.right_label,
+        runtime_root=args.right_runtime_root.resolve(),
+        source_build_bin_dir=source_build_bin_dir,
+        out_dir=out_dir,
+        work_root=work_root,
+        keep_workspace=args.keep_workspaces,
+        master_key_hex=master_key_hex,
+        sign_pubkey_hex=sign_pubkey_hex,
+        key_id=args.key_id,
+        account_login=args.account_login,
+        slot=args.slot,
+        timeout_seconds=args.timeout,
+        server_host=args.server_host,
+        auth_port=args.auth_port,
+        channel_port=args.channel_port,
+        map_name=args.map_name,
+        map_index=args.map_index,
+        global_x=args.global_x,
+        global_y=args.global_y,
+        command_delay=args.command_delay,
+        settle_delay=args.settle_delay,
+        warp_timeout=args.warp_timeout,
+        stage_script=stage_script,
+        run_script=run_script,
+        parser_script=parser_script,
+    )
+
+    compare = run_checked(
+        [
+            sys.executable,
+            str(parser_script),
+            f"{args.left_label}={left_report}",
+            f"{args.right_label}={right_report}",
+        ]
+    )
+
+    compare_path = out_dir / (
+        f"compare-{sanitize_label(args.left_label)}-vs-{sanitize_label(args.right_label)}.txt"
+    )
+    compare_path.write_text(compare.stdout, encoding="utf-8")
+    print(compare.stdout, end="")
+    print()
+    print(f"saved compare report: {compare_path}")
+    if left_workspace:
+        print(f"kept workspace: {left_workspace}")
+    if right_workspace:
+        print(f"kept workspace: {right_workspace}")
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/scripts/compare-pack-profile-runs.sh
+++ b/scripts/compare-pack-profile-runs.sh
@@ -0,0 +1,176 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+usage() {
+  cat <<'EOF'
+Usage:
+  compare-pack-profile-runs.sh [options] [build-bin-dir] [-- <client args...>]
+
+Examples:
+  compare-pack-profile-runs.sh \
+    --left-label pck \
+    --left-runtime-root /srv/client-runtime-pck \
+    --right-label m2p \
+    --right-runtime-root /srv/client-runtime-m2p
+
+  compare-pack-profile-runs.sh \
+    --left-label legacy \
+    --left-runtime-root ../runtime-pck \
+    --right-label secure \
+    --right-runtime-root ../runtime-m2p \
+    ./build-mingw64-lld/bin -- --m2pack-strict-hash 0
+
+Options:
+  --left-runtime-root DIR   Runtime root for the first run
+  --right-runtime-root DIR  Runtime root for the second run
+  --left-label NAME         Label for the first run (default: left)
+  --right-label NAME        Label for the second run (default: right)
+  --out-dir DIR             Directory for archived reports and compare output
+  --timeout SEC             Overrides M2_TIMEOUT for both runs
+  --copy-runtime            Materialize runtime instead of symlinking it
+  -h, --help                Show this help
+
+Behavior:
+  - Runs two captures back-to-back into the same build output
+  - Archives both raw reports and summaries
+  - Writes a combined compare report and prints it to stdout
+EOF
+}
+
+sanitize_label() {
+  printf '%s' "$1" \
+    | tr '[:upper:]' '[:lower:]' \
+    | sed -E 's/[^a-z0-9._-]+/-/g; s/^-+//; s/-+$//; s/-{2,}/-/g'
+}
+
+script_dir="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)"
+repo_root="$(realpath "$script_dir/..")"
+default_build_bin_dir="$repo_root/build-mingw64-lld/bin"
+
+capture_script="${M2_CAPTURE_PACK_PROFILE_SCRIPT:-$script_dir/capture-pack-profile.sh}"
+parser_script="${M2_PACK_PROFILE_PARSER:-$script_dir/pack-profile-report.py}"
+
+left_runtime_root=""
+right_runtime_root=""
+left_label="left"
+right_label="right"
+out_dir=""
+timeout_seconds="${M2_TIMEOUT:-20}"
+copy_runtime=0
+
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --left-runtime-root)
+      left_runtime_root="$2"
+      shift 2
+      ;;
+    --right-runtime-root)
+      right_runtime_root="$2"
+      shift 2
+      ;;
+    --left-label)
+      left_label="$2"
+      shift 2
+      ;;
+    --right-label)
+      right_label="$2"
+      shift 2
+      ;;
+    --out-dir)
+      out_dir="$2"
+      shift 2
+      ;;
+    --timeout)
+      timeout_seconds="$2"
+      shift 2
+      ;;
+    --copy-runtime)
+      copy_runtime=1
+      shift
+      ;;
+    -h|--help)
+      usage
+      exit 0
+      ;;
+    --)
+      shift
+      break
+      ;;
+    -*)
+      echo "unknown option: $1" >&2
+      usage >&2
+      exit 1
+      ;;
+    *)
+      break
+      ;;
+  esac
+done
+
+if [[ -z "$left_runtime_root" || -z "$right_runtime_root" ]]; then
+  echo "both --left-runtime-root and --right-runtime-root are required" >&2
+  usage >&2
+  exit 1
+fi
+
+build_bin_dir="${1:-$default_build_bin_dir}"
+if [[ $# -gt 0 ]]; then
+  shift
+fi
+build_bin_dir="$(realpath -m "$build_bin_dir")"
+
+if [[ -z "$out_dir" ]]; then
+  out_dir="$build_bin_dir/log/pack-profile-runs/$(date +%Y%m%d-%H%M%S)"
+fi
+out_dir="$(realpath -m "$out_dir")"
+mkdir -p "$out_dir"
+
+if [[ ! -x "$capture_script" ]]; then
+  echo "capture script not executable: $capture_script" >&2
+  exit 1
+fi
+
+if [[ ! -f "$parser_script" ]]; then
+  echo "pack profile parser not found: $parser_script" >&2
+  exit 1
+fi
+
+if ! command -v python3 >/dev/null 2>&1; then
+  echo "python3 not found in PATH" >&2
+  exit 1
+fi
+
+capture_common_args=(
+  --out-dir "$out_dir"
+  --timeout "$timeout_seconds"
+)
+if [[ "$copy_runtime" -eq 1 ]]; then
+  capture_common_args+=(--copy-runtime)
+fi
+
+"$capture_script" \
+  "${capture_common_args[@]}" \
+  --runtime-root "$left_runtime_root" \
+  --label "$left_label" \
+  "$build_bin_dir" \
+  -- "$@"
+
+"$capture_script" \
+  "${capture_common_args[@]}" \
+  --runtime-root "$right_runtime_root" \
+  --label "$right_label" \
+  "$build_bin_dir" \
+  -- "$@"
+
+left_safe_label="$(sanitize_label "$left_label")"
+right_safe_label="$(sanitize_label "$right_label")"
+left_report="$out_dir/$left_safe_label.pack_profile.txt"
+right_report="$out_dir/$right_safe_label.pack_profile.txt"
+compare_path="$out_dir/compare-$left_safe_label-vs-$right_safe_label.txt"
+
+python3 "$parser_script" \
+  "$left_label=$left_report" \
+  "$right_label=$right_report" | tee "$compare_path"
+
+echo
+echo "saved compare report: $compare_path"
--- a/scripts/pack-profile-report.py
+++ b/scripts/pack-profile-report.py
@@ -0,0 +1,755 @@
+#!/usr/bin/env python3
+from __future__ import annotations
+
+import argparse
+import math
+import sys
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Iterable
+
+
+TABLE_SECTIONS = {
+    "mounts_by_format",
+    "mounts_by_pack",
+    "loads_by_format",
+    "loads_by_phase_format",
+    "top_phase_pack_loads",
+    "top_extension_loads",
+    "loader_stages",
+}
+
+
+@dataclass
+class Report:
+    label: str
+    path: Path
+    metadata: dict[str, str] = field(default_factory=dict)
+    phase_markers: list[tuple[str, float]] = field(default_factory=list)
+    tables: dict[str, dict[str, dict[str, float]]] = field(default_factory=dict)
+
+    def uptime_ms(self) -> float:
+        return parse_float(self.metadata.get("uptime_ms", "0"))
+
+    def current_phase(self) -> str:
+        return self.metadata.get("current_phase", "-")
+
+    def reason(self) -> str:
+        return self.metadata.get("reason", "-")
+
+    def section(self, name: str) -> dict[str, dict[str, float]]:
+        return self.tables.get(name, {})
+
+
+def parse_args() -> argparse.Namespace:
+    parser = argparse.ArgumentParser(
+        description=(
+            "Summarize or compare pack_profile.txt reports generated by the "
+            "client-side M2 pack profiler."
+        ),
+    )
+    parser.add_argument(
+        "reports",
+        nargs="+",
+        metavar="REPORT",
+        help=(
+            "Either /path/to/pack_profile.txt or label=/path/to/pack_profile.txt. "
+            "Pass one report for a summary, or two or more reports for a comparison."
+        ),
+    )
+    parser.add_argument(
+        "--top",
+        type=int,
+        default=6,
+        help="How many hotspot rows to show in top-pack, top-extension and stage sections.",
+    )
+    return parser.parse_args()
+
+
+def parse_float(value: str) -> float:
+    try:
+        return float(value.strip())
+    except ValueError:
+        return 0.0
+
+
+def parse_metric_value(value: str) -> float:
+    value = value.strip()
+    if value.endswith(" ms"):
+        value = value[:-3]
+    return parse_float(value)
+
+
+def parse_report_arg(value: str) -> tuple[str, Path]:
+    if "=" in value:
+        label, path_value = value.split("=", 1)
+        if label and path_value:
+            return label, Path(path_value)
+
+    path = Path(value)
+    return path.stem, path
+
+
+def load_report(arg_value: str) -> Report:
+    label, path = parse_report_arg(arg_value)
+    if not path.is_file():
+        raise FileNotFoundError(f"report not found: {path}")
+
+    report = Report(label=label, path=path)
+    current_section: str | None = None
+
+    for raw_line in path.read_text(encoding="utf-8", errors="replace").splitlines():
+        line = raw_line.strip()
+        if not line or line == "PACK PROFILE REPORT":
+            continue
+
+        if line.startswith("[") and line.endswith("]"):
+            current_section = line[1:-1]
+            if current_section in TABLE_SECTIONS:
+                report.tables.setdefault(current_section, {})
+            continue
+
+        if current_section is None:
+            if "=" in line:
+                key, value = line.split("=", 1)
+                report.metadata[key.strip()] = value.strip()
+            continue
+
+        if current_section == "phase_markers":
+            if "\t" not in line:
+                continue
+            phase, value = line.split("\t", 1)
+            report.phase_markers.append((phase.strip(), parse_metric_value(value)))
+            continue
+
+        if current_section not in TABLE_SECTIONS:
+            continue
+
+        fields = [field.strip() for field in line.split("\t") if field.strip()]
+        if not fields:
+            continue
+
+        key = fields[0]
+        metrics: dict[str, float] = {}
+        for field in fields[1:]:
+            if "=" not in field:
+                continue
+            metric_key, metric_value = field.split("=", 1)
+            metrics[metric_key.strip()] = parse_metric_value(metric_value)
+
+        report.tables[current_section][key] = metrics
+
+    return report
+
+
+def aggregate_phase_loads(report: Report) -> dict[str, dict[str, float]]:
+    aggregated: dict[str, dict[str, float]] = {}
+    for key, metrics in report.section("loads_by_phase_format").items():
+        phase, _, _format = key.partition("|")
+        phase_key = phase.strip()
+        bucket = aggregated.setdefault(
+            phase_key,
+            {"count": 0.0, "fail": 0.0, "bytes": 0.0, "time_ms": 0.0},
+        )
+        for metric_name in ("count", "fail", "bytes", "time_ms"):
+            bucket[metric_name] += metrics.get(metric_name, 0.0)
+    return aggregated
+
+
+def sum_section_metrics(section: dict[str, dict[str, float]]) -> dict[str, float]:
+ totals: dict[str, float] = {}
+ for metrics in section.values():
+  for key, value in metrics.items():
+   totals[key] = totals.get(key, 0.0) + value
+ return totals
+
+
+def sum_selected_section_metrics(
+    section: dict[str, dict[str, float]],
+    include_keys: Iterable[str] | None = None,
+    exclude_keys: Iterable[str] | None = None,
+) -> dict[str, float]:
+    include_set = set(include_keys or [])
+    exclude_set = set(exclude_keys or [])
+    filtered: dict[str, dict[str, float]] = {}
+    for key, metrics in section.items():
+        if include_set and key not in include_set:
+            continue
+        if key in exclude_set:
+            continue
+        filtered[key] = metrics
+    return sum_section_metrics(filtered)
+
+
+def sort_rows_by_metric(
+    section: dict[str, dict[str, float]],
+    metric: str,
+    limit: int,
+) -> list[tuple[str, dict[str, float]]]:
+    rows = list(section.items())
+    rows.sort(key=lambda item: (-item[1].get(metric, 0.0), item[0]))
+    return rows[:limit]
+
+
+def format_ms(value: float) -> str:
+    return f"{value:.3f}"
+
+
+def format_delta(value: float) -> str:
+    return f"{value:+.3f}"
+
+
+def format_percent(value: float | None) -> str:
+    if value is None or math.isinf(value) or math.isnan(value):
+        return "-"
+    return f"{value:+.1f}%"
+
+
+def format_bytes(value: float) -> str:
+    units = ("B", "KiB", "MiB", "GiB")
+    size = float(value)
+    unit_index = 0
+    while abs(size) >= 1024.0 and unit_index < len(units) - 1:
+        size /= 1024.0
+        unit_index += 1
+
+    if unit_index == 0:
+        return f"{int(round(size))} {units[unit_index]}"
+    return f"{size:.2f} {units[unit_index]}"
+
+
+def format_count(value: float) -> str:
+    return str(int(round(value)))
+
+
+def format_metric(metric: str, value: float) -> str:
+    if metric in {"bytes", "in", "out"}:
+        return format_bytes(value)
+    if metric in {"time_ms"}:
+        return format_ms(value)
+    if metric in {"count", "fail", "entries"}:
+        return format_count(value)
+    if metric.endswith("_ms"):
+        return format_ms(value)
+    if metric.endswith("_us"):
+        return f"{value:.1f}"
+    if abs(value - round(value)) < 0.000001:
+        return format_count(value)
+    return f"{value:.3f}"
+
+
+def render_table(headers: list[str], rows: list[list[str]], numeric_columns: set[int] | None = None) -> str:
+    numeric_columns = numeric_columns or set()
+    widths = [len(header) for header in headers]
+    for row in rows:
+        for index, cell in enumerate(row):
+            widths[index] = max(widths[index], len(cell))
+
+    lines = []
+    header_cells = []
+    for index, header in enumerate(headers):
+        align = str.rjust if index in numeric_columns else str.ljust
+        header_cells.append(align(header, widths[index]))
+    lines.append("  ".join(header_cells))
+
+    divider = []
+    for index, width in enumerate(widths):
+        divider.append(("-" * width))
+    lines.append("  ".join(divider))
+
+    for row in rows:
+        cells = []
+        for index, cell in enumerate(row):
+            align = str.rjust if index in numeric_columns else str.ljust
+            cells.append(align(cell, widths[index]))
+        lines.append("  ".join(cells))
+
+    return "\n".join(lines)
+
+
+def relative_delta_percent(base: float, candidate: float) -> float | None:
+    if abs(base) < 0.000001:
+        return None
+    return ((candidate - base) / base) * 100.0
+
+
+def summarize_report(report: Report, top: int) -> str:
+    lines = [
+        f"== {report.label} ==",
+        f"path: {report.path}",
+        (
+            f"reason={report.reason()} phase={report.current_phase()} "
+            f"uptime_ms={format_ms(report.uptime_ms())}"
+        ),
+        "",
+    ]
+
+    load_totals = sum_section_metrics(report.section("loads_by_format"))
+    mount_totals = sum_section_metrics(report.section("mounts_by_format"))
+    overview_rows = [[
+        format_count(load_totals.get("count", 0.0)),
+        format_ms(load_totals.get("time_ms", 0.0)),
+        format_bytes(load_totals.get("bytes", 0.0)),
+        format_count(load_totals.get("fail", 0.0)),
+        format_count(mount_totals.get("count", 0.0)),
+        format_ms(mount_totals.get("time_ms", 0.0)),
+    ]]
+    lines.extend([
+        "Overview",
+        render_table(
+            ["loads", "load_ms", "load_bytes", "load_fail", "mounts", "mount_ms"],
+            overview_rows,
+            numeric_columns={0, 1, 2, 3, 4, 5},
+        ),
+        "",
+    ])
+
+    if report.phase_markers:
+        phase_rows = [[phase, format_ms(value)] for phase, value in report.phase_markers]
+        lines.extend([
+            "Phase Markers",
+            render_table(["phase", "ms"], phase_rows, numeric_columns={1}),
+            "",
+        ])
+
+    loads_by_format_rows = []
+    for key, metrics in sort_rows_by_metric(report.section("loads_by_format"), "time_ms", top):
+        loads_by_format_rows.append([
+            key,
+            format_count(metrics.get("count", 0.0)),
+            format_ms(metrics.get("time_ms", 0.0)),
+            format_bytes(metrics.get("bytes", 0.0)),
+            format_count(metrics.get("fail", 0.0)),
+        ])
+    if loads_by_format_rows:
+        lines.extend([
+            "Loads By Format",
+            render_table(
+                ["format", "count", "time_ms", "bytes", "fail"],
+                loads_by_format_rows,
+                numeric_columns={1, 2, 3, 4},
+            ),
+            "",
+        ])
+
+    mounts_by_format_rows = []
+    for key, metrics in sort_rows_by_metric(report.section("mounts_by_format"), "time_ms", top):
+        mounts_by_format_rows.append([
+            key,
+            format_count(metrics.get("count", 0.0)),
+            format_ms(metrics.get("time_ms", 0.0)),
+            format_count(metrics.get("entries", 0.0)),
+            format_count(metrics.get("fail", 0.0)),
+        ])
+    if mounts_by_format_rows:
+        lines.extend([
+            "Mounts By Format",
+            render_table(
+                ["format", "count", "time_ms", "entries", "fail"],
+                mounts_by_format_rows,
+                numeric_columns={1, 2, 3, 4},
+            ),
+            "",
+        ])
+
+    phase_load_rows = []
+    for key, metrics in sort_rows_by_metric(aggregate_phase_loads(report), "time_ms", top):
+        phase_load_rows.append([
+            key,
+            format_count(metrics.get("count", 0.0)),
+            format_ms(metrics.get("time_ms", 0.0)),
+            format_bytes(metrics.get("bytes", 0.0)),
+            format_count(metrics.get("fail", 0.0)),
+        ])
+    if phase_load_rows:
+        lines.extend([
+            "Load Time By Phase",
+            render_table(
+                ["phase", "count", "time_ms", "bytes", "fail"],
+                phase_load_rows,
+                numeric_columns={1, 2, 3, 4},
+            ),
+            "",
+        ])
+
+    stage_rows = []
+    for key, metrics in sort_rows_by_metric(report.section("loader_stages"), "time_ms", top):
+        count = metrics.get("count", 0.0)
+        avg_us = (metrics.get("time_ms", 0.0) * 1000.0 / count) if count else 0.0
+        stage_rows.append([
+            key,
+            format_count(count),
+            format_ms(metrics.get("time_ms", 0.0)),
+            f"{avg_us:.1f}",
+            format_bytes(metrics.get("in", 0.0)),
+            format_bytes(metrics.get("out", 0.0)),
+        ])
+    if stage_rows:
+        lines.extend([
+            "Top Loader Stages",
+            render_table(
+                ["stage", "count", "time_ms", "avg_us", "in", "out"],
+                stage_rows,
+                numeric_columns={1, 2, 3, 4, 5},
+            ),
+            "",
+        ])
+
+    hot_pack_rows = []
+    for key, metrics in sort_rows_by_metric(report.section("top_phase_pack_loads"), "time_ms", top):
+        hot_pack_rows.append([
+            key,
+            format_count(metrics.get("count", 0.0)),
+            format_ms(metrics.get("time_ms", 0.0)),
+            format_bytes(metrics.get("bytes", 0.0)),
+        ])
+    if hot_pack_rows:
+        lines.extend([
+            "Top Phase Pack Loads",
+            render_table(
+                ["phase | pack", "count", "time_ms", "bytes"],
+                hot_pack_rows,
+                numeric_columns={1, 2, 3},
+            ),
+            "",
+        ])
+
+    extension_rows = []
+    for key, metrics in sort_rows_by_metric(report.section("top_extension_loads"), "time_ms", top):
+        extension_rows.append([
+            key,
+            format_count(metrics.get("count", 0.0)),
+            format_ms(metrics.get("time_ms", 0.0)),
+            format_bytes(metrics.get("bytes", 0.0)),
+        ])
+    if extension_rows:
+        lines.extend([
+            "Top Extensions",
+            render_table(
+                ["extension", "count", "time_ms", "bytes"],
+                extension_rows,
+                numeric_columns={1, 2, 3},
+            ),
+            "",
+        ])
+
+    return "\n".join(lines).rstrip()
+
+
+def compare_two_reports(left: Report, right: Report, top: int) -> str:
+    lines = [f"== {left.label} vs {right.label} ==", ""]
+
+    overview_rows = []
+    for report in (left, right):
+        load_totals = sum_section_metrics(report.section("loads_by_format"))
+        mount_totals = sum_section_metrics(report.section("mounts_by_format"))
+        overview_rows.append([
+            report.label,
+            format_ms(report.uptime_ms()),
+            format_count(load_totals.get("count", 0.0)),
+            format_ms(load_totals.get("time_ms", 0.0)),
+            format_bytes(load_totals.get("bytes", 0.0)),
+            format_count(mount_totals.get("count", 0.0)),
+            format_ms(mount_totals.get("time_ms", 0.0)),
+        ])
+    lines.extend([
+        "Overview",
+        render_table(
+            ["report", "uptime_ms", "loads", "load_ms", "load_bytes", "mounts", "mount_ms"],
+            overview_rows,
+            numeric_columns={1, 2, 3, 4, 5, 6},
+        ),
+        "",
+    ])
+
+    left_packed_loads = sum_selected_section_metrics(left.section("loads_by_format"), exclude_keys={"disk"})
+    right_packed_loads = sum_selected_section_metrics(right.section("loads_by_format"), exclude_keys={"disk"})
+    packed_load_rows = [[
+        format_count(left_packed_loads.get("count", 0.0)),
+        format_ms(left_packed_loads.get("time_ms", 0.0)),
+        format_bytes(left_packed_loads.get("bytes", 0.0)),
+        format_count(right_packed_loads.get("count", 0.0)),
+        format_ms(right_packed_loads.get("time_ms", 0.0)),
+        format_bytes(right_packed_loads.get("bytes", 0.0)),
+        format_delta(right_packed_loads.get("time_ms", 0.0) - left_packed_loads.get("time_ms", 0.0)),
+        format_percent(relative_delta_percent(left_packed_loads.get("time_ms", 0.0), right_packed_loads.get("time_ms", 0.0))),
+    ]]
+    lines.extend([
+        "Packed Load Totals",
+        render_table(
+            [
+                f"{left.label}_count",
+                f"{left.label}_ms",
+                f"{left.label}_bytes",
+                f"{right.label}_count",
+                f"{right.label}_ms",
+                f"{right.label}_bytes",
+                "delta_ms",
+                "delta_pct",
+            ],
+            packed_load_rows,
+            numeric_columns={0, 1, 2, 3, 4, 5, 6, 7},
+        ),
+        "",
+    ])
+
+    left_packed_mounts = sum_selected_section_metrics(left.section("mounts_by_format"))
+    right_packed_mounts = sum_selected_section_metrics(right.section("mounts_by_format"))
+    packed_mount_rows = [[
+        format_count(left_packed_mounts.get("count", 0.0)),
+        format_ms(left_packed_mounts.get("time_ms", 0.0)),
+        format_count(left_packed_mounts.get("entries", 0.0)),
+        format_count(right_packed_mounts.get("count", 0.0)),
+        format_ms(right_packed_mounts.get("time_ms", 0.0)),
+        format_count(right_packed_mounts.get("entries", 0.0)),
+        format_delta(right_packed_mounts.get("time_ms", 0.0) - left_packed_mounts.get("time_ms", 0.0)),
+        format_percent(relative_delta_percent(left_packed_mounts.get("time_ms", 0.0), right_packed_mounts.get("time_ms", 0.0))),
+    ]]
+    if packed_mount_rows:
+        lines.extend([
+            "Packed Mount Totals",
+            render_table(
+                [
+                    f"{left.label}_count",
+                    f"{left.label}_ms",
+                    f"{left.label}_entries",
+                    f"{right.label}_count",
+                    f"{right.label}_ms",
+                    f"{right.label}_entries",
+                    "delta_ms",
+                    "delta_pct",
+                ],
+                packed_mount_rows,
+                numeric_columns={0, 1, 2, 3, 4, 5, 6, 7},
+            ),
+            "",
+        ])
+
+    left_phases = dict(left.phase_markers)
+    right_phases = dict(right.phase_markers)
+    ordered_phases = [phase for phase, _ in left.phase_markers]
+    ordered_phases.extend(phase for phase, _ in right.phase_markers if phase not in left_phases)
+    phase_rows = []
+    for phase in ordered_phases:
+        left_value = left_phases.get(phase)
+        right_value = right_phases.get(phase)
+        if left_value is None and right_value is None:
+            continue
+        delta = (right_value or 0.0) - (left_value or 0.0)
+        delta_pct = relative_delta_percent(left_value or 0.0, right_value or 0.0)
+        phase_rows.append([
+            phase,
+            "-" if left_value is None else format_ms(left_value),
+            "-" if right_value is None else format_ms(right_value),
+            format_delta(delta),
+            format_percent(delta_pct),
+        ])
+    if phase_rows:
+        lines.extend([
+            "Phase Markers",
+            render_table(
+                ["phase", left.label, right.label, "delta_ms", "delta_pct"],
+                phase_rows,
+                numeric_columns={1, 2, 3, 4},
+            ),
+            "",
+        ])
+
+    left_phase_loads = aggregate_phase_loads(left)
+    right_phase_loads = aggregate_phase_loads(right)
+    phase_names = sorted(set(left_phase_loads) | set(right_phase_loads))
+    phase_load_rows = []
+    for phase in phase_names:
+        left_metrics = left_phase_loads.get(phase, {})
+        right_metrics = right_phase_loads.get(phase, {})
+        left_time = left_metrics.get("time_ms", 0.0)
+        right_time = right_metrics.get("time_ms", 0.0)
+        phase_load_rows.append([
+            phase,
+            format_ms(left_time),
+            format_ms(right_time),
+            format_delta(right_time - left_time),
+            format_percent(relative_delta_percent(left_time, right_time)),
+            format_count(left_metrics.get("count", 0.0)),
+            format_count(right_metrics.get("count", 0.0)),
+        ])
+    if phase_load_rows:
+        phase_load_rows.sort(key=lambda row: (-max(parse_float(row[1]), parse_float(row[2])), row[0]))
+        lines.extend([
+            "Load Time By Phase",
+            render_table(
+                ["phase", f"{left.label}_ms", f"{right.label}_ms", "delta_ms", "delta_pct", f"{left.label}_count", f"{right.label}_count"],
+                phase_load_rows,
+                numeric_columns={1, 2, 3, 4, 5, 6},
+            ),
+            "",
+        ])
+
+    format_names = sorted(set(left.section("loads_by_format")) | set(right.section("loads_by_format")))
+    format_rows = []
+    for format_name in format_names:
+        left_metrics = left.section("loads_by_format").get(format_name, {})
+        right_metrics = right.section("loads_by_format").get(format_name, {})
+        left_time = left_metrics.get("time_ms", 0.0)
+        right_time = right_metrics.get("time_ms", 0.0)
+        format_rows.append([
+            format_name,
+            format_count(left_metrics.get("count", 0.0)),
+            format_ms(left_time),
+            format_bytes(left_metrics.get("bytes", 0.0)),
+            format_count(right_metrics.get("count", 0.0)),
+            format_ms(right_time),
+            format_bytes(right_metrics.get("bytes", 0.0)),
+            format_delta(right_time - left_time),
+            format_percent(relative_delta_percent(left_time, right_time)),
+        ])
+    if format_rows:
+        format_rows.sort(key=lambda row: (-max(parse_float(row[2]), parse_float(row[5])), row[0]))
+        lines.extend([
+            "Loads By Format",
+            render_table(
+                [
+                    "format",
+                    f"{left.label}_count",
+                    f"{left.label}_ms",
+                    f"{left.label}_bytes",
+                    f"{right.label}_count",
+                    f"{right.label}_ms",
+                    f"{right.label}_bytes",
+                    "delta_ms",
+                    "delta_pct",
+                ],
+                format_rows,
+                numeric_columns={1, 2, 3, 4, 5, 6, 7, 8},
+            ),
+            "",
+        ])
+
+    mount_names = sorted(set(left.section("mounts_by_format")) | set(right.section("mounts_by_format")))
+    mount_rows = []
+    for format_name in mount_names:
+        left_metrics = left.section("mounts_by_format").get(format_name, {})
+        right_metrics = right.section("mounts_by_format").get(format_name, {})
+        left_time = left_metrics.get("time_ms", 0.0)
+        right_time = right_metrics.get("time_ms", 0.0)
+        mount_rows.append([
+            format_name,
+            format_count(left_metrics.get("count", 0.0)),
+            format_ms(left_time),
+            format_count(left_metrics.get("entries", 0.0)),
+            format_count(right_metrics.get("count", 0.0)),
+            format_ms(right_time),
+            format_count(right_metrics.get("entries", 0.0)),
+            format_delta(right_time - left_time),
+            format_percent(relative_delta_percent(left_time, right_time)),
+        ])
+    if mount_rows:
+        mount_rows.sort(key=lambda row: (-max(parse_float(row[2]), parse_float(row[5])), row[0]))
+        lines.extend([
+            "Mounts By Format",
+            render_table(
+                [
+                    "format",
+                    f"{left.label}_count",
+                    f"{left.label}_ms",
+                    f"{left.label}_entries",
+                    f"{right.label}_count",
+                    f"{right.label}_ms",
+                    f"{right.label}_entries",
+                    "delta_ms",
+                    "delta_pct",
+                ],
+                mount_rows,
+                numeric_columns={1, 2, 3, 4, 5, 6, 7, 8},
+            ),
+            "",
+        ])
+
+    stage_names = sorted(set(left.section("loader_stages")) | set(right.section("loader_stages")))
+    stage_rows = []
+    for stage_name in stage_names:
+        left_metrics = left.section("loader_stages").get(stage_name, {})
+        right_metrics = right.section("loader_stages").get(stage_name, {})
+        left_time = left_metrics.get("time_ms", 0.0)
+        right_time = right_metrics.get("time_ms", 0.0)
+        left_count = left_metrics.get("count", 0.0)
+        right_count = right_metrics.get("count", 0.0)
+        left_avg_us = (left_time * 1000.0 / left_count) if left_count else 0.0
+        right_avg_us = (right_time * 1000.0 / right_count) if right_count else 0.0
+        stage_rows.append([
+            stage_name,
+            format_ms(left_time),
+            f"{left_avg_us:.1f}",
+            format_ms(right_time),
+            f"{right_avg_us:.1f}",
+            format_delta(right_time - left_time),
+            format_percent(relative_delta_percent(left_time, right_time)),
+        ])
+    if stage_rows:
+        stage_rows.sort(key=lambda row: (-max(parse_float(row[1]), parse_float(row[3])), row[0]))
+        lines.extend([
+            "Loader Stages",
+            render_table(
+                [
+                    "stage",
+                    f"{left.label}_ms",
+                    f"{left.label}_avg_us",
+                    f"{right.label}_ms",
+                    f"{right.label}_avg_us",
+                    "delta_ms",
+                    "delta_pct",
+                ],
+                stage_rows[:top],
+                numeric_columns={1, 2, 3, 4, 5, 6},
+            ),
+            "",
+        ])
+
+    for report in (left, right):
+        hot_rows = []
+        for key, metrics in sort_rows_by_metric(report.section("top_phase_pack_loads"), "time_ms", top):
+            hot_rows.append([
+                key,
+                format_count(metrics.get("count", 0.0)),
+                format_ms(metrics.get("time_ms", 0.0)),
+                format_bytes(metrics.get("bytes", 0.0)),
+            ])
+        if hot_rows:
+            lines.extend([
+                f"Top Phase Pack Loads: {report.label}",
+                render_table(
+                    ["phase | pack", "count", "time_ms", "bytes"],
+                    hot_rows,
+                    numeric_columns={1, 2, 3},
+                ),
+                "",
+            ])
+
+    return "\n".join(lines).rstrip()
+
+
+def summarize_many_reports(reports: list[Report], top: int) -> str:
+    if len(reports) == 2:
+        return compare_two_reports(reports[0], reports[1], top)
+
+    parts = []
+    for report in reports:
+        parts.append(summarize_report(report, top))
+    return "\n\n".join(parts)
+
+
+def main() -> int:
+    args = parse_args()
+
+    try:
+        reports = [load_report(value) for value in args.reports]
+    except FileNotFoundError as exc:
+        print(str(exc), file=sys.stderr)
+        return 1
+
+    print(summarize_many_reports(reports, max(args.top, 1)))
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
--- a/src/PackLib/CMakeLists.txt
+++ b/src/PackLib/CMakeLists.txt
@@ -1,4 +1,4 @@
-file(GLOB_RECURSE FILE_SOURCES "*.h" "*.c" "*.cpp")
+file(GLOB_RECURSE FILE_SOURCES CONFIGURE_DEPENDS "*.h" "*.c" "*.cpp")

 add_library(PackLib STATIC ${FILE_SOURCES})

--- a/src/PackLib/M2Pack.cpp
+++ b/src/PackLib/M2Pack.cpp
@@ -1,6 +1,8 @@
 #include "M2Pack.h"
+#include "PackProfile.h"
 #include "M2PackRuntimeKeyProvider.h"

+#include <chrono>
 #include <cstring>

 #include <zstd.h>
@@ -40,6 +42,7 @@ bool ReadPod(const uint8_t* bytes, std::size_t size, std::size_t& offset, T& out

 bool CM2Pack::Load(const std::string& path)
 {
+	m_source_path = path;
 	std::error_code ec;
 	m_file.map(path, ec);

@@ -100,6 +103,7 @@ bool CM2Pack::Load(const std::string& path)
 bool CM2Pack::ValidateManifest()
 {
 	std::array<uint8_t, M2PACK_HASH_SIZE> manifest_hash {};
+	const auto hashStart = std::chrono::steady_clock::now();
 	crypto_generichash(
 		manifest_hash.data(),
 		manifest_hash.size(),
@@ -107,6 +111,13 @@ bool CM2Pack::ValidateManifest()
 		m_manifest_bytes.size(),
 		nullptr,
 		0);
+	RecordPackProfileStage(
+		"m2p",
+		"manifest_hash",
+		m_manifest_bytes.size(),
+		manifest_hash.size(),
+		static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+			std::chrono::steady_clock::now() - hashStart).count()));

 	if (memcmp(manifest_hash.data(), m_header.manifest_hash, manifest_hash.size()) != 0)
 	{
@@ -121,17 +132,33 @@ bool CM2Pack::ValidateManifest()
 		return false;
 	}

+	const auto verifyStart = std::chrono::steady_clock::now();
 	if (crypto_sign_verify_detached(
 		m_header.manifest_signature,
 		m_manifest_bytes.data(),
 		m_manifest_bytes.size(),
 		publicKey->data()) != 0)
 	{
+		RecordPackProfileStage(
+			"m2p",
+			"manifest_signature",
+			m_manifest_bytes.size(),
+			M2PACK_SIGNATURE_SIZE,
+			static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+				std::chrono::steady_clock::now() - verifyStart).count()));
 		TraceError("CM2Pack::ValidateManifest: manifest signature mismatch");
 		return false;
 	}
+	RecordPackProfileStage(
+		"m2p",
+		"manifest_signature",
+		m_manifest_bytes.size(),
+		M2PACK_SIGNATURE_SIZE,
+		static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+			std::chrono::steady_clock::now() - verifyStart).count()));

 	std::size_t offset = 0;
+	const auto parseStart = std::chrono::steady_clock::now();
 	TM2PackManifestHeader manifest_header {};
 	if (!ReadPod(m_manifest_bytes.data(), m_manifest_bytes.size(), offset, manifest_header))
 	{
@@ -184,30 +211,33 @@ bool CM2Pack::ValidateManifest()
 		m_index.push_back(std::move(entry));
 	}

+	RecordPackProfileStage(
+		"m2p",
+		"manifest_parse",
+		m_manifest_bytes.size(),
+		m_index.size(),
+		static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+			std::chrono::steady_clock::now() - parseStart).count()));
 	return true;
 }

 bool CM2Pack::DecryptEntryPayload(const TM2PackEntry& entry, std::vector<uint8_t>& decrypted, CBufferPool* pPool)
 {
 	const uint64_t begin = sizeof(TM2PackHeader) + entry.data_offset;
-	const auto* ciphertext = reinterpret_cast<const uint8_t*>(m_file.data() + begin);
-
-	std::vector<uint8_t> ciphertext_copy;
+	const auto* ciphertext = reinterpret_cast<const unsigned char*>(m_file.data() + begin);
 	if (pPool)
 	{
-		ciphertext_copy = pPool->Acquire(entry.stored_size);
+		decrypted = pPool->Acquire(entry.stored_size);
 	}
-	ciphertext_copy.resize(entry.stored_size);
-	memcpy(ciphertext_copy.data(), ciphertext, entry.stored_size);
-
 	decrypted.resize(entry.stored_size);
+	const auto decryptStart = std::chrono::steady_clock::now();
 	unsigned long long written = 0;
 	if (crypto_aead_xchacha20poly1305_ietf_decrypt(
 		decrypted.data(),
 		&written,
 		nullptr,
-		ciphertext_copy.data(),
-		ciphertext_copy.size(),
+		ciphertext,
+		entry.stored_size,
 		reinterpret_cast<const unsigned char*>(entry.path.data()),
 		entry.path.size(),
 		entry.nonce.data(),
@@ -215,16 +245,19 @@ bool CM2Pack::DecryptEntryPayload(const TM2PackEntry& entry, std::vector<uint8_t
 	{
 		if (pPool)
 		{
-			pPool->Release(std::move(ciphertext_copy));
+			pPool->Release(std::move(decrypted));
 		}
 		return false;
 	}
+	RecordPackProfileStage(
+		"m2p",
+		"aead_decrypt",
+		entry.stored_size,
+		written,
+		static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+			std::chrono::steady_clock::now() - decryptStart).count()));

 	decrypted.resize(static_cast<std::size_t>(written));
-	if (pPool)
-	{
-		pPool->Release(std::move(ciphertext_copy));
-	}
 	return true;
 }

@@ -252,7 +285,19 @@ bool CM2Pack::GetFileWithPool(const TM2PackEntry& entry, std::vector<uint8_t>& r
 	{
 		result.resize(entry.original_size);
 		ZSTD_DCtx* dctx = GetThreadLocalZstdContext();
+		const auto decompressStart = std::chrono::steady_clock::now();
 		size_t written = ZSTD_decompressDCtx(dctx, result.data(), result.size(), compressed.data(), compressed.size());
+		RecordPackProfileStage(
+			"m2p",
+			"zstd_decompress",
+			compressed.size(),
+			ZSTD_isError(written) ? 0 : written,
+			static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+				std::chrono::steady_clock::now() - decompressStart).count()));
+		if (pPool)
+		{
+			pPool->Release(std::move(compressed));
+		}
 		if (ZSTD_isError(written) || written != entry.original_size)
 		{
 			TraceError("CM2Pack::GetFileWithPool: zstd failed for '%s'", entry.path.c_str());
@@ -262,11 +307,21 @@ bool CM2Pack::GetFileWithPool(const TM2PackEntry& entry, std::vector<uint8_t>& r
 	}

 	default:
+		if (pPool)
+		{
+			pPool->Release(std::move(compressed));
+		}
 		TraceError("CM2Pack::GetFileWithPool: unsupported compression %u for '%s'", entry.compression, entry.path.c_str());
 		return false;
 	}

+	if (!ShouldVerifyM2PackPlaintextHash())
+	{
+		return true;
+	}
+
 	std::array<uint8_t, M2PACK_HASH_SIZE> plain_hash {};
+	const auto hashStart = std::chrono::steady_clock::now();
 	crypto_generichash(
 		plain_hash.data(),
 		plain_hash.size(),
@@ -274,6 +329,13 @@ bool CM2Pack::GetFileWithPool(const TM2PackEntry& entry, std::vector<uint8_t>& r
 		result.size(),
 		nullptr,
 		0);
+	RecordPackProfileStage(
+		"m2p",
+		"plaintext_hash",
+		result.size(),
+		plain_hash.size(),
+		static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+			std::chrono::steady_clock::now() - hashStart).count()));

 	if (memcmp(plain_hash.data(), entry.plaintext_hash.data(), plain_hash.size()) != 0)
 	{
--- a/src/PackLib/M2Pack.h
+++ b/src/PackLib/M2Pack.h
@@ -72,6 +72,7 @@ public:

 	bool Load(const std::string& path);
 	const std::vector<TM2PackEntry>& GetIndex() const { return m_index; }
+	const std::string& GetSourcePath() const { return m_source_path; }

 	bool GetFile(const TM2PackEntry& entry, std::vector<uint8_t>& result);
 	bool GetFileWithPool(const TM2PackEntry& entry, std::vector<uint8_t>& result, CBufferPool* pPool);
@@ -82,6 +83,7 @@ private:

 private:
 	TM2PackHeader m_header {};
+	std::string m_source_path;
 	std::vector<uint8_t> m_manifest_bytes;
 	std::vector<TM2PackEntry> m_index;
 	mio::mmap_source m_file;
--- a/src/PackLib/M2PackRuntimeKeyProvider.cpp
+++ b/src/PackLib/M2PackRuntimeKeyProvider.cpp
@@ -1,6 +1,8 @@
 #include "M2PackRuntimeKeyProvider.h"
+#include "PackProfile.h"

 #include <algorithm>
+#include <cctype>
 #include <cstring>
 #include <string>

@@ -18,6 +20,7 @@ constexpr const char* M2PACK_ENV_MASTER_KEY = "M2PACK_MASTER_KEY_HEX";
 constexpr const char* M2PACK_ENV_PUBLIC_KEY = "M2PACK_SIGN_PUBKEY_HEX";
 constexpr const char* M2PACK_ENV_MAP_NAME = "M2PACK_KEY_MAP";
 constexpr const char* M2PACK_ENV_KEY_ID = "M2PACK_KEY_ID";
+constexpr const char* M2PACK_ENV_STRICT_HASH = "M2PACK_STRICT_HASH";

 #pragma pack(push, 1)
 struct M2PackSharedKeys
@@ -38,6 +41,7 @@ struct RuntimeKeyState
 	std::array<uint8_t, M2PACK_PUBLIC_KEY_SIZE> public_key {};
 	bool runtime_master_key = false;
 	bool runtime_public_key = false;
+	bool verify_plaintext_hash = false;
 	bool initialized = false;
 };

@@ -100,6 +104,31 @@ uint32_t ParseUInt32(const std::string& value)
 	}
 }

+bool ParseBoolFlag(std::string value, bool& out)
+{
+	value.erase(std::remove_if(value.begin(), value.end(), [](unsigned char ch) {
+		return std::isspace(ch) != 0;
+	}), value.end());
+
+	std::transform(value.begin(), value.end(), value.begin(), [](unsigned char ch) {
+		return static_cast<char>(std::tolower(ch));
+	});
+
+	if (value == "1" || value == "true" || value == "yes" || value == "on")
+	{
+		out = true;
+		return true;
+	}
+
+	if (value == "0" || value == "false" || value == "no" || value == "off")
+	{
+		out = false;
+		return true;
+	}
+
+	return false;
+}
+
 const std::array<uint8_t, M2PACK_PUBLIC_KEY_SIZE>* FindCompiledPublicKey(uint32_t keyId)
 {
 	for (std::size_t i = 0; i < M2PACK_SIGN_KEY_IDS.size(); ++i)
@@ -191,16 +220,34 @@ void ApplyCommandLineOption(const std::string& key, const std::string& value)
 		LoadFromSharedMapping(value);
 		return;
 	}
+
+	if (key == "--m2pack-strict-hash")
+	{
+		bool enabled = false;
+		if (ParseBoolFlag(value, enabled))
+		{
+			g_state.verify_plaintext_hash = enabled;
+		}
+		else
+		{
+			TraceError("Invalid value for --m2pack-strict-hash");
+		}
+	}
 }

 } // namespace

 bool InitializeM2PackRuntimeKeyProvider(const char* commandLine)
 {
+	InitializePackProfile(commandLine);
+
 	if (g_state.initialized)
 		return true;

 	g_state = RuntimeKeyState {};
+#ifdef _DEBUG
+	g_state.verify_plaintext_hash = true;
+#endif

 	const std::string mapName = GetEnvString(M2PACK_ENV_MAP_NAME);
 	if (!mapName.empty())
@@ -240,6 +287,16 @@ bool InitializeM2PackRuntimeKeyProvider(const char* commandLine)
 			TraceError("Invalid M2PACK_SIGN_PUBKEY_HEX value");
 	}

+	const std::string envStrictHash = GetEnvString(M2PACK_ENV_STRICT_HASH);
+	if (!envStrictHash.empty())
+	{
+		bool enabled = false;
+		if (ParseBoolFlag(envStrictHash, enabled))
+			g_state.verify_plaintext_hash = enabled;
+		else
+			TraceError("Invalid M2PACK_STRICT_HASH value");
+	}
+
 	int argc = 0;
 	PCHAR* argv = CommandLineToArgv(const_cast<PCHAR>(commandLine ? commandLine : ""), &argc);
 	if (argv)
@@ -247,6 +304,31 @@ bool InitializeM2PackRuntimeKeyProvider(const char* commandLine)
 		for (int i = 0; i < argc; ++i)
 		{
 			const std::string key = argv[i];
+			if (key == "--m2pack-strict-hash")
+			{
+				bool enabled = true;
+				if (i + 1 < argc)
+				{
+					const std::string next = argv[i + 1];
+					if (!next.starts_with("--"))
+					{
+						if (!ParseBoolFlag(next, enabled))
+						{
+							TraceError("Invalid value for --m2pack-strict-hash");
+						}
+						else
+						{
+							g_state.verify_plaintext_hash = enabled;
+						}
+						++i;
+						continue;
+					}
+				}
+
+				g_state.verify_plaintext_hash = enabled;
+				continue;
+			}
+
 			if ((key == "--m2pack-key-hex" || key == "--m2pack-pubkey-hex" || key == "--m2pack-key-map" || key == "--m2pack-key-id") && i + 1 < argc)
 			{
 				ApplyCommandLineOption(key, argv[i + 1]);
@@ -268,6 +350,11 @@ bool InitializeM2PackRuntimeKeyProvider(const char* commandLine)
 		Tracef("M2Pack runtime key provider: no runtime master key available; .m2p loading will be denied\n");
 	}

+	if (g_state.verify_plaintext_hash)
+	{
+		Tracef("M2Pack runtime key provider: plaintext hash verification enabled\n");
+	}
+
 	g_state.initialized = true;
 	return true;
 }
@@ -311,3 +398,8 @@ bool IsM2PackUsingRuntimePublicKey()
 {
 	return g_state.runtime_public_key;
 }
+
+bool ShouldVerifyM2PackPlaintextHash()
+{
+	return g_state.verify_plaintext_hash;
+}
--- a/src/PackLib/M2PackRuntimeKeyProvider.h
+++ b/src/PackLib/M2PackRuntimeKeyProvider.h
@@ -12,3 +12,4 @@ bool HasM2PackRuntimeMasterKey();
 bool HasM2PackRuntimeKeysForArchiveLoad(uint32_t keyId);
 bool IsM2PackUsingRuntimeMasterKey();
 bool IsM2PackUsingRuntimePublicKey();
+bool ShouldVerifyM2PackPlaintextHash();
--- a/src/PackLib/Pack.cpp
+++ b/src/PackLib/Pack.cpp
@@ -1,5 +1,8 @@
 #include "Pack.h"
+#include "PackProfile.h"
 #include "EterLib/BufferPool.h"
+
+#include <chrono>
 #include <zstd.h>

 static thread_local ZSTD_DCtx* g_zstdDCtx = nullptr;
@@ -20,6 +23,7 @@ void CPack::DecryptData(uint8_t* data, size_t len, const uint8_t* nonce)

 bool CPack::Load(const std::string& path)
 {
+	m_source_path = path;
 	std::error_code ec;
 	m_file.map(path, ec);

@@ -68,7 +72,15 @@ bool CPack::GetFileWithPool(const TPackFileEntry& entry, TPackFile& result, CBuf
 	switch (entry.encryption)
 	{
 		case 0: {
+			const auto decompressStart = std::chrono::steady_clock::now();
 			size_t decompressed_size = ZSTD_decompressDCtx(dctx, result.data(), result.size(), m_file.data() + offset, entry.compressed_size);
+			RecordPackProfileStage(
+				"pck",
+				"zstd_decompress",
+				entry.compressed_size,
+				ZSTD_isError(decompressed_size) ? 0 : decompressed_size,
+				static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+					std::chrono::steady_clock::now() - decompressStart).count()));
 			if (decompressed_size != entry.file_size) {
 				return false;
 			}
@@ -83,9 +95,25 @@ bool CPack::GetFileWithPool(const TPackFileEntry& entry, TPackFile& result, CBuf

 			memcpy(compressed_data.data(), m_file.data() + offset, entry.compressed_size);

+			const auto decryptStart = std::chrono::steady_clock::now();
 			DecryptData(compressed_data.data(), entry.compressed_size, entry.nonce);
+			RecordPackProfileStage(
+				"pck",
+				"xchacha20_decrypt",
+				entry.compressed_size,
+				entry.compressed_size,
+				static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+					std::chrono::steady_clock::now() - decryptStart).count()));

+			const auto decompressStart = std::chrono::steady_clock::now();
 			size_t decompressed_size = ZSTD_decompressDCtx(dctx, result.data(), result.size(), compressed_data.data(), compressed_data.size());
+			RecordPackProfileStage(
+				"pck",
+				"zstd_decompress",
+				compressed_data.size(),
+				ZSTD_isError(decompressed_size) ? 0 : decompressed_size,
+				static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+					std::chrono::steady_clock::now() - decompressStart).count()));

 			if (pPool) {
 				pPool->Release(std::move(compressed_data));
--- a/src/PackLib/Pack.h
+++ b/src/PackLib/Pack.h
@@ -14,6 +14,7 @@ public:

 	bool Load(const std::string& path);
 	const std::vector<TPackFileEntry>& GetIndex() const { return m_index; }
+	const std::string& GetSourcePath() const { return m_source_path; }

 	bool GetFile(const TPackFileEntry& entry, TPackFile& result);
 	bool GetFileWithPool(const TPackFileEntry& entry, TPackFile& result, CBufferPool* pPool);
@@ -22,6 +23,7 @@ private:
 	void DecryptData(uint8_t* data, size_t len, const uint8_t* nonce);

 	TPackFileHeader m_header;
+	std::string m_source_path;
 	std::vector<TPackFileEntry> m_index;
 	mio::mmap_source m_file;
 };
--- a/src/PackLib/PackManager.cpp
+++ b/src/PackLib/PackManager.cpp
@@ -1,5 +1,8 @@
 #include "PackManager.h"
+#include "PackProfile.h"
 #include "EterLib/BufferPool.h"
+
+#include <chrono>
 #include <filesystem>
 #include <fstream>
 #include <memory>
@@ -27,8 +30,16 @@ bool CPackManager::AddPack(const std::string& path)
 	if (packPath.extension() == ".m2p")
 	{
 		std::shared_ptr<CM2Pack> pack = std::make_shared<CM2Pack>();
+		const auto loadStart = std::chrono::steady_clock::now();
 		if (!pack->Load(path))
 		{
+			RecordPackProfileMount(
+				"m2p",
+				path,
+				false,
+				0,
+				static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+					std::chrono::steady_clock::now() - loadStart).count()));
 			return false;
 		}

@@ -38,14 +49,29 @@ bool CPackManager::AddPack(const std::string& path)
 		{
 			m_m2_entries[entry.path] = std::make_pair(pack, entry);
 		}
+		RecordPackProfileMount(
+			"m2p",
+			path,
+			true,
+			index.size(),
+			static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+				std::chrono::steady_clock::now() - loadStart).count()));

 		return true;
 	}

 	std::shared_ptr<CPack> pack = std::make_shared<CPack>();
+	const auto loadStart = std::chrono::steady_clock::now();

 	if (!pack->Load(path))
 	{
+		RecordPackProfileMount(
+			"pck",
+			path,
+			false,
+			0,
+			static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+				std::chrono::steady_clock::now() - loadStart).count()));
 		return false;
 	}

@@ -55,6 +81,13 @@ bool CPackManager::AddPack(const std::string& path)
 	{
 		m_entries[entry.file_name] = std::make_pair(pack, entry);
 	}
+	RecordPackProfileMount(
+		"pck",
+		path,
+		true,
+		index.size(),
+		static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+			std::chrono::steady_clock::now() - loadStart).count()));

 	return true;
 }
@@ -73,16 +106,37 @@ bool CPackManager::GetFileWithPool(std::string_view path, TPackFile& result, CBu
 	if (m_load_from_pack) {
 		auto m2it = m_m2_entries.find(buf);
 		if (m2it != m_m2_entries.end()) {
-			return m2it->second.first->GetFileWithPool(m2it->second.second, result, pPool);
+			const auto loadStart = std::chrono::steady_clock::now();
+			const bool ok = m2it->second.first->GetFileWithPool(m2it->second.second, result, pPool);
+			RecordPackProfileLoad(
+				"m2p",
+				m2it->second.first->GetSourcePath(),
+				buf,
+				ok,
+				ok ? result.size() : 0,
+				static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+					std::chrono::steady_clock::now() - loadStart).count()));
+			return ok;
 		}

 		auto it = m_entries.find(buf);
 		if (it != m_entries.end()) {
-			return it->second.first->GetFileWithPool(it->second.second, result, pPool);
+			const auto loadStart = std::chrono::steady_clock::now();
+			const bool ok = it->second.first->GetFileWithPool(it->second.second, result, pPool);
+			RecordPackProfileLoad(
+				"pck",
+				it->second.first->GetSourcePath(),
+				buf,
+				ok,
+				ok ? result.size() : 0,
+				static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+					std::chrono::steady_clock::now() - loadStart).count()));
+			return ok;
 		}
 	}

 	// Fallback to disk (for files not in packs, like bgm folder)
+	const auto loadStart = std::chrono::steady_clock::now();
 	std::ifstream ifs(buf, std::ios::binary);
 	if (ifs.is_open()) {
 		ifs.seekg(0, std::ios::end);
@@ -97,10 +151,27 @@ bool CPackManager::GetFileWithPool(std::string_view path, TPackFile& result, CBu
 		}

 		if (ifs.read((char*)result.data(), size)) {
+			RecordPackProfileLoad(
+				"disk",
+				"<disk>",
+				buf,
+				true,
+				result.size(),
+				static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+					std::chrono::steady_clock::now() - loadStart).count()));
 			return true;
 		}
 	}

+	RecordPackProfileLoad(
+		"disk",
+		"<disk>",
+		buf,
+		false,
+		0,
+		static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+			std::chrono::steady_clock::now() - loadStart).count()));
+
 	return false;
 }

--- a/src/PackLib/PackProfile.cpp
+++ b/src/PackLib/PackProfile.cpp
@@ -0,0 +1,554 @@
+#include "PackProfile.h"
+
+#include <algorithm>
+#include <array>
+#include <cctype>
+#include <chrono>
+#include <condition_variable>
+#include <filesystem>
+#include <fstream>
+#include <map>
+#include <mutex>
+#include <sstream>
+#include <string>
+#include <thread>
+#include <vector>
+
+#include <windows.h>
+
+#include "EterBase/Utils.h"
+
+namespace
+{
+
+using Clock = std::chrono::steady_clock;
+
+constexpr const char* kPackProfileEnv = "M2PACK_PROFILE";
+constexpr const char* kPackProfileOutput = "log/pack_profile.txt";
+constexpr std::size_t kTopListLimit = 12;
+constexpr auto kSnapshotFlushInterval = std::chrono::seconds(1);
+
+struct LoadStats
+{
+	std::uint64_t count = 0;
+	std::uint64_t failures = 0;
+	std::uint64_t output_bytes = 0;
+	std::uint64_t elapsed_us = 0;
+};
+
+struct MountStats
+{
+	std::uint64_t count = 0;
+	std::uint64_t failures = 0;
+	std::uint64_t entry_count = 0;
+	std::uint64_t elapsed_us = 0;
+};
+
+struct StageStats
+{
+	std::uint64_t count = 0;
+	std::uint64_t input_bytes = 0;
+	std::uint64_t output_bytes = 0;
+	std::uint64_t elapsed_us = 0;
+};
+
+struct PhaseMarker
+{
+	std::string phase;
+	std::uint64_t elapsed_us = 0;
+};
+
+struct PackProfileState
+{
+	bool initialized = false;
+	bool enabled = false;
+	std::string current_phase = "startup";
+	Clock::time_point start_time {};
+	std::vector<PhaseMarker> markers;
+	std::map<std::string, LoadStats> load_by_format;
+	std::map<std::string, LoadStats> load_by_phase_format;
+	std::map<std::string, LoadStats> load_by_phase_pack;
+	std::map<std::string, LoadStats> load_by_extension;
+	std::map<std::string, MountStats> mount_by_format;
+	std::map<std::string, MountStats> mount_by_pack;
+	std::map<std::string, StageStats> stage_by_key;
+	bool snapshot_dirty = false;
+	bool stop_snapshot_thread = false;
+	std::condition_variable snapshot_cv;
+	std::thread snapshot_thread;
+	std::mutex mutex;
+
+	~PackProfileState()
+	{
+		StopSnapshotThread();
+		if (enabled)
+		{
+			std::lock_guard<std::mutex> lock(mutex);
+			WriteReportLocked("shutdown");
+		}
+	}
+
+	void StartSnapshotThread()
+	{
+		snapshot_thread = std::thread([this]() {
+			SnapshotLoop();
+		});
+	}
+
+	void StopSnapshotThread()
+	{
+		{
+			std::lock_guard<std::mutex> lock(mutex);
+			stop_snapshot_thread = true;
+		}
+		snapshot_cv.notify_all();
+		if (snapshot_thread.joinable())
+		{
+			snapshot_thread.join();
+		}
+	}
+
+	void SnapshotLoop()
+	{
+		std::unique_lock<std::mutex> lock(mutex);
+		for (;;)
+		{
+			const bool shouldStop = snapshot_cv.wait_for(
+				lock,
+				kSnapshotFlushInterval,
+				[this]() { return stop_snapshot_thread; });
+			if (shouldStop)
+			{
+				return;
+			}
+
+			if (!enabled || !snapshot_dirty)
+			{
+				continue;
+			}
+
+			WriteReportLocked("periodic");
+			snapshot_dirty = false;
+		}
+	}
+
+	void WriteReportLocked(std::string_view reason) const
+	{
+		std::error_code ec;
+		std::filesystem::create_directories("log", ec);
+
+		std::ofstream output(kPackProfileOutput, std::ios::binary | std::ios::trunc);
+		if (!output)
+		{
+			return;
+		}
+
+		output << "PACK PROFILE REPORT\n";
+		output << "reason=" << reason << "\n";
+		output << "current_phase=" << current_phase << "\n";
+
+		const auto uptime_us = start_time == Clock::time_point {}
+			? 0
+			: static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+				Clock::now() - start_time).count());
+		output << "uptime_ms=" << FormatMs(uptime_us) << "\n\n";
+
+		output << "[phase_markers]\n";
+		for (const auto& marker : markers)
+		{
+			output << marker.phase << "\t" << FormatMs(marker.elapsed_us) << " ms\n";
+		}
+		output << "\n";
+
+		WriteMountSection(output, "[mounts_by_format]", mount_by_format);
+		WriteMountSection(output, "[mounts_by_pack]", mount_by_pack, kTopListLimit);
+		WriteLoadSection(output, "[loads_by_format]", load_by_format);
+		WriteLoadSection(output, "[loads_by_phase_format]", load_by_phase_format);
+		WriteLoadSection(output, "[top_phase_pack_loads]", load_by_phase_pack, kTopListLimit);
+		WriteLoadSection(output, "[top_extension_loads]", load_by_extension, kTopListLimit);
+		WriteStageSection(output, "[loader_stages]", stage_by_key);
+	}
+
+	static std::string FormatMs(std::uint64_t elapsed_us)
+	{
+		std::ostringstream out;
+		out.setf(std::ios::fixed, std::ios::floatfield);
+		out.precision(3);
+		out << (static_cast<double>(elapsed_us) / 1000.0);
+		return out.str();
+	}
+
+	static std::string ToLowerAscii(std::string value)
+	{
+		std::transform(value.begin(), value.end(), value.begin(), [](unsigned char ch) {
+			return static_cast<char>(std::tolower(ch));
+		});
+		return value;
+	}
+
+	static std::string ParsePackLabel(std::string_view pack_path)
+	{
+		const std::filesystem::path path { std::string(pack_path) };
+		const auto filename = path.filename().string();
+		if (!filename.empty())
+		{
+			return ToLowerAscii(filename);
+		}
+
+		return ToLowerAscii(std::string(pack_path));
+	}
+
+	static std::string ParseExtension(std::string_view request_path)
+	{
+		const std::filesystem::path path { std::string(request_path) };
+		auto extension = path.extension().string();
+		if (extension.empty())
+		{
+			return "<none>";
+		}
+
+		return ToLowerAscii(std::move(extension));
+	}
+
+	static std::string MakePhaseKey(std::string_view phase, std::string_view suffix)
+	{
+		std::string key;
+		key.reserve(phase.size() + suffix.size() + 3);
+		key.append(phase);
+		key.append(" | ");
+		key.append(suffix);
+		return key;
+	}
+
+	template <typename MapType>
+	static std::vector<std::pair<std::string, typename MapType::mapped_type>> SortByTimeDesc(const MapType& values)
+	{
+		using StatsType = typename MapType::mapped_type;
+		std::vector<std::pair<std::string, StatsType>> sorted;
+		sorted.reserve(values.size());
+		for (const auto& [key, stats] : values)
+		{
+			sorted.emplace_back(key, stats);
+		}
+		std::sort(sorted.begin(), sorted.end(), [](const auto& left, const auto& right) {
+			if (left.second.elapsed_us != right.second.elapsed_us)
+			{
+				return left.second.elapsed_us > right.second.elapsed_us;
+			}
+
+			return left.first < right.first;
+		});
+		return sorted;
+	}
+
+	static void WriteLoadSection(std::ofstream& output, std::string_view header, const std::map<std::string, LoadStats>& values, std::size_t limit = 0)
+	{
+		output << header << "\n";
+		const auto sorted = SortByTimeDesc(values);
+		const std::size_t count = limit == 0 ? sorted.size() : std::min(limit, sorted.size());
+		for (std::size_t i = 0; i < count; ++i)
+		{
+			const auto& [key, stats] = sorted[i];
+			output
+				<< key
+				<< "\tcount=" << stats.count
+				<< "\tfail=" << stats.failures
+				<< "\tbytes=" << stats.output_bytes
+				<< "\ttime_ms=" << FormatMs(stats.elapsed_us)
+				<< "\n";
+		}
+		output << "\n";
+	}
+
+	static void WriteMountSection(std::ofstream& output, std::string_view header, const std::map<std::string, MountStats>& values, std::size_t limit = 0)
+	{
+		output << header << "\n";
+		std::vector<std::pair<std::string, MountStats>> sorted(values.begin(), values.end());
+		std::sort(sorted.begin(), sorted.end(), [](const auto& left, const auto& right) {
+			if (left.second.elapsed_us != right.second.elapsed_us)
+			{
+				return left.second.elapsed_us > right.second.elapsed_us;
+			}
+
+			return left.first < right.first;
+		});
+
+		const std::size_t count = limit == 0 ? sorted.size() : std::min(limit, sorted.size());
+		for (std::size_t i = 0; i < count; ++i)
+		{
+			const auto& [key, stats] = sorted[i];
+			output
+				<< key
+				<< "\tcount=" << stats.count
+				<< "\tfail=" << stats.failures
+				<< "\tentries=" << stats.entry_count
+				<< "\ttime_ms=" << FormatMs(stats.elapsed_us)
+				<< "\n";
+		}
+		output << "\n";
+	}
+
+	static void WriteStageSection(std::ofstream& output, std::string_view header, const std::map<std::string, StageStats>& values)
+	{
+		output << header << "\n";
+		std::vector<std::pair<std::string, StageStats>> sorted(values.begin(), values.end());
+		std::sort(sorted.begin(), sorted.end(), [](const auto& left, const auto& right) {
+			if (left.second.elapsed_us != right.second.elapsed_us)
+			{
+				return left.second.elapsed_us > right.second.elapsed_us;
+			}
+
+			return left.first < right.first;
+		});
+
+		for (const auto& [key, stats] : sorted)
+		{
+			output
+				<< key
+				<< "\tcount=" << stats.count
+				<< "\tin=" << stats.input_bytes
+				<< "\tout=" << stats.output_bytes
+				<< "\ttime_ms=" << FormatMs(stats.elapsed_us)
+				<< "\n";
+		}
+		output << "\n";
+	}
+};
+
+PackProfileState g_pack_profile;
+
+bool ParseBoolFlag(std::string value, bool& out)
+{
+	value.erase(std::remove_if(value.begin(), value.end(), [](unsigned char ch) {
+		return std::isspace(ch) != 0;
+	}), value.end());
+
+	std::transform(value.begin(), value.end(), value.begin(), [](unsigned char ch) {
+		return static_cast<char>(std::tolower(ch));
+	});
+
+	if (value == "1" || value == "true" || value == "yes" || value == "on")
+	{
+		out = true;
+		return true;
+	}
+
+	if (value == "0" || value == "false" || value == "no" || value == "off")
+	{
+		out = false;
+		return true;
+	}
+
+	return false;
+}
+
+std::string GetEnvString(const char* name)
+{
+	char buffer[256];
+	const DWORD len = GetEnvironmentVariableA(name, buffer, sizeof(buffer));
+	if (len == 0 || len >= sizeof(buffer))
+	{
+		return {};
+	}
+
+	return std::string(buffer, buffer + len);
+}
+
+std::uint64_t CurrentElapsedUs()
+{
+	if (!g_pack_profile.enabled)
+	{
+		return 0;
+	}
+
+	return static_cast<std::uint64_t>(std::chrono::duration_cast<std::chrono::microseconds>(
+		Clock::now() - g_pack_profile.start_time).count());
+}
+
+} // namespace
+
+void InitializePackProfile(const char* commandLine)
+{
+	if (g_pack_profile.initialized)
+	{
+		return;
+	}
+
+	g_pack_profile.initialized = true;
+	g_pack_profile.enabled = false;
+
+	const std::string envProfile = GetEnvString(kPackProfileEnv);
+	if (!envProfile.empty())
+	{
+		bool enabled = false;
+		if (ParseBoolFlag(envProfile, enabled))
+		{
+			g_pack_profile.enabled = enabled;
+		}
+	}
+
+	int argc = 0;
+	PCHAR* argv = CommandLineToArgv(const_cast<PCHAR>(commandLine ? commandLine : ""), &argc);
+	if (argv)
+	{
+		for (int i = 0; i < argc; ++i)
+		{
+			const std::string arg = argv[i];
+			if (arg != "--m2pack-profile")
+			{
+				continue;
+			}
+
+			bool enabled = true;
+			if (i + 1 < argc)
+			{
+				const std::string next = argv[i + 1];
+				if (!next.starts_with("--") && ParseBoolFlag(next, enabled))
+				{
+					++i;
+				}
+			}
+
+			g_pack_profile.enabled = enabled;
+		}
+
+		SAFE_FREE_GLOBAL(argv);
+	}
+
+	if (!g_pack_profile.enabled)
+	{
+		return;
+	}
+
+	std::lock_guard<std::mutex> lock(g_pack_profile.mutex);
+	g_pack_profile.current_phase = "startup";
+	g_pack_profile.start_time = Clock::now();
+	g_pack_profile.markers.push_back(PhaseMarker {
+		.phase = g_pack_profile.current_phase,
+		.elapsed_us = 0,
+	});
+	g_pack_profile.WriteReportLocked("initialized");
+	g_pack_profile.snapshot_dirty = false;
+	g_pack_profile.stop_snapshot_thread = false;
+	g_pack_profile.StartSnapshotThread();
+}
+
+bool IsPackProfileEnabled()
+{
+	return g_pack_profile.enabled;
+}
+
+void MarkPackProfilePhase(std::string_view phase)
+{
+	if (!g_pack_profile.enabled)
+	{
+		return;
+	}
+
+	std::lock_guard<std::mutex> lock(g_pack_profile.mutex);
+	if (g_pack_profile.current_phase == phase)
+	{
+		return;
+	}
+
+	g_pack_profile.current_phase = std::string(phase);
+	g_pack_profile.markers.push_back(PhaseMarker {
+		.phase = g_pack_profile.current_phase,
+		.elapsed_us = CurrentElapsedUs(),
+	});
+	g_pack_profile.WriteReportLocked(PackProfileState::MakePhaseKey("phase", phase));
+	g_pack_profile.snapshot_dirty = false;
+}
+
+void RecordPackProfileMount(
+	std::string_view format,
+	std::string_view packPath,
+	bool ok,
+	std::size_t entryCount,
+	std::uint64_t elapsedUs)
+{
+	if (!g_pack_profile.enabled)
+	{
+		return;
+	}
+
+	std::lock_guard<std::mutex> lock(g_pack_profile.mutex);
+
+	auto& formatStats = g_pack_profile.mount_by_format[std::string(format)];
+	formatStats.count += 1;
+	formatStats.failures += ok ? 0 : 1;
+	formatStats.entry_count += entryCount;
+	formatStats.elapsed_us += elapsedUs;
+
+	auto& packStats = g_pack_profile.mount_by_pack[PackProfileState::ParsePackLabel(packPath)];
+	packStats.count += 1;
+	packStats.failures += ok ? 0 : 1;
+	packStats.entry_count += entryCount;
+	packStats.elapsed_us += elapsedUs;
+	g_pack_profile.snapshot_dirty = true;
+}
+
+void RecordPackProfileLoad(
+	std::string_view format,
+	std::string_view packPath,
+	std::string_view requestPath,
+	bool ok,
+	std::uint64_t outputBytes,
+	std::uint64_t elapsedUs)
+{
+	if (!g_pack_profile.enabled)
+	{
+		return;
+	}
+
+	std::lock_guard<std::mutex> lock(g_pack_profile.mutex);
+
+	auto update = [ok, outputBytes, elapsedUs](LoadStats& stats) {
+		stats.count += 1;
+		stats.failures += ok ? 0 : 1;
+		stats.output_bytes += outputBytes;
+		stats.elapsed_us += elapsedUs;
+	};
+
+	update(g_pack_profile.load_by_format[std::string(format)]);
+	update(g_pack_profile.load_by_phase_format[
+		PackProfileState::MakePhaseKey(g_pack_profile.current_phase, format)]);
+	update(g_pack_profile.load_by_phase_pack[
+		PackProfileState::MakePhaseKey(g_pack_profile.current_phase, PackProfileState::ParsePackLabel(packPath))]);
+	update(g_pack_profile.load_by_extension[PackProfileState::ParseExtension(requestPath)]);
+	g_pack_profile.snapshot_dirty = true;
+}
+
+void RecordPackProfileStage(
+	std::string_view format,
+	std::string_view stage,
+	std::uint64_t inputBytes,
+	std::uint64_t outputBytes,
+	std::uint64_t elapsedUs)
+{
+	if (!g_pack_profile.enabled)
+	{
+		return;
+	}
+
+	std::lock_guard<std::mutex> lock(g_pack_profile.mutex);
+	auto& stats = g_pack_profile.stage_by_key[
+		PackProfileState::MakePhaseKey(format, stage)];
+	stats.count += 1;
+	stats.input_bytes += inputBytes;
+	stats.output_bytes += outputBytes;
+	stats.elapsed_us += elapsedUs;
+	g_pack_profile.snapshot_dirty = true;
+}
+
+void FlushPackProfileSnapshot(std::string_view reason)
+{
+	if (!g_pack_profile.enabled)
+	{
+		return;
+	}
+
+	std::lock_guard<std::mutex> lock(g_pack_profile.mutex);
+	g_pack_profile.WriteReportLocked(reason);
+	g_pack_profile.snapshot_dirty = false;
+}
--- a/src/PackLib/PackProfile.h
+++ b/src/PackLib/PackProfile.h
@@ -0,0 +1,29 @@
+#pragma once
+
+#include <cstddef>
+#include <cstdint>
+#include <string_view>
+
+void InitializePackProfile(const char* commandLine);
+bool IsPackProfileEnabled();
+void MarkPackProfilePhase(std::string_view phase);
+void RecordPackProfileMount(
+	std::string_view format,
+	std::string_view packPath,
+	bool ok,
+	std::size_t entryCount,
+	std::uint64_t elapsedUs);
+void RecordPackProfileLoad(
+	std::string_view format,
+	std::string_view packPath,
+	std::string_view requestPath,
+	bool ok,
+	std::uint64_t outputBytes,
+	std::uint64_t elapsedUs);
+void RecordPackProfileStage(
+	std::string_view format,
+	std::string_view stage,
+	std::uint64_t inputBytes,
+	std::uint64_t outputBytes,
+	std::uint64_t elapsedUs);
+void FlushPackProfileSnapshot(std::string_view reason);
--- a/src/UserInterface/PythonNetworkStreamPhaseGame.cpp
+++ b/src/UserInterface/PythonNetworkStreamPhaseGame.cpp
@@ -20,6 +20,7 @@
 #include "PythonApplication.h"

 #include "GameLib/ItemManager.h"
+#include "PackLib/PackProfile.h"

 #include "AbstractApplication.h"
 #include "AbstractCharacterManager.h"
@@ -414,6 +415,7 @@ void CPythonNetworkStream::SetGamePhase()
 	if ("Game"!=m_strPhase)
 		m_phaseLeaveFunc.Run();

+	MarkPackProfilePhase("game");
 	m_strPhase = "Game";

 	m_dwChangingPhaseTime = ELTimer_GetMSec();
--- a/src/UserInterface/PythonNetworkStreamPhaseLoading.cpp
+++ b/src/UserInterface/PythonNetworkStreamPhaseLoading.cpp
@@ -5,6 +5,7 @@
 #include "NetworkActorManager.h"
 #include "AbstractPlayer.h"
 #include "PackLib/PackManager.h"
+#include "PackLib/PackProfile.h"

 void CPythonNetworkStream::EnableChatInsultFilter(bool isEnable)
 {
@@ -90,6 +91,7 @@ void CPythonNetworkStream::SetLoadingPhase()
 	Tracen("");
 	Tracen("## Network - Loading Phase ##");
 	Tracen("");
+	MarkPackProfilePhase("loading");

 	m_strPhase = "Loading";

--- a/src/UserInterface/PythonNetworkStreamPhaseLogin.cpp
+++ b/src/UserInterface/PythonNetworkStreamPhaseLogin.cpp
@@ -2,6 +2,7 @@
 #include "PythonNetworkStream.h"
 #include "Packet.h"
 #include "AccountConnector.h"
+#include "PackLib/PackProfile.h"

 // Login ---------------------------------------------------------------------------
 void CPythonNetworkStream::LoginPhase()
@@ -18,6 +19,7 @@ void CPythonNetworkStream::SetLoginPhase()
 	Tracen("");
 	Tracen("## Network - Login Phase ##");
 	Tracen("");
+	MarkPackProfilePhase("login");

 	m_strPhase = "Login";
Author	SHA1	Message	Date
server	49e8eac809	Revert stream M2Pack archive support	2026-04-15 19:06:59 +02:00
server	0b852faf0e	Support experimental stream M2Pack archives	2026-04-15 18:40:40 +02:00
server	b353339bd8	Add GM smoke compare workflow for pack profiling	2026-04-15 17:35:02 +02:00
server	db7ae1f841	Persist pack profile snapshots during timed captures	2026-04-15 17:06:11 +02:00
server	2d9beb4793	Add pack profile capture workflow	2026-04-15 16:39:16 +02:00
server	6ff59498d2	Add pack profile report parser	2026-04-15 16:34:26 +02:00
server	ba6af8115b	Add pack runtime profiling hooks	2026-04-15 16:22:10 +02:00
server	ef7cdf2809	Reduce m2pack client hot-path overhead	2026-04-15 15:43:26 +02:00