Profiling Foundation

This document describes the standalone profiling foundation for Anemone.

Purpose

The profiling package can:

  • run a named scenario

  • measure total wall-clock time

  • capture basic execution metadata

  • persist a stable JSON artifact for each run

The profiling foundation is intentionally separate from the core search engine.

Constraints

This first version does not modify core search semantics and does not add timing or profiler hooks inside the search loop.

In particular, PR1 does not touch:

  • TreeExploration.step()

  • AlgorithmNodeTreeManager

What Exists Today

  • anemone.profiling package under src/anemone/profiling/

  • stable RunResult JSON schema with schema_version = "1"

  • run directory helpers

  • a standalone scenario runner

  • a minimal CLI

  • a tiny real smoke scenario using public Anemone APIs

  • lazy scenario loading so package import stays lightweight

  • optional external profiler modes: none, cprofile, pyinstrument

  • wrapper-based component timing for evaluator and dynamics

  • component_summary.json artifact output when enabled

  • cprofile.pstats and cprofile_top.txt artifacts for cprofile

  • pyinstrument.txt artifact when pyinstrument is requested and installed

  • optional GUI integrations for SnakeViz and gprof2dot call-graph generation

  • deterministic synthetic profiling scenarios:

    • cheap_eval

    • expensive_eval

    • wide_tree

    • deep_tree

    • reuse_heavy

  • repeatable profiling suites:

    • baseline

    • quick

  • suite.json artifact output for repeated suite runs

What Is Still Out Of Scope

  • process-level profilers such as py-spy

  • viztracer integration

  • CI performance regression gates

  • flamegraph and speedscope-style timeline views

Running It

Preferred CLI:

python -m anemone.profiling.cli list-scenarios
python -m anemone.profiling.cli list-suites
python -m anemone.profiling.cli run --scenario smoke --output-dir profiling_runs
python -m anemone.profiling.cli run --scenario smoke --output-dir profiling_runs --profiler cprofile --component-summary
python -m anemone.profiling.cli run-suite --suite baseline --output-dir profiling_runs --repetitions 5 --component-summary

GUI launcher:

pip install -e .[gui]
python -m anemone.profiling.gui

Optional profiler-visualization extras:

pip install -e .[gui,profiling-viz]

Graph rendering for gprof2dot also requires the Graphviz dot executable to be installed on the system.

Convenience runner entrypoint:

python -m anemone.profiling.runner --scenario smoke --output-dir profiling_runs

Output Layout

Each run creates a timestamped folder under the selected base directory:

profiling_runs/
  2026-03-24T14-32-10_smoke/
    run.json
    component_summary.json
    cprofile.pstats
    cprofile_top.txt

If a run id collides, the storage helper appends a deterministic suffix such as _2, _3, and so on.

The run.json artifact includes scenario metadata, execution metadata, top-level wall time, run status, and placeholder artifact references for future tooling. The recorded wall time tracks scenario execution itself and intentionally excludes profiler artifact writing or profiler post-processing.

Suite runs create a separate suite-level directory with nested scenario runs:

profiling_runs/
  2026-03-24T15-10-00_baseline_suite/
    suite.json
    scenario_runs/
      2026-03-24T15-10-00_cheap_eval_rep1/
        run.json
        component_summary.json
      2026-03-24T15-10-01_cheap_eval_rep2/
        run.json
      2026-03-24T15-10-05_wide_tree_rep1/
        run.json

suite.json records:

  • suite metadata

  • requested repetition count

  • profiler and component-summary settings

  • per-scenario aggregate wall-time statistics across successful repetitions

  • per-repetition run.json paths and statuses

This keeps the suite artifact comparison-ready without requiring later tooling to re-scan directories.

GUI

PR4 adds a local Streamlit dashboard under anemone.profiling.gui.

The dashboard can:

  • launch scenarios and suites

  • browse existing runs and suites

  • display component timing breakdowns

  • show readable profiler text artifacts

  • expose SnakeViz launch commands for cprofile.pstats artifacts

  • generate and display gprof2dot call graphs on demand

  • compare two runs or two suites

The GUI reads the existing profiling artifacts only. It does not add new profiling hooks or modify core search behavior.

When component summaries are enabled, the summary approximates framework overhead as:

residual_framework_wall_time_seconds
    = total_run_wall_time_seconds - wrapped_component_wall_times

This residual is useful, but intentionally approximate.

Interactive cProfile visualization

For runs that include cprofile.pstats, the GUI exposes two optional tools:

  • SnakeViz for interactive browser-based inspection of the call stack

  • gprof2dot for caller/callee call-graph generation

Suggested setup:

pip install -e .[gui,profiling-viz]
python -m anemone.profiling.gui

When a run includes cprofile.pstats, the run page can:

  • show the exact snakeviz /absolute/path/to/cprofile.pstats command

  • generate DOT, SVG, or PNG call-graph artifacts under profiler_visualizations/

  • render generated SVG and PNG call graphs directly in the dashboard

Use the flat top-functions table when you want to answer:

  • which function is hot?

Use SnakeViz when you want to answer:

  • why is this function being reached?

  • which call path dominates cumulative time?

Use gprof2dot when you want to answer:

  • who calls the bottleneck?

  • which caller/callee relationships are shaping the hotspot?

gprof2dot thresholds control graph pruning, which makes it easier to simplify dense profiles before rendering a call graph image.

Scenario Roles

  • smoke validates profiling plumbing and public API integration.

  • cheap_eval keeps evaluator work near zero so framework overhead is easier to see.

  • expensive_eval makes evaluator CPU cost dominant while keeping the tree shape similar.

  • wide_tree stresses broader opening pressure and legal-action generation.

  • deep_tree stresses deeper narrow traversal.

  • reuse_heavy stresses repeated shared-state patterns.