SharedLLM research is organised as a sequence of falsifiable bets. Each bet declares its hypothesis, its strict / lenient / catastrophic criteria, and the experimental procedure before the experiment runs. The result either supports or falsifies the claim, and the framing in our public writeups must match what the data shows — not what we hoped it would show.
A standard ML paper claims a finding and accumulates evidence in its favour. A bet declares the conditions under which the claim would be wrong, and reports the outcome regardless. Three things follow from that:
Every bet file in experiments/bets/ follows the same structure:
experiments/bets/_common.py for registry setup, specialist loading, and result writing. Consistency over cleverness.main() that runs the experiment, computes the pass/fail flags, writes a JSON payload to results/NN_*.json.00_rollup.py regenerates SUMMARY.md and SUMMARY.json from the result files. The summary is the entry point for anyone reviewing the harness.The bets harness operates at the 30M-parameter scale on short held-out texts. We do not claim:
K=100 DiLoCo or any other federated training recipe replaces synchronous SGD at frontier scale;These are open questions, not solved problems. They are listed in the Open Questions chapter.
Every bet is a runnable Python module. From the project root:
PYTHONPATH=src python -m experiments.bets.NN_<name>
Result lands in experiments/bets/results/NN_*.json. The roll-up regenerates the summary table:
PYTHONPATH=src python -m experiments.bets.00_rollup