Policy foresight in renewables

The corpus

Every horizon-scanning system stands or falls on its corpus. Ours is forty-one million documents and refreshed daily. The contents: every consultation paper from BEIS through DESNZ, every statutory instrument laid before Parliament, every NESO publication and grid-code working group transcript, every parliamentary committee submission, every judicial-review filing in the climate and energy space, every regulator letter exchanged between Ofgem and licensees, and the equivalents in Germany, Ireland, France, Spain, Italy, the Netherlands, Belgium, the United States by FERC region, Canada, Australia and Japan. Forty-seven jurisdictions in all.

Crucially, the corpus is versioned. When a consultation paper is reissued with amendments, we keep both. When a draft instrument is published with a redline against the previous draft, we ingest both versions and the redline. This is the single largest source of insight in the system: most policy changes are visible weeks or months before they become formal, but only if you are reading the working documents alongside the published ones.

From corpus to forecast

The corpus alone is necessary but not sufficient. A long-context language model can read a consultation paper and produce a clear summary; that is solved territory. What is hard is the chaining: linking a December 2025 consultation paper to the May 2024 working-group meeting that produced it, to the January 2024 ministerial speech that signalled the policy direction, to the August 2023 European directive that constrained the options.

We do this through an event graph. Every document in the corpus is a node. Edges connect documents that reference each other directly (a consultation paper that responds to an earlier one), documents that share authors or named officials, documents that are produced by the same committee under the same mandate, and documents that the model identifies as substantively responsive even when no direct citation exists. The graph is sparse enough to be tractable and dense enough that no important chain is missed.

The forecasting layer queries the graph to produce, for each upcoming instrument category in each jurisdiction, a probability distribution over (a) whether the instrument will be published within a defined window, and (b) the most likely shape of the instrument given the chain of preceding documents. The second of those is what makes the output useful: a 70% probability that the UK will publish a connection-queue reform in 2026 is mildly informative; a 70% probability that the reform will gate cohorts on milestone evidence rather than discretionary lottery is the kind of statement that changes capital allocation.

A consultation paper in 2024 is rarely a surprise. It is the third predicted move on a chain we can usually trace back four years. The job of the model is to extract that chain, score the next move, and tell the reader how much confidence to place in the score.

Calibration discipline

This is the part that decides whether a horizon-scanning system is useful or theatre. We score every closed forecast — that is, every forecast for which the predicted instrument has now either been published or definitively missed its window — and publish the calibration plots quarterly. The current 24-month Brier score on grid-code revisions across the UK, Germany, Ireland and Texas is 0.087, against a base rate of approximately 0.34. Calibration plots show very mild over-confidence in the 60-80% bucket and the system is essentially well-calibrated elsewhere.

This number moves. In Q4 2024 our Brier score on UK planning rule changes deteriorated noticeably after a sequence of decisions that broke the precedent chain our model had identified. We published the diagnostic, re-fit the planning sub-model with a wider prior on political-discontinuity events, and re-scored the forward forecasts. The 2025 calibration is back inside its target band. We publish that recovery curve too. None of this is optional. A horizon-scanning system whose calibration is not publicly scored, criticised and corrected is, in the long run, an instrument of confusion rather than clarity.

Where it matters most

Five categories of policy change cause eighty per cent of investment write-downs in renewables. Connection queue reform. Contracts-for-difference design. Planning rule changes. REPower-style fast-track regimes. Consenting timelines. Horizon Policy is built around those five and a small number of adjacent categories: capacity-market design, ancillary-services procurement reform, BESS-specific regulation, hydrogen and CCS subsidy regimes, and grid-loss methodology changes.

The first four are forecastable from the documentary record to a useful resolution — useful here meaning calibrated intervals tight enough to drive go/no-go decisions on individual projects and across whole portfolios. The fifth (consenting timelines) is harder, because the chain involves caseload pressure, political calendars and individual planning-officer behaviour, none of which sit cleanly in the documentary record. We model it explicitly with wider intervals, and flag the wider intervals to the user.

What buyers actually do with it

The temptation, when describing a forecasting system, is to make the outputs sound oracular. Real users do something less dramatic and considerably more useful. They lock optionality clauses in EPC contracts whose value depends on a probabilistic policy outcome. They stage capex on policy probability triggers. They walk away from sites whose viability depends on a rule that the model assigns under 35% probability inside the relevant window. They re-discount portfolios whose anchor projects have queue dates implicitly conditioned on a rule that has not yet been written.

None of this requires the model to be right about specific instruments. It requires the model to be calibrated — that is, the 35% probabilities really do turn out to materialise about thirty-five per cent of the time, the 70% probabilities materialise about seventy per cent of the time, and the wider intervals on thinly evidenced classes are honest about the system's actual uncertainty. Calibrated probabilistic outputs change capital decisions; uncalibrated point forecasts do not.

What this isn't

Horizon Policy does not, and will not, do three things. It does not produce single-number forecasts: there is no output that says "the connection queue reform will happen in Q3 2026." It does not produce off-the-record outputs that lack provenance trails: every forecast is supported by the documents in the chain. And it does not produce confidence-only scores without calibration evidence: every score is shipped with its calibration class and recent calibration history. If a system gives you the first three without the third, it is selling certainty, and certainty about future policy is for sale only at very poor prices.

Methodology · v3.4 The full mathematical specification — corpus construction rules, event-graph edge weights, the language-model fine-tune used for chain extraction, and the calibration-scoring protocol — is published in the Library under Methodology v3.4. Outcome ledgers are available to clients on request and are inspected as part of standard onboarding.