This document is not a research template. It is a machine for producing accurate research. The distinction matters. A template produces outputs. A machine produces outputs, measures their accuracy, and improves itself over time. Every research project run through this framework serves two purposes simultaneously: it produces a thesis on the topic under study, and it produces data that improves M31's future research capability. The machine feeds itself. The standard against which all research is judged is simple: did it accurately predict what happened? Process rigor is not valuable in itself – it is valuable because rigorous processes produce accurate predictions. Any step that does not serve predictive accuracy should be removed.
The Architecture: Four Loops
The framework operates as four nested feedback loops, each running at a different timescale. Most research processes only run Loop 1. M31 runs all four.
- Loop 1 – The Research Loop — Produces the initial thesis. Weeks to months
- Loop 2 – The Monitoring Loop — Updates thesis probability in real time. Ongoing
- Loop 3 – The Error Loop — Identifies what went wrong and corrects it. Triggered by divergence
- Loop 4 – The Machine Loop — Improves signals, frameworks, and process based on accumulated results. Annually
Phase 0: Epistemic Hygiene
Before any research begins. The most common failure mode in institutional research is not insufficient data – it is insufficient self-awareness about the biases that will distort how data gets interpreted. Phase 0 exists to inoculate against this before it can infect the research.
0.1 – Question Decomposition
Before reading a single source, decompose the research question into its constituent parts and classify each by epistemological type. Type I questions are Empirically Resolvable – facts with definite answers in existing data, to be answered with primary sources and independent corroboration, with no judgment required. Type II questions are Projectable with Models – questions about the direction and rate of observable change, requiring quantitative modeling, historical base rates, and explicit uncertainty ranges. Type III questions are Judgment-Dependent and require M31 Framework application – questions about how actors will behave, how systems will respond, and what second-order effects will follow, requiring incentive analysis, historical pattern matching, and explicit scenario construction. Type IV questions are Fundamentally Unknowable – questions that cannot be answered with current information, which must be acknowledged explicitly and handled through probability-weighted scenarios. Converting Type IV questions into Type I or II is a research objective in itself. The single most common error in research is treating Type III and IV questions as Type I – stating incentive-dependent outcomes as facts. This produces confident wrong answers.
0.2 – Prior Declaration
Every team member who will contribute to the research must write down, before beginning: what they currently believe the answer to be; why they believe it and what evidence or reasoning underlies it; a probability estimate for each major possible outcome; and what evidence would cause them to change their mind. These declarations are sealed and dated. They are not shared among the team until Phase 8 (Integration). This prevents social anchoring – the tendency for team members to converge prematurely on the first articulated view. The prior declaration document becomes part of the permanent research record. It is reviewed at every major update cycle. The update log documenting how beliefs changed and why is itself analytical output.
0.3 – Bias Mapping
For every major source category and every major actor in the thesis landscape, explicitly answer: what does this actor want the truth to be, what are they incentivized to say publicly, what does their institutional position reward them for believing, and who has the opposite incentive and what do they say. Sources with strong directional incentives require independent corroboration from sources with opposing incentives before their claims are treated as evidence. A single source, however credible, is never sufficient for a load-bearing claim.
0.4 – Scope Definition
Define explicitly what the research will and will not cover. The boundary is as important as the content. State the precise question being answered, the time horizon of the thesis, the criteria for a successful prediction (what observable outcomes would confirm or disconfirm the thesis), and the review schedule (when probabilities will be formally updated).
Phase 1: Domain Knowledge Acquisition
Build genuine expertise before forming opinions. The most dangerous research posture is premature synthesis – forming conclusions before understanding the subject well enough to know what you don't know. Phase 1 is deliberately non-synthetic. The goal is expertise acquisition, not thesis formation.
1.1 – Domain Mapping
Every research topic spans multiple independent domains. Map them before beginning. For each domain, identify the canonical primary sources (original papers, not summaries), the 3-5 people most likely to be right about this domain, the most common misconceptions in public discourse, and the key open questions that genuine experts disagree on. Resist the temptation to skip to the synthesis. The domains that seem least relevant often contain the most important constraints on the thesis.
1.2 – Literature Protocol
Primary sources only as foundational evidence. Secondary sources (journalism, analysis, commentary) are useful for mapping the discourse and identifying which experts to interview, but are never treated as primary evidence for factual claims. For each major claim in the research, require at least two independent primary sources that are not citing each other, an explicit notation of the claim's confidence level (high/medium/low) based on source quality and replication, and a flag for any load-bearing claim that rests on a single source.
1.3 – Expert Interview Protocol
Conduct structured expert interviews. Minimum 15 interviews for a standard thesis; 25 or more for a major thesis. Source diversity is required: interview people who publicly disagree with each other. Force the thesis to survive adversarial examination before publication.
- Question 1 — What is your probability estimate for [key outcome]? Give a specific number.
- Question 2 — What would cause you to revise that estimate significantly in either direction?
- Question 3 — What do most people in your field get wrong about this?
- Question 4 — Who do you most disagree with on this topic, and what is the best version of their argument?
- Question 5 — If you had to bet your net worth on your stated position, would you?
Question 5 is not rhetorical. It separates stated beliefs from actual beliefs. An expert who hedges dramatically at question 5 is telling you their real confidence level. Record the hedge. After all interviews, identify the axes of genuine expert disagreement. These are not noise – they are signal about where genuine uncertainty lies. The thesis must either resolve these disagreements with additional evidence or acknowledge them as open uncertainties.
Phase 2: Live Player Mapping and Incentive Analysis
The future is determined by actors, not by abstract forces. Map every actor whose behavior could materially affect the thesis outcome. Classify each as a Live Player – one who exercises genuine strategic agency, reads the game board and adapts, can change the rules, and whose behavior is not fully predictable from their institutional role – or a Dead Player, who executes according to established rules, incentives, or ideology, whose behavior is largely predictable, and who does not adapt to new information in ways that change the game. The Live/Dead distinction is not permanent. Dead Players can become Live Players under sufficient pressure. Part of the analysis is identifying what pressure would activate this transition for each actor.
2.2 – Incentive Matrix
For each significant player, construct the three-column incentive analysis. The gap between columns 2 and 3 is where the predictive value lives. This gap is where conventional research fails – most analysis stops at column 2 and mistakes public positioning for revealed preference.
| Column 1 | Column 2 | Column 3 |
| What they want the truth to be | What they are incentivized to say publicly | What they will actually do when the thesis moment arrives |
2.3 – Coordination Game Analysis
When multiple actors must coordinate for an outcome to occur (or must coordinate to prevent it), model the coordination dynamics explicitly: what is the coordination threshold in terms of how many actors, at what scale, and on what timing; what are the defection incentives and why would individual actors break ranks; what is the tipping point mechanism that triggers mass coordination; who are the pivotal players whose defection or cooperation changes the outcome; and what is the history of coordination in this actor network on prior issues. The most important insight from coordination analysis is usually not that coordination will succeed or fail, but identifying the specific chokepoint where it becomes determinative.
2.4 – War Game Execution
Run structured war games for each major scenario. Assign team members to play specific actors. Force them to reason from that actor's actual incentives, not from what the actor should do rationally. The war game output is not a prediction of behavior – it is a mapping of the full strategy space that each actor is likely to explore. Real actors often surprise; the goal is to have thought through the space before they do.
Phase 3: Scenario Construction
Build the scenario space before you have favorites. Every thesis has 2-4 variables that dominate the outcome space. Identify them explicitly before building scenarios. These are the variables where uncertainty is high, the impact on outcomes is large, and they are in principle observable – you can know, eventually, which way they went. Resist the temptation to include many variables. More variables produce more scenarios but not more insight. The goal is parsimony: the minimum variable set that spans the meaningful outcome space.
3.2 – Scenario Construction Rules
Build scenarios by varying the master variables, not by imagining outcomes you find interesting. Every scenario must follow causally from specific variable configurations, include first, second, and third-order effects, be specific enough to be falsifiable, and have a named set of observable signals that would indicate it is becoming more or less likely. Avoid surprise scenarios and black swan scenarios in the primary scenario set. These belong in an explicit tail risk section. The primary scenarios should cover the majority of the probability space under a structured framework.
3.3 – The Red Team Requirement
For every scenario, assign a red team whose explicit mandate is to find the strongest possible case that the scenario is wrong. The red team is not playing devil's advocate – they are trying to genuinely defeat the scenario. Red team output goes into the permanent research record. If a scenario survives red team challenge, note what arguments the red team made and why they were ultimately unpersuasive. This is some of the most valuable content in the research.
3.4 – Probability Weighting Protocol
Assign initial probability weights to each scenario. Weights must sum to 100%. Every team member must state their individual weights before group discussion. The group must explicitly discuss and record the source of any disagreement exceeding 15 percentage points. Weights are labeled as Initial and dated – they are not presented as forecasts. The initial weights are not the point. The point is to force explicit commitment to a probability distribution that can be updated as evidence arrives. A researcher who refuses to assign probabilities is refusing to be held accountable.
Phase 4: The Signal Dashboard
Define in advance what evidence would change your mind. This is the most important anti-bias mechanism in the framework. For each major scenario, define three categories of signals: Confirmation Signals (observable events or data that would increase the probability of this scenario), Disconfirmation Signals (observable events or data that would decrease the probability), and Canary Metrics (high-priority leading indicators whose movement triggers an immediate review cycle regardless of the regular schedule). Canary metrics are defined by two properties: they are observable before the thesis moment arrives, and their movement provides strong information about which scenario is unfolding. They are the early warning system.
4.2 – Signal Specification Requirements
Every signal in the dashboard must meet the following standards. Observable: there must be a defined source or method for observing this signal – market sentiment is not a signal, but Bitcoin's 30-day realized volatility as measured by Glassnode data is a signal. Interpretable: the direction of interpretation must be specified in advance, so that if X increases, this raises the probability of Scenario A by approximately Y percentage points. Independent: signals should not all be derived from the same underlying data source, because correlated signals produce false confidence. Timely: the signal must be observable on a timescale relevant to the thesis.
4.3 – Dashboard Maintenance Protocol
- Weekly — Scan canary metrics. If any trigger, initiate emergency review.
- Monthly — Review all signals. Note any movements. No formal probability update required unless signals are significant.
- Quarterly — Full dashboard review. Formal probability update with documented rationale for each change.
- Annually — Comprehensive review. Full re-run of expert interview protocol with updated questions. Major structural update to scenario weights if warranted.
Every update is logged with date, the signal observed, the probability change made, and the reasoning. This log is the intellectual record of the research thesis across time.
Phase 5: Impact Quantification
Be specific. Ranges with explicit confidence levels. Vague language is prohibited. Every impact claim in the research must include a point estimate or range, an explicit confidence level (high/medium/low, with definition), the key assumptions that drive the estimate, and the sensitivity showing how much the estimate changes if the key assumptions are wrong. Significant impact is not an impact estimate. Forty to sixty percent probability of X outcome, with the range driven primarily by uncertainty about Y variable, is an impact estimate.
5.2 – First, Second, and Third Order Effects
For each major scenario, map effects at three levels. First Order covers the direct, immediate consequences of the scenario – these are usually visible and relatively easy to analyze. Second Order covers the responses to the first order effects by the major actors: what do the Live Players do next, how do institutions adapt, and where does capital flow. Third Order covers the responses to the responses – the equilibrium that emerges after the initial disruption has been absorbed. This is where the most important and least-discussed consequences typically live, and where M31's civilizational-scale framing adds the most analytical value. Most research stops at first order.
5.3 – The Adjacency Map
Every major thesis has implications that extend beyond the primary domain. Map the adjacent domains that would be affected and the approximate magnitude and direction of that effect. These adjacencies often reveal asymmetric investment opportunities beyond the obvious primary thesis plays.
Phase 6: M31 Framework Application
Apply proprietary lenses after the evidence has been gathered, not before. Applying the framework before evidence gathering creates confirmation bias – the research becomes a hunt for evidence supporting a pre-formed framework score.
6.1 – Five Signal Assessment
Apply the Five Signal Framework to score the thesis after evidence gathering is complete. Signal 1 is Scientific Unlock (25%): is there a genuine new capability enabling this, has something become possible that was not possible before, or is this a repackaging of existing capabilities? Score 1-10; key question: what is the specific unlock, and is it reproducible. Signal 2 is Convergence Index (5%): why is this possible now, and what independent vectors are converging to make this the right moment? Single-vector theses score low; multi-vector convergence across technology, regulation, economics, and culture simultaneously scores high. Score 1-10; key question: how many independent vectors, and how strong is each. Signal 3 is Pattern Alignment (10%): does this fit the patterns that endure? Apply the Grammar of History to assess whether this aligns with natural human drives, sustainable incentive structures, and the historical arc of how paradigm shifts unfold. Score 1-10; key question: which historical precedents map most closely, and what did those precedents produce. Signal 4 is Suppression Index (40%): are powerful interests fighting this, and what is the nature of the opposition? Distinguish genuine suppression (powerful actors threatened by the paradigm) from inverted suppression (establishment actors championing it – a warning signal). Score 1-10; key question: who specifically is fighting this, why, and are they winning. Signal 5 is Antifragility Index (20%): can they kill it even if they want to? How resilient is this paradigm to adversarial pressure – would suppression accelerate it (true antifragility) or would sustained institutional opposition be capable of stopping it? Score 1-10; key question: what is the minimum viable condition under which this paradigm survives.
- Formula — Score = (S1 x 0.25) + (S2 x 0.05) + (S3 x 0.10) + (S4 x 0.40) + (S5 x 0.20)
- 8.0 – 10.0 — High-conviction paradigm shift.
- 6.0 – 7.9 — Probable paradigm shift, monitor closely.
- 4.0 – 5.9 — Ambiguous – may be paradigm shift or advanced fad.
- 2.0 – 3.9 — Likely fad or failed paradigm.
- 0.0 – 1.9 — Not a paradigm shift. Override conditions apply – see the Five Signals framework document.
6.2 – Grammar of History Application
For each thesis, identify the closest historical precedents and extract the specific lessons that apply. This is not loose analogy – it is systematic pattern matching. For each historical parallel, establish what the structural similarities to the current situation were, what happened and on what timeline, what the key variables were that determined the outcome, how well those key variables map to the current situation, and where the analogy breaks down (which is as important as where it holds). The Grammar of History is not used to predict outcomes by analogy. It is used to identify the mechanisms that tend to determine outcomes in structurally similar situations, and to check those mechanisms against current evidence.
6.3 – Phase Assessment
Apply the paradigm shift phase model to determine where the thesis sits in its adoption arc. Phase assessment determines the urgency of the thesis and the likely trajectory of the suppression signal going forward.
Anomaly Accumulation – Incumbent paradigm straining under evidence it cannot explain.
Outsider Insight – Thesis articulated by those outside the mainstream.
Unlock/Proof – Capability demonstrated at small scale.
Ridicule – Mainstream dismissal is loudest. M31 optimal entry point.
Grudging Acknowledgment – Evidence too strong to ignore.
Rapid Adoption – Consensus formation, premium capture begins.
Orthodoxy – New paradigm becomes the establishment.
Phase 7: Red Team and Integration
7.1 – Final Red Team Challenge
Before integration, the red team produces a formal challenge document that argues, as forcefully as possible, that the primary thesis is wrong. The challenge must present the strongest possible case against the thesis using the same evidence the research team used, identify the 3-5 assumptions that if wrong would most significantly undermine the thesis, propose alternative interpretations of the key evidence, and estimate the probability that the primary thesis is wrong with reasoning. The integration lead must respond to every point in the red team challenge. Points that cannot be adequately refuted are incorporated as explicit uncertainties in the final thesis.
7.2 – Integration and Verdict
The integration lead synthesizes all research phases into the final verdict. The verdict must include a clear, specific primary thesis statement of one sentence; the probability-weighted scenario distribution showing which scenario is most likely and why; the key assumptions stating what must be true for the primary thesis to hold; the key risks identifying what single facts if discovered would most undermine the thesis; the investment implication as a specific, actionable conclusion and not merely watch closely; and the monitoring protocol specifying what signals to track and on what schedule. The verdict is signed and dated. It enters the permanent research record.
Phase 8: The Monitoring System
The thesis is not finished on publication. It is beginning. Publication is not completion – it is the beginning of the testing phase. The thesis must be updated as evidence arrives, and every update must be logged with its reasoning. The update log serves two purposes: it maintains the intellectual integrity of the thesis (preventing the post-hoc rationalization of we always believed that), and it produces data for the Machine Loop.
8.2 – Trigger-Based Review
Beyond the scheduled review cycles, certain events trigger an immediate, unscheduled review. A Canary Alert occurs when any pre-defined canary metric fires, requiring the team to convene within 48 hours. A Surprise Event occurs when something happens that was not in any scenario but is clearly relevant to the thesis, requiring rapid assessment of how it affects the scenario probability distribution. An Expert Consensus Shift occurs when multiple independent expert sources dramatically revise their stated views within a short window – this often signals that classified or non-public information is beginning to leak into the discourse. A Scenario Confirmation Event occurs when a clear event strongly confirms one scenario and disconfirms others, requiring a formal probability update and documentation.
8.3 – Probability Update Protocol
Every formal probability update must document the signal or event that triggered the update, the old probability distribution in exact numbers, the new probability distribution in exact numbers, the reasoning for the change, and what signal would cause the next significant update. This documentation discipline prevents the cognitive distortion of we always believed this. The intellectual record shows exactly what was believed, when, and what caused it to change.
Phase 9: Error Analysis and Machine Loop
The most valuable research M31 produces is the research that turns out to be wrong. The error analysis is not a post-mortem. It is a structured extraction of information that improves future research. It applies to theses where the predicted scenario clearly occurred, theses where a different scenario clearly occurred, theses where the timeline was significantly off even if the directional thesis was correct, and theses where the impact was significantly different from the estimate.
9.1 – The Error Analysis Protocol
Category 1 covers Signal Errors: were there signals in the data that we had access to but did not weight correctly, why did we underweight them, and was this a data problem (we did not have the signal in our dashboard) or a weighting problem (we had it but dismissed it)? Category 2 covers Model Errors: were there structural flaws in the scenario framework, did we miss a scenario that occurred, and did we misconstruct an incentive analysis? Category 3 covers Bias Errors: can we identify a specific cognitive bias that distorted the research, such as confirmation bias, anchoring, or narrative bias (choosing a good story over an accurate one)? Category 4 covers Unknowable Factors: were there genuinely unknowable factors that determined the outcome, and if so, is there any leading indicator for this class of factor we should add to future dashboards?
9.2 – The Machine Loop
The Machine Loop is the process by which individual research errors improve M31's systematic research capability. It runs annually, informed by all error analyses from the preceding period. The Machine Loop updates the Five Signal weights, asking whether the current weights are producing accurate composite scores and examining them if multiple high-scoring theses have failed or multiple low-scoring theses have succeeded. It updates the signal dashboard library by building a growing collection of signals proven predictive across multiple theses in each domain, which become standard inclusions in future dashboards for related topics. It updates the Grammar of History pattern library by recording which historical patterns proved most predictive for each class of thesis. It refines the interview protocol by identifying which expert interview questions produced the most predictive information and which were noise. It maintains the bias catalog documenting the specific biases that have distorted M31 research in the past, including the conditions under which each bias tends to activate. And it calibrates the phase model by measuring how accurately phase assessments predicted the timeline trajectory of paradigm shifts. The machine is not improved by intuition. It is improved by evidence about what the machine gets wrong.
9.3 – Machine Improvement Documentation
Every Machine Loop produces a formal improvement document that specifies which elements of the framework were updated, the evidence that drove the update, the expected impact on future research accuracy, and the control for how we will know if the update improved things. The machine is not improved by intuition. It is improved by evidence about what the machine gets wrong.
Team Structure
Every major research project requires three distinct team functions. These should not be collapsed into the same people. The Research Team gathers evidence, builds the factual foundation, and conducts interviews – they are responsible for domain expertise and source quality. The Red Team is explicitly tasked with defeating the thesis. They must be given full access to research team materials and must be rewarded for finding genuine weaknesses, not for being constructively critical in ways that are ultimately dismissed. The Integration Lead synthesizes research team output, responds to red team challenges, and produces the final verdict. They must be the senior analytical voice on the project and cannot be captured by either the research team's confirmation bias or the red team's adversarial framing. For smaller research projects, one person may fill multiple roles – but not during the same phase. The research and red team functions must be temporally separated.
The Honesty Protocols
These are not suggestions. They are requirements.
- No Anonymous Verdicts: Every research conclusion is signed. Every probability update is attributed. Every error analysis names the error and the researcher responsible for the flawed judgment. Accountability is the single most powerful bias-reduction mechanism available.
- No Verdict Laundering: It is not permitted to revise the original thesis document after publication to make it appear more prescient. The original document is permanently archived. Updates are addenda, not replacements.
- No Complexity Hiding: The most common form of intellectual dishonesty in research is the deliberate use of complexity to obscure weak reasoning. If a claim cannot be stated simply and specifically, it probably does not deserve to be in the thesis.
- The Bet Test: Before publishing any thesis, the integration lead must be willing to answer: if you had to bet your net worth on this verdict, would you? If the answer is a substantial hedge, the thesis requires more work. The research is only as honest as the researcher’s actual confidence in it.
The Standard Output Structure
Every research project produces the following documents in sequence:
Prior Declaration Document: Team’s prior beliefs, sealed before research begins. (Phase 0)
Domain Knowledge Summary: What we learned, key sources, key expert insights. (Phase 1)
Live Player Map: All significant actors with full incentive analysis. (Phase 2)
Scenario Set with Initial Probabilities: 3-5 scenarios, probability-weighted. (Phase 3)
Signal Dashboard: All signals, canary metrics, monitoring protocol. (Phase 4)
Impact Analysis: Quantified effects, first/second/third order. (Phase 5)
M31 Framework Scores: Five Signal assessment, Grammar of History application. (Phase 6)
Red Team Challenge and Response: Full adversarial challenge with integration lead response. (Phase 7)
Final Thesis and Verdict: The publishable brief. (Phase 7)
Update Log: Timestamped record of all updates and rationale. (Phase 8, ongoing)
Error Analysis: What the research got right and wrong, and why. (Phase 9, post-resolution)
Machine Loop Contribution: Specific improvements to M31 framework derived from this research. (Phase 9)
A Note on What This Framework Cannot Do
The goal is not to be right. The goal is to have a process that is more likely to produce right answers than alternative processes, and that honestly acknowledges when it has failed. The compound effect of that process, over years and across many theses, is the M31 analytical edge.
M31 Capital – Standard Research Process Framework v1.0