Why do impact investments underdeliver on their impact thesis?

Impact DD evaluates the wrong layer. It measures outputs — carbon reduced, people served, SDGs aligned — without assessing whether the organisation has the structural capacity to sustain and scale those outputs. A company can score perfectly on IRIS+ today and be structurally incapable of maintaining those scores at twice the size. The mechanism that produces impact is the organisation, and the organisation is what breaks.

What should impact due diligence evaluate beyond outputs?

The organisational layer: can this company's decision architecture, leadership structure, and operational design actually scale the impact the investment thesis requires? Can technical authority be distributed as the team grows? Can the operating model serve multiple customer segments simultaneously? Impact frameworks measure what the company produces. Organisational assessment measures whether the machine can keep producing it.

Impact Due Diligence Evaluates the Wrong Layer

Impact due diligence has a measurement problem, and it isn’t the one the industry talks about. The conversation for the past decade has been about standardizing impact metrics — IRIS+, IMP, SDG alignment, Theory of Change frameworks. The frameworks have gotten more sophisticated. The metrics have gotten more specific. And impact investments continue to underdeliver on their impact theses for reasons the frameworks can’t explain. The problem isn’t measurement precision. The problem is that impact DD evaluates the wrong layer. It measures the outputs a company produces — carbon reduced, people served, hectares restored — without assessing whether the organization producing those outputs has the structural capacity to sustain and scale them. A company can score perfectly on IRIS+ today and be structurally incapable of maintaining those scores at twice the size.

What impact frameworks measure

IRIS+ and its peers measure impact outputs and, in more sophisticated applications, outcomes. How much carbon was avoided. How many people gained access to clean water. Which SDGs are addressed. Some frameworks extend to outcome measurement — did the carbon avoidance lead to measurable temperature pathway change? Did access to clean water reduce disease incidence? These are legitimate and important measurements. They tell you whether the company is producing impact now. They tell you nothing about whether the company will produce impact at the scale your investment thesis requires, because they don’t examine the mechanism that produces the impact. The mechanism is the organization. And the organization is what breaks.

What they don’t measure

Impact frameworks don’t assess organizational capacity to deliver. They don’t examine decision architecture — whether the company can make decisions fast enough to capture the market windows that create impact. They don’t evaluate founder-org fit — whether the organizational design allows leadership to operate effectively at the current stage. They don’t measure scaling readiness — whether the operating model that produces impact at current scale can produce impact at the scale the investment requires. They don’t assess the gap between stated strategy and actual execution — whether the work being done on Monday morning resembles the impact thesis being presented on Tuesday afternoon. These aren’t soft assessments. They’re structural evaluations with diagnostic frameworks and observable signals. The fact that impact DD doesn’t include them isn’t because they’re hard to measure. It’s because the impact measurement industry evolved from development economics and philanthropy, where the unit of analysis was the program, not the organization delivering it. When impact measurement migrated to investing, it brought the program-level lens with it — and left the organizational assessment layer behind.

A company that scores perfectly and can’t scale

Consider a climate data company that provides early warning information to vulnerable communities. Its IRIS+ metrics are impeccable: lives protected, economic damage avoided, vulnerable populations served. Its SDG alignment is strong. Its Theory of Change is rigorous and evidence-based. It is also a 25-person organization where every customer relationship runs through the founder, every product decision requires the CTO’s personal involvement, and the commercial team has no authority to close deals above a modest threshold. The impact is real at current scale. The organizational structure makes it mechanically impossible to double that impact without a structural redesign that nobody has planned for and the investment timeline doesn’t account for. The IRIS+ scorecard won’t tell you this. A structural diagnostic will — and it will tell you specifically what needs to change and whether the organization is capable of making those changes.

What impact DD should actually evaluate

Impact DD needs a second layer: organizational capacity assessment. Not in addition to impact measurement — integrated with it. For every impact metric, the structural question is: can this organization sustain and scale this output? The assessment examines the machine that produces the impact, not just the impact itself. Decision throughput: can the organization make enough good decisions, fast enough, to maintain impact quality while growing? Talent architecture: does the organization attract, retain, and deploy the people needed to deliver impact at scale, or is it dependent on a small number of individuals whose departure would collapse delivery? Operational infrastructure: does the organization have the systems, processes, and coordination mechanisms to manage the complexity that scale creates? Founder-organization dynamics: is the founder’s involvement a force multiplier or a bottleneck, and what happens to impact delivery when the founder can no longer touch every decision? These are the variables that determine whether your impact investment produces the returns — financial and social — that your thesis describes.

What I see

My background is in atmospheric physics — I’ve spent years working with climate data, models, and the organisations that turn them into products. I’ve watched companies with genuinely world-class science fail to scale their impact, not because the science was wrong but because the organisation delivering it couldn’t grow without breaking. The IRIS+ scorecard looked perfect at 25 people. At 60, with the same science and the same impact thesis, the company was structurally incapable of maintaining the quality that produced those scores. The organisational capacity assessment I bring to impact DD is informed by having been inside the machine — knowing exactly where the translation from science to impact breaks down and what the structural prerequisites for scaling that translation actually are.

The gap between impact thesis and organizational reality

Every impact investment has a thesis: this capital will produce this impact at this scale over this timeline. The thesis is evaluated against market data, technology readiness, and impact metrics. It is almost never evaluated against organizational reality. The question nobody asks is whether the organizational structure can support the thesis. A company projecting 10x impact growth over five years is projecting an organizational transformation — from a small team doing excellent work to a larger, more complex organization doing excellent work at scale. That transformation is the hardest thing in business, and it fails more often than it succeeds. Impact DD that doesn’t assess organizational capacity to execute that transformation isn’t incomplete — it’s evaluating the thesis against the wrong variables. The technology can work. The market can exist. The impact can be measurable. And the organization can fail to deliver anyway, because the structural capacity to scale was never assessed and never built.

The impact thesis lives or dies in the structure. The IRIS+ scorecard won’t tell you that. Reach out.