MARKET SIZING | REVENUE BUILDS | VALUATION | SPAC RELATED RESEARCH | FINANCIAL MODELLING

How I Build Assumptions When Data Is Incomplete

FINANCIAL MODELLING

1/2/20263 min read

In theory, investment analysis is driven by clean datasets, consensus forecasts, and well-documented assumptions.

In practice, most projects begin with incomplete, inconsistent, or outdated data.

This is particularly visible in:

  • Healthcare diagnostics

  • Emerging technologies

  • Private markets

  • Niche or under-researched segments

Public health data often lags real-world behaviour. Private company disclosures are selective by design.

In these situations, the quality of analysis depends less on sourcing “the right number” and more on how assumptions are constructed, tested, and communicated.

MY CORE PHILOSOPHY ON ASSUMPTIONS

I work with three principles.

I. EVERY ASSUMPTION MUST HAVE A CLEAR LOGIC, NOT JUST A CITATION

Citations are inputs. Assumptions require judgment. I always make the logic explicit so clients can see why a number belongs in the model and how sensitive the outcome is if that number is wrong. A report might state “average breast cancer screening adherence is ~60%”.

This number alone is not an assumption. An assumption begins only when that number is contextualized.

For example:

Why 60% applies to urban women?
Urban populations typically benefit from higher access to screening facilities, better physician density, and shorter travel times. Applying a national average to an urban-only cohort implicitly assumes that access constraints are lower and adherence is at least in line with, if not above, the broader population.

Why it applies to this age cohort?
Screening behaviour differs materially by age. Women aged 50–74 often exhibit higher adherence due to stronger clinical recommendations and insurance coverage, while younger cohorts may screen less consistently. Using a blended rate assumes that these differences are either explicitly weighted or directionally offset.

Why it holds over the forecast period?
Adherence is not static. It is influenced by policy updates, public health campaigns, physician behaviour, and patient awareness. Assuming stability over time implies that no major structural changes are expected or that increases and decreases net out.

Only after these questions are answered does “60%” become a model assumption rather than a borrowed statistic.

II. MATERIAL ASSUMPTIONS MUST BE TESTED, NOT DEFENDED EMOTIONALLY

A material assumption is one that meaningfully changes:

  • Market size

  • Revenue trajectory

  • Valuation outcome

  • Investment decision

If such an assumption is fragile, it must be exposed, not protected.

Consider a base-case model assumes that U.S. Breast Cancer Screening Adherence = 60%

A testing mindset asks:

  • What happens at 55%?

  • What happens at 65%?

  • Which conclusions change, and which don’t?

If a ±5% change:

  • Moves TAM materially, it’s a core risk driver

  • Barely moves valuation, it’s not worth debating further

The discussion shifts from who is right to what matters.

In the U.S. breast cancer screening model, sensitivity testing showed:

  • Adherence and call back rates drive market size

  • Population growth has limited influence

  • Pricing matters, but within bounded ranges

This reframes the discussion from “Is this assumption correct?” to “How sensitive is the outcome if reality differs?”

III. IF AN ASSUMPTION MOVES VALUATION, IT MUST BE VISIBLE

In many models, the assumptions that matter most are:

  • Buried in deep tabs

  • Blended into averages

  • Masked by complex formulas

  • Hard to isolate or challenge

That is dangerous.

Visibility means:

  • The assumption can be seen

  • Its impact can be isolated

  • Its movement can be understood quickly

If a key assumption:

  • Requires 20 minutes to find

  • Cannot be flexed easily

  • Is hard to explain verbally

It will not be trusted even if it’s technically correct.

In breast cancer screening market sizing, if screening adherence changes by ±5% and TAM moves meaningfully, then that assumption must be:

  • Shown in a table

  • Included in a sensitivity matrix

  • Highlighted in a key insights

It should never be buried inside population math.

Strong analysis is not about eliminating uncertainty. It is about making uncertainty explicit, structured, and decision useful.

When data is incomplete, as it often is, the quality of work depends on whether assumptions are logically grounded, stress-tested, and clearly visible to decision-makers. That is the standard I use when building market sizing and valuation models, and the standard clients should expect when conclusions are meant to hold up beyond a spreadsheet.