How RedGraphs Estimates Hidden Money Flows Between Companies
Let’s start with a question we often hear from our clients:
“If only five percent of company relationships have actual reported dollar values, what’s the value of estimating the other ninety-five?”
It’s a great question, and the answer is what makes RedGraphs unique.
Every relationship in our dataset is real. Each one comes from verified public disclosures verified by S&P Global, representing genuine customer and supplier links where money changes hands. The only missing piece for most relationships is how much money changes hands.
Ignoring those 95% of links would be like trying to model the economy by looking only at companies that disclose their top customers. You would be missing almost all of the structure. Our job is to fill in the blanks, not by guessing, but by applying a rigorous mathematical method called Iterative Proportional Fitting (IPF). This patented approach lets us estimate the most likely money flows across the entire network in a way that is consistent, auditable, and grounded in financial reality.
We also ensure there is no forward-looking bias. Every estimate is calculated strictly based on the information available at that moment in time, just as it would have been seen by the market on that date. It is a true point-in-time network.
Imagine the economy as a giant network of companies, each one a node, and every transaction between them a link. Formally, let G = (V, E) be a directed graph where each edge (i, j) in E represents a flow of money from supplier i to customer j. We want to estimate the dollar value xij ≥ 0 for every link.
We know some of these values directly, which we call x̂ij (x-hat), but most are missing. We also know each company’s total revenue and cost of goods sold, which give us row and column sums that all flows must satisfy. The goal is to fill in the unknowns in a way that is mathematically consistent with those totals.
Formally, we estimate X = {xij} such that it:
For quant funds and macro analysts, the benefits are immediate:
Under the hood, IPF solves an optimization problem that looks like this:
\[ \min_{x_{ij}\ge 0} \sum_{(i,j)\in E} \left( x_{ij}\log\frac{x_{ij}}{q_{ij}} - x_{ij} + q_{ij} \right) \]
subject to the following linear constraints:
This objective minimizes the information divergence between the final flows xij and their priors qij, while forcing the totals to align with the companies’ actual financial statements. In statistics, this is called a Kullback–Leibler projection. It finds the maximum likelihood estimate of the full matrix X.
The result has a simple closed form:
\[ x_{ij}^{\star} = a_i\, b_j\, q_{ij} \]
where ai and bj are scaling multipliers chosen so that every row and column adds up to the right totals. IPF alternates between adjusting rows and columns until everything balances perfectly.
It is simple, elegant, and proven to converge to the unique solution regardless of network size. In practice, we can scale this to millions of relationships in just a few passes.
Suppose supplier A sells to two buyers, B and C. We know A’s total revenue is 100, and that one reported relationship is xAB = 20. We do not know xAC, but we have a prior belief that B accounts for 40% and C for 60% of A’s business.
IPF takes these constraints and quickly converges to xAC = 80. If we add more constraints, such as C’s total cost or sector averages, the algorithm refines the estimate to match all of them simultaneously. Multiply that logic across millions of companies and you get a coherent, data-driven picture of the global flow of money.
Because the economy does not stop where disclosure does. Real corporate networks are dense, interconnected systems. If we only used links with reported amounts, we would lose 95% of the information about how those systems function. IPF fills in the missing values in a way that respects all accounting and structural realities, creating a complete and self-consistent map of global dependencies.
That completeness is what allows RedGraphs to unlock signals such as supplier contagion, customer momentum, concentration risk, and sector fragility. These features consistently show alpha in our backtests on Russell 2000 and 3000 universes.
All estimates are built using a strict two-axis point-in-time framework. One axis tracks when a document was published and first processed by our system, and the other records the periods covered in that document. This means you can build a network as of May 2025 using only documents known by that date, even if they reference earlier years. It is the same principle used in financial backtesting, where no future information is ever used.
Estimating unknown values is not about guessing. It is about completing the picture. IPF gives us a mathematically sound, economically faithful way to infer hidden money flows between companies, producing a network that is both realistic and analytically powerful.
Once the network is complete, it becomes a living map of the global economy that lets you trace shocks, measure dependencies, and find alpha in the most unlikely places.
That is the beauty of RedGraphs: turning sparse public data into a coherent, measurable, and investable model of how money actually moves through the world.