Generator that samples marginals
Each variable drawn from its own marginal. Resulting row:
every column inside its real range; the respondent cannot exist
Research asks questions. Simulacra finds answers. Every study contains a sample of its response structure: the way choices, prices, claims, segments, attitudes, baskets, transactions, and outcomes move together; how every measured variable shifts when the others move. Most synthetic-data tools approximate this structure as a product of marginals, or as text completions over field names, losing the conditional dynamics that make the original sample useful. Simulacra's Generative Causal AI learns the study's response structure directly, then lets you run new scenarios against the world your data actually measured.
A generator that samples each variable from its own marginal independently can hit every univariate target while returning an impossible record — one no respondent or process could have produced. Older generators often yield incoherent rows, ignoring the internal causal structure and dependencies within the research.
Simulacra fits those constraints directly. It learns the response structure of the data: which variables can move freely, which variables have to move together, and which combinations cannot be defended by the measured process. That's why Simulacra can generate coherent rows, condition on a segment, run an intervention, and refuse a query the fitted model cannot answer. The causal behavior starts with row-level fidelity.
Each variable drawn from its own marginal. Resulting row:
every column inside its real range; the respondent cannot exist
Every value predicted in the context of every other:
rows are a linked system; dependencies preserved
The AI is tabular-native and diffusion-inspired, built for mixed data rather than text personas. It learns the joint response structure of the study: distributions, higher-order dependencies, segment dynamics, and the constraints that make one completed row plausible and another incoherent. A condition does not filter to nearby records; it reweights the whole model, with confidence that changes as the query moves toward thinner evidence.
Move the controls. Simulacra treats the values you set as a condition, then regenerates the remaining variables from the fitted response structure. In dense regions the answer is high-confidence. Near the edge, the engine labels the uncertainty. When the request goes beyond what the fitted model can defend, it refuses instead of inventing a row.
This visual is intentionally simple: real projects have dozens to hundreds of variables. Simulacra is not looking up nearby rows. It learns a joint response structure and reweights that structure under a condition. Confidence is highest where the fitted structure has strong evidence, thinner near the edge, and explicit when the data cannot answer.
The schema defines what Simulacra can be asked. The fitted response structure defines what kind of answer it can generate.
Simulacra does not pretend every condition is equally answerable. It returns the strongest answer the fitted model can defend.
A Bayesian network gives a conditional probability table. A structural causal model gives a single causal estimand under a specified graph and adjustment set. Simulacra generates a complete dataset of the conditioned population such that each row remains internally coherent.
The comparison below uses a blinded food-delivery study to show the native output of each method: distribution lookup, effect estimate, and generated scenario population.
Set income to its high tier (₹25,000+ Indian rupees). What does each method tell you about behavior under that scenario?
Twin-2K-500 tests survey synthesis on a public benchmark. Pricing & Promo tests an applied sales holdout. Data Reduction tests sample-size economics. Each page includes methodology, holdout design, and documented gaps. We'll run the same validation on your data.
Bring a study you already fielded. We hold out a portion, fit Simulacra on the remainder, run the same methodology you just read, and send back the scorecard. Standard NDA, no contract required.