Sept 2024 Customer validation N. American beverage co., 18-product block
Sensory product testing: 2,782 respondents

Reconstruct a 2,782‑person study from 500 observations.

A North American beverage company ran an 18-product complete-block sensory study with 2,782 respondents. We trained Simulacra on progressively smaller stratified subsamples and measured how well it reconstructed the overall liking score per product. The result: an 80% reduction in sample requirements with no measurable loss of reconstruction fidelity. Validated by both heuristic and Bayesian changepoint analyses, replicated under stratified and random sampling regimes.

Sample reduction Replicated across both stratified and random sampling regimes. 82% stratified, 78% random
Per-product Per-product MAE changepoint analysis under random sampling — same answer at the per-product level. 43468 obs / product
Reconstruction Sample size at which RMSE / MAE stabilizes against the full 2,782-respondent dataset. heuristic + changepoint
Replications × Five-replication design at every sample level. RMSE and MAE align across all tests. Stratified and random sampling
Protocol: sample-reduction design

Start with the completed study → Train on subsamples → Score against the full result.

01: Establish benchmark

2,782 respondents, 18 products

The customer had already fielded a complete-block sensory study. The full dataset became the empirical benchmark: overall liking score per product.

02: Reduce sample

Progressive subsamples, five replications.

At each sample size, Simulacra trained on five smaller draws. Stratified draws preserved equal product exposure; random draws removed that guarantee.

03: Reconstruct and score

Score against the empirical benchmark.

For every draw, Simulacra generated synthetic study data. We scored RMSE and MAE against the full empirical benchmark, then identified the stable cutoff with heuristic and Bayesian changepoint analyses.

Validation 1: stratified sampling regime

The reconstruction fidelity stabilizes at ~478 rows.

As observations increase, the reconstruction error falls smoothly until it flattens: a textbook learning curve for AI. For this study, heuristic analysis pegs the cutoff between 400 and 500 rows; Bayesian changepoint analysis on the MAE-vs-sample-size slope corroborates with a changepoint at 478 rows of sampled data, an 80% reduction in data requirements for new and ongoing research.

Stratified validation, sample size vs MAE

5-replication design, Bayesian changepoint = 478

Validation 1.b: per-product replication

Same pattern. Eighteen products in.

The aggregate result holds at the per-product level. Every one of the 18 products in the study follows the same MAE-vs-sample-size shape, converging at the same changepoint.

Per-product MAE vs training rows — stratified

18 products, changepoint = 78 obs/product

Per-product MAE vs training rows — random sampling

18 products, changepoint = 68 obs/product

Per-product changepoint (stratified) 78 obs Average training rows per product at which the per-product MAE-vs-N slope flattens, across all 18 products. stratified regime, 5 replications
Per-product changepoint (random) 68 obs Same analysis under incomplete-block random sampling. The earlier convergence reflects Simulacra learning shared response structure across consumers and products. random regime, 5 replications
Pattern consistency 18 / 18 Every product in the study exhibits the same MAE-vs-sample-size convergence shape. Across both sampling regimes
Validation 2: random sampling regime

Random subsampling removes equal product exposure and reaches the same conclusion on an even harder test.

The stratified regime guaranteed that every product had equal exposure during training. The random regime removes that guarantee — Simulacra trains on incomplete-block subsamples where some products may be barely seen. Same outcome: 78% reduction in data requirements (changepoint at N = 352, per-product cutoff at 68 obs/product). The agreement between regimes is the proof that Simulacra generalizes the population's response structure — across consumers, products, and attributes — rather than memorizing product-level patterns.

Stratified

82% reduction

Heuristic threshold + changepoint at N=478. Per-product changepoint averages 78 observations.

Random

78% reduction

Heuristic threshold + changepoint at N=352. Per-product changepoint averages 68 observations.

80% reduction in data requirements validated by alignment between sampling regimes.

Fieldwork economics

Spend less on redundant sample; spend more on the decisions.

A complete-block 18-product sensory study at the original scale costs roughly $80–100K and takes 8–12 weeks. At 20% of the data, the same study-level reconstruction fidelity arrives in ~2 weeks of fieldwork at <$25K. The remaining sample budget reallocates to scenario modeling, deeper cross-tabs, or a second wave that wouldn't otherwise have fit. This validation shows Simulacra learning how the real population responds to each product and generalizing out of sample.

Validate on your data

How low can you go? Find the data requirements for your ongoing research and see how much you can save.

We subsample your data at progressive sizes, fit Simulacra at each size, score against the full empirical dataset, and report the changepoint where more sample stops improving fidelity.

Acknowledgements:

This research was conducted with the support of an anonymous North American beverage company.

Bayesian changepoint analysis ran in R using the segmented package.