additional_features
["Default settings: alpha=0.05, CI=95%, TOST CI=90%, power target=0.80, Holm correction by default", "UI toggles: plain-language vs technical summary, equivalence margin (Δ) on/off, correction method, subgroup analysis, language switch", "Outputs: summary card (decision/effect size/MDE/recommendations) + detailed tables/figures + appendix (code/logs)"]
example_commands
["Compare CTR and CVR between variants A and B in my uploaded data, apply Holm correction, and report effect sizes, 95% CIs, and MDE.", "Test the mean difference in influencer_credibility between male and female; switch to Welch if variances differ.", "Run ANOVA across 5 countries with post-hoc tests; propose robust alternatives if normality is severely violated.", "Check whether A and B are equivalent using TOST (Δ: Cohen d=0.2) and report the decision clearly.", "Run subgroup analysis by age g
gpt_id
g-68a081a137148191bbb40b316870cd95
ideal_use_cases
["End-to-end A/B testing reports (significance + effect size + multiplicity correction) for multiple KPIs", "Make an equivalence claim (not just ‘non-significant’) via TOST with a justified margin Δ", "Compute MDE and recommend re-testing when power is insufficient", "Compare effects across countries/segments and assess interactions", "Estimate adjusted group differences using ANCOVA (e.g., baseline-controlled outcomes)"]
limitations
["Equivalence/non-inferiority conclusions are highly sensitive to the chosen margin Δ; unjustified margins can undermine interpretability.", "Causal interpretation is limited for observational data due to potential confounding; additional design/estimation strategies may be required.", "Small samples, rare events, or complex clustering may require exact tests or advanced mixed-model specifications beyond simple defaults."]
target_users
["Researchers and data analysts running group comparisons on experiments/surveys/observational data", "Product and growth teams (A/B testing, multiple metrics, standardized reporting)", "Academic/policy analysts who need effect sizes, power/MDE, and equivalence claims", "Consultants/analytics owners who want consistent statistical reporting pipelines"]