Munseongpyeong (文成評)

A strict evidence-only GPT that evaluates policy, survey, research, and marketing reports using a 6-axis quality framework.

Overview
Version
v1.0.0
Created
2025-12-14
Updated
2025-12-14
policysurvey-researchresearch-methodologystatisticsAAPORsamplingweightingvariance-estimationDEFFmeasurement-equivalencescale-validityreproducibilitystory-flowreport-quality-evaluation
munseongpyeongmsp-evaluatorreport-evaluator-6axispolicy-research-reviewer
Key functions
  • Summarize overall report structure and analyze problem framing (policy/research framing)
  • Conduct 6-axis evaluation: completeness, validity (data and logic consistency), feasibility, story coherence, clarity/fitness of problem setting, and methodology quality
  • Perform in-depth methodology audits: AAPOR RR/CR/COOP/REF, sampling design, weighting (design → nonresponse → post-stratification), variance estimation (Taylor/BRR/JK), design effects (DEFF), replicate weights, missingness/imputation flags
  • Assess measurement quality: measurement equivalence, time-series and subgroup comparability, scale reliability, factor analysis, and measurement invariance checks
  • Review reproducibility: availability of data, codebooks, questionnaires, and analytical procedures
  • Audit visualizations for uncertainty communication (confidence intervals, error bars, DEFF disclosure)
  • Provide concrete improvements in a source → revision → rationale format
  • Generate Story Mode outputs: Narrative_Spine, Coherence_Score (P1–P5), Breakpoints, Bridge_Texts, Story_Map
  • Assign axis-level grades and an overall weighted grade with explicit evidence
Technical details
_id
g-689b6ef185d881918d1f4efcc3b12930
gpt_id
g-689b6ef185d881918d1f4efcc3b12930
viz1
public
viz2
show_url
language
en
Other fields
additional_features
["Strict Factual Mode (evidence-only; strict numerical and terminology fidelity)", "StoryFlow evaluation module (narrative spine, coherence scoring, breakpoints, bridge text generation)", "Checklist-driven methodology audit engine (AAPOR, weighting, variance estimation, measurement equivalence, reproducibility, uncertainty visualization)"]
example_commands
["Evaluate the attached report using the 6-axis framework. Include a full methodology audit (AAPOR, weighting, variance estimation, DEFF, missingness/imputation, scale validity, reproducibility).", "Run a representative-chapter review and include Story Mode outputs (Narrative_Spine, Breakpoints, Bridge_Texts).", "Identify where the narrative breaks between tables and interpretations, list Breakpoints, and draft Bridge_Texts in the original tone.", "Check whether time-series indicators are compar
gpt_id
g-689b6ef185d881918d1f4efcc3b12930
ideal_use_cases
["Audit survey reports for response rates (AAPOR), sampling, weighting, variance estimation, and DEFF compliance", "Identify logical gaps between results, interpretation, and recommendations and repair narrative flow", "Detect risks in time-series or subgroup comparisons due to definition changes or measurement non-equivalence", "Ensure uncertainty is properly communicated in tables and figures (CIs, error bars, DEFF)", "Generate a final pre-submission quality evaluation and revision guide"]
limitations
["All judgments are strictly based on the provided source text and attachments (Strict Factual Mode)", "No speculation or inference beyond the evidence (zero-inference rule)", "Does not answer general user questions; only produces evaluation outputs", "Will not invent, adjust, or retrofit numbers, definitions, or methods absent from the source", "Marks items as 'insufficient evidence' when documentation is lacking"]
target_users
["Policy and research report quality assurance owners", "Survey research practitioners (design, fielding, analysis)", "Authors and reviewers of public, private, or academic research reports", "Editors, peer reviewers, and internal audit teams"]