DataPrepCodeGen – SPSS/R Data Preparation Code Generator

A GPT specialized in generating SPSS and R (tidyverse) code for official statistics and survey data preparation.

Overview

URL

https://chatgpt.com/g/g-69396281e0188191be94814202314ee2-dataprepcodegen

Version

v1.0.0

Created

2025-12-11

Updated

2025-12-11

policystatisticsresearch-assistant

dataprepcodegendataprep-gpt

Key functions

Generate SPSS data preparation syntax based on survey/statistical metadata and codebooks
Translate SPSS preparation syntax into equivalent R (tidyverse) data preparation code (excluding analytical code)
Create variable labels, value labels, missing value definitions, and display formats for official statistics datasets

Technical details

_id

dataprepcodegen

gpt_id

dataprepcodegen

viz1

public

viz2

show_url

language

Other fields

additional_features

["Interprets metadata schemas (variable roles, semantic types, groups) to draft type conversions, grouped sums/means, and index variables", "Adds concise Korean comments to SPSS and R code to document each preparation step for official statistics workflows"]

example_commands

["Use this survey metadata to generate SPSS VARIABLE LABELS, VALUE LABELS, and MISSING VALUES syntax.", "Convert this SPSS data preparation script into R tidyverse code, excluding any analytical procedures.", "Given these variable roles (id, weight, psu, strata), create a reusable SPSS preparation template.", "From this codebook, draft SPSS FORMATS and any necessary type conversion code for income, age, and date variables.", "Generate SPSS VARSTOCASES syntax to reshape these repeated-measures va

gpt_id

dataprepcodegen

ideal_use_cases

["Automating SPSS variable/value labeling and missing value declarations from questionnaires or metadata files", "Refactoring legacy SPSS preparation scripts into cleaner, documented syntax", "Producing parallel SPSS and R (tidyverse) preparation scripts for the same dataset to support migration"]

limitations

["Does not generate or interpret statistical analysis code (e.g., regression, hypothesis tests); it focuses strictly on data preparation.", "Does not perform file system operations, database connections, or OS-level commands."]

target_users

["Statisticians and data managers working with official statistics and survey microdata", "Practitioners responsible for SPSS-based data cleaning and preparation", "Analysts transitioning from SPSS to R who need parallel preparation scripts"]