Epidemiology Research Platform

From NHANES Data to
Lancet-Quality Publication

Automated survey-weighted analysis, publication-ready figures, and manuscript generation. Transform months of epidemiological research into a reproducible, one-click pipeline.

$ python run.py --topic "BMI and hypertension" --cycles 2017-2018
 
Step 1/8: Parsing research proposal...
🔎 Step 2/8: Mapping variables → BMXBMI, BPQ020, RIAGENDR...
📥 Step 3/8: Downloading NHANES 2017-2018 from CDC...
⚙️ Step 4/8: Processing & merging datasets (n=4,520)
📊 Step 5/8: Survey-weighted logistic regression (OR=1.87, 95%CI 1.52-2.29)
📈 Step 6/8: Generating Table 1, forest plots, Kaplan-Meier curves
📚 Step 7/8: PubMed search → 23 related articles retrieved
✍️ Step 8/8: Writing Lancet-format manuscript...
 
Done! Results: output/bmi_hypertension/results.zip
0
NHANES Variables Indexed
0
Phenotype Presets
0
Survey Cycles (1999-2018)
0
Pipeline Steps Automated

8-Step Automated Pipeline

From research question to submission-ready manuscript in one reproducible workflow

1

Parse Proposal

Upload a Word document or enter your research question. PICO/PECO framework automatically extracted using NLP.

2

Map Variables

Intelligent mapping from your concepts to 150+ NHANES variables across demographics, labs, questionnaires, and mortality data.

3

Download Data

Automated XPT file download from CDC NHANES FTP, with local parquet caching. Supports all 10 cycles from 1999-2018.

4

Process & Merge

Clean missing codes, recode variables, merge DEMO+LAB+Q datasets. Automatic survey weight adjustment for multi-cycle analysis.

5

Statistical Analysis

Survey-weighted descriptive stats, Rao-Scott chi-square, logistic/linear regression, subgroup analysis with proper SE estimation.

6

Tables & Figures

Lancet-standard Table 1 (baseline), regression tables, forest plots, correlation heatmaps, Kaplan-Meier curves. 300 DPI output.

7

Literature Search

Real-time PubMed search via NCBI E-utilities. Auto-retrieves related studies, formats Vancouver-style citations for your manuscript.

8

Write Manuscript

Generate a Lancet-format paper: structured abstract, methods with STROBE compliance, results with embedded tables, and discussion.

Why NHANES to Lancet?

Purpose-built for epidemiological research, not a generic data tool

🎯

Survey-Weighted Statistics

Proper NHANES complex survey design with MEC/interview weights, PSU, and strata. Not just “add weights to regression” — Taylor linearization SEs, Rao-Scott chi-square, and weight adjustment for multi-cycle pooling.

📈

Publication-Quality Output

300 DPI figures in Lancet color palette. Table 1 with means/SD or n/% by group. Forest plots with HR/OR and 95% CI. Correlation heatmaps. STROBE checklist compliance verification.

📚

150+ Variable Knowledge Base

Curated database of NHANES variables across 27 categories: demographics, body measures, blood pressure, labs, questionnaires, diet, physical activity, sleep, depression (PHQ-9), and mortality follow-up.

🤖

AI-Powered Paper Writing

DeepSeek LLM integration for Lancet-format manuscript generation. Structured abstract, methods section with exact statistical parameters, results narrative, and discussion with strengths/limitations.

🚀

10 Phenotype Presets

Pre-configured exposure/outcome/covariate sets for: obesity, diabetes, hypertension, dyslipidemia, CVD, smoking, depression, CKD, diet quality, and sleep disorders. One-click study setup.

📦

Reproducible Results

Every analysis produces a complete ZIP package: cleaned dataset, analysis scripts (Python + R), all tables/figures, generated manuscript, and a manifest with version info and parameters.

How We Compare

NHANES analysis requires specialized knowledge. We encode that knowledge into software.

CapabilityNHANES to LancetGeneric Stats ToolsManual Analysis
Automatic CDC data download
Survey-weighted analysisPartial
NHANES variable knowledge base
Lancet-format manuscript generation
STROBE compliance checkManual
PubMed literature integration
Reproducible pipelinePartial
Time to results~10 minutesHoursWeeks

Built For

Real epidemiological research scenarios, not toy examples

🏥

PhD Students & Postdocs

First NHANES paper without spending months learning survey design, SAS/STATA code, and Lancet formatting conventions.

🏧

Research Labs

Rapid hypothesis testing across multiple NHANES cycles. Batch-process several research questions with consistent methodology.

🌐

Public Health Institutions

Standardized, auditable analysis pipeline for population health surveillance. Reproducible reports for policy stakeholders.

Pricing

Choose the plan that fits your research workflow

Starter

$0/mo

For individual researchers exploring NHANES

  • 1 NHANES cycle per analysis
  • Basic descriptive statistics
  • Table 1 generation
  • 3 analyses per month
  • Community support
Get Started Free

Enterprise

$199/mo

For institutions and CROs

  • All Professional features
  • Custom variable mapping
  • Multi-study batch processing
  • Dedicated account manager
  • SLA guarantee
  • On-premise deployment option
Contact Sales

Ready to accelerate your research?

Join researchers using NHANES to Lancet for faster, reproducible epidemiological analysis.

Try Live Demo View on GitHub