Evaluation Framework

APHI workflows should not advance because the output looks polished. They advance when reviewers can inspect the evidence, risks, limitations, and results.

01

Accuracy

Verify factual claims, calculations, definitions, citations, and source alignment.

02

Safety

Check refusal rules, escalation points, uncertainty, and prevention of unsupported guidance.

03

Usefulness

Assess whether the output helps a real practitioner take the next step under workflow constraints.

04

Equity

Review missingness, subgroup effects, stigmatizing language, and potential resource allocation harms.

05

Source Quality

Label authoritative guidance, peer-reviewed sources, preprints, vendor claims, and secondary summaries.

06

Governance

Document privacy review, reviewer role, approval path, monitoring, and operational boundaries.

Readiness levels

APHI separates idea quality from deployment readiness. A workflow can be promising and still require review before any operational use.

Concept

Question and possible AI support are defined. Not tested.

Draft

Inputs, outputs, review checklist, and example cases exist.

Pilot

Tested with synthetic, historical, or partner-approved data under supervision.

Operational

Governance, monitoring, evaluation results, and escalation rules are documented.

Minimum evaluation record

Each workflow needs enough documentation for a reviewer, health department leader, or funder to understand what has been tested and what remains uncertain.

Record field Purpose Status
Intended use Defines the task and user Required
Excluded use Prevents misuse and overreach Required
Test cases Exercises known failure modes Required
Reviewer role Assigns accountability Required