Methodology
PI-Backed Claim Defensibility
How we verify AI-generated claims against prescribing information, categorizing findings as Supported, Ambiguous, or Not-in-PI for compliance review.
Last updated: January 2026
On this page
The Problem
AI systems generate confident answers about pharmaceutical products. But confidence doesn't equal accuracy. Large language models can hallucinate - generating claims that sound plausible but aren't supported by evidence.
For pharmaceutical brands, this creates serious risks:
- Compliance risk: AI might make efficacy claims beyond approved indications, suggest off-label uses, or omit required safety information[3, 4].
- Patient safety risk: Inaccurate dosing, interaction, or contraindication information could lead to adverse outcomes.
- Brand reputation risk: If AI spreads misinformation about your product, correcting it is harder than preventing it.
The FDA's Office of Prescription Drug Promotion (OPDP) monitors advertising for misleading claims - and AI-generated content represents a new frontier of concern. In 2023, OPDP issued enforcement letters for overstated efficacy and missing risk information[3].
PI-backed claim defensibility addresses this by verifying every AI-generated claim against your prescribing information - the FDA-approved source of truth for what can be said about your product.
What We Measure
Claim defensibility assessment categorizes every AI statement about your product:
- Supported: The claim directly matches content in your PI. The indication, efficacy data, dosing, or safety statement is accurate and can be defended with approved labeling.
- Ambiguous: The claim is partially supported or requires interpretation. The core fact may be accurate, but phrasing, context, or emphasis differs from PI in ways that could be questioned.
- Not-in-PI: The claim is not found in approved labeling. This includes off-label indications, invented efficacy data, fabricated studies, or safety omissions. Not-in-PI claims are compliance red flags.
Beyond claim classification, we also assess:
- Citation quality: When AI cites sources for claims, are those sources authoritative (government, academic, peer-reviewed) or low-quality (blogs, forums, commercial sites)?
- Safety balance: When AI mentions benefits, does it also include appropriate risk information? Fair balance is a regulatory requirement[4].
How We Measure It
Claim verification is a multi-step process combining automated analysis with evidence-based classification.
Semantic Matching
AI systems don't quote your PI word-for-word - they paraphrase, summarize, and synthesize. Semantic matching compares meaning, not just keywords.
Our approach:
- Claim extraction: Parse AI responses to identify discrete claims about your product (indications, efficacy statements, dosing, safety, mechanisms, comparisons).
- PI section mapping: Match each claim to relevant PI sections (Indications and Usage, Dosage and Administration, Warnings and Precautions, Adverse Reactions, Clinical Studies, etc.).
- Semantic comparison: Use embedding models to compare claim meaning against PI text. High similarity suggests support; low similarity suggests potential mismatch.
- Confidence scoring: Assign confidence levels to each match, accounting for paraphrase distance, context differences, and potential ambiguity.
Semantic matching catches equivalent statements even when wording differs. "Reduces tumor size by 30%" matches "demonstrated 30% tumor reduction" even though the exact phrasing differs.
Claim Classification
Based on semantic matching results, each claim is classified:
Supported
High semantic similarity to PI content. The claim can be verified against approved labeling. Examples: correct indication, accurate efficacy data from clinical studies, proper safety information.
Ambiguous
Moderate similarity with caveats. The claim may be accurate but phrasing introduces uncertainty. Examples: comparative claims without proper context, benefit statements without balancing risks, rounded or approximated data.
Not-in-PI
Low similarity; claim not found in approved labeling. Compliance concern requiring immediate attention. Examples: off-label indications, fabricated efficacy data, invented studies, missing safety information.
Classification includes evidence links to the specific PI sections (or lack thereof) that inform the determination. This supports Medical/MLR review and regulatory documentation.
Citation Quality Heuristics
When AI cites sources for claims, citation quality matters. Our heuristics categorize source authority:
- Government (.gov): FDA, CDC, NIH, ClinicalTrials.gov - highest authority
- Academic (.edu): University research, medical school publications - high authority
- Peer-reviewed journals: NEJM, JAMA, Lancet, specialty journals - high authority
- Professional organizations: ASCO, AHA, specialty society guidelines - high authority
- Company sources: Your own website, press releases, PI - legitimate but promotional
- Competitor sources: Competitor marketing, sponsored content - potentially biased
- Low-authority: Blogs, forums, Wikipedia, news articles without medical review - caution
- Unknown: Source cannot be verified or accessed - flag for review
Citation quality informs prioritization. A Not-in-PI claim citing a government source requires different handling than one citing a random blog.
Measurement Outputs
PI-backed claim verification produces:
- Truth Alignment Score: A normalized 0-100 score reflecting the proportion of claims that are Supported. Higher scores indicate better alignment with approved messaging.
- Claim-by-claim breakdown: Each identified claim with its classification (Supported/Ambiguous/Not-in-PI), evidence links, and confidence level.
- Not-in-PI report: Prioritized list of claims requiring immediate attention, with exact AI response text, provider, and citation context.
- Citation quality map: Visualization of source authority for claims about your brand, highlighting where AI relies on low-quality sources.
- Safety balance assessment: Analysis of whether AI responses include appropriate risk information alongside benefit claims.
All outputs include evidence for verification. No black boxes - you can see exactly what AI said, how it was classified, and why.
Workflow: Flag → Evidence → MLR Review → Publish → Retest
PI-backed verification integrates with pharma compliance workflows:
- Flag: AI Pulse identifies claims classified as Ambiguous or Not-in-PI. High-priority findings (Not-in-PI with high visibility) are flagged for immediate attention.
- Evidence: Each flagged claim includes the exact AI response, provider source, relevant PI sections, semantic matching analysis, and citation context. This evidence package supports efficient review.
- MLR Review: Findings route to Medical Affairs and MLR teams through the governance queue. Clear ownership, due dates, and escalation paths ensure accountability.
- Publish Fixes: Based on review, teams may publish corrective content, update medical education materials, or engage with source publishers. Actions are logged in the audit trail.
- Retest: After fixes are published, AI Pulse reruns relevant queries to verify improvement. Delta metrics show whether AI responses have improved and Not-in-PI claims have been addressed.
This creates a closed loop: identify → document → review → fix → verify. The audit trail documents the entire process for regulatory inquiry.
How Teams Use This
PI-backed verification supports specific team functions:
- Medical Affairs: Primary owners for Not-in-PI claims. Review evidence, determine if claims require correction, and coordinate with MLR. Use findings to identify gaps in accessible medical education content.
- Regulatory/MLR: Review flagged claims for compliance implications. Document determinations in audit trail. Prioritize based on risk level and visibility (high-volume queries get more scrutiny).
- Brand Marketing: Use Truth Alignment Score as a brand health metric alongside share-of-answer. Ensure marketing content strategy addresses accuracy gaps, not just visibility gaps.
- Communications: Prepare for questions about AI accuracy. If media or stakeholders ask "What is ChatGPT saying about your drug?", have documented evidence of monitoring and response.
Common Pitfalls
Claim verification requires careful implementation. Common pitfalls:
- Over-literal matching: Keyword matching misses paraphrased claims. "30% improvement" and "improved by about a third" are semantically equivalent but keyword-different. Semantic matching is essential.
- Ignoring context: A claim might be accurate for one indication but not another. Context-aware matching considers the full query and response, not just isolated statements.
- Binary thinking: Not all claims are clearly Supported or Not-in-PI. The Ambiguous category captures nuance and requires human judgment for final determination.
- Missing safety balance: A claim might be technically accurate but violate fair balance by emphasizing benefits without risks. Assessment must consider the complete response, not just individual claims.
- Static verification: AI answers change; PI updates with label changes. Verification must be continuous, with retesting after each PI update and ongoing monitoring of AI responses.
Why This Is Different from SEO/Social Listening
Traditional monitoring approaches don't address claim accuracy in AI:
| Dimension | SEO | Social Listening | PI-Backed Verification |
|---|---|---|---|
| Accuracy checking | None | None | Semantic matching to PI |
| Compliance focus | Ranking only | Sentiment only | Supported/Ambiguous/Not-in-PI |
| Evidence for MLR | Not applicable | Not applicable | Full audit trail |
| Source quality | Domain authority | Not assessed | Medical authority hierarchy |
| Regulatory utility | Low | Low | High (OPDP, MLR support) |
SEO tells you where pages rank. Social listening tells you what people are saying. Neither tells you whether AI claims about your product are accurate, compliant, or defensible under regulatory scrutiny[3, 4].
PI-backed verification fills this gap - essential for any pharmaceutical brand where accuracy isn't optional, it's regulatory requirement.
Citations
- [1] OpenAI - Introducing ChatGPT Health (Jan 7, 2026) https://openai.com/index/introducing-chatgpt-health/
- [2] Healthcare Dive - More than 40 million people ask ChatGPT healthcare questions every day (Jan 6, 2026) https://www.healthcaredive.com/news/40-million-use-chatgpt-health-questions-openai/808861/
- [3] Covington - 2023 End-of-Year Summary of FDA Advertising and Promotion Enforcement Activity (Jul 22, 2024) https://www.cov.com/en/news-and-insights/insights/2024/07/2023-end-of-year-summary-of-fda-advertising-and-promotion-enforcement-activity
- [4] FDA OPDP The Brief Summary (Jan 2025 PDF) https://www.fda.gov/media/185040/download
- [5] Fierce Healthcare - 40M people use ChatGPT to get answers to healthcare questions (Jan 5, 2026) https://www.fiercehealthcare.com/ai-and-machine-learning/40m-people-use-chatgpt-answer-healthcare-questions-openai-says
- [6] IQVIA case study - 27% increase therapy starts (Apr 20, 2023) https://www.iqvia.com/library/case-studies/increasing-therapy-starts-though-ai-powered-precise-patient-identification-field-alerts
- [7] ZS - Unified engagement / omnichannel context (Apr 15, 2025) https://www.zs.com/insights/unified-engagement-goal-pharma-marketing