MediumCapability

Policy study on mitigating disinformation risks from large language models

AI Impact Summary

OpenAI researchers collaborated with Georgetown and Stanford to examine how large language models could be misused to amplify disinformation, culminating in a report with a framework for mitigations. The work surfaces threat vectors such as automated generation of persuasive content, synthesis of believable misinformation, and targeted outreach at scale, plus evaluation criteria to compare mitigation options. For a technical team, this implies embedding risk assessment into deployment pipelines, enforcing stricter access controls, and deploying layered guardrails, monitoring, and governance across model selection, prompt design, and post-generation moderation. Early adoption of the framework can help reduce reputational and regulatory risk by ensuring disinformation risks are systematically identified and mitigated in both consumer-facing and enterprise use cases.

Business Impact

Adopting the report's mitigation framework can reduce exposure to disinformation-driven reputational harm and regulatory risk for organizations deploying LLMs in customer-, partner-, or employee-facing apps.

Risk domains

Date: Date not specified
Change type: capability
Severity: medium

Policy study on mitigating disinformation risks from large language models

More from OpenAI

Get alerts for OpenAI