Ethical Framework For Using LLMs In Non-Fiction

How can we use AI for good, ethically?

Front matter: what this is, how it was made, and limits

What this document is: A practical ethics + workflow guide for using LLMs to draft and edit non-fiction.

How this document was made (LLM disclosure): This guide was drafted with an LLM (ChatGPT) and then tightened against a small set of public, primary references listed at the end, overseen by Jonathan Frost.

Important limits (read this):

  • Not legal advice. Copyright, privacy, and disclosure duties vary by jurisdiction and contract (publisher/client/platform terms). If your work is high-stakes or commercial, get qualified review.
  • “Publicly available” ≠ “free to reuse.” Copyright and website terms can still apply.
  • LLMs can fabricate facts and citations. Build your process assuming errors will occur. [2]

1) Purpose and scope

Scope: Any output readers might treat as factual: reports, books, articles, policy briefs, educational content, explainers, biographies, case studies.

Goal: Use LLMs to improve productivity without weakening:

  • factual accuracy,
  • source integrity,
  • rights compliance,
  • reader trust.

Baseline rule: An LLM is a drafting/synthesis tool, not an authority.

2) Core principles (non-negotiables)

  1. Truth over fluency
    • If you can’t verify a claim, label it as uncertain or remove it.
    • Don’t publish “sounds right” facts.
  2. Traceability (“no source, no claim”)
    • Every material factual claim should be traceable to a source you actually accessed.
    • Keep a source log and a claim-to-source map (template below).
  3. Transparency
    • Disclose meaningful LLM involvement where it affects trust (research, synthesis, summaries, translation, or substantive rewriting).
  4. Rights respect
    • Don’t use an LLM to copy or lightly paraphrase copyrighted text, evade paywalls, or “launder” proprietary material.
  5. Human accountability
    • A human author/editor owns responsibility for accuracy, attribution, and harm reduction—consistent with risk management expectations for AI use. [3]

3) Sourcing and referencing rules

A. Evidence tiers (useful for enforcement)

  • Tier 1 (strong): primary documents, official datasets, peer-reviewed research, transcripts/recordings.
  • Tier 2 (medium): reputable journalism, well-sourced books, institutional reports.
  • Tier 3 (weak): unsourced blogs, anonymous posts, single unverified claims.

Policy: Tier 3 cannot support major claims; Tier 2 needs corroboration for high-stakes assertions.

B. Referencing rules you can actually follow

  • Direct quotes: only from sources you personally retrieved and checked. (Never quote text “as given by the model” unless you verified it against the original.)
  • Paraphrases: require a source and must preserve meaning (no “citation laundering” where you find a vaguely related source after the fact).
  • Attribution: for interpretation, dispute, or uncertainty, attribute clearly (“According to…”, “X argues…”).

4) Publicly available information: ethical use constraints

Rule: Online access does not equal reuse rights. Even if content is “public,” it may still be copyrighted, contract-restricted (terms of service), or privacy-sensitive.

Privacy & harm minimization:

  • Avoid publishing sensitive personal info (contact details, medical info, etc.) unless there’s a strong public-interest justification and you minimize harm (redact/aggregate where possible).
  • For private individuals, use a higher bar than “it’s online.”

5) Hallucinations and error risk: process design

LLMs can generate plausible falsehoods and fabricated citations; verification is mandatory for factual work. [1] [2]

A. Verification-first workflow (recommended)

  1. Outline claims first: separate what must be factual from narrative/interpretation.
  2. Collect sources first: gather the documents you’ll rely on.
  3. Use the model to draft from those sources: structure, summarize, propose wording, identify gaps.
  4. Verify and annotate: check each material claim against sources; mark confidence.
  5. Final editorial pass: look for overreach, missing caveats, misleading framing.

B. High-risk claim categories (extra checks)

Require double-checking (and ideally a second reviewer):

  • numbers/statistics, dates, names/titles,
  • direct quotes,
  • medical/legal/financial statements,
  • allegations about people or organizations.

C. “Never do this” list

  • Don’t publish citations the model generated unless you verified they exist and support the claim.
  • Don’t “reconstruct” quotes you can’t locate.
  • Don’t let the model be the only “researcher.”

6) IP and rights: rules + jurisdiction flags

A. Baseline IP ethic

  • Don’t copy protected text (or close-paraphrase) without permission or a defensible exception.
  • Avoid output that could substitute for the original work (especially paid content).

B. Jurisdiction-dependent frameworks (you must label this in your policy)

  • United States (fair use): evaluated case-by-case under statutory factors; no fixed “safe” word count rule. [4]
  • United Kingdom (fair dealing): purpose-specific exceptions; “fair dealing” is judged by context. [5]
  • International baseline (quotation right concept): quotations are generally permitted only if compatible with “fair practice” and limited to what the purpose justifies (implementation varies by country). [6]

C. Text and data mining (TDM) and “public web” content (jurisdiction-dependent, evolving)

  • In the EU, the DSM Directive includes TDM exceptions with a rights reservation/opt-out concept for certain uses; online rights reservations may be required to be machine-readable (details depend on implementation and interpretation). [7]
  • In the UK, government consultation materials discuss approaches including opt-out style mechanisms and transparency measures (policy not static). [8]

Practical policy: Treat training/crawling/large-scale ingestion as a separate legal/contractual question from “quoting and citing.” Don’t assume you can ingest or reuse just because it’s online.

D. Style imitation and misrepresentation

  • Don’t publish work that implies a human expert, journalist, or witness did reporting they didn’t do.
  • Avoid “in the voice of a living author” for publication; it risks deception and brand/rights issues.

7) Disclosure to readers: when and how

Default: Disclose LLM use when it meaningfully affects trust: research assistance, source summarization, translation, substantive drafting, or rewriting.

Example disclosure text (short):

“This work used language-model assistance for drafting/editing. All factual claims and quotations were verified against the cited sources by the author.”

Stricter disclosure triggers:

  • the piece resembles investigative reporting,
  • the topic is high-stakes,
  • the model generated any material factual claims that required verification.

8) Operational controls (so it’s enforceable)

A. Required artifacts (lightweight but effective)

  1. Source log (what you read and used)
  2. Claim table (what you assert and why)

B. Two-pass review

  • Pass 1: factual integrity (claims, numbers, quotes, attributions)
  • Pass 2: interpretive fairness (framing, omissions, loaded language)

C. Corrections policy

Publish a visible mechanism for error reporting and correction; keep a change log for factual fixes.

Micro claim-to-source table (self-compliance)

Claim used in this guideWhy it mattersRef
LLMs can produce plausible falsehoods (“hallucinations”).Justifies verification-first workflow.[1]
LLMs may fabricate citations/references.Supports “don’t trust model citations” rule.[2]
Responsible AI use calls for governance/accountability.Supports “human accountability” and controls.[3]
US fair use is case-by-case; no fixed word-count rule.Prevents fake certainty in IP rules.[4]
UK fair dealing is context-specific.Prevents overgeneralizing UK exceptions.[5]
International quotation norm includes “fair practice” and purpose-limited extent.Grounds quote/extent guidance.[6]
EU DSM Directive includes TDM exceptions and rights reservation.Grounds “public web isn’t free to mine” warning.[7]
UK copyright-and-AI policy is under consultation/active development.Justifies “evolving” caveat.[8]

References

  1. OpenAI — Why language models hallucinate (explains hallucinations as a model behavior and why it happens).
  2. OpenAI Help Center — Does ChatGPT tell the truth? (notes errors and hallucinations; cautions about reliability).
  3. NIST — AI Risk Management Framework (AI RMF 1.0) (risk management, governance, accountability concepts for trustworthy AI).
  4. U.S. Copyright Office — Fair Use (fair use overview; emphasizes case-by-case analysis, no fixed “safe” amount).
  5. UK Government — Exceptions to copyright (fair dealing and other exceptions guidance).
  6. Berne Convention (via Cornell LII) — Article 10 on quotations and “fair practice” (international baseline concept; implementation varies).
  7. EUR-Lex — Directive (EU) 2019/790 (DSM Directive) (includes TDM provisions and rights reservation concepts).
  8. UK Government — Copyright and artificial intelligence consultation materials (shows policy is active/evolving).

Meta Check on ChatGPT

ChatGPT can’t reliably list the “original material it’s based on” because it isn’t a system that stores documents and retrieves them by title. During training, the model’s internal parameters (“weights”) are adjusted so it becomes better at predicting likely text. What it retains is a distributed statistical representation of patterns across a huge amount of text—not a library of identifiable passages with a usable index back to specific books, articles, or webpages. OpenAI describes this as models not storing or retaining copies of the training data, and instead learning via parameter updates. [1]

Even when the model produces something that resembles a known phrase, there’s usually no clean, inspectable trail like “this sentence came from source X, page Y.” A response is generated from many overlapping influences in those weights plus the immediate prompt, not from a deterministic lookup. This is also part of why the model can sound confident while being wrong: the system is optimized to produce plausible continuations, and standard training/evaluation incentives can reward “best guess” answers over explicitly stating uncertainty—leading to hallucinations (confident, false outputs) that are not grounded in any real source at all. [2]

Finally, the training pipeline is typically a mixture of data types and stages (e.g., broad pretraining plus later fine-tuning and safety tuning), and providers often describe it at a high level rather than enumerating every item, in part because exhaustive lists would be enormous and can involve licensed or otherwise restricted material. OpenAI’s system cards, for example, focus on capability and safety characteristics and describe the development process at a high level, not as a per-document bibliography. [3]

Meta Check References

  1. OpenAI Help Center — How ChatGPT and our language models are developed (explains that models do not store/retain copies of training data; learn via parameter updates).
  2. OpenAI — Why language models hallucinate (explains hallucinations and how training/evaluation can reward guessing over uncertainty).
  3. OpenAI — GPT-4 System Card (example of high-level model development/safety documentation rather than an itemized source list).