Category: AI

  • Ethical Framework For Using LLMs In Non-Fiction

    How can we use AI for good, ethically?

    Front matter: what this is, how it was made, and limits

    What this document is: A practical ethics + workflow guide for using LLMs to draft and edit non-fiction.

    How this document was made (LLM disclosure): This guide was drafted with an LLM (ChatGPT) and then tightened against a small set of public, primary references listed at the end, overseen by Jonathan Frost.

    Important limits (read this):

    • Not legal advice. Copyright, privacy, and disclosure duties vary by jurisdiction and contract (publisher/client/platform terms). If your work is high-stakes or commercial, get qualified review.
    • “Publicly available” ≠ “free to reuse.” Copyright and website terms can still apply.
    • LLMs can fabricate facts and citations. Build your process assuming errors will occur. [2]

    1) Purpose and scope

    Scope: Any output readers might treat as factual: reports, books, articles, policy briefs, educational content, explainers, biographies, case studies.

    Goal: Use LLMs to improve productivity without weakening:

    • factual accuracy,
    • source integrity,
    • rights compliance,
    • reader trust.

    Baseline rule: An LLM is a drafting/synthesis tool, not an authority.

    2) Core principles (non-negotiables)

    1. Truth over fluency
      • If you can’t verify a claim, label it as uncertain or remove it.
      • Don’t publish “sounds right” facts.
    2. Traceability (“no source, no claim”)
      • Every material factual claim should be traceable to a source you actually accessed.
      • Keep a source log and a claim-to-source map (template below).
    3. Transparency
      • Disclose meaningful LLM involvement where it affects trust (research, synthesis, summaries, translation, or substantive rewriting).
    4. Rights respect
      • Don’t use an LLM to copy or lightly paraphrase copyrighted text, evade paywalls, or “launder” proprietary material.
    5. Human accountability
      • A human author/editor owns responsibility for accuracy, attribution, and harm reduction—consistent with risk management expectations for AI use. [3]

    3) Sourcing and referencing rules

    A. Evidence tiers (useful for enforcement)

    • Tier 1 (strong): primary documents, official datasets, peer-reviewed research, transcripts/recordings.
    • Tier 2 (medium): reputable journalism, well-sourced books, institutional reports.
    • Tier 3 (weak): unsourced blogs, anonymous posts, single unverified claims.

    Policy: Tier 3 cannot support major claims; Tier 2 needs corroboration for high-stakes assertions.

    B. Referencing rules you can actually follow

    • Direct quotes: only from sources you personally retrieved and checked. (Never quote text “as given by the model” unless you verified it against the original.)
    • Paraphrases: require a source and must preserve meaning (no “citation laundering” where you find a vaguely related source after the fact).
    • Attribution: for interpretation, dispute, or uncertainty, attribute clearly (“According to…”, “X argues…”).

    4) Publicly available information: ethical use constraints

    Rule: Online access does not equal reuse rights. Even if content is “public,” it may still be copyrighted, contract-restricted (terms of service), or privacy-sensitive.

    Privacy & harm minimization:

    • Avoid publishing sensitive personal info (contact details, medical info, etc.) unless there’s a strong public-interest justification and you minimize harm (redact/aggregate where possible).
    • For private individuals, use a higher bar than “it’s online.”

    5) Hallucinations and error risk: process design

    LLMs can generate plausible falsehoods and fabricated citations; verification is mandatory for factual work. [1] [2]

    A. Verification-first workflow (recommended)

    1. Outline claims first: separate what must be factual from narrative/interpretation.
    2. Collect sources first: gather the documents you’ll rely on.
    3. Use the model to draft from those sources: structure, summarize, propose wording, identify gaps.
    4. Verify and annotate: check each material claim against sources; mark confidence.
    5. Final editorial pass: look for overreach, missing caveats, misleading framing.

    B. High-risk claim categories (extra checks)

    Require double-checking (and ideally a second reviewer):

    • numbers/statistics, dates, names/titles,
    • direct quotes,
    • medical/legal/financial statements,
    • allegations about people or organizations.

    C. “Never do this” list

    • Don’t publish citations the model generated unless you verified they exist and support the claim.
    • Don’t “reconstruct” quotes you can’t locate.
    • Don’t let the model be the only “researcher.”

    6) IP and rights: rules + jurisdiction flags

    A. Baseline IP ethic

    • Don’t copy protected text (or close-paraphrase) without permission or a defensible exception.
    • Avoid output that could substitute for the original work (especially paid content).

    B. Jurisdiction-dependent frameworks (you must label this in your policy)

    • United States (fair use): evaluated case-by-case under statutory factors; no fixed “safe” word count rule. [4]
    • United Kingdom (fair dealing): purpose-specific exceptions; “fair dealing” is judged by context. [5]
    • International baseline (quotation right concept): quotations are generally permitted only if compatible with “fair practice” and limited to what the purpose justifies (implementation varies by country). [6]

    C. Text and data mining (TDM) and “public web” content (jurisdiction-dependent, evolving)

    • In the EU, the DSM Directive includes TDM exceptions with a rights reservation/opt-out concept for certain uses; online rights reservations may be required to be machine-readable (details depend on implementation and interpretation). [7]
    • In the UK, government consultation materials discuss approaches including opt-out style mechanisms and transparency measures (policy not static). [8]

    Practical policy: Treat training/crawling/large-scale ingestion as a separate legal/contractual question from “quoting and citing.” Don’t assume you can ingest or reuse just because it’s online.

    D. Style imitation and misrepresentation

    • Don’t publish work that implies a human expert, journalist, or witness did reporting they didn’t do.
    • Avoid “in the voice of a living author” for publication; it risks deception and brand/rights issues.

    7) Disclosure to readers: when and how

    Default: Disclose LLM use when it meaningfully affects trust: research assistance, source summarization, translation, substantive drafting, or rewriting.

    Example disclosure text (short):

    “This work used language-model assistance for drafting/editing. All factual claims and quotations were verified against the cited sources by the author.”

    Stricter disclosure triggers:

    • the piece resembles investigative reporting,
    • the topic is high-stakes,
    • the model generated any material factual claims that required verification.

    8) Operational controls (so it’s enforceable)

    A. Required artifacts (lightweight but effective)

    1. Source log (what you read and used)
    2. Claim table (what you assert and why)

    B. Two-pass review

    • Pass 1: factual integrity (claims, numbers, quotes, attributions)
    • Pass 2: interpretive fairness (framing, omissions, loaded language)

    C. Corrections policy

    Publish a visible mechanism for error reporting and correction; keep a change log for factual fixes.

    Micro claim-to-source table (self-compliance)

    Claim used in this guideWhy it mattersRef
    LLMs can produce plausible falsehoods (“hallucinations”).Justifies verification-first workflow.[1]
    LLMs may fabricate citations/references.Supports “don’t trust model citations” rule.[2]
    Responsible AI use calls for governance/accountability.Supports “human accountability” and controls.[3]
    US fair use is case-by-case; no fixed word-count rule.Prevents fake certainty in IP rules.[4]
    UK fair dealing is context-specific.Prevents overgeneralizing UK exceptions.[5]
    International quotation norm includes “fair practice” and purpose-limited extent.Grounds quote/extent guidance.[6]
    EU DSM Directive includes TDM exceptions and rights reservation.Grounds “public web isn’t free to mine” warning.[7]
    UK copyright-and-AI policy is under consultation/active development.Justifies “evolving” caveat.[8]

    References

    1. OpenAI — Why language models hallucinate (explains hallucinations as a model behavior and why it happens).
    2. OpenAI Help Center — Does ChatGPT tell the truth? (notes errors and hallucinations; cautions about reliability).
    3. NIST — AI Risk Management Framework (AI RMF 1.0) (risk management, governance, accountability concepts for trustworthy AI).
    4. U.S. Copyright Office — Fair Use (fair use overview; emphasizes case-by-case analysis, no fixed “safe” amount).
    5. UK Government — Exceptions to copyright (fair dealing and other exceptions guidance).
    6. Berne Convention (via Cornell LII) — Article 10 on quotations and “fair practice” (international baseline concept; implementation varies).
    7. EUR-Lex — Directive (EU) 2019/790 (DSM Directive) (includes TDM provisions and rights reservation concepts).
    8. UK Government — Copyright and artificial intelligence consultation materials (shows policy is active/evolving).

    Meta Check on ChatGPT

    ChatGPT can’t reliably list the “original material it’s based on” because it isn’t a system that stores documents and retrieves them by title. During training, the model’s internal parameters (“weights”) are adjusted so it becomes better at predicting likely text. What it retains is a distributed statistical representation of patterns across a huge amount of text—not a library of identifiable passages with a usable index back to specific books, articles, or webpages. OpenAI describes this as models not storing or retaining copies of the training data, and instead learning via parameter updates. [1]

    Even when the model produces something that resembles a known phrase, there’s usually no clean, inspectable trail like “this sentence came from source X, page Y.” A response is generated from many overlapping influences in those weights plus the immediate prompt, not from a deterministic lookup. This is also part of why the model can sound confident while being wrong: the system is optimized to produce plausible continuations, and standard training/evaluation incentives can reward “best guess” answers over explicitly stating uncertainty—leading to hallucinations (confident, false outputs) that are not grounded in any real source at all. [2]

    Finally, the training pipeline is typically a mixture of data types and stages (e.g., broad pretraining plus later fine-tuning and safety tuning), and providers often describe it at a high level rather than enumerating every item, in part because exhaustive lists would be enormous and can involve licensed or otherwise restricted material. OpenAI’s system cards, for example, focus on capability and safety characteristics and describe the development process at a high level, not as a per-document bibliography. [3]

    Meta Check References

    1. OpenAI Help Center — How ChatGPT and our language models are developed (explains that models do not store/retain copies of training data; learn via parameter updates).
    2. OpenAI — Why language models hallucinate (explains hallucinations and how training/evaluation can reward guessing over uncertainty).
    3. OpenAI — GPT-4 System Card (example of high-level model development/safety documentation rather than an itemized source list).
  • David Shapiro – Do this over the next 5 years and you’re set

    David Shapiro – Do this over the next 5 years and you’re set

    He asks “How do I prepare for AI and what’s coming to jobs and the economy?”
    He frames the answer as four big areas you can act on: (1) where you live, (2) investments, (3) jobs, (4) lifestyle / higher purpose.

    1) Where you live: “location arbitrage” is a real lever

    • Remote work (accelerated by the pandemic) lets some people choose cheaper or more desirable places to live while keeping higher-paying work.
    • He argues a lot of return-to-office mandates are often a pretext for layoffs (though he acknowledges some teams truly benefit from in-person work).
    • As people leave expensive hubs (he mentions places like San Francisco), housing availability/prices may shift, creating opportunities for those who still want city life.
    • His personal stance: moving to a smaller town improved quality of life (community feel, less stress, more “village vibe”).

    Connection to AI: if AI disrupts jobs broadly, where you live and what it costs to live there matters more.

    2) Investments: the future shifts from “wage economy” to “capital economy”

    • He says we’re moving toward a world where labour earns less overall, and capital ownership/participation becomes the main way wealth gets distributed.
    • His personal strategy (as an example, not advice): dividend-producing ETFs so he doesn’t have to stress about trading—income comes via dividends.
    • He highlights typical household capital channels: stocks, bonds, real estate.
    • He points to “employee ownership” models as a bridge:
      • ESOPs (employee stock ownership plans) in the US
      • UK-style employee-owned trusts and similar European approaches
    • On crypto:
      • He’s sceptical of most crypto/DAOs (calls many scam/rug-pull risk).
      • He views Bitcoin more as a wealth-preservation asset than an income generator, and mentions The Bitcoin Standard as an argument for that view.

    Big claim: solving “how regular people gain capital if they have none” is not an individual problem—it requires policy change.

    3) Jobs: AI + robots squeeze both knowledge work and low-skill labour

    His core thesis: AI threatens high-paid knowledge work, and robots threaten many manual/service jobs, so the old “get skills → get stable job” model breaks down.

    What he thinks survives longer

    He proposes four job “buckets” that remain valuable because people still pay for humans:

    1. Attention jobs
      • Monetizing attention (YouTube, social media, etc.).
      • But he warns it’s winner-take-most and heavily luck-driven.
    2. Experience jobs
      • Work that facilitates lived experiences: tour guides, massage, event roles, “trip sitters,” hospitality/entertainment, etc.
      • People will keep wanting human-centred experiences, even if robots exist.
    3. Authenticity jobs
      • Roles where the customer/client specifically wants a real human presence (he mentions examples like therapists, politicians, etc.).
    4. Meaning jobs
      • Philosophers, spiritual leaders, mentors—people who help others make sense of life and change.
      • He positions himself partly here.

    The “use AI” middle path

    He describes a practical adaptation: become an AI power user (like his wife shifting from copywriting to broader marketing/strategy and using AI for research, planning, artifacts).
    The value becomes judgment + agency + client trust, not typing words.

    Trust and reputation matter more

    He gives an example of a fencing contractor:

    • Even if robots do the physical labour later, customers still hire the trusted name/brand.
    • Trust/reputation are “non-fungible” (can’t easily swap one human for another).

    Timeline / urgency

    He predicts a major societal labour crisis within 10–20 years, and even suggests it could hit before 2030 given the pace of innovation (in his view).

    4) Lifestyle and higher purpose: build agency and structure for a post-work world

    Assuming a future with some mix of UBI (cash) and universal basic capital / dividends, he asks: “What do you do with your time?”

    • He argues people will need purpose, not just income.
    • Key personal skill: agency (self-directed life).
      • Not just reacting to market opportunities, but creating your own path based on what you genuinely care about.
    • He emphasizes the need for structure when external structure (a job) fades.

    How to find a mission (his suggested starting point)

    • “Admit what you’re afraid to want.”
    • Once you acknowledge what you truly want (even if it risks judgment/failure), you can align choices and opportunities toward it.

    He also emphasizes that meaning doesn’t have to be career-shaped:

    • For some, purpose is family and being a good parent, building community, doing “village life” well.

    The talk’s bottom line in one paragraph

    Shapiro’s message is: AI and robotics will undermine both white-collar knowledge work and many service/manual jobs, pushing society toward a capital-based economy and forcing big policy changes. On a personal level, he suggests you prepare by optimizing where you live, building some form of capital participation if possible, steering toward work that depends on human attention/experience/authenticity/meaning, and developing agency, structure, and purpose so life still works even if traditional employment doesn’t.

    Source: https://youtu.be/cY–hKUWKX4

  • AI in Education – Climbing the Wrong Mountain?

    Based on the research, there are several key arguments for why a curriculum emphasizing skills like communication and critical thinking may be more important than one focused primarily on acquiring knowledge and information:

    1. Changing workplace demands: The shift to a knowledge-based economy means employers are increasingly seeking workers with transferable skills like critical thinking, communication, collaboration, and problem-solving rather than just subject-specific knowledge [1][5]. These “21st century skills” are seen as essential for success in the modern workforce.

    2. Rapid pace of change: With information and technology evolving so quickly, specific knowledge can become outdated. Teaching students how to think critically, analyse information, and adapt to new situations may better prepare them for an uncertain future [1][6].

    3. Ubiquitous access to information: The internet and AI tools provide easy access to vast amounts of information. The ability to evaluate, synthesize, and apply information is becoming more valuable than simply memorizing facts [7][11].

    4. AI competition: As artificial intelligence becomes more advanced at tasks involving information processing and recall, uniquely human skills like creativity, emotional intelligence, and complex problem-solving become more important differentiators [6][7].

    5. Deeper learning: Focusing on skills like critical thinking and communication can lead to deeper understanding and retention of knowledge, as students actively engage with information rather than passively absorbing it [8][9].

    6. Preparation for lifelong learning: Teaching students how to learn and think critically equips them to continue acquiring new knowledge and skills throughout their lives [5][11].

    7. Holistic development: A skills-based approach can foster important personal qualities like confidence, motivation, and resilience, supporting students’ overall development beyond just academic achievement [10].

    8. Real-world application: Skills-based learning often involves more hands-on, project-based work that allows students to apply knowledge in practical contexts, better preparing them for real-world challenges [5][8].

    However, it’s important to note that most sources emphasize the need for balance – skills cannot be developed in a vacuum without content knowledge [9][11]. The most effective approach likely involves teaching core knowledge alongside critical 21st century skills, rather than focusing exclusively on one or the other.

    Let’s discuss what should be in the curriculum before we use AI to improve delivery.

    Citations:

    [1] https://substack.nomoremarking.com/p/skills-vs-knowledge-13-years-on

    [2] https://www.learninga-z.com/site/resources/breakroom-blog/knowledge-based-and-skill-based-learning

    [3] https://www.digitaltheatreplus.com/blog/5-reasons-why-critical-thinking-is-the-most-important-skill-for-students

    [4] https://blog.pearsoninternationalschools.com/knowledge-vs-skills-what-do-students-really-need-to-learn/

    [5] https://www.icevonline.com/blog/four-cs-21st-century-skills

    [6] https://halfbaked.education/knowledge-based-curriculum/

    [7] https://assets.publishing.service.gov.uk/media/5d71187ce5274a097c07b985/21st_century.pdf

    [8] https://en.wikipedia.org/wiki/21st_century_skills

    [9] https://my.chartered.college/impact_article/skills-versus-knowledge-a-curriculum-debate-that-matters-and-one-which-we-need-to-reject/

    [10] https://www.highspeedtraining.co.uk/hub/communication-skills-for-teachers/

    [12] https://dimensionscurriculum.co.uk/the-importance-of-children-developing-good-communication-skills/