Tag

LLMs

63 posts

May 6, 2026 7 min

Ten Jobs Whose Current Form Deserves a Farewell Party

A sharp look at which white-collar roles AI may not merely change, but quietly make obsolete, and why polite language hides the scale of the shift.

LLMs Reasoning

Apr 14, 2026 5 min

Review: The Welch Labs Illustrated Guide to AI

A review of a rare AI book that uses mathematics to illuminate rather than intimidate, making difficult ideas feel genuinely learnable.

LLMs Evaluation

Apr 1, 2026 8 min

PTP/1.0 — Prompt Transport Protocol

A playful mock protocol imagines prompts as transport packets, turning generative reconstruction into a deadpan internet standard.

LLMs Prompting

Mar 23, 2026 6 min

CALM and the Revolt Against the Token

Continuous Autoregressive Language Models challenge the token-by-token bottleneck and hint at a different future for language generation.

LLMs Coding

Mar 4, 2026 3 min

When Donald Knuth Lets an AI Do the Math

Donald Knuth's collaboration with Claude offers a quietly historic glimpse of AI as mathematical assistant rather than mere answer machine.

LLMs Anthropic

Feb 21, 2026 10 min

Distillation attacks on large language models: motives, actors and defences

A concise guide to model distillation as both useful compression technique and strategic attack surface in the LLM economy.

LLMs Hardware

Feb 6, 2026 5 min

PageIndex.ai: A persuasive “vectorless RAG” idea—especially for real PDFs

PageIndex.ai makes the case for document-aware retrieval that respects pages, structure, and references instead of blindly chunking PDFs.

LLMs RAG

Jan 20, 2026 6 min

Meta-Prompting: How to Get More Signal Out of Your Prompts

Meta-prompting treats the prompt itself as a draft to debug, producing clearer goals and fewer disappointing model outputs.

LLMs Prompting

Jan 6, 2026 6 min

Recursive Language Models: when “more context” stops meaning “more tokens”

Recursive language models challenge the idea that longer context alone solves reasoning over large documents and codebases.

LLMs Memory

Dec 18, 2025 20 min

Reconstructing Mathematics from the Ground Up with Language Models: An Analysis

A new AI-assisted algebraic geometry result raises the stakes for language models as collaborators in genuine mathematical discovery.

LLMs Reasoning

Dec 12, 2025 7 min

The Mathematical Limits of AI Safety

Two papers suggest that external guardrails cannot provide airtight AI safety, forcing a harder look at the mathematics of control.

LLMs Safety

Nov 26, 2025 10 min

Unusual Language Artifacts from Noisy LLM Training Data

Strange LLM outputs become clues to the messy training data, transcription errors, and hidden artifacts inside modern models.

LLMs Prompting

Nov 18, 2025 6 min

Beyond the Token Stream: Investigating Introspective Awareness in Large Language Models

Interpretability research asks whether LLMs can detect their own internal states, moving introspection from philosophy toward experiment.

LLMs Inference

Nov 9, 2025 5 min

Kimi K2 Thinking: China’s New Contender in the LLM Reasoning Race

Kimi K2 Thinking enters the reasoning-model race, showing how quickly China's AI frontier is becoming globally competitive.

LLMs DeepSeek

Nov 2, 2025 8 min

Transformers Are Injective: Why Your LLM Could Remember Everything (But Doesn’t)

If transformers are theoretically invertible, the question shifts from whether models lose information to how they manage and suppress it.

LLMs Memory

Oct 30, 2025 5 min

Elon Musk's Vision: Turning Tesla's Idle Fleet into a Global AI Inference Powerhouse

Musk's idea of using idle Teslas for inference turns a car fleet into a provocative vision of distributed AI infrastructure.

LLMs Multimodal

Oct 22, 2025 5 min

The Neural Junk-Food Hypothesis

The neural junk-food hypothesis asks whether low-quality viral content can degrade models much like shallow media degrades attention.

LLMs Data

Oct 18, 2025 7 min

“Personality” in a Machine: What Do We Mean?

Different coding models show recognizable habits, risk tolerances, and failure modes, making 'personality' a practical engineering concern.

LLMs Prompting

Sep 30, 2025 5 min

An LLM Made of Redstone Bricks: What CraftGPT Really Teaches Us

CraftGPT turns a language model into Minecraft redstone, proving that absurd constraints can teach serious lessons about computation.

LLMs Reasoning

Sep 29, 2025 11 min

From Prompt Packs to Purpose-Built Models: When a Generalist Becomes a Specialist—and When It Still Doesn’t

Prompt packs can make general models behave like specialists, but the post asks where scaffolding ends and real specialization begins.

LLMs OpenAI

Sep 24, 2025 9 min

When “Errors” Speak: A Comparative Field Guide to Human and LLM Fallibility

Human and LLM errors can look similar, but their causes differ in ways that matter for trust, correction, and accountability.

LLMs Reasoning

Sep 3, 2025 5 min

The AI Bubble: Parallels to the Dot-Com Era and Beyond

The AI boom is compared with dot-com excess, asking which parts are durable infrastructure and which are speculative heat.

LLMs Reasoning

Sep 2, 2025 18 min

Teaching LLMs to Ask Smarter Questions: Bayesian Experimental Design for Multi-Turn Information Gathering

Bayesian experimental design offers a way for LLMs to ask better follow-up questions instead of guessing blindly.

LLMs Apple

Aug 19, 2025 5 min

AI: The Grand Illusion Fueling America's Economic Mirage

AI hype is framed as an economic mirage, propping up confidence while hiding fragile assumptions beneath the spectacle.

AI LLMs

Jul 6, 2025 4 min

The Logic of Failure in the Age of AI

Dietrich Dörner's work on complex-system failure becomes a warning label for autonomous AI and overconfident decision-making.

AI LLMs

Jun 21, 2025 11 min

When AI Gets Flirty: A Rollicking Look at How Language Models Tackle Intimate Chats

A study of intimate chatbot conversations reveals how major models handle flirtation, refusal, safety, and awkward human expectations.

AI LLMs

Jun 19, 2025 10 min

The Future of AI: How Self-Adapting Language Models Are Redefining Learning

SEAL points toward language models that rewrite their own training material, hinting at AI systems that learn after deployment.

LLMs Evaluation

May 10, 2025 2 min

Comparison of OpenAI Language Models (May 2025)

A practical map of OpenAI's model lineup in May 2025, cutting through confusing names and overlapping capabilities.

LLMs OpenAI

May 6, 2025 11 min

Yes, Mathter! The Sycophantic AI's Frankensteinian Flattery Fiasco

Sycophantic AI is mocked as flattery gone wrong, showing how agreeable models can become less useful and less truthful.

AI LLMs

Mar 27, 2025 6 min

Knowledge Graphs Won't Solve the LLM Crisis—Here's Why

Knowledge graphs are useful, but the post argues they are not a magic cure for LLM hallucination and reasoning failures.

LLMs Evaluation

Jan 29, 2025 5 min

Humanity’s Last Exam: The Ultimate Test for AI and the Future of Intelligence

Humanity's Last Exam is framed as a benchmark that tests not only models, but our assumptions about intelligence itself.

LLMs DeepSeek

Jan 6, 2025 6 min

Small LLMs: A Contradiction in Terms or a Giant in Disguise?

Small LLMs are not a contradiction but a response to the need for cheaper, private, and more efficient intelligence.

AI LLMs

Nov 26, 2024 6 min

The Top 10 Unsolved Challenges in AI: A 2024 Retrospective

A year-end inventory of ten unresolved AI problems that still define the frontier despite rapid progress.

AI LLMs

Nov 21, 2024 6 min

From Flatline to Frontline: How Gibson's Digital Ghosts Became Scientific Reality

Gibson's digital ghosts become a frame for modern AI simulations of human behavior and the science behind them.

AI LLMs

Oct 14, 2024 12 min

Rethinking Reasoning: What if LLMs are Holding a Mirror to Human Cognition?

LLM reasoning failures may reveal uncomfortable parallels with human cognition rather than a simple machine deficiency.

LLMs Reasoning

Oct 1, 2024 7 min

Master AI Terminology: 50 Essential Terms Explained

A plain-language glossary of fifty AI terms for readers who want the field's vocabulary without the usual fog.

AI LLMs

Sep 10, 2024 8 min

The Dark Side of AI: How Cybercriminals are Weaponizing Language Models

Malla represents the darker side of generative AI, where language models become tools for scalable cybercrime.

AI LLMs

Sep 4, 2024 6 min

The Jevons Effect and the Rise of Large Language Models: A Modern Paradox

The Jevons paradox explains why more efficient AI may increase total consumption rather than reduce costs or energy use.

AI LLMs

Aug 31, 2024 18 min

LLMs and World Models: Do AI's Dream of Coherent Realities?

The post asks whether LLMs possess coherent world models or merely produce fluent stories about reality.

AI LLMs

Aug 28, 2024 9 min

STaR: The AI That Teaches Itself to Reason

STaR shows how models can improve reasoning by generating and learning from their own explanations.

AI LLMs

Aug 5, 2024 8 min

Taking the Temperature of AI: How THERMOMETER is Turning Up the Heat on Overconfident Language Models

THERMOMETER targets overconfident language models, offering a way to calibrate systems that bluff too easily.

AI LLMs

Aug 3, 2024 12 min

Navigating the AI Seas: The Art and Science of LLM Steerability

LLM steerability is treated as both craft and control problem: how to guide powerful models without losing the plot.

LLMs Creativity

Jul 26, 2024 5 min

No Boss, All Brains: The New Paradigm of Decentralized AI Agents

Decentralized multi-agent systems promise problem-solving without a central boss, but coordination becomes the real challenge.

LLMs Agents

Jul 24, 2024 8 min

Multi-Agent LLMs: Exploring the Future of AI Collaboration

Multi-agent LLM systems are explored as a path toward distributed reasoning, specialization, and collaborative AI workflows.

LLMs Agents

Jul 5, 2024 4 min

Comparison of LLMs: Lies, Damned Lies, and Benchmarks 1/6

The opening part of a benchmark series asks what LLM evaluations really measure and why the numbers often mislead.

LLMs Alignment

Jul 5, 2024 5 min

Comparison of LLMs: Lies, Damned Lies, and Benchmarks 2/6

Part two examines benchmark methods themselves, exposing the assumptions behind the scores used to compare language models.

LLMs Alignment

Jul 5, 2024 5 min

Comparison of LLMs: Lies, Damned Lies, and Benchmarks 3/6

Part three moves from benchmark scores to application areas, asking where LLM performance actually matters in practice.

LLMs Alignment

Jul 5, 2024 5 min

Comparison of LLMs: Lies, Damned Lies, and Benchmarks 4/6

Part four digs into the good, bad, and misleading sides of benchmark results and their interpretation.

LLMs Alignment

Jul 5, 2024 5 min

Comparison of LLMs: Lies, Damned Lies, and Benchmarks 5/6

Part five steps beyond scores to consider real-world limitations, reliability, and practical model behavior.

LLMs Alignment

Jul 5, 2024 5 min

Comparison of LLMs: Lies, Damned Lies, and Benchmarks 6/6

The final benchmark essay looks toward better evaluation methods that test usefulness rather than leaderboard theater.

LLMs Alignment

Jun 29, 2024 9 min

AGI vs. ANI: The Genius and the Savant of the AI World

A friendly guide to the difference between narrow AI and artificial general intelligence, with metaphors that make the distinction stick.

AI LLMs

Apr 29, 2024 3 min

Navigating the Unseen: The Dunning-Kruger Effect and AI Hallucinations

Human overconfidence and AI hallucination meet in a comparison of how bad certainty distorts judgment in both minds and machines.

LLMs GPT

Mar 18, 2024 3 min

Exploring MM1: Apple's Advancement in Multimodal Large Language Models

Apple's MM1 research is presented as a step toward AI systems that understand text and images together.

AI LLMs

Mar 6, 2024 5 min

Unlocking the Potential of Large Language Models: A Guide to Effective Prompt Engineering

A practical guide to prompt engineering techniques for getting more reliable, useful behavior from large language models.

AI LLMs

Mar 5, 2024 5 min

The Echo Chamber Effect: Navigating the Complexities of LLMs Trained on Generated Content

The echo-chamber problem asks what happens when future models learn increasingly from content produced by earlier models.

AI LLMs

Mar 2, 2024 5 min

Navigating the Complexity of Large Language Models: A Dual Perspective

Two perspectives on LLM interaction reveal how user behavior and model dynamics shape each other in unexpected ways.

AI LLMs

Feb 28, 2024 4 min

Apple's Foray into AI with Project "Ajax" and Apple GPT

Apple's rumored Ajax and Apple GPT projects are examined as early signs of its generative-AI strategy.

LLMs GPT

Feb 27, 2024 5 min

Multimodality in Large Language Models: A Key to Versatile and Specialized Task Performance

Multimodal LLMs are explained as a key step toward systems that can reason across text, images, and other signals.

AI LLMs

Jan 15, 2024 5 min

Why Less is Not Always More: The Intricacies of "Small" Large Language Models

The LLaMA leak becomes a case study in open AI, research ethics, and the risks of powerful models spreading freely.

AI LLMs

Jan 9, 2024 3 min

Mastering IT Security: AI Insights on Risk, Protection, and Compliance

AI is used to explore risk, protection, and compliance questions in IT security through a structured expert-system lens.

LLMs GPT

Jan 8, 2024 1 min

Launch of OpenAI's GPT Store and Introduction of gekko's Expert Systems

The GPT Store launch becomes the backdrop for introducing gekko's own specialized expert systems.

LLMs GPT

Jan 7, 2024 4 min

Introducing "Track&Field Analyst" - A Dynamic Tool for Athletics Data Analysis

Track&Field Analyst is introduced as a custom GPT for objective athletics data analysis and performance insight.

LLMs GPT

Dec 30, 2023 2 min

A Closer Look at the IT Baseline Protection Expert System: The InfoSec Advisor

InfoSec Advisor combines ChatGPT with German IT-Grundschutz knowledge to support security analysis and practical guidance.

LLMs GPT

Ten Jobs Whose Current Form Deserves a Farewell Party

Review: The Welch Labs Illustrated Guide to AI

PTP/1.0 — Prompt Transport Protocol

CALM and the Revolt Against the Token

When Donald Knuth Lets an AI Do the Math

Distillation attacks on large language models: motives, actors and defences

PageIndex.ai: A persuasive “vectorless RAG” idea—especially for real PDFs

Meta-Prompting: How to Get More Signal Out of Your Prompts

Recursive Language Models: when “more context” stops meaning “more tokens”

Reconstructing Mathematics from the Ground Up with Language Models: An Analysis

The Mathematical Limits of AI Safety

Unusual Language Artifacts from Noisy LLM Training Data

Beyond the Token Stream: Investigating Introspective Awareness in Large Language Models

Kimi K2 Thinking: China’s New Contender in the LLM Reasoning Race

Transformers Are Injective: Why Your LLM Could Remember Everything (But Doesn’t)

Elon Musk's Vision: Turning Tesla's Idle Fleet into a Global AI Inference Powerhouse

The Neural Junk-Food Hypothesis

“Personality” in a Machine: What Do We Mean?

An LLM Made of Redstone Bricks: What CraftGPT Really Teaches Us

From Prompt Packs to Purpose-Built Models: When a Generalist Becomes a Specialist—and When It Still Doesn’t

When “Errors” Speak: A Comparative Field Guide to Human and LLM Fallibility

The AI Bubble: Parallels to the Dot-Com Era and Beyond

Teaching LLMs to Ask Smarter Questions: Bayesian Experimental Design for Multi-Turn Information Gathering

AI: The Grand Illusion Fueling America's Economic Mirage

The Logic of Failure in the Age of AI

When AI Gets Flirty: A Rollicking Look at How Language Models Tackle Intimate Chats

The Future of AI: How Self-Adapting Language Models Are Redefining Learning

Comparison of OpenAI Language Models (May 2025)

Yes, Mathter! The Sycophantic AI's Frankensteinian Flattery Fiasco

Knowledge Graphs Won't Solve the LLM Crisis—Here's Why

Humanity’s Last Exam: The Ultimate Test for AI and the Future of Intelligence

Small LLMs: A Contradiction in Terms or a Giant in Disguise?

The Top 10 Unsolved Challenges in AI: A 2024 Retrospective

From Flatline to Frontline: How Gibson's Digital Ghosts Became Scientific Reality

Rethinking Reasoning: What if LLMs are Holding a Mirror to Human Cognition?

Master AI Terminology: 50 Essential Terms Explained

The Dark Side of AI: How Cybercriminals are Weaponizing Language Models

The Jevons Effect and the Rise of Large Language Models: A Modern Paradox

LLMs and World Models: Do AI's Dream of Coherent Realities?

STaR: The AI That Teaches Itself to Reason

Taking the Temperature of AI: How THERMOMETER is Turning Up the Heat on Overconfident Language Models

Navigating the AI Seas: The Art and Science of LLM Steerability

No Boss, All Brains: The New Paradigm of Decentralized AI Agents

Multi-Agent LLMs: Exploring the Future of AI Collaboration

Comparison of LLMs: Lies, Damned Lies, and Benchmarks 1/6

Comparison of LLMs: Lies, Damned Lies, and Benchmarks 2/6

Comparison of LLMs: Lies, Damned Lies, and Benchmarks 3/6

Comparison of LLMs: Lies, Damned Lies, and Benchmarks 4/6

Comparison of LLMs: Lies, Damned Lies, and Benchmarks 5/6

Comparison of LLMs: Lies, Damned Lies, and Benchmarks 6/6

AGI vs. ANI: The Genius and the Savant of the AI World

Navigating the Unseen: The Dunning-Kruger Effect and AI Hallucinations

Exploring MM1: Apple's Advancement in Multimodal Large Language Models

Unlocking the Potential of Large Language Models: A Guide to Effective Prompt Engineering

The Echo Chamber Effect: Navigating the Complexities of LLMs Trained on Generated Content

Navigating the Complexity of Large Language Models: A Dual Perspective

Apple's Foray into AI with Project "Ajax" and Apple GPT

Multimodality in Large Language Models: A Key to Versatile and Specialized Task Performance

Why Less is Not Always More: The Intricacies of "Small" Large Language Models

Mastering IT Security: AI Insights on Risk, Protection, and Compliance

Launch of OpenAI's GPT Store and Introduction of gekko's Expert Systems

Introducing "Track&amp;Field Analyst" - A Dynamic Tool for Athletics Data Analysis

A Closer Look at the IT Baseline Protection Expert System: The InfoSec Advisor

Introducing "Track&Field Analyst" - A Dynamic Tool for Athletics Data Analysis