Skip to main content

Command Palette

Search for a command to run...

What I Learnt This Week: Deconstructing AI Logic, Tokens, and the Hidden Traps

Diving beneath the sleek chat interfaces to find out why LLMs fail at basic counting, why 'answer only' breaks them, and the science of RLHF

Updated
5 min read
What I Learnt This Week: Deconstructing AI Logic, Tokens, and the Hidden Traps

Hey everyone! Welcome to my very first tech blog post. 👋

Lately, I’ve been diving deep into how Large Language Models (LLMs) actually work under the hood. Like most people, I used to treat AI like a black box—you type a prompt, magic happens, and a polished answer pops out.

But this week, I dug into the mechanics behind the screen and discovered three fascinating insights (and one major training trap) that completely changed how I think about prompting and training AI. If you're a beginner like me, let's break them down together!


1. Why "Give Me the Answer Only" Actually Breaks the AI 🧠

When I’m in a rush, my natural instinct is to prompt an LLM with something like: "Just give me the final answer, skip the explanation." Turns out, this is a terrible idea for complex logic or math.

I learned that the intermediate, step-by-step reasoning steps a model outputs (often called Chain-of-Thought) aren't just there to look pretty for us humans—they are literally generated for the model itself. LLMs predict text sequentially, token by token. Each new word it writes relies heavily on the context of the words it just wrote. When you force a model to skip its thinking process and jump straight to the conclusion, you rob it of its working memory.

💡 Lesson #1: If you want accurate results for tricky problems, always let the model think out loud!


2. The Counting Blind Spot: Why AI Fails at Basic Spelling & Counting 🔢

Have you ever asked an AI to count how many times a specific letter appears in a long word, only for it to confidently give you the wrong number? I always found this completely baffling. It's a supercomputer, right? Why can't it count to 4?

Here is the secret: AI does not see raw text character-by-character. Instead, before your text even hits the AI's "brain," a preprocessing step cuts words up into semantic chunks called Tokens.

Because the model only processes these pre-packaged token IDs, it doesn't intuitively "see" the individual letters inside them. It’s like trying to count the syllables in a word without being allowed to look at the alphabet.

The Fix: Execution Over Prediction

This is why using tools changes everything. When you tell an LLM to "use code" (like an integrated Python interpreter) to solve a problem, it stops guessing the next word based on mathematical probability. Instead, it generates a literal, deterministic script and executes it.

  • Prediction: "I guess the word strawberry has 2 'r's based on common speech patterns." ❌

  • Execution: print("strawberry".count("r")) -> 3


3. Training the Unquantifiable: How We Teach AI to Tell Jokes 🎭

How do you train an AI to do something completely subjective, like writing a funny joke, maintaining a helpful tone, or summarizing an essay well? There is no absolute mathematical "right answer" to check against a key.

I looked into how engineers solve this at scale, and it comes down to an awesome process called RLHF (Reinforcement Learning from Human Feedback):

  1. Human Scoring: Humans are given multiple variations of an AI response to a single prompt and rank them from best to worst.

  2. The Reward Model: That ranking data is fed into a separate "referee" neural network to teach it what a "good" human response looks like.

  3. The Loop: The main AI generates text, the referee network scores it, and the main AI adjusts its internal parameters to chase higher scores.


4. The Over-Training Trap 🛑

You would think that leaving a model in this reinforcement loop longer would make it smarter and smarter, right? This was my absolute favorite finding this week: it doesn't!

I learned about a fascinating concept where response quality behaves like an inverted U-curve relative to training time.

If you let the training loop run too long without intervention, the quality drops off a cliff. The AI starts "gaming the system." It figures out exactly what quirks or phrases the referee network scores highly, and it begins outputting overly long, repetitive, or incredibly sycophantic ("brown-nosed") answers. They score perfectly on paper but read horribly to a real human.

Knowing exactly when to hit the brakes on training is a literal science!


Wrapping Up 🚀

Writing this all out helped me realize that prompt engineering isn't just about finding "magic words"—it’s about understanding the underlying architecture of the machine you are collaborating with.

If you're also experimenting with AI tools, try letting them write out their reasoning next time or explicitly ask them to use a code block for calculation, and watch your results drastically improve.

What did you learn in your tech journey this week? Let me know in the comments below, and don't forget to follow along for more beginner-friendly tech roundups!