Skip to main content
Learning Center
AI and Agentic FraudUnderstanding AI

Understanding AI

What large language models are, what they can do, and what they cannot do

By Benjamin, Fraud Attacks · Updated

A large language model (LLM) is a text-prediction system trained on huge amounts of writing. It generates language by predicting the most likely next words from patterns in that training data, not by looking up facts. This article covers what LLMs do well, where they fail, and why "vibe coding" lets people with no programming background build working software, including fraud tools.

The Conversation That Didn't Break

Maya was reviewing the chat logs from a romance scam. Three months of messages before the victim transferred $43,000.

The scammer claimed to be a petroleum engineer on an offshore rig. Classic setup. What wasn't classic was the conversation quality.

The victim had asked detailed questions about offshore drilling. The responses were technically plausible. When she switched to Spanish mid-conversation to test him, he continued fluently. When she mentioned a news story from his supposed hometown, he referenced it naturally.

Maya pulled up similar cases from the past month. Investment pitches that answered technical questions about market mechanics. Tech support scams where the "agent" actually troubleshot real problems before pivoting to the grift. The conversations were fluent, consistent, patient. They ran for weeks without contradicting themselves.

These conversations used to break. Victims would ask something off-script and the scammer couldn't keep up. They'd deflect or disappear.

These didn't break.

She checked the response times. Too fast. The number of parallel conversations. Too many. The consistency across weeks of messages. Too perfect.

Something had changed in what was possible.

This story is fictional, but the patterns are real.

Why This Matters

If you work in fraud, you've probably heard a lot about AI. Some of it is hype. Some of it matters. This article helps you separate the two.

This is the first article in the Agentic Fraud module. We start with LLMs because they're the foundation. The next articles will cover how LLMs become agents, how agents are built, and what agentic fraud might look like.

What Is an LLM?

A Large Language Model is, at its core, autocomplete on steroids.

When you start typing a text message and your phone suggests the next word, that's a tiny language model. LLMs work the same way, just trained on a vastly larger scale. They've processed billions of documents: books, websites, code repositories, conversations, scientific papers. From all that text, they've learned patterns about how language works.

When you give an LLM a prompt, it predicts what text would most likely come next based on those patterns. Then it predicts the next word after that. And the next. This is how it generates responses that sound coherent and human-like.

Here's the critical insight: LLMs don't "know" things in the way humans do. They don't have a database of facts they look up. They predict what text typically follows based on patterns in their training data. This distinction matters because it explains both their impressive capabilities and their frustrating limitations.

Think of it like a student who has read every book in the library but doesn't truly understand any of them. They can produce convincing essays by pattern-matching what "sounds right" in a given context. Sometimes this works brilliantly. Sometimes it produces confident nonsense.

What are LLMs good at?

LLMs excel at tasks that involve language and pattern matching:

Language tasks. They can write in any style, from formal legal documents to casual text messages. They can translate between languages, summarize long documents, explain complex concepts in simple terms, and adapt their tone to match what you need.

Pattern matching. Given examples, they can categorize new items, extract specific information from text, identify similarities between documents, and find patterns in data.

Code generation. They can write functional code from plain English descriptions. This is one of their most impactful capabilities for fraud, as we'll see shortly.

Scale. Language tasks that take humans hours take LLMs seconds. The time difference for summarizing, translating, categorizing, and generating text is dramatic.

What are LLMs bad at?

Understanding LLM limitations is just as important as understanding their capabilities. Here's where they consistently struggle:

Math and calculation. LLMs make arithmetic errors, especially with large numbers or multi-step calculations. They're predicting what "looks like" a correct answer, not actually computing. Never trust an LLM's math without verification.

Factual accuracy. LLMs "hallucinate." This is the technical term for when they confidently state false information. They might cite papers that don't exist, invent statistics, or describe events that never happened. They do this because they're predicting plausible-sounding text, not retrieving verified facts. The OWASP Top 10 for LLM Applications[1] lists misinformation as LLM09:2025, one of the top vulnerabilities in production LLM systems.

Judgment and common sense. LLMs don't understand real-world consequences. They can't tell if an action is dangerous, unethical, or just silly. They lack the intuitive reasoning that humans develop through lived experience.

Taking actions. This is crucial: an LLM by itself cannot do anything in the real world. It can't click buttons, send emails, browse websites, or make phone calls. It can only generate text. This limitation is what separates LLMs from agents (covered in the next article).

Memory. Each conversation with an LLM starts fresh. It doesn't remember what you discussed yesterday or learn from past interactions. Once a conversation ends, that context is gone.

LimitationExampleWhy It Matters
Math errors"What's 17 × 24?" might produce wrong answerDon't trust calculations
HallucinationCites a study that doesn't existVerify all facts
No judgmentDoesn't recognize dangerous requestsCan be misused
No actionsCan write a phishing email but can't send itNeeds additional tools
No memoryForgets previous conversationsContext is temporary

Vibe Coding: Anyone Can Build Tools Now

Here's where things get interesting for fraud.

"Vibe coding" is a term coined by Andrej Karpathy, co-founder of OpenAI and former Director of AI at Tesla. On February 2, 2025, he described it[2] as a new way of building software where you "fully give in to the vibes, embrace exponentials, and forget that the code even exists." You describe what you want in plain English, accept whatever the LLM produces, and when you hit errors, you just paste them back and let the AI fix them.

This sounds irresponsible for professional software development. And it often is. But it works well enough for many purposes, and it has a profound implication: technical skill is no longer a barrier to building automation tools.

Consider Doher Drizzle Pablo[3], a business professional with no coding background. She built a custom expense management application in two hours by describing what she needed in plain language to an AI tool. No programming. No technical training. Just a description of the problem she wanted to solve.

Now think about what this means for fraud. Previously, building sophisticated automation required real programming skills. Credential stuffing tools, phishing kits, bot networks: these required developers. That created a barrier. Not anymore.

Someone with no technical background can now describe a fraud tool in plain English and get working code. "Build me a script that takes a list of emails, scrapes LinkedIn for their job titles, and generates personalized phishing messages for each one." An LLM can produce that in minutes. The same technique applies to building pretexting scripts, scraping public profile data, or automating any tedious step in a fraud workflow.

There are caveats. Vibe-coded software often has bugs and security vulnerabilities. Veracode's 2025 GenAI Code Security Report found that 45% of AI-generated code samples failed security tests and introduced OWASP Top 10 vulnerabilities,[4] and Wiz Research found that roughly 1 in 5 organizations building on vibe-coding platforms exposed themselves to high-impact misconfigurations.[5] The code works, but it's not robust. For attackers, though, "works most of the time" is often good enough.

The Honest Truth About LLMs

LLMs are impressive tools, but they're not magic. Understanding their limitations protects you from overestimating the threat.

They make mistakes constantly. Different mistakes than humans make, but mistakes nonetheless. They hallucinate facts, fail at math, lose context, and sometimes produce complete nonsense. Impressive outputs one moment, baffling failures the next.

They can't act on their own. An LLM can write a phishing email, but it can't send one. It can plan an attack, but it can't execute it. It can analyze data, but it can't access databases. For any real-world action, it needs additional infrastructure. That's what turns an LLM into an agent, and that's what we'll cover next.

The technology is powerful. It's also limited. Both things are true. The fraud professionals who understand both will be best equipped to navigate what comes next.

Key Takeaways

  • LLMs predict text, they don't "know" things. They generate responses based on patterns in training data, not by retrieving verified facts. This is why they hallucinate.
  • Great at language, bad at facts and math. LLMs excel at writing, summarizing, and translation. They consistently fail at arithmetic and factual accuracy.
  • They can't take actions on their own. An LLM can only generate text. Sending emails, browsing websites, and making API calls requires additional tools.
  • Vibe coding removes the technical barrier. Anyone can now describe software and have an LLM write it. This changes who can build fraud tools.
  • Understanding limitations matters. The technology is powerful but flawed. Knowing what LLMs can't do is as important as knowing what they can.

What's next: From LLMs to Agents explains how LLMs gain memory, tools, and the ability to take actions, and what this means for fraud economics.

Key Terms

Large Language Model (LLM): An AI system trained on massive amounts of text to predict and generate human-like language. Examples include models from OpenAI (GPT), Anthropic (Claude), and Google (Gemini).

Hallucination: When an LLM generates confident but false information. This happens because LLMs predict plausible-sounding text rather than retrieving verified facts.

Context window: The amount of text an LLM can consider at once. Once a conversation exceeds this limit, the LLM "forgets" earlier parts.

Prompt: The input text you give to an LLM. The quality and specificity of your prompt significantly affects the quality of the output.

Vibe coding: Building software by describing what you want in plain language and letting an LLM generate the code, without fully reviewing or understanding the output.

Token: The unit LLMs use to process text. The common "roughly 4 characters or 0.75 words per token" rule applies to English text under BPE tokenization. Other languages (especially non-Latin scripts), source code, and unusual formatting can use significantly more tokens per character. Context windows and pricing are measured in tokens.

References

1. OWASP Top 10 for Large Language Model Applications (2025) - LLM09:2025 covers misinformation/hallucination; LLM01:2025 covers prompt injection

2. Andrej Karpathy — "vibe coding" (X/Twitter, February 2, 2025) - Original post coining the term

3. Microsoft Source — Vibe coding and other ways AI is changing who can build apps - Profile of Doher Drizzle Pablo and others building software without coding skills

4. Veracode 2025 GenAI Code Security Report - 45% of AI-generated code samples failed security tests and introduced OWASP Top 10 vulnerabilities

5. Wiz Research: Common Security Risks in Vibe-Coded Apps - 1 in 5 organizations building on vibe-coding platforms exposed themselves to high-impact misconfigurations

Test Your Knowledge

Ready to test what you've learned? Take the quiz to reinforce your understanding.