All courses > Technology and Programming > Artificial Intelligence and Machine Learning ::

What an LLM Is and What It Produces

Capítulo 1

Estimated reading time: 12 minutes

What an LLM Is (in Practical Terms)

An LLM (Large Language Model) is a software system that takes text (and sometimes other inputs like images or audio, depending on the model) and produces new text as output. “Large” refers to the scale of the model—many internal parameters and a broad training dataset—while “language model” refers to the core job: predicting likely sequences of tokens (small pieces of text) given a context.

A useful way to think about an LLM is as a highly capable text generator that has learned patterns of language: grammar, style, common facts, typical reasoning steps, and many domain-specific conventions (like how recipes are written, how bug reports look, or how legal clauses are structured). It does not “look up” answers in a database by default. Instead, it generates output by estimating what text is most likely to come next, repeatedly, until it finishes.

This matters because it explains both the strengths and the limitations of what it produces. The output can be fluent, structured, and helpful, but it is still generated text—an informed guess, not a guaranteed fact.

Inputs and Outputs: The Basic Interface

At its simplest, an LLM works like a function:

output_text = LLM(prompt_text)

In practice, the prompt can include multiple parts: instructions (“Write a summary”), context (“Here is the document”), examples (“Input → Output”), and constraints (“Use bullet points, max 100 words”). The model then produces a completion: the next tokens that best fit the prompt.

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

Many modern systems wrap this basic interface with additional features (tools, retrieval, memory, safety filters), but the core behavior remains: given a context window of tokens, generate the next token, then the next, and so on.

Tokens: What the Model Actually Sees and Produces

LLMs do not operate directly on “words” as humans think of them. They operate on tokens—chunks of text that might be a whole word (“banana”), part of a word (“ban” + “ana”), punctuation, or even whitespace markers. Tokenization allows the model to handle many languages and writing styles efficiently.

Why tokens matter for outputs:

Length limits: Models have a maximum context window measured in tokens. Your prompt plus the model’s output must fit within that window.
Cost and speed: Many APIs charge and run time roughly proportional to tokens processed and generated.
Precision of constraints: “Write 200 words” is approximate; “write 250 tokens” is closer to what the model can control.

What the Model Produces: A Probability-Driven Completion

An LLM produces text by repeatedly choosing the next token based on probabilities. For each step, it assigns a probability distribution over possible next tokens. Then it selects one token (using a decoding strategy), appends it to the output, and repeats.

This has a practical implication: the model’s output is not a single fixed answer. The same prompt can yield different completions depending on randomness settings and decoding choices.

Deterministic vs. Creative Outputs

Most LLM interfaces expose settings that influence how predictable or varied the output is. Names differ by provider, but the concepts are similar:

Temperature: Higher temperature increases randomness (more varied outputs). Lower temperature makes outputs more consistent and conservative.
Top-p (nucleus sampling): The model samples from the smallest set of tokens whose cumulative probability exceeds p (e.g., 0.9). Lower top-p narrows choices.
Top-k: Sample only from the k most likely tokens. Lower k narrows choices.

When you want stable, “businesslike” outputs (summaries, extraction, formatting), you typically lower randomness. When you want brainstorming, you raise it.

Types of Outputs an LLM Can Produce

Because it generates text, an LLM can produce many kinds of artifacts. The key is that these artifacts are still text sequences, even when they represent other things (tables, code, JSON, checklists).

1) Natural Language Responses

The most obvious output is conversational text: explanations, answers, suggestions, and step-by-step guidance. The model can adapt tone and complexity if instructed.

Example prompt:

Explain the difference between a TCP and UDP connection in 5 bullet points for a beginner.

Typical output: A short bullet list with simplified definitions and trade-offs.

2) Summaries and Transformations

LLMs are strong at rewriting: summarizing, paraphrasing, translating, changing tone, and restructuring content.

Example prompt:

Rewrite this email to be more polite and concise. Keep all factual details. Text: ...

The output is a transformed version of the input, often preserving meaning while changing style.

3) Structured Text (Lists, Tables, JSON)

LLMs can output structured formats when you specify them clearly. This is extremely useful for automation, because structured output can be parsed by software.

Example prompt:

Extract the tasks from the text and return JSON with fields: task, owner, due_date (ISO 8601 or null). Text: ...

Even though JSON is “data,” the model is still producing it as text. That means you must validate it (for example, check that it is valid JSON and matches your schema).

4) Code and Pseudocode

LLMs can generate code in many languages, explain code, refactor it, and write tests. Code generation is still next-token prediction, which means it can be impressively correct but also subtly wrong.

Example prompt:

Write a Python function that validates an email address using a simple regex. Include 5 unit tests.

The output is code text. You still need to run it, test it, and review edge cases.

5) “Plans” and Step-by-Step Procedures

LLMs can produce plans: checklists, project outlines, troubleshooting flows, and study schedules. These are especially useful when you want a repeatable process.

Example prompt:

Create a step-by-step checklist to debug a web app that suddenly started returning 500 errors.

The output is a procedure, often organized into phases (reproduce, isolate, inspect logs, rollback, etc.).

Practical Step-by-Step: Getting the Output You Actually Want

Because an LLM produces text based on the prompt, the most practical skill is learning how to specify the output. The following steps help you reliably shape what the model produces.

Step 1: Define the Output Format First

Start by stating the format you want (bullets, table, JSON, numbered steps). This reduces ambiguity.

Example:

Return your answer as a table with columns: Symptom, Likely Cause, First Check, Fix.

If you need machine-readable output, say so explicitly:

Return ONLY valid JSON. Do not include commentary or markdown.

Step 2: Specify the Role and Audience

LLMs adapt style based on cues. “Explain like I’m new to this” yields different output than “Write for senior engineers.”

You are a technical writer. Audience: non-technical managers. Keep it under 150 words.

This is not about the model “becoming” a person; it is a prompt technique that steers tone and content.

Step 3: Provide the Necessary Context (and Only That)

LLMs can only condition on what is in the prompt (plus what they already learned during training). If your task depends on specific facts—your company policy, your dataset, your product requirements—include them.

Practical approach:

Paste the relevant excerpt rather than an entire document.
Include definitions for ambiguous terms.
State constraints (deadlines, allowed tools, style guide).

Example:

Context: Our refund policy allows refunds within 30 days for unused items. Used items are eligible only if defective. Task: Draft a customer reply.

Step 4: Add Examples (Few-Shot Prompting) When Precision Matters

If you want consistent formatting or classification, show 1–3 examples of input and the exact output you want. The model often imitates the pattern.

Example 1 input: "Order arrived late" → output: {"category":"shipping","urgency":"medium"} Example 2 input: "Charged twice" → output: {"category":"billing","urgency":"high"} Now classify: "Package missing items"

Step 5: Constrain the Model’s Freedom

LLMs will “fill in gaps” if you leave them. If you want extraction rather than invention, say so.

Useful constraints:

“If the answer is not in the provided text, respond with ‘Not found’.”
“Do not guess. Ask a clarifying question if needed.”
“Cite the exact sentence from the provided context that supports each claim.”

Example prompt for extraction:

Using ONLY the text below, list the deadlines mentioned. If none, output an empty list. Text: ...

Step 6: Iterate with Targeted Feedback

When the output is close but not correct, give specific corrections rather than restating the whole task.

Examples of targeted feedback:

“Keep the same structure, but shorten each bullet to one sentence.”
“The JSON must include a field called ‘confidence’ as a number 0–1.”
“Remove any recommendations; only describe observed symptoms.”

Understanding “Model Output” vs. “System Output”

In many products, what you see is not raw model output. There may be additional layers:

System instructions: Hidden or fixed rules that set behavior (tone, safety, formatting).
Content filters: Post-processing that blocks or rewrites unsafe content.
Tool use: The system may call external tools (search, calculator, database) and then ask the model to write a final response.
Templates: The system may wrap the model’s text in UI elements or add citations.

This distinction helps when debugging. If the output seems inconsistent, it may be due to system-level settings rather than the prompt alone.

Common Output Behaviors You Should Expect

Fluency Without Guarantees

LLMs are optimized to produce plausible text. Plausible does not always mean correct. The model can generate confident-sounding statements that are incomplete or wrong, especially when the prompt lacks context or asks for niche details.

Practical response: treat outputs as drafts. Verify facts, run code, and validate structured data.

Over-Completion (Answering More Than You Asked)

Because the model is trained to be helpful, it may add extra advice, assumptions, or background. This is useful for brainstorming but problematic for strict tasks like extraction or compliance writing.

Practical response: constrain the output (“Only return X,” “No additional commentary,” “If unknown, say unknown”).

Format Drift

Even with instructions, the model may occasionally break format—especially in long outputs. For example, it might add a sentence before JSON, or include trailing commas.

Practical response: keep outputs shorter, use examples, and validate with a parser. If invalid, re-prompt with the invalid output and ask for a corrected version:

Here is invalid JSON you produced: ... Fix it and return ONLY valid JSON.

Inconsistent Terminology

The model may alternate between synonyms (“client” vs. “customer”) unless you standardize terms.

Practical response: provide a glossary in the prompt:

Use these terms exactly: customer (not client), order_id (snake_case), refund (not reimbursement).

What “Producing Text” Enables: Composing Tools Out of Prompts

Once you see the LLM as a text-to-text system, you can design repeatable “mini-tools” by combining prompt structure with consistent output formats. For example:

Classifier: Input a message, output a category label.
Extractor: Input a contract clause, output key fields as JSON.
Formatter: Input messy notes, output a clean meeting summary with action items.
Generator: Input requirements, output a draft policy or test plan.

Each of these is still “just text,” but the reliability improves dramatically when you (1) define a schema, (2) provide examples, and (3) validate outputs.

Hands-On Examples: Prompts and Expected Output Shapes

Example A: Turning Notes into Action Items (Structured Output)

Goal: Convert unstructured meeting notes into a consistent list of tasks.

Prompt (template):

Task: Extract action items from the notes. Return a JSON array. Each item must have: description, owner, due_date (ISO 8601 or null), dependencies (array). If an owner is not stated, use null. Notes: [PASTE NOTES]

What the LLM produces: A JSON array of objects. You can then parse it and load it into a task tracker. If the model guesses owners or dates, tighten constraints: “Use ONLY explicitly stated owners and dates.”

Example B: Creating a Troubleshooting Runbook (Step-by-Step)

Goal: Produce a runbook that an on-call engineer can follow.

Prompt (template):

Create a troubleshooting runbook for: "API latency increased from 200ms to 2s". Output as numbered steps. Each step must include: what to check, how to check it (command or UI path), and what to do if it fails.

What the LLM produces: A sequence of steps with checks and actions. If you want it to match your environment, provide context such as your monitoring tools, service names, and deployment process.

Example C: Generating a Draft and Then Refining It (Two-Pass Output)

Goal: Get a strong first draft, then improve it with targeted constraints.

Pass 1 prompt:

Draft a one-page internal policy for handling customer data deletion requests. Use clear headings and bullet points.

Pass 2 prompt:

Revise the draft to: (1) remove any legal claims, (2) add a checklist for support agents, (3) keep it under 400 words, (4) use the term "deletion request" consistently.

What the LLM produces: A refined document. This illustrates a key workflow: treat the model’s output as an editable artifact and iterate with constraints.

How to Evaluate the Output You Get

Since the model produces generated text, evaluation is about fitness for purpose. A practical checklist:

Correctness: Are factual claims verifiable? If not, can the output be rewritten to avoid unverifiable claims?
Completeness: Does it include all required fields/sections?
Format validity: If it is JSON/code, does it parse/compile?
Consistency: Are terms and units consistent throughout?
Constraint adherence: Does it follow length, tone, and “only use provided text” rules?

For structured outputs, automated checks are especially effective: schema validation, unit tests, linters, and parsers. For natural language outputs, spot checks and reference comparisons help.

Key Takeaway: An LLM Produces Text That You Can Shape

An LLM is best understood as a system that generates the next most likely tokens given a prompt. What it produces can look like conversation, documentation, code, or data, but it is always generated text. The practical skill is specifying the output shape and constraints so the generated text becomes reliably usable—either by humans (as drafts and explanations) or by software (as structured, validated output).

Now answer the exercise about the content:

When you need an LLM to produce machine-readable output for automation, which prompting approach best improves reliability?

You are right! Congratulations, now go to the next page

You missed! Try again.

LLMs generate text token by token, so structured outputs like JSON can drift or be invalid. Reliability improves when you specify the format/schema, add constraints and examples, and then validate the output.