All courses > Technology and Programming > Artificial Intelligence and Machine Learning ::

When LLMs Are the Right Tool

Capítulo 8

Estimated reading time: 14 minutes

Choosing LLMs as a Tool: The Core Idea

An LLM is most useful when your problem is primarily about working with language: understanding, generating, transforming, or organizing text (and sometimes code) in a way that benefits from flexible reasoning over messy inputs. “Right tool” does not mean “best at everything.” It means the model’s strengths match the shape of the task, the cost of occasional mistakes is manageable, and you can put guardrails around the output (through constraints, checks, and workflow design).

A practical way to decide is to ask: is the hard part of the task about interpreting human language, producing human-readable language, or bridging between formats (notes → email, policy → checklist, requirements → test cases)? If yes, an LLM is often a strong candidate. If the hard part is exact arithmetic, strict compliance with a schema, or retrieving a single authoritative fact with zero tolerance for error, the LLM should usually be a helper inside a system that enforces correctness, not the sole engine.

Signals that an LLM is the right tool

Ambiguous inputs: You receive incomplete, inconsistent, or informal text (customer messages, meeting notes, support tickets) and need a best-effort interpretation.
Many acceptable outputs: There isn’t one “correct” phrasing; you want a good draft, a set of options, or a structured summary.
High variation: The task changes frequently (new product features, new policies, new tone requirements), making rigid rules expensive to maintain.
Language transformation: You need rewriting, summarizing, translating, simplifying, or adapting tone for different audiences.
Reasoning over text: You want to extract action items, detect themes, classify intent, or propose next steps based on narrative content.
Human-in-the-loop is acceptable: A person can review, approve, or correct outputs before they matter.
Speed matters more than perfection: You want to reduce time-to-first-draft, not eliminate all manual work.

Signals that an LLM is not the right tool (by itself)

Deterministic correctness is required: Financial calculations, compliance-critical decisions, safety-critical instructions, or anything that must be exact every time.
Strict formatting with zero tolerance: Outputs must always match a schema and be machine-consumable without validation or repair.
Single-source truth is mandatory: You must cite and reproduce an authoritative record exactly (legal text, medical dosage, contractual clauses) without deviation.
Latency/cost constraints are extreme: You need millisecond responses at massive scale and cannot afford model inference.
Data cannot be shared: You cannot send the content to an external service and do not have an approved on-prem or private deployment.

Common “Right Tool” Use Cases (with Practical Examples)

1) Drafting and rewriting: accelerating communication

LLMs shine when you already know what you want to say, but you want help saying it clearly, politely, or in a specific style. The model can produce multiple drafts quickly, and you choose or edit the best one.

Examples

Turn bullet points into a customer update email with a calm, transparent tone.
Rewrite a technical explanation for a non-technical audience.
Create three alternative subject lines and preview text for a newsletter.

Practical step-by-step workflow

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

Step 1: Provide the raw material. Paste your bullets, constraints, and any “must include/must avoid” items.
Step 2: Specify audience and intent. “For existing customers affected by downtime; goal is to reduce support tickets and set expectations.”
Step 3: Request multiple options. Ask for 2–5 variants with different tones (formal, friendly, concise).
Step 4: Add constraints. Word limit, reading level, required terms, and banned phrases.
Step 5: Review for accuracy and commitments. Check dates, promises, and any implied guarantees.
Step 6: Finalize with your voice. Edit the opening and closing to match your brand and relationship.

Prompt template: Rewrite / draft
Input notes:
- [paste bullets]
Audience: [who]
Goal: [what outcome]
Tone: [tone]
Constraints:
- Must include: [items]
- Must not claim: [items]
- Length: [limit]
Output: Provide 3 versions and a short list of key differences.

2) Summarization and synthesis: turning messy text into usable artifacts

When you have long-form text (calls, interviews, incident reports, research notes) and need a compact representation, an LLM can produce summaries, outlines, and structured takeaways. This is especially valuable when the “right” summary is subjective: you care about themes, decisions, and next steps rather than a perfect extract.

Examples

Summarize a meeting transcript into decisions, open questions, and action items.
Synthesize 20 customer reviews into top pain points and feature requests.
Convert a policy document into a checklist for frontline staff.

Practical step-by-step workflow

Step 1: Define the summary format. For example: “5 bullets: decisions; 5 bullets: risks; table: action items.”
Step 2: Define what matters. “Prioritize customer impact and operational steps; ignore small talk.”
Step 3: Ask for traceability. Request quotes or short excerpts supporting each key point (useful for verification).
Step 4: Run a second pass for gaps. Ask: “What questions remain unanswered?” or “What assumptions are being made?”
Step 5: Human review. Confirm that decisions and assignments match reality.

Prompt template: Meeting synthesis
Task: Create a structured summary.
Output sections:
1) Decisions (bullet list)
2) Action items (owner, due date if mentioned)
3) Risks / blockers
4) Open questions
5) Supporting excerpts (1–2 short quotes per decision)
Text:
[paste transcript or notes]

3) Classification and routing: triaging text at scale

If you receive a high volume of short texts (tickets, emails, chat messages), an LLM can label them with intent, urgency, topic, sentiment, or required team. This is often a good fit because the inputs are natural language and the categories can evolve.

Examples

Route support tickets to billing vs. technical vs. account management.
Detect whether a message is a bug report, feature request, or how-to question.
Tag internal requests by department and priority.

Practical step-by-step workflow

Step 1: Define a small, stable label set. Avoid dozens of overlapping categories at first.
Step 2: Provide definitions and examples. Include borderline cases and how to handle them.
Step 3: Require a confidence score and rationale. Confidence helps decide when to escalate to a human.
Step 4: Add a fallback label. “Other/Needs review” prevents forced misclassification.
Step 5: Sample and audit regularly. Review a random subset weekly and refine definitions.

Prompt template: Ticket triage
You are classifying customer messages.
Labels:
- Billing: ...
- Technical issue: ...
- Account access: ...
- Feature request: ...
- Other/Needs review: ...
Return JSON with fields: label, confidence (0-1), rationale (1 sentence).
Message:
[paste ticket]

4) Information extraction: pulling structured fields from unstructured text

LLMs can extract entities and fields from text when the input format varies (different writing styles, missing punctuation, mixed languages). This is useful for turning emails into CRM entries, parsing incident narratives into structured reports, or extracting requirements from notes.

Examples

From an inbound email, extract: customer name, product, issue type, requested deadline.
From a bug report, extract: steps to reproduce, expected vs. actual behavior, environment.
From a contract summary request, extract: parties, term dates, renewal clause presence (for review, not final legal interpretation).

Practical step-by-step workflow

Step 1: Define the target schema. List fields, types, and allowed values.
Step 2: Instruct the model to use nulls. “If not present, set the field to null; do not guess.”
Step 3: Ask for evidence spans. Include the exact phrase from the text that supports each extracted field.
Step 4: Validate programmatically. Check types, enums, required fields, and length limits.
Step 5: Human review for high-impact fields. Especially dates, amounts, and commitments.

Prompt template: Structured extraction
Extract the following fields as JSON.
Rules:
- If a field is not explicitly stated, use null.
- Provide evidence: the exact substring supporting each non-null field.
Schema:
{ customer_name: string|null, product: string|null, deadline: string|null, issue_type: one of [...], evidence: { ... } }
Text:
[paste email]

5) Brainstorming and option generation: expanding the solution space

LLMs are good at generating diverse possibilities quickly: names, outlines, hypotheses, test ideas, edge cases, or alternative approaches. This is valuable when you want breadth first, then you narrow down with expertise and constraints.

Examples

Generate 15 onboarding checklist items for a new role, then pick the best 8.
Propose A/B test ideas for a landing page given a product description.
List potential failure modes for a new feature to inform QA planning.

Practical step-by-step workflow

Step 1: Provide context and constraints. Target audience, platform, brand voice, technical limits.
Step 2: Ask for diversity. “Include conservative, moderate, and bold options.”
Step 3: Request grouping. Cluster ideas into themes to reduce overwhelm.
Step 4: Add evaluation criteria. “Score each idea by effort, impact, risk.”
Step 5: Select and refine. Use the model again to elaborate only the chosen items.

Prompt template: Brainstorm
Context: [product / situation]
Goal: [what you want]
Constraints: [budget, time, tone, platform]
Generate: 20 ideas.
Then: group into 4 themes and score each idea (1-5) for impact and effort.

6) Coding assistance: accelerating routine development work

LLMs can be effective for code-adjacent language tasks: explaining code, drafting boilerplate, generating tests, refactoring for readability, or translating between languages. They are especially helpful when you can run the code, execute tests, and rely on tooling to catch mistakes.

Examples

Generate unit tests from a function signature and behavior description.
Refactor repetitive code into a reusable helper.
Explain a confusing error message and propose debugging steps.

Practical step-by-step workflow

Step 1: Provide minimal reproducible context. Function signature, expected behavior, failing test, error logs.
Step 2: Ask for incremental changes. “Propose a small patch; do not rewrite the whole module.”
Step 3: Require tests. “Include tests that fail before and pass after.”
Step 4: Run locally. Execute tests, linters, and type checks.
Step 5: Review for security and correctness. Pay attention to input validation, auth, and data handling.

Prompt template: Code patch
Language: Python
Goal: Fix the bug described below.
Constraints:
- Minimal diff
- Add/adjust unit tests
- Explain the root cause in 3-5 bullets
Bug report:
[paste]
Code:
[paste relevant snippets]
Tests:
[paste failing test output]

Decision Framework: A Practical Checklist

Use the following checklist to decide whether to use an LLM, and if so, how to integrate it safely.

Step 1: Define the task type

Generate: drafts, options, outlines, code scaffolding.
Transform: rewrite, translate, simplify, format conversion.
Extract: fields, entities, structured summaries.
Judge: classify, rank, detect policy violations, prioritize.

If your task fits one of these and the input is mostly text, an LLM is a candidate.

Step 2: Identify what “correct” means

Soft correctness: helpful, clear, plausible, on-brand. LLMs fit well.
Hard correctness: exact facts, exact numbers, exact compliance. Use LLMs only with verification and constraints.

Write down which parts must be exact (dates, amounts, names, policy clauses) and which parts can be approximate (tone, phrasing, ordering of ideas).

Step 3: Decide the role of the LLM in the workflow

Assistant: produces drafts for humans to approve (common for comms and planning).
Copilot with tools: proposes actions but external systems validate and execute (common for extraction, routing, and automation).
Batch processor: processes many items with sampling-based QA (common for tagging and summarization).

Step 4: Add guardrails appropriate to the risk

Constrain outputs: require JSON, tables, or fixed sections.
Force “unknown” behavior: instruct “use null / Needs review; do not guess.”
Use confidence thresholds: low confidence routes to humans.
Validate automatically: schema validation, regex checks, allowed-value checks.
Require evidence: quotes/excerpts for extracted claims.

Step 5: Plan for iteration

LLM-based solutions improve through prompt refinement, better examples, and better post-processing. Treat the first version as a prototype. Track failure cases and update instructions and validation rules.

Design Patterns That Make LLMs the Right Tool More Often

Pattern A: “Draft, then verify”

Use the LLM to create a draft, then verify critical elements with a separate step. This is effective when the model’s value is speed and language quality, but correctness still matters.

Where it fits: customer emails, incident updates, internal announcements.
Verification step examples: check dates against a calendar, confirm metrics against dashboards, confirm names against CRM.

Workflow sketch:
1) LLM drafts message from notes.
2) System highlights factual claims (dates, numbers, names).
3) Human verifies highlighted claims.
4) Final message sent.

Pattern B: “Extract with evidence + validation”

Ask the model to extract fields and provide evidence spans, then validate the output before saving it. This reduces silent errors and makes review faster.

Where it fits: CRM intake, incident reports, compliance checklists (as a preparatory step).

Workflow sketch:
1) LLM outputs JSON fields + evidence.
2) Validator checks schema, enums, required fields.
3) If validation fails or confidence low: send to review queue.
4) Otherwise: write to database.

Pattern C: “Triage and escalate”

Let the LLM handle the easy majority and escalate uncertain cases. This is often the highest ROI pattern because it reduces workload without demanding perfection.

Where it fits: support, moderation, internal request routing.

Workflow sketch:
1) LLM assigns label + confidence.
2) If confidence > 0.8: auto-route.
3) Else: human triage.
4) Collect corrected labels to refine instructions.

Pattern D: “Generate options, keep humans in charge”

Use the LLM to expand possibilities, but keep decision-making with people. This is ideal for creative and strategic work where diversity of ideas matters.

Where it fits: marketing concepts, product naming, experiment ideas, training materials.

Practical Scenarios: Choosing LLM vs. Alternatives

Scenario 1: You need to answer repetitive customer questions

LLM is right when: questions are phrased many different ways, you want friendly tone, and a human can review or you can constrain answers to approved content.

LLM is not right when: answers must be exact and always up-to-date, and you cannot tolerate outdated or invented details.

Practical approach: use the LLM to draft responses and require it to quote from your approved help text; route uncertain cases to support staff.

Scenario 2: You need to populate a database from inbound emails

LLM is right when: emails are messy and vary by sender, and you can validate fields and allow nulls.

LLM is not right when: every field must be present and correct with no review, and the cost of a wrong entry is high.

Practical approach: extraction with evidence spans, schema validation, and a review queue for low-confidence items.

Scenario 3: You need to generate a weekly report narrative

LLM is right when: you already have the numbers and want a readable narrative, highlights, and risks written consistently.

LLM is not right when: the model is expected to invent or compute the numbers.

Practical approach: provide the metrics as input and instruct the model to only describe what is present; have a reviewer check that the narrative matches the metrics.

Scenario 4: You need strict, machine-readable output

LLM is right when: you can validate and repair outputs, and the schema is stable.

LLM is not right when: downstream systems will break on a single formatting deviation and you cannot add validation.

Practical approach: require JSON, validate, and retry with a corrective prompt when validation fails.

Corrective prompt pattern:
Your previous output did not match the schema because: [validator error].
Return ONLY valid JSON matching this schema: [schema].
Do not include extra keys.

How to Write Requests That Fit “Right Tool” Tasks

Be explicit about the job-to-be-done

Instead of “Summarize this,” specify the purpose: “Summarize for an executive who needs decisions and risks in under 200 words.” The more your request resembles a real work product, the more likely the output is useful.

Provide constraints and acceptance criteria

For drafting tasks, constraints might include tone, length, reading level, and required phrases. For extraction tasks, constraints include schema, allowed values, and “null if missing.” For classification, constraints include label definitions and a fallback category.

Ask for intermediate artifacts when helpful

When the task is complex, request an outline first, then expand. Or ask for a list of assumptions before the final draft. This keeps the model aligned with your intent and makes review faster.

Two-step prompt pattern:
1) Produce an outline with section headings and bullet points.
2) Wait for my approval.
3) Expand into full text using the approved outline.

Operational Considerations: Making LLM Use Practical

Human review strategy

LLMs are most effective when you decide in advance what humans must check. For example: verify all numbers and dates; verify any claim about policy; verify any promise to a customer. This turns review into a checklist rather than an open-ended reading task.

Cost and latency trade-offs

LLMs can be cost-effective when they replace significant human time, but expensive if used for trivial tasks or called repeatedly without caching. Batch processing (summarize 100 items overnight) often provides better economics than real-time calls for every small interaction.

Privacy and data handling

Before using an LLM with real data, define what data is allowed, how it is redacted, and where it can be processed. In many organizations, the “right tool” decision depends as much on governance as on capability.

Monitoring and feedback loops

For classification and extraction, track error types: wrong label, missing field, overconfident output, or inconsistent formatting. Use these examples to refine instructions, tighten validation, and adjust escalation thresholds.

Now answer the exercise about the content:

In which situation is an LLM most appropriate as the main tool rather than only a helper inside a stricter system?

You are right! Congratulations, now go to the next page

You missed! Try again.

LLMs fit best when the hard part is working with language (drafting, rewriting, summarizing) over ambiguous inputs, and when guardrails or human review can catch occasional mistakes.