All courses > Technology and Programming > Artificial Intelligence and Machine Learning ::

Essential AI Glossary: Key Terms You Will Hear Often

Capítulo 11

Estimated reading time: 15 minutes

How to Use This Glossary (So It Actually Sticks)

This chapter is a practical glossary of AI terms you will hear repeatedly in articles, product pages, meetings, and tutorials. Instead of memorizing definitions, use each term in three steps:

Step 1: Identify the “thing” (Is it data? a setting? a component? a measurement? a role?)
Step 2: Ask what it changes (Does it change quality, speed, cost, safety, or user experience?)
Step 3: Connect it to an example (A chatbot, a photo tool, a search feature, or a fraud detector.)

When a term appears below, you will see a clear meaning and a practical “where you’ll see it” example. Some entries also include simple step-by-step checks you can do as a non-coder.

Core Building Blocks You’ll Hear in Most AI Products

Dataset

A dataset is a collection of examples used to build or test an AI system. In everyday products, datasets can be customer support chats, product photos, sensor readings, or documents.

Where you’ll see it: “We trained on a dataset of 2 million images.” As a user, you may not see the dataset, but you’ll feel its effects in what the system handles well (and what it fails on).

Feature

A feature is a measurable input signal used by a model. In a spam filter, features might include “contains certain words,” “sender reputation,” or “number of links.” In a credit risk tool, features might include “income,” “payment history,” and “debt ratio.”

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

Practical check: If someone says “we added new features,” ask: “What new signals did you add, and how do they improve decisions?”

Parameter

A parameter is an internal numeric value that the model adjusts during learning. You don’t set parameters directly in most tools; the system learns them.

Where you’ll see it: “This model has 7 billion parameters.” Bigger is not automatically better; it can mean more capability, but also more cost and sometimes more unpredictability.

Hyperparameter

A hyperparameter is a setting chosen by humans that affects how a model learns or behaves. Examples include learning rate, number of training steps, or (in text generation) temperature.

Where you’ll see it: In AI apps, hyperparameters show up as sliders like “creativity,” “style strength,” or “detail.”

Architecture

Architecture is the blueprint of a model: how it is structured and how information flows through it. Different architectures suit different tasks (text, images, audio, time series).

Where you’ll see it: “Transformer-based model,” “diffusion model,” or “CNN.” You don’t need to know the math; treat architecture as “the kind of engine under the hood.”

Pipeline

A pipeline is the end-to-end process that turns raw input into an output. It often includes steps like cleaning input, running the model, and post-processing results.

Where you’ll see it: “Our pipeline extracts text from PDFs, summarizes it, then formats a report.”

Common Model Types and System Labels

LLM (Large Language Model)

An LLM is a model designed to work with language: it can generate, rewrite, summarize, classify, and answer questions based on text patterns.

Where you’ll see it: Chatbots, writing assistants, customer support automation, and “ask your documents” tools.

Multimodal

Multimodal systems can handle more than one type of input or output, such as text + images, or text + audio.

Where you’ll see it: “Upload a screenshot and ask questions,” “Describe an image,” or “Generate an image from a prompt.”

Embedding

An embedding is a way to represent text (or images) as a list of numbers so that “similar meaning” ends up “close together” in that numeric space. Embeddings are widely used for search and recommendations.

Where you’ll see it: “Semantic search,” “find similar documents,” “recommend related articles.”

Vector Database

A vector database stores embeddings and lets you quickly retrieve items that are “closest” to a query embedding.

Where you’ll see it: Tools that let you chat with your company documents often use a vector database behind the scenes.

RAG (Retrieval-Augmented Generation)

RAG is a setup where a language model first retrieves relevant information (often from your documents) and then generates an answer grounded in that retrieved content.

Why it matters: RAG can reduce made-up answers and keep responses aligned with your sources.

Step-by-step: how to spot if a tool is using RAG

Step 1: Ask a question that should be answered from a specific document you uploaded.
Step 2: Check whether the tool shows citations, quotes, or links to passages.
Step 3: Ask “Where did you get that?” A RAG-based tool often points to specific sections.
Step 4: Change the document and ask again; a RAG system should change its answer accordingly.

Agent

An agent is an AI system that can plan and take actions toward a goal, often by calling tools (search, calendars, spreadsheets, code, databases) rather than only generating text.

Where you’ll see it: “AI that can book meetings,” “AI that updates your CRM,” or “AI that runs a workflow.”

Practical caution: Agents can cause real-world changes. Look for approval steps, logs, and undo options.

Tool Use / Function Calling

Tool use (often called function calling) is when a model outputs a structured request to use an external tool, like “search the web,” “fetch customer record,” or “create invoice.”

Where you’ll see it: Assistants that can “do” things, not just “say” things.

Prompting and Output Control Terms

Prompt

A prompt is the input you give an AI system (instructions, context, examples, constraints). In many products, your prompt is the main “control surface.”

Where you’ll see it: Chat input boxes, image generation text fields, or “instructions” panels.

System Prompt / Developer Prompt / User Prompt

Many chat systems use layered instructions:

System prompt: high-priority rules (tone, safety boundaries, role).
Developer prompt: app-specific instructions (how to format answers, what tools to use).
User prompt: what you type.

Why it matters: If an assistant “refuses” or behaves oddly, it may be following higher-priority instructions you can’t see.

Context Window

The context window is how much text (and sometimes images) the model can consider at once: your message, earlier conversation, and any documents inserted.

Where you’ll see it: “Supports 128k tokens” or “long context.” Larger context helps with long documents but can cost more and still doesn’t guarantee perfect recall.

Token

A token is a chunk of text used internally by language models. Tokens are not exactly words; a word can be one token or several.

Where you’ll see it: Pricing (“$ per 1M tokens”), limits (“max tokens”), and speed discussions.

Temperature

Temperature is a generation setting that affects randomness. Lower temperature tends to produce more consistent, conservative outputs; higher temperature tends to produce more varied outputs.

Step-by-step: choosing temperature in practice

Step 1: If you need accuracy and consistency (policies, summaries, extraction), start low.
Step 2: If you need variety (brainstorming, slogans), increase gradually.
Step 3: Run the same prompt 3 times; if outputs vary too much, lower temperature.

Top-p (Nucleus Sampling)

Top-p is another setting that controls randomness by limiting the model to a set of likely next tokens whose total probability is p. It often works alongside temperature.

Where you’ll see it: Advanced settings in text generation tools.

Max Tokens / Output Length

Max tokens limits how long the response can be. If an answer cuts off, the limit may be too low.

Practical fix: Increase max tokens or ask for the output in parts (e.g., “Give sections 1–3 now, then wait”).

Stop Sequence

A stop sequence is a piece of text that tells the model when to stop generating (for example, stop when it reaches “###”).

Where you’ll see it: Automations that need clean outputs without extra commentary.

Training-Related Terms You’ll Hear in Product Descriptions

Pretraining

Pretraining is the initial large-scale learning phase that gives a model broad capabilities. You’ll often hear “pretrained model” to mean a model that already knows general patterns and can be adapted.

Where you’ll see it: “Built on a pretrained foundation model.”

Fine-Tuning

Fine-tuning is adapting a pretrained model to a specific domain or style using additional examples.

Where you’ll see it: “Fine-tuned for legal writing,” “fine-tuned on our support tickets.”

Step-by-step: deciding if fine-tuning is needed (non-coder checklist)

Step 1: Try strong prompting with examples and clear formatting rules.
Step 2: If the model still misses domain-specific terminology or tone, consider fine-tuning.
Step 3: If the main problem is “it doesn’t know our latest internal facts,” consider retrieval (RAG) instead of fine-tuning.

Instruction Tuning

Instruction tuning is training a model to follow instructions better (e.g., “write a table,” “answer in JSON,” “be concise”).

Where you’ll see it: “Instruction-following model,” “chat-optimized model.”

RLHF (Reinforcement Learning from Human Feedback)

RLHF is a method where humans rate or compare outputs, and the model is adjusted to better match preferred responses (helpfulness, harmlessness, style).

Where you’ll see it: “Aligned with human feedback,” “safer, more helpful responses.”

Alignment

Alignment refers to how well a model’s behavior matches human intentions and constraints (helpful, safe, policy-compliant, on-task).

Where you’ll see it: Safety discussions, enterprise AI requirements, and “guardrails” features.

Overfitting

Overfitting happens when a model learns patterns that are too specific to its training examples and performs worse on new, unseen cases.

Where you’ll see it: “Great in the lab, disappointing in real use.” In practice, it can show up as a model that handles familiar templates well but fails on slightly different inputs.

Generalization

Generalization is the ability to perform well on new inputs that weren’t seen during training.

Practical check: Test with varied examples (different phrasing, formats, edge cases). If performance collapses, generalization is weak.

Quality, Reliability, and Evaluation Terms

Benchmark

A benchmark is a standardized test used to compare models. Benchmarks can be useful, but they may not match your real task.

Practical check: Ask: “Which benchmark, and how similar is it to our use case?”

Ground Truth

Ground truth is the best available reference answer used to evaluate outputs. For example, verified labels, audited records, or expert judgments.

Where you’ll see it: “Compared against ground truth annotations.”

Precision and Recall

Precision means “when the system says yes, how often is it correct?” Recall means “of all the true cases, how many did it catch?”

Everyday example: In a fraud detector, high precision means fewer legitimate transactions are flagged; high recall means more fraud is caught. Often you trade one for the other.

False Positive / False Negative

A false positive is an incorrect “yes” (flagging a legitimate email as spam). A false negative is an incorrect “no” (missing a spam email).

Practical step: When evaluating a tool, ask which error is more costly in your context and tune thresholds accordingly.

Threshold

A threshold is a cutoff value that turns a score into a decision (approve/deny, spam/not spam). Changing the threshold changes the balance between false positives and false negatives.

Where you’ll see it: Risk scoring systems, moderation tools, and anomaly detection dashboards.

Calibration

Calibration describes whether a model’s confidence scores match reality. If a system says “90% confident” many times, it should be correct about 90% of those times to be well-calibrated.

Why it matters: Poor calibration makes confidence scores misleading, which can cause bad decisions in high-stakes workflows.

Drift (Data Drift / Concept Drift)

Drift means the world changes and the system’s performance degrades. Data drift is when input patterns change (new slang, new product categories). Concept drift is when the meaning of the target changes (what counts as “fraud” evolves).

Step-by-step: a simple drift watch routine

Step 1: Track a small set of key metrics weekly (error rate, complaint rate, manual review rate).
Step 2: Compare recent inputs to older ones (formats, lengths, categories).
Step 3: If metrics worsen, collect examples of failures and decide whether to update prompts, retrieval sources, thresholds, or the model.

Latency

Latency is the time it takes to get a response. Lower latency feels more “instant.”

Where you’ll see it: “Real-time AI,” “streaming responses,” or complaints that a chatbot is slow.

Throughput

Throughput is how many requests a system can handle per unit time (e.g., per second). It matters for high-traffic apps.

Cost per Request

This is the money spent each time the AI is used (often tied to tokens, compute, and tool calls). It affects whether a feature is sustainable at scale.

Practical check: Ask for a cost estimate for typical and worst-case usage (long documents, many tool calls, retries).

Text Generation Failure Modes and Safety Terms (Practical Vocabulary)

Hallucination

A hallucination is when a model produces information that sounds plausible but is incorrect or unsupported.

Where you’ll see it: Confidently invented citations, wrong dates, fake product features, or made-up policies.

Step-by-step: quick hallucination triage

Step 1: Identify which statements are checkable facts (numbers, names, quotes, policies).
Step 2: Ask for sources or direct quotes from provided documents.
Step 3: Verify the critical facts externally (official docs, internal records).
Step 4: If the tool can’t cite sources, treat factual claims as untrusted.

Guardrails

Guardrails are controls that limit or shape AI behavior: content filters, allowed tools, required citations, formatting constraints, and approval steps.

Where you’ll see it: “Enterprise guardrails,” “policy controls,” “safe completion.”

Content Moderation

Content moderation is filtering or classifying content to enforce rules (e.g., blocking harmful requests, removing unsafe outputs).

Where you’ll see it: Social platforms, workplace chat tools, and image generation apps.

Red Teaming

Red teaming is structured testing where people try to break the system: provoke unsafe outputs, bypass rules, or trigger failures.

Practical example: Testing whether a support chatbot reveals private account details when asked in tricky ways.

Prompt Injection

Prompt injection is when an attacker hides instructions in content (like a web page or document) to manipulate the model into ignoring its rules or leaking data.

Where you’ll see it: “Chat with websites,” “summarize emails,” or any tool that reads untrusted text.

Step-by-step: basic defenses you can ask for

Step 1: Ensure the system treats external content as data, not instructions.
Step 2: Restrict tool permissions (least privilege).
Step 3: Require user confirmation before sensitive actions (sending emails, payments).
Step 4: Log tool calls and show the user what the agent is about to do.

Data Leakage

Data leakage is when sensitive information is exposed to the wrong place or person. In AI systems, leakage can happen through logs, prompts, tool outputs, or overly broad access to internal documents.

Where you’ll see it: Enterprise AI rollouts and compliance reviews.

Deployment and Operations Terms (What Happens After a Demo)

Production

Production is the real environment where customers or employees rely on the system. Many AI tools look great in demos but behave differently in production because inputs are messier and stakes are higher.

Monitoring

Monitoring means continuously tracking performance, errors, latency, cost, and safety issues after deployment.

Practical check: Ask what is monitored, how often alerts trigger, and who responds.

Logging

Logging is recording inputs, outputs, and system events for debugging and auditing. Logs are useful but can be risky if they store sensitive data.

Where you’ll see it: “Conversation logs,” “audit trail,” “trace view.”

A/B Testing

A/B testing compares two versions (Model A vs. Model B, or Prompt A vs. Prompt B) on real traffic to see which performs better.

Practical example: Testing whether a shorter prompt reduces cost without lowering answer quality.

Fallback

A fallback is what the system does when the AI fails or is uncertain: route to a human, use a simpler rule-based method, or ask clarifying questions.

Practical check: Ask: “What happens when the model is wrong, slow, or unavailable?”

Working Vocabulary for Talking to Vendors and Teams

On-Prem vs. Cloud

On-prem means running systems in your own infrastructure; cloud means running on a provider’s infrastructure. This affects control, cost, and compliance.

Where you’ll see it: “Private deployment,” “VPC,” “self-hosted,” “managed service.”

SLA (Service Level Agreement)

An SLA is a commitment about uptime, response time, and support. For AI, you may also care about rate limits and incident response.

Rate Limit

A rate limit caps how many requests you can make in a time period. It prevents overload but can break workflows if underestimated.

Practical check: Ask for limits per minute and per day, and what happens when you exceed them.

API

An API is a way for software to talk to software. Many AI features are accessed via APIs so they can be built into apps, websites, and internal tools.

Where you’ll see it: “API key,” “endpoint,” “request/response.”

SDK

An SDK is a set of tools and libraries that makes it easier to use an API in a specific programming language.

Webhook

A webhook is an automated message sent from one system to another when an event happens (e.g., “analysis complete”).

Where you’ll see it: Automations that trigger actions after an AI job finishes.

Mini Reference: Quick “Translate the Jargon” Examples

Example 1: “We support long context and RAG with citations.”

Translation: The model can read a lot at once, and it can pull relevant passages from your documents and show where the answer came from.

Example 2: “We fine-tuned an instruction-following LLM and added guardrails.”

Translation: They adapted a language model to follow tasks and formatting better, and they added controls to reduce unsafe or off-policy outputs.

Example 3: “We improved latency and throughput, but token costs increased.”

Translation: It responds faster and handles more users, but each request may cost more because of model choice, longer prompts, or extra tool calls.

Practical Glossary Drill (5 Minutes)

Use this short exercise to make the terms usable in real conversations:

Step 1: Pick one AI tool you use (chatbot, image generator, search feature).
Step 2: Identify 5 terms from this chapter that apply (e.g., prompt, context window, tokens, RAG, guardrails).
Step 3: For each term, write one sentence: “In this tool, term shows up as ____.”
Step 4: Write one question you would ask a vendor or teammate using that term (e.g., “What’s our fallback when the model is uncertain?”).

Example fill-in: In our support chatbot, RAG shows up as citations to our help-center articles. Question: Can we restrict retrieval to the latest policy pages only?

Now answer the exercise about the content:

Which situation best indicates that an AI tool is using RAG to answer questions about your uploaded documents?

You are right! Congratulations, now go to the next page

You missed! Try again.

RAG retrieves relevant content first, then generates an answer grounded in that retrieved text. A practical sign is seeing citations or quotes, and the answer updating when the document changes.

100%

AI Fundamentals for Absolute Beginners: Concepts, Use Cases, and Key Terms

New course

11 pages