All courses > Business and Marketing > Digital Marketing ::

Common Data Pitfalls and Quality Checks in Marketing Analytics

Capítulo 11

Estimated reading time: 9 minutes

Marketing analytics is only as useful as the data feeding it. Most “bad decisions” in marketing reporting come from a small set of repeatable data problems: missing tracking, double-counting, non-human traffic, privacy-related gaps, identity mismatches, time lags, and post-purchase reversals. This chapter focuses on how to spot these issues early and how to run quick checks before you trust a number.

Tracking gaps: missing UTMs, broken pixels, and lost parameters

What it is

A tracking gap happens when a user action occurs but your systems fail to record it correctly (or at all). The result is undercounted conversions, misattributed revenue, and inflated “direct” traffic.

Common causes

Missing UTMs on paid or partner links, causing traffic to fall into direct or referral.
Broken pixels/tags after site changes (new templates, tag manager updates, checkout redesign).
Redirects stripping parameters (e.g., ?utm_source=... lost during a 301/302 redirect).
In-app browsers (social apps) that handle cookies differently and can drop identifiers.
Cross-domain issues (landing domain → checkout domain) where sessions break and attribution resets.

Practical checks (step-by-step)

Pick one live campaign link (ad, email, partner) and click it in an incognito window.
Confirm UTMs persist: the landing page URL should still include UTMs after any redirects.
Open your analytics real-time view (or debug mode) and verify the session source/medium matches the UTMs.
Trigger a test conversion (newsletter signup, add-to-cart, purchase in a test environment) and confirm the event fires once.
Check tag firing rules: ensure the pixel/tag triggers on the right pages and not on refresh or duplicate pageviews.

Fast symptom to watch: a sudden rise in direct traffic or (not set) values after a site release often indicates UTMs or referrers are being lost.

Duplication and double-counting: when “one conversion” becomes two (or more)

What it is

Double-counting occurs when the same real-world action is recorded multiple times. It can happen within one tool (duplicate events) or across tools (platform + analytics + CRM all counted as separate “conversions”).

Where it shows up

Pixel fires twice due to multiple tag containers, duplicated scripts, or SPA (single-page app) route changes.
Thank-you page reloads or users bookmarking the confirmation page.
Multiple conversion definitions (e.g., “Purchase” and “Order Completed” both counted as primary conversions).
Server-side + browser-side tracking both sending the same event without deduplication keys.

Practical checks (step-by-step)

Choose a single order ID (or lead ID) and search for it across systems (analytics export, ad platform, backend/CRM).
Count occurrences: it should appear once per system per definition.
Inspect event logs (tag manager preview, network requests) and confirm only one request is sent on conversion.
Verify deduplication: if you track both browser and server events, ensure a shared event_id or order_id is used to dedupe.
Check “conversion windows” and “counting method” in ad platforms (e.g., “every” vs “one” for leads).

Symptom	Likely cause	Quick fix direction
Conversions jump exactly 2× overnight	Duplicate tag deployed	Audit tag manager versions and page source
More purchases than orders in backend	Thank-you page reload / SPA firing twice	Fire on transaction ID once; block repeats
Platform shows more conversions than analytics	Different counting rules / modeled conversions	Align definitions; compare using same date basis

Bot traffic and spam: fake visits that distort performance

What it is

Bot traffic includes automated crawlers, click fraud, and spam referrals that inflate sessions, skew conversion rates, and can trigger fake leads.

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

Common signs

Sudden spikes in sessions with no corresponding spend change.
Very low engagement (0 seconds, 100% bounce) or unrealistically high engagement (thousands of pageviews per session).
Odd geographies or data center ISPs dominating traffic.
High volume to a single landing page with nonsensical paths.

Practical checks (step-by-step)

Segment by source/medium and geography for the spike period.
Check engagement metrics (time on site, pages/session, event rate) for outliers.
Inspect landing pages: are bots hitting obscure URLs or parameterized pages?
Compare against server logs (if available) to validate user agents and request patterns.
Apply filters/blocks: exclude known bot traffic, block suspicious referrers, add bot protection to forms.

Decision risk: bot-driven “cheap traffic” can make a channel look efficient while actually harming lead quality and wasting budget.

Cookie consent impact: why numbers drop (or shift) after privacy changes

What it is

Consent banners and privacy settings can prevent analytics and ad tags from firing until a user opts in. This creates partial visibility: you may still get spend and clicks, but fewer sessions and conversions recorded in analytics.

How it typically affects reporting

Underreported users and conversions in analytics tools that require consent to set cookies.
Channel mix shifts: more traffic appears as direct or (not set) when identifiers are missing.
Platform vs analytics divergence increases because platforms may use modeled or aggregated reporting.

Practical checks (step-by-step)

Identify the consent change date (banner launch, CMP update, region rollout).
Compare opt-in rates by device and region; low opt-in often correlates with bigger data gaps.
Trend “direct” and “(not set)” before/after the change.
Compare platform conversions vs analytics conversions across the same period to quantify the gap.
Validate tag behavior: confirm tags fire only after consent and that consent states are passed correctly.

Tip: When consent reduces measurement, focus on stable comparisons (e.g., same region, same device mix) and use backend/CRM outcomes as an anchor where possible.

Cross-device and identity issues: one person, multiple sessions, multiple “users”

What it is

People often discover you on one device and convert on another (mobile → desktop), or switch browsers. If your measurement can’t connect these interactions, you’ll see fragmented journeys and misattribution.

Where it hurts decisions

Upper-funnel channels look weak because conversions happen later on another device.
Retargeting appears more effective than it is if it “catches” the final device/session.
Frequency and reach are overstated because the same person is counted multiple times.

Practical checks (step-by-step)

Compare device splits for traffic vs conversions (e.g., 80% mobile sessions but 70% desktop purchases can be normal, but watch for sudden changes).
Check login rate (if applicable): logged-in flows typically have better cross-device continuity.
Review assisted conversion patterns (if available) to see if certain channels frequently start journeys.
Validate cross-domain/session continuity (landing → checkout) to avoid creating artificial “new sessions” mid-journey.

Practical interpretation: treat user-level metrics (users, frequency) as directional unless you have strong identity resolution (logins or consistent identifiers).

Delayed conversions: time lags that make recent performance look worse

What it is

Many conversions happen days or weeks after the first click/visit. If you judge campaigns too quickly, you may pause winners and scale losers.

Typical sources of delay

Consideration cycles (B2B, high-ticket items).
Offline steps (sales calls, demos, approvals).
Attribution processing delays (platform reporting latency, modeled conversions posted later).

Practical checks (step-by-step)

Use a “cooling-off” window: avoid final judgments on the last 1–3 days (or longer for long cycles).
Compare by conversion date vs click date: understand whether your report is “when it happened” or “what caused it.”
Track lag distribution: what % of conversions arrive within 1 day, 7 days, 28 days?
Annotate reporting: mark periods where conversions are expected to backfill.

Example lag snapshot (illustrative): 40% same-day, 35% within 7 days, 20% within 28 days, 5% after 28 days

Refunds and chargebacks: revenue that disappears after you count it

What it is

Analytics and ad platforms often record gross revenue at purchase time, but finance cares about net revenue after refunds, cancellations, and chargebacks. If you optimize on gross-only, you may scale campaigns that attract high-refund customers.

Practical checks (step-by-step)

Define the adjustment source of truth: payment processor, ecommerce platform, or finance system.
Calculate refund rate by channel/campaign/cohort (e.g., refunds within 30 days).
Adjust ROAS views: compare gross ROAS vs net ROAS for major channels.
Flag anomalies: campaigns with normal conversion volume but unusually high refunds.

Metric	Gross view	Net view	Why it matters
Revenue	At purchase time	After refunds/chargebacks	Prevents overestimating profitability
ROAS	Revenue / Spend	(Revenue − Refunds) / Spend	Stops scaling low-quality acquisition

Sanity check routine: a repeatable pre-decision checklist

Use this routine before you change budgets, declare a campaign “winning,” or report performance to stakeholders. The goal is not perfection—it’s catching the most common errors fast.

1) Reconcile spend: finance/ad platform totals vs your report

Pull spend totals from each ad platform for the same date range and timezone.
Compare to your dashboard totals (by platform and overall).
Investigate gaps: missing accounts, currency mismatch, timezone mismatch, or excluded campaigns.

Rule of thumb: if spend doesn’t match within a small tolerance, don’t trust efficiency metrics (CPA/ROAS) yet.

2) Compare conversions across systems (platform, analytics, backend/CRM)

Pick one primary conversion (purchase, qualified lead) and compare counts across systems.
Align definitions: same event, same date basis (conversion date vs click date), same filters (test orders excluded?).
Quantify the delta as a % and track it over time.

Interpretation: differences are normal, but sudden changes usually indicate a tracking break, consent shift, or deduplication issue.

3) Inspect sudden spikes or drops

Identify the exact start time of the change (hour/day).
Check for releases: site deploys, tag manager publishes, checkout changes, consent banner updates.
Break down by source/medium, campaign, device, geo, landing page.
Look for “single-driver” patterns: one campaign or one page causing most of the change.

4) Check top landing pages for tracking and UX issues

List top landing pages by sessions for the period.
Verify tracking: page has the correct tag container, no console errors, no blocked scripts.
Spot anomalies: unexpected pages in the top list (staging URLs, internal search pages, parameter spam).
Confirm conversion paths: key CTAs and forms work; no broken redirects that drop UTMs.

5) Review “(not set)” and “direct” anomalies

These buckets are often where tracking problems hide.

Rising “direct” can indicate missing UTMs, referrer loss, or app-to-web transitions.
High “(not set)” can indicate missing parameters, consent restrictions, or misconfigured channel grouping.

Trend these buckets week over week.
Drill into landing pages and device/geo for these sessions.
Check recent link sources: new email templates, partner links, QR codes, influencer bios.

Lightweight data quality log template (track issues, owners, fixes)

Use a simple log to prevent repeated incidents and to make ownership clear. Keep it in a shared document or ticketing system and review it weekly.

Date detected	Issue	Impact	Where observed	Suspected cause	Owner	Status	Fix / action	Date fixed	Validation check
YYYY-MM-DD	UTMs dropped after redirect	High (paid traffic misattributed to direct)	Landing page report; direct traffic spike	New redirect rule stripping query params	Web/Dev	In progress	Update redirect to preserve query string	YYYY-MM-DD	Test click keeps UTMs; source/medium correct in real-time
YYYY-MM-DD	Purchase event firing twice	High (ROAS inflated)	Analytics events; backend orders lower	Duplicate tag in container	Analytics/MarOps	Open	Remove duplicate tag; add transaction-id guard		Single order ID appears once in event export
YYYY-MM-DD	Bot spike from referral spam	Medium (sessions inflated)	Source/medium spike; low engagement	Spam referrer campaign	Analytics	Fixed	Add filter/block; tighten bot rules	YYYY-MM-DD	Sessions normalize; engagement returns to baseline

Now answer the exercise about the content:

After a site release, you notice a sudden rise in “direct” traffic and more “(not set)” values. Which issue is the most likely cause to investigate first?

You are right! Congratulations, now go to the next page

You missed! Try again.

A spike in “direct” or “(not set)” right after changes often signals UTMs/referrers are being dropped (e.g., redirects stripping parameters or broken tags), which misattributes traffic and undercounts conversions.

100%

Marketing Analytics for Beginners: Measure What Matters and Make Better Decisions

New course

11 pages

Common Data Pitfalls and Quality Checks in Marketing Analytics

Tracking gaps: missing UTMs, broken pixels, and lost parameters

What it is

Common causes

Practical checks (step-by-step)

Duplication and double-counting: when “one conversion” becomes two (or more)

What it is

Where it shows up

Practical checks (step-by-step)

Bot traffic and spam: fake visits that distort performance

What it is

Common signs

Practical checks (step-by-step)

Cookie consent impact: why numbers drop (or shift) after privacy changes

What it is

How it typically affects reporting

Practical checks (step-by-step)

Cross-device and identity issues: one person, multiple sessions, multiple “users”

What it is

Where it hurts decisions

Practical checks (step-by-step)

Delayed conversions: time lags that make recent performance look worse

What it is

Typical sources of delay

Practical checks (step-by-step)

Refunds and chargebacks: revenue that disappears after you count it

What it is

Practical checks (step-by-step)

Sanity check routine: a repeatable pre-decision checklist

1) Reconcile spend: finance/ad platform totals vs your report

2) Compare conversions across systems (platform, analytics, backend/CRM)

3) Inspect sudden spikes or drops

4) Check top landing pages for tracking and UX issues

5) Review “(not set)” and “direct” anomalies

Lightweight data quality log template (track issues, owners, fixes)

After a site release, you notice a sudden rise in “direct” traffic and more “(not set)” values. Which issue is the most likely cause to investigate first?

Marketing Analytics for Beginners: Measure What Matters and Make Better Decisions

LearnDigital Marketing

LearnBusiness and Marketing