Free Ebook cover Marketing Analytics for Beginners: Measure What Matters and Make Better Decisions

Marketing Analytics for Beginners: Measure What Matters and Make Better Decisions

New course

11 pages

Common Data Pitfalls and Quality Checks in Marketing Analytics

Capítulo 11

Estimated reading time: 9 minutes

+ Exercise

Marketing analytics is only as useful as the data feeding it. Most “bad decisions” in marketing reporting come from a small set of repeatable data problems: missing tracking, double-counting, non-human traffic, privacy-related gaps, identity mismatches, time lags, and post-purchase reversals. This chapter focuses on how to spot these issues early and how to run quick checks before you trust a number.

Tracking gaps: missing UTMs, broken pixels, and lost parameters

What it is

A tracking gap happens when a user action occurs but your systems fail to record it correctly (or at all). The result is undercounted conversions, misattributed revenue, and inflated “direct” traffic.

Common causes

  • Missing UTMs on paid or partner links, causing traffic to fall into direct or referral.
  • Broken pixels/tags after site changes (new templates, tag manager updates, checkout redesign).
  • Redirects stripping parameters (e.g., ?utm_source=... lost during a 301/302 redirect).
  • In-app browsers (social apps) that handle cookies differently and can drop identifiers.
  • Cross-domain issues (landing domain → checkout domain) where sessions break and attribution resets.

Practical checks (step-by-step)

  1. Pick one live campaign link (ad, email, partner) and click it in an incognito window.
  2. Confirm UTMs persist: the landing page URL should still include UTMs after any redirects.
  3. Open your analytics real-time view (or debug mode) and verify the session source/medium matches the UTMs.
  4. Trigger a test conversion (newsletter signup, add-to-cart, purchase in a test environment) and confirm the event fires once.
  5. Check tag firing rules: ensure the pixel/tag triggers on the right pages and not on refresh or duplicate pageviews.

Fast symptom to watch: a sudden rise in direct traffic or (not set) values after a site release often indicates UTMs or referrers are being lost.

Duplication and double-counting: when “one conversion” becomes two (or more)

What it is

Double-counting occurs when the same real-world action is recorded multiple times. It can happen within one tool (duplicate events) or across tools (platform + analytics + CRM all counted as separate “conversions”).

Where it shows up

  • Pixel fires twice due to multiple tag containers, duplicated scripts, or SPA (single-page app) route changes.
  • Thank-you page reloads or users bookmarking the confirmation page.
  • Multiple conversion definitions (e.g., “Purchase” and “Order Completed” both counted as primary conversions).
  • Server-side + browser-side tracking both sending the same event without deduplication keys.

Practical checks (step-by-step)

  1. Choose a single order ID (or lead ID) and search for it across systems (analytics export, ad platform, backend/CRM).
  2. Count occurrences: it should appear once per system per definition.
  3. Inspect event logs (tag manager preview, network requests) and confirm only one request is sent on conversion.
  4. Verify deduplication: if you track both browser and server events, ensure a shared event_id or order_id is used to dedupe.
  5. Check “conversion windows” and “counting method” in ad platforms (e.g., “every” vs “one” for leads).
SymptomLikely causeQuick fix direction
Conversions jump exactly 2× overnightDuplicate tag deployedAudit tag manager versions and page source
More purchases than orders in backendThank-you page reload / SPA firing twiceFire on transaction ID once; block repeats
Platform shows more conversions than analyticsDifferent counting rules / modeled conversionsAlign definitions; compare using same date basis

Bot traffic and spam: fake visits that distort performance

What it is

Bot traffic includes automated crawlers, click fraud, and spam referrals that inflate sessions, skew conversion rates, and can trigger fake leads.

Continue in our app.

You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.

Or continue reading below...
Download App

Download the app

Common signs

  • Sudden spikes in sessions with no corresponding spend change.
  • Very low engagement (0 seconds, 100% bounce) or unrealistically high engagement (thousands of pageviews per session).
  • Odd geographies or data center ISPs dominating traffic.
  • High volume to a single landing page with nonsensical paths.

Practical checks (step-by-step)

  1. Segment by source/medium and geography for the spike period.
  2. Check engagement metrics (time on site, pages/session, event rate) for outliers.
  3. Inspect landing pages: are bots hitting obscure URLs or parameterized pages?
  4. Compare against server logs (if available) to validate user agents and request patterns.
  5. Apply filters/blocks: exclude known bot traffic, block suspicious referrers, add bot protection to forms.

Decision risk: bot-driven “cheap traffic” can make a channel look efficient while actually harming lead quality and wasting budget.

Cookie consent impact: why numbers drop (or shift) after privacy changes

What it is

Consent banners and privacy settings can prevent analytics and ad tags from firing until a user opts in. This creates partial visibility: you may still get spend and clicks, but fewer sessions and conversions recorded in analytics.

How it typically affects reporting

  • Underreported users and conversions in analytics tools that require consent to set cookies.
  • Channel mix shifts: more traffic appears as direct or (not set) when identifiers are missing.
  • Platform vs analytics divergence increases because platforms may use modeled or aggregated reporting.

Practical checks (step-by-step)

  1. Identify the consent change date (banner launch, CMP update, region rollout).
  2. Compare opt-in rates by device and region; low opt-in often correlates with bigger data gaps.
  3. Trend “direct” and “(not set)” before/after the change.
  4. Compare platform conversions vs analytics conversions across the same period to quantify the gap.
  5. Validate tag behavior: confirm tags fire only after consent and that consent states are passed correctly.

Tip: When consent reduces measurement, focus on stable comparisons (e.g., same region, same device mix) and use backend/CRM outcomes as an anchor where possible.

Cross-device and identity issues: one person, multiple sessions, multiple “users”

What it is

People often discover you on one device and convert on another (mobile → desktop), or switch browsers. If your measurement can’t connect these interactions, you’ll see fragmented journeys and misattribution.

Where it hurts decisions

  • Upper-funnel channels look weak because conversions happen later on another device.
  • Retargeting appears more effective than it is if it “catches” the final device/session.
  • Frequency and reach are overstated because the same person is counted multiple times.

Practical checks (step-by-step)

  1. Compare device splits for traffic vs conversions (e.g., 80% mobile sessions but 70% desktop purchases can be normal, but watch for sudden changes).
  2. Check login rate (if applicable): logged-in flows typically have better cross-device continuity.
  3. Review assisted conversion patterns (if available) to see if certain channels frequently start journeys.
  4. Validate cross-domain/session continuity (landing → checkout) to avoid creating artificial “new sessions” mid-journey.

Practical interpretation: treat user-level metrics (users, frequency) as directional unless you have strong identity resolution (logins or consistent identifiers).

Delayed conversions: time lags that make recent performance look worse

What it is

Many conversions happen days or weeks after the first click/visit. If you judge campaigns too quickly, you may pause winners and scale losers.

Typical sources of delay

  • Consideration cycles (B2B, high-ticket items).
  • Offline steps (sales calls, demos, approvals).
  • Attribution processing delays (platform reporting latency, modeled conversions posted later).

Practical checks (step-by-step)

  1. Use a “cooling-off” window: avoid final judgments on the last 1–3 days (or longer for long cycles).
  2. Compare by conversion date vs click date: understand whether your report is “when it happened” or “what caused it.”
  3. Track lag distribution: what % of conversions arrive within 1 day, 7 days, 28 days?
  4. Annotate reporting: mark periods where conversions are expected to backfill.
Example lag snapshot (illustrative): 40% same-day, 35% within 7 days, 20% within 28 days, 5% after 28 days

Refunds and chargebacks: revenue that disappears after you count it

What it is

Analytics and ad platforms often record gross revenue at purchase time, but finance cares about net revenue after refunds, cancellations, and chargebacks. If you optimize on gross-only, you may scale campaigns that attract high-refund customers.

Practical checks (step-by-step)

  1. Define the adjustment source of truth: payment processor, ecommerce platform, or finance system.
  2. Calculate refund rate by channel/campaign/cohort (e.g., refunds within 30 days).
  3. Adjust ROAS views: compare gross ROAS vs net ROAS for major channels.
  4. Flag anomalies: campaigns with normal conversion volume but unusually high refunds.
MetricGross viewNet viewWhy it matters
RevenueAt purchase timeAfter refunds/chargebacksPrevents overestimating profitability
ROASRevenue / Spend(Revenue − Refunds) / SpendStops scaling low-quality acquisition

Sanity check routine: a repeatable pre-decision checklist

Use this routine before you change budgets, declare a campaign “winning,” or report performance to stakeholders. The goal is not perfection—it’s catching the most common errors fast.

1) Reconcile spend: finance/ad platform totals vs your report

  1. Pull spend totals from each ad platform for the same date range and timezone.
  2. Compare to your dashboard totals (by platform and overall).
  3. Investigate gaps: missing accounts, currency mismatch, timezone mismatch, or excluded campaigns.

Rule of thumb: if spend doesn’t match within a small tolerance, don’t trust efficiency metrics (CPA/ROAS) yet.

2) Compare conversions across systems (platform, analytics, backend/CRM)

  1. Pick one primary conversion (purchase, qualified lead) and compare counts across systems.
  2. Align definitions: same event, same date basis (conversion date vs click date), same filters (test orders excluded?).
  3. Quantify the delta as a % and track it over time.

Interpretation: differences are normal, but sudden changes usually indicate a tracking break, consent shift, or deduplication issue.

3) Inspect sudden spikes or drops

  1. Identify the exact start time of the change (hour/day).
  2. Check for releases: site deploys, tag manager publishes, checkout changes, consent banner updates.
  3. Break down by source/medium, campaign, device, geo, landing page.
  4. Look for “single-driver” patterns: one campaign or one page causing most of the change.

4) Check top landing pages for tracking and UX issues

  1. List top landing pages by sessions for the period.
  2. Verify tracking: page has the correct tag container, no console errors, no blocked scripts.
  3. Spot anomalies: unexpected pages in the top list (staging URLs, internal search pages, parameter spam).
  4. Confirm conversion paths: key CTAs and forms work; no broken redirects that drop UTMs.

5) Review “(not set)” and “direct” anomalies

These buckets are often where tracking problems hide.

  • Rising “direct” can indicate missing UTMs, referrer loss, or app-to-web transitions.
  • High “(not set)” can indicate missing parameters, consent restrictions, or misconfigured channel grouping.
  1. Trend these buckets week over week.
  2. Drill into landing pages and device/geo for these sessions.
  3. Check recent link sources: new email templates, partner links, QR codes, influencer bios.

Lightweight data quality log template (track issues, owners, fixes)

Use a simple log to prevent repeated incidents and to make ownership clear. Keep it in a shared document or ticketing system and review it weekly.

Date detectedIssueImpactWhere observedSuspected causeOwnerStatusFix / actionDate fixedValidation check
YYYY-MM-DDUTMs dropped after redirectHigh (paid traffic misattributed to direct)Landing page report; direct traffic spikeNew redirect rule stripping query paramsWeb/DevIn progressUpdate redirect to preserve query stringYYYY-MM-DDTest click keeps UTMs; source/medium correct in real-time
YYYY-MM-DDPurchase event firing twiceHigh (ROAS inflated)Analytics events; backend orders lowerDuplicate tag in containerAnalytics/MarOpsOpenRemove duplicate tag; add transaction-id guardSingle order ID appears once in event export
YYYY-MM-DDBot spike from referral spamMedium (sessions inflated)Source/medium spike; low engagementSpam referrer campaignAnalyticsFixedAdd filter/block; tighten bot rulesYYYY-MM-DDSessions normalize; engagement returns to baseline

Now answer the exercise about the content:

After a site release, you notice a sudden rise in “direct” traffic and more “(not set)” values. Which issue is the most likely cause to investigate first?

You are right! Congratulations, now go to the next page

You missed! Try again.

A spike in “direct” or “(not set)” right after changes often signals UTMs/referrers are being dropped (e.g., redirects stripping parameters or broken tags), which misattributes traffic and undercounts conversions.

Download the app to earn free Certification and listen to the courses in the background, even with the screen off.