All courses > Technology and Programming > Data Science and Business Intelligence ::

BI Team Roles and Handoffs: Analyst, Analytics Engineer, Data Engineer

Capítulo 5

Estimated reading time: 8 minutes

Why role clarity matters in a sustainable BI program

A BI program scales when responsibilities are explicit, deliverables are standardized, and handoffs are lightweight but unambiguous. When roles blur, teams tend to re-implement logic in multiple places, ship dashboards that disagree, and spend most of their time reconciling numbers instead of answering questions. This chapter clarifies three common roles—data engineer, analytics engineer, and BI/data analyst—by focusing on (1) responsibilities, (2) typical deliverables, and (3) collaboration points and handoff artifacts.

Role 1: Data Engineer (DE)

Primary responsibilities

Ingestion and connectivity: bring data from source systems into the analytics environment (batch, streaming, files, APIs).
Pipeline reliability: scheduling, retries, idempotency, backfills, monitoring, alerting, and incident response.
Data quality at the boundary: schema drift handling, deduplication strategies, late-arriving data policies, and basic validity checks.
Security and access patterns: service accounts, secrets management, network controls, and least-privilege access to raw/landing zones.
Operational documentation: runbooks, on-call notes, and SLAs/SLOs for critical datasets.

Typical deliverables

Ingestion pipelines: jobs/workflows that land source data into a raw/landing area.
Source connectors and configs: API credentials, CDC configuration, file ingestion rules.
Reliability assets: monitors, alerts, dashboards for pipeline health, and runbooks.
Raw dataset contracts: expected schemas, partitioning strategy, and freshness expectations.

Collaboration points

With analytics engineers: agree on raw table naming, partitioning, incremental load strategy, and how to represent deletes/updates (especially for CDC).
With analysts: clarify source system meaning (what a field represents operationally), event timing (created vs posted vs updated), and known quirks.

Practical step-by-step: standing up a reliable ingestion pipeline

Define the ingestion contract: source tables/endpoints, expected schema, update method (append-only vs upserts), and freshness target.
Implement extraction: choose CDC or API pulls; ensure idempotency (re-running does not duplicate).
Land raw data: store with minimal transformation; keep metadata like _ingested_at, source file name, and batch id.
Validate boundary checks: row counts, null checks on key fields, schema drift detection.
Operationalize: scheduling, retries, alerting, and a runbook describing how to backfill.

Role 2: Analytics Engineer (AE)

Primary responsibilities

Transformations and modeling: convert raw data into analysis-ready tables with consistent keys, standardized timestamps, and business-friendly structures.
Testing and quality gates: automated tests for uniqueness, referential integrity, accepted values, freshness, and reconciliation checks.
Semantic definitions: define canonical metrics and dimensions (e.g., “Net Revenue”, “Active Customer”) and ensure they are computed consistently.
Performance and maintainability: incremental models, partitioning/clustering choices, and avoiding repeated heavy computations.
Documentation for reuse: model descriptions, lineage notes, and usage guidance for downstream consumers.

Typical deliverables

Transformation code: version-controlled SQL/models and orchestration definitions.
Curated datasets: standardized intermediate and final tables used by BI tools.
Test suite: automated checks and CI gates that prevent breaking changes.
Metric definition documents: canonical KPI formulas, filters, inclusion/exclusion rules, and edge cases.
Change logs: what changed, why, and expected downstream impact.

Collaboration points

With data engineers: coordinate on schema changes, late-arriving data handling, and backfill procedures.
With analysts: translate stakeholder questions into stable definitions; validate that modeled outputs match business expectations; agree on grain and slicing behavior for dashboards.

Example: transformation spec (handoff artifact)

A transformation spec is a lightweight agreement between AE and analyst (and often DE) that defines exactly how a dataset will be produced.

Section	Example content
Input sources	`raw.orders`, `raw.order_items`, `raw.refunds`
Output model	`mart.fct_order_revenue_daily`
Grain	One row per `order_id` per day of `order_posted_date`
Business rules	Revenue recognized on posted date; exclude test orders; refunds reduce revenue on refund posted date
Keys	`order_id` unique per day; `customer_id` required
Edge cases	Partial refunds; multi-currency conversion using daily FX rate
Tests	Uniqueness on (`order_id`, `order_posted_date`); non-null `customer_id`; reconciliation to finance totals within tolerance

Role 3: BI/Data Analyst (Analyst)

Primary responsibilities

Stakeholder requirements: clarify the business question, decision context, audience, and success criteria.
Analysis and interpretation: explore data, identify drivers, quantify impact, and communicate insights.
Dashboarding and reporting: design visuals, define filters, set default views, and ensure usability for the target audience.
Validation with the business: confirm metrics match expectations; run UAT sessions; document assumptions and limitations.
Adoption and iteration: training, feedback loops, and prioritizing enhancements.

Typical deliverables

Requirements brief: business question, intended actions, definitions, and acceptance criteria.
Mockups/wireframes: proposed dashboard layout and interactions.
Exploratory queries/notebooks: analysis used to validate definitions and identify needed fields.
BI assets: dashboards, reports, alert rules, and documentation for end users.
Data validation notes: reconciliations, known caveats, and sign-off records.

Collaboration points

With analytics engineers: request new fields/metrics, confirm definitions, and avoid implementing business logic only in the BI tool.
With data engineers: surface data availability gaps, freshness issues, and source system changes impacting reporting.

Practical step-by-step: turning a stakeholder question into a dashboard requirement

Clarify the decision: “What will you do differently based on this dashboard?”
Define the audience and cadence: exec weekly review vs daily operations.
List required KPIs and dimensions: include exact definitions and needed slices (region, channel, product).
Set acceptance criteria: expected totals for a known period, refresh frequency, and performance expectations.
Identify data gaps: missing fields, unclear mappings, or inconsistent source values.
Draft a wireframe: include filters, drill paths, and default date ranges.

Handoff artifacts that prevent rework

1) Source-to-target mapping (DE ↔ AE)

This artifact makes ingestion and transformation explicit: what comes from where, how it is typed, and where it lands.

Source field	Source table	Target field	Target model	Transform	Notes
`order_total`	`erp.orders`	`gross_amount`	`stg_orders`	Cast to decimal(18,2)	Stored in source currency
`status`	`erp.orders`	`order_status`	`stg_orders`	Uppercase; map legacy codes	See status mapping table
`posted_at`	`erp.orders`	`order_posted_ts`	`stg_orders`	Convert to UTC	Use source timezone metadata

2) Transformation specification (AE ↔ Analyst)

This defines business logic once, upstream, so dashboards don’t become the place where definitions diverge.

Model: mart.fct_orders
Purpose: Canonical order-level facts for revenue and fulfillment reporting
Grain: 1 row per order_id
Inclusions: posted orders only; exclude test customers
Revenue: gross_amount - discounts - refunds
Time logic: revenue_date = order_posted_date; refunds applied on refund_posted_date
Dimensions: customer_id, channel, region, product_category (from dim_product)
Quality checks: unique(order_id), not_null(customer_id), reconcile to finance within 0.5%

3) Metric definition document (AE ↔ Analyst ↔ Stakeholders)

A metric definition document is a shared contract. It should be short enough to maintain, but precise enough to prevent “same name, different math.”

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

Metric	Definition	Formula (pseudo)	Filters/Exclusions	Grain & caveats
Net Revenue	Revenue after discounts and refunds, recognized on posted dates	`sum(gross_amount - discount_amount) - sum(refund_amount)`	Exclude test orders; exclude canceled before posting	Daily by posted date; refunds can shift revenue across days
Active Customers	Customers with ≥1 posted order in the last 30 days	`count_distinct(customer_id where order_posted_date >= today-30)`	Exclude internal accounts	Rolling window; sensitive to late-arriving orders

Common failure modes (and how to prevent them)

Failure mode: Unclear ownership

Symptoms: pipeline breaks linger; stakeholders don’t know who to contact; analysts patch data issues in dashboards.

Prevention: define dataset owners and on-call expectations; attach an owner to each critical table/model; maintain a simple escalation path (DE for ingestion failures, AE for model logic, Analyst for dashboard behavior and requirements).

Failure mode: Duplicated logic in dashboards

Symptoms: multiple dashboards compute “Net Revenue” differently; changes require editing many reports; performance degrades due to heavy BI-tool calculations.

Prevention: push business logic upstream into curated models; restrict BI-tool calculations to presentation-only (formatting, simple ratios when already well-defined); create a canonical metric layer and require reuse.

Failure mode: Conflicting KPI definitions

Symptoms: finance and sales disagree; weekly business reviews focus on reconciling numbers; teams lose trust in BI.

Prevention: maintain metric definition documents with explicit inclusions/exclusions; require stakeholder sign-off for KPI changes; version definitions and communicate changes with examples (before/after for a known period).

Failure mode: Handoffs without acceptance criteria

Symptoms: AE delivers a model but analyst finds missing fields; repeated rework; “done” is subjective.

Prevention: add acceptance criteria to specs: required columns, expected totals for a validation period, freshness, and performance targets.

Responsibility map: a typical dashboard request from intake to deployment

Stage	Primary role	Supporting roles	Key outputs	Quality gate
1. Intake & scoping	Analyst	AE	Requirements brief; KPI list; wireframe; acceptance criteria	Stakeholder confirms definitions and intended decisions
2. Data availability check	AE	DE, Analyst	Data gap list; proposed sources; initial mapping needs	Feasibility confirmed; owners assigned
3. Ingestion/landing (if needed)	DE	AE	Ingestion pipeline; raw tables; runbook; freshness SLA	Pipeline monitored; boundary checks passing
4. Modeling & transformations	AE	Analyst	Transformation spec; curated models; tests; documentation	Automated tests pass; reconciliation within tolerance
5. Metric standardization	AE	Analyst, Stakeholders	Metric definition document; versioned KPI logic	Business sign-off on KPI definitions
6. Dashboard build	Analyst	AE	Dashboard/report; filter behaviors; drill paths; user guide	Uses canonical models/metrics; minimal duplicated logic
7. UAT & validation	Analyst	AE, DE	UAT checklist; validation queries; issue log	Acceptance criteria met; discrepancies resolved
8. Deployment & monitoring	Analyst	DE, AE	Published dashboard; access controls; refresh schedule; alerting	Refresh succeeds; usage and data quality monitored

Now answer the exercise about the content:

A team notices multiple dashboards calculating “Net Revenue” differently and performance is degrading due to heavy BI-tool calculations. Which action best prevents this issue going forward?

You are right! Congratulations, now go to the next page

You missed! Try again.

Prevent divergence by defining metrics once upstream in curated models and a shared metric layer. This reduces duplicated dashboard logic, improves performance, and keeps KPIs consistent across reports.