All courses > Technology and Programming > Data Science and Business Intelligence ::

Governance, Access Control, and Metric Consistency in BI Programs

Capítulo 8

Estimated reading time: 9 minutes

How trust is maintained as BI usage scales

As more teams rely on dashboards and self-serve exploration, the biggest risk is not “wrong numbers once,” but “different numbers for the same question,” unclear accountability, and uncontrolled access to sensitive data. Trust at scale comes from three pillars working together: governance processes (who owns what and how changes happen), access control (who can see what), and metric consistency (one definition of key metrics, implemented once and reused).

Pillar 1: Governance processes

1) Data ownership: clear accountability for domains

Governance starts by assigning ownership to data domains (e.g., Sales, Finance, Product). Ownership is not about doing all the work; it is about being accountable for definitions, approvals, and prioritization.

Data Owner (Business): accountable for meaning and acceptable use (e.g., “What counts as a customer?”).
Data Steward (Business/Operations): maintains documentation, validates changes, coordinates communication.
Data Custodian (Technical): implements controls and pipelines, ensures security and reliability.

Practical step-by-step: establish ownership

List your top 10–20 most-used datasets and KPIs.
For each, assign a business owner and a technical custodian.
Publish an ownership directory (dataset → owner → contact → escalation path).
Require an owner for any dataset to be “certified” (see metric consistency).

2) Change management: predictable, reviewable changes

As BI scales, “silent changes” break trust. A lightweight change process reduces surprises while keeping delivery fast.

Example policy: who can publish datasets

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

Sandbox datasets: any analyst can publish to a personal or team workspace; not visible company-wide.
Team datasets: published by approved contributors in that domain; visible to the domain.
Certified datasets: only data platform team (or designated analytics engineers) can publish; requires owner sign-off and documentation.

Practical step-by-step: change workflow for a certified dataset

Request: submit a ticket/PR describing the change, impacted fields, and reason.
Impact analysis: list downstream dashboards/metrics and expected deltas.
Review: technical review (logic, performance, tests) + business review (definition and acceptability).
Staging validation: run backfill or sample comparisons; validate key aggregates.
Release: deploy with version tag and release notes.
Communicate: notify affected channels and update documentation.

3) Documentation: make meaning and usage discoverable

Documentation is a control mechanism: it prevents re-interpretation and reduces repeated questions. At minimum, document what the dataset is, how it is built, and how it should (and should not) be used.

Minimum documentation checklist (dataset)

Purpose: what decisions it supports.
Owner: business + technical contacts.
Refresh cadence: and known latency.
Grain statement: one sentence describing what a row represents (avoid re-explaining modeling theory; just state it).
Field definitions: especially IDs, timestamps, status fields, and money fields.
Known limitations: missing sources, partial history, edge cases.
Security classification: public/internal/confidential/restricted.

Pillar 2: Access control

Access control ensures the right people can access the right data at the right level of detail. It typically combines authentication (who you are) with authorization (what you can do/see), plus fine-grained controls for sensitive fields and records.

1) Authentication: establish identity reliably

Authentication should be centralized (e.g., SSO) so access can be revoked quickly when roles change. The BI tool, semantic layer, and data platform should align on identity (email/employee ID) to avoid “ghost users” and orphaned permissions.

Practical step-by-step: baseline authentication setup

Enable SSO and enforce MFA for BI access.
Disable shared accounts; require named users.
Integrate with an identity provider group structure (e.g., Finance, Sales Ops, HR).
Set session timeouts appropriate for sensitivity (shorter for restricted data).

2) Role-based permissions (RBAC): control actions and broad access

RBAC defines what users can do (view, edit, publish) and what spaces they can access (workspaces, projects, schemas). Keep roles simple and map them to job functions.

Example RBAC policy (BI tool)

Role	Can view	Can edit	Can publish	Scope
Viewer	Yes	No	No	Assigned workspaces
Explorer	Yes	Limited (personal)	No	Assigned datasets
Creator	Yes	Yes	Team workspace	Domain
Publisher	Yes	Yes	Certified workspace	Org-wide
Admin	Yes	Yes	Yes	Platform

Practical step-by-step: implement RBAC without permission sprawl

Define 4–6 standard roles and what each can do.
Assign access via groups, not individuals.
Review group membership monthly for sensitive domains (Finance/HR).
Remove direct grants unless there is a documented exception.

3) Row-level security (RLS): restrict which records a user can see

RLS is used when different users should see different subsets of the same dataset (e.g., regional managers only see their region). RLS rules should be driven by a maintained mapping table (user → allowed entity) rather than hard-coded logic in dashboards.

Example: regional sales access

user_region_access table: user_email, region_id
RLS rule: filter sales.region_id to those in user_region_access for the current user

-- Conceptual RLS predicate (implementation varies by platform/tool):
WHERE sales.region_id IN (
  SELECT region_id
  FROM user_region_access
  WHERE user_email = CURRENT_USER_EMAIL()
)

Operational tip: treat the access mapping table as governed data with an owner (often IT/security) and an audit trail for changes.

4) Column-level security and masking: protect sensitive fields

Some users may need access to a dataset but not to specific columns (e.g., salary, personal identifiers). Column-level security can hide columns entirely; masking can show a redacted or partially obfuscated value.

Example policy: how sensitive fields are masked

Restricted identifiers (e.g., national ID): never exposed in BI; only available in tightly controlled operational systems.
PII fields (email, phone): masked for most users; unmasked only for approved roles (e.g., Support Ops).
Financial compensation: visible only to HR and Finance leadership groups; aggregated views available to others.

-- Conceptual masking logic:
CASE
  WHEN user_has_role('SupportOps') THEN customer_email
  ELSE REGEXP_REPLACE(customer_email, '(^.).*(@.*$)', '\1***\2')
END AS customer_email_masked

Practical step-by-step: implement masking safely

Classify columns (public/internal/confidential/restricted).
Decide per class: hide vs mask vs aggregate-only.
Implement controls as close to the data layer as possible (so every dashboard inherits them).
Test with representative users (a viewer, a creator, an admin) to confirm expected visibility.

Pillar 3: Metric consistency

Metric consistency means that “Revenue,” “Active User,” or “Churn” has one definition, implemented once, reused everywhere, and changed in a controlled way. This is the difference between a BI program that scales and one that fragments into competing dashboards.

1) Single definitions: a shared metric catalog

Create a metric catalog (sometimes called a KPI dictionary) that includes the business definition, calculation rules, filters, and exclusions. The catalog should be easy to search and should link to the certified dataset or semantic object that implements the metric.

Metric definition template

Name: Net Revenue
Business question: “How much revenue did we recognize after refunds and discounts?”
Formula: sum(invoice_amount) - sum(refunds) - sum(discounts)
Inclusions/exclusions: exclude test accounts; include only posted invoices
Time logic: based on invoice posted date
Owner: Finance
Implementation link: certified dataset / semantic metric ID
Version: v1.3

2) Certified datasets: trusted sources for reuse

Certification is a governance label that signals: “this dataset is reviewed, documented, tested, and appropriate for broad use.” Certification reduces the need for every team to rebuild the same logic.

Example certification criteria

Owner assigned and documentation complete.
Security classification applied; RLS/masking validated.
Data quality checks in place (see operational practices).
Backward compatibility plan for breaking changes.
Usage guidance: recommended joins/filters and common pitfalls.

Practical step-by-step: certify a dataset

Start from a widely used team dataset with stable logic.
Add documentation and field-level definitions.
Add tests/quality checks for key fields and aggregates.
Validate access controls with sample users.
Publish to a certified workspace/schema and mark it “certified.”
Deprecate older duplicates by redirecting users and setting an end-of-life date.

3) Versioned logic: change metrics without breaking trust

Metrics evolve (new product lines, accounting rules, attribution changes). Versioning allows you to update logic while keeping historical comparability and avoiding silent shifts.

Approaches to versioning

Semantic versioning for metric definitions (v1.0, v1.1, v2.0) where major versions indicate breaking changes.
Effective dating: store which logic applies from a given date onward.
Parallel metrics: keep both “Old Churn” and “New Churn” for a transition period, with clear labels.

Example policy: how KPI changes are communicated

Any change to a Tier-1 KPI requires: owner approval, release notes, and a notification to a defined channel (e.g., #metrics-changes) at least 3 business days before rollout.
Release notes must include: what changed, why, expected impact (directional), effective date, and dashboards known to be affected.
For major changes, provide a comparison period where both versions are available and a short FAQ.

Release note example (Tier-1 KPI):
- KPI: Active Users (v2.0)
- Change: excludes internal employees and test accounts using updated identity list
- Effective: 2026-02-01
- Expected impact: -1.5% to -3% vs v1.4 depending on region
- Owner: Product Analytics
- Affected assets: Executive Overview, Weekly Growth Review

Operational practices that keep governance working day-to-day

Data quality checks: detect issues before users do

Quality checks should focus on what breaks decisions: missing data, unexpected spikes/drops, duplicates, and referential mismatches in key identifiers.

Practical step-by-step: implement a basic quality suite

Freshness: alert if a dataset is not updated within its expected window.
Volume: alert on row count changes outside a threshold (e.g., ±30% day-over-day).
Null checks: ensure key fields (IDs, dates, amounts) are not unexpectedly null.
Uniqueness: enforce uniqueness where required (e.g., invoice_id).
Business rule checks: e.g., revenue should not be negative beyond refunds; status values must be from an allowed list.

Audit logs: know who accessed and changed what

Auditability supports security, compliance, and incident response. At minimum, log dataset publishes, permission changes, and access to restricted data.

Access logs: user, timestamp, dataset, query/dashboard, success/failure.
Change logs: who changed permissions, RLS mappings, or certified dataset logic.
Retention: keep logs long enough for investigations (often 90–365 days depending on policy).

Usage monitoring: prioritize what matters and retire what doesn’t

Monitor usage to identify critical dashboards, unused assets, and duplication. This helps governance focus on high-impact areas.

What to track

Top dashboards/datasets by weekly active users.
Most queried metrics and slowest queries.
Dashboards with zero views in 60–90 days (candidates for deprecation).
“Forked” dashboards that diverge from certified sources.

Lightweight approval workflow for new metrics or dashboards

Approval should be proportional: fast for low-risk assets, stricter for org-wide KPIs and sensitive data.

Example workflow (two lanes)

Lane	Applies to	Approval	Required artifacts	SLA target
Standard	Team dashboards, non-sensitive metrics	Domain lead review	Short description + dataset link	1–2 days
Controlled	Tier-1 KPIs, certified datasets, sensitive fields	Business owner + data platform	Definition, impact analysis, tests, comms plan	3–5 days

Practical step-by-step: request a new metric

Submit a request with: name, business question, proposed formula, and intended users.
Check for duplicates in the metric catalog; if similar exists, propose an extension instead.
Owner reviews definition and confirms it aligns with business policy.
Technical review ensures the metric is implemented once (in a certified dataset/semantic object) and inherits security controls.
Publish with a version and add it to the catalog.
Announce in the agreed channel with effective date and links to documentation.

Now answer the exercise about the content:

Which approach best prevents teams from getting different answers to the same business question as BI usage scales?

You are right! Congratulations, now go to the next page

You missed! Try again.

Metric consistency requires one definition of key metrics, implemented once and reused (often via a metric catalog and certified datasets). This avoids competing dashboard logic and reduces “different numbers for the same question.”