All courses > Hobbies and Special Interests > Robotics and Drones ::

Sensor Fusion in Robotics: Conceptual Models for Combining Measurements

Capítulo 13

Estimated reading time: 10 minutes

Why fuse sensors?

Robots rarely get a complete, reliable picture of the world from a single sensor. Each sensor is partial (measures only some aspects of motion or environment), noisy, and prone to specific failure modes. Sensor fusion combines multiple imperfect measurements with a motion model to produce a more reliable estimate of the robot’s state than any single source can provide.

Fusion is not “averaging everything.” It is a structured way to answer: Given what I believed a moment ago, what I predict now, and what I just measured, what should I believe next?

Robot state: what are we estimating?

A state is a compact set of variables that describes the robot at a given time. The exact state depends on the task and robot type, but common components include:

Pose: position and orientation (e.g., x, y, yaw for a ground robot; x, y, z, roll, pitch, yaw for a drone).
Velocity: linear and angular velocities (e.g., v, ω).
Optional extras: sensor biases, wheel slip indicators, scale factors, or landmark positions (in SLAM).

Fusion works best when you explicitly choose a state that matches what you need to control or plan. Estimating unnecessary variables adds complexity and can reduce robustness.

Two roles in every fusion system: prediction vs correction

Prediction (model-driven)

Prediction propagates the previous state forward using a motion model and control inputs. Examples:

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

Differential drive: use wheel commands/encoder-derived motion to predict x, y, yaw.
Legged robot: use IMU-integrated motion plus kinematic constraints to predict body orientation and velocity.

Prediction answers: Where should I be now if everything behaved as expected?

Correction (measurement-driven)

Correction adjusts the predicted state using sensor measurements. It answers: Given what I measured, how should I update my belief?

Key idea: correction should be weighted by how trustworthy the measurement is right now, not just in general.

Conceptual loop

repeat each time step k:  predict state using model  predict uncertainty (how wrong we might be)  for each sensor measurement:    check if measurement is consistent (gating)    if consistent: correct state using weighted update    else: ignore or down-weight  output fused state to controller/planner

Conceptual fusion example 1: Encoders + IMU for stable orientation and velocity

This fusion targets a common problem: encoders provide good short-term velocity/odometry on good traction, while the IMU provides strong rotational dynamics but can drift or be disturbed. Together they can stabilize both orientation and velocity estimates.

What each sensor contributes (conceptually)

Encoders: strong for forward speed and yaw rate inferred from wheel differences when wheels track the ground; weak under slip, bumps, or wheel lift.
IMU: strong for rapid rotational changes and short-term angular rate; weak for long-term drift and linear acceleration ambiguity.

Fusion model (high level)

Choose a state such as [x, y, yaw, v, ω] (or add IMU bias terms if needed). Use wheel motion as the primary predictor for x, y and v, and use IMU angular rate to improve yaw and ω stability.

Step-by-step practical workflow

Predict the next state using encoder-based motion (dead-reckoning) over the last time step.
Predict uncertainty: increase uncertainty more when conditions suggest slip (e.g., high commanded acceleration, low normal force, aggressive turns).
Correct yaw/ω with IMU: compare predicted yaw change to IMU-indicated yaw change; adjust yaw and ω toward the IMU when the IMU is behaving consistently.
Detect slip inconsistency: if encoder-implied yaw rate and IMU yaw rate disagree beyond a threshold, treat encoders as less reliable for that interval (down-weight their influence on yaw/ω).
Output fused yaw and v to the controller for smoother heading hold and speed regulation.

Practical takeaway: the fusion is not “encoder + IMU = better.” It is “use encoders for what they are good at, but let the IMU veto or soften encoder updates when slip is likely.”

Conceptual fusion example 2: ToF/Ultrasonic + encoders for obstacle-aware motion

Encoders help you estimate where you are and how fast you are moving. Range sensors (ultrasonic or optical ToF) help you estimate how close obstacles are. Fusion here is often less about a single unified state and more about combining motion estimation with environment constraints to make safer decisions.

Two common fusion patterns

State + constraint correction: use encoders to predict forward motion; use range readings to correct the predicted position when you expect to be near a wall or known boundary.
State + safety layer: keep odometry as the main state, but use range sensors to gate commands (slow/stop) when obstacles are detected, effectively fusing at the decision level.

Example: hallway following with a side range sensor

Goal: maintain a target distance to a wall while moving forward.

Predict forward progress from encoders to estimate where you are along the hallway.
Measure lateral distance to the wall from the range sensor.
Correct lateral offset estimate (or directly control steering) using the range measurement, but only when the reading is consistent (e.g., not out of plausible bounds, not rapidly jumping).
Use encoder velocity to anticipate braking distance: if range indicates an obstacle ahead within stopping distance, reduce speed even if odometry says you are on the planned path.

What “fusion” buys you here

Encoders provide continuity (you still know you are moving even if the range sensor temporarily fails).
Range sensors provide absolute proximity cues that odometry cannot infer.
The combination supports predictive safety: speed decisions based on both estimated motion and measured distance.

Conceptual fusion example 3: Camera + IMU for motion tracking

Cameras can provide motion cues from visual features; IMUs provide high-rate motion dynamics. Together they can track motion more robustly than either alone, especially during fast turns or brief visual degradation.

Conceptual division of labor

IMU: high-rate prediction of orientation changes and short-term motion; helps bridge gaps between camera updates.
Camera: provides drift-correcting information from observed scene changes; helps correct accumulated IMU drift over time.

Step-by-step practical workflow (conceptual visual-inertial loop)

IMU-driven prediction: propagate orientation (and optionally velocity/position) at IMU rate.
Camera update when a frame arrives: compute a motion observation from tracked features (or a pose change estimate).
Innovation check: compare predicted motion to camera-observed motion; if consistent, apply a correction.
Handle visual degradation: if too few features are tracked or motion blur is high, reduce the camera’s weight or skip correction; rely on IMU prediction temporarily.
Recover: when visual tracking quality returns, allow camera updates to pull the estimate back from drift.

This is the essence of many practical visual-inertial systems: IMU keeps the estimate responsive; camera keeps it honest over longer horizons.

Kalman filter intuition (without heavy math)

The Kalman filter is a widely used conceptual model for fusion when you can describe (1) how the state evolves and (2) how sensors relate to the state. Even if you never implement a textbook Kalman filter, its ideas guide good fusion design.

1) Uncertainty as a first-class signal

Instead of treating a sensor reading as “truth,” you treat it as “truth with uncertainty.” The filter maintains:

a best estimate of the state, and
a belief about how uncertain that estimate is.

When uncertainty is high, you should be more willing to accept corrections. When uncertainty is low, you should be more skeptical of surprising measurements.

2) Weighting: who do you trust right now?

In Kalman-style fusion, the correction is effectively a weighted blend between:

prediction (model + previous state), and
measurement (sensor reading).

The weights come from uncertainties: trust the source with lower uncertainty more. Importantly, uncertainties can be made context-dependent (e.g., increase encoder uncertainty during aggressive acceleration; increase camera uncertainty during motion blur).

3) Innovation: the “surprise” that drives correction

Innovation is the difference between what you measured and what you expected to measure based on the prediction.

Small innovation: measurement agrees with prediction; apply a gentle correction.
Large innovation: either something changed (e.g., slip, collision, new obstacle) or a sensor is wrong; handle carefully (gating, down-weighting, or fallback).

4) Why model quality matters

A fusion system is only as good as its assumptions. If the motion model is unrealistic (e.g., ignores slip, ignores actuator saturation, ignores delays), then predictions will be systematically wrong and the filter will either:

over-correct constantly (noisy estimate), or
reject good measurements as “inconsistent,” or
become overconfident and diverge.

Practical rule: a simple model with honest uncertainty often beats a complex model with overconfident uncertainty.

Failure handling in fusion: inconsistency detection, gating, and fallback

Fusion increases robustness only if you explicitly handle the fact that sensors can be wrong in different ways. A good fusion design includes mechanisms to detect and respond to inconsistency.

Detecting inconsistent sensors

Common signals that a sensor is currently unreliable:

Residual/innovation too large: measurement deviates far beyond expected uncertainty.
Quality indicators: e.g., too few visual features, saturated IMU readings, range sensor returns that are out of plausible bounds.
Cross-check disagreement: two sensors that should correlate (encoder yaw rate vs IMU yaw rate) disagree persistently.
Temporal behavior: sudden jumps, stuck-at values, or implausible rate of change.

Gating measurements (accept/reject or down-weight)

Gating decides whether to use a measurement update. Conceptually:

Compute predicted measurement from the current predicted state.
Compute innovation (difference between actual and predicted measurement).
Compare innovation to an acceptance threshold based on expected uncertainty.
If within threshold: apply correction. If outside: reject or reduce weight.

Gating prevents a single bad measurement from pulling the state estimate into an unsafe or unstable region.

Fallback strategies when a sensor becomes unreliable

Plan for degraded modes explicitly:

Drop to prediction-only temporarily: e.g., if camera tracking fails, rely on IMU prediction for short intervals.
Switch primary source: e.g., if encoders indicate slip, reduce their influence on yaw and rely more on IMU for heading.
Freeze certain state components: if a measurement is essential but missing (e.g., absolute position), keep estimating velocity/orientation but avoid aggressive maneuvers that require accurate position.
Safety behavior: slow down, increase obstacle clearance, or stop if the remaining sensors cannot guarantee safe motion.

Sensor issue	Symptom in fusion	Typical response
Wheel slip	Encoder-based motion disagrees with IMU rotation	Increase encoder uncertainty; rely more on IMU for yaw; limit acceleration
Range sensor dropout	Missing/invalid distance readings	Hold last valid reading briefly; reduce speed; rely on odometry for short horizon
Visual degradation	Few features / inconsistent camera motion estimate	Down-weight camera; rely on IMU; wait for recovery
IMU saturation	Clipped angular rates/accelerations	Reject IMU updates; rely on slower sensors; limit motion

Decision guide: when fusion is worth the complexity

Fusion adds engineering cost: more computation, more tuning, more failure cases. Use it when it clearly improves task success or safety.

Questions to decide

Is a single sensor sufficient? If one sensor already meets accuracy and robustness needs across operating conditions, fusion may be unnecessary.
Do sensors fail differently? Fusion is most valuable when sensors have complementary weaknesses (e.g., one drifts, another provides drift correction; one is high-rate, another is absolute but slow).
Do you need continuity? If control requires smooth, high-rate state estimates, fusion with a prediction model is often essential.
Can you detect failures? If you cannot gate or validate measurements, adding sensors may add failure modes rather than reduce them.

Minimal signals for typical robot tasks

Task	Minimal state needed	Minimal signals that usually work	Fusion value
Stable heading + speed control (ground robot)	`yaw, v, ω`	Encoders + IMU	High (stability under disturbances)
Obstacle-aware forward motion	`v` plus obstacle distance	Encoders + front range sensor	High (safety and braking decisions)
Short-term dead-reckoning navigation	`x, y, yaw`	Encoders + IMU (with gating for slip)	Medium to high (reduces drift in heading)
Motion tracking with fast turns (handheld/robot body)	`orientation` (and sometimes `position`)	Camera + IMU	High (IMU continuity + camera drift correction)
Slow indoor waypoint driving with frequent stops	`x, y, yaw` (coarse)	Encoders + occasional external correction (e.g., camera-based)	Medium (depends on environment)

Practical starting point (low complexity fusion)

Define the smallest state that supports your controller.
Implement prediction at a fixed rate.
Add one correction source at a time with explicit uncertainty and gating.
Log innovations and sensor quality metrics; tune weights based on observed conditions.
Implement at least one degraded mode (what the robot does when a sensor is rejected).

Now answer the exercise about the content:

In a prediction–correction sensor fusion loop, what is the main purpose of gating a sensor measurement before applying a correction?

You are right! Congratulations, now go to the next page

You missed! Try again.

Gating compares the measurement to what the prediction expects (innovation) and uses uncertainty-based thresholds to accept it, reject it, or reduce its weight, preventing bad data from destabilizing the estimate.