All courses > Hobbies and Special Interests > Robotics and Drones ::

Robotics Vision Task: Fiducial Markers for Localization and Docking

Capítulo 11

Estimated reading time: 11 minutes

What fiducial markers provide to a robot

Fiducial markers (e.g., square binary tags) are engineered visual patterns designed to be detected quickly and unambiguously. They are popular in robotics because a single detection can provide both identity (which marker it is) and geometry (where it is and how it is oriented relative to the camera). This makes them useful for two common tasks: (1) localization against a known map of marker locations, and (2) docking/alignment to a station that carries one or more markers.

Detection outputs you should expect

A typical marker detector returns three categories of outputs. Treat these as the contract between perception and the rest of the robot stack.

Marker ID: an integer identifying which tag in the family was detected (e.g., 0–249). This links the detection to a known physical marker in your environment or on your dock.
Corner pixel coordinates: the 2D image locations of the marker’s corners, usually in a consistent order (e.g., clockwise starting at top-left). Represented as (u, v) pixel pairs: [(u0,v0),(u1,v1),(u2,v2),(u3,v3)]. These corners are the key input for pose estimation.
Pose estimate (optional but common): the marker’s 3D position and orientation relative to the camera, computed from the corners plus the marker’s known physical size and the camera intrinsics. Often returned as a rotation (R or rvec) and translation (t) such that a 3D point on the marker in marker coordinates transforms into the camera frame.

Many libraries also provide a detection quality score (e.g., Hamming distance, decision margin, reprojection error). Use it for gating and debugging.

Building a reliable marker-based system

Marker-based systems fail more often due to engineering details (size, placement, lighting, motion) than due to the detector itself. The goal is to ensure the marker is consistently detectable and yields stable pose.

Step 1: Choose marker family and print quality

Family choice: prefer widely used families with good error correction. Use a family large enough to avoid ID collisions in your environment.
Border and quiet zone: ensure the printed marker includes the required black border and margin. Cutting too close to the pattern increases false detections and corner jitter.
Material: matte finish reduces specular highlights. Avoid glossy laminates unless you control lighting well.
Mounting flatness: a warped marker breaks the planar assumption used by pose estimation and increases reprojection error.

Step 2: Decide marker size and placement using pixel coverage

Pose accuracy depends strongly on how many pixels the marker occupies. Too small and the corners become quantized and noisy; too large and you may lose it due to partial visibility at close range. A practical design approach is to target a minimum marker width in pixels at the farthest expected detection distance.

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

Rule-of-thumb pixel targets

Detection reliability: aim for marker width ≥ 60–80 px in the image at the farthest range.
Stable pose for docking: aim for marker width ≥ 120–200 px near the docking approach region.
Corner stability: if corners jump by multiple pixels frame-to-frame, pose will jitter; increase pixel coverage or improve shutter/lighting.

If you know camera intrinsics, you can estimate expected pixel width. For a pinhole model, the approximate pixel width is:

marker_width_px ≈ fx * (marker_width_m / distance_m)

where fx is focal length in pixels. Use this to pick a marker size that meets your pixel target at the maximum distance you care about.

Placement guidelines

Keep markers near the camera’s typical gaze direction: avoid placing them only at the extreme edges of the field of view where distortion and blur are worse.
Prefer vertical mounting for ground robots: a marker on a vertical plane (wall/dock face) is easier to view head-on during approach than one on the floor.
Provide approach visibility: ensure the marker is visible early enough to allow the robot to correct its trajectory before reaching the dock.
Use multiple markers for robustness: a cluster around the dock face reduces occlusion sensitivity and improves pose stability.

Step 3: Lighting considerations specific to markers

Markers are high-contrast patterns, but they can still fail under real lighting. Common issues include blown highlights, deep shadows, and flicker. The detector needs crisp edges and an undistorted binary pattern.

Avoid specular glare: position lights so reflections do not wash out the black/white cells. Matte prints help.
Ensure sufficient exposure at motion: docking often happens while moving; if exposure time is long, motion blur destroys corner localization. Increase illumination or use shorter exposure with higher gain if needed.
Handle flicker: some indoor lighting flickers; if you see periodic detection dropouts, adjust exposure time or use flicker-aware settings.

Step 4: Calibration dependencies (what matters for pose)

Pose estimation from a planar marker depends on accurate camera intrinsics (focal lengths and principal point) and distortion parameters. Small calibration errors can cause systematic pose bias (e.g., wrong distance, yaw drift) even if the marker is detected perfectly.

Intrinsics accuracy: directly affects scale and angle accuracy from PnP.
Distortion correction: important when markers appear near image edges; unmodeled distortion bends edges and shifts corners.
Marker size accuracy: the physical side length you provide to the solver must match the printed marker; a 2% size error becomes a ~2% distance error.

Pose estimation basics: from corners to 6-DoF with PnP

Given the 2D pixel coordinates of the marker corners and the known 3D coordinates of those corners in the marker’s own coordinate frame, you can estimate the marker pose relative to the camera using a Perspective-n-Point (PnP) solver.

Define the 3D corner coordinates

Assume a square marker of side length s meters. A common convention sets the marker plane at Z=0 and centers it at the origin:

// Marker frame (meters), corners in consistent order with detector output (clockwise for example):

P0 = (-s/2,  s/2, 0)  // top-left (example convention; match your detector ordering!
P1 = ( s/2,  s/2, 0)  // top-right
P2 = ( s/2, -s/2, 0)  // bottom-right
P3 = (-s/2, -s/2, 0)  // bottom-left

The 2D observations are the detected pixel corners p0..p3. PnP finds R and t that best project the 3D points onto the observed pixels given the camera intrinsics.

What you get back

Translation t: the marker origin position in the camera frame (meters). The forward distance is typically the camera frame’s Z component (depending on your convention).
Rotation R: orientation of the marker frame relative to the camera. Often converted to roll/pitch/yaw for control, but keep it as a rotation matrix or quaternion internally to avoid singularities.

Quality check: reprojection error

After solving PnP, reproject the 3D corners back into the image and measure pixel error. Large reprojection error indicates a bad detection, wrong corner ordering, wrong marker size, or calibration issues.

reproj_error_px = mean_i || project(R, t, Pi) - pi ||

Use a threshold (e.g., < 2–5 px depending on resolution and distance) to accept/reject pose estimates.

Smoothing pose over time (for control stability)

Raw pose estimates can jitter due to pixel noise, corner quantization, and intermittent partial occlusions. Docking controllers are sensitive to this jitter, especially in yaw and lateral offset. Smoothing should reduce noise while preserving responsiveness.

Practical smoothing strategies

Low-pass filter translation: apply an exponential moving average (EMA) to t components. Choose a time constant based on robot speed and camera FPS.
Smooth rotation with quaternions: use spherical linear interpolation (slerp) between previous and current orientation to avoid artifacts from Euler angles.
Outlier rejection before smoothing: reject frames with high reprojection error or implausible jumps; smoothing alone cannot fix large outliers.

Example: EMA with gating

// Inputs each frame: detection (id, corners), pose (R, t), quality metrics (reproj_error, decision_margin)

if not detected: hold last pose for a short timeout, then declare lost

if reproj_error > err_thresh: reject measurement

if ||t - t_prev|| > jump_thresh: reject measurement (or down-weight)

// Smooth translation (EMA)

t_filt = alpha * t_meas + (1 - alpha) * t_filt_prev

// Smooth rotation (slerp between quaternions q_prev and q_meas)

q_filt = slerp(q_filt_prev, q_meas, alpha_rot)

Choose alpha higher (less smoothing) when the robot is close to docking and needs responsiveness, and lower (more smoothing) when far away and noise dominates.

Using markers for docking alignment

Docking typically needs the robot to align laterally and in yaw with the dock face, then drive forward while maintaining alignment. A marker on the dock provides a direct measurement of relative pose.

Control-relevant quantities from pose

Forward distance: use the marker’s Z in camera frame (or along the robot’s forward axis after transforming into robot base frame).
Lateral offset: marker X in camera/robot frame indicates left-right error.
Yaw misalignment: relative rotation around the vertical axis (in the robot frame) indicates heading error to correct.

Step-by-step docking loop (marker-based)

Acquire: rotate or move until the dock marker is detected with sufficient quality and pixel size.
Approach phase: drive forward while correcting yaw and lateral offset using filtered pose. Keep speed low enough to avoid motion blur.
Final alignment: when within a distance threshold, reduce speed and tighten gating thresholds (reject jittery measurements). Optionally require multi-marker agreement.
Contact/engage: switch to short-range sensors (bump, IR, charging contacts) if available, but keep marker pose as a sanity check.

In practice, it helps to define a docking coordinate frame on the dock (e.g., marker frame) and compute the robot’s target pose relative to it (e.g., a fixed offset in front of the marker).

Using markers for map localization

If marker positions are known in a global/map frame, each detection provides a constraint on the robot pose. The marker ID tells you which known landmark you observed; the pose estimate gives the transform between camera and marker. Combining these with the robot’s known camera-to-base transform yields a robot pose estimate in the map.

Typical transform chain

Let T_A_B denote the transform that maps coordinates from frame B into frame A. If you know:

T_map_marker from a marker map (per marker ID)
T_cam_marker from PnP (measurement)
T_base_cam from robot geometry

Then you can compute robot base in map:

T_map_base = T_map_marker * inverse(T_cam_marker) * inverse(T_base_cam)

In a multi-marker environment, you can fuse multiple marker-based pose estimates over time (and with odometry) using a state estimator. Even without a full estimator, you can use marker-based pose as periodic corrections to reduce drift.

Pitfalls and mitigation strategies

Motion blur

Symptom: corners become smeared, detections drop, pose jumps. Mitigations:

Reduce exposure time; increase illumination to compensate.
Slow down during acquisition and final docking.
Prefer global shutter cameras for fast motion if available.

Occlusion and partial visibility

Symptom: marker disappears when robot gets close (bumper, gripper, cable blocks view) or when viewed from the side. Mitigations:

Use multi-marker layouts around the dock so at least one is visible.
Mount markers higher/lower to avoid robot self-occlusion.
Use a larger marker or multiple sizes (large for far acquisition, smaller for close-range if the large one gets cropped).

Steep viewing angles (foreshortening)

Symptom: marker becomes a thin trapezoid; corner localization degrades; pose becomes unstable. Mitigations:

Place markers where the robot can approach more fronto-parallel.
Use multiple markers with different orientations (e.g., slight outward cant) to increase effective viewing angle coverage.
Gate on minimum observed area or aspect ratio; reject detections that are too foreshortened.

False detections and ID confusion

Symptom: detector reports a marker where none exists, or wrong ID; pose is nonsensical. Mitigations:

Detection gating: require low Hamming distance / high decision margin, and low reprojection error.
Temporal consistency checks: require the same ID to persist for N frames before acting; reject sudden ID changes unless the previous marker was lost.
Geometric plausibility: enforce bounds on distance, height, and orientation based on your environment (e.g., dock marker should be roughly vertical and within a known region).
Multi-marker agreement: if multiple markers are on the dock, require consistent relative poses between them.

Corner ordering mistakes

Symptom: pose flips or rotates unexpectedly even though the marker is detected. Mitigations:

Verify the detector’s corner ordering convention and match it to your 3D point ordering.
Use reprojection error as a diagnostic; wrong ordering often yields large error or mirrored pose.

Scale errors from wrong marker size

Symptom: distance estimates consistently too large/small. Mitigations:

Measure the printed marker side length precisely (including any scaling from printers).
Standardize printing settings; avoid “fit to page”.

Practical checklist for a robust deployment

Area	What to verify	Practical test
Marker size	Meets pixel coverage at far range	Place robot at max range; confirm marker width in pixels and stable detection
Placement	Visible during approach; not self-occluded	Drive full docking trajectory; check when detections start/stop
Lighting	No glare; short exposure possible	Record video while moving; inspect blur and detection dropouts
Calibration dependency	Pose bias acceptable	Compare measured distance to ground truth at several ranges
Gating	Reject outliers and false IDs	Introduce distractor patterns; ensure system does not latch onto them
Temporal consistency	Stable pose for control	Hold robot still; verify pose jitter stays within control tolerance

Now answer the exercise about the content:

Which combination of detector outputs is sufficient to estimate a fiducial marker’s 6-DoF pose relative to the camera using PnP (assuming camera intrinsics and marker size are known)?

You are right! Congratulations, now go to the next page

You missed! Try again.

PnP estimates R and t by matching observed 2D corner pixels to the marker’s known 3D corner points (from its physical size) using camera intrinsics.