Why CRDTs matter in multi-writer offline apps
In an offline-first mobile app, multiple devices (and sometimes multiple users) can edit the same logical data while disconnected. When connectivity returns, replicas exchange changes and converge. A common approach is to detect conflicts and resolve them with domain rules. A different approach is to design the data type itself so that merges are mathematically safe and convergence is guaranteed without custom conflict resolution logic for every field. That family of data types is called CRDTs: Conflict-free Replicated Data Types.
A CRDT is a data structure with a merge strategy (or an operation application strategy) that ensures replicas converge to the same state when they have received the same set of updates, regardless of message ordering, duplication, or delays. This is a strong fit for offline-first UX because it reduces “sync surprises”: users can keep editing, and the system can merge without asking them to resolve conflicts for many common patterns (counters, sets, lists, maps, text).
CRDTs do not eliminate all product decisions. They encode a specific notion of “correctness” (for example, add-wins vs remove-wins). Your job is to pick CRDT semantics that match the user’s mental model for each piece of data.
Core properties: convergence under unreliable delivery
CRDT designs target the realities of distributed sync: messages can arrive out of order, be duplicated, or be temporarily missing. Two key families exist, each with a different way to guarantee convergence.
State-based CRDTs (CvRDTs): merge states
In a state-based CRDT, each replica periodically sends its full (or compacted) state to others. The receiver merges the incoming state into its local state using a merge function that is:
- Listen to the audio with the screen off.
- Earn a certificate upon completion.
- Over 5000 courses for you to explore!
Download the app
- Commutative: merge(a, b) = merge(b, a)
- Associative: merge(merge(a, b), c) = merge(a, merge(b, c))
- Idempotent: merge(a, a) = a
These properties mean you can merge in any order, multiple times, and still converge. The state must form a join-semilattice: there is a partial order (≤) and merge computes the least upper bound (join). In practice, you often implement merge as “take max per component” or “union of sets.”
Operation-based CRDTs (CmRDTs): deliver operations
In an operation-based CRDT, replicas send operations (like “increment by 1” or “insert character X at position P”) rather than full state. Convergence is guaranteed if operations are delivered exactly-once in causal order, or if the CRDT is designed to tolerate weaker guarantees by making operations commutative or by attaching enough metadata to reorder them safely.
For mobile sync, operation-based CRDTs can be bandwidth-efficient, but they require careful handling of causal context and deduplication. Many production systems use a hybrid: operation logs for near-real-time sync plus periodic state compaction.
CRDT vocabulary you will use in implementation
Replica identity and unique operation IDs
Most CRDTs attach metadata that must be unique per replica and per operation. A typical pattern is a pair (replicaId, counter) where counter is a monotonically increasing local integer. This yields a globally unique identifier without coordination.
Causality and vector clocks (conceptual)
Some CRDTs need to know whether one update happened before another (causal order). A common representation is a version vector: a map replicaId → max counter seen from that replica. You do not need to expose this to users, but you will store it to support merges, garbage collection, and compaction.
Join, tombstones, and compaction
Many CRDTs represent removals as tombstones (metadata that remembers something was removed) so that a late-arriving “add” does not resurrect deleted content incorrectly. Tombstones can grow without bound, so practical systems include compaction strategies that are safe once all replicas have observed certain versions.
Choosing CRDT semantics: add-wins vs remove-wins
Consider a shared set of tags on a note. If one device removes tag “urgent” while another offline device adds it, what should happen after sync? CRDTs force you to choose a deterministic rule:
- Add-wins: concurrent add and remove results in the element being present.
- Remove-wins: concurrent add and remove results in the element being absent.
Neither is universally correct. Add-wins often matches “I explicitly added it, don’t lose my work.” Remove-wins often matches “I explicitly deleted it, don’t bring it back.” The important part is that the rule is consistent and encoded in the data type.
Common CRDTs you can apply in mobile apps
G-Counter and PN-Counter (multi-writer counters)
A grow-only counter (G-Counter) supports only increments. Each replica maintains its own component count. The value is the sum across replicas. Merge takes the per-replica maximum, which is commutative/associative/idempotent.
// State-based G-Counter (conceptual) state: Map<replicaId, int> countsfunction inc(state, replicaId, delta=1): state.counts[replicaId] = state.counts.get(replicaId, 0) + deltafunction value(state): return sum(state.counts.values())function merge(a, b): merged.counts = {} for each id in union(keys(a.counts), keys(b.counts)): merged.counts[id] = max(a.counts.get(id,0), b.counts.get(id,0)) return mergedA PN-Counter supports increments and decrements by combining two G-Counters: one for increments (P) and one for decrements (N). Value = sum(P) − sum(N). Merge each component with max per replica.
Practical uses: likes/upvotes, inventory adjustments, “times opened,” or any metric where you want offline increments without conflicts.
OR-Set (Observed-Remove Set) for multi-writer sets
A naive set with add/remove will conflict under concurrency. OR-Set solves this by tagging each add with a unique ID. Removing an element removes the specific observed tags for that element. If an add happens concurrently with a remove, the remove cannot remove a tag it never observed, so the concurrent add remains (add-wins behavior).
State typically includes:
- Adds: map element → set of unique tags
- Removals: set of removed tags (or per-element removed tags)
Membership: element is present if it has at least one add-tag not in the removed set.
// OR-Set (add-wins) state-based sketchstate.adds: Map<E, Set<Tag>>state.removed: Set<Tag>function add(state, e, tag): state.adds[e].add(tag)function remove(state, e): // remove only tags we have observed for e for tag in state.adds.get(e, emptySet): state.removed.add(tag)function contains(state, e): for tag in state.adds.get(e, emptySet): if tag not in state.removed: return true return falsefunction merge(a, b): merged.adds = union per element of tag sets merged.removed = a.removed ∪ b.removed return mergedPractical uses: tags, membership lists, “saved items,” feature flags per user, or any set-like attribute edited on multiple devices.
LWW-Register (Last-Writer-Wins) and why it is not always enough
An LWW-Register stores a value plus a timestamp (or logical clock). Merge picks the value with the greatest timestamp; ties are broken deterministically (for example by replicaId). This is simple and often used for fields like “display name” where overwriting is acceptable.
However, LWW can lose updates silently when two edits are concurrent and one timestamp wins. In offline-first UX, that may be surprising for user-generated content. Prefer richer CRDTs (maps, sets, text) for collaborative fields, and reserve LWW for fields where overwriting is expected.
CRDT Maps: composing CRDTs per field
Real app objects are structured: a note has title, body, tags, checklist items, etc. A common pattern is a CRDT map where each key maps to a CRDT value. For example:
- title: LWW-Register (or a text CRDT if collaboratively edited)
- tags: OR-Set
- viewCount: PN-Counter
- checklist: list CRDT
Composition is powerful: if each field is a CRDT and merges are independent per field, the whole object becomes a CRDT. This lets you avoid monolithic “object-level conflict resolution” and instead encode semantics where they belong: at the field level.
List and text CRDTs: the hard part of collaboration
Ordered sequences (lists, rich text) are challenging because concurrent inserts at the same position must be deterministically ordered, and deletes must not corrupt the sequence. Two major approaches exist:
- Identifier-based (sequence CRDTs): each element gets a position identifier that can be ordered. Inserts create new identifiers between neighbors. Examples include Logoot-style and LSEQ-style schemes.
- Operation transformation-like but CRDT: represent edits as operations with causal metadata; examples include RGA (Replicated Growable Array) and similar linked-list approaches.
For mobile apps, you often do not need Google-Docs-level real-time co-editing, but you may still need multi-device edits to a checklist or a simple ordered list. A list CRDT can provide stable merges without forcing users to resolve “which item came first.”
RGA intuition: stable IDs and tombstones
RGA assigns each inserted element a unique ID and links it after a predecessor ID. Concurrent inserts after the same predecessor are ordered by their IDs deterministically. Deletes mark an element as removed (tombstone) rather than physically removing it immediately, so late operations can still reference it.
Key tradeoff: tombstones accumulate. You need compaction once you know all replicas have seen the delete.
Step-by-step: modeling a multi-writer “Note” with CRDT fields
This walkthrough focuses on how you would represent and merge a note edited on multiple devices. It assumes you already have a sync engine that can exchange payloads; the focus here is the CRDT state and merge logic.
Step 1: identify which fields need multi-writer safety
Make a table of fields and expected concurrent behavior:
- title: if two devices edit concurrently, is it acceptable that one wins? If yes, LWW-Register. If no, use a text CRDT (more complex).
- tags: should concurrent add/remove be add-wins or remove-wins? Choose OR-Set (add-wins) or a remove-wins variant.
- pinned: boolean toggle. LWW-Register is usually fine.
- readCount: PN-Counter or G-Counter depending on whether decrements exist.
- checklist: list CRDT if ordering matters and multiple devices can edit items.
Step 2: define replica IDs and per-replica counters
Each device needs a stable replicaId. Store it locally. Maintain a local monotonic counter for generating unique tags/operation IDs.
// local metadatareplicaId = loadOrCreateReplicaId()localSeq = loadLocalSequenceCounter()function nextTag(): localSeq += 1 persist(localSeq) return (replicaId, localSeq)Step 3: define CRDT state shapes
Example state for a note:
NoteCRDT { title: LWW<String> tags: ORSet<String> pinned: LWW<Boolean> readCount: GCounter checklist: RGA<ChecklistItem> // each item has stable id + text }LWW<T> { value: T timestamp: (replicaId, counter) } // use a logical timestamp to avoid wall-clock issuesORSet<E> { adds: Map<E, Set<Tag>> removed: Set<Tag> }GCounter { counts: Map<replicaId, int> }Note the LWW timestamp uses a logical pair (replicaId, counter) rather than wall-clock time. This avoids problems when device clocks are skewed. Ordering can be lexicographic: compare counter first, then replicaId as tie-breaker, or use a hybrid logical clock if you need wall-clock-ish ordering.
Step 4: implement local edits as CRDT mutations
When the user edits offline, you update the CRDT state immediately.
- Set title: create a new LWW timestamp using nextTag().
- Add tag: OR-Set add with a fresh tag ID.
- Remove tag: OR-Set remove by marking observed tags removed.
- Increment readCount: increment local component in G-Counter.
function setTitle(note, newTitle): note.title.value = newTitle note.title.timestamp = nextTag()function addTag(note, tagText): note.tags.add(tagText, nextTag())function removeTag(note, tagText): note.tags.remove(tagText)function incrementReadCount(note): note.readCount.inc(replicaId, 1)Step 5: merge remote state deterministically
When remote changes arrive, merge field-by-field using each CRDT’s merge.
function mergeNote(local, remote): local.title = mergeLWW(local.title, remote.title) local.pinned = mergeLWW(local.pinned, remote.pinned) local.tags = mergeORSet(local.tags, remote.tags) local.readCount = mergeGCounter(local.readCount, remote.readCount) local.checklist = mergeRGA(local.checklist, remote.checklist) return localBecause each merge is commutative/associative/idempotent, you can apply merges repeatedly and in any order. This is particularly useful when sync delivers partial updates, retries, or when you reconcile multiple peers.
Step 6: derive user-visible materialized views
CRDT state often contains metadata (tags, tombstones). Your UI should render a clean view:
- OR-Set: show elements with at least one live add-tag.
- RGA: traverse in order, skipping tombstoned elements.
- Counters: sum components.
Keep the materialization logic separate from storage so you can change compaction strategies without breaking UI code.
Practical step-by-step: OR-Set tag merging example
Consider two devices A and B editing the same note’s tags while offline.
Step 1: initial state
Both replicas start with tags = {“work”} represented as adds[“work”] = {t1}. removed = ∅.
Step 2: concurrent edits
- Device A removes “work”. It observes tag t1, so it adds t1 to removed.
- Device B adds “work” again (maybe it was missing locally or user re-added). It creates a new tag t2 and adds it to adds[“work”].
Step 3: merge
Merging unions adds and removed:
- adds[“work”] = {t1, t2}
- removed = {t1}
Membership check finds t2 is not removed, so “work” is present. This is add-wins behavior for concurrent add/remove.
Step 4: what the user experiences
After sync, the tag remains. If your product expectation is that deletions should dominate, you would choose a remove-wins set variant instead of OR-Set, or model “remove” as a stronger operation with its own timestamped intent.
Metadata growth and compaction: planning for mobile constraints
CRDTs trade coordination for metadata. On mobile, storage and bandwidth are limited, so you must plan for compaction.
Tombstone accumulation in sets and lists
OR-Sets and list CRDTs often retain removed identifiers. Without compaction, a frequently edited object can grow indefinitely.
Safe compaction requires knowledge of what others have seen
To drop tombstones safely, you need evidence that all replicas that might later send an old add/insert have already observed the corresponding remove/delete. This is typically done using version vectors:
- Each replica tracks a version vector of operations it has seen.
- When replicas exchange their version vectors (or a summarized “ack” state), you can compute a lower bound that is known to be seen by all relevant replicas.
Once a removed tag’s unique ID is known to be seen by all replicas, you can physically delete it from both adds and removed sets. Similarly, you can purge tombstoned list elements.
Practical compaction strategy for mobile apps
- Periodic compaction: run when the app is idle, charging, or on Wi-Fi.
- Size thresholds: compact when metadata exceeds a threshold per object.
- Server-assisted compaction: if you have a central sync service, it can compute safe compaction points and distribute compacted state. This reduces device CPU and simplifies “all replicas have seen X” reasoning.
CRDTs and domain invariants: what CRDTs do not guarantee
CRDTs guarantee convergence, not business correctness. Some invariants cannot be ensured without coordination. Examples:
- Uniqueness constraints: “only one item can be primary.” Two devices can concurrently set different items as primary; a CRDT will converge but may violate the invariant unless you encode a deterministic rule (for example, LWW on primaryId) and accept overwrite semantics.
- Bounded resources: “stock cannot go below zero.” A PN-Counter can represent decrements offline, but it cannot prevent overselling without coordination.
- Cross-object constraints: “sum of allocations across projects must equal 100%.” CRDTs operate per replicated object; global constraints require additional design.
In practice, you often combine CRDTs for user-facing merge behavior with validation layers that detect invariant violations and guide remediation (for example, pick a winner deterministically and notify, or queue a server-side reconciliation task).
Interoperability and storage representation
Encoding CRDT state for persistence
CRDT state is just data. Store it in your local database as structured blobs or normalized tables. Key considerations:
- Stable ordering for deterministic serialization (important for hashing and caching).
- Compact binary formats when state is large (lists/text).
- Schema evolution: include a version field so you can migrate CRDT representations.
Encoding for sync payloads
For state-based CRDTs, you can send the whole state or a compressed delta of the state (for example, only changed per-replica counter components, only new add-tags). For operation-based CRDTs, you send operations with unique IDs and causal context. The CRDT choice influences payload shape, but the key is that the receiver can apply merges repeatedly without harm.
Testing CRDT behavior: how to gain confidence
CRDT bugs are often subtle because they appear only under reordering and concurrency. Testing should simulate the network, not just single-threaded updates.
Property-based tests for convergence
Generate random sequences of edits on multiple replicas, deliver them in random orders with duplicates, and assert convergence.
// Pseudocode test idea:for trial in 1..N: replicas = [newReplica(), newReplica(), newReplica()] ops = generateRandomOps(replicas) deliveries = shuffleWithDuplicates(ops) for (replica, op) in deliveries: apply(replica, op) // or merge states periodically assert all materializedViewsEqual(replicas)Golden scenario tests for semantics
Write explicit tests for your chosen semantics:
- Concurrent add/remove in tags yields add-wins (or remove-wins if that’s your choice).
- Concurrent increments sum correctly.
- Checklist concurrent inserts are deterministically ordered and no item disappears.
Migration tests
If you change CRDT representation (for example, switch from LWW timestamps based on wall-clock to logical IDs), test that old persisted states can be loaded and merged with new ones without divergence.