Why offline testing is different from “normal” testing
Testing offline-first apps is not only about verifying that screens render without a network. The hardest bugs appear in transitions: going offline mid-request, switching between Wi‑Fi and cellular, captive portals, partial connectivity where DNS works but HTTPS fails, or a device that reports “connected” while packets are dropped. Your app’s correctness depends on how it behaves under these unstable conditions, and your test strategy must intentionally create them.
Offline scenario testing focuses on three layers at once: (1) the network layer (latency, loss, timeouts, TLS failures), (2) the persistence/sync boundary (what is committed locally vs. pending), and (3) the user-visible behavior (what the user can do, what is blocked, and what is deferred). This chapter concentrates on how to systematically simulate those conditions and validate behavior with automated and manual tests, without rehashing earlier design topics.
Define an offline test matrix (what to test)
Before tools, define a matrix of scenarios that represent real failure modes. A useful matrix combines: network condition, app lifecycle state, and operation type. Keep it small enough to run frequently, but broad enough to catch regressions.
Network conditions to include
- Airplane mode: no radio, immediate failures.
- Hard offline: no route to host (e.g., Wi‑Fi connected but no internet).
- Captive portal: HTTP redirects, TLS failures, “connected” indicator lies.
- High latency: e.g., 400–2000ms RTT.
- Packet loss: e.g., 5–30% loss causing retries/timeouts.
- Bandwidth constrained: e.g., 50–200 kbps, to expose progress UI and timeout tuning.
- Jitter: variable latency causing race conditions and flaky ordering.
- Server unreachable: DNS failure, connection refused, 503s.
- Mid-flight drop: disconnect during upload/download or during a multi-step API sequence.
Lifecycle states to include
- Foreground active: user actively interacting.
- Backgrounded: app suspended; network changes happen while not running.
- Cold start offline: app launched with no connectivity.
- Resume with changed network: app returns after minutes/hours; tokens may be stale; DNS may differ.
- OS kill + restart: pending work must survive process death.
Operation types to include
- Read-only: list/detail fetch, search, filters.
- Write: create/update/delete, multi-step flows (wizard forms).
- Bulk: batch edits, import/export.
- Binary transfer: images, attachments, large payloads.
- Real-time-ish: polling, subscriptions, “live” indicators (even if you degrade offline).
Turn the matrix into a checklist for manual testing and a smaller “smoke subset” for CI. For example: “Cold start offline + create item + kill app + reconnect + verify server state” is a high-value scenario that catches persistence and replay issues.
Testing approach: unit, integration, end-to-end, and chaos
Unit tests: deterministic failure injection
At the unit level, you want deterministic control over network outcomes. The key is to avoid real sockets and instead inject a transport that can return: timeouts, partial responses, malformed payloads, and specific HTTP error codes. If your code uses an HTTP client, wrap it behind an interface so tests can provide a fake implementation.
- Listen to the audio with the screen off.
- Earn a certificate upon completion.
- Over 5000 courses for you to explore!
Download the app
Unit tests should validate: (1) correct classification of errors (offline vs. server vs. auth), (2) correct persistence of pending work, and (3) correct state transitions (e.g., “pending → in-flight → committed” or “in-flight → pending” after a drop). Keep these tests fast and exhaustive.
Integration tests: real storage + simulated network
Integration tests should include your real local database and serialization logic, but still avoid real network variability. Use a local mock server (or in-process HTTP server) that can be scripted to delay responses, close connections, or return sequences of errors. This catches issues like transaction boundaries, schema mismatches, and “works in unit tests but fails with real persistence.”
End-to-end (E2E) tests: device + OS network toggles
E2E tests validate the full stack: UI, persistence, background behavior, and OS-level connectivity changes. They are slower and more brittle, so focus on a small number of critical flows. E2E is where you verify that the app remains usable offline, that the UI reflects pending work, and that reconnect triggers the expected sync behavior.
Chaos testing: randomized network turbulence
Chaos testing introduces random latency, loss, and disconnects during longer scripted user sessions. The goal is not to assert every intermediate UI state, but to assert invariants: no crashes, no data corruption, no duplicate server writes, and eventual convergence after connectivity returns.
Network condition simulation: practical options
You can simulate network conditions at multiple levels. Choose the lowest level that is practical for the test you are running.
1) In-app transport simulation (fastest, most deterministic)
Implement a “fault-injecting” HTTP transport used only in tests (and optionally in debug builds). It can apply rules like: delay all requests by 800ms, fail every 3rd request with a timeout, or drop the connection after N bytes.
Example pseudo-implementation of a rule-based transport:
interface HttpTransport { fun execute(req: Request): Response }class FaultInjectingTransport(private val real: HttpTransport) : HttpTransport { var rules: List<Rule> = emptyList() override fun execute(req: Request): Response { rules.forEach { it.before(req) } val res = try { real.execute(req) } catch (e: Exception) { rules.forEach { it.onError(req, e) } throw e } rules.forEach { it.after(req, res) } return res } }sealed class Rule { open fun before(req: Request) {} open fun after(req: Request, res: Response) {} open fun onError(req: Request, e: Exception) {} }class DelayRule(private val ms: Long) : Rule() { override fun before(req: Request) { Thread.sleep(ms) } }class FailNthRule(private val n: Int) : Rule() { private var count = 0 override fun before(req: Request) { count++ if (count % n == 0) throw TimeoutException("simulated") } }This approach is ideal for unit/integration tests and for reproducing bugs quickly. It will not catch OS-level quirks (like captive portals), but it is excellent for deterministic coverage.
2) Mock server with scripted behavior (integration-friendly)
Run a local HTTP server in tests and script it to behave badly: delay responses, return 503 for a while, send malformed JSON, or close the socket mid-response. This tests your parsing, error handling, and retry boundaries with real HTTP semantics.
Scripted sequence example (conceptual):
Scenario: POST /items 1) accept connection, read request, then close socket (simulate drop) 2) next call returns 201 with created item 3) subsequent GET returns list including the itemUse this to validate that a mid-flight drop does not create duplicates and that the client can recover on the next attempt.
3) OS and device network shaping (most realistic)
For E2E and manual verification, use OS tools to toggle connectivity and shape traffic. The exact tooling differs by platform and environment, but the principles are the same: control bandwidth, latency, loss, and connectivity transitions.
- Airplane mode toggling: validates immediate offline behavior and UI transitions.
- Wi‑Fi disable/enable: validates network change events and reconnect logic.
- Link conditioner / traffic shaping: validates slow networks, timeouts, progress UI, and cancellation.
- Proxy-based simulation: route traffic through a proxy that can throttle, drop, or rewrite responses.
When possible, prefer shaping at the router or host machine level for consistent results across devices. For mobile simulators/emulators, built-in network conditioning can be sufficient for many cases.
Step-by-step: build a repeatable offline test harness
Step 1: Add test hooks for connectivity and time
Offline bugs are often timing bugs. Add two injectable dependencies: a connectivity provider and a clock. In production they use OS APIs; in tests they are controllable.
- Connectivity provider: can be set to “offline”, “online”, or “unknown/limited”.
- Clock: allows advancing time to trigger timeouts, expirations, and scheduled work deterministically.
This prevents flaky tests that depend on real timers and real network state.
Step 2: Create a catalog of “network scripts”
Define reusable scripts that describe network behavior over time. For example:
- Script A: online for 2 seconds → offline for 10 seconds → online.
- Script B: 1500ms latency + 10% loss for all requests.
- Script C: first POST drops mid-flight, second POST succeeds.
Implement scripts either in your fault-injecting transport or in your mock server. The key is repeatability: the same script should reproduce the same behavior every run.
Step 3: Add invariants and assertions (what must always be true)
Offline-first correctness is best validated with invariants rather than brittle UI snapshots. Examples of invariants:
- No data loss: locally created entities remain present after app restart.
- No duplication: a single user action results in at most one server-side creation.
- Monotonic state: once an operation is marked committed, it never returns to pending.
- Eventual convergence: after reconnect and enough time, local and server views match.
- Crash-free: network errors never crash the app; errors are surfaced as states.
Write assertions against your local database state, operation queue state, and server mock state. UI assertions should focus on a few critical indicators (e.g., “pending badge visible”) rather than pixel-perfect layouts.
Step 4: Build golden-path E2E flows for offline
Choose 3–6 E2E flows that represent core value. For each flow, define: initial state, steps, network script, and expected invariants. Example flows:
- Cold start offline → browse cached list → open detail → edit → kill app → reconnect → verify server updated
- Start upload → drop network at 30% → resume on reconnect → verify checksum matches
- Create multiple items offline → reorder them → reconnect on slow network → verify order and count
Keep E2E flows stable by controlling data: use test accounts, reset server state, and seed predictable fixtures.
Step-by-step: simulate mid-flight failures reliably
Mid-flight failures are where many subtle bugs hide: you may have sent the request but not received the response, so the client cannot know whether the server applied it. To test this, you need the ability to cut the connection after the server receives the request (or after N bytes).
Approach A: server closes connection after reading request
- Configure the mock server endpoint to read the full request body.
- Persist on the server side that the operation was applied (or not), depending on what you want to test.
- Close the socket without sending a response.
- On the next attempt, return a normal success response.
Assertions to add:
- The client does not create duplicates on retry.
- The client eventually marks the operation committed.
- The UI does not show a permanent error state once connectivity is restored.
Approach B: client-side transport throws after upload begins
- In the fault-injecting transport, stream the request body and throw an exception after N bytes.
- Verify that partial uploads do not corrupt local state and that resume logic (if present) restarts correctly.
This approach is easier to run in unit/integration tests because it does not require socket-level control.
Testing “connected but unusable” networks
Many apps incorrectly treat “network available” as “internet reachable.” You should test cases where the device reports connectivity but requests fail.
Captive portal simulation
In a captive portal, HTTP requests may be redirected to a login page, while HTTPS may fail due to certificate mismatch or blocked traffic. To simulate:
- Route traffic through a proxy that returns a 302 redirect to an HTML page for HTTP endpoints.
- For HTTPS endpoints, simulate TLS handshake failure or connection reset.
Assertions:
- Your app classifies the condition as “limited connectivity” (or similar) rather than “server down.”
- It avoids destructive actions that require confirmed server acknowledgement.
- It provides a recoverable path (e.g., retry) once normal internet returns.
DNS failure vs. TCP failure vs. HTTP 503
These failures should be tested separately because they often map to different user messaging and retry behavior.
- DNS failure: simulate by using an invalid host in tests or a resolver that returns NXDOMAIN.
- TCP failure: simulate connection refused or no route to host.
- HTTP 503: server reachable but overloaded; responses are valid HTTP.
In tests, assert that your logging/telemetry tags these correctly so production diagnostics are actionable.
Testing background and lifecycle transitions under network changes
Some of the most painful offline bugs occur when the app is backgrounded: the OS may suspend network activity, kill the process, or delay background tasks. You can still test your app’s behavior by focusing on what must persist and what must resume safely.
Step-by-step: “background during sync” scenario
- Start with connectivity online and a known pending operation set.
- Trigger a sync attempt (or whatever mechanism initiates network work).
- Immediately background the app (or simulate suspension in tests).
- Toggle connectivity offline while backgrounded.
- Bring the app to foreground and restore connectivity.
- Assert invariants: no lost operations, no duplicated operations, and the app resumes without manual intervention.
For automation, prefer platform test frameworks that can background/foreground the app and control network toggles. If full automation is difficult, keep a manual script that QA and developers can run consistently.
Observability for tests: logs, traces, and “sync timeline” snapshots
Offline bugs are hard to debug without visibility. Add test-friendly observability that can be enabled in debug builds and captured by automated tests.
Structured logs with correlation IDs
Ensure each operation and each network request has a correlation ID. In tests, capture logs and assert on key events, such as:
- operation_enqueued
- request_started
- request_failed (with reason)
- operation_committed
- operation_reverted (if applicable)
Instead of asserting exact sequences (which can be brittle), assert that required events occur and forbidden events do not occur (e.g., “committed twice”).
Sync timeline snapshot
Expose a debug endpoint or internal API that returns a snapshot of sync state: counts of pending/in-flight/failed operations, last successful sync time, and any backoff timers. In E2E tests, poll this snapshot to wait for stable states without arbitrary sleeps.
Preventing flaky offline tests
Offline tests often become flaky due to real timers, real network dependencies, and race conditions. Reduce flakiness with these practices:
- Use virtual time in unit/integration tests: advance the clock instead of sleeping.
- Avoid real external services: use mock servers and deterministic scripts.
- Wait on conditions (database state, sync snapshot) rather than fixed delays.
- Isolate test data: unique IDs per test run; reset server fixtures.
- Control concurrency in tests: limit background threads or use deterministic schedulers where possible.
- Record and replay tricky sequences: once you find a production bug, encode it as a network script and add it to regression tests.
Practical offline test cases (ready-to-implement)
Case 1: Cold start offline with cached data
- Pre-seed local database with a list and details.
- Disable network.
- Launch app and navigate list → detail.
- Assert: no spinners that never end; cached content renders; no crash; appropriate offline indicator state.
Case 2: Write while offline, then restart
- Disable network.
- Create or edit an entity.
- Force-kill the app.
- Relaunch offline.
- Assert: the change is still present locally and marked pending (or equivalent).
Case 3: Mid-flight drop on create
- Network online.
- Mock server: on first create request, apply server-side change then close connection without response.
- Client retries after reconnect or next attempt.
- Assert: server has exactly one created entity; client shows one entity; operation marked committed once.
Case 4: Slow network + user cancellation
- Shape network to low bandwidth and high latency.
- Start a large download or upload.
- User cancels.
- Assert: transfer stops; partial files are cleaned up or safely resumable; UI returns to a stable state; no background retry continues unexpectedly.
Case 5: Captive portal behavior
- Simulate “connected” network where HTTPS fails or HTTP redirects.
- Attempt a sync-triggering action.
- Assert: app does not mislabel as “server error”; it does not clear pending work; it remains usable locally; retry works once normal internet returns.
How to integrate offline simulation into CI
CI environments are constrained: emulators may not support full network shaping, and tests must be reliable. A pragmatic approach is layered:
- CI unit tests: extensive fault-injecting transport coverage (timeouts, drops, malformed responses).
- CI integration tests: mock server scripts for mid-flight drops and error sequences.
- Nightly device farm: a small set of E2E flows with OS-level toggles and basic conditioning.
When a production incident occurs, add a regression test at the lowest level that reproduces it (often integration with a scripted mock server). Reserve device-level tests for issues that truly depend on OS behavior.