Why file transfer flows are different from “normal” sync
Uploading and downloading files in an offline-first mobile app is not just “syncing a big record.” Files are large, long-lived, and expensive to re-transfer. They also have integrity requirements: the user expects the downloaded PDF to open, the uploaded video to be playable, and the server to store exactly what the device produced. A resilient file flow therefore needs: (1) resumability (continue after app restarts, network drops, or OS kills), (2) integrity checks (detect corruption or mismatched content), (3) clear user experience states (queued, uploading, paused, failed, completed), and (4) a storage strategy that avoids duplicating large blobs unnecessarily.
In practice, a robust file flow is built around chunking, stable identifiers, and explicit transfer sessions. The app should treat each file transfer as a state machine persisted locally, with enough metadata to resume without re-reading the entire file or re-uploading already confirmed bytes.
Core concepts: chunks, sessions, and integrity
Chunking and byte ranges
Chunking splits a file into fixed-size pieces (for example 1–8 MB). Each chunk can be uploaded or downloaded independently, which enables resume: if the device already transferred chunks 0–41, it can continue at chunk 42. Chunk size is a tradeoff: smaller chunks reduce rework after a failure but increase overhead (more requests, more metadata). Many mobile apps start with 4 MB and adjust based on network and server limits.
Transfer session
A transfer session is a server-side object that tracks progress for a specific file. The client creates a session, then uploads chunks referencing that session. The server records which chunks are received (and optionally their hashes). The client can query session status to resume after a crash or device switch.
Integrity checks: what to verify and when
Integrity checks ensure the bytes received match the bytes sent. Common layers:
- Listen to the audio with the screen off.
- Earn a certificate upon completion.
- Over 5000 courses for you to explore!
Download the app
- Per-chunk hash: compute a hash (e.g., SHA-256) for each chunk and send it with the chunk. The server verifies it before accepting. This detects corruption early and allows retrying only the bad chunk.
- Whole-file hash: compute a hash for the entire file and store it in metadata. After upload completion, the server computes/compares the full hash. For downloads, the client verifies the final file hash before marking it usable.
- Length checks: confirm total byte length matches expected size. This is a quick sanity check but not sufficient alone.
- Content-type and container validation: optional validation like “is this a valid JPEG/MP4/PDF” can catch truncated files that still match length in some edge cases (less common if hashes are correct).
Hash choice: SHA-256 is widely available and strong. For very large files, computing a whole-file hash can be expensive; you can compute it incrementally while reading chunks so you do not read the file twice.
Data model for resumable transfers (local)
To resume reliably, persist transfer state in a local database (or equivalent durable store). Keep the file bytes in the filesystem (or platform file storage) and store metadata separately.
Suggested fields for an UploadTransfer record
- transferId: local UUID for the transfer.
- localFileUri/path: where the file is stored locally.
- fileSize: bytes.
- mimeType: optional but useful.
- chunkSize: bytes.
- totalChunks: derived.
- uploadedChunksBitset or list of uploaded chunk indexes.
- sessionId: server session identifier (nullable until created).
- remoteFileId: server file identifier (nullable until finalized).
- fileHashSha256: whole-file hash (optional until computed).
- state: queued, uploading, paused, failed, completed, canceled.
- lastErrorCode/lastErrorMessage: for UI and diagnostics.
- createdAt/updatedAt: timestamps.
Suggested fields for a DownloadTransfer record
- transferId
- remoteFileId or URL
- expectedSize and expectedHash (if available)
- destinationPath (final location)
- tempPath (partial download location)
- chunkSize
- downloadedRanges or chunk bitset
- etag/version or server revision identifier
- state, lastError
Important: store partial downloads in a temporary file and only “promote” (rename/move) to the final path after integrity checks pass. This prevents other parts of the app from opening a corrupted file.
Server API patterns for resumable upload
A typical resumable upload API has three phases: create session, upload chunks, finalize.
1) Create upload session
Client sends metadata: file name, size, mime type, optional whole-file hash (if already known), and chunk size. Server returns a sessionId and any constraints (max chunk size, required headers).
POST /uploads/sessions { fileName, fileSize, mimeType, chunkSize, fileHashSha256? } -> { sessionId, expiresAt }2) Upload chunk
Client uploads a chunk with an index and byte range. Include per-chunk hash and length. Server verifies and records it.
PUT /uploads/sessions/{sessionId}/chunks/{index} Headers: Content-Range: bytes start-end/total X-Chunk-Hash: sha256=... Body: [bytes]Alternative: a single endpoint with Content-Range and no explicit index. The server infers which range is being sent. Index-based is simpler for clients and for “which chunks are missing” queries.
3) Query session status (for resume)
Client asks which chunks are already stored. This is critical after app restarts or if the client lost local progress state.
GET /uploads/sessions/{sessionId} -> { receivedChunks: [0,1,2,5,...], fileSize, chunkSize }4) Finalize
When all chunks are uploaded, client requests finalize. Server assembles the file, verifies whole-file hash if provided (or computes and returns it), stores the file, and returns a remoteFileId.
POST /uploads/sessions/{sessionId}/finalize { fileHashSha256? } -> { remoteFileId, fileHashSha256 }If the server detects a mismatch (hash or size), it should reject finalize with a specific error code so the client can decide whether to re-upload missing chunks or restart from scratch.
Step-by-step: implementing resumable upload on mobile
Step 1: Prepare the file and create a local transfer record
When the user selects a file (camera capture, document picker, etc.), copy it into an app-controlled location if possible. This avoids losing access to a content URI later and ensures the file remains available across restarts.
- Determine file size and mime type.
- Choose chunk size.
- Create UploadTransfer in local DB with state=queued.
Step 2: Compute hashes incrementally while reading chunks
To avoid reading the file twice, compute the whole-file hash as you stream through chunks. Also compute per-chunk hashes for each chunk. If the upload is paused mid-file, you can store the partial hash state if your crypto library supports it; otherwise, compute whole-file hash upfront (one pass) before uploading. For many apps, computing upfront is acceptable for files up to a few hundred MB, but for multi-GB videos you may prefer streaming hash computation during upload.
Step 3: Create or recover the server session
If sessionId is null, call create-session and store sessionId. If sessionId exists (resume case), call GET session status and reconcile:
- Mark chunks as uploaded if the server already has them.
- If the server session expired, create a new session and reset uploadedChunks.
Step 4: Upload missing chunks with verification
Loop through chunk indexes not yet uploaded. For each chunk:
- Read bytes from file at offset index*chunkSize.
- Compute chunk hash.
- Send PUT chunk with Content-Range and hash header.
- On success, mark chunk as uploaded in local DB immediately (durable progress).
Persisting after each chunk is what makes resume reliable after a crash. If you only persist at the end, you will re-upload more than necessary.
Step 5: Finalize and verify server response
After all chunks are uploaded, call finalize. If the server returns the computed whole-file hash, compare it to the client’s hash. If mismatch, treat as integrity failure: do not mark completed; either restart the session or re-check which chunks the server has and re-upload.
Step 6: Attach the uploaded file to domain objects
Often the file belongs to a message, task, or profile. Keep the file upload separate from “attach to record” operations. A common pattern is:
- Upload file to get remoteFileId.
- Then create/update the domain record referencing remoteFileId.
This separation prevents partially uploaded files from creating broken references. It also allows multiple records to reference the same file if needed.
Minimal pseudocode for upload
function runUpload(transferId): t = db.getUploadTransfer(transferId) assert fileExists(t.localPath) if t.sessionId == null: s = api.createSession(metaFrom(t)) db.update(t, sessionId=s.sessionId) else: status = api.getSession(t.sessionId) db.mergeUploadedChunks(t, status.receivedChunks) for idx in 0..t.totalChunks-1: if !t.isChunkUploaded(idx): bytes = readChunk(t.localPath, idx, t.chunkSize) chunkHash = sha256(bytes) api.putChunk(t.sessionId, idx, bytes, chunkHash) db.markChunkUploaded(t, idx) result = api.finalize(t.sessionId, t.fileHashSha256) if t.fileHashSha256 != null and result.fileHashSha256 != t.fileHashSha256: db.fail(t, "INTEGRITY_MISMATCH") return db.complete(t, remoteFileId=result.remoteFileId)Resumable download with integrity checks
Downloads need the same rigor: resume, verify, and avoid exposing partial files. The main difference is that the server is the source of truth for size and hash, and the client should verify what it received.
HTTP Range requests
Most resumable downloads use HTTP Range requests:
GET /files/{remoteFileId} Headers: Range: bytes=start-endThe server responds with status 206 Partial Content and Content-Range. The client writes bytes into a temp file at the correct offset. If the app restarts, it can check the temp file size and resume from the last confirmed range, or maintain a chunk bitset similar to uploads.
Step-by-step: implementing resumable download
Step 1: Get expected metadata
Before downloading, obtain expected size and expected hash (if your backend provides it). This can come from a file metadata endpoint.
GET /files/{id}/meta -> { size, sha256, mimeType, etag }If you cannot get a hash, you can still do length checks and rely on TLS for transit integrity, but you lose end-to-end corruption detection (for example, disk write issues or server-side corruption).
Step 2: Create a DownloadTransfer and temp file
Create a record with tempPath and destinationPath. Create an empty temp file (or keep an existing partial temp file if resuming). Store the expected hash and size.
Step 3: Download in chunks and persist progress
For each missing range/chunk:
- Request Range bytes for that chunk.
- Verify response headers (Content-Range matches request; length matches).
- Write to temp file at offset.
- Mark chunk complete in DB.
If the server provides per-chunk hashes (less common for downloads), verify them. Otherwise, verify whole-file hash at the end.
Step 4: Final integrity verification and promotion
After all chunks are present:
- Compute SHA-256 of the temp file and compare to expectedHash.
- If mismatch, delete temp file and mark failed with integrity error (or retry from scratch).
- If match, move/rename temp file to destinationPath atomically and mark completed.
On many platforms, a rename within the same filesystem is atomic, which is ideal for ensuring other code never sees a half-written “final” file.
Minimal pseudocode for download
function runDownload(transferId): t = db.getDownloadTransfer(transferId) meta = api.getFileMeta(t.remoteFileId) db.update(t, expectedSize=meta.size, expectedHash=meta.sha256, etag=meta.etag) ensureTempFile(t.tempPath, meta.size) for idx in 0..totalChunks(meta.size, t.chunkSize)-1: if !t.isChunkDownloaded(idx): start = idx*t.chunkSize end = min(start+t.chunkSize-1, meta.size-1) resp = api.getRange(t.remoteFileId, start, end, meta.etag) verifyContentRange(resp, start, end, meta.size) writeAt(t.tempPath, start, resp.bytes) db.markChunkDownloaded(t, idx) if meta.sha256 != null: h = sha256File(t.tempPath) if h != meta.sha256: db.fail(t, "INTEGRITY_MISMATCH") deleteFile(t.tempPath) return atomicMove(t.tempPath, t.destinationPath) db.complete(t)Handling edits, replacements, and versioning
Offline-first apps often allow users to replace an attachment (edit a photo, re-export a PDF). This introduces a subtle integrity and UX problem: you may have an in-flight upload for an older version while the user has already produced a new version.
Practical rules:
- Immutable uploads: treat each upload as immutable content. If the user changes the file, create a new UploadTransfer with a new transferId and new session. Do not reuse the old session.
- Version pointer: the domain record should reference the “current” remoteFileId. If a newer upload completes later, update the pointer.
- Cancel old transfers: if the old version is no longer needed, cancel it to save bandwidth and storage.
On download, use an ETag or version identifier. If the file changes on the server, the client should detect it (ETag mismatch) and restart the download rather than resuming into a file that now represents different bytes.
UX states and user controls that matter
File transfers are visible and can be slow. A resilient UX needs explicit states and controls:
- Queued: waiting for network or user action. Show “Waiting to upload” rather than “Failed.”
- Uploading/Downloading: show progress (bytes transferred / total). Prefer chunk-based progress so it remains stable across resumes.
- Paused: user-initiated pause, or system constraint (e.g., “Wi‑Fi only”). Make the reason visible.
- Failed: show an actionable message: “Tap to retry” plus a short cause (storage full, file missing, integrity check failed).
- Completed: for downloads, only after integrity verification and promotion to final path.
Provide user actions: pause/resume, cancel, retry. For uploads, consider “Retry from scratch” when integrity mismatches persist (it can happen if the local file changed while uploading or if the session got corrupted).
Storage considerations for large files
Avoid duplicate copies
Copying a 500 MB video multiple times can quickly exhaust device storage. Prefer:
- Store a single canonical local file for upload, referenced by transfers.
- For downloads, write directly to a temp file in the final filesystem location (same volume) so promotion is a rename, not a copy.
Detect “file missing” early
Users can delete files via system cleaners, or content URIs can expire. Before starting/resuming an upload, check that the file exists and size matches what you recorded. If not, fail with a specific error and prompt the user to re-select the file.
Disk-full behavior
Downloads can fail mid-way due to disk full. Treat this as a distinct error state and keep partial progress only if it is likely to be useful after freeing space. If the device is critically low, you may delete the temp file to recover space and restart later.
Integrity failure scenarios and how to respond
Local file changed during upload
If the file is modified while uploading (for example, the user edits it in another app), chunk hashes or final hash will mismatch. Mitigation: copy the file into app storage at selection time, or lock it if the platform supports it. If you must read from a shared location, record last-modified timestamp and size; if either changes, restart the upload with a new session.
Server assembled file differs from client
This can happen due to server bugs, incorrect chunk ordering, or incorrect range handling. The whole-file hash comparison at finalize is your safety net. If mismatch:
- Mark transfer failed with integrity error.
- Optionally query session status and re-upload missing chunks.
- If mismatch persists, restart session from scratch and log diagnostics (chunk indexes, hashes, server responses).
Download corruption or partial writes
Even with TLS, corruption can occur due to disk issues or interrupted writes. Whole-file hash verification catches this. If mismatch, delete the temp file and retry. If repeated, consider switching networks or reducing chunk size to reduce memory pressure.
Practical API and implementation tips
Use stable identifiers and idempotent chunk uploads
Make chunk upload idempotent: uploading the same chunk index again should either overwrite safely or return “already received” if the hash matches. If the hash differs for the same index, the server should reject it clearly (this indicates the client is sending different bytes for the same chunk index, usually due to file changes).
Prefer streaming I/O
On mobile, avoid loading large chunks into memory if possible. Use streaming request bodies and file channels. If your HTTP client requires a byte array, keep chunk size modest and reuse buffers.
Parallelism with care
Uploading/downloading multiple chunks in parallel can improve throughput on good networks but increases memory and battery usage. A practical approach is a small concurrency level (2–4) with backpressure. Ensure your progress persistence can handle out-of-order chunk completion.
Time-limited sessions
Server sessions often expire. Store expiresAt and refresh or recreate sessions when needed. If the session expires, you may need to re-upload; if the server persists received chunks beyond expiry, you can still resume by creating a new session that references the same temporary storage (implementation-specific).
Security: validate what you store
Integrity checks confirm bytes, but you should also consider safety when opening files. For example, if you download a file and then render it in a web view, treat it as untrusted content. Keep downloaded files in app-private storage when possible and only share via controlled mechanisms.
Testing strategies for resume and integrity
File transfer bugs often appear only under interruptions. Build tests and debug tools that simulate:
- Network drop mid-chunk and between chunks.
- App kill and restart during upload/download.
- Server returning 206 with incorrect Content-Range (should be detected).
- Corrupted chunk bytes (flip a bit) to ensure hash mismatch triggers retry.
- Local file modification during upload to ensure mismatch is detected and handled.
Also add observability fields in your transfer records (attempt count per chunk, last successful byte offset, last server status) so you can diagnose issues from user reports without needing full logs.