All courses > Technology and Programming > Developer and IT Tools ::

Persisting Data with Volumes and Bind Mounts

Capítulo 5

Estimated reading time: 13 minutes

Why Data Disappears When Containers Are Removed

Containers are designed to be disposable. When you start a container, it gets a writable “container layer” on top of the image. Any files you create or modify inside the container (for example, a database file under /var/lib or an uploaded image under /app/uploads) live in that writable layer.

If you delete the container, that writable layer is deleted with it. This is great for repeatable application code, but it is a problem for stateful data: databases, user uploads, caches you want to keep, and configuration you want to edit without rebuilding images.

To persist data beyond the lifetime of a container, Docker provides two main mounting mechanisms: volumes and bind mounts. Both map a directory (or file) into the container, but they differ in who manages the storage location and how portable the setup is.

Volumes vs Bind Mounts (What to Use and When)

Docker Volumes (Docker-managed storage)

A Docker volume is storage managed by Docker. Docker chooses where it lives on the host (typically under Docker’s data directory), and you refer to it by name. Volumes are the default choice for persistent application data because they are portable, easy to back up, and work consistently across environments.

Best for: databases, persistent app data, shared data between containers, production deployments.
Pros: Docker manages location and permissions more predictably; easy to move/backup; can be used by multiple containers; works well with Docker Desktop on macOS/Windows.
Cons: Not as convenient for editing source code directly from your editor (though still possible with other workflows).

Bind Mounts (host path mapped into container)

A bind mount maps a specific path on your host machine into the container. You control the exact host directory. This is extremely useful during development when you want changes on your host (like editing code) to immediately appear inside the container.

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

Best for: local development, live code editing, mounting a single config file, inspecting container output on the host.
Pros: Direct access to host files; easy to edit with normal tools; no extra Docker-managed storage to track.
Cons: Less portable (depends on host paths); host filesystem permissions can cause issues; performance can vary on macOS/Windows; can accidentally overwrite container files if you mount over them.

Quick decision guide

If the data is “application state” you must not lose (database files, uploads): use a volume.
If the data is “developer workflow” (source code, local config you want to tweak): use a bind mount.
If you are unsure: start with a volume for persistence and add bind mounts only for development convenience.

How Mounting Works (Mental Model)

When you mount something into a container, the mount point inside the container becomes a window to the external storage (volume or host path). If the container image already has files at that path, the mount will hide them while the container is running.

Example: if an image contains /app with code, and you bind mount your host folder onto /app, the container will see your host folder instead of the image’s /app. This is powerful, but it can also be confusing if you accidentally mount over important directories.

Practical: Persisting a Database with a Named Volume

This example uses PostgreSQL because it stores its database files on disk and demonstrates persistence clearly. The same idea applies to MySQL, MongoDB, Redis (when configured for persistence), and many other services.

Step 1: Create a named volume

docker volume create pgdata

You can list volumes to confirm it exists:

docker volume ls

Step 2: Run PostgreSQL using the volume

PostgreSQL stores data under /var/lib/postgresql/data in the official image. Mount the volume there.

docker run -d --name pg1 \
  -e POSTGRES_PASSWORD=secret \
  -v pgdata:/var/lib/postgresql/data \
  -p 5432:5432 \
  postgres:16

At this point, PostgreSQL initializes its data directory inside the volume. That initialization persists even if the container is removed.

Step 3: Create some data

Connect using psql inside the container and create a table with a row:

docker exec -it pg1 psql -U postgres

CREATE TABLE notes(id serial PRIMARY KEY, body text NOT NULL);
INSERT INTO notes(body) VALUES ('hello from a volume');
SELECT * FROM notes;

Step 4: Remove the container (data should remain)

docker rm -f pg1

Now start a new container that uses the same volume:

docker run -d --name pg2 \
  -e POSTGRES_PASSWORD=secret \
  -v pgdata:/var/lib/postgresql/data \
  -p 5432:5432 \
  postgres:16

Check the data again:

docker exec -it pg2 psql -U postgres -c "SELECT * FROM notes;"

You should see the row you inserted earlier. The container changed, but the data stayed because it lives in the volume.

Inspecting where the volume lives

You usually do not need to know the host path, but it can be useful for debugging. Inspect the volume:

docker volume inspect pgdata

This shows metadata including the mountpoint on the host (on Linux). On Docker Desktop, the path may be inside a VM, so direct host access can differ.

Practical: Using a Bind Mount for Live Development

Bind mounts shine when you want to edit files on your host and immediately see changes in the container. A common pattern is to run a web server in a container while your code lives on the host.

Example: Serve a local static site with Nginx

Step 1: Create a local folder and HTML file

On your host:

mkdir -p site
printf '<h1>Hello from bind mount</h1>' > site/index.html

Step 2: Run Nginx with a bind mount

Nginx serves files from /usr/share/nginx/html. Bind mount your local site folder there.

docker run -d --name web1 \
  -p 8080:80 \
  -v "$(pwd)/site":/usr/share/nginx/html:ro \
  nginx:alpine

Open http://localhost:8080 in your browser. Now edit site/index.html on your host and refresh the page; the changes appear immediately because the container reads from your host directory.

Why the `:ro` matters

:ro makes the mount read-only inside the container. This is a good habit when the container should not modify your source files. It reduces risk (for example, a misbehaving process overwriting files).

Bind Mounting a Single File (Config Override)

Sometimes you want to override one configuration file without rebuilding an image. Bind mounting a single file is ideal for this.

Example: provide a custom Nginx config from your host:

docker run -d --name web2 \
  -p 8081:80 \
  -v "$(pwd)/nginx.conf":/etc/nginx/nginx.conf:ro \
  nginx:alpine

Be careful: if the container expects additional included config files, overriding the main config can break startup. When overriding configs, start small and validate by checking container logs if it fails.

Anonymous Volumes vs Named Volumes

When you mount a volume without giving it a name, Docker creates an anonymous volume (a random name). Anonymous volumes still persist, but they are harder to manage because you do not have a meaningful name to reference later.

Named volumes are usually better for anything you intend to keep.

Example of an anonymous volume mount:

docker run -d --name pgtemp \
  -e POSTGRES_PASSWORD=secret \
  -v /var/lib/postgresql/data \
  postgres:16

This creates an anonymous volume for /var/lib/postgresql/data. It persists after container removal, but you must find it via docker volume ls and inspect to identify it. Prefer explicit naming:

docker run -d --name pgtemp \
  -e POSTGRES_PASSWORD=secret \
  -v pgdata:/var/lib/postgresql/data \
  postgres:16

Sharing Data Between Containers with a Volume

Volumes can be mounted into multiple containers at the same time. This is useful for patterns like “one container writes files, another serves them.”

Step-by-step: Producer and consumer containers

Create a volume:

docker volume create shareddata

Run a container that writes a file into the volume:

docker run -d --name writer \
  -v shareddata:/data \
  alpine sh -c "while true; do date >> /data/timestamps.txt; sleep 2; done"

Run another container that reads the same file:

docker run -it --rm --name reader \
  -v shareddata:/data \
  alpine sh -c "tail -f /data/timestamps.txt"

You should see timestamps streaming. This demonstrates that the volume is the shared storage, independent of any single container.

Note: concurrent writes require application-level care. Volumes do not magically solve file locking or consistency issues; they just provide shared access.

Pre-Populating a Volume (Seeding Data)

A common need is to start with default data (for example, initial SQL scripts, default uploads, or template files). There are two practical approaches:

Copy on first run: start a container that copies seed files into the volume if the volume is empty.
One-time init container: run a short-lived container that writes seed data into the volume, then stop it.

One-time init container example

Suppose you have a local folder seed with files you want in a volume:

mkdir -p seed
printf 'seed file\n' > seed/hello.txt

Create a volume and copy the seed data into it:

docker volume create seeded
docker run --rm \
  -v seeded:/data \
  -v "$(pwd)/seed":/seed:ro \
  alpine sh -c "cp -a /seed/. /data/"

Now any container that mounts seeded at /data will see hello.txt.

Backups and Restore with Volumes (Practical Pattern)

Because volumes are Docker-managed, a common backup method is to run a temporary container that mounts the volume and a host directory, then creates an archive.

Backup a volume to a tar file

Create a backup folder on the host:

mkdir -p backups

Archive the volume contents:

docker run --rm \
  -v pgdata:/data:ro \
  -v "$(pwd)/backups":/backup \
  alpine sh -c "cd /data && tar -czf /backup/pgdata.tar.gz ."

Restore into a new volume

Create a new volume and restore the archive:

docker volume create pgdata_restored
docker run --rm \
  -v pgdata_restored:/data \
  -v "$(pwd)/backups":/backup:ro \
  alpine sh -c "cd /data && tar -xzf /backup/pgdata.tar.gz"

This pattern works for many types of data. For databases, prefer database-native backup tools for consistency (for example, pg_dump for PostgreSQL), but volume-level backups are still useful for quick snapshots in development.

Common Pitfalls and How to Avoid Them

Mounting over the wrong directory

If you mount a volume or bind mount onto a path that already contains important files in the image, those files become hidden. Symptoms include “my app can’t find its dependencies” or “the default config disappeared.”

Practical tip: mount only the directories that truly need to be externalized (for example, /var/lib/postgresql/data for Postgres data, not /var/lib/postgresql broadly).

Permissions issues (especially with bind mounts)

Bind mounts use the host filesystem permissions. If the container runs as a non-root user, it may not be able to write to the mounted directory. If it runs as root, it may create files owned by root on your host, which can be annoying.

Practical approaches:

Use read-only mounts for code: add :ro when the container should not write.
Ensure the host directory is writable by the user the container runs as (sometimes by adjusting ownership/permissions on the host).
Prefer volumes for services that write lots of data (databases), because Docker manages them more predictably.

SELinux and security contexts (Linux-specific)

On some Linux systems with SELinux enabled, containers may be blocked from accessing bind-mounted paths. Docker supports options that relabel content for container access. If you see permission denied errors despite correct Unix permissions, SELinux may be the cause.

In such environments, you may need to add an SELinux label option to the mount. The exact choice depends on your system policy, so treat it as an environment-specific adjustment rather than a default step.

Performance considerations on macOS/Windows

On Docker Desktop, bind mounts can be slower because file access crosses a virtualization boundary. If your development setup involves many small file reads (for example, large JavaScript dependency trees), you may notice slowness.

Practical mitigations:

Use volumes for heavy-write directories (like dependency caches) while bind mounting only your source code.
Keep bind mounts narrow (mount the project folder, not your entire home directory).

Choosing Mount Syntax: `-v` vs `--mount`

Docker provides two ways to specify mounts. Both work, but --mount is more explicit and less error-prone for complex cases.

Volume mount with `-v`

docker run -v pgdata:/var/lib/postgresql/data postgres:16

Volume mount with `--mount`

docker run --mount type=volume,source=pgdata,target=/var/lib/postgresql/data postgres:16

Bind mount with `-v`

docker run -v "$(pwd)/site":/usr/share/nginx/html:ro nginx:alpine

Bind mount with `--mount`

docker run --mount type=bind,source="$(pwd)/site",target=/usr/share/nginx/html,readonly nginx:alpine

Practical tip: if you are teaching yourself or working in a team, using --mount can make commands easier to read because it forces you to specify type, source, and target explicitly.

Mini Project: Persistent Notes API Data Directory

This mini project focuses on the persistence mechanism rather than building images. You will run a simple container that writes notes to a file, and you will persist that file using both a volume and a bind mount to see the difference.

Part A: Persist with a named volume

Step 1: Create a volume:

docker volume create notesdata

Step 2: Run a container that appends notes to a file in /data:

docker run -d --name notes-writer \
  -v notesdata:/data \
  alpine sh -c "echo 'first note' >> /data/notes.txt; sleep 3600"

Step 3: Verify the file exists by reading it from a separate container:

docker run --rm \
  -v notesdata:/data \
  alpine sh -c "cat /data/notes.txt"

Step 4: Remove the writer container and confirm the data remains:

docker rm -f notes-writer
docker run --rm \
  -v notesdata:/data \
  alpine sh -c "cat /data/notes.txt"

You should still see first note.

Part B: Persist with a bind mount

Step 1: Create a host folder:

mkdir -p notes-host

Step 2: Run a container that writes into the bind-mounted directory:

docker run --rm \
  -v "$(pwd)/notes-host":/data \
  alpine sh -c "echo 'note stored on host path' >> /data/notes.txt"

Step 3: Confirm the file exists on your host:

cat notes-host/notes.txt

This demonstrates the key difference: with bind mounts, the data is plainly visible and editable on the host at a known path; with volumes, Docker manages the storage location and you typically interact with it through Docker commands or helper containers.

Managing and Cleaning Up Persistent Storage Safely

Listing and inspecting volumes

docker volume ls
docker volume inspect pgdata

Removing a volume

Only remove a volume when you are sure you no longer need the data.

docker volume rm pgdata

Pruning unused volumes

Over time, you may accumulate unused volumes (especially anonymous ones). Docker can remove volumes not referenced by any container:

docker volume prune

Be careful: “unused” means “not currently attached to a container,” not “unimportant.” If you removed a container but intended to keep its volume for later, pruning could delete it.

Practical Patterns You Will Reuse

Pattern 1: Database uses a named volume, app code uses a bind mount (development)

In local development, a common approach is:

Use a named volume for the database data directory so it persists and avoids host permission issues.
Use a bind mount for your application source code so edits are instant.

This gives you fast iteration on code while keeping stateful data stable.

Pattern 2: Read-only bind mounts for configuration

Mount configuration files or directories as read-only to reduce accidental changes from inside the container:

docker run -v "$(pwd)/config":/app/config:ro yourimage

Pattern 3: Use helper containers for volume operations

Because volumes are not always directly accessible (especially on Docker Desktop), using a small utility container (like Alpine) to inspect, copy, or archive volume contents is a practical and repeatable technique:

docker run --rm -it -v pgdata:/data alpine sh

Now answer the exercise about the content:

You run a database in a container and want its data to survive even if you remove and recreate the container. Which approach best fits this goal?

You are right! Congratulations, now go to the next page

You missed! Try again.

Data written only to the container layer is deleted with the container. A named volume persists outside the container and is a common default for databases and other stateful data.

Next chapter

Connecting Services with Docker Networks

42%

Docker for Beginners: Containers Explained with Simple Projects

New course

12 pages