Free Ebook cover Python for Absolute Beginners: Variables, Loops, and Small Useful Scripts

Python for Absolute Beginners: Variables, Loops, and Small Useful Scripts

New course

14 pages

Reading and Writing Files for Practical Automation

Capítulo 8

Estimated reading time: 12 minutes

+ Exercise

Why files matter in automation

Many “useful scripts” become truly practical when they can read existing information from a file and write results back out. Files let your program work with data that already exists (logs, exports, notes, configuration files) and produce outputs you can share (reports, cleaned data, renamed lists, summaries). In this chapter you will learn how to open files safely, read text in different ways, write new files, append to existing ones, and handle common issues like missing files and character encoding.

File paths: telling Python where the file is

Before reading or writing, you must identify the file using a path. A path can be relative (based on your current working folder) or absolute (the full location on your computer).

Relative paths

A relative path points to a file “near” your script. If your script is in a folder and the file is in the same folder, you can usually use just the filename.

filename = "notes.txt"

If the file is inside a subfolder, include the folder name:

filename = "data/notes.txt"

Absolute paths

An absolute path includes the full location. On Windows it might look like C:\Users\Name\Documents\file.txt. On macOS/Linux it might look like /Users/name/Documents/file.txt.

Continue in our app.

You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.

Or continue reading below...
Download App

Download the app

When writing Windows paths in Python strings, remember that backslashes can act like escape characters. Two common solutions are to use a raw string or forward slashes.

path1 = r"C:\Users\Name\Documents\notes.txt"  # raw string keeps backslashes literal
path2 = "C:/Users/Name/Documents/notes.txt"      # forward slashes also work in most cases

Using pathlib for cleaner paths

Python’s pathlib module helps you build paths in a way that works across operating systems. This is especially useful for automation scripts you might run on different machines.

from pathlib import Path

base = Path("data")
file_path = base / "notes.txt"  # joins paths safely
print(file_path)                 # data/notes.txt (or data\notes.txt on Windows)

Opening files with with: the safest pattern

The most important habit for file work is to open files using a context manager (with). It automatically closes the file even if an error happens, which prevents corrupted output and “file is in use” issues.

with open("notes.txt", "r", encoding="utf-8") as f:
    text = f.read()

The open() function takes:

  • file: path or filename
  • mode: how you want to use the file (read, write, append, etc.)
  • encoding: how text is decoded/encoded (use utf-8 in most modern cases)

Common modes

  • "r": read (file must exist)
  • "w": write (creates file; overwrites if it exists)
  • "a": append (creates file; adds to end if it exists)
  • "x": create (fails if file already exists)
  • "rb", "wb": binary read/write (for images, PDFs, etc.)

Reading text files: three practical approaches

Text files are common in automation: exported CSV-like text, logs, lists of names, configuration snippets, and notes. Python gives you multiple ways to read them depending on your goal.

1) Read the entire file at once

Use read() when the file is small enough to fit comfortably in memory and you want the whole content as one string.

with open("notes.txt", "r", encoding="utf-8") as f:
    content = f.read()

print(content)

Automation example: quickly search for a keyword in a small report.

keyword = "ERROR"
with open("app.log", "r", encoding="utf-8") as f:
    content = f.read()

if keyword in content:
    print("Found errors in the log")

2) Read line by line (recommended for large files)

When files can be large (logs, exports), read them line by line. This is memory-friendly and fits many automation tasks such as filtering, counting, or extracting matching lines.

with open("app.log", "r", encoding="utf-8") as f:
    for line in f:
        if "ERROR" in line:
            print(line.strip())

strip() removes the trailing newline so printed lines look clean.

3) Read all lines into a list

readlines() returns a list of lines. This can be convenient if you need random access or want to process lines multiple times, but it loads everything into memory.

with open("names.txt", "r", encoding="utf-8") as f:
    lines = f.readlines()

print("Total lines:", len(lines))

Writing text files: overwrite vs append

Writing is where automation becomes a time-saver: you can generate cleaned versions of files, create summaries, or produce new files for other tools.

Writing a new file (or overwriting an existing one)

Mode "w" creates the file if it does not exist, and overwrites it if it does. Use this when you want a fresh output each run.

report = "Daily summary\n- Items processed: 120\n- Errors: 0\n"

with open("summary.txt", "w", encoding="utf-8") as f:
    f.write(report)

write() returns the number of characters written, but you usually do not need it.

Appending to a file

Mode "a" adds new content to the end. This is useful for logs your script maintains, or for building a running history.

from datetime import datetime

line = f"{datetime.now().isoformat()} - Script ran successfully\n"

with open("automation.log", "a", encoding="utf-8") as f:
    f.write(line)

Writing multiple lines

If you have several lines to write, you can write them one by one or use writelines(). Remember that writelines() does not add newline characters automatically; include "\n" yourself.

lines = ["apple\n", "banana\n", "cherry\n"]

with open("fruits.txt", "w", encoding="utf-8") as f:
    f.writelines(lines)

Step-by-step automation: clean a messy text list

A common beginner-friendly automation task is cleaning a text file that contains one item per line, but with extra spaces, blank lines, or duplicates. The goal: read an input file, clean it, and write a new file.

Example input file

Imagine raw_emails.txt contains:

 alice@example.com

BOB@example.com 
 bob@example.com
carol@example.com

Goal

  • Remove blank lines
  • Trim spaces
  • Normalize case (lowercase)
  • Remove duplicates
  • Write cleaned results to clean_emails.txt

Step 1: read and clean line by line

input_path = "raw_emails.txt"
output_path = "clean_emails.txt"

seen = set()
cleaned = []

with open(input_path, "r", encoding="utf-8") as f:
    for line in f:
        email = line.strip().lower()
        if not email:
            continue
        if email in seen:
            continue
        seen.add(email)
        cleaned.append(email)

Step 2: write the cleaned list

with open(output_path, "w", encoding="utf-8") as f:
    for email in cleaned:
        f.write(email + "\n")

Step 3: add a small status message

It is helpful for automation scripts to report what they did.

print(f"Wrote {len(cleaned)} unique emails to {output_path}")

This pattern (read → transform → write) is the backbone of many practical file-based automations.

Step-by-step automation: filter a log file into a smaller report

Another common task is extracting only the lines you care about from a large log file. For example, you may want to keep only warnings and errors and write them to a separate file for quick review.

Goal

  • Read app.log
  • Keep lines containing WARNING or ERROR
  • Write them to issues.log

Implementation

input_path = "app.log"
output_path = "issues.log"

keywords = ("WARNING", "ERROR")
count = 0

with open(input_path, "r", encoding="utf-8") as fin, open(output_path, "w", encoding="utf-8") as fout:
    for line in fin:
        if any(k in line for k in keywords):
            fout.write(line)
            count += 1

print(f"Extracted {count} issue lines to {output_path}")

Notice that you can open two files in one with statement. This is a clean way to build “pipeline” scripts that transform one file into another.

Encoding and Unicode: avoiding weird characters

Text files are stored as bytes, and an encoding tells Python how to interpret those bytes as characters. Most modern text files are UTF-8, but you may encounter files saved in other encodings, especially on Windows or from older systems.

Best practice: specify encoding explicitly

When reading and writing text, include encoding="utf-8" unless you have a reason not to. This makes your script more predictable across machines.

with open("data.txt", "r", encoding="utf-8") as f:
    content = f.read()

What if you get a UnicodeDecodeError?

If reading fails with a decoding error, the file may not be UTF-8. You can try a different encoding (common ones include "utf-8-sig" for UTF-8 with a BOM, or "cp1252" for some Windows text). Another approach is to use error handling to replace problematic characters, which can be acceptable for logs and rough text processing.

with open("legacy.txt", "r", encoding="utf-8", errors="replace") as f:
    content = f.read()

Using errors="replace" prevents crashes, but it may change characters. For critical data, you should identify the correct encoding instead of replacing.

Handling missing files and other common errors

Automation scripts should fail gracefully. Two common issues are missing input files and permission problems (trying to write somewhere you do not have access, or trying to overwrite a file that is open in another program).

Check existence with pathlib

from pathlib import Path

path = Path("data/input.txt")
if not path.exists():
    print("Input file not found:", path)
else:
    with path.open("r", encoding="utf-8") as f:
        print(f.read())

Use try/except around file operations

When your script is meant to run unattended, catching errors and printing a helpful message is better than a confusing traceback.

try:
    with open("input.txt", "r", encoding="utf-8") as f:
        data = f.read()
except FileNotFoundError:
    print("Could not find input.txt. Put it in the same folder as the script.")
except PermissionError:
    print("Permission error: close the file if it is open, or choose a different folder.")

Creating folders for outputs

When writing output files, you often want them in a dedicated folder like output/. If the folder does not exist, writing will fail. Create it first.

from pathlib import Path

out_dir = Path("output")
out_dir.mkdir(parents=True, exist_ok=True)

out_file = out_dir / "report.txt"
with out_file.open("w", encoding="utf-8") as f:
    f.write("Report generated\n")

exist_ok=True means “do not error if the folder already exists.”

Working with CSV-like text (without special libraries)

Many automation tasks involve simple “comma-separated” files. While Python has a dedicated csv module (and it is recommended for real CSV), you can still do basic tasks by splitting lines. This is acceptable only when your data is simple and does not contain commas inside quoted fields.

Example: sum values from a simple file

Suppose sales.txt contains:

2026-01-01,12.50
2026-01-02,9.99
2026-01-03,20.00

You can parse it like this:

total = 0.0

with open("sales.txt", "r", encoding="utf-8") as f:
    for line in f:
        line = line.strip()
        if not line:
            continue
        date_str, amount_str = line.split(",")
        total += float(amount_str)

with open("sales_total.txt", "w", encoding="utf-8") as f:
    f.write(f"Total sales: {total:.2f}\n")

If you later encounter more complex CSV files, switch to the csv module to handle quoting and edge cases correctly.

Reading and writing binary files (when text is not enough)

Not all files are text. Images, PDFs, and many document formats are binary. If you open them in text mode, Python will try to decode bytes into characters and fail or corrupt the data. Use binary modes "rb" and "wb".

Example: copy a file

This is a simple but useful automation building block: copying a file without importing extra modules.

source = "photo.jpg"
dest = "backup_photo.jpg"

with open(source, "rb") as fin, open(dest, "wb") as fout:
    chunk_size = 1024 * 1024  # 1 MB
    while True:
        chunk = fin.read(chunk_size)
        if not chunk:
            break
        fout.write(chunk)

Reading in chunks is important for large files so you do not load the entire file into memory.

Practical patterns for file-based automation

Pattern 1: transform one file into another

This is the most common pattern: read input, process each line, write output. Keep it streaming (line by line) when possible.

with open("input.txt", "r", encoding="utf-8") as fin, open("output.txt", "w", encoding="utf-8") as fout:
    for line in fin:
        fout.write(line.upper())

Pattern 2: build a timestamped output filename

Automation often produces outputs you want to keep. A timestamp avoids overwriting previous results.

from datetime import datetime

ts = datetime.now().strftime("%Y%m%d_%H%M%S")
output_name = f"report_{ts}.txt"

with open(output_name, "w", encoding="utf-8") as f:
    f.write("Generated report\n")

Pattern 3: safe creation with x mode

If overwriting would be dangerous, use "x" to ensure you do not accidentally replace an existing file.

try:
    with open("important_output.txt", "x", encoding="utf-8") as f:
        f.write("First run output\n")
except FileExistsError:
    print("important_output.txt already exists; not overwriting.")

Pattern 4: process many files in a folder

Real automation often means “do the same thing to every file in a folder.” pathlib makes this straightforward.

from pathlib import Path

input_dir = Path("incoming")
output_dir = Path("processed")
output_dir.mkdir(exist_ok=True)

for path in input_dir.glob("*.txt"):
    out_path = output_dir / path.name
    with path.open("r", encoding="utf-8") as fin, out_path.open("w", encoding="utf-8") as fout:
        for line in fin:
            fout.write(line.strip() + "\n")

This example trims whitespace on every line for every .txt file in incoming/ and writes cleaned versions to processed/.

Checklist: avoiding common beginner mistakes

  • Forgetting with: can leave files open and cause locked-file issues.
  • Using "w" when you meant "a": "w" overwrites the file.
  • Not adding newlines: write() does not automatically add "\n".
  • Not specifying encoding: may work on your machine but fail on another.
  • Reading huge files with read(): prefer line-by-line iteration for large inputs.
  • Using naive splitting for complex CSV: fine for simple data, but switch to the csv module when fields can contain commas or quotes.
  • Hardcoding fragile paths: prefer pathlib and keep inputs/outputs in predictable folders.

Now answer the exercise about the content:

You need to extract only lines containing WARNING or ERROR from a very large log file and save them into a new file. Which approach best matches a memory-friendly and safe solution?

You are right! Congratulations, now go to the next page

You missed! Try again.

Iterating line by line keeps memory usage low for large files, and using with ensures files are closed safely while writing only the lines that match the keywords.

Next chapter

Handling Errors with Try/Except and Defensive Checks

Arrow Right Icon
Download the app to earn free Certification and listen to the courses in the background, even with the screen off.