All courses > Technology and Programming > Artificial Intelligence and Machine Learning ::

Clean Project Structure: From Notebook Prototype to Maintainable Codebase

Capítulo 9

Estimated reading time: 9 minutes

Why a Clean Project Structure Matters

Notebook prototypes are excellent for exploring ideas, but they often mix concerns: data loading, preprocessing, model code, training loops, evaluation, and ad-hoc plotting all in one place. This makes it hard to reproduce results, rerun experiments consistently, or reuse the same code for training and serving.

A maintainable TensorFlow codebase separates responsibilities into small modules, keeps configuration in one place, and provides a repeatable entry point (a command) that runs the same code paths in development, training, and serving.

A Repeatable Folder Layout

The goal is to make it obvious where things live and to keep notebooks as thin “drivers” rather than the source of truth.

Recommended structure

my_tf_project/  ├─ data/                 # local datasets, cached files (often gitignored)  │  ├─ raw/  │  └─ processed/  ├─ notebooks/            # exploration and prototypes  │  └─ 01_prototype.ipynb  ├─ src/                  # importable Python package code  │  ├─ __init__.py  │  ├─ config.py            # config schema + loader  │  ├─ reproducibility.py  # seeding + determinism helpers  │  ├─ data_pipeline.py     # reading + preprocessing + dataset builders  │  ├─ model_def.py         # model factory  │  ├─ train.py             # training routine (calls model + data)  │  ├─ inference.py         # inference helpers (calls model + preprocessing)  │  └─ serving_app.py       # minimal service wrapper (optional)  ├─ configs/              # experiment configs (YAML/JSON)  │  ├─ base.yaml  │  └─ exp_small.yaml  ├─ models/               # saved models, checkpoints (often gitignored)  │  └─ run_2026_01_16/  ├─ scripts/              # command-line entry scripts  │  ├─ train_model.py  │  └─ serve_model.py  ├─ tests/                # unit/integration tests (optional but recommended)  ├─ pyproject.toml or setup.cfg  └─ README.md

data/ is for local artifacts; keep it out of version control unless small and necessary.
notebooks/ is for exploration; notebooks should import from src/ rather than define core logic.
src/ contains the reusable code: the only place where “real” logic should live.
configs/ holds experiment settings so you can rerun the same run later.
models/ stores outputs (saved models, logs, checkpoints) organized by run.
scripts/ are thin CLI wrappers that call into src/.

Refactoring a Notebook into Modules

Take a typical prototype notebook and split it into four modules: (1) data pipeline, (2) model definition, (3) training routine, (4) inference. The notebook becomes a short file that selects a config and calls the training function.

Step 1: Create a single configuration object

Centralize all “knobs” (paths, hyperparameters, batch sizes, run names) into one config. This prevents hidden state scattered across cells.

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

# src/config.pyfrom dataclasses import dataclassfrom pathlib import Pathimport jsontry:    import yamlexcept Exception:    yaml = None@dataclassclass Config:    project_root: str = "."    data_dir: str = "data"    model_dir: str = "models"    run_name: str = "run_default"    seed: int = 42    epochs: int = 3    batch_size: int = 32    learning_rate: float = 1e-3    image_size: int = 224    num_classes: int = 10    train_split: float = 0.9    @property    def run_dir(self) -> Path:        return Path(self.project_root) / self.model_dir / self.run_name    @property    def processed_dir(self) -> Path:        return Path(self.project_root) / self.data_dir / "processed"def load_config(path: str) -> Config:    p = Path(path)    if p.suffix in [".yaml", ".yml"]:        if yaml is None:            raise RuntimeError("PyYAML not installed")        with open(p, "r", encoding="utf-8") as f:            d = yaml.safe_load(f)    elif p.suffix == ".json":        with open(p, "r", encoding="utf-8") as f:            d = json.load(f)    else:        raise ValueError("Config must be .yaml/.yml or .json")    return Config(**d)

Example config file:

# configs/base.yamlproject_root: "."data_dir: "data"model_dir: "models"run_name: "run_001"seed: 123epochs: 5batch_size: 64learning_rate: 0.001image_size: 224num_classes: 10train_split: 0.9

Step 2: Add reproducibility helpers

Make reproducibility a first-class feature by setting seeds and (optionally) deterministic ops in one place. Call this at the start of training and any integration run.

# src/reproducibility.pyimport osimport randomimport numpy as npimport tensorflow as tfdef set_reproducible(seed: int, deterministic: bool = True) -> None:    os.environ["PYTHONHASHSEED"] = str(seed)    random.seed(seed)    np.random.seed(seed)    tf.random.set_seed(seed)    if deterministic:        try:            tf.config.experimental.enable_op_determinism()        except Exception:            pass

Step 3: Move data logic into a data pipeline module

The module should expose a small API: “build train/val datasets” and “preprocess a single example for inference.” Keep preprocessing consistent between training and serving by reusing the same functions.

# src/data_pipeline.pyfrom dataclasses import asdictimport tensorflow as tfdef preprocess_image(x, image_size: int):    x = tf.image.resize(x, [image_size, image_size])    x = tf.cast(x, tf.float32) / 255.0    return xdef build_datasets(cfg, x_train, y_train):    ds = tf.data.Dataset.from_tensor_slices((x_train, y_train))    ds = ds.shuffle(buffer_size=len(x_train), seed=cfg.seed, reshuffle_each_iteration=True)    n_train = int(len(x_train) * cfg.train_split)    ds_train = ds.take(n_train)    ds_val = ds.skip(n_train)    def _map(x, y):        return preprocess_image(x, cfg.image_size), tf.cast(y, tf.int32)    ds_train = ds_train.map(_map, num_parallel_calls=tf.data.AUTOTUNE)    ds_val = ds_val.map(_map, num_parallel_calls=tf.data.AUTOTUNE)    ds_train = ds_train.batch(cfg.batch_size).prefetch(tf.data.AUTOTUNE)    ds_val = ds_val.batch(cfg.batch_size).prefetch(tf.data.AUTOTUNE)    return ds_train, ds_val

Note: the example uses in-memory arrays (x_train, y_train) to keep the refactor focused on structure. In a real project, this module would also handle reading files, parsing records, and caching.

Step 4: Move model creation into a model factory

Expose a single function that builds and compiles the model from config. This makes it easy to swap architectures without touching training code.

# src/model_def.pyimport tensorflow as tfdef build_model(cfg):    inputs = tf.keras.Input(shape=(cfg.image_size, cfg.image_size, 3))    x = tf.keras.layers.Conv2D(16, 3, activation="relu")(inputs)    x = tf.keras.layers.MaxPool2D()(x)    x = tf.keras.layers.Conv2D(32, 3, activation="relu")(x)    x = tf.keras.layers.GlobalAveragePooling2D()(x)    outputs = tf.keras.layers.Dense(cfg.num_classes, activation="softmax")(x)    model = tf.keras.Model(inputs, outputs)    model.compile(        optimizer=tf.keras.optimizers.Adam(learning_rate=cfg.learning_rate),        loss=tf.keras.losses.SparseCategoricalCrossentropy(),        metrics=["accuracy"],    )    return model

Step 5: Create a training routine that orchestrates everything

This module should: set reproducibility, build datasets, build model, train, and save outputs into a run directory. Keep it callable from both scripts and notebooks.

# src/train.pyimport jsonfrom pathlib import Pathimport tensorflow as tffrom .reproducibility import set_reproduciblefrom .data_pipeline import build_datasetsfrom .model_def import build_modeldef train_and_save(cfg, x_train, y_train):    set_reproducible(cfg.seed, deterministic=True)    run_dir = cfg.run_dir    run_dir.mkdir(parents=True, exist_ok=True)    (run_dir / "artifacts").mkdir(exist_ok=True)    ds_train, ds_val = build_datasets(cfg, x_train, y_train)    model = build_model(cfg)    callbacks = [        tf.keras.callbacks.ModelCheckpoint(            filepath=str(run_dir / "checkpoints" / "ckpt"),            save_weights_only=True,            save_best_only=True,            monitor="val_accuracy",            mode="max",        )    ]    (run_dir / "checkpoints").mkdir(exist_ok=True)    history = model.fit(ds_train, validation_data=ds_val, epochs=cfg.epochs, callbacks=callbacks)    export_path = run_dir / "saved_model"    model.save(str(export_path))    with open(run_dir / "config_used.json", "w", encoding="utf-8") as f:        json.dump(cfg.__dict__, f, indent=2)    with open(run_dir / "history.json", "w", encoding="utf-8") as f:        json.dump(history.history, f, indent=2)    return str(export_path)

Step 6: Create an inference module that reuses preprocessing

Inference should load the exported model and apply the same preprocessing function used during training. This avoids “training-serving skew” caused by different preprocessing code paths.

# src/inference.pyimport tensorflow as tffrom .data_pipeline import preprocess_imagedef load_export(export_path: str):    return tf.keras.models.load_model(export_path)def predict_one(model, image_tensor, image_size: int):    x = preprocess_image(image_tensor, image_size)    x = tf.expand_dims(x, axis=0)    probs = model(x, training=False)[0]    pred = tf.argmax(probs).numpy().item()    return pred, probs.numpy()

Configuration-Driven Runs and a Simple CLI Entry Pattern

Keep scripts thin: parse arguments, load config, call into src/. This makes the code testable and avoids “script logic” that can’t be imported.

Training script

# scripts/train_model.pyimport argparseimport numpy as npfrom src.config import load_configfrom src.train import train_and_savedef main():    parser = argparse.ArgumentParser()    parser.add_argument("--config", required=True)    args = parser.parse_args()    cfg = load_config(args.config)    x_train = (np.random.rand(1000, 256, 256, 3) * 255).astype("uint8")    y_train = np.random.randint(0, cfg.num_classes, size=(1000,), dtype="int32")    export_path = train_and_save(cfg, x_train, y_train)    print(export_path)if __name__ == "__main__":    main()

This example uses synthetic data so the structure is runnable immediately. Replace the synthetic block with real dataset loading (ideally implemented inside src/data_pipeline.py or a dedicated loader module).

Serving script (minimal local HTTP)

This script demonstrates a small “serve” entry point that uses the same inference code path. It is intentionally minimal: it loads the exported model and exposes a single endpoint that accepts an image as raw bytes.

# scripts/serve_model.pyimport argparseimport ioimport numpy as npfrom PIL import Imagefrom flask import Flask, request, jsonifyimport tensorflow as tffrom src.config import load_configfrom src.inference import load_export, predict_oneapp = Flask(__name__)MODEL = NoneCFG = None@app.post("/predict")def predict():    if "file" not in request.files:        return jsonify({"error": "missing file"}), 400    f = request.files["file"]    img = Image.open(io.BytesIO(f.read())).convert("RGB")    arr = np.array(img).astype("uint8")    pred, probs = predict_one(MODEL, tf.convert_to_tensor(arr), CFG.image_size)    return jsonify({"pred_class": int(pred), "probs": probs.tolist()})def main():    global MODEL, CFG    parser = argparse.ArgumentParser()    parser.add_argument("--config", required=True)    parser.add_argument("--export_path", required=True)    parser.add_argument("--host", default="127.0.0.1")    parser.add_argument("--port", type=int, default=8000)    args = parser.parse_args()    CFG = load_config(args.config)    MODEL = load_export(args.export_path)    app.run(host=args.host, port=args.port)if __name__ == "__main__":    main()

Small Integration Run: Train, Save, and Serve Using the Same Code Paths

This integration run verifies the project structure end-to-end: the training script produces an exported model, and the serving script loads that export and runs inference using the shared preprocessing and inference utilities.

Step-by-step integration

Create a config file at configs/base.yaml (or reuse the example above).
Run training from the project root:

python scripts/train_model.py --config configs/base.yaml

Confirm that a run folder exists, for example:

models/run_001/saved_model/

Start the server using the exported model path:

python scripts/serve_model.py --config configs/base.yaml --export_path models/run_001/saved_model

Send a request with an image file (from another terminal):

curl -X POST -F "file=@path/to/image.jpg" http://127.0.0.1:8000/predict

Because both training and serving call into src/data_pipeline.py and src/inference.py, you can change preprocessing or model settings in one place (the config and modules) and keep behavior consistent across experimentation, training runs, and serving.

Now answer the exercise about the content:

What is the main benefit of keeping preprocessing in a shared data pipeline module and reusing it during both training and inference?

You are right! Congratulations, now go to the next page

You missed! Try again.

Reusing the same preprocessing functions for training and serving keeps behavior consistent and avoids training-serving skew caused by different preprocessing code paths.