Free Ebook cover TensorFlow for Beginners: Building and Serving Your First Models

TensorFlow for Beginners: Building and Serving Your First Models

New course

10 pages

Saving and Loading TensorFlow Models as Reusable Artifacts

Capítulo 7

Estimated reading time: 7 minutes

+ Exercise

What it Means to “Save a Model”

When you save a TensorFlow/Keras model, you are turning the result of training into a reusable artifact that can be moved to another machine, checked into version control (or stored in object storage), and loaded later for inference or continued training. Depending on the format and what you choose to save, the artifact may include:

  • Architecture: the network structure (layers and connections).
  • Weights: learned parameters.
  • Optimizer state: momentum/Adam moments and other internal variables needed to resume training seamlessly.
  • Training configuration: loss, metrics, and optimizer configuration (when using Keras compile).
  • Extra assets: vocabularies, lookup tables, or other files referenced by the model.

SavedModel vs. H5: Which Should You Use?

SavedModel (recommended default)

  • Directory-based format (a folder containing protobuf + variables + assets).
  • Best compatibility with TensorFlow Serving, TFLite conversion workflows, and production tooling.
  • Supports more features (signatures, assets, trackable objects).
  • Typical use: production deployment and long-term storage.

H5 / HDF5 (.h5)

  • Single-file format, convenient for quick sharing.
  • Works well for many standard Keras models.
  • Can be more fragile with custom layers/losses unless you implement serialization correctly and provide custom objects on load.
  • Typical use: experiments, simple portability, legacy workflows.

In modern TensorFlow/Keras, prefer SavedModel for production and interoperability. Use .h5 when you specifically want a single file and your model is simple and fully serializable.

Saving the Full Model (Architecture + Weights + Optimizer State)

Saving the “full model” is the most convenient option when you want to reload and immediately run inference, and optionally continue training with the same optimizer state.

Step-by-step: Save as SavedModel

import tensorflow as tf
from pathlib import Path

# Assume `model` is already trained and compiled.
export_dir = Path("artifacts") / "my_model" / "savedmodel" / "v1"
export_dir.parent.mkdir(parents=True, exist_ok=True)

# This saves architecture, weights, and (when compiled) training config.
# Optimizer state is saved when the model has been compiled and trained.
model.save(export_dir)  # SavedModel directory

This creates a directory like:

  • saved_model.pb (graph and metadata)
  • variables/ (weights and optimizer variables)
  • assets/ (optional extra files)

Step-by-step: Save as H5

h5_path = Path("artifacts") / "my_model" / "h5" / "v1" / "model.h5"
h5_path.parent.mkdir(parents=True, exist_ok=True)

model.save(h5_path)  # HDF5 single file

H5 can store the model configuration and weights, and often the optimizer state as well, but custom components require extra care (covered below).

Continue in our app.

You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.

Or continue reading below...
Download App

Download the app

Saving Weights Only (When You Don’t Need the Full Artifact)

Saving weights only is useful when:

  • You want to keep the code-defined architecture as the source of truth.
  • You are iterating on architecture code but want to checkpoint weights.
  • You want smaller artifacts and you control the model-building code at load time.

Step-by-step: Save weights

weights_path = Path("artifacts") / "my_model" / "weights" / "v1" / "ckpt"
weights_path.parent.mkdir(parents=True, exist_ok=True)

model.save_weights(weights_path)

To load weights, you must recreate the model with the exact same architecture (and typically build it by calling it once or using build), then load:

# Recreate the model in code (same layers and shapes)
model2 = make_model()  # you define this

# Ensure variables exist (one way is to run a dummy forward pass)
_ = model2(tf.zeros([1, 32]))

model2.load_weights(weights_path)

If you want to resume training with the same optimizer state, weights-only is not enough; you need full model saving or a separate optimizer checkpoint strategy.

Custom Objects: Making Custom Layers/Losses Serializable

Custom layers, losses, and metrics can be saved and loaded reliably if they are properly serializable. The goal is that Keras can reconstruct them from a config dictionary.

Custom layer: implement get_config (and optionally from_config)

import tensorflow as tf

@tf.keras.utils.register_keras_serializable(package="Course")
class ScaledDense(tf.keras.layers.Layer):
    def __init__(self, units, scale=1.0, **kwargs):
        super().__init__(**kwargs)
        self.units = units
        self.scale = scale
        self.dense = tf.keras.layers.Dense(units)

    def call(self, inputs):
        return self.scale * self.dense(inputs)

    def get_config(self):
        config = super().get_config()
        config.update({"units": self.units, "scale": self.scale})
        return config

Key points:

  • get_config must return JSON-serializable values (numbers, strings, lists, dicts).
  • Use @register_keras_serializable so you can load without manually passing custom_objects in many cases.
  • If your layer contains sublayers (like Dense), Keras will track them automatically when assigned as attributes.

Custom loss: prefer a serializable class

@tf.keras.utils.register_keras_serializable(package="Course")
class WeightedMSE(tf.keras.losses.Loss):
    def __init__(self, weight=1.0, name="weighted_mse"):
        super().__init__(name=name)
        self.weight = weight

    def call(self, y_true, y_pred):
        return self.weight * tf.reduce_mean(tf.square(y_true - y_pred))

    def get_config(self):
        return {"weight": self.weight, "name": self.name}

If you used a plain Python function as a loss, you can still load by providing custom_objects, but class-based losses with get_config are more robust for long-term reuse.

Loading with custom_objects (fallback approach)

If you did not register the objects (or you are loading in a different process where registration is not executed), pass them explicitly:

loaded = tf.keras.models.load_model(
    "artifacts/my_model/savedmodel/v1",
    custom_objects={"ScaledDense": ScaledDense, "WeightedMSE": WeightedMSE}
)

Validation Routine: Prove the Loaded Model Matches

A practical way to validate a saved artifact is to run a fixed test input through the model before saving and after loading, then compare outputs. Use deterministic settings and a fixed input tensor.

Step-by-step: Save, load, compare outputs

import numpy as np
import tensorflow as tf
from pathlib import Path

# Fixed test input
x_test = tf.constant(np.linspace(-1.0, 1.0, 32).reshape(1, 32), dtype=tf.float32)

# 1) Run original model
y_before = model(x_test, training=False)

# 2) Save
export_dir = Path("artifacts") / "my_model" / "savedmodel" / "v1"
model.save(export_dir)

# 3) Load
reloaded = tf.keras.models.load_model(export_dir)

# 4) Run reloaded model
y_after = reloaded(x_test, training=False)

# 5) Compare
max_abs_diff = tf.reduce_max(tf.abs(y_before - y_after)).numpy()
print("max_abs_diff:", max_abs_diff)

# Tolerance depends on dtype/device; float32 should be extremely close.
assert max_abs_diff < 1e-6

Notes:

  • Use training=False to avoid randomness from dropout and to ensure batch norm uses moving statistics.
  • On different hardware (CPU vs GPU) you may see tiny numeric differences; adjust tolerance if needed.
  • If you validate a weights-only save, compare outputs after rebuilding the architecture and loading weights.

Versioning Model Artifacts and Recording Metadata

Model files are not enough by themselves. You also want to know what data, code, and hyperparameters produced the artifact. A simple, effective approach is:

  • Use a versioned directory structure for each export.
  • Write a small JSON metadata file alongside the model.
  • Include enough information to reproduce or audit the training run.

Recommended directory layout

artifacts/
  my_model/
    savedmodel/
      v1/
        saved_model.pb
        variables/
        assets/
        metadata.json
      v2/
        ...
    weights/
      v1/
        ckpt.index
        ckpt.data-00000-of-00001
        metadata.json

Step-by-step: Write a metadata file

import json
from datetime import datetime
from pathlib import Path
import tensorflow as tf

export_dir = Path("artifacts") / "my_model" / "savedmodel" / "v1"
metadata = {
    "model_name": "my_model",
    "format": "SavedModel",
    "version": "v1",
    "created_at": datetime.utcnow().isoformat() + "Z",
    "tensorflow_version": tf.__version__,
    "keras_version": tf.keras.__version__,
    "input_signature": {"shape": [None, 32], "dtype": "float32"},
    "training": {
        "optimizer": "Adam",
        "loss": "WeightedMSE",
        "metrics": ["mae"],
        "batch_size": 64,
        "epochs": 10,
        "seed": 123
    },
    "data": {
        "dataset_id": "internal-dataset-2026-01-15",
        "split": "train/val",
        "preprocessing": "standardize_v3"
    },
    "notes": "First export after tuning scale parameter"
}

with open(export_dir / "metadata.json", "w", encoding="utf-8") as f:
    json.dump(metadata, f, indent=2, sort_keys=True)

Practical versioning guidance

  • Use monotonically increasing versions (v1, v2, …) or semantic versions (1.0.0).
  • Never overwrite a released version; create a new directory for each export.
  • Record the input contract (expected shape/dtype and any preprocessing assumptions).
  • Record training configuration (optimizer/loss/metrics and key hyperparameters) so you can interpret results and reproduce runs.
  • Store a validation checksum if desired: for example, save max_abs_diff from the validation routine or store a small set of reference inputs/outputs in the metadata.

Now answer the exercise about the content:

You want to resume training later with the same optimizer state (e.g., Adam moments) after moving the model to another machine. Which saving approach best fits this goal?

You are right! Congratulations, now go to the next page

You missed! Try again.

Weights-only saves do not include optimizer state, so they can’t seamlessly resume training. Saving the full model captures the necessary pieces (including optimizer variables when compiled and trained) to reload and continue training consistently.

Next chapter

Basic Model Serving Patterns: From Python Inference to HTTP Endpoints

Arrow Right Icon
Download the app to earn free Certification and listen to the courses in the background, even with the screen off.