What it Means to “Save a Model”
When you save a TensorFlow/Keras model, you are turning the result of training into a reusable artifact that can be moved to another machine, checked into version control (or stored in object storage), and loaded later for inference or continued training. Depending on the format and what you choose to save, the artifact may include:
- Architecture: the network structure (layers and connections).
- Weights: learned parameters.
- Optimizer state: momentum/Adam moments and other internal variables needed to resume training seamlessly.
- Training configuration: loss, metrics, and optimizer configuration (when using Keras compile).
- Extra assets: vocabularies, lookup tables, or other files referenced by the model.
SavedModel vs. H5: Which Should You Use?
SavedModel (recommended default)
- Directory-based format (a folder containing protobuf + variables + assets).
- Best compatibility with TensorFlow Serving, TFLite conversion workflows, and production tooling.
- Supports more features (signatures, assets, trackable objects).
- Typical use: production deployment and long-term storage.
H5 / HDF5 (.h5)
- Single-file format, convenient for quick sharing.
- Works well for many standard Keras models.
- Can be more fragile with custom layers/losses unless you implement serialization correctly and provide custom objects on load.
- Typical use: experiments, simple portability, legacy workflows.
In modern TensorFlow/Keras, prefer SavedModel for production and interoperability. Use .h5 when you specifically want a single file and your model is simple and fully serializable.
Saving the Full Model (Architecture + Weights + Optimizer State)
Saving the “full model” is the most convenient option when you want to reload and immediately run inference, and optionally continue training with the same optimizer state.
Step-by-step: Save as SavedModel
import tensorflow as tf
from pathlib import Path
# Assume `model` is already trained and compiled.
export_dir = Path("artifacts") / "my_model" / "savedmodel" / "v1"
export_dir.parent.mkdir(parents=True, exist_ok=True)
# This saves architecture, weights, and (when compiled) training config.
# Optimizer state is saved when the model has been compiled and trained.
model.save(export_dir) # SavedModel directoryThis creates a directory like:
saved_model.pb(graph and metadata)variables/(weights and optimizer variables)assets/(optional extra files)
Step-by-step: Save as H5
h5_path = Path("artifacts") / "my_model" / "h5" / "v1" / "model.h5"
h5_path.parent.mkdir(parents=True, exist_ok=True)
model.save(h5_path) # HDF5 single fileH5 can store the model configuration and weights, and often the optimizer state as well, but custom components require extra care (covered below).
Continue in our app.
You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.
Or continue reading below...Download the app
Saving Weights Only (When You Don’t Need the Full Artifact)
Saving weights only is useful when:
- You want to keep the code-defined architecture as the source of truth.
- You are iterating on architecture code but want to checkpoint weights.
- You want smaller artifacts and you control the model-building code at load time.
Step-by-step: Save weights
weights_path = Path("artifacts") / "my_model" / "weights" / "v1" / "ckpt"
weights_path.parent.mkdir(parents=True, exist_ok=True)
model.save_weights(weights_path)To load weights, you must recreate the model with the exact same architecture (and typically build it by calling it once or using build), then load:
# Recreate the model in code (same layers and shapes)
model2 = make_model() # you define this
# Ensure variables exist (one way is to run a dummy forward pass)
_ = model2(tf.zeros([1, 32]))
model2.load_weights(weights_path)If you want to resume training with the same optimizer state, weights-only is not enough; you need full model saving or a separate optimizer checkpoint strategy.
Custom Objects: Making Custom Layers/Losses Serializable
Custom layers, losses, and metrics can be saved and loaded reliably if they are properly serializable. The goal is that Keras can reconstruct them from a config dictionary.
Custom layer: implement get_config (and optionally from_config)
import tensorflow as tf
@tf.keras.utils.register_keras_serializable(package="Course")
class ScaledDense(tf.keras.layers.Layer):
def __init__(self, units, scale=1.0, **kwargs):
super().__init__(**kwargs)
self.units = units
self.scale = scale
self.dense = tf.keras.layers.Dense(units)
def call(self, inputs):
return self.scale * self.dense(inputs)
def get_config(self):
config = super().get_config()
config.update({"units": self.units, "scale": self.scale})
return configKey points:
get_configmust return JSON-serializable values (numbers, strings, lists, dicts).- Use
@register_keras_serializableso you can load without manually passingcustom_objectsin many cases. - If your layer contains sublayers (like
Dense), Keras will track them automatically when assigned as attributes.
Custom loss: prefer a serializable class
@tf.keras.utils.register_keras_serializable(package="Course")
class WeightedMSE(tf.keras.losses.Loss):
def __init__(self, weight=1.0, name="weighted_mse"):
super().__init__(name=name)
self.weight = weight
def call(self, y_true, y_pred):
return self.weight * tf.reduce_mean(tf.square(y_true - y_pred))
def get_config(self):
return {"weight": self.weight, "name": self.name}If you used a plain Python function as a loss, you can still load by providing custom_objects, but class-based losses with get_config are more robust for long-term reuse.
Loading with custom_objects (fallback approach)
If you did not register the objects (or you are loading in a different process where registration is not executed), pass them explicitly:
loaded = tf.keras.models.load_model(
"artifacts/my_model/savedmodel/v1",
custom_objects={"ScaledDense": ScaledDense, "WeightedMSE": WeightedMSE}
)Validation Routine: Prove the Loaded Model Matches
A practical way to validate a saved artifact is to run a fixed test input through the model before saving and after loading, then compare outputs. Use deterministic settings and a fixed input tensor.
Step-by-step: Save, load, compare outputs
import numpy as np
import tensorflow as tf
from pathlib import Path
# Fixed test input
x_test = tf.constant(np.linspace(-1.0, 1.0, 32).reshape(1, 32), dtype=tf.float32)
# 1) Run original model
y_before = model(x_test, training=False)
# 2) Save
export_dir = Path("artifacts") / "my_model" / "savedmodel" / "v1"
model.save(export_dir)
# 3) Load
reloaded = tf.keras.models.load_model(export_dir)
# 4) Run reloaded model
y_after = reloaded(x_test, training=False)
# 5) Compare
max_abs_diff = tf.reduce_max(tf.abs(y_before - y_after)).numpy()
print("max_abs_diff:", max_abs_diff)
# Tolerance depends on dtype/device; float32 should be extremely close.
assert max_abs_diff < 1e-6Notes:
- Use
training=Falseto avoid randomness from dropout and to ensure batch norm uses moving statistics. - On different hardware (CPU vs GPU) you may see tiny numeric differences; adjust tolerance if needed.
- If you validate a weights-only save, compare outputs after rebuilding the architecture and loading weights.
Versioning Model Artifacts and Recording Metadata
Model files are not enough by themselves. You also want to know what data, code, and hyperparameters produced the artifact. A simple, effective approach is:
- Use a versioned directory structure for each export.
- Write a small JSON metadata file alongside the model.
- Include enough information to reproduce or audit the training run.
Recommended directory layout
artifacts/
my_model/
savedmodel/
v1/
saved_model.pb
variables/
assets/
metadata.json
v2/
...
weights/
v1/
ckpt.index
ckpt.data-00000-of-00001
metadata.jsonStep-by-step: Write a metadata file
import json
from datetime import datetime
from pathlib import Path
import tensorflow as tf
export_dir = Path("artifacts") / "my_model" / "savedmodel" / "v1"
metadata = {
"model_name": "my_model",
"format": "SavedModel",
"version": "v1",
"created_at": datetime.utcnow().isoformat() + "Z",
"tensorflow_version": tf.__version__,
"keras_version": tf.keras.__version__,
"input_signature": {"shape": [None, 32], "dtype": "float32"},
"training": {
"optimizer": "Adam",
"loss": "WeightedMSE",
"metrics": ["mae"],
"batch_size": 64,
"epochs": 10,
"seed": 123
},
"data": {
"dataset_id": "internal-dataset-2026-01-15",
"split": "train/val",
"preprocessing": "standardize_v3"
},
"notes": "First export after tuning scale parameter"
}
with open(export_dir / "metadata.json", "w", encoding="utf-8") as f:
json.dump(metadata, f, indent=2, sort_keys=True)Practical versioning guidance
- Use monotonically increasing versions (
v1,v2, …) or semantic versions (1.0.0). - Never overwrite a released version; create a new directory for each export.
- Record the input contract (expected shape/dtype and any preprocessing assumptions).
- Record training configuration (optimizer/loss/metrics and key hyperparameters) so you can interpret results and reproduce runs.
- Store a validation checksum if desired: for example, save
max_abs_difffrom the validation routine or store a small set of reference inputs/outputs in the metadata.