All courses > Technology and Programming > Artificial Intelligence and Machine Learning ::

End-to-End Capstone: Train, Export, and Serve Your First TensorFlow Model

Capítulo 10

Estimated reading time: 10 minutes

Capstone Goal and Acceptance Checks

In this capstone, you will connect the full workflow into one repeatable run: define a small supervised learning problem, build a consistent preprocessing pipeline, train with callbacks, hit a metric target, export a SavedModel, and run an inference/serving demo that returns a documented prediction payload. The key concept is end-to-end consistency: the exact same input schema and preprocessing must be used during training, evaluation, and serving. Your acceptance checks at the end verify that you can reproduce the run and that the exported artifact behaves correctly when loaded elsewhere.

Problem Definition

Task: Predict whether a passenger will tip (binary classification) based on trip features. Metric target: validation AUC ≥ 0.80 on a held-out split (adjustable depending on your dataset quality). Serving contract: a request contains a JSON object with numeric fields; the response returns a probability and a class label.

Dataset and Input Schema

Use a simple CSV dataset with a header row. You can create a small synthetic dataset for the capstone (so the workflow is runnable anywhere) or point to your own CSV with the same schema. The model expects these input fields:

trip_distance (float)
fare_amount (float)
passenger_count (float)
payment_type (string; e.g., "card", "cash")
hour_of_day (float; 0–23)
label (int; 0/1) — used only for training/evaluation

Define the schema explicitly in code so training and serving agree on dtypes and feature names.

import os, json, random, numpy as np, tensorflow as tf

SEED = 42
os.environ["PYTHONHASHSEED"] = str(SEED)
random.seed(SEED)
np.random.seed(SEED)
tf.random.set_seed(SEED)

FEATURES_NUMERIC = ["trip_distance", "fare_amount", "passenger_count", "hour_of_day"]
FEATURES_CATEGORICAL = ["payment_type"]
LABEL = "label"

INPUT_SCHEMA = {
  "trip_distance": tf.float32,
  "fare_amount": tf.float32,
  "passenger_count": tf.float32,
  "hour_of_day": tf.float32,
  "payment_type": tf.string,
}

CSV_COLUMNS = FEATURES_NUMERIC + FEATURES_CATEGORICAL + [LABEL]
CSV_DEFAULTS = [0.0, 0.0, 0.0, 0.0, "", 0]  # matches CSV_COLUMNS order

Optional: Generate a Small Synthetic CSV

This creates a dataset with a learnable signal (tips more likely for longer trips, card payments, and evening hours). Replace this with your real CSV if you have one.

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

def make_synthetic_csv(path, n=5000):
    rng = np.random.default_rng(SEED)
    payment_types = np.array(["card", "cash"])

    trip_distance = rng.gamma(shape=2.0, scale=2.0, size=n).astype(np.float32)
    fare_amount = (2.5 + trip_distance * 3.2 + rng.normal(0, 2.0, size=n)).astype(np.float32)
    passenger_count = rng.integers(1, 5, size=n).astype(np.float32)
    hour_of_day = rng.integers(0, 24, size=n).astype(np.float32)
    payment_type = rng.choice(payment_types, size=n, p=[0.7, 0.3])

    # Create probability of tip with a simple rule + noise
    logit = (
        0.25 * trip_distance +
        0.08 * fare_amount +
        0.15 * (payment_type == "card").astype(np.float32) +
        0.10 * ((hour_of_day >= 18) & (hour_of_day <= 23)).astype(np.float32) -
        0.30 * (payment_type == "cash").astype(np.float32) +
        rng.normal(0, 0.8, size=n)
    )
    prob = 1 / (1 + np.exp(-logit))
    label = (prob > 0.5).astype(np.int32)

    with open(path, "w", encoding="utf-8") as f:
        f.write(",".join(CSV_COLUMNS) + "\n")
        for i in range(n):
            row = [
                f"{trip_distance[i]:.3f}",
                f"{fare_amount[i]:.2f}",
                f"{passenger_count[i]:.0f}",
                payment_type[i],
                f"{hour_of_day[i]:.0f}",
                str(int(label[i]))
            ]
            f.write(",".join(row) + "\n")

DATA_DIR = "./capstone_data"
os.makedirs(DATA_DIR, exist_ok=True)
train_csv = os.path.join(DATA_DIR, "train.csv")
val_csv = os.path.join(DATA_DIR, "val.csv")

if not os.path.exists(train_csv) or not os.path.exists(val_csv):
    make_synthetic_csv(train_csv, n=8000)
    make_synthetic_csv(val_csv, n=2000)

Build the tf.data Pipeline (Training and Validation)

The pipeline must: (1) parse CSV rows into typed tensors, (2) separate features from label, (3) batch and prefetch. Keep preprocessing logic centralized so it can be reused by the exported model.

AUTOTUNE = tf.data.AUTOTUNE

def parse_csv_line(line):
    fields = tf.io.decode_csv(line, record_defaults=CSV_DEFAULTS)
    features = dict(zip(CSV_COLUMNS, fields))
    label = tf.cast(features.pop(LABEL), tf.float32)
    return features, label

def make_dataset(csv_path, batch_size=128, training=True):
    ds = tf.data.TextLineDataset(csv_path).skip(1)  # skip header
    if training:
        ds = ds.shuffle(10_000, seed=SEED, reshuffle_each_iteration=True)
    ds = ds.map(parse_csv_line, num_parallel_calls=AUTOTUNE)
    ds = ds.batch(batch_size)
    ds = ds.prefetch(AUTOTUNE)
    return ds

train_ds = make_dataset(train_csv, batch_size=256, training=True)
val_ds = make_dataset(val_csv, batch_size=256, training=False)

Quick Data Sanity Check

Before modeling, verify that the dataset yields the expected keys and dtypes. This catches schema drift early.

for batch_features, batch_labels in train_ds.take(1):
    print("Feature keys:", sorted(batch_features.keys()))
    for k in sorted(batch_features.keys()):
        print(k, batch_features[k].dtype, batch_features[k].shape)
    print("Labels:", batch_labels.dtype, batch_labels.shape)

Implement a Keras Model with Built-In Preprocessing

To avoid training/serving skew, put preprocessing inside the model using Keras preprocessing layers. The model will accept raw feature tensors (including strings) and perform normalization and categorical encoding internally. This makes the exported SavedModel self-contained: serving only needs to provide the raw schema.

Preprocessing Layers and Adaptation

Normalization layers should be adapted on training data only. For categorical strings, use a lookup layer and one-hot encoding.

def build_preprocessing_layers():
    # Numeric normalizers
    normalizers = {}
    for name in FEATURES_NUMERIC:
        normalizers[name] = tf.keras.layers.Normalization(axis=None, name=f"norm_{name}")

    # Categorical lookup
    payment_lookup = tf.keras.layers.StringLookup(output_mode="one_hot", name="lookup_payment")

    return normalizers, payment_lookup

normalizers, payment_lookup = build_preprocessing_layers()

# Adapt numeric normalizers
for name in FEATURES_NUMERIC:
    ds_num = train_ds.map(lambda x, y: x[name])
    normalizers[name].adapt(ds_num)

# Adapt categorical lookup
ds_pay = train_ds.map(lambda x, y: x["payment_type"])
payment_lookup.adapt(ds_pay)

Model Definition

The model concatenates normalized numeric features with one-hot categorical features, then uses a small MLP for binary classification. The output is a probability in [0, 1].

def build_model():
    inputs = {}
    for name, dtype in INPUT_SCHEMA.items():
        inputs[name] = tf.keras.Input(shape=(), name=name, dtype=dtype)

    # Numeric stack
    numeric_tensors = []
    for name in FEATURES_NUMERIC:
        x = normalizers[name](inputs[name])
        numeric_tensors.append(tf.expand_dims(x, -1))
    numeric_stack = tf.concat(numeric_tensors, axis=-1)

    # Categorical one-hot
    payment_oh = payment_lookup(inputs["payment_type"])  # already one-hot

    # Combine
    x = tf.concat([numeric_stack, payment_oh], axis=-1)
    x = tf.keras.layers.Dense(32, activation="relu")(x)
    x = tf.keras.layers.Dropout(0.2)(x)
    x = tf.keras.layers.Dense(16, activation="relu")(x)
    prob = tf.keras.layers.Dense(1, activation="sigmoid", name="tip_probability")(x)

    model = tf.keras.Model(inputs=inputs, outputs=prob)
    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
        loss=tf.keras.losses.BinaryCrossentropy(),
        metrics=[
            tf.keras.metrics.AUC(name="auc"),
            tf.keras.metrics.BinaryAccuracy(name="accuracy")
        ],
    )
    return model

model = build_model()
model.summary()

Train with Callbacks and Reproducible Settings

Use callbacks to (1) stop when validation stops improving, and (2) keep the best weights. This makes your run more stable and helps you meet the metric target without manual trial-and-error.

callbacks = [
    tf.keras.callbacks.EarlyStopping(
        monitor="val_auc",
        mode="max",
        patience=3,
        restore_best_weights=True,
    ),
    tf.keras.callbacks.ModelCheckpoint(
        filepath=os.path.join(DATA_DIR, "checkpoint.keras"),
        monitor="val_auc",
        mode="max",
        save_best_only=True,
    ),
]

history = model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=20,
    callbacks=callbacks,
    verbose=2,
)

Evaluate Against a Clear Metric Target

Evaluation is an acceptance gate: if the model does not meet the target, you should not export it as a “release” artifact. Keep the target explicit in code so it can run in CI later.

results = model.evaluate(val_ds, verbose=0)
metrics = dict(zip(model.metrics_names, results))
print("Validation metrics:", metrics)

TARGET_AUC = 0.80
if metrics["auc"] < TARGET_AUC:
    raise ValueError(f"AUC target not met: {metrics['auc']:.3f} < {TARGET_AUC}")

Export as a SavedModel with a Serving Signature

Export the model in a form that can be loaded for inference without rebuilding code. Because preprocessing is inside the model, the SavedModel expects raw inputs matching the schema. Define a concrete serving function so the input and output fields are explicit and stable.

EXPORT_DIR = os.path.join(DATA_DIR, "export", "tip_model")

@tf.function(input_signature=[{
    "trip_distance": tf.TensorSpec([None], tf.float32),
    "fare_amount": tf.TensorSpec([None], tf.float32),
    "passenger_count": tf.TensorSpec([None], tf.float32),
    "hour_of_day": tf.TensorSpec([None], tf.float32),
    "payment_type": tf.TensorSpec([None], tf.string),
}])
def serve_fn(inputs):
    prob = model(inputs, training=False)
    prob = tf.squeeze(prob, axis=-1)
    pred_class = tf.cast(prob >= 0.5, tf.int32)
    return {
        "tip_probability": prob,
        "tip_predicted_class": pred_class,
    }

tf.saved_model.save(model, EXPORT_DIR, signatures={"serving_default": serve_fn})
print("Exported to:", EXPORT_DIR)

Inference Demo: Load the SavedModel and Predict

This demo simulates what a separate service process would do: load the exported artifact and run predictions on raw inputs. If this works, you have verified that the export is self-contained and that the serving signature is correct.

loaded = tf.saved_model.load(EXPORT_DIR)
serving = loaded.signatures["serving_default"]

example_request = {
    "trip_distance": tf.constant([1.2, 7.5], dtype=tf.float32),
    "fare_amount": tf.constant([8.5, 32.0], dtype=tf.float32),
    "passenger_count": tf.constant([1.0, 2.0], dtype=tf.float32),
    "hour_of_day": tf.constant([10.0, 21.0], dtype=tf.float32),
    "payment_type": tf.constant(["cash", "card"], dtype=tf.string),
}

response = serving(**example_request)
print({k: v.numpy() for k, v in response.items()})

Document the Output Fields

tip_probability: float32 vector of shape [batch], values in [0, 1].
tip_predicted_class: int32 vector of shape [batch], 1 if probability ≥ 0.5 else 0.

Serving Demo: Minimal HTTP Endpoint (Local)

This example uses a lightweight Python HTTP server to demonstrate request/response behavior. It loads the SavedModel once at startup and serves predictions. This is not production-hardened, but it validates the end-to-end contract.

from flask import Flask, request, jsonify

app = Flask(__name__)
loaded = tf.saved_model.load(EXPORT_DIR)
serving = loaded.signatures["serving_default"]

@app.post("/predict")
def predict():
    payload = request.get_json(force=True)
    # Expect payload to be a dict of feature_name -> list values
    inputs = {
        "trip_distance": tf.constant(payload["trip_distance"], dtype=tf.float32),
        "fare_amount": tf.constant(payload["fare_amount"], dtype=tf.float32),
        "passenger_count": tf.constant(payload["passenger_count"], dtype=tf.float32),
        "hour_of_day": tf.constant(payload["hour_of_day"], dtype=tf.float32),
        "payment_type": tf.constant(payload["payment_type"], dtype=tf.string),
    }
    outputs = serving(**inputs)
    result = {
        "tip_probability": outputs["tip_probability"].numpy().tolist(),
        "tip_predicted_class": outputs["tip_predicted_class"].numpy().tolist(),
    }
    return jsonify(result)

# Run: FLASK_APP=this_file.py flask run --port 8080

Example Request

curl -X POST http://127.0.0.1:8080/predict \
  -H "Content-Type: application/json" \
  -d '{
    "trip_distance": [2.0],
    "fare_amount": [12.5],
    "passenger_count": [1.0],
    "hour_of_day": [20.0],
    "payment_type": ["card"]
  }'

Example Response

{
  "tip_probability": [0.83],
  "tip_predicted_class": [1]
}

Final Verification Checklist (Acceptance Gate)

Correct input schema: Serving signature expects exactly the documented fields and dtypes (float32 for numeric, string for payment_type), with batch dimension [None].
Consistent preprocessing: Preprocessing layers (normalization and lookup) are inside the model and were adapted on training data only; no external preprocessing is required at inference time.
Reproducible run: Random seeds are set (Python, NumPy, TensorFlow) and the data pipeline uses a fixed shuffle seed.
Model loads cleanly: A separate process can run tf.saved_model.load(EXPORT_DIR) without rebuilding the model code.
Valid prediction response: An example request returns both tip_probability and tip_predicted_class with correct shapes and types, and output fields are documented for consumers.

Now answer the exercise about the content:

Which approach best prevents training/serving skew when exporting and serving the tip prediction model?

You are right! Congratulations, now go to the next page

You missed! Try again.

Putting preprocessing layers inside the model and exporting a SavedModel with a fixed serving signature ensures the same schema and transformations are used during training, evaluation, and serving.