1) Creating Tensors from Python and NumPy
In TensorFlow, a tensor is a typed, n-dimensional array. Every tensor has two properties you must track to avoid bugs: shape (how many elements along each axis) and dtype (the element type). Most model errors that feel “mysterious” come from a mismatch in one of these two.
From Python scalars and lists
Use tf.constant for fixed values and tf.convert_to_tensor when you want TensorFlow to infer the best representation from existing data.
import tensorflow as tf
# Scalar (rank-0)
a = tf.constant(3)
# Vector (rank-1)
b = tf.constant([1, 2, 3])
# Matrix (rank-2)
c = tf.constant([[1.0, 2.0], [3.0, 4.0]])
# Convert (often used when you might pass NumPy arrays or Python lists)
d = tf.convert_to_tensor([10, 20, 30])TensorFlow infers dtype from your input. If you mix ints and floats in a Python list, it will typically promote to float. You can also force a dtype.
x = tf.constant([1, 2, 3], dtype=tf.float32)From NumPy arrays
NumPy arrays convert cleanly to tensors. The dtype is preserved unless you override it.
import numpy as np
np_x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.int32)
tf_x = tf.convert_to_tensor(np_x)
tf_y = tf.constant(np_x, dtype=tf.float32) # override dtypeWhen you need random initialization or placeholders for computation, use TensorFlow creators:
Continue in our app.
You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.
Or continue reading below...Download the app
zeros = tf.zeros([2, 3], dtype=tf.float32)
ones = tf.ones([2, 3])
randn = tf.random.normal([2, 3], mean=0.0, stddev=1.0)
randu = tf.random.uniform([2, 3], minval=0.0, maxval=1.0)Exercise: rank and batch intuition
- Create a tensor representing a batch of 4 examples, each with 3 features. What shape should it have?
- Create a tensor representing a single example with 3 features. What shape should it have? How is it different from a batch of size 1?
# Try:
batch = tf.zeros([4, 3])
single = tf.zeros([3])
batch1 = tf.zeros([1, 3])2) Inspecting Shape and Dtype
Always inspect tensors early, especially before feeding them into layers or combining them with other tensors.
Shape basics
tensor.shape returns a TensorShape object (often partially known in graph contexts). tf.shape(tensor) returns a tensor with the dynamic shape (useful inside @tf.function).
x = tf.random.normal([5, 10])
print(x.shape) # TensorShape([5, 10])
print(tf.shape(x)) # tf.Tensor([ 5 10], shape=(2,), dtype=int32)
print(x.ndim) # 2
print(tf.rank(x)) # tf.Tensor(2, shape=(), dtype=int32)Dtype basics
Dtype affects numerical stability, memory usage, and which ops are allowed. Many ops require matching dtypes (e.g., you can’t add int32 and float32 without casting).
a = tf.constant([1, 2, 3], dtype=tf.int32)
b = tf.constant([0.1, 0.2, 0.3], dtype=tf.float32)
print(a.dtype) # <dtype: 'int32'>
print(b.dtype) # <dtype: 'float32'>
# Cast to align dtypes
c = tf.cast(a, tf.float32) + bExercise: dtype reasoning
- Create an
int32tensor and afloat32tensor of the same shape. Try adding them. Then fix it usingtf.cast. - Why might you prefer
float32for model inputs even if your raw data is integer?
3) Reshaping, Broadcasting, Slicing, and Concatenation
Reshaping safely
tf.reshape changes the view of the data without changing the number of elements. A common safe pattern is to keep the batch dimension and reshape the rest.
x = tf.random.normal([32, 28, 28, 1]) # batch of images
# Flatten each image into a vector while keeping batch
x_flat = tf.reshape(x, [tf.shape(x)[0], -1])
print(x_flat.shape) # (32, 784) if static shape knownUse -1 for “infer this dimension,” but only when the total element count is compatible.
v = tf.range(12)
m = tf.reshape(v, [3, 4])
# tf.reshape(v, [5, -1]) would fail because 12 is not divisible by 5Broadcasting (the silent shape changer)
Broadcasting lets TensorFlow combine tensors of different shapes by virtually expanding dimensions of size 1. This is powerful and also a common source of bugs if you accidentally broadcast along the wrong axis.
# Batch of 4 examples, 3 features each
x = tf.ones([4, 3])
# Per-feature bias (broadcast across batch)
bias = tf.constant([0.1, 0.2, 0.3]) # shape (3,)
y = x + bias # result shape (4, 3)
# Per-example scale (broadcast across features)
scale = tf.constant([[1.0], [2.0], [3.0], [4.0]]) # shape (4, 1)
z = x * scale # result shape (4, 3)When broadcasting surprises you, explicitly reshape with tf.reshape or tf.expand_dims to make intent clear.
bias2 = tf.reshape(bias, [1, 3]) # explicit batch axis
y2 = x + bias2Slicing and indexing
Slicing works similarly to NumPy. Use it to separate batch and feature dimensions deliberately.
x = tf.random.normal([8, 5]) # 8 examples, 5 features
first_three_examples = x[:3, :] # shape (3, 5)
last_feature = x[:, -1] # shape (8,)
last_feature_keepdim = x[:, -1:] # shape (8, 1)Notice how x[:, -1] drops a dimension. This difference often matters when you later concatenate or do matrix multiplication.
Concatenation vs stacking
tf.concat joins tensors along an existing axis. tf.stack creates a new axis.
a = tf.zeros([2, 3])
b = tf.ones([2, 3])
# Concatenate along features: (2, 6)
cat = tf.concat([a, b], axis=1)
# Stack to create a new axis: (2, 2, 3)
stk = tf.stack([a, b], axis=1)Exercise: batch vs feature axis
- You have two feature blocks:
x1shape(batch, 4)andx2shape(batch, 6). What axis should you use to combine them into(batch, 10)? - You have predictions from 3 different models, each shape
(batch, 1). If you want a tensor of shape(batch, 3), should you useconcatorstack? Along which axis?
batch = 5
x1 = tf.random.normal([batch, 4])
x2 = tf.random.normal([batch, 6])
combined = tf.concat([x1, x2], axis=1) # (5, 10)
p1 = tf.random.uniform([batch, 1])
p2 = tf.random.uniform([batch, 1])
p3 = tf.random.uniform([batch, 1])
ensemble = tf.concat([p1, p2, p3], axis=1) # (5, 3)4) Common Math Ops and Reductions
Elementwise math
Most basic operators are elementwise and support broadcasting: +, -, *, /, as well as functions like tf.square, tf.exp, tf.maximum.
x = tf.constant([[1.0, 2.0], [3.0, 4.0]])
print(x + 1.0)
print(tf.square(x))
print(tf.maximum(x, 2.5))Matrix multiplication (shape-sensitive)
tf.matmul (or the @ operator) performs matrix multiplication and is strict about inner dimensions. This is where many model shape errors originate.
# Batch of 32, 10 features
x = tf.random.normal([32, 10])
# Weight matrix to produce 4 outputs
W = tf.random.normal([10, 4])
# Result: (32, 4)
y = x @ WIf you accidentally transpose W to shape (4, 10), the error will mention incompatible shapes because 10 (from x) must match the second-to-last dimension of W.
W_bad = tf.random.normal([4, 10])
# y_bad = x @ W_bad # InvalidArgumentError: Matrix size-incompatibleReductions: sum, mean, max
Reductions collapse one or more axes. Always decide whether you want to keep reduced dimensions using keepdims=True to preserve shape for later broadcasting.
x = tf.random.normal([8, 5]) # (batch, features)
# Per-example mean over features: shape (8,)
per_example_mean = tf.reduce_mean(x, axis=1)
# Same, but keep dimension: shape (8, 1)
per_example_mean_kd = tf.reduce_mean(x, axis=1, keepdims=True)
# Global mean: scalar
global_mean = tf.reduce_mean(x)Exercise: reductions and shape propagation
- Given
xshape(batch, features), compute a per-feature mean across the batch. What is the resulting shape? - Compute a per-example L2 norm (one value per example). Hint: square, reduce_sum over features, then sqrt. What shape do you get with and without
keepdims=True?
x = tf.random.normal([7, 3])
per_feature_mean = tf.reduce_mean(x, axis=0) # (3,)
per_example_l2 = tf.sqrt(tf.reduce_sum(tf.square(x), axis=1)) # (7,)
per_example_l2_kd = tf.sqrt(tf.reduce_sum(tf.square(x), axis=1, keepdims=True)) # (7, 1)5) Debugging Shape Mismatches with Assert and Print Utilities
When shapes are wrong, the error might appear far from where the mistake happened (for example, a bad slice produces shape (batch,) instead of (batch, 1), and a later concatenation fails). The goal is to catch issues as close to the source as possible.
Use tf.debugging.assert_* early
Assertions fail fast with clear messages. They are especially useful inside functions where shapes may be dynamic.
def ensure_2d(x, name="x"):
tf.debugging.assert_rank(x, 2, message=f"{name} must be rank-2 (batch, features)")
return x
def ensure_last_dim(x, d, name="x"):
tf.debugging.assert_equal(tf.shape(x)[-1], d, message=f"{name} last dim must be {d}")
return x
x = tf.random.normal([16, 10])
ensure_2d(x, "inputs")
ensure_last_dim(x, 10, "inputs")You can also assert compatible shapes directly:
a = tf.random.normal([8, 5])
b = tf.random.normal([8, 5])
tf.debugging.assert_shapes([
(a, ("batch", "features")),
(b, ("batch", "features")),
])
c = a + bPrint shapes during execution with tf.print
Inside @tf.function, Python print may not behave as you expect. Use tf.print to print tensor values or shapes at runtime.
@tf.function
def forward(x):
tf.print("x shape:", tf.shape(x), "dtype:", x.dtype)
return tf.reduce_mean(x, axis=1, keepdims=True)
out = forward(tf.random.normal([4, 6]))How incorrect shapes propagate into model errors (common pattern)
A frequent mistake is losing the feature dimension by slicing, then trying to concatenate or multiply later.
# Suppose you have (batch, features)
x = tf.random.normal([10, 4])
# You want the last feature as a column vector (batch, 1)
last_feat_wrong = x[:, -1] # (10,) rank-1
last_feat_right = x[:, -1:] # (10, 1) rank-2
# Later you try to concatenate it back with other features
try_concat_wrong = tf.concat([x[:, :3], last_feat_wrong], axis=1) # will error
try_concat_right = tf.concat([x[:, :3], last_feat_right], axis=1) # (10, 4)The error from tf.concat will complain about ranks not matching. The real bug happened earlier: x[:, -1] dropped a dimension.
Another propagation example: batch dimension confusion
Mixing up (features,) and (batch, features) can lead to broadcasting that “works” but produces wrong results.
# Intended: 5 examples, 3 features
x = tf.random.normal([5, 3])
# Mistake: per-example bias should be (5, 1), but you created (5,)
bias_wrong = tf.range(5, dtype=tf.float32) # (5,)
# This will fail because (5,3) and (5,) are not broadcast-compatible
# y = x + bias_wrong
# Fix: make it (5, 1) so it broadcasts across features
bias_right = tf.reshape(bias_wrong, [5, 1])
y = x + bias_rightMini-lab: catch a shape bug before it hits a layer
Imagine a simple linear computation that expects inputs shaped (batch, 8). You accidentally flatten incorrectly and produce (batch*8,). Use assertions and prints to locate the issue.
@tf.function
def linear_forward(x, W):
tf.print("input:", tf.shape(x))
tf.debugging.assert_rank(x, 2, message="x must be (batch, features)")
tf.debugging.assert_equal(tf.shape(x)[1], tf.shape(W)[0], message="feature dim must match W")
return x @ W
batch = 4
x_ok = tf.random.normal([batch, 8])
W = tf.random.normal([8, 2])
# Buggy reshape: collapses batch and features into one dimension
x_bug = tf.reshape(x_ok, [-1]) # (32,)
# Uncomment to see assertion failure close to the source
# y_bug = linear_forward(x_bug, W)
y_ok = linear_forward(x_ok, W)Exercises: diagnose and fix
- Exercise A: You have
featuresshape(32, 12)andweightsshape(12, 1). What is the output shape offeatures @ weights? Now changeweightsto shape(1, 12). Predict the error and explain which dimensions are incompatible. - Exercise B: You compute
tf.reduce_mean(x, axis=1)onxshape(batch, features), then try to concatenate the result back toxalongaxis=1. Why does it fail? Fix it usingkeepdims=Trueortf.expand_dims. - Exercise C: You have labels shaped
(batch,)but your model outputs(batch, 1). Show two ways to make shapes consistent: reshape labels to(batch, 1)or squeeze predictions to(batch,). Which choice is safer if you later concatenate labels as a feature?
# Exercise B quick check
x = tf.random.normal([6, 4])
m = tf.reduce_mean(x, axis=1) # (6,)
# tf.concat([x, m], axis=1) # rank mismatch
m2 = tf.reduce_mean(x, axis=1, keepdims=True) # (6, 1)
fixed = tf.concat([x, m2], axis=1) # (6, 5)
# Exercise C quick check
labels = tf.random.uniform([6], maxval=2, dtype=tf.int32) # (6,)
preds = tf.random.uniform([6, 1]) # (6, 1)
labels_col = tf.reshape(tf.cast(labels, tf.float32), [-1, 1]) # (6, 1)
preds_vec = tf.squeeze(preds, axis=1) # (6,)