TensorFlow Tensors(What are Tensors: Understanding the Basics, Creating, and Working with Tensors)
First off: If you are familiar with NumPy arrays, understanding TensorFlow Tensors will be as easy as first importing TensorFlow as below:
import tensorflow as tf
print(tf.__version__) # check version
# 2.14.0
But what is TensorFlow?
TensorFlow is an open-source machine-learning platform designed to facilitate the development and deployment of machine-learning models, especially in deep learning. Mainly, its name is derived from one of its core frameworks:Ā Tensors.
In TensorFlow, all the computations involve Tensors. That makes working with Tensors a must-knowledge before your next venture into building deep machine-learning models. This article is dedicated to exploring Tensors, and by the end of it, we aim to make you ready to work with them.
What is a Tensor?
Tensors are multidimensional arrays with a uniform type(see supported types/dtypes). They form the fundamental building block for data representation and manipulation of TensorFlow. Tensors can be scalars, vectors, matrices, or higher dimensional arrays, which hold numerical data necessary for representing input data, model parameters, and output predictions in TensorFlow-based machine learning models.
Tensors resemble NumPy arrays. However, Tensors are immutable, which means that once we have created a Tensor, we can not modify or change it. This feature ensures consistency and avoids unintended side effects during the construction and execution of machine-learning models.
How to create tensors(Functions to create various Tensor objects)
TensorFlow offers us several functions and methods for creating tensors. Most of the tensors we will create are also called Dense tensors since they have fixed shapes along all dimensions. We will also look at special types of tensors,.
We will look at:
tf.constant
tf.Variable
tf.zeros
andtf.ones
- Creating tensors from NumPy arrays(
tf.convert_to_tensor
) - Functions to create random tensors
- Ragged(
tf.ragged.constant
) and Sparse(tf.SparseTensor
) tensors(special tensors)
How to create tensors with custom values with tf.constant()
tf.constant()
is TensorFlow's most basic method for creating tensors. This function is vital as it allows us to create tensors with constant values. Tensors created with this function are immutable.
Before exploring other functions, we will use this function to explain most tensor concepts like ranks, shape, and dtypes.
tf.constant(value, dtype=None, shape=None, name='Const')
'''
value: A constant value or list of n dimensions to define the tensor
dtype: The type of the elements in the output tensor: Optional: Inferred
from the value if not specified
shape: The intended dimensions of the resulting tensor: Optional: If
specified, the value is reshaped to match
name: Name of the tensor: Optional
'''
Example 1:
rank_0_tensor = tf.constant(4)
print(rank_0_tensor)
Output:
Notice that we named the example tensor above "rank_0_tensor". That means the resulting tensor is a scalar with a single value and zero dimensions. We can check the number of dimensions with the Tensor.ndim
attribute.
print(f"rank_0_tensor has {rank_0_tensor.ndim} dimensions")
Output:
We can create tensors of n-dimensions. A vector, for instance, will have 1 dimension (rank 1 tensor), a matrix will have 2 dimensions(rank 2 tensor), while a higher dimensional tensor will have n-dimensions(rank n tensor).
Example 2: Creating a rank 1 tensor(vector) - A list of values:
rank_1_tensor = tf.constant([20, 100])
print(rank_1_tensor)
print(f"\nTensor rank: {tf.rank(rank_1_tensor)}")
print(f"rank_1_tensor has {rank_1_tensor.ndim} dimension")
Output:
Example 3: Creating a rank 2 tensor(matrix) - A list of lists:
rank_2_tensor = tf.constant([[20, 10],
[15, 30],
[45, 35]])
print(rank_2_tensor)
print(f"\nTensor rank: {tf.rank(rank_2_tensor)}")
print(f"rank_2_tensor has {rank_2_tensor.ndim} dimensions")
Output:
Example 4: Creating a rank 3 tensor(n-dimensional):
rank_3_tensor = tf.constant([
[[0, 1, 2],
[3, 4, 5]],
[[6, 7, 8],
[9, 10, 11]],
[[12, 13, 14],
[15, 16, 17]],])
print(rank_3_tensor)
print(f"\nTensor rank: {tf.rank(rank_3_tensor)}")
print(f"rank_3_tensor has {rank_3_tensor.ndim} dimensions")
Output:
Aside from the rank and the dimensions, tensors have two other very important attributes: shape and dtype.
- shape: returns the size of the tensor along each of its axes or dimensions.
- dtype: returns the type of all the elements in the tensor.
Tensor shape
As you may have noticed, when we print a tensor, the result returns its value, shape, and dtype. For instance, looking at the rank_3_tensor, the result is represented as below:
Sometimes, we may want to know or retrieve a tensor's shape after computation for operations like:
- Reshaping
- Slicing and indexing
- Model debugging
We can retrieve the shape using tf.shape
function :
# retrieve the shape of a tensor
print(f"Rank 0 tensor shape: {tf.shape(rank_0_tensor)}")
print(f"Rank 1 tensor shape: {tf.shape(rank_1_tensor)}")
print(f"Rank 2 tensor shape: {tf.shape(rank_2_tensor)}")
print(f"Rank 3 tensor shape: {tf.shape(rank_3_tensor)}")
Output:
We can also check the shape with Tensor.shape
. However, this does not return a tensor.
print(rank_2_tensor.shape)
# (3, 2)
Returning a tensor shape as a tensor may have the following benefits:
- When performing operations based on the shape of a tensor, having the shape as a tensor allows us to use it in computations.
- We can dynamically reshape a tensor's shape.
- We can query and validate the shape of the output when creating functions or layers that operate on tensors.
Tensor dtype
A tensor can have any dtype(data type) listed on tf. dtypes
. However, remember a tensor must have a specific data type, and all elements within that tensor must conform to that data type. Therefore, a single tensor cannot have two different data types. For instance, we cannot create a tensor with integer and floating-point elements. If you need to work with different data types, you would typically create separate tensors for each type.
We can specify the tensor dtype
while creating it:
# Float tensor
float32_tensor = tf.constant([20.5, 30.0, 4.3], dtype='float32')
# Integer tensor
int64_tensor = tf.constant([[1, 2, 3], [4, 5, 6]], dtype='int64')
# String tensor
string_tensor = tf.constant(["TensorFlow", "tensors", "dtypes"])
print(float32_tensor)
print(int64_tensor)
print(string_tensor)
Output:
Aside: The b'...'
notation on string tensors indicates that they are byte strings. As you will also learn later, especially when dealing with Natural Language Processing (NLP) tasks, string tensors can have elements of variable lengths, unlike numeric tensors.
While we can not have tensors with elements of different types, TensorFlow provides tf.cast
that we can convert tensors between various datatypes:
float32_tensor_as_int32_tensor = tf.cast(float32_tensor, dtype='int32')
int64_tensor_as_float16_tensor = tf.cast(int64_tensor, dtype='float16')
print(float32_tensor_as_int32_tensor)
print(int64_tensor_as_float16_tensor)
Output:
How to initialize tensors with specific values(tf.zeros and tf.ones)
tf.zeros
and tf.ones
are commonly used in deep learning(especially when building neural network models) to initialize certain tensors to specific values. By default, a tensor initialized with tf.zeros
will contain only zeros, while that initialized with tf.ones
will have only ones.
Both functions are initialized with the format below:
tf.zeros/tf.ones(
shape,
dtype=tf.dtypes.float32,
name=None,
layout=None
)
'''
shape: a list or tuple of integers or a 1D tensor
dtype: dtype of the elements
'''
Initializing a tensor with ones
# Intiailize tensors with ones
ones_tensor1 = tf.ones(shape=(2), dtype='float32') # rank 1 tensor of ones
ones_tensor2 = tf.ones(shape=[2, 3], dtype='int32') # rank 2 tensor of ones
tensor_1d = tf.constant(value = [1, 2])
ones_tensor3 = tf.ones(shape=tensor_1d) # takes shape as the value of the 1d tensor
print(ones_tensor1)
print(ones_tensor2)
print(ones_tensor3) # 1 row and two columns
Output:
Initializing a tensor with ones
# Intiailize tensors with zeors
zeros_tensor1 = tf.zeros(shape=(3), dtype='float32') # rank 1 tensor of zeros
zeros_tensor2 = tf.zeros(shape=[2, 3, 3], dtype='int32') # rank 3 tensor of zeros
tensor_1d = tf.constant(value = [3, 2])
zeros_tensor3 = tf.zeros(shape=tensor_1d) # takes shape as the value of the 1d tensor
print(zeros_tensor1)
print(zeros_tensor2)
print(zeros_tensor3) # 3 rows and 2 columns
Output:
When building a model with TensorFlow, tf.zeros
and tf.ones
can come in handy, for instance, when initializing a model's weights, biases, variables(with tf.Variable
), or other parameters. For example, we can build a sample model where we initialize the weights and biases with tf.zeros
to give you a taste of building a model:
Do not worry if you do not understand some of the things below. This intends to break the monotony so far and excite you for the future.
Example use case of using tf.zeros to initialize weights and biases
import numpy as np
# Dummy data for training
np.random.seed(42)
x_train = np.random.random((100, 10)) # 100 samples with 10 features each
y_train = np.random.randint(2, size=(100,)) # Integer labels
# Define the model
class SimpleModel(tf.keras.Model):
def __init__(self, input_size, hidden_size, output_size):
super(SimpleModel, self).__init__()
# Initialize weights and biases with tf.zeros
self.weights_hidden = tf.Variable(
tf.zeros(shape=(input_size,
hidden_size)))
self.biases_hidden = tf.Variable(
tf.zeros(shape=(hidden_size,)))
self.weights_output = tf.Variable(
tf.zeros(shape=(hidden_size,
output_size)))
self.biases_output = tf.Variable(
tf.zeros(shape=(output_size,)))
def call(self, inputs):
# Forward pass
hidden_layer = tf.matmul(inputs,
self.weights_hidden) + self.biases_hidden
output_layer = tf.matmul(hidden_layer,
self.weights_output) + self.biases_output
return output_layer
# Instantiate the model
input_size = 10
hidden_size = 5
output_size = 2
model_zeros = SimpleModel(input_size, hidden_size, output_size)
# Compile the model
model_zeros.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
# Train the model
model_zeros.fit(x_train, y_train, epochs=10, batch_size=32)
# Evaluate the model
loss, accuracy = model_zeros.evaluate(x_train, y_train)
print(f"Final Training Loss: {loss:.4f}, Accuracy: {accuracy:.4f}")
Output:
As much as the model may work when the weights are initialized with tf.zeros and tf.ones, it is not the best practice for most deep-learning models. This is because if all the weights are initialized to the same value, each neuron in the layer will learn the same features, and the model won't effectively capture the complexity of the data. However, biases may be initialized that way since they do not suffer from the symmetry issue like weights.
the
Instead, we can randomly initialize weights or use higher-level APIs, such as tf.keras.layers.Dense, that comes withĀ built-in weight initializers like Glorot initialization.
I hope you liked the small exercise. Just take that lightly for now, at least!
Creating TensorFlow variables(mutable tensors) with tf.Variable()
So far, we have explored how to create tensors with tf.constant
function. However, we mentioned that tensors created that way are immutable - that is, we can not modify their elements. To prove this point, let's see some examples below:
Suppose we had the tensor:
# tensor with tf.constant()
immutable_tensor = tf.constant([20, 30, 40]) # vector
print(immutable_tensor)
Output:
Let's try to modify the first element of the vector through indexing and assignment like we would do on NumPy arrays:
# Try to change the first element
immutable_tensor[0] = 100 # index first element(20) and assign new value
print(immutable_tensor)
Output:
TensorFlow variablesĀ - mutable tensors, are recommended to represent shared, persistent states your program manipulates(as defined here). They are created using tf.Variable
class. The class has some of the following use cases when building machine learning models:
- Creating trainable parameters like weights and biases. As evident in this example above:
self.weights_hidden = tf.Variable(
tf.zeros(shape=(input_size,hidden_size)))
self.biases_hidden = tf.Variable(tf.zeros(shape=(hidden_size,)))
- Allow us to specify initial values like random initialization or zero initialization. Also evident in the example above.
- Useful in theĀ automatic differentiation systemĀ of TensorFlow.
- Variables can be shared between different parts of a model or even between other models, enabling the reuse of learned representations.
Let's create our first variable: TheĀ Variable()
Ā constructor requires an initial value that can be a tensor of any type and shape. This initial value defines the type and shape of the variable. After construction, the type and shape of the variable are fixed.
example_variable = tf.Variable([[1.0, 2.0], [3.0, 4.0]])
print(example_variable)
print(f"Shape: {tf.shape(example_variable)}")
print(f"Rank: {tf.rank(example_variable)}")
Output:
Variables can hold any type, just like tensors:
bool_variable = tf.Variable([False, True, True, False, False])
int32_variable = tf.Variable([20, 40, 15], dtype='int32')
print(bool_variable)
print(int32_variable)
Output:
How to modify TensorFlow variables using assign methods
We can mutate a variable tensor using the assign method.
Using assign()
to reassign a tensor to a variable tensor
Since tensors back variables, we can modify or re-assign a tensor to an existing variable usingĀ tf.Variable.assign
. We call the assign
method on the variable and allocate a new tensor.
int32_variable = tf.Variable([20, 40, 15], dtype='int32')
print(f"Old variable: {int32_variable.numpy()}")
# modify the variable elements
int32_variable = int32_variable.assign([12, 15, 18])
print(f"Mod variable: {int32_variable.numpy()}")
Output:
We can not assign a tensor with a different shape from the existing variable!
Updating variable value with assign.add()
method (Counter/incrementing):
We can add a specific value to the current value of an existing variable(increment) with tf.Variable.assign_add
. The method is particularly useful for implementing counters or variables that need to be incremented during the execution, for instance, counting training steps(As we shall see later).
example_variable1 = tf.Variable(10)
example_variable2 = tf.Variable([10,30])
print(f"Current variable1 value: {example_variable1.numpy()}")
print(f"Current variable2 values: {example_variable2.numpy()}")
# increment values in the variables
example_variable1 = example_variable1.assign_add(5) # add 5 to current value
example_variable2 = example_variable2.assign_add([200, 100]) # add 200 & 100 to current values
print(f"Updated variable1 value: {example_variable1.numpy()}")
print(f"Updated variable2 values: {example_variable2.numpy()}")
Output:
The shapes of the variable tensors must match for that to work!
You notice that we mentioned that one of the major use cases of variables is to store trainable parameters. They can also hold gradients computed duringĀ backpropagationĀ on a neural network model. However, not all model variables(for instance, counters or constant values) need to be trainable or have gradients. They can be part of the model but don't need to be updated through optimization.
We can specify whether or not a variable needs updating during the training process or its gradients computed during backpropagation by setting trainable=False
on the variable. For instance:
# set a non-trainable variable to count training steps
train_step = tf.Variable(initial_value=0, trainable=False)
From the code above, we could use the untrainable train step counter in a loop as below. Don't be intimidated by what you see. This will all make sense as you advance:
# SAMPLE MODEL HERE
...
# set a non-trainable variable to count training steps
train_step = tf.Variable(initial_value=0, trainable=False)
...
# Example training loop
for epoch in range(5): # Run for 5 epochs
for step in range(len(x_train)):
# Forward pass
with tf.GradientTape() as tape:
pass
# Perform backward pass and optimization here
...
# Update the train step variable here
train_step.assign_add(1)
print(f"Epoch {epoch + 1}, train step: {train_step.numpy()}")
# Show the final training step value
print("Final Global Step:", train_step.numpy())
Output:
You see, the train_step
variable is incremented at each training step. Since we have set trainable = False
to it, it will not affect any model's weights or gradients.
Creating tensors from NumPy Arrays
We can easily create tensors fromĀ NumPy arrays using tf.convert_to_tensor
or calling tf.constant
on the particular NumPy array. Let's see how:
np_array = np.arange(0, 12).reshape(3, 4)
print(np_array, type(np_array))
# convert to tensor
np_array_to_tensor = tf.convert_to_tensor(np_array)
print()
print(np_array_to_tensor)
# Alternatively, we can call tf.constant on the array
np_array_to_tensor = tf.constant(np_array, dtype='float32')
print()
print(np_array_to_tensor)
Output:
TensorFlow also has graph execution mode, which constructs a computational graph for optimization, resulting in speedier execution for certain tensor operations.
Creating random tensors
There are various ways of generating random tensors. We will explore the following ways:
tf.random.normal
tf.random.uniform
tf.random.shuffle
tf.random.set_seed
How to create a random tensor with tf.random.normal
tf.random.normal
creates tensors with random values from a normal(Gaussian) distribution. A normal distribution depicts two parameters: a mean and a standard deviation. That means we can set the two parameters for the random tensor.
The syntax:
tf.random.normal(
shape,
mean=0.0,
stddev=1.0,
dtype=tf.dtypes.float32,
seed=None,
name=None
)
'''
mean: optional and default is 0
stddev: Standard deviation - optional and default is 1
'''
Example:
# generate a 3x2 tensor containing random values
# sampled from a normal distribution with a mean of 0.0
# and a standard deviation of 1.0
random_normal_tensor = tf.random.normal(shape=(3, 2),
mean=0, stddev=1.0,
dtype='float32')
print(random_normal_tensor)
Output:
Note that your result will differ every time you run the above code. In the next section, we will learn how to produce the same random tensor each time(random seed).
How to create a random tensor with tf.random.uniform
The tf.random.uniform
function creates tensors with random values from a uniform distribution. A normal distribution is a probability distribution where all values in the range have an equal probability of being sampled. That means that every value in the specified interval has the same likelihood of being chosen. So, we can generate a random tensor with a set range(min_value, max_value).
The syntax:
tf.random.uniform(
shape,
minval=0,
maxval=None,
dtype=tf.dtypes.float32,
seed=None,
name=None
)
'''
minval: optional, and default is 0. The minimum value of the distribution.
maxval: Optional, and default is 1 - The maximum value of the distribution.
'''
Example:
# generate a 2x3x2 tensor containing random values
# sampled from a uniform distribution with a min_value of 0.0
# and a max_value of 1.5
random_uniform_tensor = tf.random.uniform(shape=[2, 3, 2],
maxval=0,
minval=1.5)
print(random_uniform_tensor)
Output:
Note that your result will differ every time you run the above code. In the next section, we will learn how to produce the same random tensor each time(random seed).
How to shuffle tensor with tf.random.shuffle
It is a common practice in machine learning to shuffle data, especially while training a model, to ensure randomness in the data. The data could be presented as tensors, and thus, TensorFlow has a function to help in the shuffling.
The tf.random.shuffle
function in TensorFlow is used to randomly shuffle the elements along the first dimension of a tensor. For a 2D or rank 2 tensor, this means shuffling the rows.
Example:
intial_tensor_data = tf.constant([[10, 15], [20, 30], [40, 50]])
print("Original:")
print(intial_tensor_data.numpy())
# shuffle the elements
shuffled_intial_tensor_data = tf.random.shuffle(intial_tensor_data)
print("\nShuffled:")
print(shuffled_intial_tensor_data.numpy())
Output:
Note that your shuffle result will differ every time you run the above code. In the next section, we will learn how to produce the same random tensor each time(random seed).
As you advance, you may encounter shuffling tensors in some use cases like:
- Data augmentation in image classification tasks to increase the diversity/randomness of the training dataset.
- In cross-validation, before splitting the data into folds.
- In Natural Language Processing.
How to set random seed while generating random tensors for tensor reproducibility
In the examples above, we have noticed that the codes we have written generate new random tensor elements each we run them. While building models, we must have consistent results where the same sequence of random numbers is generated each time we run the model.
Tensor reproducibility can come in handy in cases like when initializing weights, shuffling datasets, or in cases of data augmentation and other tasks that require randomized tensors.
TensorFlow has two ways we can set the random seed for random tensor generation:
- Global-level random seed setting
- Operation-level random seed setting
Global-level random seeds
We set the global seeds with tf.random.set_seed(seed=integer_value)
. A global seed is shared for all TensorFlow operations in the script, which means that any random operation will be affected by it. Typically, we set the global seed at the beginning of your script or notebook.
Example:
# set the global random seed
tf.random.set_seed(42)
random_tensor = tf.random.uniform(shape=[1, 3, 2])
print(random_tensor)
Output:
Notice that the same tensor elements are reproduced each time you run the code. If you try to generate the same random tensor with the same random seed with the same seed value(42) on another notebook cell, the result is similar.
However, the results will differ if you change the seed value to another integer value(say 123). For example:
# set a different global random seed value
tf.random.set_seed(123)
random_tensor = tf.random.uniform(shape=[1, 3, 2])
print(random_tensor)
Output:
Operational-level random seeds(More flexible)
Operational-level random seeds are currently the most recommended way to ensure the reproducibility of random tensors. We set the seed with tf.random.Generator
class.
Each random tensor generated with this class has its own random seed, thus unique randomness. This can be helpful when we want different parts of the code to have independent random sequences.
Example:
# Create two instances of tf.random.Generator
# with different seeds
random_generator1 = tf.random.Generator.from_seed(42)
random_generator2 = tf.random.Generator.from_seed(123)
# use the generators for random tensor generation
random_tensor1 = random_generator1.normal(shape=[4, 2])
random_tensor2 = random_generator1.uniform(shape=[1, 3, 3])
print(random_tensor1)
print(random_tensor2)
Output:
āļø You can try shuffling a tensor with tf.shuffle
while the random seed is set and observe the behavior!
Special types of tensors
These tensors are different from the dense tensors with special characteristics and are viable for specific use cases in deep learning. We will look at two of these tensors:
- Ragged tensors
- Sparse tensors
Ragged tensors
A ragged tensor is mainly used to represent sequences of variable lengths. While dense tensors have dimensions of fixed sizes(or uniform), dimensions in ragged tensors vary in size(not uniform). They can be helpful in tasks like NLP, where sentences are sequences with different numbers of words.
A ragged tensor would be represented like:
We create a ragged tensor using tf.ragged.constant
:
# sample array of varying dimension sizes
ragged_array = [
[1.5, 3.0, 2.3],
[4.5, 0.5],
[0.8]]
# we can not convert it to a dense tensor
try:
tensor = tf.constant(ragged_array)
except Exception as e:
print(f"{type(e).__name__}: {e}")
# instead we can convert it to a ragged tensor
ragged_tensor = tf.ragged.constant(ragged_array)
print("\nConverted to ragged tensor:")
print(ragged_tensor)
Output:
You notice that the ragged tensor is not presented like normal tensors. Instead, it is encoded such that its variable-length rows are concatenated into a flattened list. The flattened list has row partitions that indicate the row divisions:
That encoding gives us more ways in which we can construct ragged tensors. We can pair flatĀ valueĀ tensors withĀ row-partitioningĀ tensors, indicating how those values should be divided into rows. We can use the following methods:
value_rowids
partitioning tensor:tf.RaggedTensor.from_value_rowids
row_length
partitioning tensor:tf.RaggedTensor.from_row_lengths
row splits
partitioning tensor:tf.RaggedTensor.from_row_splits
Constructing a ragged tensor using value_rowids
partitioning tensor
You can create a tensor if you know which row each value belongs to. Creating a tensor this way is handy when you have a set of values and a corresponding row assignment for each value and you want to create a ragged tensor where each row represents a group of values assigned to the same row.
It is also an efficient way of storing ragged tensors with many empty rows since the size of the tensor depends only on the total number of values.
Example:
# tf.RaggedTensor.from_value_rowids
values = tf.constant([20, 30, 40, 50, 60, 70]) # must be a vector
row_ids = tf.constant([0, 0, 0, 1, 1, 2]) # integer vector specifying the
# row index for each value
ragged_tensor = tf.RaggedTensor.from_value_rowids(values = values,
value_rowids = row_ids)
print("Values:" ,values.numpy())
print("Row ids:" ,row_ids.numpy())
print("Ragged tensor from value_row_ids:\n" , ragged_tensor)
Output:
Constructing a ragged tensor using row_lengths
partitioning tensor
You can create a tensor if you know how long each row is. Creating a tensor this way is handy when concatenating ragged tensors since row lengths do not change when two tensors are concatenated together.
Example:
# tf.RaggedTensor.from_row_lengths
values = tf.constant([20, 30, 40, 50, 60, 70]) # must be a vector
row_lengths = tf.constant([3, 2, 1]) # integer vector specifying the length of each row
# ragged tensor
ragged_tensor2 = tf.RaggedTensor.from_row_lengths(values = values,
row_lengths = row_lengths)
print("Values:" ,values.numpy())
print("Row lengths:" ,row_lengths.numpy())
print("Ragged tensor from row lengths:\n" , ragged_tensor2)
Output:
Constructing a ragged tensor using row_splits
partitioning tensor
You can create a tensor if you know the index where each row starts and ends. The row_splits
enable quick indexing and slicing into ragged tensors since TensorFlow can quickly determine each row's starting and ending indices.
Example:
# tf.RaggedTensor.from_row_splits
values = tf.constant([20, 30, 40, 50, 60, 70]) # must be a vector
row_splits = tf.constant([0, 3, 5, 6]) # integer vector specifying
# the split points between rows
ragged_tensor3 = tf.RaggedTensor.from_row_splits(values = values,
row_splits = row_splits)
print("Values:" ,values.numpy())
print("Row splits:" ,row_lengths.numpy())
print("Ragged tensor from row_splits:\n" , ragged_tensor3)
Output:
The dimensions and shape of a ragged tensor
The outermost dimension of a ragged tensor is always uniform(it has the same length) since it consists of a single slice(ragged_tensor3.shape[0]
). The remaining dimensions can be ragged or uniform.
We can view the shape of a tensor with the shape
method:
# shape of the ragged_tensor3 above
print(ragged_tensor3.shape)
# the outer dimension is ragged_tensor3.shape[0]
# Returns
# (3, None)
The above code gives the static shape of the ragged tensor. The outer dimension is indicated by 3, representing the total number of rows. The ragged dimension is always represented by None
, which indicates the rows have varying lengths.
Viewing the dynamic shape with tf.shape
gives more details about the lengths of the ragged tensor dimensions:
# dynamic shape of the ragged_tensor3 above
print(tf.shape(ragged_tensor3))
# Returns
# <DynamicRaggedShape lengths=[3, (3, 2, 1)] num_row_partitions=1>
The results show that the ragged tensor has 3 rows with lengths 3, 2, and 1.
How do we describe the shape of a ragged tensor?Ā For instance, for a ragged tensor that will store the word embeddings for each word in a batch of sentences?
When describing a ragged tensor, we enclose the ragged dimensions in parenthesis. For example, we can write [num_sentences, (num_words), embedding_size]
. That conveys that the size of those dimensions can vary across different rows.
Sparse tensors
Sparse tensors are tensors that contain a lot of zero values. When you have tensors with many zero values, storing them in a sparse tensor improves space and time on computations. These tensors are common in areas like NLP for data preprocessing and computer vision.
We construct a sparse tensor by supplying the following components to tf.sparse.SparseTensor
:
indices
: A 2-D int64 tensor of shapeĀ[N, rank]
(number of values, number of dimensions) which specifies the indices of the elements in the sparse tensor that contain nonzero values.values
: A 1-D tensor of any type and shape[N]
with all the nonzero values of the tensor.dense_shape
: A 1-D int64 tensor of shape[rank]
, specifying the dense shape of the sparse tensor.
Example:
# indices of non-zero values
indices = tf.constant([[0, 1], [1, 2], [2, 0]], dtype=tf.int64)
# the nonzero values in the tensor
values = tf.constant([15, 25, 35], dtype=tf.float32)
# define the tensor's shape
shape = tf.constant([4,3], dtype=tf.int64)
sparse_tensor = tf.sparse.SparseTensor(indices = indices,
values = values,
dense_shape= shape)
'''Results
Printing the result will return the components: Visualize below.
'''
The sparse tensor has 4 rows and 3 columns (shape [4, 3]
). We can then visualize the tensor represented by this sparse tensor by converting the sparse tensor to a dense tensor with tf.sparse.to_dense
:
# convert sparse tensor to dense tensor
sparse_to_dense_tensor = tf.sparse.to_dense(sparse_tensor)
print(sparse_to_dense_tensor)
''' Results
tf.Tensor(
[[ 0. 15. 0.]
[ 0. 0. 25.]
[35. 0. 0.]
[ 0. 0. 0.]], shape=(4, 3), dtype=float32)
'''
āļø To convert the dense tensor back to sparse, use tf.sparse.from_dense
.
Indexing and slicing tensors
Indexing tensors follows the basic Python and NumPy indexing rules, which include:
- Indexes start atĀ
0
- Negative indices count backward from the end.
- Colons,Ā
:
, are used for slices:Āstart:stop:step
Example single index indexing and slicing:
tensor = tf.constant([20, 30, 40, 50, 15, 45, 100, 120])
print(f"Tensor: {tensor.numpy()}")
# return everything
print(f"Return everything: {tensor[:]}")
print(f"First 3 elements: {tensor[:3]}")
print(f"All elements after first 3 elements: {tensor[3:]}")
print(f"Every other item: {tensor[::2]}")
print(f"Elements between fourth and before 7th element: {tensor[3:6]}")
print(f"Reversing: {tensor[::-1]}")
Output:
We index higher dimensional tensors by passing multiple indices.
rank_2tensor = tf.constant([[20, 30, 40], [50, 15, 45], [100, 120, 150]])
print(f"Tensor:\n{rank_2tensor.numpy()}")
print(f"Second row:, {rank_2tensor[1, :].numpy()}")
print(f"Second column:, {rank_2tensor[:, 1].numpy()}")
print(f"Last row:, {rank_2tensor[-1, :].numpy()}")
print(f"First item in last column:, {rank_2tensor[0, -1].numpy()}")
print(f"Second row onwards:\n {rank_2tensor[1:, :].numpy()}")
Output:
Slicing tensor with tf.slice
:
tf.slice
takes begin
and size
parameters. begin
specifies the start index for the slicing, while thesize
specifies the number of elements to slice.
Example slicing rank 1 tensor:
tensor = tf.constant([20, 30, 40, 50, 15, 25, 60])
print(f"Tensor:\n{rank_2tensor.numpy()}")
# slice with tf.slice
begin = [2] # begin at index 2
size = [3] # number of elements to slice starting from begin index
t_slice = tf.slice(tensor, begin = begin, size=size)
print(f"\nSlice: {t_slice.numpy()}") # similar to tensor[2:5]
Output:
Example 2 slicing rank 3 tensor:
r3_tensor = tf.constant([
[[20, 30, 40, 15],
[50, 15, 25, 60]],
[[5, 16, 21, 17],
[9, 11, 35, 13]]])
r3_slice = tf.slice(r3_tensor, begin=[1, 1, 1], size=[1, 1, 2])
print(f"Tensor: {r3_tensor.numpy()}")
print(f"\nr3_slice: {r3_slice.numpy()}")
'''
Tensor:
[[[20 30 40 15]
[50 15 25 60]]
[[ 5 16 21 17]
[ 9 11 35 13]]]
r3_slice: [[[11 35]]]
'''
Slicing tensor with tf.gather
:
tf.gather
extracts specific indices
from a single axis/dimension of a tensor. The indices must be an integer tensor of any dimension but primarily 1D.
Example 1. Gather elements from rank 1 tensor:
print(f"r1_tTensor: {tensor.numpy()}")
print(f"Gathered: {tf.gather(tensor, indices=[1, 5])}") # take index 1 and 5
'''
r1_tTensor: [20 30 40 50 15 25 60]
Gathered: [30 25]
'''
Example 2. Using batch_dims
:
tensor_2d = tf.constant([
[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15]])
indices = tf.constant([
[2, 4],
[0, 4],
[1, 3]])
print(tensor_2d.numpy())
print(tf.gather(tensor_2d, indices=indices, batch_dims=1, axis=1).numpy())
Output:
batch_dims
helps gather different items from each batch element along a specified axis by looping over the first axis of the tensor and the indices.
Note that ragged tensors can also be indexed. However, we cannot index on the ragged dimensions since a value may exist in some rows but not in others.
Operations for shaping and manipulating tensors
Tensors need to meet specific requirements for various machine-learning models. Knowing how to manipulate them to a particular structure or shape ensures we can handle diverse data formats and ensure their compatibility throughout the model development process. In this section, we will look at the following operations for reshaping and manipulating tensors:
- Reshaping
tf.reshape
- Swapping dimensions
tf.transpose
- Reducing dimensions
tf.squeeze
- Expanding dimensions
tf.expand_dims
- Joining tensors
tf.concat
How to reshape tensors with tf.reshape
Reshaping tensors is a critical concept employed in the preprocessing of data or in situations where the shape of a tensor needs to be adjusted to meet the requirements of a particular operation or model( for instance, in preparing the input data for a neural network model.
tf.reshape
enables us to reshape a tensor without altering its data. It does not change the order or total number of elements in the tensor.
Example 1:
tensor = tf.constant([[1, 2, 3, 4],
[5, 6, 7, 8]])
print(f"Tensor:\n{tensor.numpy()} \nOld shape: {tensor.shape}")
# reshape
tensor = tf.reshape(tensor, [2, 2, 2])
print(f"Reshaped:\n{tensor.numpy()} \nNew shape: {tensor.shape}")
Output:
Example 2:
var_tensor = tf.Variable([[[1, 2, 3, 4],
[5, 6, 7, 8]],
[[9, 10, 11, 12],
[13, 14, 15, 16]]])
print(f"Tensor:\n{var_tensor.numpy()} \nOld shape: {var_tensor.shape}")
# reshape
var_tensor = tf.reshape(var_tensor, [4, 4])
print(f"Reshaped:\n{var_tensor.numpy()} \nNew shape: {var_tensor.shape}")
Output:
We can flatten a tensor(rank 1/1D) by specifying -1 as the shape.
# flatten the variable tensor
print(f"Tensor:\n{var_tensor.numpy()}")
var_tensor = tf.reshape(var_tensor, [-1])
print(f"Flattened:\n{var_tensor.numpy()} \nNew shape: {var_tensor.shape}")
Output:
How to reshape tensors with tf.transpose
Transposing a tensor means switching its rows and columns. For instance, in a rank 2 tensor, that means swapping its rows and columns. We achieve this using tf.transpose
.
Example 1:
var_tensor = tf.reshape(var_tensor, [2, 2, 4])
print(f"Tensor:\n{var_tensor.numpy()} shape before: {tf.shape(var_tensor)}")
# transpose
var_tensor = tf.transpose(var_tensor) # row => columns, columns => rows
print(f"Transposed:\n{var_tensor.numpy()}\
shape after: {tf.shape(var_tensor)}")
Output:
Notice that the rows become the columns and vice versa.
How to squeeze tensors(removing dimensions of size 1) with tf.squeeze
In certain cases where we have tensors with singleton dimensions, for instance, batches of size one, we may want to remove them to have a more concise representation of the data. Squeezing does not change the data in the tensor but only modifies its shape. We can achieve that with tf.squeeze
.
Example 1:
tensor = tf.constant([[[10], [15], [30]]])
print(f"Tensor:\n{tensor} => shape: {tf.shape(tensor)}")
# squeeze
tensor = tf.squeeze(tensor)
print(f"\nSqueezed: {tensor.numpy()} => New shape: {tf.shape(tensor)}")
Output:
Example 2: Specifying the axis if you do not want to remove all size 1 dimensions
tensor3 = tf.constant([[[10]],
[[11]],
[[9]]])
print(f"Tensor:\n{tensor3} => shape: {tf.shape(tensor3)}")
# specify the axis(squeeze axis 2)
tensor3 = tf.squeeze(tensor3, axis=[2])
print("\nSqueezed:")
print(f"{tensor3.numpy()} => New shape: {tf.shape(tensor3)}")
Output:
Be aware that you must specify the axis when squeezing a ragged tensor!.
How to add dimensions to tensors(adding dimensions of size 1) with tf.expand_dims
Expanding dimensions involves adding size 1 dimensions to a tensor. That increases the rank of the tensor by one. It is the opposite of squeezing a tensor. Usingtf.expand_dims
, we can specify the axis on which to add the dimension.
Expanding dimensions is a common practice, for instance when:
- Adding an outer "batch" dimension to a tensor, for instance, a tensor of shape
(height, width, channels)
storing image data. - Broadcasting for arithmetic operations with tensors of different shapes.
For example, we can add an outer batch to a tensor:
tensor = tf.constant([10, 20, 30])
print(f"Tensor:\n{tensor} => shape: {tf.shape(tensor)}")
# expand dimensions
tensor_expanded = tf.expand_dims(tensor, axis = [0])
print("Tensor expanded:")
print(f"{tensor_expanded} => shape: {tf.shape(tensor_expanded)}")
Output:
Specifying a negative axis will add an innermost dimension:
tensor_expanded = tf.expand_dims(tensor, axis = [-1]) # innermost ndim
print("Tensor expanded(axis=-1):")
print(f"{tensor_expanded} => shape: {tf.shape(tensor_expanded)}")
Output:
How to concatenate tensors with tf.concat
We can join two tensors along a particular dimension. For that to work, the tensors must have a similar number of dimensions and equal dimensions. tf.concat
helps us achieve that.
Example:
tensor_1 = tf.constant([[1, 2, 3],
[4, 5, 6]])
tensor_2 = tf.constant([[7, 8, 9],
[10, 11, 12]])
print(f"Tensor 1:\n{tensor_1}")
print(f"Tensor 2:\n{tensor_2}")
# concatenate along axis=0
tensor1_tensor2 = tf.concat([tensor_1, tensor_2], axis=[0])
print("tensor_1 and tensor_2 joined(axis=0):")
print(f"{tensor1_tensor2} => shape: {tf.shape(tensor1_tensor2)}")
Output:
Concatenating along axis 1:
# concatenate along axis=1
tensor1_tensor2 = tf.concat([tensor_1, tensor_2], axis=[1])
print("tensor_1 and tensor_2 joined(axis=1):")
print(f"{tensor1_tensor2} => shape: {tf.shape(tensor1_tensor2)}")
Output:
āļø Consider exploringtf.stack
, and tf.tile
as an exercise for this section!
Broadcasting tensors
Broadcasting tensors is a concept very similar toĀ NumPy's broadcastingĀ concept. It allows operations to be performed on tensors of different shapes. The smaller tensor is stretched to match the shape of the larger tensor, enabling seamless elementwise operations.
For instance, if we multiply a tensor by a scalar, the scalar is stretched to match the shape of the tensor:
tensor = tf.constant([5, 10, 15])
#multiply by scalar 5
print(tensor * 5)
''' Results
tf.Tensor([25 50 75], shape=(3,), dtype=int32)
'''
To understand broadcasting in tensors, we can review NumPy's broadcasting rules but with tensors in mind:
- Rule 1: If the two tensors vary in their number of dimensions, the shape of the one with lesser dimensions isĀ paddedĀ with ones on its left side.
- Rule 2: If the shape of the two tensors does not match in any dimension, the tensor with a shape of 1 in that dimension is stretched to match the other shape.
- Rule 3: If, in any dimension, the sizes differ and neither is 1, an error is raised.
Let's understand the rules with a few examples:
Example 1: Adding a rank two tensor to a rank one tensor:
rank1_t = tf.constant([1, 2 , 3])
rank2_t = tf.constant([[5, 10, 15],
[20, 25, 30]])
# shapes
print(f"rank1_t shape: {rank1_t.shape}")
print(f"rank2_t shape: {rank2_t.shape}")
'''
rank1_t shape: (3,)
rank2_t shape: (2, 3)
'''
- The above tensors have different shapes. By rule 1,
rank1_t
has fewer dimensions, so it is padded with ones on the left. The new shapes are now:rank1_t shape: (1, 3)
, andrank2_t shape: (2, 3)
. - Next, we see that the shapes in their first dimensions differ. By rule 2, we stretch
rank1_t
- since its first dimension is of size 1 - to match the shape ofrank2_t
. The new shapes are now:rank1_t shape: (1, 3)
, andrank2_t shape: (2, 3)
. - Since now the shapes match, we can add the two tensors. The shape of the resulting tensor will be
(2, 3)
.
# add
r2_plus_r1 = rank2_t + rank1_t
print(f"r2_plus_r1:\n {r2_plus_r1} => shape{r2_plus_r1.shape}")
Visualize:
Example 2: Broadcasting two tensors
t1 = tf.constant([1, 2 , 3, 4])
t2 = tf.constant([[10],
[20],
[30],
[40]])
# shapes
print(f"t1 shape: {t1.shape}")
print(f"t2 shape: {t2.shape}")
'''Result
t1 shape: (4,)
t2 shape: (4, 1)
'''
- The above tensors have different shapes. By rule 1,
t1
has fewer dimensions, so it is padded with ones on the left. The new shapes are now:t1 shape: (1, 4)
, andt2 shape: (4, 1)
. - Next, we see that their shapes differ. By rule 2, we stretch both to match the shape of the other - since they both have a dimension of size 1. The new shapes are now:
t1 shape: (4, 4)
, andt2 shape: (4, 4)
. - Since now the shapes match, we can multiply the two tensors. The shape of the resulting tensor will be
(4, 4)
.
# multiply
t2_x_t1 = t2 + t1
print("t2_x_t1:")
print(f"{t2_x_t1} => shape{t2_x_t1.shape}")
Output:
Visualize:
āļø There are instances where broadcasting fails. For example, try adding tensors of shape(4, 3)
and shape(4)
and analyze why they are incompatible while referring to the rules!
Basic mathematical operations with tensors
We can perform various basic mathematical operations on tensors.
Example tensors:
# Example tensors
tensor_a = tf.constant([[1, 2], [3, 4]])
tensor_b = tf.constant([[5, 6], [7, 8]])
print("Tensor A:")
print(tensor_a)
print("Tensor B:")
print(tensor_b)
Adding tensors
Element-wise tensor addition operation.
# Addition
result_addition = tensor_a + tensor_b
print("Addition Result:")
print(result_subtraction.numpy())
Subtracting tensors
Element-wise tensor subtraction operation.
# Subtraction
result_subtraction = tensor_a - tensor_b
print("Subtraction Result:")
print(result_subtraction.numpy())
Multiplying tensors
We can perform element-wise multiplication with tf.multiply
or matrix multiplication with tf.matmul
.
Element-wise multiplication involves multiplying the corresponding elements of two tensors or matrices. That means that the element on the first row and first column of the resultant tensor is the product of the elements on the first row and first column of the input tensors. The idea is true for other rows and columns of the result. It is the a * b
operation in Python but done with tf.multiply
.
Matrix multiplication involves finding the dot product of each row of matrix A with each column of matrix B. Each element of the resulting matrix is the sum of the products of the corresponding elements in the selected row of A and column of B. The only requirement for matrix multiplication is that the number of columns in the first matrix must equal the number of rows in the second matrix.
Visualize matrix multiplication here!
Example:
# Element-wise multiplication
result_elementwise_multiplication = tf.multiply(tensor_a, tensor_b)
# Matrix multiplication
result_matrix_multiplication = tf.matmul(tensor_a, tensor_b)
print("Element-wise Multiplication Result:")
print(result_elementwise_multiplication.numpy())
print("Matrix Multiplication Result:")
print(result_matrix_multiplication.numpy())
Dividing tensors
Element-wise tensor division.
# Element-wise division
result_elementwise_division = tf.divide(tensor_a, tensor_b)
print("Element-wise Division Result:")
print(result_elementwise_division.numpy())
Tensor aggregation functions
Aggregation deriving a reduced summary of a tensor's information like its mean, sum, maximum, and minimum value or other statistical measures. In this section, we will look at the following aggregation functions:
tf.reduce_sum
tf.reduce_mean
tf.reduce_min and tf.reduce max
.tf.argmax
, andtf.argmin
example_t = tf.constant([[5, 10, 15],
[20, 25, 30]])
print(example_t.numpy()
Getting the sum of elements across tensor dimensions with tf.reduce_sum
We can compute the sum of elements across a tensor's axis with tf.reduce_sum
:
sum_of_all_elems = tf.reduce_sum(example_t)
sum_axis_0 = tf.reduce_sum(example_t, axis = 0)
sum_axis_1 = tf.reduce_sum(example_t, axis = 1)
print(f"Tensor: \n{example_t.numpy()}")
print("Sum of all elements:", sum_of_all_elems.numpy())
print("Sum on axis 0:", sum_axis_0.numpy())
print("Sum on axis 1:", sum_axis_1.numpy())
Getting the mean of elements across tensor dimensions with tf.reduce_mean
We can compute the sum of elements across a tensor's axis with tf.reduce_mean
:
tensor_mean = tf.reduce_mean(example_t)
mean_axis_0 = tf.reduce_mean(example_t, axis = 0)
mean_axis_1 = tf.reduce_mean(example_t, axis = 1)
print(f"Tensor: \n{example_t.numpy()}")
print("Mean of all elements:", tensor_mean.numpy())
print("Mean on axis 0:", mean_axis_0.numpy())
print("Mean on axis 1:", mean_axis_1.numpy())
Getting the minimum and maximum of elements across tensor dimensions with tf.reduce_min and tf.reduce_max
We can compute the minimum and maximum elements across a tensor's axis with tf.reduce_min
, and tf.reduce_max
.
# Along axis 0
max_axis_0 = tf.reduce_max(example_t, axis=0)
min_axis_0 = tf.reduce_min(example_t, axis=0)
# Along axis 1
max_axis_1 = tf.reduce_max(example_t, axis=1)
min_axis_1 = tf.reduce_min(example_t, axis=1)
print(f"Tensor: \n{example_t.numpy()}")
print("Max axis 0:", max_axis_0.numpy())
print("Min axis 0:", min_axis_0.numpy())
print("Max axis 1:", max_axis_1.numpy())
print("Min axis 1:", min_axis_1.numpy())
Getting the indices of the smallest and largest element across tensor dimensions with tf.argmin and tf.argmax
We can find the smallest and largest value across a tensor dimension with tf.argmin
and tf.argmax
.
# Along axis 0
argmax_axis_0 = tf.argmax(example_t, axis=0)
argmin_axis_0 = tf.argmin(example_t, axis=0)
# Along axis 1
argmax_axis_1 = tf.argmax(example_t, axis=1)
argmin_axis_1 = tf.argmin(example_t, axis=1)
print(f"Tensor: \n{example_t.numpy()}")
print("Index of max value axis 0:", argmax_axis_0.numpy())
print("Index of min value axis 0:", argmin_axis_0.numpy())
print("Index of max value axis 1:", argmax_axis_1.numpy())
print("Index of min value axis 1:", argmin_axis_1.numpy())
Final thoughts
TensorFlow tensors form the foundation of numerical computation and data representation within the TensorFlow framework. They are versatile data structures that enable efficient handling of multidimensional data, making them essential for machine learning and deep learning tasks.
This article has given you a solid understanding of the basics of creating, manipulating, and aggregating tensors, which is crucial for building robust and effective machine-learning models.
As you delve into the world of TensorFlow, a solid grasp of tensors and their operations will undoubtedly enhance your ability to design and implement sophisticated machine learning algorithms. Keep exploring the extensive capabilities of TensorFlow tensors to unlock the full potential of your data-driven applications.
TensorFlow Resources
- Implementing Transformer decoder for text generation in Keras and TensorFlow
- Object detection with TensorFlow 2 Object detection API
- How to train deep learning models on Apple Silicon GPU
- How to build CNN in TensorFlow(examples, code, and notebooks)
- How to build artificial neural networks with Keras and TensorFlow
- Custom training loops in Keras and TensorFlow
- Flax vs. TensorFlow
- How to build TensorFlow models with the Keras Functional API
Follow us onĀ LinkedIn,Ā Twitter,Ā GitHub, andĀ subscribe to ourĀ blog,Ā so you don't miss a new issue.