Understanding HDF5 and SavedModel in TensorFlow
HDF5 and SavedModel Formats
- HDF5 (.h5 format): Hierarchical Data Format version 5 (HDF5) is a file format and a set of tools for managing complex data. In TensorFlow/Keras, it is used to save model architectures, weights, training configurations, and optimizer states.
- SavedModel: This is TensorFlow's standard format for serializing models. It saves the entire TensorFlow program, including weights, computation graph, and configurations. SavedModel is more robust, allowing models to be reused in different environments.
Differences Between HDF5 and SavedModel
- Scope of Storage: HDF5 focuses mainly on Keras models (architecture, weights, optimizer states), whereas SavedModel can handle the full TensorFlow program including assets, variables, and metadata.
- Compatibility: SavedModel is designed to be portable across various TensorFlow deployments, while HDF5 is mainly limited to model architecture and weights.
- Use Cases: Use HDF5 when you want a lightweight, straightforward file containing only the model architecture and weights. SavedModel is preferred for TensorFlow Serving or deploying models across multiple platforms.
Usage Example
Saving Using HDF5
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# Define a simple model
model = Sequential([Dense(10, activation='relu', input_shape=(10,)),
Dense(1, activation='sigmoid')])
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy')
# Save model to HDF5 format
model.save('model.h5')
Loading Using HDF5
# Load an HDF5-formatted model
loaded_model = tf.keras.models.load_model('model.h5')
Saving Using SavedModel
# Save model to SavedModel format
model.save('model_saved')
Loading Using SavedModel
# Load a SavedModel
loaded_model = tf.keras.models.load_model('model_saved')
Conclusion
- Choosing Between the Two: Consider using the HDF5 format when compatibility with older tools is necessary, and use SavedModel when full environment deployment, advanced features, or TensorFlow Serving is required.
- Best Practices: If using TensorFlow 2.x and models need to be served with TensorFlow Serving or other advanced deployment scenarios, employ SavedModel. For simpler use cases, HDF5 remains a viable option.