quantization
Example Script: quantization.py
This script implements model optimization using TensorFlow Model Optimization Toolkit's quantization capabilities on neural network architectures. It demonstrates an implementation for applying and evaluating both quantization-aware training and post-training quantization.
This example includes
- Implementation of QuantizationAwareTraining class
- Building and training baseline CNN models
- Applying quantization-aware training with TensorFlow
- Training and evaluation workflows for both models
- Converting models to TFLite format with optimization
- Implementing post-training quantization for model compression
Version Info
- 06/01/2024: Initial version
QuantizationAwareTraining
A reusable class for performing quantization-aware training on any dataset.
Attributes:
Name | Type | Description |
---|---|---|
model |
Model
|
The base model architecture. |
q_aware_model |
Model
|
The quantization-aware trained model. |
Source code in scirex/core/model_compression/quantization.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 |
|
__init__(input_shape, num_classes, filters=12, kernel_size=(3, 3), pool_size=(2, 2))
Initializes the model architecture.
:param input_shape: Shape of input data. :param num_classes: Number of output classes. :param filters: Number of filters for the Conv2D layer. :param kernel_size: Kernel size for the Conv2D layer. :param pool_size: Pool size for the MaxPooling2D layer.
Source code in scirex/core/model_compression/quantization.py
apply_quantization_aware_training()
Applies quantization-aware training to the base model.
Source code in scirex/core/model_compression/quantization.py
convert_to_tflite()
Converts the quantization-aware model to TensorFlow Lite format.
:return: Quantized TFLite model.
Source code in scirex/core/model_compression/quantization.py
evaluate(test_data, test_labels)
Evaluates both the base model and the quantized model.
:param test_data: Test dataset. :param test_labels: Test labels. :return: Accuracy of base model and quantized model.
Source code in scirex/core/model_compression/quantization.py
measure_model_size(filepath)
staticmethod
Measures the size of a model file.
:param filepath: Path to the model file. :return: Size of the model in megabytes.
Source code in scirex/core/model_compression/quantization.py
post_quantization()
Applies post-training quantization to the base model.
:return: Post-quantized TFLite model.
Source code in scirex/core/model_compression/quantization.py
save_model(model_content, filename)
staticmethod
Saves the TFLite model to a file.
:param model_content: The TFLite model content. :param filename: File name to save the model.
Source code in scirex/core/model_compression/quantization.py
train(train_data, train_labels, epochs=10, validation_split=0.1)
Trains the base model.
:param train_data: Training dataset. :param train_labels: Training labels. :param epochs: Number of training epochs. :param validation_split: Fraction of training data for validation.
Source code in scirex/core/model_compression/quantization.py
train_q_aware_model(train_data, train_labels, batch_size=500, epochs=10, validation_split=0.1)
Trains the quantization-aware model.
:param train_data: Training dataset. :param train_labels: Training labels. :param batch_size: Batch size for training. :param epochs: Number of training epochs. :param validation_split: Fraction of training data for validation.