The use of this site and the content contained therein is governed by the Terms of Use. When you use this site you acknowledge that you have read the Terms of Use and that you accept and will be bound by the terms hereof and such terms as may be modified from time to time.
All text, graphics, audio, design and other works on the site are the copyrighted works of nasscom unless otherwise indicated. All rights reserved.
Content on the site is for personal use only and may be downloaded provided the material is kept intact and there is no violation of the copyrights, trademarks, and other proprietary rights. Any alteration of the material or use of the material contained in the site for any other purpose is a violation of the copyright of nasscom and / or its affiliates or associates or of its third-party information providers. This material cannot be copied, reproduced, republished, uploaded, posted, transmitted or distributed in any way for non-personal use without obtaining the prior permission from nasscom.
The nasscom Members login is for the reference of only registered nasscom Member Companies.
nasscom reserves the right to modify the terms of use of any service without any liability. nasscom reserves the right to take all measures necessary to prevent access to any service or termination of service if the terms of use are not complied with or are contravened or there is any violation of copyright, trademark or other proprietary right.
From time to time nasscom may supplement these terms of use with additional terms pertaining to specific content (additional terms). Such additional terms are hereby incorporated by reference into these Terms of Use.
Disclaimer
The Company information provided on the nasscom web site is as per data collected by companies. nasscom is not liable on the authenticity of such data.
nasscom has exercised due diligence in checking the correctness and authenticity of the information contained in the site, but nasscom or any of its affiliates or associates or employees shall not be in any way responsible for any loss or damage that may arise to any person from any inadvertent error in the information contained in this site. The information from or through this site is provided "as is" and all warranties express or implied of any kind, regarding any matter pertaining to any service or channel, including without limitation the implied warranties of merchantability, fitness for a particular purpose, and non-infringement are disclaimed. nasscom and its affiliates and associates shall not be liable, at any time, for any failure of performance, error, omission, interruption, deletion, defect, delay in operation or transmission, computer virus, communications line failure, theft or destruction or unauthorised access to, alteration of, or use of information contained on the site. No representations, warranties or guarantees whatsoever are made as to the accuracy, adequacy, reliability, completeness, suitability or applicability of the information to a particular situation.
nasscom or its affiliates or associates or its employees do not provide any judgments or warranty in respect of the authenticity or correctness of the content of other services or sites to which links are provided. A link to another service or site is not an endorsement of any products or services on such site or the site.
The content provided is for information purposes alone and does not substitute for specific advice whether investment, legal, taxation or otherwise. nasscom disclaims all liability for damages caused by use of content on the site.
All responsibility and liability for any damages caused by downloading of any data is disclaimed.
nasscom reserves the right to modify, suspend / cancel, or discontinue any or all sections, or service at any time without notice.
For any grievances under the Information Technology Act 2000, please get in touch with Grievance Officer, Mr. Anirban Mandal at data-query@nasscom.in.
The benefits of Batch Normalization in training are well known for the reduction of internal covariate shift and hence optimizing the training to converge faster. This article tries to bring in a different perspective, where the quantization loss is recovered with the help of Batch Normalization layer, thus retaining the accuracy of the model. The article also gives a simplified implementation of Batch Normalization to reduce the load on edge devices which generally will have constraints on computation of neural network models.
Batch Normalization Theory
During the training of neural network, we have to ensure that the network learns faster. One of the ways to make it faster is by normalizing the inputs to network, along with normalization of intermittent layers of the network. This intermediate layer normalization is what is called Batch Normalization. The Advantage of Batch norm is also that it helps in minimizing internal covariate shift, as described in this paper.
The frameworks like TensorFlow, Keras and Caffe have got the same representation with different symbols attached to it. In general, the Batch Normalization can be described by following math:
Batch Normalization equation
Here the equation (1.1) is a representation of Keras/TensorFlow. whereas equation (1.2) is the representation used by Caffe framework. In this article, the equation (1.1) style is adopted for the continuation of the context.
Now let’s modify the equation (1.1) as below:
Now by observing the equation of (1.4), there remains an option for optimization in reducing number of multiplications and additions. The bias comb (read it as combined bias) factor can be offline calculated for each channel. Also the ratio of “gamma/sqrt(variance)” can be calculated offline and can be used while implementing the Batch norm equation. This equation can be used in Quantized inference model, to reduce the complexity.
Quantized Inference Model
The inference model to be deployed in edge devices, would generally integer arithmetic friendly CPUs, such as ARM Cortex-M/A series processors or FPGA devices. Now to make inference model friendly to the architecture of the edge devices, will create a simulation in Python. And then convert the inference model’s chain of inputs, weights, and outputs into fixed point format. In the fixed point format, Q for 8 bits is chosen to represent with integer.fractional format. This simulation model will help you to develop the inference model faster on the device and also will help you to evaluate the accuracy of the model.
e.g: Q2.6 represents 6 bits of fractional and 2 bits of an integer.
Now the way to represent the Q format for each layer is as follows:
Take the Maximum and Minimum of inputs, outputs, and each layer/weights.
Get the fractional bits required to represent the Dynamic range (by using Maximum/Minimum) is as below using Python function:
def get_fract_bits(tensor_float): # Assumption is that out of 8 bits, one bit is used as sign fract_dout = 7 - np.ceil(np.log2(abs(tensor_float).max()))
fract_dout = fract_dout.astype('int8')
return fract_dout
Now the integer bits are 7-fractional_bits, as one bit is reserved for sign representation.
4. To start with perform this on input and then followed by Layer 1, 2 …, so on.
5. Do the quantization step for weights and then for the output assuming one example of input. The assumption is made that input is normalized so that we can generalize the Q format, otherwise, this may lead to some loss in data when non-normalized different input gets fed.
6. This will set Q format for input, weights, and outputs.
Example:
Let’s consider Resnet-50 as a model to be quantized. Let’s use Keras inbuilt Resnet-50 trained with Imagenet.
#Creating the model
def model_create():
model = tf.compat.v1.keras.applications.resnet50.ResNet50(
include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000)
return model
Let’s prepare input for resnet-50. The below image is taken from ImageNet dataset.
def prepare_input():
img = image.load_img(
"D:\\Elephant_water.jpg",
target_size=(224,224)
)
x_test = image.img_to_array(img)
x_test = np.expand_dims(x_test,axis=0)
x = preprocess_input(x_test) # from tensorflow.compat.v1.keras.applications.resnet50 import preprocess_input, decode_predictions
return x
Now Lets call the above two functions and find out the Q format for input.
model = model_create()
x = prepare_input()
If you observe the input ‘x’, its dynamic range is between -123.68 to 131.32. This makes it hard for fitting in 8 bits, as we only have 7 bits to represent these numbers, considering one sign bit. Hence the Q Format for this input would become, Q8.0, where 7 bits are input numbers and 1 sign bit. Hence it clips the data between -128 to +127 (-2? to 2? -1). so we would be loosing some data in this input quantization conversion (most obvious being 131.32 is clipped to 127), whose loss can be seen by Signal to Quantize Noise Ratio , which will be described soon below.
If you follow the same method for each weight and outputs of the layers, we will have some Q format which we can fix to simulate the quantization.
# Lets get first layer properties
(padding, _) = model.layers[1].padding
# lets get second layer properties
wts = model.layers[2].get_weights()
strides = model.layers[2].strides
W=wts[0]
b=wts[1]
hparameters =dict(
pad=padding[0],
stride=strides[0]
)
# Lets Quantize the weights .
quant_bits = 8 # This will be our data path.
wts_qn,wts_bits_fract = Quantize(W,quant_bits) # Both weights and biases will be quantized with wts_bits_fract.
Now if you observe the above snippet of code, the convolution operation will take input, weights, and output with its fractional bits defined.
i.e: fractional_bits=[0,7,-3]
where 1st element represents 0 bits for fractional representation of input (Q8.0)
2nd element represents 7 bits for fractional representation of weights (Q1.7).
3rd element represents -3 bits for the fractional representation of outputs (Q8.0, but need additional 3 bits for integer representation as the range is beyond 8-bit representation).
This will have to repeat for each layer to get the Q format.
Now the quantization familiarity is established, we can move to the impact of this quantization on SQNR and hence accuracy.
Signal to Quantization Noise Ratio
As we have reduced the dynamic range from floating point representation to fixed point representation by using Q format, we have discretized the values to nearest possible integer representation. This introduces the quantization noise, which can be quantified mathematically by Signal to Quantization noise ratio.(refer: https://en.wikipedia.org/wiki/Signal-to-quantization-noise_ratio)
As shown in the above equation, we will measure the ratio of signal power to noise power. This representation applied on log scale converts to dB (10log10SQNR). Here signal is floating point input which we are quantizing to nearest integer and noise is Quantization noise. example: The elephant example of input has maximum value of 131.32, but we are representing this to nearest integer possible, which is 127. Hence it makes Quantization noise = 131.32–127 = 4.32.
So SQNR = 131.32² /4.32² = 924.04, which is 29.66 db, indicating that we have only attained close to 30dB as compared to 48dB (6*no_of_bits) possibility.
This reflection of SQNR on accuracy can be established for each individual network depending on structure. But indirectly we can say better the SQNR the higher is the accuracy.
Convolution in Quantized environments:
The convolution operation in CNN is well known, where we multiply the kernel with input and accumulate to get the results. In this process we have to remember that we are operating with 8 bits as inputs , hence the result of multiplication need at least 16 bits and then accumulating it in 32 bits accumulator, which would help to maintain the precision of the result. Then result is rounded or truncated to 8 bits to carry 8 bit width of data.
Apply one filter defined by parameters W on a single slice (a_slice_prev) of the output activation
of the previous layer.
Arguments:
a_slice_prev -- slice of input data of shape (f, f, n_C_prev)
W -- Weight parameters contained in a window - matrix of shape (f, f, n_C_prev)
b -- Bias parameters contained in a window - matrix of shape (1, 1, 1)
Returns:
Z -- a scalar value, result of convolving the sliding window (W, b) on a slice x of the input data
"""
# Element-wise product between a_slice and W. Do not add the bias yet.
s = np.multiply(a_slice_prev.astype('int16'),W) # Let result be held in 16 bit
# Sum over all entries of the volume s.
Z = np.sum(s.astype('int32')) # Final result be stored in int32.
# The Result of 32 bit is to be trucated to 8 bit to restore the data path.
# Add bias b to Z. Cast b to a float() so that Z results in a scalar value.
# Bring bias to 32 bits to add to Z.
Z = Z + (b << ip_fract).astype('int32')
# Lets find out how many integer bits are taken during addition.
# You can do this by taking leading no of bits in C/Assembly/FPGA programming
# Here lets simulate
Z = Z >> (ip_fract+wt_fract - fract_dout)
if(Z > 127):
Z = 127
elif(Z < -128):
Z = -128
else:
Z = Z.astype('int8')
return Z
The above code is inspired from AndrewNg’s deep learning specialization course, where convolution from scratch is taught. Then modified the same to fit for Quantization.
Batch Norm in Quantized environment
As shown in Equation 1.4, we have modified representation to reduce complexity and perform the Batch normalization.The code below shows the same implementation.
def calculate_bn(x,bn_param,Bn_fract_dout):
x_ip = x[0] x_fract_bits = x[1]
bn_param_gamma_s = bn_param[0][0]
bn_param_fract_bits = bn_param[0][1]
op = x_ip*bn_param_gamma_s.astype(np.int16) # x*gamma_s
# This output will have x_fract_bits + bn_param_fract_bits
fract_bits =x_fract_bits + bn_param_fract_bits
bn_param_bias = bn_param[1][0]
bn_param_fract_bits = bn_param[1][1]
bias = bn_param_bias.astype(np.int16)
# lets adjust bias to fract bits
bias = bias << (fract_bits - bn_param_fract_bits)
op = op + bias # + bias
# Convert this op back to 8 bits, with Bn_fract_dout as fractional bits
op = op >> (fract_bits - Bn_fract_dout)
BN_op = op.astype(np.int8)
return BN_op
Now with these pieces in place for the Quantization inference model, we can see now the Batch norm impact on quantization.
Results
The Resnet-50 trained with ImageNet is used for python simulation to quantize the inference model. From the above sections, we bind the pieces together to only analyze the first convolution followed by Batch Norm layer.
The convolution operation is the heaviest of the network in terms of complexity and also in maintaining accuracy of the model. So let’s look at the Convolution data after we quantized it to 8 bits. The below figure on the left-hand side represents the convolution output of 64 channels (or filters applied) output whose mean value is taken for comparison. The Blue color is float reference and the green color is Quantized implementation. The difference plot (Left-Hand side) gives an indication of how much variation exists between float and quantized one. The line drawn in that Difference figure is mean, whose value is around 4. which means we are getting on an average difference between float and Quantized values close to a value of 4.
Convolution and Batch Norm Outputs
Now let’s look at Right-Hand side figure, which is Batch Normalization section. As you can see the Green and blue curves are so close by and their differences range is shrunk to less than 0.5 range. The Mean line is around 0.135, which used to be around 4 in the case of convolution. This indicates we are reducing our differences between float and quantized implementation from mean of 4 to 0.135 (almost close to 0).
Now let’s look at the SQNR plot to appreciate the Batch Norm impact.
Signal to Quantization Noise Ratio for sequence of layers
Just in case values are not visible, we have following SQNR numbers
Input SQNR : 25.58 dB (The Input going in to Model)
Convolution SQNR : -4.4dB (The output of 1st convolution )
Batch-Norm SQNR : 20.98 dB (The Batch Normalization output)
As you can see the input SQNR is about 25.58dB , which gets reduced to -4.4 dB indicating huge loss here, because of limitation in representation beyond 8 bits. But the Hope is not lost, as Batch normalization helps to recover back your SQNR to 20.98 dB bringing it close to input SQNR.
Conclusion
Batch Normalization helps to correct the Mean, thus regularizing the quantization variation across the channels.
Batch Normalization recovers the SQNR. As seen from above demonstration, we see a recovery of SQNR as compared to convolution layer.
If the quantized inference model on edge is desirable, then consider including Batch Normalization as it acts as recovery of quantization loss and also helps in maintaining the accuracy, along with training benefits of faster convergence.
Batch Normalization complexity can be reduced by using (1.4) so that many parameters can be computed offline to reduce load on the edge device.
That the contents of third-party articles/blogs published here on the website, and the interpretation of all information in the article/blogs such as data, maps, numbers, opinions etc. displayed in the article/blogs and views or the opinions expressed within the content are solely of the author's; and do not reflect the opinions and beliefs of NASSCOM or its affiliates in any manner. NASSCOM does not take any liability w.r.t. content in any manner and will not be liable in any manner whatsoever for any kind of liability arising out of any act, error or omission. The contents of third-party article/blogs published, are provided solely as convenience; and the presence of these articles/blogs should not, under any circumstances, be considered as an endorsement of the contents by NASSCOM in any manner; and if you chose to access these articles/blogs , you do so at your own risk.
Generative Artificial Intelligence (Gen AI) is revolutionizing customer services and support by enhancing efficiency, improving customer experiences, and reducing operational costs. Gen AI is becoming an integral part of customer-centric…
Fundraising is the lifeblood of any startup, and in the rapidly evolving Web3 industry, it plays a pivotal role in driving innovation and ecosystem expansion. Unlike traditional industries, Web3 projects leverage blockchain-based fundraising…
Initial Coin Offerings (ICOs) have revolutionized the way blockchain projects secure funding. It is becoming an irreplaceable element of the crypto. By enabling developers to bypass traditional financial institutions, ICOs democratize fundraising,…
Gone are the days when the answer to "What is a lock screen?" was simply a security feature on mobile devices. As we step into 2025, the lock screen has undergone a remarkable transformation, becoming a dynamic interface that offers much more than…
SUMMARY
India's healthcare system faces significant challenges due to its large population, a shortage of healthcare professionals, increasing costs, and systemic inequalities. However, it stands to gain immensely from advancements in AI and…
In the ever-evolving IT landscape, efficiently managing operations has become increasingly challenging. IT teams are often overwhelmed, striving to minimize incidents, ensure system uptime, and enhance operational efficiency. Traditional IT…