The use of this site and the content contained therein is governed by the Terms of Use. When you use this site you acknowledge that you have read the Terms of Use and that you accept and will be bound by the terms hereof and such terms as may be modified from time to time.
All text, graphics, audio, design and other works on the site are the copyrighted works of nasscom unless otherwise indicated. All rights reserved.
Content on the site is for personal use only and may be downloaded provided the material is kept intact and there is no violation of the copyrights, trademarks, and other proprietary rights. Any alteration of the material or use of the material contained in the site for any other purpose is a violation of the copyright of nasscom and / or its affiliates or associates or of its third-party information providers. This material cannot be copied, reproduced, republished, uploaded, posted, transmitted or distributed in any way for non-personal use without obtaining the prior permission from nasscom.
The nasscom Members login is for the reference of only registered nasscom Member Companies.
nasscom reserves the right to modify the terms of use of any service without any liability. nasscom reserves the right to take all measures necessary to prevent access to any service or termination of service if the terms of use are not complied with or are contravened or there is any violation of copyright, trademark or other proprietary right.
From time to time nasscom may supplement these terms of use with additional terms pertaining to specific content (additional terms). Such additional terms are hereby incorporated by reference into these Terms of Use.
Disclaimer
The Company information provided on the nasscom web site is as per data collected by companies. nasscom is not liable on the authenticity of such data.
nasscom has exercised due diligence in checking the correctness and authenticity of the information contained in the site, but nasscom or any of its affiliates or associates or employees shall not be in any way responsible for any loss or damage that may arise to any person from any inadvertent error in the information contained in this site. The information from or through this site is provided "as is" and all warranties express or implied of any kind, regarding any matter pertaining to any service or channel, including without limitation the implied warranties of merchantability, fitness for a particular purpose, and non-infringement are disclaimed. nasscom and its affiliates and associates shall not be liable, at any time, for any failure of performance, error, omission, interruption, deletion, defect, delay in operation or transmission, computer virus, communications line failure, theft or destruction or unauthorised access to, alteration of, or use of information contained on the site. No representations, warranties or guarantees whatsoever are made as to the accuracy, adequacy, reliability, completeness, suitability or applicability of the information to a particular situation.
nasscom or its affiliates or associates or its employees do not provide any judgments or warranty in respect of the authenticity or correctness of the content of other services or sites to which links are provided. A link to another service or site is not an endorsement of any products or services on such site or the site.
The content provided is for information purposes alone and does not substitute for specific advice whether investment, legal, taxation or otherwise. nasscom disclaims all liability for damages caused by use of content on the site.
All responsibility and liability for any damages caused by downloading of any data is disclaimed.
nasscom reserves the right to modify, suspend / cancel, or discontinue any or all sections, or service at any time without notice.
For any grievances under the Information Technology Act 2000, please get in touch with Grievance Officer, Mr. Anirban Mandal at data-query@nasscom.in.
A simple yet effective and association method to track objects by associating almost every detection box instead of just the high scores one.
The goal of this blog is to cover ByteTrack and techniques for Multi-Object Tracking (MOT). We will also cover running YOLOv8 object detection with ByteTrack tracking on a sample video.
Multi-Object Tracking (MOT)
You might heard of object detection, there are many algorithms like Faster RCNN, SSD, and versions of YOLO which can detect objects with good accuracy. But there is a newer problem which is Multi-object tracking. Basically, you will be passing a video stream, and for each frame, you need to detect the object and assign an “Object ID” and in the next frame if same object is detected same Object ID needs to be assigned. There are many algorithms for MOT like SORT (Simple Online and Realtime tracking), DeepSort, StrongSort, etc.
There are various methods used for object tracking as follows:
Feature-based Tracking: This involves tracking based on its features, such as color, shapes, Texture, etc.
Template Matching: As the name defined, this method uses a pre-defined template to match in each video sequence.
Co-relation-based Tracking: This method is used to compute the similarity between the target object and the candidate region in subsequent frames.
Deep learning-based Tracking: This method uses neural networks trained on large datasets to detect and track objects in real-time.
Now you must have got some basic idea on MOT. Let’s try to jump to the ByteTrack and try to understand why it is a better object tracking than DeepSort etc.
ByteSort
Here first we will understand the problem with previous MOT algorithms then we will understand the logic of ByteSort.
Problems with other MOT Algorithms
Low confidence Detection boxes: The very first problem is other MOT Algorithms removing the low confidence detection boxes. While ByteTrack takes into account low-confidence detection boxes also. Why?
“What is reasonable is real; that which is real is reasonable.” — Hegel
i.e., Low confidence detection boxes sometimes indicate the existence of objects e.g., occluded objects. Filtering these objects causes irreversible error in MOT and brings non-negligible missing detection and fragmented trajectories.
Let’s understand it by example:
As you can see in Frame t1, we initialize three different tracklets as their scores are higher than 0.5. But in frame t2 and t3, scores drooped from 0.8 to 0.4 then 0.4 to 0.1.
These detection boxes will be eliminated by the thresholding mechanism and the red tracklets disappear accordingly as shown in Figure (b). But if we take all the detection boxes into consideration more false rates will be introduced e.g., the rightmost box in Figure (a). This comes with a second problem.
False rate bonding box consideration: It is identified here that the similarity with tractlets provides a strong relation to distinguishing the objects and background in low-score detection boxes.
For e.g., as we can see in Figure ( c) two low scores detection boxes are matched to the tracklets by the motion-predicted boxes (in dotted lines), and thus the objects are correctly recovered. And the background box is removed since it has no matched tracklet.
So, for using the high scores to low scores detection boxes in the matching process. This simple and effective association method is called BYTE, named since each detection box is a basic unit of the tracklets. First, it matches the high-score detection boxes to the tracklets based on motion or appearance similarity. Then, it adopts the Kalman filter for predicting the tracklets location in the next frame. Then the similarity between the predicted and detection box can be computed using IoU or Re-ID feature distance. In the second matching step, low scores detections and the unmatched tracklets i.e., tracklets in the red box are matched using the same motion similarity.
Let’s try to understand Data Association which is the core of the MOT algorithm.
Data Association
It is the core of multi-object tracking which first computes the similarities between tracklets and detection boxes and applies different strategies to match them according to the similarity.
Similarities metrics: For association, location, motion, and appearance are three important cues. SORT uses location and motion cues in a very simple way. It adopts a Kalman Filter for predicting the tracklets in the next frame and then computes the IoU between the detection boxes and predicted boxes as a similarity. But Location and motion cues are good for short-range matching. But for long-range, appearance similarity are helpful. E.g., an Object that was occluded for a long time will get identified using appearance similarity. Appearance similarity is calculated by the cosine similarity of Re-ID features. DeepSort uses a standalone deep learning model for appearance similarity.
Matching Strategy: The matching strategy is used to assign an ID to the object after the computation of similarity. This can be done by Hungarian algorithm or Greedy assignment. SORT matches the detection boxes to the tracklets by matching once. While, DeepSort uses a cascaded matching strategy which first matches the detection boxes to the most recent trackers and then to the lost ones.
BYTE Algorithm
The input to the BYTE algorithm is a video sequence along the Detector. Also a detection threshold value. The algorithm outputs Tracks T of the video each frame contains the bounding box and the ID of the objects.
For each frame in the video, first we predict the detection boxes and prediction score using the Detector Det. Then we separate the detection boxes between Det(high) and Det(low) according to the detection score threshold.
After separating the detection boxes, the Kalman filter is applied to predict the new location in the current frame of each Track T. Firstly association is applied on high detection boxes after that left-over low detection boxes association will be applied.
The main highlight of BYTE is, it’s very flexible and can be compatible with different association methods.
Performance
Byte track outperforms SORT and DeepSORT algorithms. Bytetrack with 76.6 MOTA (Multi-object tracking Accuracy) while SORT and DeepSort with 74.6 and 75.4 MOTA respectively.
Now, you might have understood the main concept of ByteTrack. It’s simple I guess. Let’s try to apply it in real-world project.
ByteTrack with YOLOv8 Detector
In this we will see how we can use the YOLOv8 detector to track vehicles on the road also we will count incoming and outgoing vehicles.
As you can see every new vehicle is assigned an ID with a Class name and Detection probability. Using in and out you can see the count of incoming and outgoing traffic.
Let’s see the code for this implementation:
Here I have used the YOLOv8 Ultralytics library for loading the YOLO model train on the COCO dataset. And Supervision library is used for loading ByteTrack and other Vision tasks such as Labelling, vehicle count, etc.
You can just run this command by passing video as input:python sv_bytetracker_yolo.py –source_weights_path yolov8m.pt –source_video_path test_video.mp4 –target_video_path test_pred.mp4 –confidence_threshold 0.1
You can remove the class filter from the code if you want to track any other class or so.
Applications
So, we have completely understood ByteTrack. There are various applications and industries where it can be used such as:
Automobile industry: For tracking vehicles on the road for traffic analysis. If any vehicle going in the wrong direction or traffic movement on a four-way road.
Production Industry: This can be used in the production line for counting and tracking the production item.
Customer Interaction in Shopping: Tracking customer movement, which product or which category customers are interested in more. How long they hold the product, whether they finally buy or return it to the shelf.
Enhanced customer experience: Recognize when a customer appears confused or searching too long for a product.
Summary
Let’s summarize the points we have learned:
There are various MOT models like SORT, DeepSort, Bytetrack, etc.
There are various methods/techniques for object tracking, Feature-based tracking, Template matching, Co-relation-based tracking, and Deep learning-based tracking.
ByteTrack algorithm takes in low scores detection also (with high scores detections) in consideration for object tracking.
Data association is applied to each detection.
In Data Association similarities are generated between tracklets and detection boxes. Later apply different strategies to match them according to the similarity.
Similarity can be computed by IoU or Re-ID by prediction of tracklets from the Kalman filter.
For long-range, appearance similarity is useful.
For the matching strategy, the Hungarian algorithm is used.
Byte first applies association on high scores detection boxes than on low scores detection boxes.
That the contents of third-party articles/blogs published here on the website, and the interpretation of all information in the article/blogs such as data, maps, numbers, opinions etc. displayed in the article/blogs and views or the opinions expressed within the content are solely of the author's; and do not reflect the opinions and beliefs of NASSCOM or its affiliates in any manner. NASSCOM does not take any liability w.r.t. content in any manner and will not be liable in any manner whatsoever for any kind of liability arising out of any act, error or omission. The contents of third-party article/blogs published, are provided solely as convenience; and the presence of these articles/blogs should not, under any circumstances, be considered as an endorsement of the contents by NASSCOM in any manner; and if you chose to access these articles/blogs , you do so at your own risk.
NeST Digital, the software arm of the NeST Group, has been transforming businesses, providing customized and innovative software solutions and services for customers across the globe. A leader in providing end-to-end solutions under one roof, covering contract manufacturing and product engineering services, NeST has 25 years of proven experience in delivering industry-specific engineering and technology solutions for customers, ranging from SMBs to Fortune 500 enterprises, focusing on Transportation, Aerospace, Defense, Healthcare, Power, Industrial, GIS, and BFSI domains.
Below are some of the learnings from the panel discussion on “Preparing for the Dynamic Mobility Engineering Disruption” from the 13th NASSCOM Design and Engineering Summit held in Oct 2021.
The last 2 years have been instrumental for disruptive…
Today huge chunks of user data are available which can be used to the benefit of businesses. Though data analysis and machine learning, business owners can identify the buying patterns of users and predict their choices. Business Intelligence can…
Keeping track of sales is critical for the success of your organization. You must know where your personnel are, deploy the appropriate team members, and complete the task. Tracking sales can also assist you in the real-time understanding of issues…
Many employees worldwide use their personal mobile devices (smartphones, tablets, laptops etc.) for both official and leisure purposes. According to a Deloitte survey, 42% of German employees use their personal smartphones to do official work. It is…
Arm Neon was introduced to improve multimedia encoding/decoding, UI, graphics and gaming related features running on mobile devices. Over the years, it has been used to accelerate signal processing algorithms and functions, to speed up not only the…
The term blockchain might be quite new for most of us. But if you have been following NASSCOM blogs you might have come across blockchain a few times. Though blockchain might be something we have not heard, there is absolutely no doubt that we have…