Federated Learning : A privacy first approach to ML

Terms of use

Terms of Use

The use of this site and the content contained therein is governed by the Terms of Use. When you use this site you acknowledge that you have read the Terms of Use and that you accept and will be bound by the terms hereof and such terms as may be modified from time to time.

All text, graphics, audio, design and other works on the site are the copyrighted works of nasscom unless otherwise indicated. All rights reserved.
Content on the site is for personal use only and may be downloaded provided the material is kept intact and there is no violation of the copyrights, trademarks, and other proprietary rights. Any alteration of the material or use of the material contained in the site for any other purpose is a violation of the copyright of nasscom and / or its affiliates or associates or of its third-party information providers. This material cannot be copied, reproduced, republished, uploaded, posted, transmitted or distributed in any way for non-personal use without obtaining the prior permission from nasscom.
The nasscom Members login is for the reference of only registered nasscom Member Companies.
nasscom reserves the right to modify the terms of use of any service without any liability. nasscom reserves the right to take all measures necessary to prevent access to any service or termination of service if the terms of use are not complied with or are contravened or there is any violation of copyright, trademark or other proprietary right.
From time to time nasscom may supplement these terms of use with additional terms pertaining to specific content (additional terms). Such additional terms are hereby incorporated by reference into these Terms of Use.

Disclaimer

The Company information provided on the nasscom web site is as per data collected by companies. nasscom is not liable on the authenticity of such data.
nasscom has exercised due diligence in checking the correctness and authenticity of the information contained in the site, but nasscom or any of its affiliates or associates or employees shall not be in any way responsible for any loss or damage that may arise to any person from any inadvertent error in the information contained in this site. The information from or through this site is provided "as is" and all warranties express or implied of any kind, regarding any matter pertaining to any service or channel, including without limitation the implied warranties of merchantability, fitness for a particular purpose, and non-infringement are disclaimed. nasscom and its affiliates and associates shall not be liable, at any time, for any failure of performance, error, omission, interruption, deletion, defect, delay in operation or transmission, computer virus, communications line failure, theft or destruction or unauthorised access to, alteration of, or use of information contained on the site. No representations, warranties or guarantees whatsoever are made as to the accuracy, adequacy, reliability, completeness, suitability or applicability of the information to a particular situation.
nasscom or its affiliates or associates or its employees do not provide any judgments or warranty in respect of the authenticity or correctness of the content of other services or sites to which links are provided. A link to another service or site is not an endorsement of any products or services on such site or the site.
The content provided is for information purposes alone and does not substitute for specific advice whether investment, legal, taxation or otherwise. nasscom disclaims all liability for damages caused by use of content on the site.
All responsibility and liability for any damages caused by downloading of any data is disclaimed.
nasscom reserves the right to modify, suspend / cancel, or discontinue any or all sections, or service at any time without notice.

For any grievances under the Information Technology Act 2000, please get in touch with Grievance Officer, Mr. Anirban Mandal at data-query@nasscom.in.

New

See all

No notification found.

Federated Learning : A privacy first approach to ML

Intuit

@ankita_intuit

July 28, 2021

Data Privacy Data Science & AI Community Emerging Tech AI

1732

What is Federated Learning?

Federated Learning is a form of collaborative learning introduced by Google in 2016 to address privacy concerns regarding sharing of data. In a traditional machine learning scenario, the training data, which is stored across multiple data sources / devices, is expected to be hosted at a centralized location to facilitate model training. For making training data available at one central location, the data across multiple sources must be transferred, raising concerns regarding data corruption, privacy, trust, etc. Imagine being able to train a ML model where the data never leaves the source of origin. Instead, multiple models are trained locally at the data sources / devices and the learnings(model parameters) are centralized and aggregated, resulting in a collaboratively trained global model, which is superior to the individual models. This is exactly what Federated Learning does.

Federated Learning sits in stark contrast to traditional ML approaches in the way the model is trained. Federated Learning relies on aggregating model updates received from training multiple local models on different devices. These local models are trained on datasets local to the respective devices. The training involves a cohort of edge-devices (Federated Learning clients) like the client’s own desktop or phone, that participate in the model training and a central server (Federated Learning server) responsible for aggregation. The steps are as follows:

The Data Science team at the Federated Learning server side chooses a model to be trained for a given task.
The server sends a copy of the model to the Federated Learning clients for training
The personal copy of the model is trained locally by the clients with their own datasets, achieving local convergence.
Once the training is complete, the clients send their model updates to the server, where the server aggregates the individual local model updates to create a global model. Step 2 is repeated till meaningful convergence is achieved or for a preset number of iterations.

Source - Wikimedia

Ok. But why do we need to aggregate the learnings? Aren’t these local datasets enough for effective training to happen on its own? The answer is “it depends”. It is ideal to have a local dataset which is large. However in most of the scenarios, the quantity of data is not sufficient in these individual sources and might be quite sparse leading to mal-converged models. Consider you are training a Language Model, for predicting the next word when you type. If the model was trained only on your own data, the predictions will be limited to your vocabulary only, without any scope of better alternatives being suggested. To address these shortcomings, it makes sense to inject some essence from other local models, which are trained on a variety of writing styles and diverse vocabularies, while still retaining your own personalized style of writing and choice of words. That’s where aggregating models learned on different datasets comes into the picture, to provide generalization resulting in a model that is better than the sum of its parts. Further, personalization can be imparted to the global model outputs by re-applying the updates from the local model.

In a nutshell, Federated Learning enables training ML models without data leaving the source of origin, thus eliminating the need for centralizing data and enforcing data privacy by design.

Why is Federated Learning Necessary ?

Data Inaccessibility - In many cases it might be difficult to centralize the data being generated due to technical infeasibility and economical viability, which might lead us to discard data after short intervals without deriving any values out of it.
Data Privacy - Access to data might also be limited owing to regulations like GDPR, CCPA and other legal compliance, which limits both the kind and amount of data that can be stored. Also data breaches in the recent past clearly points to the inherent danger of centralizing data in a single location owing to the volume of the data contained.

Federated Learning is central to addressing these challenges by eliminating the need to store data at a central location. This enables users to indirectly collaborate with their peers to train models that safeguard privacy. Organizations can also collaborate with their peers and vendors, without the need to share the data to build more robust AI models that could potentially solve fundamental yet challenging problems in Healthcare and BFSI sectors.

Several other Privacy Preserving techniques such as Differential Privacy, Homomorphic Encryption, Secure Multi-Party Computation in addition to Federated Learning promise stricter privacy guarantees, making it possible to comply with the ever-changing landscape of Data Sharing and Privacy guidelines as well as engendering trust in the robustness of the privacy first approaches.

Federated learning in the Tech Industry -

Google uses Federated Learning in their mobile keyboard product, called GBoard, to train Language Models for predicting the next word you are going to type, improving query suggestions based on what you type. Google is also experimenting with Federated Learning to eliminate the use of third-party cookies in their browser Chrome, making it immensely hard for advertisers to track user activities on the web to serve targeted ads.
The virtual assistant for iOS - Siri wakes up when you say “Hey Siri,” but not when the same phrase comes from your friends or family. Apple employs Federated Learning to enable this personalization based on your voice patterns.
Nvidia’s CLARA makes use of Federated Learning in the healthcare sector where Health Care Organizations can collaborate to build better diagnostic models without sharing patient data. This is important given the critical nature of the task and the utmost need to preserve the privacy of the patients, conforming to HIPAA regulations.
In the BFSI sector, WeBank utilizes FATE, a home-grown Federated Learning framework to facilitate collaboration across other banks and financial institutions to train better models for Credit Risk Management and Anti-Money laundering, ensuring richer learning without sharing any data that might give away their own competitive advantages.

Authors:

Ankita Sinha Arkadeep Banerjee Goutham Kallepalli

Software Engineer, Intuit Data Scientist, Intuit Software Engineer, Intuit

federated-learning machine-learning data-science AI artificial-intelligence ML

Disclaimer

That the contents of third-party articles/blogs published here on the website, and the interpretation of all information in the article/blogs such as data, maps, numbers, opinions etc. displayed in the article/blogs and views or the opinions expressed within the content are solely of the author's; and do not reflect the opinions and beliefs of NASSCOM or its affiliates in any manner. NASSCOM does not take any liability w.r.t. content in any manner and will not be liable in any manner whatsoever for any kind of liability arising out of any act, error or omission. The contents of third-party article/blogs published, are provided solely as convenience; and the presence of these articles/blogs should not, under any circumstances, be considered as an endorsement of the contents by NASSCOM in any manner; and if you chose to access these articles/blogs , you do so at your own risk.

Intuit

Intelligent Audit Models: Enabling AI-Ready, Digitally Resilient Data Centers

SPNX Consulti..

@SPNX

25 Aug 2025

Cyber Security & Privacy Data Privacy Threat Intelligence Digital Transformation AI IT Services

AI AS THE DEFINING FORCE OF GOVERNANCE Artificial intelligence is no longer confined to chatbots, automation scripts, or headline-grabbing innovations. A quieter, yet more profound revolution is underway in how data centers the invisible backbone…

Draft DPDP Rules Check-In: Do These Four Clauses Continue to Define the Key Day-One Friction Points?

Ashish Aggarw..

@ashish.aggarwal

30 Jul 2025

Public Policy Data Privacy

On Day One, the real operational friction is likely to turn on breach‑alert volume, parent‑consent workflows, clarity on SDF triggers, and penalty calibration—unless the final rules address the same. 1. Breach workflow: user alerts, Board filing,…

Model Context Protocols: The Global Standard for Agentic Communication or a New Security Loophole?

Janhvi Juyal

@juyal janhvi

30 Jul 2025

Emerging Tech Cyber Security & Privacy Data Privacy Threat Intelligence Data Science & AI Community Digital Transformation AI nasscom insights

As Agentic AI continues to evolve, numerous protocols are emerging that enable autonomous communication and coordination. These protocols enable agents to independently discover, select and allocate resources without requiring human intervention.…

Vector databases: Revolutionizing AI and search

Opcito Techno..

@Opcito Technologies

24 Jul 2025

AI Data Privacy

Ever wonder how your favorite apps seem to get what you’re looking for, like suggesting the perfect song or product? That’s where vector databases come in. They’re a smart way to handle data, making searches and recommendations faster and more…

How AI Can Improves Data Protection For Your Business.

AlgoDocs

@AlgoDocs

14 Jul 2025

AI Data Privacy Data Science & AI Community

In today’s digital world, data is one of the most valuable resources. Every day, businesses, governments, and individuals create, share, and store huge amounts of data. This includes customer records, financial details, health information, legal…

Unified Data Fabric

Bipin Kondal..

@bipinkondalkar

11 Jul 2025

Cyber Security & Privacy Data Privacy

What is Unified Data Fabric? A Unified Data Fabric is an architectural framework that connects, integrates, and governs data across hybrid and multi-cloud environments—regardless of where it resides or what form it takes. It enables…

Topics In Demand

Notification

New

Federated Learning : A privacy first approach to ML

What is Federated Learning?

Why is Federated Learning Necessary ?

Federated learning in the Tech Industry -

Share this blog

Related blogs