EVOLUTION OF DATA PIPELINES

Terms of use

Terms of Use

The use of this site and the content contained therein is governed by the Terms of Use. When you use this site you acknowledge that you have read the Terms of Use and that you accept and will be bound by the terms hereof and such terms as may be modified from time to time.

All text, graphics, audio, design and other works on the site are the copyrighted works of nasscom unless otherwise indicated. All rights reserved.
Content on the site is for personal use only and may be downloaded provided the material is kept intact and there is no violation of the copyrights, trademarks, and other proprietary rights. Any alteration of the material or use of the material contained in the site for any other purpose is a violation of the copyright of nasscom and / or its affiliates or associates or of its third-party information providers. This material cannot be copied, reproduced, republished, uploaded, posted, transmitted or distributed in any way for non-personal use without obtaining the prior permission from nasscom.
The nasscom Members login is for the reference of only registered nasscom Member Companies.
nasscom reserves the right to modify the terms of use of any service without any liability. nasscom reserves the right to take all measures necessary to prevent access to any service or termination of service if the terms of use are not complied with or are contravened or there is any violation of copyright, trademark or other proprietary right.
From time to time nasscom may supplement these terms of use with additional terms pertaining to specific content (additional terms). Such additional terms are hereby incorporated by reference into these Terms of Use.

Disclaimer

The Company information provided on the nasscom web site is as per data collected by companies. nasscom is not liable on the authenticity of such data.
nasscom has exercised due diligence in checking the correctness and authenticity of the information contained in the site, but nasscom or any of its affiliates or associates or employees shall not be in any way responsible for any loss or damage that may arise to any person from any inadvertent error in the information contained in this site. The information from or through this site is provided "as is" and all warranties express or implied of any kind, regarding any matter pertaining to any service or channel, including without limitation the implied warranties of merchantability, fitness for a particular purpose, and non-infringement are disclaimed. nasscom and its affiliates and associates shall not be liable, at any time, for any failure of performance, error, omission, interruption, deletion, defect, delay in operation or transmission, computer virus, communications line failure, theft or destruction or unauthorised access to, alteration of, or use of information contained on the site. No representations, warranties or guarantees whatsoever are made as to the accuracy, adequacy, reliability, completeness, suitability or applicability of the information to a particular situation.
nasscom or its affiliates or associates or its employees do not provide any judgments or warranty in respect of the authenticity or correctness of the content of other services or sites to which links are provided. A link to another service or site is not an endorsement of any products or services on such site or the site.
The content provided is for information purposes alone and does not substitute for specific advice whether investment, legal, taxation or otherwise. nasscom disclaims all liability for damages caused by use of content on the site.
All responsibility and liability for any damages caused by downloading of any data is disclaimed.
nasscom reserves the right to modify, suspend / cancel, or discontinue any or all sections, or service at any time without notice.

For any grievances under the Information Technology Act 2000, please get in touch with Grievance Officer, Mr. Anirban Mandal at data-query@nasscom.in.

New

See all

No notification found.

EVOLUTION OF DATA PIPELINES

L&T Technology Services

@L&T Technology Services

March 16, 2022

Data Science & AI Community

6330

In the past, when data had to be updated, operators manually entered it into a data table. This would lead to manual user entry errors and time lag. Since this was majorly done in batches, mostly as a daily job, there was substantial lead time from the time the event occurred to the time it was reported. Decision makers had to live with this time lag and often make decisions on stale data.

Fast forward into the present and now we have real-time updates and insights which are common place requirements. Building data pipelines essentially was with the intent to move data from one layer (transactional or event sources) to data warehouses or lakes where insights where derived.

The question is with these advancements in requirements to support real-time insights, and other quality requirements, are we efficient by using traditional architectures or popularly used ETL approaches. Let’s find out!

Current state of Data Pipeline Architectures and Challenges

Data pipelines is important to any Product Digitization program. Later half of this decade we witnessed immense focus on Digital architecture and technologies being adopted. Adoption of microservices and containerization is only seeing a strong growth trajectory establishes this fact. We also see tech advancements being applied but limited to traditional “OLTP” or core service/business logic.

However, the story is bit different, when one inspects the patterns involved in Data pipelines or “OLAP” side of things. We observe limited adaptation to tech evolution seen in core services space. Most common data pipelines are built using either traditional ETL, or ELTL architectures. These are popular industry de-facto approaches. Though these do solve the larger problem at hand i.e. deriving actionable insights, but it also comes with certain limitations. Let’s explore some of these challenges:

Siloed Teams: The ETL process requires expertise or skills in data extraction or migration. This could mean the technical team is layered or structured to deal with technical nuances of the process. E.g.: An ETL engineer is many a times oblivious to insights being derived and how it is consumed by end users.

Limited Manifestation: The implementation team is now trying to fit any use-case that is desired in to the set structure or pattern. Though this is always not a problem or a wrong thing to do, there are times this can be more in-efficient. E.g.: How does one extract from an unstructured source and deal with modelling the intermediate persistence schema?

Latency: Time taken to process extract, transform and load the data many a times does introduce lags. This lag could be attributed to the fact that data is processed in batches, or the necessary intermediate load steps to persist interim results. In few business scenario, this is not acceptable. E.g.: Data streams emanating from an IoT service is stored and batch processed at a later scheduled time. Thereby, introducing a lag from data generation to updated insights on dashboards.

Future state of Data Pipeline Architecture and Key considerations

As we see advancements in general software architecture like Microservice, Service Mesh, and so on, there is a need for similar modernization. One key approach emerging is distributing the data pipeline for the domains instead of centralized data pipeline contributing to build multiple such objects resulting in Data Mesh. Data Mesh aims to address these challenges by adopting a different approach:

Team or pods that are aligned on functional feature delivery
Treat Data as Product (discoverable, self-contained and secure)
Polyglot storage and communication facilitate via Mesh

Initial read on Data Mesh can be found here.

Data Mesh can be implemented in various ways. One effective pattern is to use Event driven approach and Event storming to form Data Products. A Domain can comprise of one or more Data Products. This would also mean that data can be redundant and persisted in one or more stores. This is referred to as Polyglot storage. Finally, these data products are consumed via the Mesh APIs designed along the lines of each domain requirement.

Other architectural styles include Data Lake, Data Hub and Data Virtualization. A brief comparison on these can be found here.

Some other considerations that one should evaluate:

Facilitate easy data access any time use standard interfaces like SQL. Tech like Snowflake, DBT, Materialize enable such real-time joins which not only enables BI, but also helps in low level plumbing of the pipeline
Design Data Pipelines to be robust and fault tolerant, E.g. checkpoint intermediate results where required for further analysis
Leverage distributed loosely-couple processing units, scalable to use polyglot technologies e.g. Spark job or Python models
Use Data Virtualization to mitigate bottlenecks, E.g. shorten lead time for data availability
Use of DataOps effectively to track and evaluate your Data pipeline performance

Conclusion

Finally, I would like to conclude with a disclaimer. This article is not to discard current architectures associated to ETL. In fact, for certain use cases like batch jobs, ETL is still a very good option to adopt. The intent here is more of a realization one would need to have based on the varied requirements and explore further architectures which could suit well for the need. In this article, we looked at few such architectures like Data Mesh and associated areas one needs to consider.

DATA PIPELINES DATA WAREHOUSES Cloud data lakes DATA PIPELINE ARCHITECTURES data extraction DATA MESH

Disclaimer

That the contents of third-party articles/blogs published here on the website, and the interpretation of all information in the article/blogs such as data, maps, numbers, opinions etc. displayed in the article/blogs and views or the opinions expressed within the content are solely of the author's; and do not reflect the opinions and beliefs of NASSCOM or its affiliates in any manner. NASSCOM does not take any liability w.r.t. content in any manner and will not be liable in any manner whatsoever for any kind of liability arising out of any act, error or omission. The contents of third-party article/blogs published, are provided solely as convenience; and the presence of these articles/blogs should not, under any circumstances, be considered as an endorsement of the contents by NASSCOM in any manner; and if you chose to access these articles/blogs , you do so at your own risk.

L&T Technology Services

ER&D

L&T Technology Services

AI’s Declaration of Autonomy: Why T...

AccentureIndia

AI

24 Jul 2025

Why AI Data Collection Companies Ar...

Gurpreet Singh Arora

AI

23 Jul 2025

Benefits of Cloud Hosting for Start...

Cyfuture Cloud

Cloud Computing

23 Jul 2025

Guardians of the Wild: How AI-Power...

Valiance Solutions

Case studies

22 Jul 2025

Agentic AI Demystified: A Complete ...

Intelliswift Softwar..

229

Data Science &a..

21 Jul 2025

Why Big Techs are Replacing Roles w...

Janhvi Juyal

121

Emerging Tech

21 Jul 2025

Digital Identity Management: Defend...

Jayajit Dash

Data Science &a..

18 Jul 2025

Types of Chatbots: Script-Based and...

Sparkout Tech

Data Science &a..

15 Jul 2025

AI-Driven Personalization in Wealth...

NuSummit

AI

15 Jul 2025

Building Client Loyalty with Data a...

NuSummit

Digital Transfo..

15 Jul 2025

How AI Can Improves Data Protection...

AlgoDocs

AI

14 Jul 2025

What makes agentic AI the future of...

Opcito Technologies

949

AI

11 Jul 2025

A Step-by-Step Guide to Building an Inbound Content Strategy

Getlatest

@Getlatest

10 Jul 2025

Sales & Marketing

In a world where consumers actively avoid ads and crave genuine value, inbound marketing has emerged as a powerful way to attract and engage audiences. At the core of this approach is a well-planned inbound content strategy a structured process…

Breaking Down Today’s Top Headlines in 5 Minutes

Getlatest

@Getlatest

10 Jul 2025

Sales & Marketing

In a world of constant updates, staying informed can be overwhelming. Here's your 5-minute breakdown of today's most talked-about headlines in tech, business, and culture—what they mean, and why they matter. 1. Apple Unveils AI Features in iOS…

The Latest Buzz in Tech, Culture, and Business

Getlatest

@Getlatest

10 Jul 2025

Sales & Marketing

Each week brings a new wave of headlines that redefine how we live, work, and connect. From innovation in the tech world to cultural moments and evolving business strategies, here’s what’s generating the most buzz right now across industries. 1. AI…

Chaos, Cameras & Cold Tea: Why Urban Surveillance Needs a Rethink! - by Sakshi Bhalla

Valiance Solu..

@ValianceSolutions

10 Jul 2025

Data Science & AI Community Smartcities Tech for Good Impact Stories

There’s a certain kind of silence that sits inside a surveillance control room. It’s the kind that hums beneath the sound of endless footage, the flicker of 60 monitors playing street corners on loop, and the low hiss of a coffee mug being refilled…

Agentforce 2dx: Enhancing Enterprise Workflows with Proactive AI

Daniel Walker

@Daniel_tech84

09 Jul 2025

Mulesoft and Salesforce Community

Agentforce 2dx: Transforming Enterprise Workflows with Agentforce Embedded in the Backend Enterprise workflows are evolving, and Salesforce is leading that evolution with Agentforce 2dx, the next generation of its digital labor platform. Purpose-…

How New Tech Is Transforming Crypto Exchange Development

aaron

@aaron

08 Jul 2025

Blockchain

The world of crypto exchange development is moving faster than ever in 2025. As more people and businesses turn to digital assets, the demand for smarter, safer, and easier-to-use platforms is skyrocketing. Today’s users want more than just…

New

EVOLUTION OF DATA PIPELINES

L&T Technology Services

Current state of Data Pipeline Architectures and Challenges

Future state of Data Pipeline Architecture and Key considerations

Conclusion

L&T Technology Services

ER&D

A Step-by-Step Guide to Building an Inbound Content Strategy

Getlatest

Breaking Down Today’s Top Headlines in 5 Minutes

Getlatest

The Latest Buzz in Tech, Culture, and Business

Getlatest

Chaos, Cameras & Cold Tea: Why Urban Surveillance Needs a Rethink! - by Sakshi Bhalla

Valiance Solu..

Agentforce 2dx: Enhancing Enterprise Workflows with Proactive AI

Daniel Walker

How New Tech Is Transforming Crypto Exchange Development

aaron

About Us

Knowledge Center

In the News

Topics In Demand

Notification

New

EVOLUTION OF DATA PIPELINES

Current state of Data Pipeline Architectures and Challenges

Future state of Data Pipeline Architecture and Key considerations

Conclusion

ER&D

Share this blog

Related blogs

AccentureIndia

24 Jul 2025

Gurpreet Singh Arora

23 Jul 2025

Cyfuture Cloud

23 Jul 2025

Valiance Solutions

22 Jul 2025

Intelliswift Softwar..

21 Jul 2025

Janhvi Juyal

21 Jul 2025

Jayajit Dash

18 Jul 2025

Sparkout Tech

15 Jul 2025

NuSummit

15 Jul 2025

NuSummit

15 Jul 2025

AlgoDocs

14 Jul 2025

Opcito Technologies

11 Jul 2025

About Us

Knowledge Center

In the News

Newsletter