Why Companies Stumble at Delivering Access to Big Data (Part 2)

Terms of use

Terms of Use

The use of this site and the content contained therein is governed by the Terms of Use. When you use this site you acknowledge that you have read the Terms of Use and that you accept and will be bound by the terms hereof and such terms as may be modified from time to time.

All text, graphics, audio, design and other works on the site are the copyrighted works of nasscom unless otherwise indicated. All rights reserved.
Content on the site is for personal use only and may be downloaded provided the material is kept intact and there is no violation of the copyrights, trademarks, and other proprietary rights. Any alteration of the material or use of the material contained in the site for any other purpose is a violation of the copyright of nasscom and / or its affiliates or associates or of its third-party information providers. This material cannot be copied, reproduced, republished, uploaded, posted, transmitted or distributed in any way for non-personal use without obtaining the prior permission from nasscom.
The nasscom Members login is for the reference of only registered nasscom Member Companies.
nasscom reserves the right to modify the terms of use of any service without any liability. nasscom reserves the right to take all measures necessary to prevent access to any service or termination of service if the terms of use are not complied with or are contravened or there is any violation of copyright, trademark or other proprietary right.
From time to time nasscom may supplement these terms of use with additional terms pertaining to specific content (additional terms). Such additional terms are hereby incorporated by reference into these Terms of Use.

Disclaimer

The Company information provided on the nasscom web site is as per data collected by companies. nasscom is not liable on the authenticity of such data.
nasscom has exercised due diligence in checking the correctness and authenticity of the information contained in the site, but nasscom or any of its affiliates or associates or employees shall not be in any way responsible for any loss or damage that may arise to any person from any inadvertent error in the information contained in this site. The information from or through this site is provided "as is" and all warranties express or implied of any kind, regarding any matter pertaining to any service or channel, including without limitation the implied warranties of merchantability, fitness for a particular purpose, and non-infringement are disclaimed. nasscom and its affiliates and associates shall not be liable, at any time, for any failure of performance, error, omission, interruption, deletion, defect, delay in operation or transmission, computer virus, communications line failure, theft or destruction or unauthorised access to, alteration of, or use of information contained on the site. No representations, warranties or guarantees whatsoever are made as to the accuracy, adequacy, reliability, completeness, suitability or applicability of the information to a particular situation.
nasscom or its affiliates or associates or its employees do not provide any judgments or warranty in respect of the authenticity or correctness of the content of other services or sites to which links are provided. A link to another service or site is not an endorsement of any products or services on such site or the site.
The content provided is for information purposes alone and does not substitute for specific advice whether investment, legal, taxation or otherwise. nasscom disclaims all liability for damages caused by use of content on the site.
All responsibility and liability for any damages caused by downloading of any data is disclaimed.
nasscom reserves the right to modify, suspend / cancel, or discontinue any or all sections, or service at any time without notice.

For any grievances under the Information Technology Act 2000, please get in touch with Grievance Officer, Mr. Anirban Mandal at data-query@nasscom.in.

New

See all

No notification found.

Why Companies Stumble at Delivering Access to Big Data (Part 2)

QuboleTechnologies

@QuboleTechnologies

September 10, 2019

Big Data Analytics

563

This blog is the second installment in a two-part series about self-service access to data. Read the first post here.

We all know the issue of delivering self-service access to data is a problem with no foreseeable end in sight — businesses will remain perpetually behind the speed at which data volume, tools, and use cases multiply. Yet I see many companies fall into the same patterns in their attempts to address these challenges. As the demand for data skyrockets, enterprises continue to rapidly expand their investment in personnel and technology. While this approach to big data complexity seems sufficient at first glance, it lacks the agility and flexibility companies need to successfully run their businesses and prepare for future data demands.

An Insatiable Demand For Personnel

In an effort to get ahead of big data challenges, businesses are frequently resorting to hiring more people. Data teams are bringing on new personnel (or hiring third parties) to conduct penetration testing, manage the infrastructure, operate specialized technologies, or meet data demands from the broader organization. Unfortunately, this approach has two critical flaws — it’s become incredibly difficult to find the right talent, and hiring additional people acts as a band-aid to the predicament rather than a long-term solution.

Although the talent gap is affecting many areas of the business, it has become particularly problematic for IT and data professionals. By 2022, the demand for data science skills will lead to 2.7 million open positions, while the shortage of security experts is expected to reach nearly 2 million. Even if IT leaders wish to build out their data, security, open source software, and infrastructure teams, they will have trouble finding the right candidates to fill those roles.

The bigger issue with using hiring to solve your data challenges is the feasibility of maintaining this cycle in the long run. Adding headcount addresses a symptom — resource limitations — as opposed to addressing the full scope of the problem (i.e. the complexity inherent to a big data infrastructure). What’s more, the talent gaps that many data teams are experiencing is making it impossible to match the speed at which big data is evolving. And the problem is only becoming more prevalent: 65 percent of CIOs say a lack of IT talent prevents their organization from keeping up with the pace of change.

A Deluge Of Technology

Companies are eager to reap the rewards that big data tools offer. Yet with so many technologies built to address a specific purpose and so few designed to tackle an entire system of processes, one size no longer fits all. In the realm of big data, each open source engine offers distinct advantages for specific types of workloads. Your data analysts may rely on Presto, while your data scientists may leverage Apache Spark and Notebooks and your data engineers may leverage Airflow. Unfortunately, the presence of multiple engines can create further chaos for those managing the infrastructure. When three-quarters of businesses are actively using multiple big data engines to conduct their workloads, data teams need both the software expertise and the manpower to successfully maintain and operate big data tools within the broader technology stack.

Often, enterprises will rely on technology to fill their various talent and resource gaps — and end up with a messier, more confusing infrastructure to manage. For instance, your team may elect to deploy data governance or security tools if you lack the headcount or are unable to find the right talent. Or, perhaps you’ve invested in a business intelligence (BI) tool to provide easier data access for non-technical users. Having a range of tools theoretically enables you to optimize processes and explore new project areas, though these technologies come with a caveat — you must have a plan in place to derive business value from these additions.

In reality, many enterprises lack the ability to determine where and how a new technology investment will fit in before it has been deployed. This situation results in a clunky, ad hoc infrastructure that requires a sizeable team to manage the technology stack. Only 40 percent of big data administrators are able to support more than 25 users — which means the majority of businesses are devoting extensive manpower to provisioning licenses, securing sensitive data, and regulating access controls. Your infrastructure can quickly become a black hole for resources that impacts project success and delays business-critical data initiatives.

Out-Of-Control Costs

Businesses around the world are jump-starting or increasing their investment in big data technology: IDC forecasts worldwide revenue for big data and analytics solutions will reach $260 billion by 2022. Coupled with an expanding roster of data professionals, the focus on big data resources is inevitably leading to rising costs. Not only will you spend significant funds on your team’s big data resources, but you’ll also accumulate unexpected costs — particularly around cloud computing.

Unlike traditional hardware and software, modern tools and technologies use a more complex pricing model — and, as a result, costs can be exceedingly difficult to contain. Every cluster you spin up and every model you run uses compute power, but that’s just the beginning. Your costs can easily grow out of control when you factor in some of the likely scenarios you’ll encounter with a cloud infrastructure:

Costs can creep up on those who are new to the cloud, especially if you don’t track the extent of jobs being run or how many clusters are active
If you exceed your provisioned capacity, you must pay an additional fee
Without access to autoscaling capabilities, you may end up paying for clusters running with no active jobs
The large, bursty workloads of big data can cause drastic, unpredictable changes in compute power (which pushes costs higher and higher)

The larger and more complex your infrastructure, the more difficult tracking every line item becomes. Compute costs can balloon rapidly — such as from clusters that a team spun up but forgot to downscale after completing the required jobs — and can catch even seasoned companies by surprise.

The Solution: Self-Service Access

I’ve seen firsthand just how ineffective it is to continuously throw tools and people at the dilemma of how to provide access to data. This trend hijacks your budget and resources, leaving your business far short of its desired data-driven state. Such a solution creates the perfect storm of an over-complicated infrastructure and manual, time-consuming data operations processes.

Nearly half of all companies (44 percent) now store 100 terabytes of data or more in their data lake, further increasing the complexity of connecting that data to the users who need it. To succeed in this data-driven world, businesses must provide data users with self-service access to stored data. In my next and final post of this series, I’ll discuss how an enterprise can make self-service a reality without sending costs skyrocketing or placing strain on the infrastructure.

Find out how Qubole can help you deliver self-service access to data users in this introduction to Qubole.

technology Infrastructure BigDataChallenges DataLake CloudComputing

Disclaimer

That the contents of third-party articles/blogs published here on the website, and the interpretation of all information in the article/blogs such as data, maps, numbers, opinions etc. displayed in the article/blogs and views or the opinions expressed within the content are solely of the author's; and do not reflect the opinions and beliefs of NASSCOM or its affiliates in any manner. NASSCOM does not take any liability w.r.t. content in any manner and will not be liable in any manner whatsoever for any kind of liability arising out of any act, error or omission. The contents of third-party article/blogs published, are provided solely as convenience; and the presence of these articles/blogs should not, under any circumstances, be considered as an endorsement of the contents by NASSCOM in any manner; and if you chose to access these articles/blogs , you do so at your own risk.

QuboleTechnologies

How Master Data is Foundational to Business Transformation?

CSM Tech

@csmtechnologies

13 Aug 2025

Big Data Analytics

Digital transformation has evolved rapidly over the years, becoming a critical driver of business innovation and growth. What started as a slow shift towards technology adoption has now become an essential strategy for businesses looking to have…

Developing Intelligent Chatbots with Generative AI Capabilities

Motherson Tec..

@Jaydip Roy

11 Aug 2025

AI Inside AI Big Data Analytics

Developing Intelligent Chatbots with Generative AI Capabilities “Intelligent chatbot development is advancing through generative AI applications, integrating NLP chatbot solutions and conversational AI tools. This…

From Global Talent to Global Impact: How Remote Staff Augmentation Unlocks 24/7 Expertise

C5i (Course5 ..

@Ronald Fernandes

06 Aug 2025

Analytics

Research AI Markets don’t sleep anymore, and neither can your operations. As research timelines shrink and clients expect answers in real time, traditional team setups just can’t keep pace. Many leaders still depend on local teams to…

How To Simplify Insurance Claims Processes with Data Analytics?

Ken Milko

@kenmilko

05 Aug 2025

Big Data Analytics

In our last blog, we discussed the important factors to bear in mind before transforming insurance claims operations. In this post, we will uncover how data analytics can streamline insurance claims workflows. A digitized Insurance claims…

Worker Lives Matter: The Tech Revolution Transforming Workplace Safety

TATA Communic..

@tatacommunications

30 Jul 2025

Manufacturing Retail - FMCG CPG

In an era defined by rapid technological advancement and global interconnectedness, one would expect workplace safety to be a universally upheld standard. Yet, the grim reality is that millions of workers worldwide continue to face life-threatening…

Why Cash Flow Management Is Important If You Run a Small Business?

Vandna Jadhav

@veronicawinston

29 Jul 2025

Analytics

Running a small business is a labor of love, but it’s also a balancing act. You’re managing inventory, handling customer relationships, hiring the right people—and in the middle of it all, there’s one thing that can make or break your progress: cash…

Topics In Demand

Notification

New

Why Companies Stumble at Delivering Access to Big Data (Part 2)

An Insatiable Demand For Personnel

A Deluge Of Technology

Out-Of-Control Costs

The Solution: Self-Service Access

Share this blog

Related blogs