Topics In Demand
Notification
New

No notification found.

Blog
4 Steps to Solve the Unstructured Data Problem

515

0

Brief

In 1998, Merrill Lynch stated that most data stored in an enterprise is unstructured and estimated to be as high as 80%. This number may have been a bit anecdotal at the time with only a few parties accepting this number unequivocally. Though this number remained unverified, some sources suggested that the actual number may  indeed be close to 80%.

Fast forward to 2020. IDC and Dell EMC predicted that by this year, there will be an increase of 40 zettabytes of data. Furthermore, IDC and Seagate reported that by 2025, the global datasphere will grow to 163 zettabytes and most of this data will be unstructured.

What do we observe from the above metric? Before making the deduction, we need to elucidate what ‘unstructured data’ means in the context of an enterprise. Unstructured data does not have a predefined structure and is usually written and presented in a free-flowing manner. The data could potentially include documents such as employee information, insurance policies, travel papers, legal contracts, agreements, invoices etc.

Making sense of this information stack to bring out themes and trends requires time and a huge effort on the part of the organization. As most of this data comes in as text, the language is ambiguous, and key messages buried in text data are not easy to discern or process. Also, as the merit remains in combining text data with structured data in decision-making contexts, the analysis of unstructured data remains a challenge.

By,

Vic Gupta

Senior Vice President – Digital & AI

Coforge Limited

www.coforgetech.com


That the contents of third-party articles/blogs published here on the website, and the interpretation of all information in the article/blogs such as data, maps, numbers, opinions etc. displayed in the article/blogs and views or the opinions expressed within the content are solely of the author's; and do not reflect the opinions and beliefs of NASSCOM or its affiliates in any manner. NASSCOM does not take any liability w.r.t. content in any manner and will not be liable in any manner whatsoever for any kind of liability arising out of any act, error or omission. The contents of third-party article/blogs published, are provided solely as convenience; and the presence of these articles/blogs should not, under any circumstances, be considered as an endorsement of the contents by NASSCOM in any manner; and if you chose to access these articles/blogs , you do so at your own risk.


Download Attachment

29322-coforge---blog---4-steps-to-solve-the-unstructured-data-problem.pdf

vinayak12

© Copyright nasscom. All Rights Reserved.