Topics In Demand
Notification
New

No notification found.

ES Index migration simplified!
ES Index migration simplified!

7

0

Data migration may conceptually sound simple, but it is one of those things that are not as simple as they sound. This is especially true in the context of Cloud Migration, where complexities can multiply. The lack of integration and synchronization between system components may lead to the loss or corruption of critical data. It may also lead to data quality issues along with security challenges which may incur heavy financial and reputational risks. Moreover, recovering data is an extremely tedious, time-consuming, and costly affair. And even after going through all this, the data needs to be checked and validated, resulting in extended downtimes. Considering the burnout, security challenges, and data quality issues involved in data migration, most organizations treat data migration as a distinct project, often leveraging expertise in Data Engineering to navigate these complexities. This blog is about how we addressed the exact problem for one of our clients with the help of ES Index Migration. But before we start with the actual solution for your migration needs, let me tell you in brief about ES Index and ES Index Migration.

What is ES Index?

An Index is usually a set of one or more Documents, whereas a Document encompasses one or more fields. While referring to databases, a table row is termed a Document, and a table column is termed as a Field. An Index is like a database within a relational database and has mapping defining multiple types. It is a logical namespace that maps one or more primary shards and can have zero or more replica shards. Elasticsearch can fast search responses as it searches an Index instead of searching the text directly. This is like retrieving pages in a book related to a keyword by scanning the Index at the back of a book, instead of searching every word on every page of the book.

What is ES Index Migration?

Imagine a scenario where you want to migrate production Elasticsearch data to staging for QA purposes. There could be a lot of scenarios, such as:

  • Upgrading underlying cloud infrastructure/instance types for ES deployment
  • Migrating from one cloud to another cloud
  • Migrating from on-premises to cloud or vice versa
  • Creating a pre-prod/staging environment from production data clusters

Here you may want to migrate Indices/data from one Elasticsearch cluster to another Elasticsearch cluster.

You can migrate ES Indices via the following ways:

  • Index your data from the original source
  • Reindex from a remote cluster
  • Restore from a snapshot

To ease out this operation, we created a simple Python-based automation tool that can migrate data from one Elasticsearch database to another using S3-based snapshot.

Prerequisite
To migrate using an S3-based snapshot, both clusters need to install the given plugin:


 

Here is how it works:

  • Clone the following Repo on your machine:

Update migration.yml with required config values:



Run migration utility:



How it migrates?
ES Index migration
That was a brief overview of how the tool works and how it takes care of data migration projects. It is clear from the above explanation that the tool interacts with three components in the environment: the source database, the destination database, and the storage element that acts as a medium between these two databases. The whole purpose is to achieve data migration in the most simple and efficient way. I hope this simplifies the Index migration process for your Elasticsearch.


That the contents of third-party articles/blogs published here on the website, and the interpretation of all information in the article/blogs such as data, maps, numbers, opinions etc. displayed in the article/blogs and views or the opinions expressed within the content are solely of the author's; and do not reflect the opinions and beliefs of NASSCOM or its affiliates in any manner. NASSCOM does not take any liability w.r.t. content in any manner and will not be liable in any manner whatsoever for any kind of liability arising out of any act, error or omission. The contents of third-party article/blogs published, are provided solely as convenience; and the presence of these articles/blogs should not, under any circumstances, be considered as an endorsement of the contents by NASSCOM in any manner; and if you chose to access these articles/blogs , you do so at your own risk.


© Copyright nasscom. All Rights Reserved.