Data Migration Challenges

Several challenges occur in data migration projects. Following is my attempt to concisely summarize key challenges from the business perspective.

Data migration projects are short-lived but inevitable because a new information system can be introduced into the business for a variety of reasons.
Data migration is the perennial exercise in enterprise computing. With the introduction of a new information system (ERP/CRM) in the business, data existing in the legacy system must be moved into a new improved target system. Data migration projects are short-lived but inevitable because a new information system can be introduced into the business for a variety of reasons. Often, it is the deterioration of existing legacy systems over a period which leads to heavy maintenance and eventually minimal or no vendor support. Mergers, acquisition, demand for more functionality and improved usability are other mundane business scenarios requiring data migration.



Data is of significant value for the organization who owns it. Therefore, considerable care must be taken to migrate the legacy data into the new system accurately. Several challenges occur in data migration projects. Following is my attempt to concisely summarize key challenges from the business perspective.

  • Legacy data is an unfamiliar terrain.
  • Limited development time
  • Delays in decisions making from the business.
  • Restricted execution time
  • Poor data quality
  • Lack of rigorous testing
  • Effect on data warehouse and BI

Legacy data is an unfamiliar terrain

When migrating from a legacy system to a new one, it is less likely to have first key developers in the team. The extreme case could be zero vendor support. The data dictionary is your friend, but legacy systems may have been documented well in the beginning. But, modifications, documented if you are lucky otherwise undocumented, in the data models from its original state, are expected when the system is operational for years. Therefore, you cannot entirely rely on the available documentation.

Limited development time

Until the data model of the target system is not well-defined, articulated and documented, the data migration project cannot commence. However, it must be achieved before the go-live. In an agile development environment, target data model may remain subject to change until the final release.On the other hand, in an incremental go-live execution, drastic change in the data model may occur due to any major bug fix in the live system. An extreme but likely situation could be limited testing time between the production ready release date, and the go-live(can be as short as one or two weeks).

Delays in decisions from the business side.

Data migration is always a part of larger enterprise level change, which may alter business processes, reporting methods, and habitual data entry practices. It would be over-optimistic to desire quick decisions from business experts in the midst of such change. System matter experts, super users of the legacy application, and in fact the database developers may typically be struggling to complete their daily work while working with the new application vendor concurrently on requirement gathering and testing.

Restricted execution time

It is often recommended by the business to keep the data migration development time and execution time to its optimal best. Unwanted delay in the development time can shorten the data migration testing hours and may even extend the production release deadline. Large data extract from the legacy system can take more time to transform into the data model of the target system resulting in longer execution time. It is not feasible to keep business processes on hold for hours when data migration is in execution.

Poor data quality

The target system may have relatively higher data quality constraints than those of the legacy system. Example: In the legacy system, country_name field of customer address table is a free text field; prone to the typo. Whereas the data in the analogous country name field in the target system must be valid, recognized by the UN, country name, represented by a unique code. The data quality issue, which the legacy system not even aware of, can cause a severe problem in the target system. In the above example, data migration developer may have to map one country name written in 10 different ways(due to typo) to a single matching country code in the new system. When the legacy data is of poor quality, data migration scripts can corrupt the data during transformation, which is only visible to end users.

Lack of rigorous testing

The data migration testing is usually conducted, in a tight time-frame, after the legacy data is migrated and available in the target system. The volume of data within the data migration scope is usually large thus it is not practically possible to manually compare each data element in the destination with its origin. One popular workaround is to limit the number of records for data migration testing as a good representative sample(When QA lead appreciates statistics). However, for massive datasets, this sample size may, still be higher and require a lot of human resources, but still, count as not reliable.

Effect on data warehouse and BI

Large organizations with existing data warehouse and reporting infrastructure need to be more conscious about the data migration as the new target system will feed the data warehouse ETL process and also populate operational reports from the go-live moment. Immediately after the business is live with the new system, the data warehouse and reporting should be aware of the data model underneath. Business analysts receiving corrupted reports or no reports at all after the go-live can be a nightmare for the data warehouse and BI team.

Coherent data migration solution

Whatever the reason could be for the change, data migration is inevitable. The new information system can not be operational with an empty database. The legacy data, generally large in amount must be first loaded into the new system before its go-live moment. More often than not, data models of old and new systems are fundamentally different.

The primary objectives of the data migration project are to extract the data from the legacy system, transform it into the desired target data model, and finally to upload into the new information system. However, it needs an agile and proactive solution model to overcome unique challenges of data migration. The next blog in this series describes the coherent data migration solution, used by Venus Informatics, which mitigates the risk stemming from above-stated challenges.




Krupesh Desai
Consulting Director
Venus Informatics Limited