In the previous tutorial, we looked at Data Acquisition. Now that we have acquired out data, we will identify it=, that is, we will mark the data in some way, so that we may track each record through our entire data migration. Not only is this beneficial to the mechanics of our migration, it will also become important once we start to look at migration reconciliation and auditing. We will always need to be able to explain where our data came from, and why we did what we did.
Assuming that we have data that is structured by rows and columns, we will add the following, columns to each of our rows. If we have data that is structured differently, we will achieve the same result; but may have to mark the data using a different mechanism. We will discuss this further, in a later article.
dmSourceName String dmRowNumber Integer dmUUID String 36
Our source name is a free-text field that identifies where our data came from. This may identify a database table and column name, a disk file name, a URI, or any other source that our data may come from.
The prefix “dm” simply stands for “Data Migration” and has been added to differentiate data migration columns that have been added during processing, from real data columns, and to avoid name collisions.
This is a generated sequence that identifies the row number of the data, within each identified source. If, for example, during Data Acquisition, we have read data from a database table and then stored the data in a locally-hosted disk file, this sequence will identify the line number within this local file.
This is a generated UUID, which uniquely identifies each record throughout our data migration, and any subsequent migrations that we may undertake.
We will generate this UUID using the method routines.FieldHelper.getUUID.