Extracting, Data Transport and Loading
For copying full tables, XDM has two task types, Compatibility Table Copy Task and Native Table Copy Task. Both task types have their own mechanism to extract, transport and load data.
Compatibility Table Copy Task
In compatibility table copy tasks and in pairs of compatibility table icebox tasks, XDM uses SQL Data Transport to transport the data is called. In these tasks there is no alternative transport mechanism.
SQL Data Transport
When using SQL data transport, the data is read from the source tables using SQL SELECT statements and will be applied to existing tables using the SQL statements INSERT, UPDATE, DELETE and/or MERGE statements.
The extracted data is stored internally in a file on the XDM server. XDM uses this file to apply the data with SQL statements.
Due to the internal storage, it is not necessary to store files on both database servers and XDM does not need to move the data from the source to the target database server in a separate step.
| SQL data transport also applies automatically for row level processor tasks and for pairs of row level icebox tasks. |
Native table copy task
The native table copy task is designed to copy data in the fastest possible way, so it works in a different way using faster mechanisms than SQL Data transport.
Extract and Load mechanisms
Depending on the conditions described below, XDM uses one of three different extract and load mechanisms.
These mechanisms are:
-
Native extract and load,
-
Generic extract (via SQL
SELECT) and load, or -
SQL data transport (only in special cases).
XDM automatically selects the mode to be used for each table. It is quite possible that XDM uses different mechanisms during a task execution.
Native extract and load
XDM uses the appropriate native extract and load tools of the DBMS, if it is possible. To use these tools, these three requirements must be fulfilled:
-
Source and target DBMS is the same.
-
There are no activated reduction rules in the selection set.
-
There is no modification applied to the tables in the selection set.
For example, when copying whole tables from an MSSQL source database to an MSSQL target database, if all of the above conditions are met, then XDM would use the MSSQL bcp utility to perform the extract and load.
Generic extract and load
If a native table copy task has active reduction rules or modification sets, then XDM uses a generic extract. The data is extracted to CSV files that are stored on the database server and not in XDM internally. These files are loaded to the target database with an appropriate utility from the target DBMS.
The generic extract is also used when the source DBMS differs from the target DBMS. For example, when copying whole tables from a Db2 LUW database to an Oracle database, XDM extracts the data to CSV files and the Oracle SQL*loader utility to load these to the target database.
If generic extract is used, the data is processed in XDM and then written to a file on the target server. It is not necessary to use the data transfer method.
Special cases using SQL data transport.
There are a few cases where the extract and load mechanism switches automatically to SQL Data transport due to technical reasons:
-
If source or target DBMS is Snowflake, DB2 I, IMS, a Generic connection or a Filebridge connection.
-
If source DBMS is PostgreSQL and reduction rules or modification applied to a table in the selection set.
| It is recommended to use a compatibility table copy task in this situation. There is no performance advantage with a simultaneous loss of flexibility. |
In these cases, a use of native table copy task is not possible and will be refused by XDM:
-
If either source or target DBMS is PostgreSQL and the other is a different DBMS.
-
If source or target DBMS differs in Native From Icebox Tasks.
| It is necessary to use a compatibility table copy task (or compatibility table icebox tasks) in this situation. |
Data transfer method
With native table copy tasks, the data is usually written to a file on the source server during the extract and read from this file during the load on the target server. If not both servers are able to access this file directly it is moved from the source to the target server. Where the files are stored and how they are accessed is defined in a storage location.
The Data transfer method defines the mechanism that XDM uses to move data from the source database server to the target database server.
Possible methods are:
-
Local Use (internal),
-
Network transport,
-
Shared Drive, or
-
SQL data transport (internal, only in special cases).
In general the following rules apply:
-
The data transfer method switch is only used in native table copy tasks, not in native table copy icebox tasks. When using the icebox, the extracted data is stored in a different way that can not be changed.
-
SQL data transport is an implicit method to copy data without using a storage location in special cases. If SQL data Transport is chosen internally, the data transfer method is, therefore, irrelevant in this case.
-
If Generic extract is used (i.e. data is modified or reduced), it is processed in XDM and then written from XDM directly to the target server. Data transfer method switch is, therefore, irrelevant in this case.
-
When the source and target databases are located on the same server, it is not necessary to move extracted data from source to target server over a network connection. The data transfer method set in the task is irrelevant in this case. Here, the data transfer method is automatically set to Local Use (
LOCAL), and it is not possible to change this. -
When the source and target databases are located on different servers, and data is extracted via native extract from the source tables (i.e. using database utilities such as
EXPORTorUNLOAD), the extracted data is typically stored in a location that is local to the source database server. In this case XDM may have to move the data to the target machine in a separate step and uses the data transfer method that was selected for the task.
There are two different data transfer methods available, which one is preferable depends on the usage scenario. In a native table copy task the data transfer method must be specified explicitly. Possible choices are Shared drive and Network .
Local Use
The data transfer method Local Use uses one local directory for source and target server.
XDM uses Local Use if the source database and the target database are located on the same DBMS server. In this situation, the same storage location is used for the source and target. Local use is an internal data transfer method to which XDM switches if the above requirements are fulfilled. The Data Transfer Method set in the interface is then ignored, as no data transport is necessary.
With Native Extract, XDM extracts the data into the working directory of the storage location using UNLOAD.
XDM then loads the data from the working directory into the target database using LOAD.
With Generic Extract, XDM selects the data using SQL SELECT via a JDBC connection and processes it internally.
XDM then saves the selected and processed data as a file in the working directory. From there, it is loaded into the target database using LOAD.
Network transport
The data transfer method Network uses a local directory on both source and target server respectively. The extracted data is stored in the local source directory. After extracting all required data from the source tables, a copy program is invoked that moves the extracted data from the source directory over a network connection to the target directory. The target tables are then populated with the data from the local target directory.
XDM uses Network if two storage locations on different DBMS servers are used and the Data Transfer Method Network is set in the task template. The selected data is copied from the source database server to the target database server via a network connection.
With Native Extract XDM extracts the data with unload mechanisms into the working directory of the source storage location. From there, it is copied via FTP to the working directory of the target storage location and loaded from this directory into the target database using load mechanisms. This is the only situation in which an FTP user is used for the data transfer.
With Generic Extract, XDM selects the data using SQL SELECT via a JDBC connection and processes it internally.
XDM then saves the selected and processed data as a file in the working directory of the target connection. From there, it is loaded into the target database using load mechanisms.
It is, of course, possible to use this method to move data between two different servers when a shared drive exists. However, since there is an extra step to explicitly move the data using FTP or SCP, this transport method is slower than the shared drive method. Nevertheless, a user may want to use this data transfer method if the shared drive does not have enough free space to store all the extracted data, or if the user does not have the appropriate access permissions for the shared drive on both servers.
If no transport method set for a native table copy task with different source and target database server, this option will be used as default. No further settings are required for this transfer method.
For more details refer to the data transfer method description in the reference.
Shared Drive
The data transfer method Shared Drive uses a directory accessible in the file system of the source database server to store extracted source data. The mount point for a shared drive must exist on the source system, and the path to the mount point on the source database server is set in the Source mount point property in the task template. The physical location of extracted data stored on the shared drive is transparent in so far as the mounted drive can be accessed like a local directory by any program running on the source server.
Similarly, the shared drive must also be accessible in the target server’s file system. On the target server, the extracted data is visible without having to invoke an explicit program that copies data over a network connection. The target tables are populated with the data from the mounted directory, which can be accessed like a local directory by any program running on the target server. For this, the mount point for a shared drive must exist on the target system, and the path to the mount point on the target database server is set in the Target mount point property in the task template.
XDM uses Shared Drive if two storage locations on different DBMS servers are used and the Data Transfer Method Shared Drive is set in the task template. The selected data is copied from the source database server to the target database server using the shared directory.
With Native Extract, XDM extracts the data into the source mount point of the shared drive using unload mechanisms. XDM then loads the data from the target mount point of the shared drive into the target database using load mechanisms.
With Generic Extract, XDM selects the data using SQL SELECT via a JDBC connection and processes it internally.
XDM then saves the selected and processed data as a file in the target working directory.
From there, it is loaded into the target database using load mechanisms.
When choosing Shared drive in the task definition, the absolute paths of the storage location must be set in the fields Source mount point and Target mount point in the Transfer Method section in the Other options tab of the task template. The user accounts that are used to execute the XDM backend programs on both the source and the target servers must have appropriate authorization to access the respective mount directories.
| If you often need to copy bulk data, it is preferable to configure a shared drive for the source and target environment, if possible, to avoid excessive network traffic. Furthermore, the data will be written and read only once during task execution. With network transport, the data is written twice during task execution. |
For more details refer to the data transfer method description in the reference.
Overview of used paths
The following table provides an overview of which paths are used in which configuration:
Data Transfer Method |
Extract Method |
Extract Path |
Load Path |
Local Use |
Native Extract |
Working Directory |
Working Directory |
Local Use |
Generic Extract |
Working Directory |
Working Directory |
Network Transport |
Native Extract |
Source Working Directory |
Target Working Directory |
Network Transport |
Generic Extract |
Target Working Directory |
Target Working Directory |
Shared Drive |
Native Extract |
Source Mount Point |
Target Mount Point |
Shared Drive |
Generic Extract |
Target Working Directory |
Target Working Directory |
Native Table Icebox Tasks
When using native table icebox tasks, the data will be stored as part of the icebox generation.
In the native table to icebox tasks, the icebox generation is stored in the icebox generation directory, which is mounted to /xdm/backups/ in the XDM installation.
In the native table from icebox task, the data is loaded from the icebox generation directory into the database.
In contrast to the native table copy task, this is done in two steps.
With Native Extract, XDM extracts the data into the working directory of the source storage location using UNLOAD.
From there, the data is transferred to the icebox generation directory on the XDM server.
With Generic Extract, XDM selects the data using SQL SELECT via a JDBC connection and processes it internally.
XDM then saves the selected and processed data as a file in the icebox generation directory.
In both cases, the data is transported from this directory in the From Icebox Task to the working directory of the target storage location.
XDM then loads the data from the target mount point of the shared drive into the target database using LOAD.