Data Reduction

Reduction Rules

Reduction rules specify which objects should be reduced in an XDM task execution.

They are used to control the amount of data when fetching rows from a table. To this end the SELECT statement to fetch the data can be customized. The statement can be either a conditional expression (e.g. WHERE clause), or a template that defines the entire SELECT statement.

Once the set of rows to be copied has been determined, the reduction rules are applied to this set. When a row is removed from the set, because it does not satisfy the condition of a reduction rule, XDM automatically removes all dependent rows from the set as well. Therefore, reduction rules will be set on the source tables.

All selected rows which fulfill the reduction rule condition are copied, all other rows are discarded.
If a reduction rule is set for a table that is not selected in the task execution, the rule will be ignored.
Every reduction rule in the task execution is used. If there are two or more rules on a table, only the last matching rule is used. All previous rules are ignored for this table.

The reduction rule expression specifies a conditional expression that determines whether a row is to be copied or not.

There are two ways to define an expression:

  • By defining a WHERE condition for a SELECT statement generated at runtime

  • By defining a template that contains the full SELECT statement.

XDM evaluates the expression as a SELECT statement when processing the keys during subset extract. If the full SELECT statement template is not defined, the expression is interpreted as a WHERE clause for a SELECT statement. This must be entered in the form WHERE <condition>. The first part of the statement is generated by XDM.

It is possible to use runtime parameters as variables in the query. This is both for the WHERE condition and for the template possible. When using a SELECT template, the variables ${columns}, ${tableName}, and ${keys} will be set automatically in the task execution.

Using the WHERE clause

Usually the expression is set by defining a WHERE clause. A SELECT query that uses the specified WHERE clause is generated. Only rows which satisfy the resulting SELECT query are copied.

It is not possible to use conditions containing columns from other tables. If it is necessary to use such condition, use full templates instead.

Using the SELECT template

With a template, it is possible to customize the SELECT and FROM clauses of the statement. A SELECT template starts with <#--template--> and has to contain a valid SQL SELECT statement, as well as the variables ${columns} for the list of columns, ${tableName} for the name of the table and ${keys} for the list of keys selected by XDM when evaluating the start condition. These values will be set automatically by XDM during task execution (see default template below).

The variables ${columns}, ${tableName} and ${keys} must be part of the template, otherwise the template cannot be processed correctly.
A template should be used for complex reduction rules that contain more than a straight forward WHERE clause.

The default template is defined as:

<#--template-->
SELECT ${columns}
   FROM ${tableName} "T"
   WHERE ${keys}

Examples

Thin out rows

A typical use for a reduction rule is to thin out the rows that are copied from the child table of a relationship. For example, if a table contains history records, and a data relation rule specifies that the rows from the child table are also to be copied, a reduction rule can be added to the child table in order to specify that not all dependent rows, but only those that fulfill a certain conditional expression, are supposed to be copied. An example for this conditional expression is:

WHERE ID < 1000

Copy data depending on condition

Another usage of reduction rules is to filter rows that have a special value in one column. Assume, you want to copy customers with their contracts. The contracts have different types named A, B and C, which are stored in the CONTRACT_TYPE column of the contract table. For your purpose, only contracts of type A are necessary. To avoid to copying contracts of type B and C, you would define a reduction rule for the table PROD.CONTRACT and the column CONTRACT_TYPE:

WHERE CONTRACT_TYPE = 'A'

Copy data depending on a condition in another table

If the condition is set in another table, it is not possible to use the WHERE condition. We modify the example Depending Condition: Instead of a CONTRACT_TYPE column with the types as data, the type is stored in its own table called PROD.CONTRACTTYPE. The ID of this table is stored as foreign key in the column CONTRACT_TYPE. Again, only contracts of type A are necessary. To satisfy this requirement, a reduction rule template with a join between PROD.CONTRACT and PROD.CONTRACTTYPE is used. Note that PROD.CONTRACT is the subject of the reduction rule and therefore specified by the variable ${tableName}.

<#--template-->
SELECT ${columns}
   FROM ${tableName} "T" JOIN "PROD"."CONTRACTTYPE" "C" ON "T"."CONTRACT_TYPE" = "C"."ID"
   WHERE ${keys} AND "C"."TYPE" = 'A'

For more details refer to the reduction rules description in the reference.