Incremental Load of Information Sources

An Information Source supports two data access types - Online and Data Mart access. The Data Mart version is based on snapshot tables that contains calculated and derived data. Originally it was only possible to implement snapshot tables by using Materialized Views in the Oracle database. The Data Mart framework has been extended to also support incremental load of Information Sources. Incremental loading can be performed for dimensions as well as for facts but the facts are the main targets since the fact definitions are often rather complex and they also represent transactional data. Introducing incremental load possibilities gives a very big advantage compared to the solution based on Materialized Views; except for the first/initial load it will be possible to only recalculate data related to a small sub set of all available transactions and thus enhancing the refresh time considerably. The solution based on Materialized Views only handles complete refresh.

Contents

Framework Comparison

The Data Mart framework based on Materialized Views can be visualized as follows:

Figure 1: Data Mart access based on Materialized View

 

The extended Data Mart framework supporting incremental load can be visualized as follows:

Figure 2: Data Mart access based on incremental load

 

 

Functional Overview

The framework for incremental load supports the following functionality:

Data Mart Based on Incremental Load

 Some words about the Data Mart based on incremental load.

Refresh Options

The following refresh options are available:

  1. Full refresh  (might consider conditions added to the MVS view)
  2. Incremental refresh (using the MVS view as source)
  3. Sub Set load using the MVB view and a set of conditions that defines the sub set.

Considerations

Some considerations when creating objects related to incremental load:

How it Works

The general principle used by the framework will be explained in this section.

Assume that we have a Fact where the associated access views are retrieving information mainly from the tables CUSTOMER_ORDER_TAB and CUSTOMER_ORDER_LINE_TAB. Analyzing the Fact views, e.g. the function calls, leads to that there are two more tables that are involved when retrieving the Fact information; CUST_ORDER_LINE_TAX_LINES_TAB and CUST_ORDER_LINE_DISCOUNT_TAB.

 A structure describing the table dependencies must be defined. For the example case it is possible to define two structures. The first alternative is to use the following structure:

Figure 3: Table dependency structure - alternative 1

In this alternative the table CUSTOMER_ORDER_TAB has been defined as the top table. CUSTMER_ORDER_LINE_TAB is a child table to CUSTOMER_ORDER_TAB but at the same time acts as a parent for the other two tables. The framework will keep track of changes in all four tables and propagate the changes up in the dependency hierarchy. The end result will be a collection of keys in the top most table, i.e. order numbers in table CUSTOMER_ORDER_TAB, that defines the orders affected by changes on all levels.

Figure 4: Table change handling - alternative 1

Changes are in the above model handled as follows:

 

The second alternative is to create a dependency structure as below:

Figure 5: Table dependency structure - alternative 2

In this alternative the tables CUSTOMER_ORDER_TAB and CUSTMER_ORDER_LINE_TAB have been defined as top level tables. The framework will keep track of changes in all four tables and propagate the changes up in the dependency hierarchy. The end result will be a collection of keys related to CUSTOMER_ORDER_TAB and CUSTMER_ORDER_LINE_TAB.

Figure 6: Table change handling - alternative 2

Changes are in the above model handled as follows:

 

In general the second alternative is to prefer since it will reduce number of affected records compared to the first alternative.

 

Development

How to develop support for incremental load is described in section Incremental Load Development as part of the IFS Applications development guide.