Wednesday, October 1, 2014

Migrating OWB to ODI: The PRE-MAPPING dilemma

Most of the Oracle Warehouse Builder projects I have worked on, have used the PRE- and POST-Mapping steps to do some sort of housekeeping or retrieve values from the database (like the SCN or a date) as a primitive version of Change Data Capture. When these mappings are migrated to ODI 12c the migration utility will drop the objects from the mapping and move the code to the Begin Mapping Command and End Mapping Command in the Physical Layer.
The migration will work unless we need to use the output values in the rest of the mapping. One of the solutions would be to split the mapping into a pre- and post-mapping ODI procedure, some ODI variables and combining them in an ODI package.
Let’s look at a simple example. The following mapping uses the classic EMP table as input to populate a staging table (EMP_STG). GET_MAPPING_NAME fetches the name of the mapping, but needs an input variable so constant DUMMY is defined but has no significant value. The name is passed to the PRE-MAPPING which calls a procedure START_RUN_STEP which in turn will return a range of SCN’s to use in the filter on the source table. ROWSCN_EXP adds the ROWSCN pseudo-column to the EMP table. The DEDUPLICATOR is needed to prevent an API8003: Connection target attribute group is already connected to an incompatible data source exception.
2014-10-01 10_36_28-Oracle Warehouse Builder _ _DEMO_DB11G_DEMO_PRE_MAP_DEMO
START_RUN_STEP looks as follows:
FUNCTION start_run_step(p_mapping_name      VARCHAR2
,p_run_nr OUT dwh_runs.run_nr%TYPE
,p_previous_scn OUT dwh_runs.current_scn%TYPE
,p_current_scn OUT dwh_runs.previous_scn%TYPE)
RETURN NUMBER
IS
PRAGMA AUTONOMOUS_TRANSACTION;
l_step_id dwh_run_steps.id%TYPE;
BEGIN
BEGIN
SELECT run_nr
,previous_scn
,current_scn
INTO p_run_nr
,p_previous_scn
,p_current_scn
FROM dwh_runs
WHERE run_status = 'R';
EXCEPTION
WHEN NO_DATA_FOUND
THEN
raise_application_error(-20000
,'No active run');
END;

INSERT INTO dwh_run_steps(drn_run_nr
,mapping_name
,startdate)
VALUES (p_run_nr
,p_mapping_name
,SYSDATE)
RETURNING id
INTO l_step_id;

COMMIT;
RETURN l_step_id;
END;

In a previous step procedure START_RUN has inserted some information into table DWH_RUNS. When we migrate this simple mapping to ODI it will look like this:

2014-10-01 13_28_25-PRE_MAP_DEMO - Editor

GET_MAPPING_NAME is now loose and the PRE-MAPPING is gone. The code has been moved to the physical layer:

2014-10-01 13_33_55-Edit Begin Mapping Command for _Physical_

If we leave it like this the mapping will fail. First because GET_MAPPING_NAME.MAPPING_NAME is not defined. We could try to replace this with <%=odiRef.getPop("POP_NAME")%> but that won’t work either. Secondly, even if it did work, we would still not be able to retrieve the output parameters for use in the mapping filter.




Solution 1


My first solution was to pull out the PRE-MAPPING code and move it to an ODI procedure and create ODI variables to store the input and output parameters. We can refresh the variables via an SQL statement, but wait we can’t retrieve output parameters of a procedure in a SQL statement. That can only be done with the return value of a function. To circumvent this issue I created a number of database packaged variables filled in the original PRE-MAPPING procedure and getter functions in the package. Now I can call the getter function in the refresh part of the ODI variable. Then I can use the ODI variables in the filter part of the mapping, after which I only have to create an ODI package to orchestrate it all.


Solution 2


After I stumbled onto the David Allen’s blog post about migrating OWB PLSQL Procedures I wondered if I could use the same technique for the PRE-MAPPING migration. The answer is: Yes. Although ODI cannot call a procedure in a mapping it can call a Table Function. The trick is to wrap the procedure in a pipelined function. the wrapped function will look like this:
CREATE OR REPLACE FUNCTION pl_start_run_step(p_mapping_name VARCHAR2)
RETURN tb_run_step_obj
PIPELINED
AS
out_pipe run_step_obj
:= run_step_obj(NULL
,NULL
,NULL
,NULL
,NULL);
BEGIN
out_pipe.p_mapping_name := p_mapping_name;

out_pipe.p_step_id :=
dwh_runs_mgr.start_run_step(p_mapping_name
,out_pipe.p_run_nr
,out_pipe.p_previous_scn
,out_pipe.p_current_scn);
PIPE ROW (out_pipe);
RETURN;
END;

To make it compile correctly we also need to create an object and a table of objects:
CREATE OR REPLACE TYPE run_step_obj AS OBJECT
(
p_mapping_name VARCHAR2(1024)
,p_run_nr NUMBER
,p_previous_scn NUMBER
,p_current_scn NUMBER
,p_step_id NUMBER
);
/

CREATE OR REPLACE TYPE tb_run_step_obj AS TABLE OF run_step_obj;
/

Now we can put it all together in a mapping, but since we are going to use this table function in many mappings it is best to put it in a reusable mapping:

2014-10-01 16_17_32-Oracle Data Integrator Studio 12c _ PRE-MAPPING

Assign '<%=odiRef.getPop("POP_NAME")%>'to mapping_name and pass it to the table function.

2014-10-01 16_20_51-EXPRESSION - Properties - Editor

Add the function to the Function Name property:

2014-10-01 16_19_55-pl_start_run_step - Properties - Editor

Now use the reusable mapping in the migrated mapping:

2014-10-01 16_29_56-PRE_MAP_DEMO - Editor

Don’t forget to clear the Begin Mapping Command. We also need to add a Cross Join between the PRE_MAPPING Table Function and the DEDUPLICATOR, this is only possible because we know for sure that the function will only return one row. Then adjust the filter to include the PRE_MAPPING parameters:

2014-10-01 16_33_16-FILTER - Properties - Editor

Now we have a working mapping which is a lot easier to understand for OWB developers than the ODI solution with variables, procedures and a package. Also everything is captured in one compact, reusable mapping.


Solution 3


A combination of solution 1 and 2.


But wait…


There are some potential drawbacks to this way of working.

The restart issue
As Tom Kyte pointed out it is possible for a statement to restart. This can happen when an update statement runs into a locked record. When this happens the updates so far will be rolled back and Oracle will try and lock the records before the statement is run again. In my case the table function I used contains an AUTONOMOUS TRANSACTION which will be executed twice when a restart occurs. Also anything that is non-transactional, like modifying a package global variable, will mess up your process

The DETERMINISTIC issue
You must make sure your function is DETERMINISTIC or you must guarantee the function is only called once. Because you can connect the table function to more than one object in your mapping, you must be aware of how ODI generates the underlying SQL code. If the function is called more than once in the generated code you must make sure the result is always the same for this session, in other words: your function must be DETERMINISTIC.
Update 15-10-2014

After doing some more tests I have come to the conclusion that, even though the solution works and performs well, it is not the preferred method of migrating an OWB  pre-mapping.

No comments:

Post a Comment