Wednesday, October 29, 2014

The Power of Groovy: Finding all mappings in an ODI 12c project

For a groovy script I needed a list of all mappings in a project. So far I only had to retrieve lists of objects from just one folder, but for this script I needed them all. My first thought was to look for a findByProject method for the IMappingFinder class, but for some reason all findByProject methods need an additional parameter like the folder name. Funnily there is a findByProject which only takes a project code in the IOdiPackageFinder class.

My first thought was very procedural: lets do it do old fashioned way and create a class which will traverse the folder tree of a project and build a list of all the mappings. The class I built looks like this:
import oracle.odi.domain.mapping.Mapping
import oracle.odi.domain.mapping.finder.IMappingFinder
import oracle.odi.domain.project.OdiProject


/**
 *  A class to find all Mapping objects within a project
 *  All folders are traversed recursively and the mappings are returned in a collection.
 */
class FindMappings {
    def private allMappings = null

    def processProject(odiInstance, projectCode) {
        def odiProjectsList = (odiInstance.getTransactionalEntityManager().getFinder(OdiProject.class).findByCode(projectCode))

        odiProjectsList.each { p ->
            def odiFoldersList = p.getFolders()
            odiFoldersList.each { f ->
                /* process interfaces of the current folder */
                this.listMappings(odiInstance, p.getCode(), f.getName())
                /* Process sub folders recursively */
                this.processSubFolder(odiInstance, f, p.getCode())
            }
        }

        return (allMappings)
    }

    def private listMappings(odiInstance, projectCode, folderName) {
        def mappingList = ((IMappingFinder) odiInstance.getTransactionalEntityManager().getFinder(Mapping.class)).findByProject(projectCode, folderName)

        if (allMappings == null) {
            allMappings = mappingList
        } else {
            allMappings = allMappings + mappingList
        }
    }

    /* given an odiInstance, folder and project code, we will parse all subfolders (recursively) and print the name of all interfaces found at all levels*/

    def private processSubFolder(odiInstance, Folder, projectCode) {
        def subFolderList = Folder.getSubFolders()

        if (subFolderList.size() != 0) {
            subFolderList.each { s ->
                /* process interfaces of the current folder */
                this.listMappings(odiInstance, projectCode, s.getName())
                /* Process sub folders recursively */
                this.processSubFolder(odiInstance, s, projectCode)
            }
        }
    }
}
The entry point is the processProject method which we pass a project code. The result is a collection of mappings.

I felt a bit unsatisfied by the result, although it works well I kept thinking there must be a better way to do this. Being fairly new to groovy I have not yet had the opportunity the get to know all the ins and outs of the language, but when I was looking at the documentation for collections I got thinking if I could rewrite the class to a few lines of code, and indeed it is possible.

As I mentioned before there is no findByProject method in the IMappingFinder class, however it does inherit a findAll method from its parent. So we can find all mappings in a repository as follows:
def mappingList = ((IMappingFinder) odiInstance.getTransactionalEntityManager().getFinder(Mapping.class)).findAll()
Now we’ve got a list of all mappings in the repository, but I only want the mappings in a specific project. We can select these by using the collection and object methods in combination with closures. The resulting find will look like this:
def mappingList = ((IMappingFinder) odiInstance.getTransactionalEntityManager().getFinder(Mapping.class)).findAll().findAll{w -> w.getProject().getCode() == '<projectCode>'}

This one line of code does the same as the class does I built first.

Slightly rewritten to make it a bit more readable, it could look like this:
def tme = odiInstance.getTransactionalEntityManager()            // Shortcut to transaction manager
def fm = ((IMappingFinder) tme.getFinder(Mapping.class))         // shorcut to Find Mapping

def mappingList = fm.findAll().findAll{w -> w.getProject().getCode() == '<projectCode>'}

We can keep on chaining these commands together like:
// Find all mappings in project DEMO which start with 'DEM'
def mappingList = fm.findAll().findAll { w -> w.getProject().getCode() == 'DEMO' }.findAll {w -> w.getName()[0..2] == 'DEM'}

// Find all mappings in project DEMO which name contains STG
def mappingList = fm.findAll().findAll { w -> w.getProject().getCode() == 'DEMO' }.findAll {w -> w.getName().contains('STG')}


Amazing power.

Wednesday, October 22, 2014

Using IntelliJ IDEA as development tool for ODI Groovy scripts

I find the Groovy editor in ODI quite limiting when editing my scripts. I know there must be a multitude of IDE’s which are suitable for editing Groovy but I found InteliJ IDEA on Google and liked the look and feel of the tool, so I wondered if I could use it to make editing ODi Groovy scripts easier.

Download the IntelliJ IDEA Community Edition which is free from the following link and install.

Start InitelliJ IDEA:

2014-10-21 07_31_55-IntelliJ IDEA

Select Project structure  so we can setup the defaults for our ODI Groovy projects. The first thing we have to define is the SDK. Select Project on the left and create the New button to create a new SDK and select JDK. In the select window that follows select the home folder of the SDK you use to run ODI. I use jdk1.7.0_45 because I now this works for ODI 12.1.3, I have had some issues with other patch levels.

2014-10-21 07_33_15-Project Structure

Next we need to setup the ODI Libraries. Click Libraries on the left and then click the green + sign at the top of the empty box. Again a select window will open where you can add the library paths by clicking the green +. At the top you can enter a name for the Library, I chose odilib. Add the following paths to the library (the version numbers might vary):

  • %ODI_HOME%\odi\sdk\lib
  • %ODI_HOME%\oracle_common\modules\oracle.jdbc_12.1.0
  • %ODI_HOME%\oracle_common\modules\oracle.jsf_2.1

2014-10-22 00_01_08-Project Structure

Now setup is complete and we can start creating projects. Make sure to select Groovy for the language, the Groove library can remain empty, then finish creating the project.

2014-10-22 01_30_13-New Project

I have chosen to create a module for each of my scripting sub-projects, you can choose to organize your projects any way you want, but this works for me. Create a new Module by right clicking on the project name and then selecting New->Module. The New Module dialog appears and again choose Groovy as your language and select the Groovy Library we defined before (in my case odilib). Finish creating the Module.

2014-10-22 00_25_38-New Module

Within a Module we can create several types of source files, but for scripting I use a Groovy Script type. To create it right-click the module name and select New->Groovy Script. This will open an editing window for your new script.

2014-10-22 00_26_40-Program Manager

These are the features of IntelliJ IDEA I particularly like:

  • The out of the box support for Groovy
  • Code->Reformat Code
  • Code->Optimize Imports
  • The automatic import of classes
  • The very helpful information the editor gives when writing code
  • Version control
  • and many, many more…

Of course there are many IDE’s which have similar functionality. You will just have to use the one that works best for you.

Wednesday, October 8, 2014

Migrating OWB to ODI: ORA-02292 integrity constraint (DEV_ODI_REPO.FK_MAP_EXPR_REF_3) violated - child record found

I have run into a persistent error when trying to migrate my OWB mappings to ODI. I use this simple mapping for test purposes:
2014-10-08 10_53_15-Oracle Warehouse Builder _ _DEMO_DB11G_DEMO_MAPPING_2
When I migrate this to ODI I get the following result:
2014-10-08 11_10_10-Oracle Data Integrator Studio 12c _ MAPPING_2
So far so good and exactly as expected. Now Expression_0 is no longer needed and has to be removed, so select the object, press delete and save. The object disappears, but the 2014-10-08 11_13_38-Oracle Data Integrator Studio 12c _ MAPPING_2 icon stays highlighted. Pressing the icon again seems to save the changes. Close and reopen the mapping, and lo and behold the expression_0 object is still present. Now delete it again and press the save all button instead and…..
Unable to save MAPPING_3 (ODI-10143: Error while accessing the ODI repository.   
ORA-02292 integrity constraint (DEV_ODI_REPO.FK_MAP_EXPR_REF_3) violated - child record found
I never used to get this error before, but I think I might just have pressed the save button and not noticed the object did not disappear. This looks like a bug in the migration utility to me.
There is a workaround, but that requires me to modify the original OWB mapping, I have not yet found a way to solve the problem in ODI.
The workaround is as follows:
  • either remove the expression in OWB before the migration
  • or replace the expression by a constant (you can still use expressions in a constant and since it the input for a pre-mapping it will only be evaluated once)

Thursday, October 2, 2014

Migrating OWB to ODI: What about the POST-MAPPING?

In my previous post I looked at solutions for migrating an OWB pre-mapping component to ODI. It turns out a table function was the best solution in my case. But what about the post-mapping, can it be migrated in a similar fashion. After some testing I think that theoretically it should  be possible, it would not be a practical solution though. So for the post-mapping I will go down a different road.
PROCEDURE end_run_step(p_step_id   NUMBER
                      ,p_message   VARCHAR2 DEFAULT NULL)
IS
   PRAGMA AUTONOMOUS_TRANSACTION;
   l_run_number  dwh_runs.run_nr%TYPE;
BEGIN
   UPDATE dwh_run_steps drp
      SET drp.enddate   = SYSDATE
         ,drp.MESSAGE   = p_message
    WHERE id = p_step_id;

   COMMIT;
END;
Again, I would like to prevent using an ODI package, and I want to make sure my post-mapping is the last step in the mapping. Using the table function approach could work, but it will be very hard to ensure it is the last action in the mapping, as well as guaranteeing that the procedure will only run once.
Leaving the code in the Physical Layer End Mapping Command will ensure it is the last step and it will only run one. What about the 2 parameters though? The easiest way to transfer these values from the pre-mapping (where the step_id is generated) is via a public database package variable. We need to change the the DWH_RUNS package, or create a new one, to achieve this. I will assume that modifying the code is no problem and change the package like this:
CREATE OR REPLACE PACKAGE demo.dwh_runs_mgr
AS
   g_step_id  NUMBER;

   FUNCTION start_run
      RETURN NUMBER;

   PROCEDURE end_run(p_run_status VARCHAR2);

   FUNCTION start_run_step(p_mapping_name      VARCHAR2
                          ,p_run_nr        OUT dwh_runs.run_nr%TYPE
                          ,p_previous_scn  OUT dwh_runs.current_scn%TYPE
                          ,p_current_scn   OUT dwh_runs.previous_scn%TYPE)
      RETURN NUMBER;

   PROCEDURE end_run_step(p_step_id   NUMBER
                         ,p_message   VARCHAR2 DEFAULT NULL);
END;

We also need assign a value to the global variable:
FUNCTION start_run_step(p_mapping_name      VARCHAR2
                       ,p_run_nr        OUT dwh_runs.run_nr%TYPE
                       ,p_previous_scn  OUT dwh_runs.current_scn%TYPE
                       ,p_current_scn   OUT dwh_runs.previous_scn%TYPE)
   RETURN NUMBER
IS
   PRAGMA AUTONOMOUS_TRANSACTION;
   l_step_id  dwh_run_steps.id%TYPE;
BEGIN
   BEGIN
      SELECT run_nr
            ,previous_scn
            ,current_scn
        INTO p_run_nr
            ,p_previous_scn
            ,p_current_scn
        FROM dwh_runs
       WHERE run_status = 'R';
   EXCEPTION
      WHEN NO_DATA_FOUND
      THEN
         raise_application_error(-20000
                                ,'No active run');
   END;

   INSERT INTO dwh_run_steps(drn_run_nr
                            ,mapping_name
                            ,startdate)
        VALUES (p_run_nr
               ,p_mapping_name
               ,SYSDATE)
     RETURNING id
          INTO l_step_id;

   COMMIT;

   g_step_id   := l_step_id;
   RETURN l_step_id;
END;


Now all we have to do is add the code to the End Mapping Command and we will have a working mapping:
DECLARE
   p_step_id  NUMERIC := dwh_runs.g_step_id;
   p_message  VARCHAR2(1024);
BEGIN
   "DWH_RUNS_MGR"."END_RUN_STEP"(p_step_id
                                ,p_message);
END;


Don’t forget to set the Technology for the End Mapping Command.

2014-10-02 15_18_49-Physical - Properties - Editor


Wednesday, October 1, 2014

Migrating OWB to ODI: The PRE-MAPPING dilemma

Most of the Oracle Warehouse Builder projects I have worked on, have used the PRE- and POST-Mapping steps to do some sort of housekeeping or retrieve values from the database (like the SCN or a date) as a primitive version of Change Data Capture. When these mappings are migrated to ODI 12c the migration utility will drop the objects from the mapping and move the code to the Begin Mapping Command and End Mapping Command in the Physical Layer.
The migration will work unless we need to use the output values in the rest of the mapping. One of the solutions would be to split the mapping into a pre- and post-mapping ODI procedure, some ODI variables and combining them in an ODI package.
Let’s look at a simple example. The following mapping uses the classic EMP table as input to populate a staging table (EMP_STG). GET_MAPPING_NAME fetches the name of the mapping, but needs an input variable so constant DUMMY is defined but has no significant value. The name is passed to the PRE-MAPPING which calls a procedure START_RUN_STEP which in turn will return a range of SCN’s to use in the filter on the source table. ROWSCN_EXP adds the ROWSCN pseudo-column to the EMP table. The DEDUPLICATOR is needed to prevent an API8003: Connection target attribute group is already connected to an incompatible data source exception.
2014-10-01 10_36_28-Oracle Warehouse Builder _ _DEMO_DB11G_DEMO_PRE_MAP_DEMO
START_RUN_STEP looks as follows:
FUNCTION start_run_step(p_mapping_name      VARCHAR2
,p_run_nr OUT dwh_runs.run_nr%TYPE
,p_previous_scn OUT dwh_runs.current_scn%TYPE
,p_current_scn OUT dwh_runs.previous_scn%TYPE)
RETURN NUMBER
IS
PRAGMA AUTONOMOUS_TRANSACTION;
l_step_id dwh_run_steps.id%TYPE;
BEGIN
BEGIN
SELECT run_nr
,previous_scn
,current_scn
INTO p_run_nr
,p_previous_scn
,p_current_scn
FROM dwh_runs
WHERE run_status = 'R';
EXCEPTION
WHEN NO_DATA_FOUND
THEN
raise_application_error(-20000
,'No active run');
END;

INSERT INTO dwh_run_steps(drn_run_nr
,mapping_name
,startdate)
VALUES (p_run_nr
,p_mapping_name
,SYSDATE)
RETURNING id
INTO l_step_id;

COMMIT;
RETURN l_step_id;
END;

In a previous step procedure START_RUN has inserted some information into table DWH_RUNS. When we migrate this simple mapping to ODI it will look like this:

2014-10-01 13_28_25-PRE_MAP_DEMO - Editor

GET_MAPPING_NAME is now loose and the PRE-MAPPING is gone. The code has been moved to the physical layer:

2014-10-01 13_33_55-Edit Begin Mapping Command for _Physical_

If we leave it like this the mapping will fail. First because GET_MAPPING_NAME.MAPPING_NAME is not defined. We could try to replace this with <%=odiRef.getPop("POP_NAME")%> but that won’t work either. Secondly, even if it did work, we would still not be able to retrieve the output parameters for use in the mapping filter.




Solution 1


My first solution was to pull out the PRE-MAPPING code and move it to an ODI procedure and create ODI variables to store the input and output parameters. We can refresh the variables via an SQL statement, but wait we can’t retrieve output parameters of a procedure in a SQL statement. That can only be done with the return value of a function. To circumvent this issue I created a number of database packaged variables filled in the original PRE-MAPPING procedure and getter functions in the package. Now I can call the getter function in the refresh part of the ODI variable. Then I can use the ODI variables in the filter part of the mapping, after which I only have to create an ODI package to orchestrate it all.


Solution 2


After I stumbled onto the David Allen’s blog post about migrating OWB PLSQL Procedures I wondered if I could use the same technique for the PRE-MAPPING migration. The answer is: Yes. Although ODI cannot call a procedure in a mapping it can call a Table Function. The trick is to wrap the procedure in a pipelined function. the wrapped function will look like this:
CREATE OR REPLACE FUNCTION pl_start_run_step(p_mapping_name VARCHAR2)
RETURN tb_run_step_obj
PIPELINED
AS
out_pipe run_step_obj
:= run_step_obj(NULL
,NULL
,NULL
,NULL
,NULL);
BEGIN
out_pipe.p_mapping_name := p_mapping_name;

out_pipe.p_step_id :=
dwh_runs_mgr.start_run_step(p_mapping_name
,out_pipe.p_run_nr
,out_pipe.p_previous_scn
,out_pipe.p_current_scn);
PIPE ROW (out_pipe);
RETURN;
END;

To make it compile correctly we also need to create an object and a table of objects:
CREATE OR REPLACE TYPE run_step_obj AS OBJECT
(
p_mapping_name VARCHAR2(1024)
,p_run_nr NUMBER
,p_previous_scn NUMBER
,p_current_scn NUMBER
,p_step_id NUMBER
);
/

CREATE OR REPLACE TYPE tb_run_step_obj AS TABLE OF run_step_obj;
/

Now we can put it all together in a mapping, but since we are going to use this table function in many mappings it is best to put it in a reusable mapping:

2014-10-01 16_17_32-Oracle Data Integrator Studio 12c _ PRE-MAPPING

Assign '<%=odiRef.getPop("POP_NAME")%>'to mapping_name and pass it to the table function.

2014-10-01 16_20_51-EXPRESSION - Properties - Editor

Add the function to the Function Name property:

2014-10-01 16_19_55-pl_start_run_step - Properties - Editor

Now use the reusable mapping in the migrated mapping:

2014-10-01 16_29_56-PRE_MAP_DEMO - Editor

Don’t forget to clear the Begin Mapping Command. We also need to add a Cross Join between the PRE_MAPPING Table Function and the DEDUPLICATOR, this is only possible because we know for sure that the function will only return one row. Then adjust the filter to include the PRE_MAPPING parameters:

2014-10-01 16_33_16-FILTER - Properties - Editor

Now we have a working mapping which is a lot easier to understand for OWB developers than the ODI solution with variables, procedures and a package. Also everything is captured in one compact, reusable mapping.


Solution 3


A combination of solution 1 and 2.


But wait…


There are some potential drawbacks to this way of working.

The restart issue
As Tom Kyte pointed out it is possible for a statement to restart. This can happen when an update statement runs into a locked record. When this happens the updates so far will be rolled back and Oracle will try and lock the records before the statement is run again. In my case the table function I used contains an AUTONOMOUS TRANSACTION which will be executed twice when a restart occurs. Also anything that is non-transactional, like modifying a package global variable, will mess up your process

The DETERMINISTIC issue
You must make sure your function is DETERMINISTIC or you must guarantee the function is only called once. Because you can connect the table function to more than one object in your mapping, you must be aware of how ODI generates the underlying SQL code. If the function is called more than once in the generated code you must make sure the result is always the same for this session, in other words: your function must be DETERMINISTIC.
Update 15-10-2014

After doing some more tests I have come to the conclusion that, even though the solution works and performs well, it is not the preferred method of migrating an OWB  pre-mapping.