Tuesday, June 25, 2013

Incremental Loading in Informatica using Mapping Variable

Lets understand what is a Mapping Variable:

A mapping variable represents a value that can change through the session. The Integration Service saves the value of a mapping variable to the repository at the end of each successful session run and uses that value the next time when we run the session. Variable functions like SetMaxVariable, SetMinVariable, SetVariable, SetCountVariable are used in the mapping to change the value of the variable. At the beginning of a session, the Integration Service evaluates references to a variable to determine the start value. At the end of a successful session, the Integration Service saves the final value of the variable to the repository. The next time we run the session, the Integration Service evaluates references to the variable to the saved value. To override the saved value, define the start value of the variable in the parameter file.

Lets see the steps to create a Mapping Variable which can be used in incremental loading.

Step 1:

How to create mapping variable in informatica?


Step 2:

To define a mapping variable to be used for incremental loading (for this illustration we are in a assumption that your source system is a relational source (RDBMS))


Step 3:

In the Source Qualifier we need to Generate SQL and add a filter, WHERE condition to filter records that are greater than the $$LAST_RUN_DATE


Step 4: Part 1: 

To define the logic where the Informatica integration services saves the timestamp when the session is executed, for the next run, it will use that saved timestamp to read the records from the relational source incrementally. We need to create a expression transformation right after the Source Qualifier to set the variable.


Step 4: Part 2:

In the expression transformation create a OUT port and use the code as show below



Step 5:

After the session has executed we can see the saved timestamp in the workflow manager as below:-




Wednesday, June 19, 2013

Information OPB Metadata tables to track session performance

Informatica metadata repository stores and maintains information about all the objects in Informatica. They contain details of connection information, users, folders, mappings, sources, targets etc. These information can serve many purposes while accessed through external SQL query.

Scenerio 1:

Want to know the average time a workflow has taken to complete.

Code syantax :

SELECT workflow_name, server_name,TO_CHAR (start_time, 'MM/DD/YY HH24:MI:SS'),TO_CHAR (end_time, 'MM/DD/YY HH24:MI:SS'),(end_time - start_time) * 24 AS diff_time_hrs
FROM pm_repo_r.opb_wflow_run
WHERE workflow_name = 'wkfl_cm_job_dly_new' -- enter the name of the workflow ORDER BY start_time DESC;














Scenerio 2:

Want to know in a worflow the tasks that run and the time taken by each task to complete, if it is a session we can know the start time and the end time of the session, source rows read, target rows written, any failed rows, transformation errors and the time taken for each of them. This gives us a good insight on session performance. We have build a report on top of this data to give us a graphical represenatation of our production jobs.

We first need to find the folder we are intrested for which we would like to pull performance data.

select * from pm_repo_r.opb_subject folder; -- get the folder name from SUBJ_NAME col and SUBJ_ID which is the folder id which will be paramater for the next query








SELECT DISTINCT folder.subj_name "FLDR_NAME",workflow.workflow_name "WRFL_NM",timing.instance_name "SESS/WKLT_NAME",timing.start_time "START_TIME_S",timing.end_time "END_TIME_S", LOG.src_success_rows "SRC_ROWS",LOG.targ_success_rows "TGT_ROWS",LOG.targ_failed_rows "FAIL_TGT_ROWS",total_trans_errs "T/R_ERRORS",FLOOR ((timing.end_time - timing.start_time) * 60 * 60 * 24
) "DUR_SEC",FLOOR ((timing.end_time - timing.start_time) * 60 * 24
) "DUR_MIN"
FROM
pm_repo_r.opb_task_inst_run timing,
pm_repo_r.opb_sess_task_log LOG,
pm_repo_r.opb_wflow_run workflow,
pm_repo_r.opb_subject folder
WHERE folder.subj_id IN (132) -- --'folder ID'
AND timing.start_time >=TO_DATE ('06/15/13 16:30:00', 'MM/DD/YY HH24:MI:SS')
AND timing.end_time <=TO_DATE ('06/20/13 07:21:00', 'MM/DD/YY HH24:MI:SS')
AND UPPER (timing.instance_name) NOT LIKE '%START%'
AND timing.workflow_run_id = workflow.workflow_run_id(+)
AND timing.workflow_run_id = LOG.workflow_run_id(+)
AND timing.instance_id = LOG.instance_id(+)
AND workflow.subject_id = folder.subj_id(+)
AND workflow.workflow_name = '<WORKFLOWNAME>'--'workflowname
ORDER BY 4 DESC;

Tuesday, June 18, 2013

Generate sequence numbers in Informatica without using sequence generator transformation.




Mapping:

Create sequence numbers without using sequence generator transformation.

Business Requirement:

This is Informatica PowerCenter mapping example to generate sequence number without using sequence generator tranform, the logic is implemented using unconnected look-up tranformation and expression transformation.


Implementation Guidelines:








In the expression transformation create a variable port for impelemting the logic:-


iif
(isnull(v_EMP_KEY) or v_EMP_KEY = 0,iif
(isnull(:LKP.lkp_TGT_EMP_DIM_MAX_SK(-1)) ,0, :LKP.lkp_TGT_EMP_DIM_MAX_SK
(-1)),v_EMP_KEY) + 1



Link foward the EMP_KEY to the target instance as shown in figure 1.


Informatica Expression Code: Chr(10): Line Feed character  Chr(13): Carriage Return character.  Chr(12): Form Feed Expression Code to...