February 22, 2013

Pentaho Data Integration - Execute for Each Line (#011)



Create a new "each_line" folder in the repository.


Create two transformations:
- tr_el_list
- tr_el_query

Create one job:
- jb_each_line





Open "jb_each_line" and add four steps:
- START
- Two Transformations
- Success


Adjust the "jb_each_line" job in following way:




Open the "tr_el_list" transformation. Add two steps:
- Data Grid
- Copy rows to result



Open "Data Grid" and add fields to the "Meta" tab:


Add data to the "Data" tab:


Open the "tr_el_query" transformation. Add following steps:
- Table input
- Table output


Open "Table input", add SELECT statement


SQL:
SELECT
  EMPLOYEE_ID
, FIRST_NAME
, LAST_NAME
, EMAIL
, PHONE_NUMBER
, HIRE_DATE
, JOB_ID
, SALARY
, COMMISSION_PCT
, MANAGER_ID
, DEPARTMENT_ID
, 'a_table_name' AS tgt_table_name
FROM EMPLOYEES

Open and adjust the "Table output" step. Check the fields in the "Main options" tab:
- "Specify database fields"
- "Is the name of the table defined in a field?"
- "Field that contains name of table": "TGT_TABLE_NAME"


Click "Get fields" on the "Database fields" tab and remove last line from the "Data Grid".



Reopen "Table input" and adjust SELECT statement in the following way:


SQL:
SELECT
  EMPLOYEE_ID
, FIRST_NAME
, LAST_NAME
, EMAIL
, PHONE_NUMBER
, HIRE_DATE
, JOB_ID
, SALARY
, COMMISSION_PCT
, MANAGER_ID
, DEPARTMENT_ID
, '${tgt_table_name_dg}' AS tgt_table_name
FROM ${src_table_name_dg}
WHERE HIRE_DATE >= TO_DATE('${start_date_dg}', 'YYYY-MM-DD')
AND HIRE_DATE < TO_DATE('${end_date_dg}', 'YYYY-MM-DD')

Go to the "jb_each_line" and double-click the "tr_el_query" transformation. Add four parameters from the "tr_el_list" transformation.


Open the "tr_el_query" transformation. Right click anywhere in the workspace/canvas and select "Transformation settings"


Go to the "Parameters" tab and add parameters' names like in the previous step.


Go to "Advanced" tab and check "Execute for every input row?" option.


Create employees tables in the "dwh" schema:




SQL:
CREATE TABLE dwh.employees_2001 AS SELECT * FROM hr.employees WHERE 1 = 2
/
CREATE TABLE dwh.employees_2002 AS SELECT * FROM hr.employees WHERE 1 = 2
/
CREATE TABLE dwh.employees_2003 AS SELECT * FROM hr.employees WHERE 1 = 2
/
CREATE TABLE dwh.employees_2004 AS SELECT * FROM hr.employees WHERE 1 = 2
/
CREATE TABLE dwh.employees_2005 AS SELECT * FROM hr.employees WHERE 1 = 2
/
CREATE TABLE dwh.employees_2006 AS SELECT * FROM hr.employees WHERE 1 = 2
/
CREATE TABLE dwh.employees_2007 AS SELECT * FROM hr.employees WHERE 1 = 2
/
CREATE TABLE dwh.employees_2008 AS SELECT * FROM hr.employees WHERE 1 = 2
/

Run the "jb_each_line" job and check results.



SQL:
SELECT 'SELECT ''' || TABLE_NAME || ''' AS a_table, COUNT(1) AS cnt FROM ' || TABLE_NAME || ' UNION ALL'
FROM USER_TABLES
WHERE TABLE_NAME LIKE 'EMPLOYEES_200_'
ORDER BY TABLE_NAME;

No comments:

Post a Comment