January 23, 2013

Pentaho Data Integration - How To Use a Job (#003)


In the post "Pentaho Data Integration - How To Use a Transformation" a transformation was made to create and load the "countries" table in a data warehouse schema (dwh).
Repeat the formula for each table in the "hr" schema.


Add one option. In each target (tgt_table_name) definiton select "Truncate table". Before each load a table will be truncated.


SQL:
SELECT
table_name,
CASE WHEN table_name <> 'COUNTRIES' THEN 'to do' ELSE NULL END AS create_tr
FROM user_tables
ORDER BY table_name;


After creating six additional transformations. The repository should contain the following objects:


Now let's create a new job which will include all previously created transformations.


Save it as "jb_hr_initial_load".

Jobs' naming convention:
jb_[project_name]_[task]



From the "General" Folder on the "Design" tab select, drag and drop the following steps:
- "START"
- "Transformation" times seven
- "Success"
Save the progress.


Double click the "Transformation" step.


Select "Specify by name and directory".
Click a transformation sign and select the first transformation "tr_hr_countries_inital_load"
Save the progress.




Now the job has the following structure:


Repeat the previous step for each transformation in the "hr" folder.


Run the job and review the results.




SQL:
SET LIN 150

SELECT 'SELECT ''' || table_name || ''' AS table_name, COUNT(1) AS r_count FROM ' || table_name || ' UNION ALL' AS q_in_q
FROM user_tables
ORDER BY table_name;

No comments:

Post a Comment