January 27, 2013
Pentaho Data Integration – Variables (#006)
Pentaho Data Integration – How to Use a “Set Variables” step.
Pentaho Data Integration – How to Replace Variables in a SQL statement
Explore the repository and create a new folder "employees_incremental_load"
Inside "employees_incremental_load" create a "eil_transformations" subfolder.
Create three new objects:
- job: "jb_employees_incremental_load"
- transformation "tr_eil_dates"
- transformation "tr_eil_new_employees_load"
The main job naming convention:
jb_[folder_name], i.e.: "jb_employees_incremental_load"
The job will be saved in the main folder: "employees_incremental_load"
The transformations will be saved in the subfolder "eil_transformations"
The ETL process will work as follows:
The main job will include two transformations. First transformation will pass a date value (by "Set Variables" step) to the second transformation. The second transformation will receive the data value and pass it as a parameter to the SELECT statement.
The "tr_eil_dates" transformation
Add two steps to the workspace area:
- From the "Input" folder "Table input"
- From the "Job" folder "Set Variables"
Adjust the "Input table" step:
SQL:
SELECT '2007-01-01' AS date_value FROM DUAL
Edit the "Set Variables" step" and click "Get Fields"
The transformation structure:
The "tr_eil_new_employees_load" transformation:
Add two steps to the workspace area:
- From the "Input" folder "Table input"
- From the "Output" folder "Table output"
Adjust the "Table input" step:
Remember to check the "Replace variables in script" option.
SQL:
SELECT
EMPLOYEE_ID
, FIRST_NAME
, LAST_NAME
, PHONE_NUMBER
, HIRE_DATE
, JOB_ID
, SALARY
, COMMISSION_PCT
, MANAGER_ID
, DEPARTMENT_ID
FROM EMPLOYEES
WHERE HIRE_DATE < TO_DATE('${date_value_in}', 'YYYY-MM-DD')
Adjust the "Table output" step:
Click the "SQL" button to create the "new_employees" table in the "dwh" schema
Variables in the script are causing troubles.
Comment it out in order to create a table. After the table is created return to the previous form.
Right-click anywhere in the workspace/canvas and select "Transformation settings"
In the "Transformation properties" window select the "Parameters" tab and add a "date_value_in" parameter without a default value.
The transformation structure:
The "jb_employees_incremental_load" job:
From the "General" folder add three steps to the workspace area:
- "START" and two "Transformations"
Adjust the "Transformation" in the following way:
Adjust the "Transformation 2" in the following way:
The job has the following structure:
Run and review the job’s results.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment