May 13, 2013

Pentaho Data Integration - Connection - Oracle 11g R2 RAC (#017)

To run Pentaho you need to install Java first. Required Java version can be found in the run batch or shell script. Both files, Spoon.bat and spoon.sh, are located in the main Pentaho folder.
The next step is selecting of a proper JDBC driver which is associated with the Oracle and Java version.
 
 
 
JDBC drivers can be found on the Oracle's website:
 
 
 
 
Download the driver, i.e.: ojdbc6.jar and place it in the following directory:
\data-integration\libext\JDBC
 
Naming convention:
o - Oracle
jdbc - Java database connectivity
6 - Java 6 (1.6)
 
 
 
Information about the connection string can be found on the Penaho's wiki website:
 
 
 
Use one of the examples or copy a section from the tnsnames.ora file:
 
 
 
Place the connection string in the "Database Name" field, the "Port Number" can be omitted or set to "-1".
 
 

Press the "Test" button.
 

Pentaho Data Integration - Connection - SQL Server 2012 Cluster (#016)

To run Pentaho you need to install Java first. Required Java version can be found in the run batch or shell script. Both files, Spoon.bat and spoon.sh, are located in the main Pentaho folder.
The next step is selecting of a proper JDBC driver which is associated with SQL Server and Java version.
 

 
Download the native jdbc driver for the SQL Server 2012 from the Microsoft's website:




Download the installation file, i.e.: sqljdbc_4.0.2206.100_enu.exe, unpack and place the driver in the following directory:
\data-integration\libext\JDBC
 
Naming convention:
sql - SQL Server
jdbc - Java database connectivity
4.0 - type 4, suitable for Java EE 5 and 6 (1.5 and 1.6)
 
Information about the connection string can be found on one of the msdn's sites:



The "Database Connection" window should look in the following way:


 
The "Feature List" window will look in the following way:

 
 
Run the test.

 

May 7, 2013

Pentaho Data Integration - Align / Snap to Grid (#015)

There are two ways to organize a transformation’s steps on the canvas:
- to align with the keyboard shortcut keys,
- to use “grid” and “snap to grid” functionality.
 
Shortcut keys
 
 
Section: Keyboard Shortcuts
 
Snap to grid
 
 
Section: Snap to grid

Pentaho Data Integration - Improving Performance - Table Input and Oracle (#014)

When using Pentaho Data Integration Table Input step to connect to Oracle via a JDBC connection there is a setting in your connection information that you can specify that can dramatically improve your performance in retrieving data.  This property is the defaultRowprefetch.  Oracle JDBC drivers allow you to set the number of rows to prefetch from the server while the result set is being populated during a query. Prefetching row data into the client reduces the number of round trips to the server. The default value for this property is 10.
 
In the table input step, edit your connection, click on the options tab and then enter in your defaultRowPrefetch specification:
 
 
 
Posted by Wayne Johnson (Senior Sales Engineer at Pentaho)