August 31, 2013

AutoMate 9 vs. Windows Task Scheduler 2.0 (#021)


Source:
http://www.networkautomation.com/documents/507363d10f838452330925.pdf

Free to try (30-day trial); $1,495.00 to buy

Price Source:
http://download.cnet.com/AutoMate/3000-2084_4-10000220.html


Getting Started With AutoMate 9
 
Getting Started With AutoMate 9

Pentaho Data Integration - Fundamental Tutorial (Video) (#020)


Fundamental Tutorial 


About This Video

Comprehensive Pentaho Data Integration Tutorial
Creating a Job and Transformation.
Creating a simple Multi-Dimensional model
Logging and Performance Metrics
Scheduling and Running Remotely with Carte
Using with the BI Platform and an Action Sequence
Using the PDI Console
Notification

June 2, 2013

Pentaho Data Integration - Improving Performance - Increase the Java Memory (#019)

Pentaho runs inside a Java Virtual Machine, and hence is bound by the properties of that VM.
These optimisations can apply to just about any Java application, including the Pentaho BI Server and GUI tools.

Method to increase memory allocation:

Open the file spoon.sh or Spoon.bat in a text editor. Look for a section that looks like this:

# ******************************************************************
# ** Set java runtime options                                     **
# ** Change 256m to higher values in case you run out of memory.  **
# ******************************************************************


OPT="-Xmx256m -cp $CLASSPATH -Djava.library.path=$LIBPATH -DKETTLE_HOME=$KETTLE_HOME -DKETTLE_REPOSITORY=$KETTLE_REPOSITORY -DKETTLE_USER=$KETTLE_USER -DKETTLE_PASSWORD=$KETTLE_PASSWORD -DKETTLE_PLUGIN_PACKAGES=$KETTLE_PLUGIN_PACKAGES"

Change the -Xmx parameter to alter the maximum heap size, i.e.: -Xmx1024m

Source: http://djugal.blogspot.ca/2011/07/increase-java-memory-for-pentaho-data.html
Posted by Jugal Dhrangadharia

Pentaho Data Integration - Run from the Windows cmd (#018)

Run a transformation from the Windows cmd
Run a transformation from the Windows command line
Run a transformation from shell
Run a job from the Windows cmd
Run a job from the Windows command line
Run a job from shell

Running a transformation

To run a job from the cmd you need to use the "Pan" batch file. The script is located in the main Pantaho folder.


Create the "execute_from_cmd" folder in the "Repository explorer."


Create a dummy transformation.






Create a dummy job.





Run the job to test it.



The information needed to run the transformation can be found in the "Repository Connection" window.


To get the details edit the repository.


The Information how to run a Pentaho process can be found on the Pentaho Wiki pages:
http://wiki.pentaho.com/display/EAI/Pan+User+Documentation


Run the dummy transformation:

Pan.bat /rep:penrep_id /user:admin /pass:admin /dir:/execute_from_cmd /trans:tr_dummy /level:Detailed

/rep:   - a repository
/user:  - the repository user name
/user:  - the repository password
/dir:   - the repository directory
/trans: - the repository transformation to run
/level: - logging level



To write the results to the log file use the "/log" clause:

Pan.bat /rep:penrep_id /user:admin /pass:admin /dir:/execute_from_cmd /trans:tr_dummy /level:Detailed /log:C:\a_example_log_file.log

/rep:   - a repository
/user:  - the repository user name
/user:  - the repository password
/dir:   - the repository directory
/trans: - the repository transformation to run
/level: - logging level
/log:   - the logging file




Running a job

To run the job from the cmd you need to use the "Kitchen" batch file. The script is located in the main Pantaho folder.



The Information how to run the a Pentaho job can be found on the Pentaho Wiki pages:

or the Infocenter pages:
http://infocenter.pentaho.com/help/index.jsp?topic=%2Fpdi_user_guide%2Freference_kitchen.html


Run the dummy job:

Kitchen.bat /rep:penrep_id /user:admin /pass:admin /dir:/execute_from_cmd /job:jb_dummy /level:Basic



This time to write the results to the log file redirect the output

Kitchen.bat /rep:penrep_id /user:admin /pass:admin /dir:/execute_from_cmd /job:jb_dummy /level:Basic > C:\a_example_log_file.log




May 13, 2013

Pentaho Data Integration - Connection - Oracle 11g R2 RAC (#017)

To run Pentaho you need to install Java first. Required Java version can be found in the run batch or shell script. Both files, Spoon.bat and spoon.sh, are located in the main Pentaho folder.
The next step is selecting of a proper JDBC driver which is associated with the Oracle and Java version.
 
 
 
JDBC drivers can be found on the Oracle's website:
 
 
 
 
Download the driver, i.e.: ojdbc6.jar and place it in the following directory:
\data-integration\libext\JDBC
 
Naming convention:
o - Oracle
jdbc - Java database connectivity
6 - Java 6 (1.6)
 
 
 
Information about the connection string can be found on the Penaho's wiki website:
 
 
 
Use one of the examples or copy a section from the tnsnames.ora file:
 
 
 
Place the connection string in the "Database Name" field, the "Port Number" can be omitted or set to "-1".
 
 

Press the "Test" button.
 

Pentaho Data Integration - Connection - SQL Server 2012 Cluster (#016)

To run Pentaho you need to install Java first. Required Java version can be found in the run batch or shell script. Both files, Spoon.bat and spoon.sh, are located in the main Pentaho folder.
The next step is selecting of a proper JDBC driver which is associated with SQL Server and Java version.
 

 
Download the native jdbc driver for the SQL Server 2012 from the Microsoft's website:




Download the installation file, i.e.: sqljdbc_4.0.2206.100_enu.exe, unpack and place the driver in the following directory:
\data-integration\libext\JDBC
 
Naming convention:
sql - SQL Server
jdbc - Java database connectivity
4.0 - type 4, suitable for Java EE 5 and 6 (1.5 and 1.6)
 
Information about the connection string can be found on one of the msdn's sites:



The "Database Connection" window should look in the following way:


 
The "Feature List" window will look in the following way:

 
 
Run the test.

 

May 7, 2013

Pentaho Data Integration - Align / Snap to Grid (#015)

There are two ways to organize a transformation’s steps on the canvas:
- to align with the keyboard shortcut keys,
- to use “grid” and “snap to grid” functionality.
 
Shortcut keys
 
 
Section: Keyboard Shortcuts
 
Snap to grid
 
 
Section: Snap to grid

Pentaho Data Integration - Improving Performance - Table Input and Oracle (#014)

When using Pentaho Data Integration Table Input step to connect to Oracle via a JDBC connection there is a setting in your connection information that you can specify that can dramatically improve your performance in retrieving data.  This property is the defaultRowprefetch.  Oracle JDBC drivers allow you to set the number of rows to prefetch from the server while the result set is being populated during a query. Prefetching row data into the client reduces the number of round trips to the server. The default value for this property is 10.
 
In the table input step, edit your connection, click on the options tab and then enter in your defaultRowPrefetch specification:
 
 
 
Posted by Wayne Johnson (Senior Sales Engineer at Pentaho)

March 1, 2013

Pentaho Data Integration - Create Tables' Structures and Copy Data (#013)


Select "File", "New" and "Transformation" option. When the transformation is open click "Tools", "Wizard" and "Copy tables".


When the wizard is open select a source and target database connection to use.


Choose tables that need to be created and click "Next".


Choose a proper job name, point directory to the “create_tables_structures“ folder and click "Finish".


The job is created.



After job is run, tables will be created and data copied to a new schema.