What is Pentaho Data Integration (Kettle)?

byAlooma Team
Updated Feb 10, 2018

Pentaho Data Integration (PDI) is a part of the Pentaho Open Source Business intelligence suite. It includes software for all aspects of supporting business decision making: the data warehouse managing utilities, data integration and analysis tools, software for managers, and data mining tools.

Pentaho Data Integration is well known for its ease of use and quick learning curve. PDI implements a metadata-driven approach which means that the development is based on specifying what to do, not how to do it.

Pentaho lets administrators and ETL developers create their own data manipulation jobs with a user-friendly graphical creator, and without entering a single line of code.

PDI uses a common, shared repository which enables remote ETL execution, facilitates teamwork, and simplifies the development process.

PDI components

There are a few development tools for implementing ETL processes in Pentaho:

  • Spoon - a data modeling and development tool for ETL developers. Use it to create transformations (elementary data flows) and jobs (execution sequences of transformations and other jobs)
  • Pan - executes transformations modeled in Spoon
  • Kitchen - executes jobs designed in Spoon
  • Carte - a simple web server used for running and monitoring data integration tasks

Pentaho enterprise edition

The enterprise edition of PDI adds some extra components that extend the capabilities of the Pentaho platform, including support and performance monitoring.

Like what you read? Share on