Read DB table

Workshop - Read DB Table

Database tables serve as the foundational data source for most enterprise data integration workflows—from transactional systems and operational databases to data warehouses and analytical platforms. Organizations rely on efficient data extraction from relational databases to power their reporting, analytics, and downstream data processing pipelines. Understanding how to read data from database tables using SQL queries is fundamental for building data integration solutions that transform raw database records into actionable business insights.

In this hands-on workshop, you'll learn to use PDI's "Table Input" step to extract data from relational database tables using SQL queries. The warehouse manager at Steel Wheels requires a report highlighting the status of shipped orders, and you'll build a complete transformation that retrieves order data, performs calculations, categorizes results into ranges, and sorts the output for analysis. You'll discover how to write and modify SQL statements, leverage automatic SQL generation, preview query results, and chain multiple transformation steps together to create meaningful business reports from raw database records.

What You'll Accomplish:

Configure the Table Input step to connect to database tables
Write SQL queries to filter and retrieve specific data subsets
Use the "Get SQL select statement" feature to automatically generate SQL
Preview query results before executing the full transformation
Modify SQL statements with WHERE clauses to filter data by status
Integrate Calculator steps to perform computations on database fields
Apply Number Range steps to categorize numeric values into descriptive ranges
Implement Sort Rows steps to organize data for reporting
Build a multi-step transformation that processes database records end-to-end
Understand SQL query optimization and parameterization techniques

By the end of this workshop, you'll have practical experience extracting data from database tables and processing it through a complete transformation pipeline. You'll understand how the Table Input step serves as the entry point for database-driven transformations and how to combine it with other steps to create sophisticated data processing workflows. Rather than exporting data manually or writing standalone SQL scripts, you'll build native PDI solutions that seamlessly integrate database queries with transformation logic, enabling repeatable, automated data processing that delivers consistent business value.

Prerequisites: Understanding of basic transformation concepts, database connection configuration, familiarity with SQL SELECT statements and WHERE clauses; Pentaho Data Integration installed and configured with appropriate database connections established

Estimated Time: 20 minutes

Create a new Transformation

Any one of these actions opens a new Transformation tab for you to begin designing your transformation.

By clicking File > New > Transformation
By using the CTRL-N hot key

Table input

This step is used to read information from a database, using a connection and SQL. Basic SQL statements can be generated automatically by clicking Get SQL select statement.

Connects to the ORDERS data table and extracts the required dataset where the status of the order is ‘Shipped’.

Start Pentaho Data Integration.

Windows - PowerShell

Set-Location C:\Pentaho\design-tools\data-integration
.\spoon.bat

Linux

cd
cd ~/Pentaho/design-tools/data-integration
./spoon.sh

Drag the Table Input step onto the canvas.
Open the Table Input properties dialog box. Ensure the following details are configured, as outlined below:

Preview and Click OK.

PreviousCreate DB table NextUpdate DB table

Last updated 12 days ago

Was this helpful?

Good evening

hashtagWorkshop - Read DB Table

Workshop - Read DB Table