Text File Output
Output text files ..
Workshop - Text File Output
While importing data is essential, data integration professionals also need to export processed data in formats suitable for downstream systems, reporting tools, or business stakeholders. Text file exports remain one of the most universal and widely-used output formats - whether generating EDI transactions, creating CSV files for spreadsheet analysis, or producing custom-formatted reports for business users.
In this hands-on workshop, you'll build a transformation that generates personalized customer surveys for Steel Wheels. This practical scenario demonstrates how to dynamically create text-based output by combining multiple data sources, performing string manipulation with Java expressions, and formatting content for human readability. You'll work with several data streams, merge them using append operations, and write a professionally formatted text file - all parameterized to accept customer names as runtime arguments.
What You'll Accomplish:
Use Get System Info to capture command-line arguments and system information
Apply User Defined Java Expressions for dynamic string concatenation and formatting
Create static reference data using the Data Grid step
Combine multiple data streams with the Append Streams step
Read template content from text files using internal Kettle variables
Configure the Text File Output step with proper delimiters and formatting options
Generate dynamic, parameterized output files with customer-specific content
Pass runtime arguments to transformations for flexible execution
By the end of this workshop, you'll understand how to orchestrate multiple data sources into cohesive output files. You'll have practical experience with parameterization, stream merging, and Java expressions - techniques that enable you to create sophisticated, production-ready file exports. Rather than manually creating individual files or relying on external scripts, you'll build automated solutions that generate customized output at scale.
Prerequisites: Understanding of basic transformation concepts, experience with Text File Input; Pentaho Data Integration installed and configured
Estimated Time: 35 minutes

Get System info
The Get System Info step retrieves information from the Kettle environment. This step generates a single row with the fields containing the requested information. It also accepts input rows. The selected values are added to the rows found in the input stream(s).
We'll be using this step to input the Customer Name as an argument.
Start Pentaho Data Integration.
Drag the ‘Get System Info’ onto the canvas:
Double-click on the step, and configure the following properties:

Click OK.
User defined Java expression
This step allows you to enter User Defined Java Expressions as a basis for the calculation of new values. In this example, a user defined java expression is used to update the ‘text’ stream field with the Customer Name.
Drag the ‘User Defined Java Expression’ onto the canvas.
Create a hop from the ‘name’ step.
Double-click the step, and configure the following properties:

Close the Step.
Data Grid
The Data Grid step allows you to enter a static list of rows in a grid. This is usually done for testing, reference or demo purposes.
• Meta tab: on this tab, you can specify the field metadata (output specification) of the data
• Data tab: This grid contains the data. Everything is entered in String format so make sure you use the correct format masks in the metadata tab.
We’re going to use this step to define the top section - head – of the survey.
Drag a ‘Data Grid’ step onto the canvas.
Double-click the step, and configure the following properties:


Close the Step.
Append
This step type allows you to order the rows of two inputs hops. First, all the rows of the "Head hop" will be read and output, after that all the rows of the "Tail hop" will be written to the output.
If more than 2 hops need to be used, you can use multiple append steps in sequence. As always, the row layout for the input data coming from both steps must be identical: the same row lengths, the same data types, the same fields at the same field indexes in the row.
In our example, the Head hop ‘text + name’ is appended to the Tail hop, ‘questions’.
Drag the ‘Append’ step onto the canvas.
Create hops from the ‘text + name’ and ‘questions’ steps.
Double-click on the step, and configure the following properties:

Close the Step.
Text File input
The Text File Input step provides you with the ability to specify a list of files to read, or a list of directories with wild cards in the form of regular expressions. In addition, you can accept filenames from a previous step making filename handling more even more generic.
Part II – the main objective is to append the questions.txt to the ‘head’ stream.
Drag the ‘Text File Input’ step onto the canvas.
Double-click on the step, and configure with the following properties:

File: ${Internal.Entry.Current.Directory}/questions.txt

💡TAB as delimiter
💡Uses row numbers as question numbers.
Rename the stream field: text

Close the Step.
User defined Java expression
This step allows you to enter User Defined Java Expressions as a basis for the calculation of new values.
In this example, a user defined java expression is used to update the ‘text’ stream field with the ‘question_num’.
Drag the ‘User Defined Java Expression’ onto the canvas.
Create a hop from the ‘survey questions’ step.
Double-click the step, and configure the following properties:

Close the Step.
Select values
The Select Values step is useful for selecting, removing, renaming, changing data types and configuring the length and precision of the fields on the stream.
These operations are organized into different categories:
Select and Alter — Specify the exact order and name in which the fields should be placed in the output rows
Remove — Specify the fields that should be removed from the output rows
Meta-data - Change the name, type, length and precision (the metadata) of one or more fields
Drag the Select values step onto the canvas.
Create a hop from the ‘question seq’ step.
Double-click on the step, and configure the following properties:

Close the Step.
Append
This step type allows you to order the rows of two inputs hops. First, all the rows of the "Head hop" will be read and output, after that all the rows of the "Tail hop" will be written to the output.
If more than 2 hops need to be used, you can use multiple append steps in sequence.
In our example, the Head hop ‘Append head’ is appended to the Tail hop, ‘select questions’.
Drag the ‘Append Streams’ step onto the canvas.
Create hops from the ‘Append head’ and ‘select questions’ steps.
Double-click on the step, and configure the following properties:

Close the Step.
Text file output
The Text file output step is used to export data to text file format. This is commonly used to generate Comma Separated Values (CSV files) that can be read by spreadsheet applications.
It is also possible to generate fixed width files by setting lengths on the fields in the fields tab.
It is not possible to execute this step in parallel to write to the same file. In this case, you need to set the option "Include stepnr in filename" and later merge the files.
Drag the ‘Text File Output’ step onto the canvas.
Create a hop from the ‘Append body’ step.
Double-click on the step, and configure the following properties:

File: ${Internal.Entry.Current.Directory}/survey

Click on the Field tab, and click on the ‘Get Fields’ button.

Close Step.
Last updated
Was this helpful?


