# Text File Output

{% hint style="warning" %}

#### Workshop - Text File Output

Reading files is only half the job. You also need to generate files for users and systems.

In this workshop, you build a transformation that writes a customer survey for Steel Wheels. You will build the survey from multiple streams. You will parameterize it with a runtime customer name.

**What you'll do**

* Read a customer name from a transformation argument
* Build header and body sections with static rows and file-driven rows
* Format text using User Defined Java Expression
* Merge streams in a predictable order with Append streams
* Write the final output with Text file output

**Prerequisites:** Understanding of basic transformation concepts (steps, hops, preview). Complete [Text File Input](https://academy.pentaho.com/pentaho-data-integration/data-integration/data-sources/flat-files/text/text-file-input) first.

**Estimated time:** 35 minutes
{% endhint %}

***

{% hint style="info" %}
**Workshop files**

Download the following files.

Keep the filenames unchanged.

Save them in your workshop folder.
{% endhint %}

{% file src="<https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-5d0ba50d3a2abc8be0b2285ce0b4e7972bcace17%2Ftr_write_output.ktr?alt=media>" %}

{% file src="<https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-6d5160f06de3b0fa7f96a5a18986ef59b64ab5b2%2Fquestions.txt?alt=media>" %}

***

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-92d5f2896b6fcf7765585bfd02d49d40baa1c4d1%2Fimage.png?alt=media" alt=""><figcaption><p>Survey - Text File Output</p></figcaption></figure>

{% hint style="info" %}
**Create a new transformation**

Use any of these options to open a new transformation tab:

* Select **File** > **New** > **Transformation**
* Use `Ctrl+N` (Windows/Linux) or `Cmd+N` (macOS)
  {% endhint %}

***

{% tabs %}
{% tab title="1. Get System Info" %}
{% hint style="info" %}

#### Get System Info

Use **Get System Info** to read a runtime argument. We will treat the argument as the customer name.
{% endhint %}

{% embed url="<https://www.loom.com/share/de06920fdcd84bc2b1fe63454afc8df8?hideEmbedTopBar=true&hide_owner=true&hide_share=true&hide_title=true>" %}
Get System Info
{% endembed %}

1. Start Pentaho Data Integration (Spoon).

{% hint style="info" %}
{% tabs %}
{% tab title="Windows (PowerShell)" %}

```powershell
Set-Location C:\Pentaho\design-tools\data-integration
.\spoon.bat
```

{% endtab %}

{% tab title="macOS / Linux" %}

```bash
cd ~/Pentaho/design-tools/data-integration
./spoon.sh
```

{% endtab %}
{% endtabs %}
{% endhint %}

2. Drag **Get System Info** onto the canvas.
3. Double-click the step. Configure it like this:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-6693c3539757a2e9f83bb64730d5b3ee832bdcea%2Fget%20sys%20info.png?alt=media" alt="" width="375"><figcaption><p>Get system info</p></figcaption></figure>

4. Select **OK**.
   {% endtab %}

{% tab title="2. User Defined Java Expression" %}
{% hint style="info" %}

#### User Defined Java Expression

Use **User Defined Java Expression** to format the header line. It will combine a label with the customer name.
{% endhint %}

1. Drag **User Defined Java Expression** onto the canvas.
2. Create a hop from **Get System Info**.
3. Double-click the step. Configure it like this:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-b605f7136531ab6498959d979518672a9058be40%2Ftext%20%2B%20name.png?alt=media" alt=""><figcaption><p>UDJE</p></figcaption></figure>

4. Select **OK**.

{% hint style="info" %}
This replaces the original argument value with formatted text. The output field is still named `text`.
{% endhint %}

{% embed url="<https://docs.oracle.com/javase/tutorial/java/nutsandbolts/opsummary.html>" %}
Link to Operators
{% endembed %}
{% endtab %}

{% tab title="3. Data Grid" %}
{% hint style="info" %}

#### Data Grid

Use **Data Grid** to add static survey header lines. This keeps the top-of-file content inside the transformation.

Configure the field metadata on **Meta**. Enter the rows on **Data**.
{% endhint %}

1. Drag **Data Grid** onto the canvas.
2. Double-click the step. Configure it like this:

<div><figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-f5ca7d96f6b75f7fcc2fa397d16c5c9ae34a61c7%2Fdg%20-%20instructions.png?alt=media" alt=""><figcaption><p>Data grid - text</p></figcaption></figure> <figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-a4a71cd03d4e4ecb890d64267110d00fd6498dea%2Fdg%20-%20data.png?alt=media" alt=""><figcaption><p>Data grid - data</p></figcaption></figure></div>

3. Select **OK**.
   {% endtab %}

{% tab title="4. Append streams (head)" %}
{% hint style="info" %}

#### Append streams

Use **Append streams** when order matters. It outputs all rows from the first hop. It then outputs all rows from the second hop.

Both input streams must have the same field names and types.
{% endhint %}

1. Drag **Append streams** onto the canvas.
2. Create hops from **User Defined Java Expression** and **Data Grid**.
3. Double-click the step. Configure it like this:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-4ca10f383970c2035ccb6d9447a4e45cf99fc5ff%2Fappend%20(2).png?alt=media" alt="" width="375"><figcaption><p>Append</p></figcaption></figure>

4. Select **OK**.

{% hint style="info" %}
To append streams, keep the layout consistent. In this workshop, every stream uses a single `text` field.
{% endhint %}

{% hint style="info" %}
If order does not matter, use a step that performs a union of streams instead.
{% endhint %}

{% hint style="warning" %}
Make sure the header stream is the **first** input hop. Append streams will output that stream first.
{% endhint %}
{% endtab %}

{% tab title="5. Text file input (questions)" %}
{% hint style="info" %}

#### Text file input

Use **Text file input** to read the question list from a file. Each question becomes one row.
{% endhint %}

1. Drag **Text file input** onto the canvas.
2. Double-click the step. Configure it like this:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-6320e6241c6d084f5bfcfefe75f63a7ae3113483%2Fsurvey%20(2).png?alt=media" alt=""><figcaption><p>Text file input</p></figcaption></figure>

File: `${Internal.Transformation.Filename.Directory}/questions.txt`

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-ced38fa0454555e8d7eb8a28c36e1984fa8f3c9d%2Fsurvey%20content.png?alt=media" alt=""><figcaption><p>Text file input - Content</p></figcaption></figure>

{% hint style="info" %}
Use a **Tab** delimiter. Enable **row numbers** to generate question numbers.
{% endhint %}

3. On **Fields**, rename the output field to `text`:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-c3928e94e4c0add20437664b92f7b74353651d3a%2Fsurvey%20fields.png?alt=media" alt=""><figcaption><p>Text file input - Fields</p></figcaption></figure>

4. Select **OK**.

{% hint style="info" %}
Each row now contains a question in `text`. The row number field (for example `question_num`) identifies the question number.
{% endhint %}
{% endtab %}

{% tab title="6. User Defined Java Expression (number questions)" %}
{% hint style="info" %}

#### User Defined Java Expression

Use a second **User Defined Java Expression** to prefix each question line with its question number.
{% endhint %}

1. Drag **User Defined Java Expression** onto the canvas.
2. Create a hop from **Text file input (questions)**.
3. Double-click the step. Configure it like this:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-a39615d6811bb2d92286fe5cdeb1920a6f5ce1bb%2FUDJE%20-%20question.png?alt=media" alt=""><figcaption><p>UDJE - concat question numbers</p></figcaption></figure>

4. Select **OK**.

{% hint style="info" %}
This overwrites `text` with a numbered question like `1. How did we do?`.
{% endhint %}
{% endtab %}

{% tab title="7. Select values" %}
{% hint style="info" %}

#### Select values

The Select Values step is useful for selecting, removing, renaming, changing data types and configuring the length and precision of the fields on the stream.

These operations are organized into different categories:

* Select and Alter — Specify the exact order and name in which the fields should be placed in the output rows
* Remove — Specify the fields that should be removed from the output rows
* Meta-data — Change the name, type, length, and precision (the metadata) of one or more fields
  {% endhint %}

1. Drag **Select values** onto the canvas.
2. Create a hop from **User Defined Java Expression (number questions)**.
3. Double-click the step. Configure it like this:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-88535910b2d0c8e6e34a88ab1d8a7ee82ab75406%2FSV%20(1)%20(1).png?alt=media" alt="" width="375"><figcaption><p>Select values - remove question_num</p></figcaption></figure>

4. Select **OK**.

{% hint style="info" %}
Remove the question number field so both streams have the same layout. You need a single `text` field before you append.
{% endhint %}
{% endtab %}

{% tab title="8. Append streams (body)" %}
{% hint style="info" %}

#### Append streams

Append the survey header stream to the numbered question stream.
{% endhint %}

1. Drag **Append streams** onto the canvas.
2. Create hops from **Append streams (head)** and **Select values**.
3. Double-click the step. Configure it like this:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-1e4c1390e286a6bc7efd5575507f2b1345e56f6d%2Fappend%20-q.png?alt=media" alt="" width="375"><figcaption><p>Append</p></figcaption></figure>

4. Select **OK**.

{% hint style="info" %}
You now have one stream. It contains one field named `text`.
{% endhint %}
{% endtab %}

{% tab title="9. Text file output" %}
{% hint style="info" %}

#### Text file output

Use **Text file output** to write the survey file to disk.
{% endhint %}

{% hint style="warning" %}
Do not run multiple copies of this step against the same output file. Use **Include stepnr in filename** if you need parallel output.
{% endhint %}

1. Drag **Text file output** onto the canvas.
2. Create a hop from **Append streams (body)**.
3. Double-click the step. Set the file name:

`Filename: ${Internal.Transformation.Filename.Directory}/survey`

{% hint style="info" %}
Set **Extension** to `txt` if your output should be `survey.txt`.
{% endhint %}

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-809d3fffe2308f047ffc7c53a0fd22006287f146%2FTFO%20-%20Content%20(1).png?alt=media" alt="" width="563"><figcaption><p>Text file output - Content</p></figcaption></figure>

4. On **Fields**, select **Get Fields**.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-f05039f38558a046ffe566cf6b1ee26f682aa669%2FTFO%20-%20survey.png?alt=media" alt="" width="563"><figcaption><p>Text file output - Fields</p></figcaption></figure>

5. Select **OK**.
   {% endtab %}

{% tab title="10. RUN" %}
{% hint style="info" %}

#### Run the transformation

Run the transformation locally. Pass a customer name as an argument.
{% endhint %}

1. Select **Run** in the canvas toolbar.
2. Open **Arguments (legacy)**. Enter a customer name.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-8019e8bf89601faebee1dee42b7b25c3bbf65636%2FTFO%20-%20run.png?alt=media" alt=""><figcaption><p>Enter argument</p></figcaption></figure>

3. Select **Run**.
4. Open the **Preview data** tab.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-cd34c47b60410982c2e11ffd5542572fd57e41c2%2FTFO%20-%20preview.png?alt=media" alt=""><figcaption><p>Preview data</p></figcaption></figure>

5. Open the generated survey file in your transformation folder.

{% hint style="info" %}
This workshop reinforces the rule for merging streams:

* Keep the same field layout (names and order).
* Keep matching data types.
  {% endhint %}
  {% endtab %}
  {% endtabs %}
