# Excel Writer

{% hint style="warning" %}

#### Workshop - Excel Writer

Excel reports often need templates, charts, and fixed layouts.

Here you populate a pre-formatted Sales and Expenses report. You will write multiple sections into one workbook. You will control execution order so writers do not conflict.

**What you'll do**

* Use a template workbook and write to fixed cell positions
* Write a report header with **Generate rows**
* Read sales and expense rows from text files
* Block parallel flows before writing to the same file
* Write multiple sections with **Microsoft Excel Writer**

By the end, you will know how to write into an Excel template safely. You will also know when to block parallel flows.

**Prerequisites:** Understanding of basic transformation concepts (steps, hops, preview). Complete [Text File Input](https://academy.pentaho.com/pentaho-data-integration/data-integration/data-sources/flat-files/text/text-file-input) first.

**Estimated time:** 35 minutes
{% endhint %}

{% embed url="<https://www.loom.com/share/770dd4bae75049eeab247bec2bf6fbcc?hideEmbedTopBar=true&hide_owner=true&hide_share=true&hide_title=true>" %}
Advanced Excel Writer scenario
{% endembed %}

***

{% hint style="info" %}
**Workshop files**

Download the following files.

Keep the filenames unchanged.

Save them in your workshop folder.
{% endhint %}

{% file src="<https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-9f8d9dcda077be950aedaffa0d5fc3213a56e15e%2Fsales.txt?alt=media>" %}

{% file src="<https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-7708888c6906cfa34eae075d8098c29dd78edbbb%2Fexpenses.txt?alt=media>" %}

{% file src="<https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-fa294945043898ccd03800a5d05529b14bfcc9ad%2Ftemplate.xlsx?alt=media>" %}

{% file src="<https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-07499f8b2eb12bc356a278894e8fde3121594b71%2Ftr_write_excel.ktr?alt=media>" %}

***

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-de9e48adaafdbe895552b18a82192bff60a9efc6%2Fexcel.png?alt=media" alt="" width="375"><figcaption><p>Sales &#x26; Expenses</p></figcaption></figure>

{% hint style="info" %}
**Create a new transformation**

Use any of these options to open a new transformation tab:

* Select **File** > **New** > **Transformation**
* Use `Ctrl+N` (Windows/Linux) or `Cmd+N` (macOS)
  {% endhint %}

***

{% tabs %}
{% tab title="1. Excel Template" %}
{% hint style="info" %}

#### Excel template

The various stages of the transformation write data to a template.xlsx. The template has 2 worksheets:

* Sales Chart - this worksheet creates a 3D stacked graph
* SourceData - worksheet
  {% endhint %}

1. Open `template.xlsx` in Excel:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-2a5905039e5a72e0ef38b5abd50a11ba64342c87%2Fimage.png?alt=media" alt=""><figcaption><p>Blank template</p></figcaption></figure>

{% hint style="info" %}
SourceData - the datasheet. Transformations write to the required cells that are used to create the graph.
{% endhint %}

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-bdaf2c1e7d5421ed18aff42eb19efccf82f98fd3%2Fimage.png?alt=media" alt=""><figcaption><p>Blank SourceData</p></figcaption></figure>
{% endtab %}

{% tab title="2. Write Year" %}
{% hint style="info" %}

#### Write year

The first workflow is to write the current Year to the SourceData worksheet in the template.xlsx

You can change the year value.
{% endhint %}

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-a6707295a9c995feab7bf153794969927ad69824%2Fexcel%20year.png?alt=media" alt="" width="375"><figcaption><p>Year</p></figcaption></figure>

{% tabs %}
{% tab title="1. Generate Rows - Year" %}
{% hint style="info" %}

#### Generate rows - year

Generate rows outputs a fixed number of rows. Here you output a single row that contains the report year.
{% endhint %}

1. Start Pentaho Data Integration (Spoon).

{% hint style="info" %}
{% tabs %}
{% tab title="Windows (PowerShell)" %}

```powershell
Set-Location C:\Pentaho\design-tools\data-integration
.\spoon.bat
```

{% endtab %}

{% tab title="macOS / Linux" %}

```bash
cd ~/Pentaho/design-tools/data-integration
./spoon.sh
```

{% endtab %}
{% endtabs %}
{% endhint %}

2. Drag the ‘Generate Rows’ step onto the canvas.
3. Double-click on the step, and configure the following properties:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-e639cb50c5afbeb98638687680d7814e4045d83e%2FGR%20-%20year.png?alt=media" alt=""><figcaption><p>Generate rows - Year</p></figcaption></figure>

4. Close Step.

***

{% hint style="info" %}
**Summary**

* Generates a record that holds the Year value – 2023 – in the year stream field.
* The Excel template will also need to be formatted yyyy to interpret the Date.
  {% endhint %}
  {% endtab %}

{% tab title="2. Excel Writer - Year" %}
{% hint style="info" %}

#### Excel Writer - year

Microsoft Excel Writer writes incoming rows into an Excel workbook. Use `xlsx` when you work with templates and charts.
{% endhint %}

1. Drag the ‘Excel writer’ step onto the canvas.
2. Create a hop from the ‘Year’ step.
3. Double-click on the step, and configure the following properties:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-19d7eb53166ee0ee1decf20751276b2614a98039%2Fexcel%20writer%20-%20year.png?alt=media" alt=""><figcaption><p>Excel writer - Year</p></figcaption></figure>

{% hint style="info" %}
Use these paths:

* Output: `${Internal.Transformation.Filename.Directory}/Sales_and_Expenses_2023.xlsx`
* Template: `${Internal.Transformation.Filename.Directory}/template.xlsx`

Select **Replace with new output file** while you develop. It resets the workbook on every run.
{% endhint %}

4\. Click on the Content tab, and configure the following properties:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-5a15b8933cc1792135534546cf6a929d8cd54781%2Fexcel%20writer%20-%20content.png?alt=media" alt=""><figcaption><p>Excel writer - cell</p></figcaption></figure>

5. Click on ‘Get Fields’ button.
6. Click OK.
   {% endtab %}
   {% endtabs %}
   {% endtab %}

{% tab title="3. Write Sales" %}
{% hint style="info" %}

#### Write sales

Write sales rows into the same workbook. Use a blocking step so the year write completes first.
{% endhint %}

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-6259ab6de7f8fde19b1798d6ca337cc49d23b567%2Fwrite%20sales.png?alt=media" alt="" width="375"><figcaption><p>Write Sales</p></figcaption></figure>

{% tabs %}
{% tab title="1. Text File Input - Read Sales" %}
{% hint style="info" %}

#### Text file input - read sales

Read the sales dataset from the workshop file. Keep the header row enabled so field names match the template.
{% endhint %}

1. Drag the ‘Text file input’ step onto the canvas.
2. Double-click on the step, and configure the following properties:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-b512928eb22d1112122216f89715b6a4610c7126%2Ftfi%20-%20sales.png?alt=media" alt=""><figcaption><p>Text file input - sales</p></figcaption></figure>

3. Click on the Content tab, and configure the following properties:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-7ea368c91f3c1a8076ca846a0d1521db3ea8145e%2Ftfi%20-content.png?alt=media" alt=""><figcaption><p>Text file input - Content</p></figcaption></figure>

{% hint style="info" %}

* Ensure the Header is selected.
* No empty rows
* Mixed Format
  {% endhint %}

4. Click on the Fields tab, and click on ‘Get Fields’ button:

{% hint style="info" %}
Returns the Header values as stream fields.
{% endhint %}

5. Click OK.
   {% endtab %}

{% tab title="2. Block until Step Finish - Wait Year" %}
{% hint style="info" %}

#### Block until steps finish - wait year

This step waits for specific steps to finish. Use it to prevent parallel writers.
{% endhint %}

1. Drag the ‘Block this step until steps finish’ step onto the canvas.
2. Create a hop from the ‘Read Sales’ step.
3. Double-click on the step, and configure the following properties:
   * Watch step: the step that writes the year (copy `0`)
   * If you used **Get steps**, remove everything except the year writer step

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-e118ed4f9621494cd4a1cd8585419c90b53b5730%2Fblock%20year.png?alt=media" alt=""><figcaption><p>Block step</p></figcaption></figure>

{% hint style="info" %}
This will result in the workflow being blocked until the Write Year step has been completed.
{% endhint %}
{% endtab %}

{% tab title="3. Excel Writer - Write Sales" %}
{% hint style="info" %}

#### Excel Writer - write sales

Write sales rows into the existing workbook. Use **Use existing file for writing**.
{% endhint %}

1. Drag the ‘Excel writer’ step onto the canvas.
2. Create a hop from the ‘Wait Year’ step.
3. Double-click on the step, and configure the following properties:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-0798210c5c09c27b5dfa7eae090525f15b6d06cb%2Fexcel%20writer%20-%20sales.png?alt=media" alt=""><figcaption><p>Excel writer - Sales</p></figcaption></figure>

{% hint style="info" %}
Use the same output path you used in the year writer:

* Output: `${Internal.Transformation.Filename.Directory}/Sales_and_Expenses_2023.xlsx`

Select **Use existing file for writing**.
{% endhint %}

4. Click on the Content tab, and configure the following properties:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-56dc27d91490795042379de2cf4cd8ad8af5a2de%2Fexcel%20writer%20-%20content%20(1).png?alt=media" alt=""><figcaption><p>Excel writer - Content</p></figcaption></figure>

5. Click on the ‘Get Fields’ button.
6. Delete the productline field, as its not required. The template already has the fieldname and you are just writing the data, starting at cell B5.
7. Click OK.
   {% endtab %}
   {% endtabs %}
   {% endtab %}

{% tab title="4. Write Expenses" %}
{% hint style="info" %}

#### Write expenses

Write expense rows into the same workbook. Block until the sales write completes.
{% endhint %}

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-6387c32a104e96cbef469d4575ed695d5cd7c267%2Fwrite%20expenses.png?alt=media" alt="" width="375"><figcaption><p>Write Expenses</p></figcaption></figure>

{% tabs %}
{% tab title="1.  Text File Input - Read Expenses" %}
{% hint style="info" %}

#### Text file input - read expenses

Start with loading the Expenses data into the data stream.
{% endhint %}

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-814663c405611ca3e3d7ec1a809584b55f828e51%2Ftfi%20-expenses.png?alt=media" alt=""><figcaption><p>Text File input - Expenses</p></figcaption></figure>

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-1607b26aa3f3d194be69443f33cd3bcd43720e7b%2Ftfi%20-content%20expenses.png?alt=media" alt=""><figcaption><p>Text File input - Content</p></figcaption></figure>

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-86291266e1f25e3209a9e55b43d3379bdffe1a2c%2Ftfi%20-field%20expenses.png?alt=media" alt=""><figcaption><p>Text File input - Fields</p></figcaption></figure>
{% endtab %}

{% tab title="2. Block until Step finish - Sales" %}
{% hint style="info" %}

#### Block until steps finish - wait sales

Wait until sales data has finished writing to the workbook.
{% endhint %}

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-f386cf0039a5442712a7167f306ab5be312688e0%2Fblock%20expenses.png?alt=media" alt=""><figcaption><p>Block 'Write Sales'</p></figcaption></figure>
{% endtab %}

{% tab title="3. Excel Writer - Expenses" %}
{% hint style="info" %}

#### Excel Writer - Expenses

Write expenses to `SourceData`.
{% endhint %}

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-7e88d440d1bb37bda963119925c252eb49dc76f6%2Fexcel%20writer%20-%20expenses.png?alt=media" alt=""><figcaption><p>Excel writer - Expenses</p></figcaption></figure>

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-41ed7c70832699824dacd81af06c440bdbbe60e9%2Fexcel%20writer%20-%20expenses%20fields%20.png?alt=media" alt=""><figcaption><p>Excel writer - Content</p></figcaption></figure>
{% endtab %}
{% endtabs %}
{% endtab %}

{% tab title="5. RUN" %}
{% hint style="info" %}

#### Run the transformation

Steps initialize in parallel. Use blocking steps to prevent concurrent writes to the workbook.
{% endhint %}

1. Click the Run button in the Canvas Toolbar.
2. Open the Sales\_and\_Expenses\_2023.xlsx file.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-76e5699c21d628580661adf8008e1279fec9e299%2Fexcel%20template.jpg?alt=media" alt=""><figcaption><p>Excel Book</p></figcaption></figure>
{% endtab %}
{% endtabs %}
