# Create DB table

{% hint style="warning" %}

#### Workshop - Create DB table

Create a table from a stream definition.\
Load rows into that table in the same run.

**What you’ll do**

* Read a delimited file into a stream
* Map stream fields to table columns
* Generate and run `CREATE TABLE` SQL from **Table Output**
* Insert rows with commit and batch settings

**Prerequisites**

* A working database connection. See [Database Connections](https://academy.pentaho.com/pentaho-data-integration/data-integration/data-sources/databases/cruid/database-connections).
* Basic understanding of tables and SQL data types

**Estimated time:** 30 minutes
{% endhint %}

{% embed url="<https://www.loom.com/share/ebcc69cd2a9347f8bec1620259952df7?hideEmbedTopBar=true&hide_owner=true&hide_share=true&hide_title=true>" %}

***

{% hint style="info" %}
**Workshop files**

Download the following files.

Keep the filenames unchanged.

Save them in your workshop folder.
{% endhint %}

{% file src="<https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2F98ivTtuRZcjCXOEnrftl%2Fsales_data.csv?alt=media&token=6a57700d-afd7-467e-b5c8-36378eef2191>" %}

{% file src="<https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2FfUdWs0DDLZCUzp1AAqZ4%2Forders.txt?alt=media&token=646f0e63-5208-42e4-816b-1cc1e09ee2fb>" %}

{% file src="<https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2FH5nF2jwMNZ52ltofxhab%2Ftr_write_database.ktr?alt=media&token=a8aa1279-6bf5-47e2-b7ea-3af72224e8c9>" %}

***

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-2aeea88980870a6e5d7a1e1c10e611e113c4ea4c%2Fcreate%20DB.png?alt=media" alt="" width="375"><figcaption><p>Create databases</p></figcaption></figure>

{% hint style="info" %}

### Create a new transformation

Use any of these options to open a new transformation tab:

* Select **File** > **New** > **Transformation**
* Use `Ctrl+N` (Windows/Linux) or `Cmd+N` (macOS)
  {% endhint %}

***

{% tabs %}
{% tab title="Workflow 1 - Sales Data" %}
{% hint style="info" %}

#### **Load Sales Data**

Load sales data from a CSV file into `STG_SALES_DATA`.

Adjust field lengths to avoid truncation.
{% endhint %}

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-939504761ba0c05eff5cfc4b950f92123b1b1276%2Fimage.png?alt=media" alt="" width="277"><figcaption><p>Load sales data</p></figcaption></figure>

Follow the steps outlined below:

{% tabs %}
{% tab title="1. CSV File Input" %}
{% hint style="info" %}
**CSV File Input**

Read a delimited file into a stream.\
Use **Text File Input** if you need more format options.
{% endhint %}

1. Start Pentaho Data Integration.

{% hint style="info" %}
**Start Spoon**

{% tabs %}
{% tab title="Windows (PowerShell)" %}

```powershell
Set-Location C:\Pentaho\design-tools\data-integration
.\spoon.bat
```

{% endtab %}

{% tab title="macOS / Linux" %}

```bash
cd ~/Pentaho/design-tools/data-integration
./spoon.sh
```

{% endtab %}
{% endtabs %}
{% endhint %}

2. Drag the CSV file input step onto the canvas.
3. Open the CSV file input properties dialog box.

Configure these key fields:

* **Step name:** `csvi-sales_data`
* **File name:** `${Internal.Transformation.Filename.Directory}/sales_data.csv`
* **Delimiter:** `,` (comma)
* **Lazy conversion:** clear
* **Header row present:** select

{% hint style="info" %}
If `${Internal.Transformation.Filename.Directory}` is empty, save the transformation first.
{% endhint %}

{% hint style="danger" %}
CSV File Input infers field lengths from a sample.\
Increase string lengths before you generate table DDL.
{% endhint %}

4. Ensure the following details are configured, as outlined below:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-18ab60e37c7258a8e7f56837e242624b38fa6688%2Fcsv_sales.png?alt=media" alt=""><figcaption><p>CSV file input</p></figcaption></figure>

5. Click on the Get Fields button.
6. Select **OK**.
   {% endtab %}

{% tab title="2. Table Output" %}
{% hint style="info" %}
**Table Output**

Load rows into a database table.\
This step uses SQL `INSERT`.
{% endhint %}

1. Drag the Table Output step onto the canvas.
2. Open the Table Output properties dialog box. Ensure the following details are configured, as outlined below:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-78793d034a4f963db9e5f8b42c344ba9b25b71db%2Ftable%20output.png?alt=media" alt=""><figcaption><p>Table output - options</p></figcaption></figure>

3. Click on the Database fields.
4. Click on the ‘Get Fields’ button.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-cf0a1518735c115eb62562d66bbe672d336b6910%2Ftable%20output%20-%20fields.png?alt=media" alt=""><figcaption><p>Table output - fields</p></figcaption></figure>

{% hint style="warning" %}
Confirm the mappings between **Table fields** and **Stream fields**.
{% endhint %}

5. Click on the SQL button.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-d04735ed9f4f51775c6fed1ba0602735fd31c6e1%2Fsales%20sql.png?alt=media" alt="" width="375"><figcaption><p>SQL editor</p></figcaption></figure>

6. Select **Execute**.
7. Select **OK** to close all dialogs.

{% hint style="success" %}
Checkpoint: `STG_SALES_DATA` exists in the database.
{% endhint %}
{% endtab %}

{% tab title="3. RUN" %}
{% hint style="info" %}
**RUN and validate**

Use a database tool to verify results (DBeaver, Workbench, or your IDE).\
In production, you usually manage DDL with migrations or scripts.
{% endhint %}

1. Click the Run button in the Canvas Toolbar.
2. Confirm the table exists in your `sampledata` database.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-e3be5fbb6060046f8e74408aa2ec53487761fdcc%2Fsales%20data.png?alt=media" alt=""><figcaption><p>STG_SALES_DATA</p></figcaption></figure>
{% endtab %}
{% endtabs %}
{% endtab %}

{% tab title="Workflow 2 - Orders" %}
{% hint style="info" %}

#### **Load Orders**

Load orders data from a delimited file into `STG_ORDERS_MERGED`.

Adjust field lengths to avoid truncation.
{% endhint %}

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-b9df43a1d2408359027be9461f759b257c4c9cdd%2Fimage.png?alt=media" alt="" width="288"><figcaption><p>Load orders data</p></figcaption></figure>

Follow the steps outlined below:

{% tabs %}
{% tab title="1. CSV File Input" %}
{% hint style="info" %}
**CSV File Input**

Same setup as Workflow 1, but point to your orders file.
{% endhint %}

1. Drag the CSV file input step onto the canvas.
2. Open the CSV file input properties dialog box.

Ensure the following details are configured, as outlined below:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-a13f4e532c7c30315451b3f4a0cb01648f7696c8%2Fimage.png?alt=media" alt=""><figcaption><p>Add path to orders.txt</p></figcaption></figure>

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-02367b05a054d0fbc487b0c8cab31d1643e04f70%2Fimage.png?alt=media" alt=""><figcaption><p>Set Content</p></figcaption></figure>

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-33eec2a6df4ffd2fd6b12e39eb0939c171d80af7%2Fimage.png?alt=media" alt=""><figcaption><p>Get Fields</p></figcaption></figure>
{% endtab %}

{% tab title="2. Table Output" %}
{% hint style="info" %}
**Table Output**

Load rows into a database table.\
This step uses SQL `INSERT`.
{% endhint %}

1. Drag the Table Output step onto the canvas.
2. Open the Table Output properties dialog box. Ensure the following details are configured, as outlined below:

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-093791d2d2218d2c606de5b04daf17dc780d66ac%2Fstg_orders_merged.png?alt=media" alt=""><figcaption><p>Table output - options</p></figcaption></figure>

3. Click on the Database fields.
4. Click on the ‘Get Fields’ button.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-9a3aff1d5eaaa96e3dc9da02a90539b2a497ae04%2Fstg_orders_merged%20fields.png?alt=media" alt=""><figcaption><p>Table output -fields</p></figcaption></figure>

5. Click on the SQL button.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-173762bc17935b78624668068aa852a4b4469803%2Forders%20sql.png?alt=media" alt=""><figcaption><p>SQL editor</p></figcaption></figure>

6. Click Execute.
7. Select **OK** to close all dialogs.

{% hint style="success" %}
Checkpoint: `STG_ORDERS_MERGED` exists in the database.
{% endhint %}
{% endtab %}

{% tab title="3. RUN" %}
{% hint style="info" %}
**RUN and validate**

Use a database tool to verify results.
{% endhint %}

1. Click the Run button in the Canvas Toolbar.
2. Confirm the table exists in your `sampledata` database.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-c4ff35a67a59d81154cc6d46c4bbf687507a429a%2Fimage.png?alt=media" alt=""><figcaption><p>STG_ORDERS_MERGED</p></figcaption></figure>
{% endtab %}
{% endtabs %}
{% endtab %}
{% endtabs %}

<details>

<summary>Troubleshooting</summary>

**“Table already exists”**\
Disable table creation in **Table Output**, or drop the table first.

**String truncation / data too long**\
Increase field lengths in **CSV File Input** before you generate SQL.

**SQL window is empty**\
Select **Get fields** on the **Database fields** tab first.

</details>
