# Create DB table

{% hint style="warning" %}

#### Workshop - Create DB table

Create a table from a stream definition.\
Load rows into that table in the same run.

**What you’ll do**

* Read a delimited file into a stream
* Map stream fields to table columns
* Generate and run `CREATE TABLE` SQL from **Table Output**
* Insert rows with commit and batch settings

**Prerequisites**

* A working database connection. See [Database Connections](/pentaho-data-integration/data-integration/data-sources/databases/cruid/database-connections.md).
* Basic understanding of tables and SQL data types

**Estimated time:** 30 minutes
{% endhint %}

{% embed url="<https://www.loom.com/share/ebcc69cd2a9347f8bec1620259952df7?hideEmbedTopBar=true&hide_owner=true&hide_share=true&hide_title=true>" %}

***

{% hint style="info" %}
**Workshop files**

Download the following files.

Keep the filenames unchanged.

Save them in your workshop folder.
{% endhint %}

{% file src="/files/GABq2FDI1hWBtd8TM1rK" %}

{% file src="/files/laqnH0bKB6FdtKFs3wi3" %}

{% file src="/files/VMjJDhgZTsjGuFmPHSiN" %}

***

<figure><img src="/files/1IJE894zsoO3fgK4yV0P" alt="" width="375"><figcaption><p>Create databases</p></figcaption></figure>

{% hint style="info" %}

### Create a new transformation

Use any of these options to open a new transformation tab:

* Select **File** > **New** > **Transformation**
* Use `Ctrl+N` (Windows/Linux) or `Cmd+N` (macOS)
  {% endhint %}

***

{% tabs %}
{% tab title="Workflow 1 - Sales Data" %}
{% hint style="info" %}

#### **Load Sales Data**

Load sales data from a CSV file into `STG_SALES_DATA`.

Adjust field lengths to avoid truncation.
{% endhint %}

<figure><img src="/files/Erbl7dFoZFgdZThpPMCo" alt="" width="277"><figcaption><p>Load sales data</p></figcaption></figure>

Follow the steps outlined below:

{% tabs %}
{% tab title="1. CSV File Input" %}
{% hint style="info" %}
**CSV File Input**

Read a delimited file into a stream.\
Use **Text File Input** if you need more format options.
{% endhint %}

1. Start Pentaho Data Integration.

{% hint style="info" %}
**Start Spoon**

{% tabs %}
{% tab title="Windows (PowerShell)" %}

```powershell
Set-Location C:\Pentaho\design-tools\data-integration
.\spoon.bat
```

{% endtab %}

{% tab title="macOS / Linux" %}

```bash
cd ~/Pentaho/design-tools/data-integration
./spoon.sh
```

{% endtab %}
{% endtabs %}
{% endhint %}

2. Drag the CSV file input step onto the canvas.
3. Open the CSV file input properties dialog box.

Configure these key fields:

* **Step name:** `csvi-sales_data`
* **File name:** `${Internal.Transformation.Filename.Directory}/sales_data.csv`
* **Delimiter:** `,` (comma)
* **Lazy conversion:** clear
* **Header row present:** select

{% hint style="info" %}
If `${Internal.Transformation.Filename.Directory}` is empty, save the transformation first.
{% endhint %}

{% hint style="danger" %}
CSV File Input infers field lengths from a sample.\
Increase string lengths before you generate table DDL.
{% endhint %}

4. Ensure the following details are configured, as outlined below:

<figure><img src="/files/zOGpA8O7STKx8xze795J" alt=""><figcaption><p>CSV file input</p></figcaption></figure>

5. Click on the Get Fields button.
6. Select **OK**.
   {% endtab %}

{% tab title="2. Table Output" %}
{% hint style="info" %}
**Table Output**

Load rows into a database table.\
This step uses SQL `INSERT`.
{% endhint %}

1. Drag the Table Output step onto the canvas.
2. Open the Table Output properties dialog box. Ensure the following details are configured, as outlined below:

<figure><img src="/files/i37VQPS6Abk4BNCSvSsX" alt=""><figcaption><p>Table output - options</p></figcaption></figure>

3. Click on the Database fields.
4. Click on the ‘Get Fields’ button.

<figure><img src="/files/4hAWwn0z1XNExKfCa6z9" alt=""><figcaption><p>Table output - fields</p></figcaption></figure>

{% hint style="warning" %}
Confirm the mappings between **Table fields** and **Stream fields**.
{% endhint %}

5. Click on the SQL button.

<figure><img src="/files/qgYtFQ35xSmpVJix96pl" alt="" width="375"><figcaption><p>SQL editor</p></figcaption></figure>

6. Select **Execute**.
7. Select **OK** to close all dialogs.

{% hint style="success" %}
Checkpoint: `STG_SALES_DATA` exists in the database.
{% endhint %}
{% endtab %}

{% tab title="3. RUN" %}
{% hint style="info" %}
**RUN and validate**

Use a database tool to verify results (DBeaver, Workbench, or your IDE).\
In production, you usually manage DDL with migrations or scripts.
{% endhint %}

1. Click the Run button in the Canvas Toolbar.
2. Confirm the table exists in your `sampledata` database.

<figure><img src="/files/w2IODoGL4Jxn1V4OAUyM" alt=""><figcaption><p>STG_SALES_DATA</p></figcaption></figure>
{% endtab %}
{% endtabs %}
{% endtab %}

{% tab title="Workflow 2 - Orders" %}
{% hint style="info" %}

#### **Load Orders**

Load orders data from a delimited file into `STG_ORDERS_MERGED`.

Adjust field lengths to avoid truncation.
{% endhint %}

<figure><img src="/files/fTixivdeazDfoVP0lire" alt="" width="288"><figcaption><p>Load orders data</p></figcaption></figure>

Follow the steps outlined below:

{% tabs %}
{% tab title="1. CSV File Input" %}
{% hint style="info" %}
**CSV File Input**

Same setup as Workflow 1, but point to your orders file.
{% endhint %}

1. Drag the CSV file input step onto the canvas.
2. Open the CSV file input properties dialog box.

Ensure the following details are configured, as outlined below:

<figure><img src="/files/9STa2CDOhvSF9khcd5OC" alt=""><figcaption><p>Add path to orders.txt</p></figcaption></figure>

<figure><img src="/files/j692ELltSo63GQWNFPxC" alt=""><figcaption><p>Set Content</p></figcaption></figure>

<figure><img src="/files/tXyHrT4vy7E9Xui34ype" alt=""><figcaption><p>Get Fields</p></figcaption></figure>
{% endtab %}

{% tab title="2. Table Output" %}
{% hint style="info" %}
**Table Output**

Load rows into a database table.\
This step uses SQL `INSERT`.
{% endhint %}

1. Drag the Table Output step onto the canvas.
2. Open the Table Output properties dialog box. Ensure the following details are configured, as outlined below:

<figure><img src="/files/z0juo75eYE6QoIZQAthN" alt=""><figcaption><p>Table output - options</p></figcaption></figure>

3. Click on the Database fields.
4. Click on the ‘Get Fields’ button.

<figure><img src="/files/jVc7QAh1QsVUOpJDwFgY" alt=""><figcaption><p>Table output -fields</p></figcaption></figure>

5. Click on the SQL button.

<figure><img src="/files/KuNskbRAnAupIALHLfFK" alt=""><figcaption><p>SQL editor</p></figcaption></figure>

6. Click Execute.
7. Select **OK** to close all dialogs.

{% hint style="success" %}
Checkpoint: `STG_ORDERS_MERGED` exists in the database.
{% endhint %}
{% endtab %}

{% tab title="3. RUN" %}
{% hint style="info" %}
**RUN and validate**

Use a database tool to verify results.
{% endhint %}

1. Click the Run button in the Canvas Toolbar.
2. Confirm the table exists in your `sampledata` database.

<figure><img src="/files/sp2ghOBFEUzhGHIw3lsa" alt=""><figcaption><p>STG_ORDERS_MERGED</p></figcaption></figure>
{% endtab %}
{% endtabs %}
{% endtab %}
{% endtabs %}

<details>

<summary>Troubleshooting</summary>

**“Table already exists”**\
Disable table creation in **Table Output**, or drop the table first.

**String truncation / data too long**\
Increase field lengths in **CSV File Input** before you generate SQL.

**SQL window is empty**\
Select **Get fields** on the **Database fields** tab first.

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://academy.pentaho.com/pentaho-data-integration/data-integration/data-sources/databases/cruid/create-db-table.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
