# Database Connections

{% hint style="warning" %}

#### Workshop - Database connections

Create a reusable MySQL connection to the Steel Wheels `sampledata` database.

**What you’ll do**

* Validate the database is reachable (optional, using DBeaver)
* Install a JDBC driver (only if PDI does not include it)
* Create, test, share, and explore a PDI database connection

**Prerequisites**

* PDI (Spoon) installed and working
* A running `sampledata` database (Docker setup recommended)
* Basic understanding of schemas, tables, and authentication

**Estimated time:** 15 minutes
{% endhint %}

{% embed url="<https://www.loom.com/share/3ec2d5123814460d92085851c18daaee?hideEmbedTopBar=true&hide_owner=true&hide_share=true&hide_title=true>" %}
Database Connection
{% endembed %}

***

{% hint style="info" %}
**Workshop files**

Download the following file.

Keep the filename unchanged.

Save it in your workshop folder.
{% endhint %}

{% file src="<https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2FZx6jfrv7B9TFesgrSXud%2Ftr_connect_database.ktr?alt=media&token=08e82862-bc8e-42f4-af1e-23f1855d5bc0>" %}

***

{% hint style="info" %}

### Create a new transformation

Use any of these options to open a new transformation tab:

* Select **File** > **New** > **Transformation**
* Use `Ctrl+N` (Windows/Linux) or `Cmd+N` (macOS)
  {% endhint %}

***

{% tabs %}
{% tab title="1. DBeaver" %}
{% hint style="info" %}

#### **DBeaver**

DBeaver is optional.\
Use it to confirm the database is reachable before you touch PDI.

DBeaver CE ships with most drivers.\
That makes it a fast connectivity check.
{% endhint %}

{% hint style="warning" %}
Pentaho no longer ships a writable sample database user by default.\
Use the Docker `sampledata` setup for hands-on database workshops.
{% endhint %}

{% tabs %}
{% tab title="MySQL" %}
{% hint style="info" %}

#### **MySQL Database**

If you completed the [Setup](https://academy.pentaho.com/pentaho-data-integration/data-integration/data-sources/databases/cruid/broken-reference), you should have a MySQL Docker container.\
It should be exposed on port `3306` and include the `sampledata` database.
{% endhint %}

1. Launch DBeaver and select **MySQL**.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-42a361694e5ba192787221c14512cce1bee558d0%2Fimage.png?alt=media" alt=""><figcaption><p>MySQL</p></figcaption></figure>

2. Configure the connection:

* **Username:** `root`
* **Password:** `password`

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-d7cd5cc2208013de8fbbd5cafaf814dcc79b8f1c%2Fimage.png?alt=media" alt=""><figcaption><p>Configure &#x26; Test MySQL connection - sampledata</p></figcaption></figure>

{% hint style="warning" %}
You might need to download the supported driver version.

If the test fails, enable `allowPublicKeyRetrieval`.
{% endhint %}

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-5c8f6785f686bc5a001ce1ea4146538dbd43839c%2Fimage%20(56).png?alt=media" alt=""><figcaption><p>Enable: allowPublicKeyRetrieval</p></figcaption></figure>

3. Test the connection.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-b67c91b68bd64e8d8f41b48f331633d8b05c1acb%2Fimage.png?alt=media" alt=""><figcaption><p>Test connection</p></figcaption></figure>

4. Expand **Databases** > **sampledata** > **Tables**.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-ba7c0cd0e9ece85004b117cf365c9e99c761be87%2Fimage.png?alt=media" alt=""><figcaption><p>Customer Data</p></figcaption></figure>

5. Open a SQL window and run a test query:

```sql
select * from CUSTOMERS
where COUNTRY = 'USA' and CITY = 'NYC';
```

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-a97f7b088cb293b4f1417b4ab7bd14a647cd60b5%2Fimage.png?alt=media" alt=""><figcaption><p>Sql query - NYC Customers</p></figcaption></figure>

{% hint style="success" %}
Checkpoint: you can browse tables and run a query against `CUSTOMERS`.
{% endhint %}
{% endtab %}
{% endtabs %}
{% endtab %}

{% tab title="2. JDBC Driver" %}
{% hint style="info" %}

#### **Download JDBC Driver**

PDI does not ship all JDBC drivers.\
If your database type is missing, add the driver JAR.
{% endhint %}

{% embed url="<https://docs.pentaho.com/install/jdbc-drivers-reference>" %}

1. Download the JDBC driver for your database.

{% embed url="<https://dbschema.com/databases.html>" %}

2. Copy the driver JAR into your PDI install:

{% tabs %}
{% tab title="Windows" %}
`C:\Pentaho\design-tools\data-integration\lib\`
{% endtab %}

{% tab title="macOS / Linux" %}
`~/Pentaho/design-tools/data-integration/lib/`
{% endtab %}
{% endtabs %}

3. Restart Spoon.

{% hint style="info" %}
If your install uses `lib/jdbc/`, place the JAR there instead.
{% endhint %}
{% endtab %}

{% tab title="3. Data Integration" %}
{% hint style="info" %}

#### **Pentaho Data Integration Connection**

Create the connection once.\
Reuse it in steps like **Table input**, **Table output**, and **Database lookup**.

In this lab, you connect to the Steel Wheels `sampledata` database (MySQL).
{% endhint %}

**Define a database connection (MySQL)**

1. Create a transformation.
2. In Spoon, select **File** > **New** > **Database connection**.

The **Database connection** dialog opens.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-a2b6dfa99dd14bf066d367c6662f9e2e4c3a2080%2Fimage.png?alt=media" alt=""><figcaption></figcaption></figure>

3. Enter the following details:

{% hint style="danger" %}
If you use a MariaDB driver newer than `2.7.x`, you might see a **fetch size** error.\
If that happens, use the **MySQL** driver instead.
{% endhint %}

* **Connection name:** `MySQL: sampledata`
* **Connection type:** **MySQL**
* **Access:** **Native (JDBC)**
* **Host name:** `localhost` (or your Docker host IP)
* **Database name:** `sampledata`
* **Username:** `pentaho_admin`
* **Password:** `password`

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-3a8173423bdbe4c0f9aebe67a7c1fb7fc7c69fee%2Fimage.png?alt=media" alt=""><figcaption><p>MySQL - sampledata</p></figcaption></figure>

4. Select **Test**.

{% hint style="info" %}
Checkpoint: Spoon shows a success message.
{% endhint %}

{% tabs %}
{% tab title="1. Share Connection" %}
{% hint style="info" %}
**Share Database Connection**

Share the connection so other transformations can reuse it.
{% endhint %}

1. Click OK to save your entries and exit the Database Connection dialog box.
2. From within the View tab, right-click on the connection and select Share from the list that appears.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-b08bf0f402207eaf52189c184b1705303552ed6b%2Fimage.png?alt=media" alt=""><figcaption><p>Share database connection</p></figcaption></figure>

{% hint style="info" %}
Shared connections show up for other users and projects.\
Use **Explore** to confirm schemas and tables.
{% endhint %}
{% endtab %}

{% tab title="2. Explore Database" %}
{% hint style="info" %}
**Explore Database**

Use **Database Explorer** to browse schemas, preview rows, and run SQL.
{% endhint %}

1. Click on the View tab, expand Database Connections.
2. Right-click MySQL:sampledata and choose Explore from the menu options:

|                                   | Action                                                                |
| --------------------------------- | --------------------------------------------------------------------- |
| Preview the first 100 rows of ..  | Return the first 100 rows of the selected table.                      |
| Preview first .. rows of ..       | Enter the number of rows to preview                                   |
| Number of rows ..                 | Displays number of rows                                               |
| Generate DDL                      | Displays DDL statement that creates table.                            |
| Generate DDL for other connection | Select connection to display DDL. Syntax is based on database engine. |
| Open SQL for ..                   | Edit SELECT statement                                                 |
| Truncate table                    | Deletees all the rows from selected table                             |

3. In the Database Explorer window, expand Sampledata > Tables

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-48ad8d4f6c8378428a7a9687f6fb3fe7d486f6e9%2Fdb%20explorer.png?alt=media" alt="" width="375"><figcaption><p>Database Explorer - sampledata</p></figcaption></figure>

4. Right-click the `CUSTOMERS` table and choose **Preview first 100**.
5. Examine the customer data.
6. Select **View SQL**.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-6a37fd33b07cae5013febf4159c25ab593d580e8%2FSQL.png?alt=media" alt="" width="375"><figcaption><p>SQL</p></figcaption></figure>

7. Click Execute.

<figure><img src="https://3680356391-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZpCSy6Skj215f4oWypdc%2Fuploads%2Fgit-blob-0855e613cc3e4c79a4b9d4ff51d58a0c305ea892%2Fimage.png?alt=media" alt=""><figcaption><p>Execute SQL statement</p></figcaption></figure>

{% hint style="success" %}
Checkpoint: you can preview `CUSTOMERS` and execute SQL in Database Explorer.
{% endhint %}
{% endtab %}
{% endtabs %}
{% endtab %}
{% endtabs %}
