# Connect AW Database

{% hint style="success" %}

#### Adventure Works&#x20;

In this hands-on workshop, you'll learn how to establish a connection between Pentaho Data Catalog and the Adventure Works 2022 database. We'll walk through configuring a Microsoft SQL Server data source connection, testing the connection, and ingesting metadata schemas to make Adventure Works data available for governance, cataloging, and analysis within PDC.

By the end of this workshop, you will be able to:

* Configure Microsoft SQL Server data source connections in PDC
* Set up proper JDBC connection strings with authentication parameters
* Test database connectivity before committing configuration changes
* Ingest specific database schemas for targeted metadata management
* Configure optional data source properties for storage optimization and cost tracking
* Understand the relationship between data source connections and metadata ingestion
* Establish the foundation for Adventure Works data governance workflows

**Technical Requirements:**

* MSSQL JDBC Driver (mssql-jdbc-12.10.1.jre11.jar) must be uploaded to PDC
* Database server: pdc.pentaho.lab:1433
* Authentication: SQL Server authentication with sa account
  {% endhint %}

<figure><img src="/files/hVaLawhSiL7zFTcntKop" alt=""><figcaption><p>Adventure Works</p></figcaption></figure>

{% embed url="<https://www.dbdiagrams.com/online-diagrams/adventureworks/index.html?page=Diagrams>" %}
Link to Adventure Works ER
{% endembed %}

***

1. Log into Data Catalog:

{% embed url="<https://pdc.pentaho.lab>" %}

Username: <hugo.reilly@adventureworks.com>

&#x20;                    <james.lock@adventureworks.com>

Password: Welcome123!

2. Click: Management in the left navigation menu.

<figure><img src="/files/LYshW9r1LHdmdWt6a2fA" alt=""><figcaption><p>Management - Resources</p></figcaption></figure>

3. In the Resources tile, click: Add Data Source.

<figure><img src="/files/y32p0Jb6E0JR3abKmZ3b" alt=""><figcaption><p>Add a Data Source</p></figcaption></figure>

4. Specify the following information for the connection to your data source.

{% hint style="warning" %}
If you are nearing or have exceeded the limit of data sources allowed by your license agreement, a message appears when you try to add a new data source.

Data Catalog encrypts your data source connection details, such as user name and password, before storing them.
{% endhint %}

{% tabs %}
{% tab title="1. Ingest Metadata" %}
{% hint style="info" %}

#### Test Connection and Ingest Metadata Schema ..

After you have specified the detailed information according to your data source type, test the connection to the data source and add the data source.
{% endhint %}

1. Enter the following details to connect to: Adventure Works database.

<table><thead><tr><th width="226">Field</th><th>Setting</th></tr></thead><tbody><tr><td>Data Source Name</td><td>mssql:adventureworks2022</td></tr><tr><td>Data Source ID</td><td>Leave Blank to autogenerate ID</td></tr><tr><td>Description</td><td>AW DW: Person, HR, Purchasing, Sales, Production</td></tr><tr><td>Data Source Type</td><td>Microsoft SQL Server</td></tr><tr><td>Affinity</td><td>Default</td></tr><tr><td>Configuration Method</td><td><strong>URI</strong></td></tr><tr><td>   Username</td><td>sa</td></tr><tr><td>   Password</td><td>StrongPassword123</td></tr><tr><td>   URI</td><td>jdbc:sqlserver://pdc.pentaho.lab:1433;databaseName=AdventureWorks2022;user=sa;password=StrongPassword123;encrypt=false</td></tr><tr><td>Driver</td><td>mssql-jdbc-13.2.0.jre11.jar*</td></tr><tr><td>Database Name</td><td>AdventureWorks2022</td></tr></tbody></table>

{% hint style="info" %}

#### Manage Drivers

Select an existing driver or upload a new driver to ensure that the communication between the application and the database is efficient, secure, and follows the required standards.
{% endhint %}

<figure><img src="/files/aVxZIKNqPa049NAbQ2qM" alt=""><figcaption><p>Enter connection details</p></figcaption></figure>

{% hint style="danger" %}
Ensure you have uploaded the supported MSSQL Driver - mssql-jdbc-13.2.0.jre11.jar to PDC
{% endhint %}

{% tabs %}
{% tab title="Download JDBC Driver" %}

1. Download the MSSQL JDBC driver.

{% embed url="<https://learn.microsoft.com/en-us/sql/connect/jdbc/download-microsoft-jdbc-driver-for-sql-server?view=sql-server-ver17>" %}

2. Untar the file.

```bash
cd ~/Downloads
unzip ~/Downloads/sqljdbc_13.2.0.0_enu.zip -d ~/Download
```

{% hint style="info" %}
The drivers are located in: ./sqljdbc\_13.2/enu/jar.
{% endhint %}
{% endtab %}

{% tab title="Upload Driver" %}

1. Click: Manage Drivers.
2. Click: Add New

<figure><img src="/files/CDI4fns6GXv1DasvPb9U" alt=""><figcaption><p>Add New</p></figcaption></figure>

3. Drag & Drop the driver.

<figure><img src="/files/2g5b0iOo6NqcLHjQ0iez" alt=""><figcaption><p>Add Driver</p></figcaption></figure>

4. Click: Apply.

<figure><img src="/files/KQEYESKx45GJdBNWyfFX" alt=""><figcaption><p>Manage Drivers</p></figcaption></figure>

4. Click: Close
   {% endtab %}
   {% endtabs %}

5. Click: Manage Drivers.

<figure><img src="/files/sdqSe7UhJtg8AgnRLueu" alt=""><figcaption><p>Select JDBC Driver</p></figcaption></figure>

3. Enter the URI:

```
jdbc:sqlserver://pdc.pentaho.lab:1433;databaseName=AdventureWorks2022;user=sa;password=StrongPassword123;encrypt=false
```

4. Click Test Connection to test your connection to the specified data source.

<figure><img src="/files/Rh81EFsoOTARIlMKw7lj" alt=""><figcaption><p>Test Connection</p></figcaption></figure>

5. Click Ingest Schema, select the following 5 schemas, and then click Ingest Schemas.

<figure><img src="/files/nYA68oiRI2ZKbGygCxS1" alt=""><figcaption><p>Select Schemas</p></figcaption></figure>

{% hint style="warning" %}
While you have the option to select all schemas, it is advisable to exclude schemas that are not relevant to your requirements.

We're going to cover this in more detail in Data Optimizer.
{% endhint %}

6. Click: Save Data Source.

***

{% hint style="info" %}

#### Recommendation

For Pentaho Data Catalog, the best practice is to:

1. Use the standard connection URI without schema restrictions
2. Control schema scope through the **selective "Ingest Schema"** process during data source setup
3. Use virtual folders and role-based access control for further data governance

This approach aligns with Data Catalog's design philosophy of comprehensive discovery followed by organized access control rather than connection-level restrictions.
{% endhint %}

6. (Optional) In the **Physical Location** field, specify the physical location details of the data source.
7. (Optional) Configure the following storage optimization options for the data source.

<table><thead><tr><th width="195">Field</th><th>Description</th></tr></thead><tbody><tr><td><strong>Available for Migration</strong></td><td>Enables or disables the data source for storage optimization. When enabled, it includes the data source for data optimizer activities.</td></tr><tr><td><strong>Available for Writing</strong></td><td>Enables or disables writing capabilities for the data source and enables migration when turned on.</td></tr><tr><td><strong>Available for Data Mastering</strong></td><td>Enables or disables the data source for data mastering purposes.</td></tr></tbody></table>

{% hint style="info" %}
To use storage optimization options, you need a Pentaho Data Optimizer license.
{% endhint %}

8. (Optional) In the **Cost per Terabyte** field, specify the data source pricing details like currency, price per terabyte, and billing frequency.
9. (Optional) In the **Total Capacity** field, specify the total capacity of the data source in terabytes.
10. (Optional) Enter a **Note** for any additional information to share with others who might access this data source.
11. Click **Create Data Source** to establish your data source connection.

{% hint style="info" %}
So we've now ingested the AdventureWorks database schemas.

Before we start getting into the weeds, let's revisit our users and apply their Permissions & Scope.
{% endhint %}
{% endtab %}

{% tab title="2. Azure Data Studio" %}
{% hint style="info" %}

#### Connect to AdventureWorks2022 database

{% endhint %}

1. Start Azure Data Studio.

```bash
azuredatastudio
```

2. Select: Connections (first icon in left menu).

<figure><img src="/files/WD9M6gLxtkHHlwmoiS5F" alt=""><figcaption></figcaption></figure>

3. Select SQL Login
4. Enter the following details:

<table><thead><tr><th width="247">Field</th><th>Setting</th></tr></thead><tbody><tr><td>Connection type</td><td>Microsoft SQL Server</td></tr><tr><td>Input type</td><td>Parameters</td></tr><tr><td>*Server</td><td>localhost,1433</td></tr><tr><td>Authentication type</td><td>SQL Login</td></tr><tr><td>User name</td><td>sa</td></tr><tr><td>Password</td><td>StrongPassword123</td></tr><tr><td>Database</td><td>AdventureWorks2022</td></tr><tr><td>Encrypt</td><td>Mandatory (True)</td></tr><tr><td>Trust server certificate</td><td>True</td></tr><tr><td>Server group</td><td>&#x3C;Default></td></tr><tr><td>Name (optional)</td><td>AdventureWorks2022</td></tr></tbody></table>

{% hint style="warning" %}
\*Enter server IP address or FQDN.

\*\*PDC does not ship with any database drivers.

To upload JDBC drivers follow the instructions in tab: [Upload JDBC drivers](#download-jdbc-driver)
{% endhint %}

5. Click: Connect.

<figure><img src="/files/AYuNu679OcEfn6kJTAd1" alt=""><figcaption><p>Person Schema - Person Address</p></figcaption></figure>
{% endtab %}
{% endtabs %}

***


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://academy.pentaho.com/pentaho-data-catalog-en/data-catalog/data-discovery/connect-aw-database.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
