Connect AW Database
Connect to AW DW ..
Adventure Works
Adventure Works 2022 contains approximately 70 tables organized into multiple schemas representing different functional areas of the business, with around 20,000 customers, over 70,000 orders, and 500 products.
The database contains 486 columns that require classification, making it ideal for demonstrating data governance and classification processes:
Personal Data Identification and Classification: The database contains various types of sensitive data including personal information in tables like Person.Person and HumanResources.Employee, with data such as names, addresses, contact information, dates of birth, and even employee resumes that could contain multiple types of personal data.
Data Sensitivity Categorization: Using Pentaho Data Catalog (PDC), the Adventure Works database demonstrates how to perform automated data classification, categorizing columns into sensitivity levels such as Confidential, Highly Confidential, and assigning appropriate information types based on content.
Regulatory Reporting and Audit Trails: The comprehensive business structure of Adventure Works, spanning sales, human resources, and production data, provides an excellent framework for demonstrating how data catalogs support regulatory reporting requirements.
Risk Assessment and Data Governance: The database allows data governance teams to quantify data risk and develop processes for data masking in non-production environments, which is a critical compliance requirement for protecting sensitive data in development and testing scenarios.

Log into Data Catalog:
Username: [email protected]
Password: Welcome123!
Click: Management in the left navigation menu.

In the Resources card, click: Add Data Source.

Specify the following information for the connection to your data source.
If you are nearing or have exceeded the limit of data sources allowed by your license agreement, a message appears when you try to add a new data source.
Data Catalog encrypts your data source connection details, such as user name and password, before storing them.
Test Connection and Ingest Metadata Schema ..
After you have specified the detailed information according to your data source type, test the connection to the data source and add the data source.
Enter the following details to connect to: Adventure Works database.
Data Source Name
mssql:adventureworks2022
Data Source ID
Leave Blank to autogenerate ID
Description
AW DW: Person, HR, Purchasing, Sales, Production
Data Source Type
Microsoft SQL Server
Affinity
Default
Configuration Method
URI
Username
sa
Password
StrongPassword123
URI
jdbc:sqlserver://pdc.pentaho.lab:1433;databaseName=AdventureWorks2022;user=sa;password=StrongPassword123;encrypt=false
Driver
mssql-jdbc-12.10.1.jre11.jar*
Database Name
AdventureWorks2022

Ensure you have uploaded the supported MSSQL Driver - mssql-jdbc-12.10.1.jre11.jar to PDC
Select the JDBC driver:

Enter the URI:
jdbc:sqlserver://pdc.pentaho.lab:1433;databaseName=AdventureWorks2022;user=sa;password=StrongPassword123;encrypt=false
Click Test Connection to test your connection to the specified data source.

Click Ingest Schema, select the following 5 schemas, and then click Ingest Schemas.

(Optional) In the Physical Location field, specify the physical location details of the data source.
(Optional) Configure the following storage optimization options for the data source.
Available for Migration
Enables or disables the data source for storage optimization. When enabled, it includes the data source for data optimizer activities.
Available for Writing
Enables or disables writing capabilities for the data source and enables migration when turned on.
Available for Data Mastering
Enables or disables the data source for data mastering purposes.
(Optional) In the Cost per Terabyte field, specify the data source pricing details like currency, price per terabyte, and billing frequency.
(Optional) In the Total Capacity field, specify the total capacity of the data source in terabytes.
(Optional) Enter a Note for any additional information to share with others who might access this data source.
Click Create Data Source to establish your data source connection.
Last updated
Was this helpful?