# Windows Pentaho Lab

{% hint style="info" %}

#### **Pentaho Lab**

Pentaho Data Integration is a client-based tool commonly installed and configured to run on Windows 11.

There are several licensing options, for these workshops we will be installing a Enterprise Edition. This will give you the opportunity to try out building a complete solution - automated data pipelines + analytics ..
{% endhint %}

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FaO9M8NuR6rzHnWiAvgEi%2Fimage.png?alt=media&#x26;token=135c03ed-25dd-465c-9b25-6c2abe6ed093" alt=""><figcaption><p>Pentaho Tiers</p></figcaption></figure>

{% hint style="danger" %}
The following steps are intended for setting up a Pentaho Lab environment and need to be completed in order to complete the Workshops.

Ensure you have downloaded the Workshop--Installation&#x20;

```
cd \
git clone https://github.com/jporeilly/Workshop--Installation
```

{% endhint %}

{% tabs %}
{% tab title="Docker Desktop" %}
{% hint style="info" %}

#### Docker Desktop

Docker Desktop is an application for Windows, macOS, and Linux that provides an easy-to-use interface for developing and running containerized applications. It bundles the Docker Engine, Docker CLI, Docker Compose, Kubernetes, and other essential tools into a single package with a graphical user interface.&#x20;

Docker Desktop simplifies container management by handling the underlying virtualization automatically, allowing developers to build, test, and deploy applications in isolated, portable containers without worrying about environment configuration differences. It's particularly popular among developers who want to ensure their applications run consistently across different environments, from local development machines to production servers.
{% endhint %}

{% embed url="<https://www.docker.com/products/docker-desktop/>" %}

1. Download the Docker Desktop installer.

{% embed url="<https://desktop.docker.com/win/main/amd64/Docker%20Desktop%20Installer.exe?utm_campaign=dd-smartbutton&utm_location=module&utm_medium=webreferral&utm_source=docker>" %}
Link to download Docker Desktop
{% endembed %}

2. Navigate to: Downloads
3. Double-click: `Docker Desktop Installer.exe` to run the installer.

By default, Docker Desktop is installed at `C:\Program Files\Docker\Docker`.

{% hint style="danger" %}
When prompted, ensure the **Use WSL 2 instead of Hyper-V** option on the Configuration page is selected.

On systems that support only one backend, Docker Desktop automatically selects the available option.
{% endhint %}

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FKBG27SUuTlAYI8OCLtDk%2Fimage.png?alt=media&#x26;token=552b06bd-5966-4a9b-95ee-4b826a89781b" alt=""><figcaption><p>Use WSL 2</p></figcaption></figure>

3. Close to complete the installation process.

***

{% hint style="info" %}

#### **Docker User**

If your administrator account is different to your user account, you must add the user to the docker-users group:
{% endhint %}

1. Run Computer Management as an administrator.
2. Navigate to **Local Users and Groups** > **Groups** > **docker-users**.
3. Right-click to add the user to the group.

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FGdGKjQIayhbbRzAmRBP8%2Fimage.png?alt=media&#x26;token=0e3f6123-f8d4-41be-b432-3621b0a19ca7" alt=""><figcaption><p>Add User to docker group</p></figcaption></figure>

4. Sign out and sign back in for the changes to take effect.
   {% endtab %}

{% tab title="MySQL Container" %}
{% hint style="info" %}

#### **Docker Compose - MySQL**

The pentaho\_admin user only has READ permission for the Steel Wheels - sampledata database. The administrator account has been removed.

As you'll be running through CRUID database operations we need to deploy a sampledata database - Docker container, granting all privileges to an admin user.
{% endhint %}

{% file src="<https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2F6hPSoZkwDCVQoPvmTu20%2Fcopy-mysql.ps1?alt=media&token=992633e7-24c1-4b23-84f0-c14770c1164a>" %}

{% file src="<https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FXnkS8JtDisS2P2nqJLvC%2Frun-docker-mysql.ps1?alt=media&token=f12b269e-427e-48bf-8b36-c55253cc81f7>" %}

{% file src="<https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FlbsqipaAdqdhpvHj7FTv%2Fdocker-compose.yml?alt=media&token=fde4bc13-9715-4257-8feb-b5f70639dae4>" %}

1. Run the following script to create a MySQL folder and copy the required files.

```ps1
cd \
cd C:\Workshop--Installation\MySQL
.\copy-mysql.ps1
```

2. Check the Directory has been created and the files copied over.
3. Execute the docker-compose script to create the container.

```powershell
cd \
cd C:\MySQL
.\run-docker-mysql.ps1
```

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FHnql132ameJUQL5OGA74%2Fimage.png?alt=media&#x26;token=a242c86f-d75e-47c4-a592-bf9767243bc6" alt=""><figcaption><p>Deploy MySQL</p></figcaption></figure>

4. Check the container is up and running in Docker Desktop.

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FbzmptT3CVaRADQKBkjMl%2Fimage.png?alt=media&#x26;token=31182fee-8ad9-40c2-9e3e-fa1c5d9324d9" alt=""><figcaption><p>mysql docker containers</p></figcaption></figure>
{% endtab %}

{% tab title="sampledata" %}
{% hint style="info" %}

#### **Sampledata**

Next on the list is to create the sampledata database.
{% endhint %}

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FfHrm1HbmPxxV8bPSDUK9%2Fimage%20(62).png?alt=media&#x26;token=4d4eb534-9ee4-49eb-80d1-17c363ada1f1" alt=""><figcaption><p>Relationship Diagram</p></figcaption></figure>

{% tabs %}
{% tab title="1. sampledata\_schema.sql" %}
{% hint style="info" %}

#### **sampledata\_schema.sql**

This script creates a comprehensive relational database structure for a sample business application. It's designed to model a sales and order management system for a company that sells various products.
{% endhint %}

{% hint style="info" %}

#### **Database Setup**

* Creates a database named

  ```
  sampledata
  ```

  with UTF-8 character set
* Sets up users with appropriate permissions
* Configures SQL mode for better data integrity
  {% endhint %}

{% hint style="info" %}

#### **Tables**

**OFFICES**: Stores company office locations with address details

**EMPLOYEES**: Contains employee information with relationships to offices and reporting structure

**CUSTOMERS**: Stores customer information including contact details and credit limits

**PRODUCTS**: Contains product catalog with inventory and pricing information

**ORDERS**: Tracks customer orders with status and dates

**ORDERDETAILS**: Contains line items for each order with quantity and price

**PAYMENTS**: Records customer payments with amounts and dates

**ORDERFACT**: A fact table for order analytics

**CUSTOMER\_W\_TER**: Extended customer information with territory

**DIM\_TIME**: Time dimension table for reporting

**DEPARTMENT\_MANAGERS**: Stores department manager information

**QUADRANT\_ACTUALS**: Contains budget vs. actual financial data with a generated VARIANCE column

**TRIAL\_BALANCE**: Financial accounting data
{% endhint %}

{% hint style="info" %}

#### **Views**

**customer\_order\_summary**: Summarizes orders and spending by customer

**product\_performance**: Analyzes product sales metrics including revenue and profit

**employee\_sales\_performance**: Tracks sales performance by employee

**monthly\_sales\_trend**: Shows sales trends over time by month

**product\_inventory\_status**: Categorizes products by inventory levels

**customer\_payment\_history**: Summarizes customer payment activity and balances
{% endhint %}

{% hint style="info" %}

#### **Stored Procedures**

**GetCustomerOrders**: Retrieves orders for a specific customer

**UpdateProductStock**: Updates product inventory levels

**GetProductSalesByQuarter**: Analyzes quarterly product sales

**GetTopCustomersByRegion**: Identifies top customers by region

**GetInventoryValueByProductLine**: Calculates inventory metrics by product line

**Triggers**

**before\_order\_insert**: Validates date constraints on orders

**before\_payment\_insert**: Ensures payment amounts are positive
{% endhint %}

{% hint style="info" %}

#### **Events**

* **daily\_maintenance**: Scheduled task for database maintenance
  {% endhint %}

1. Execute the following command to create the schema.

```powershell
cd \
cd MySQL
Get-Content sampledata_schema.sql | docker exec -i mysql-database-1 mysql -u root -p"password" sampledata
```

{% hint style="info" %}
This command is importing SQL schema data into a MySQL database running in a Docker container. Here's a breakdown:

This command reads the SQL file:

```powershell
Get-Content sampledata_schema.sql
```

Pipes (forwards) the file contents to the next command:

```
|
```

This executes a command in a running Docker container:

```
docker exec -i mysql-database-1 mysql -u root -p"password" sampledata
```

{% endhint %}

2. You can check the sampledata database & tables with the following commands.

Show databases:

```powershell
docker exec -i mysql-database-1 mysql -u root -p"password" -e "SHOW DATABASES;"
```

Show tables:

```powershell
docker exec -i mysql-database-1 mysql -u root -p"password" sampledata -e "SHOW TABLES;"
```

Show table columns:

```powershell
docker exec -i mysql-database-1 mysql -u root -p"password" sampledata -e "DESCRIBE table_name;"
```

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FRdQGb6Vr3qtw2ot3KsM5%2Fimage.png?alt=media&#x26;token=2909bf5d-c81f-4b4f-a447-93fc903dd48b" alt=""><figcaption><p>Check database &#x26; tables</p></figcaption></figure>
{% endtab %}

{% tab title="2. sampledata\_data.sql" %}
{% hint style="info" %}

#### **sampledata\_data.sql**

This script populates the database with sample data to demonstrate the functionality of the schema.
{% endhint %}

{% hint style="info" %}

#### **Reference Data**

* Office locations across different regions
* Employee hierarchy with job titles
* Product catalog organized by product lines
  {% endhint %}

{% hint style="info" %}

#### **Transactional Data**

* Customer records with contact information
* Order history with dates and status
* Order details with quantities and prices
* Payment records
  {% endhint %}

{% hint style="info" %}

#### **Financial Data**

* Budget vs. actual figures in QUADRANT\_ACTUALS
* Trial balance accounting data
  {% endhint %}

{% hint style="info" %}

#### **Data Characteristics**

* Realistic business scenarios with varied order statuses
* Comprehensive product catalog with descriptions and pricing
* Hierarchical employee structure with reporting relationships
* Time-based data spanning multiple years for trend analysis
* Financial data suitable for budgeting and variance analysis
  {% endhint %}

{% hint style="info" %}

#### **Notable Features**

* Data follows referential integrity constraints
* Proper handling of NULL values where appropriate
* Realistic pricing and quantity values
* Generated columns (like VARIANCE) are excluded from direct inserts
* Orders are sequenced to satisfy foreign key constraints
  {% endhint %}

1. Execute the following command to load the data into the sampledata tables.

```powershell
cd \
cd MySQL
Get-Content sampledata_data.sql | docker exec -i mysql-database-1 mysql -u root -p"password" sampledata
```

2. You can use the following commands to check that the data has loaded.

To count the number of rows in a specific table:

```docker
docker exec -i mysql-database-1 mysql -u root -p"password" sampledata -e "SELECT COUNT(*) FROM table_name;"
```

To view the first few rows from a table:

```docker
docker exec -i mysql-database-1 mysql -u root -p"password" sampledata -e "SELECT * FROM table_name LIMIT 10;"
```

To check counts for all tables:

```docker
docker exec -i mysql-database-1 mysql -u root -p"password" sampledata -e "SELECT TABLE_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'sampledata';"
```

To get a summary of tables and their statuses:

```docker
docker exec -i mysql-database-1 mysql -u root -p"password" sampledata -e "SHOW TABLE STATUS;"
```

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2Fe0qSyplq4BSWdMja0KmU%2Fimage.png?alt=media&#x26;token=6e535223-9a4d-4916-b8f5-a3b41128907b" alt=""><figcaption><p>sampledata table info</p></figcaption></figure>
{% endtab %}
{% endtabs %}
{% endtab %}

{% tab title="DBeaver" %}
{% hint style="info" %}

#### DBeaver

Your going to need a database management tool. DBeaver Community is a free, open-source database management tool for personal projects.
{% endhint %}

1. Go to the official [DBeaver download page](https://dbeaver.io/download/)

{% embed url="<https://dbeaver.io/files/dbeaver-ce-latest-x86_64-setup.exe>" %}

2. Navigate to Downloads & double-click on: `dbeaver-ce-25.2.5-x86_64-setup.exe`
3. Follow the installation instructions.
4. Follow the on-screen instructions, clicking "Next" and agreeing to the license agreement to proceed.
5. Choose your desired installation options (e.g., for all users or the current user).

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FyqemtZhN9dE6mAXXMHOT%2Fimage.png?alt=media&#x26;token=3d12607c-5d6c-46b8-8f8e-7f5642756b20" alt=""><figcaption></figcaption></figure>

6. Complete the installation process.&#x20;

***

{% hint style="info" %}

#### **MySQL Database**

If you have completed the previous 3 requirements, then you should have a MySQL Docker container, exposed on port:3306 with sampledata databse.
{% endhint %}

1. Launch DBeaver and Select: MySQL.

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FdO4Wo4hKJzPuJW6OCI7C%2Fimage.png?alt=media&#x26;token=72d92d39-22dd-4603-a480-94495aa6a808" alt=""><figcaption><p>MySQL</p></figcaption></figure>

2. Configure the connection with the following properties:

Username: root or pentaho\_user

Password: password

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2Fx4hj27wSVxKqareREN8p%2Fimage.png?alt=media&#x26;token=1268171f-8274-4cdb-87af-f0f44c3a5392" alt=""><figcaption><p>Configure &#x26; Test MySQL connection - sampledata</p></figcaption></figure>

{% hint style="warning" %}
You may need to download the supported version of the database driver.

Also enable: allowPublicKeyRetrieval
{% endhint %}

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FMcdTFaqU0LsctCgZfC7S%2Fimage%20(56).png?alt=media&#x26;token=2a981c76-44d2-416d-a6f1-f08152c14740" alt=""><figcaption><p>Enable: allowPublicKeyRetrieval</p></figcaption></figure>

3. Test the connection.

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FMMdsCwDdInxb5tkHhfjS%2Fimage.png?alt=media&#x26;token=420b31e7-86a7-46f6-807f-255c607e3526" alt=""><figcaption><p>Test connection</p></figcaption></figure>

4. Expand: databases > sampledata > Tables

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FE7sRaOXCFgJLUlfRmbyv%2Fimage.png?alt=media&#x26;token=ecf46ab4-f4f5-4f8c-bc48-15b85389da2a" alt=""><figcaption><p>Customer Data</p></figcaption></figure>

5. Open a SQL window and run a test query.

```sql
select * from CUSTOMERS
where COUNTRY = 'USA' and CITY = 'NYC';
```

<figure><img src="https://4179059538-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5WjeDOi3MsT9tY3oU6mQ%2Fuploads%2FEiNfAWHg3ZxaJX6INeIC%2Fimage.png?alt=media&#x26;token=bec6eaec-6e08-488e-ab04-85001312c550" alt=""><figcaption><p>SQL query - NYC Customers</p></figcaption></figure>
{% endtab %}
{% endtabs %}
