Ubuntu Pentaho Lab
Setup Pentaho Server + Plugins on Ubuntu ..
Pentaho Lab
Pentaho Data Integration is a client-based tool commonly installed and configured to run on Windows 11.
There are several licensing options, for these workshops we will be installing a Enterprise Edition. This will give you the opportunity to try out building a complete solution - automated data pipelines + analytics ..

The following steps are intended for setting up a Pentaho Lab environment and need to be completed in order to complete the Workshops.
Ensure you have downloaded the Workshop--Installation:
To install git:
Prerequisites
Ubuntu 24.04 LTS system (physical or virtual machine)
User account with sudo privileges
Internet connection
Basic familiarity with Linux command line
Docker
Docker is a platform that enables developers to package applications and their dependencies into lightweight, portable containers. Containers ensure that applications run consistently across different computing environments, from development laptops to production servers. This workshop will guide you through the complete process of installing Docker Engine on Ubuntu 24.04 LTS (Noble Numbat).
Before installing Docker, update your existing package list.
Install packages that allow apt to use repositories over HTTPS.
Create a directory for keyrings and add Docker's GPG key.
Add the Docker repository to your apt sources.
Now that the Docker repository is added, update the package index.
Install Docker Engine, containerd, and Docker Compose.
Check that Docker is installed correctly by checking the version.
You should see output similar to (Nov 2025):
Verify that Docker Engine is running.
The service should show as "active (running)".
Quit.
Test your Docker installation by running the hello-world container.
This command downloads a test image and runs it in a container. If successful, you'll see a message confirming that Docker is working correctly.
Add your user to the docker group.
Apply the new group membership (or log out and back in).
Verify you can run Docker without sudo.
Ensure Docker starts automatically when the system boots.
Check Docker version:
View Docker system information:
List running containers:
List all containers (including stopped ones):
List downloaded images:
Common Commands
Here are essential Docker commands you'll use regularly:
docker pull <image>- Download an image from Docker Hubdocker images- List all local imagesdocker run <image>- Create and start a container from an imagedocker ps- List running containersdocker ps -a- List all containersdocker stop <container>- Stop a running containerdocker rm <container>- Remove a stopped containerdocker rmi <image>- Remove an imagedocker logs <container>- View container logsdocker exec -it <container> bash- Access a running container's shell
Docker Compose - MySQL
The pentaho_admin user only has READ permission for the Steel Wheels - sampledata database. The administrator account has been removed.
As you'll be running through CRUID database operations we need to deploy a sampledata database - Docker container, granting all privileges to an admin user.
Run the following script to create a MySQL folder and copy the required files.
Check the Directory has been created and the files copied over.
Execute the docker-compose script to create the container.

Check the container is up and running in Docker.


sampledata_schema.sql
This script creates a comprehensive relational database structure for a sample business application. It's designed to model a sales and order management system for a company that sells various products.
Database Setup
Creates a database named
with UTF-8 character set
Sets up users with appropriate permissions
Configures SQL mode for better data integrity
Tables
OFFICES: Stores company office locations with address details
EMPLOYEES: Contains employee information with relationships to offices and reporting structure
CUSTOMERS: Stores customer information including contact details and credit limits
PRODUCTS: Contains product catalog with inventory and pricing information
ORDERS: Tracks customer orders with status and dates
ORDERDETAILS: Contains line items for each order with quantity and price
PAYMENTS: Records customer payments with amounts and dates
ORDERFACT: A fact table for order analytics
CUSTOMER_W_TER: Extended customer information with territory
DIM_TIME: Time dimension table for reporting
DEPARTMENT_MANAGERS: Stores department manager information
QUADRANT_ACTUALS: Contains budget vs. actual financial data with a generated VARIANCE column
TRIAL_BALANCE: Financial accounting data
Views
customer_order_summary: Summarizes orders and spending by customer
product_performance: Analyzes product sales metrics including revenue and profit
employee_sales_performance: Tracks sales performance by employee
monthly_sales_trend: Shows sales trends over time by month
product_inventory_status: Categorizes products by inventory levels
customer_payment_history: Summarizes customer payment activity and balances
Stored Procedures
GetCustomerOrders: Retrieves orders for a specific customer
UpdateProductStock: Updates product inventory levels
GetProductSalesByQuarter: Analyzes quarterly product sales
GetTopCustomersByRegion: Identifies top customers by region
GetInventoryValueByProductLine: Calculates inventory metrics by product line
Triggers
before_order_insert: Validates date constraints on orders
before_payment_insert: Ensures payment amounts are positive
Execute the following command to create the schema.
This command is importing SQL schema data into a MySQL database running in a Docker container. Here's a breakdown:
This command reads the SQL file:
Pipes (forwards) the file contents to the next command:
This executes a command in a running Docker container:
You can check the sampledata database & tables with the following commands.
Show databases:
Show tables:
Show table columns:
sampledata_data.sql
This script populates the database with sample data to demonstrate the functionality of the schema.
Reference Data
Office locations across different regions
Employee hierarchy with job titles
Product catalog organized by product lines
Transactional Data
Customer records with contact information
Order history with dates and status
Order details with quantities and prices
Payment records
Data Characteristics
Realistic business scenarios with varied order statuses
Comprehensive product catalog with descriptions and pricing
Hierarchical employee structure with reporting relationships
Time-based data spanning multiple years for trend analysis
Financial data suitable for budgeting and variance analysis
Notable Features
Data follows referential integrity constraints
Proper handling of NULL values where appropriate
Realistic pricing and quantity values
Generated columns (like VARIANCE) are excluded from direct inserts
Orders are sequenced to satisfy foreign key constraints
Execute the following command to load the data into the sampledata tables.
You can use the following commands to check that the data has loaded.
To count the number of rows in a specific table:
To view the first few rows from a table:
To check counts for all tables:
To get a summary of tables and their statuses:

DBeaver
Your going to need a database management tool. DBeaver Community is a free, open-source database management tool for personal projects.
Simpliest option is to download & install from Snapstore.
Or
Go to the official DBeaver download page
Or
To install that DEB file.
Pin DBeaver to Dash - bottom toolbar.
MySQL Database
If you have completed the previous 3 requirements, then you should have a MySQL Docker container, exposed on port:3306 with sampledata databse.
Launch DBeaver and Select: MySQL.

Configure the connection with the following properties:
Username: root or pentaho_user
Password: password

You may need to download the supported version of the database driver.
Also enable: allowPublicKeyRetrieval

Test the connection.

Expand: databases > sampledata > Tables

Open a SQL window and run a test query.

General Troubleshooting (click to expand)
Issue: "permission denied" errors
Solution: Ensure your user is in the docker group and you've logged out/in or run
newgrp docker
Issue: Docker service won't start
Solution: Check logs with
sudo journalctl -u docker.service
Issue: Cannot connect to Docker daemon
Solution: Ensure Docker service is running with
sudo systemctl start docker
Last updated
Was this helpful?
