Sensitivity Level & Trust Scores
Sensitivity Level & Trust Score
This hands-on workshop teaches you how to implement an automated solution for bulk updating Trust Scores and Sensitivity levels across your entire data catalog. You'll learn to extract entity data, calculate metrics, join data using Pentaho, and perform bulk updates efficiently.
By the end of this workshop, you will be able to:
Extract entity data from your data catalog with hierarchical names
Join calculated Trust Score and Sensitivity values using Pentaho Data Integration
Perform bulk updates across all schemas, tables, and columns
Validate and monitor the update process
Troubleshoot common issues

Entity Extraction
The extraction process retrieves all entities from your data catalog with their hierarchical relationships intact.
What Gets Extracted:
Entity unique identifiers (UUIDs)
Entity types (SCHEMA/TABLE/COLUMN)
Hierarchical names for joining
Current Trust Score and Sensitivity values
Fully qualified domain names (FQDNs)
Learning Objectives:
Understand the entity extraction process
Extract all entities with hierarchical names from your data catalog
Analyze the extracted data structure
Prepare data for joining with calculated metrics
Run the extraction script:
Expected output:
Take a look at the entity_extraction.csv
entity_id
Unique identifier
ef60e629-4261-4ce6-8635-961ca4b1b420
entity_type
Type of entity
SCHEMA, TABLE, COLUMN
entity_name
Entity's actual name
Employee
schema_name
Schema name for joining
HumanResources
table_name
Table name (empty for schemas)
Employee
column_name
Column name (empty for schemas/tables)
FirstName
fqdn
Internal fully qualified name
688cc7b9c5759eae5fdcba07/...
fqdn_display
Human-readable path
mssql:adventureworks2022/...
current_trust_score
Existing trust score
48
current_sensitivity
Existing sensitivity
HIGH
new_trust_score
For your calculated values
(empty)
new_sensitivity
For your calculated values
(empty)
Run a Data Quality Analysis
x
x
x
3. Run
Run Complete Extraction
Data Integration
x
x
Learning Objectives
Set up Pentaho Data Integration transformation
Join extracted entities with calculated metrics
Handle different entity types (schema, table, column)
Output properly formatted CSV for bulk updates
Validate join results
Step 1: Prepare Input Files
Verify Your Input Files
Bulk Updates (API and OpenSearch)
x
x
Learning Objectives
Understand the two update methods (API vs OpenSearch)
Configure authentication and connection settings
Perform bulk updates with validation
Monitor update progress and handle errors
Verify updates were applied successfully
Section 5: Bulk Updates (API and OpenSearch) markdown
Learning Objectives
Understand the two update methods (API vs OpenSearch)
Configure authentication and connection settings
Perform bulk updates with validation
Monitor update progress and handle errors
Verify updates were applied successfully
⚖️ Step 1: Choose Update Method
API Method (Recommended)
Pros:
Uses official data catalog API
Respects business rules and validation
Maintains audit trails
Safer for production use
Cons:
Slower for large datasets (rate limited)
Requires valid API authentication
OpenSearch Direct Method
Pros:
Faster bulk operations
No API rate limits
Direct database updates
Cons:
Bypasses business logic
Requires OpenSearch access
Less audit trail
Higher risk if misconfigured
Step 2: Configure Authentication
API Authentication Setup
Section 7: Troubleshooting (Common Issues and Solutions) markdown
Section 7: Troubleshooting
Learning Objectives
Identify and resolve common issues during implementation
Understand error messages and their solutions
Implement monitoring and alerting
Create maintenance procedures
Establish best practices for ongoing operations
Common Issues & Solutions
Environment Setup Issues
Issue: "Command not found: extract-entities"
Last updated
Was this helpful?
