Sensitivity Level & Trust Scores
Sensitivity Level & Trust Score
This hands-on workshop teaches you how to implement an automated solution for bulk updating Trust Scores and Sensitivity levels across your entire data catalog. You'll learn to extract entity data, calculate metrics, join data using Pentaho, and perform bulk updates efficiently.
By the end of this workshop, you will be able to:
Extract entity data from your data catalog with hierarchical names
Join calculated Trust Score and Sensitivity values using Pentaho Data Integration
Perform bulk updates across all schemas, tables, and columns
Validate and monitor the update process
Troubleshoot common issues

The solution consists of three main components:
Entity Extraction Tool - Extracts all entities with hierarchical names from OpenSearch
Pentaho Data Integration - Joins your calculated values with entity data
Bulk Update Tool - Updates Trust Score and Sensitivity via API or OpenSearch
Expected Outcomes
Automated bulk updates of Trust Score (0-100) and Sensitivity (HIGH/MEDIUM/LOW)
Support for schema, table, and column level updates
Validation and error reporting
Scalable solution for thousands of entities
x
Entity Extraction
The extraction process retrieves all entities from your data catalog with their hierarchical relationships intact.
What Gets Extracted:
Entity unique identifiers (UUIDs)
Entity types (SCHEMA/TABLE/COLUMN)
Hierarchical names for joining
Current Trust Score and Sensitivity values
Fully qualified domain names (FQDNs)
Learning Objectives:
Understand the entity extraction process
Extract all entities with hierarchical names from your data catalog
Analyze the extracted data structure
Prepare data for joining with calculated metrics
Run the extraction script:
Expected output:
Take a look at the entity_extraction.csv
entity_id
Unique identifier
ef60e629-4261-4ce6-8635-961ca4b1b420
entity_type
Type of entity
SCHEMA, TABLE, COLUMN
entity_name
Entity's actual name
Employee
schema_name
Schema name for joining
HumanResources
table_name
Table name (empty for schemas)
Employee
column_name
Column name (empty for schemas/tables)
FirstName
fqdn
Internal fully qualified name
688cc7b9c5759eae5fdcba07/...
fqdn_display
Human-readable path
mssql:adventureworks2022/...
current_trust_score
Existing trust score
48
current_sensitivity
Existing sensitivity
HIGH
new_trust_score
For your calculated values
(empty)
new_sensitivity
For your calculated values
(empty)
Run a Data Quality Analysis
x
x
x
3. Run
Run Complete Extraction
Data Integration
x
x
Learning Objectives
Set up Pentaho Data Integration transformation
Join extracted entities with calculated metrics
Handle different entity types (schema, table, column)
Output properly formatted CSV for bulk updates
Validate join results
Step 1: Prepare Input Files
Verify Your Input Files
Bulk Updates (API and OpenSearch)
x
x
Learning Objectives
Understand the two update methods (API vs OpenSearch)
Configure authentication and connection settings
Perform bulk updates with validation
Monitor update progress and handle errors
Verify updates were applied successfully
Section 5: Bulk Updates (API and OpenSearch) markdown
Learning Objectives
Understand the two update methods (API vs OpenSearch)
Configure authentication and connection settings
Perform bulk updates with validation
Monitor update progress and handle errors
Verify updates were applied successfully
⚖️ Step 1: Choose Update Method
API Method (Recommended)
Pros:
Uses official data catalog API
Respects business rules and validation
Maintains audit trails
Safer for production use
Cons:
Slower for large datasets (rate limited)
Requires valid API authentication
OpenSearch Direct Method
Pros:
Faster bulk operations
No API rate limits
Direct database updates
Cons:
Bypasses business logic
Requires OpenSearch access
Less audit trail
Higher risk if misconfigured
Step 2: Configure Authentication
API Authentication Setup
x
x
Learning Objectives
Perform comprehensive validation of bulk updates
Test data catalog UI to confirm changes are visible
Validate data integrity and consistency
Create automated validation scripts
Document test results and generate reports
Section 6: Testing & Validation markdown
Section 6: Testing & Validation
Step 1: Pre-Update Baseline Capture
Create Baseline Report (if not done before updates)
Section 7: Troubleshooting (Common Issues and Solutions) markdown
Section 7: Troubleshooting
Learning Objectives
Identify and resolve common issues during implementation
Understand error messages and their solutions
Implement monitoring and alerting
Create maintenance procedures
Establish best practices for ongoing operations
Common Issues & Solutions
Environment Setup Issues
Issue: "Command not found: extract-entities"
Last updated
Was this helpful?
