Dashboard
Data Discovery Dashboard
A picture paints a thousand words .. a dashboard is a great visual aid for explaining complex concepts - Data Discovery.
We're going to build a comprehensive data discovery and governance application designed for Adventure Works - extensible to any JDBC-compliant database. Features modular architecture, industry-aligned trust scoring, and production-ready dashboard analytics.
Ensure you have completed: Data Discovery
Start the Dashboard server:
cd
cd ~/Projects/Data_Discovery
# Activate virtual environment
source venv/bin/activate
# Start dashboard server
python -m data_discovery.dashboard.serverLog into Dashboard:
Executive Overview
This AdventureWorks2022 database executive dashboard provides a comprehensive analysis of data discovery, classification, and risk assessment status. The system has analyzed 88 total tables containing 723 columns, with 225 classified columns identified. The database maintains a 77% trust score with a "Good" governance level, though the overall risk level is marked as "HIGH," indicating significant security concerns requiring attention.
Click on the Executive Overview report:

Each Chart, Graph & Table has an info: 'About this ..' button for further details
Key Insights & Recommendations
Schema Distribution & Data Classification
The schema distribution chart shows Production and Sales schemas containing the highest concentration of tables (approximately 28 and 26 tables respectively), followed by Person (15 tables), HumanResources (12 tables), and Purchasing (7 tables). The classification summary reveals diverse sensitive data categories, with Technical data comprising the largest segment, followed by Personal Name, Address Info, Business, Financial, Contact Info, Operational, Biometric, Compliance, and Security classifications, demonstrating the presence of multiple types of sensitive information requiring protection.
Critical Insights & Risk Indicators
The analysis of 723 columns across 88 tables reveals comprehensive data coverage of 95%+ for governance assessment. A significant PII exposure concern has been identified, with 188 columns containing personally identifiable information requiring enhanced data protection and access controls. Most critically, the security risk assessment has flagged 124 columns as high risk, necessitating immediate security controls and monitoring implementation to mitigate potential data breaches or unauthorized access.
Trust Score
The trust score breakdown shows DAMA-DMBOK standards achieving excellent ratings (85+, Good 70-84, Fair 55-69, Poor <55), indicating strong adherence to data management best practices. The system prioritizes three key recommendations: implementing data protection with masking for PII columns in non-production environments (High Priority), establishing role-based access controls for sensitive data schemas (Medium Priority), and deploying automated data quality monitoring for critical business tables (Standard). These actions are essential to address the HIGH overall risk level while maintaining operational efficiency.
Database Inventory
Comprehensive inventory of your database structure showing the results of metadata ingestion from database schemas. Displays structural organization, table distribution, data volume analysis, and business context mapping.
Click on the Database Inventory report:

Select the Schema & Table for more detailed Column details.
Database Coverage and Schema Analysis The system has cataloged 5 schemas containing 88 tables with 723 total columns, representing over 803,204 total rows. The complete inventory shows 100% data coverage verification, indicating all database schemas have been analyzed. This comprehensive column-level documentation forms the foundation for understanding your entire data ecosystem.
Column Details for Data Relationships The schema details section reveals specific column metadata including data types, nullability, and primary/foreign key indicators (PK/FK markers visible in the bottom table). These column attributes are essential for understanding how tables like "Person" connect across schemas. Without this granular detail, developers cannot properly construct joins or understand referential integrity constraints that maintain data consistency.
Data Lineage and Impact Analysis The column report shows 13 columns found in schema "Person" and table "Person", with specific data types (VARCHAR, INT, etc.) and lengths documented. This level of detail enables accurate data lineage tracking - understanding where data originates, how it transforms, and where it flows. When a column definition changes, you can immediately identify all downstream dependencies and assess impact across your data pipeline.
Compliance and Regulatory Requirements The dashboard identifies potentially sensitive columns containing personal information (like FirstName, LastName, EmailAddress, and AdditionalContactInfo). This automated discovery and classification of personal data is crucial for GDPR, CCPA, and other privacy regulations. The nullability indicators and data type specifications help ensure compliance with data retention policies and proper handling of required versus optional fields.
Auditing and Data Quality The schema distribution chart and top tables by size metrics provide essential audit trails. Column-level statistics (like the 294,412 rows in Sales table) combined with data type verification enable quality checks and anomaly detection. The complete discovery status with timestamps (last updated 8/22/2025) creates an audit history showing when column definitions were last verified, critical for change management and compliance reporting.
This comprehensive column documentation transforms raw database metadata into actionable intelligence, enabling better governance, faster development, and reduced compliance risk across your organization.
x
x
x
Last updated
Was this helpful?
