# Data Discovery

{% hint style="info" %}

#### Data Discovery

Data Discovery establishes the critical foundation for implementing Pentaho Data Catalog (PDC) with Adventure Works database. The workshop systematically transforms raw data discovery into actionable governance, ensuring regulatory compliance (GDPR, SOX, CCPA) while enabling secure, role-based data access.&#x20;

Through structured sessions, you will create:

* a complete data asset inventory, classify sensitive information, map organizational access requirements, and design automated compliance controls that reduce manual governance overhead by an estimated 75%.

The results deliver immediate business value through proactive risk mitigation and audit readiness. By identifying 47 sensitive data elements across Adventure Works' 71 tables and mapping them to specific regulatory requirements, organizations can avoid potential GDPR fines of up to 4% of global revenue and SOX compliance violations that carry criminal liability for executives.&#x20;

The structured approach ensures that all 19,972 person records and 31,465 financial transactions are properly classified and protected according to their risk profile and business usage patterns.

This foundation enables automated segregation of duties for SOX compliance, purpose limitation for GDPR requirements, and principle of least privilege access controls - all while maintaining business operational efficiency and user productivity.
{% endhint %}

{% embed url="<https://docs.pentaho.com/pdc-admin/ldc-manage-data-sources-cp/adding-a-data-source-ldc-manage-data-sources-ag>" %}
Link to Data Catalog data sources
{% endembed %}

***


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://academy.pentaho.com/pentaho-data-catalog-en/data-catalog/data-discovery.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
