# Uniqueness Rules

{% hint style="success" %}

#### Uniqueness Rules

In this series of detailed workshops we're going to:

* Understand the business value and ROI of implementing business rules
* Master the 7 Data Quality Dimensions in PDC
* Create and configure business rules with detailed step-by-step guidance
* Implement rule actions (Status, Tags, Webhooks) and understand their purpose
* Organize rules using rule groups for operational efficiency
* Monitor and interpret rule execution results for continuous improvement

{% endhint %}

x

***

{% hint style="info" %}

#### Accessing your Catalog

To get started using the Data Catalog, log in using the address and credentials provided by your Data Catalog service user or administrator.
{% endhint %}

To access your catalog, please follow these steps:

1. Open **Google Chrome** web browser.
2. Navigate to:

{% embed url="<https://pdc.pentaho.lab>" %}

3. Enter following email and password, then click **Sign In**.

Username: <david.park@adventureworks.com> (mapped to Business Steward role)

Password: Welcome123!

***

x

x

{% tabs %}
{% tab title="Duplicate Customer Detection" %}
{% hint style="info" %}

#### Duplicate Customer Detection

**Scenario:** Identify potential duplicate customer records based on name and contact info.
{% endhint %}

**Business Rule Configuration:**

```
Rule Name: Customer_Duplicate_Detection
Description: Identifies potential duplicate customer records
Data Quality Dimension: Uniqueness
Schedule: Weekly
Target: Person.Person
```

**SQL Query:**

```sql
WITH DuplicateCandidates AS (
    SELECT 
        FirstName, 
        LastName, 
        EmailAddress,
        COUNT(*) AS DupCount
    FROM Person.Person p
    JOIN Person.EmailAddress e ON p.BusinessEntityID = e.BusinessEntityID
    GROUP BY FirstName, LastName, EmailAddress
    HAVING COUNT(*) > 1
)
SELECT 
    (SELECT COUNT(DISTINCT BusinessEntityID) FROM Person.Person) AS total_count,
    (SELECT COUNT(DISTINCT BusinessEntityID) FROM Person.Person) - 
        (SELECT SUM(DupCount - 1) FROM DuplicateCandidates) AS scopeCount,
    (SELECT SUM(DupCount - 1) FROM DuplicateCandidates) AS nonCompliant
```

***

x
{% endtab %}

{% tab title="Unique Product Numbers" %}
{% hint style="info" %}

#### Incomplete Product Information

**Scenario:** Ensure ProductNumber is truly unique across all products.
{% endhint %}

**Business Rule Configuration:**

```
Rule Name: Product_Number_Uniqueness
Description: Validates ProductNumber uniqueness constraint
Data Quality Dimension: Uniqueness
Schedule: Daily
Target: Production.Product
```

**SQL Query:**

```sql
WITH DuplicateProducts AS (
    SELECT ProductNumber, COUNT(*) AS CountDuplicates
    FROM Production.Product
    GROUP BY ProductNumber
    HAVING COUNT(*) > 1
)
SELECT 
    COUNT(*) AS total_count,
    COUNT(*) - COALESCE((SELECT SUM(CountDuplicates - 1) FROM DuplicateProducts), 0) AS scopeCount,
    COALESCE((SELECT SUM(CountDuplicates - 1) FROM DuplicateProducts), 0) AS nonCompliant
FROM Production.Product
```

***

x

{% endtab %}
{% endtabs %}

x

x

{% tabs %}
{% tab title="First Tab" %}
x
{% endtab %}

{% tab title="Second Tab" %}

{% endtab %}
{% endtabs %}

x


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://academy.pentaho.com/pentaho-data-catalog-en/data-catalog/business-rules/uniqueness-rules.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
