# APIs

x

x

x

{% tabs %}
{% tab title="Sensitivity & Trust Score" %}
{% hint style="info" %}

{% endhint %}

1. Run the following commands to verify system requirements.

```bash
# Check Python version (must be 3.8+)
python3 --version

# Check pip availability
pip --version

# Verify project directory access
ls -la /home/pdc/Projects/APIs/Key_Metrics/
```

2. Ensure the required packages are installed.

```bash
# Navigate to project directory
cd ~/Projects/APIs/Key_Metrics

# Install package in development mode (recommended)
pip install -e .

# Alternative: Using uv package manager
# uv sync

# Verify installation
extract-entities --help
bulk-update-api --help
bulk-update-opensearch --help
```

3. Generate a JWT Bearer-token.

{% hint style="info" %}
Ensure the credentials used have the required permissions - james.lock has the admin / system\_administrator role.

The JWT - Bearer Token - will enable authentication, while PDC will authorize OpenSearch API calls.

The bearer token is time limited so you may have to update the token.
{% endhint %}

```bash
curl -k -L -X POST 'https://pdc.pentaho.lab/keycloak/realms/pdc/protocol/openid-connect/token' \
  -H 'Content-Type: application/x-www-form-urlencoded' \
  --data-urlencode 'client_id=pdc-client' \
  --data-urlencode 'grant_type=password' \
  --data-urlencode 'username=james.lock@adventureworks.com' \
  --data-urlencode 'password=Welcome123!' | jq -r '.access_token'
```

{% hint style="info" %}
This will return and output the JWT token as .access\_token:
{% endhint %}

```
eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJoTTRKdGZzc0tnWUdXOUJPMEVFeGNISWdDZ0FsWUFnOENQS1JvcWYzbUVvIn0.eyJle
```

4. Create a config.py:

```python
# Data Catalog API Configuration
API_CONFIG = {
    "base_url": "https://pdc.pentaho.lab",
    "auth_token": "your-bearer-token-here",
    "timeout": 30,
    "max_retries": 3
}

# OpenSearch Configuration
OPENSEARCH_CONFIG = {
    "url": "http://localhost:9200",
    "username": "admin",  # Add if authentication required
    "password": "Es3vweMuABJr", #located in the .env.default
    "verify_ssl": False
}

# File Paths
FILES = {
    "entity_extraction": "data/output/entity_extraction.csv",
    "calculated_input": "data/input/calculated_metrics.csv",
    "joined_output": "data/output/bulk_update_ready.csv",
}

# Processing Options
PROCESSING = {
    "batch_size": 50,
    "delay_between_batches": 1,  # seconds
    "dry_run": False  # Set True for testing
}
```

5. Run a diagnostic check.

```bash
# Run diagnostic check
python3 << 'EOF'
import sys
import importlib

required_modules = ['requests', 'pandas', 'urllib3', 'csv', 'json']
missing = []

for module in required_modules:
    try:
        importlib.import_module(module)
        print(f"✅ {module} - OK")
    except ImportError:
        print(f"❌ {module} - MISSING")
        missing.append(module)

if missing:
    print(f"\n⚠️  Install missing modules: pip install {' '.join(missing)}")
else:
    print("\n✅ All required modules installed!")
EOF
```

{% hint style="info" %}
We're now ready to kick the project off .. !&#x20;
{% endhint %}
{% endtab %}

{% tab title="Second Tab" %}

{% endtab %}
{% endtabs %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://academy.pentaho.com/pentaho-data-catalog-en/setup/apis.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
