# APIs x x x {% tabs %} {% tab title="Sensitivity & Trust Score" %} {% hint style="info" %} {% endhint %} 1. Run the following commands to verify system requirements. ```bash # Check Python version (must be 3.8+) python3 --version # Check pip availability pip --version # Verify project directory access ls -la /home/pdc/Projects/APIs/Key_Metrics/ ``` 2. Ensure the required packages are installed. ```bash # Navigate to project directory cd ~/Projects/APIs/Key_Metrics # Install package in development mode (recommended) pip install -e . # Alternative: Using uv package manager # uv sync # Verify installation extract-entities --help bulk-update-api --help bulk-update-opensearch --help ``` 3. Generate a JWT Bearer-token. {% hint style="info" %} Ensure the credentials used have the required permissions - james.lock has the admin / system\_administrator role. The JWT - Bearer Token - will enable authentication, while PDC will authorize OpenSearch API calls. The bearer token is time limited so you may have to update the token. {% endhint %} ```bash curl -k -L -X POST 'https://pdc.pentaho.lab/keycloak/realms/pdc/protocol/openid-connect/token' \ -H 'Content-Type: application/x-www-form-urlencoded' \ --data-urlencode 'client_id=pdc-client' \ --data-urlencode 'grant_type=password' \ --data-urlencode 'username=james.lock@adventureworks.com' \ --data-urlencode 'password=Welcome123!' | jq -r '.access_token' ``` {% hint style="info" %} This will return and output the JWT token as .access\_token: {% endhint %} ``` eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJoTTRKdGZzc0tnWUdXOUJPMEVFeGNISWdDZ0FsWUFnOENQS1JvcWYzbUVvIn0.eyJle ``` 4. Create a config.py: ```python # Data Catalog API Configuration API_CONFIG = { "base_url": "https://pdc.pentaho.lab", "auth_token": "your-bearer-token-here", "timeout": 30, "max_retries": 3 } # OpenSearch Configuration OPENSEARCH_CONFIG = { "url": "http://localhost:9200", "username": "admin", # Add if authentication required "password": "Es3vweMuABJr", #located in the .env.default "verify_ssl": False } # File Paths FILES = { "entity_extraction": "data/output/entity_extraction.csv", "calculated_input": "data/input/calculated_metrics.csv", "joined_output": "data/output/bulk_update_ready.csv", } # Processing Options PROCESSING = { "batch_size": 50, "delay_between_batches": 1, # seconds "dry_run": False # Set True for testing } ``` 5. Run a diagnostic check. ```bash # Run diagnostic check python3 << 'EOF' import sys import importlib required_modules = ['requests', 'pandas', 'urllib3', 'csv', 'json'] missing = [] for module in required_modules: try: importlib.import_module(module) print(f"✅ {module} - OK") except ImportError: print(f"❌ {module} - MISSING") missing.append(module) if missing: print(f"\n⚠️ Install missing modules: pip install {' '.join(missing)}") else: print("\n✅ All required modules installed!") EOF ``` {% hint style="info" %} We're now ready to kick the project off .. ! {% endhint %} {% endtab %} {% tab title="Second Tab" %} {% endtab %} {% endtabs %} --- # Agent Instructions: Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://academy.pentaho.com/pentaho-data-catalog-en/setup/apis.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.