# Read JSON

{% hint style="warning" %}

#### Workshop - Read JSON

Parse a JSON file into rows. Use **JSON Input**.

**What you’ll do**

* Read a JSON file from disk.
* Set a loop path for an array.
* Extract fields with JSONPath.
* Use **Get Fields** to infer metadata.
* Preview rows and validate types.

**Prerequisites:** Basic transformations. Basic JSON (objects, arrays). PDI installed.

**Estimated time:** 15 minutes
{% endhint %}

{% embed url="<https://www.loom.com/share/992afd2bbf75465ab70ae76d895ea4f3?hideEmbedTopBar=true&hide_owner=true&hide_share=true&hide_title=true>" %}
JSON Input
{% endembed %}

***

{% hint style="info" %}
**Workshop files**

Download the following files.

Keep the filenames unchanged.

Save them in your workshop folder.
{% endhint %}

{% file src="/files/KuMmOmVib28mfo1DWknO" %}

***

<figure><img src="/files/x6wBpjxiEX4lzpV0aNaa" alt="" width="375"><figcaption><p>JSON input</p></figcaption></figure>

{% hint style="info" %}
**Create a new transformation**

Use any of these options to open a new transformation tab:

* Select **File** > **New** > **Transformation**
* Use `Ctrl+N` (Windows/Linux) or `Cmd+N` (macOS)
  {% endhint %}

***

1. Open `jsonfile.js` in an editor. You will extract fields from this array:

```json
{ "document": {
    "order": [ 
      { "productline": "Classic Cars",
        "customer": "Christine Loomis",
        "status": "Delivered",
        "date": "January 2004",
        "value": 21.99
      },
      { "productline": "Classic Cars",
        "customer": "Mary L. Peachin",
        "status": "Delivered",
        "date": "November 2008",
        "value": 24.99
      },
      { "productline": "Trains",
        "customer": "Bob Italia",
        "status": "Delivered",
        "date": "July 1994",
        "value": 14.99
      }
    ]
  }
}
```

{% hint style="info" %}
`status` can be `Delivered` or `Returned`.
{% endhint %}

***

{% tabs %}
{% tab title="1. JSON Input" %}
{% hint style="info" %}

#### JSON Input

JSON Input reads JSON and outputs rows.

You set one loop path. It outputs one row per loop element.
{% endhint %}

1. Start Pentaho Data Integration.

{% hint style="info" %}
{% tabs %}
{% tab title="Windows (PowerShell)" %}

```powershell
Set-Location C:\Pentaho\design-tools\data-integration
.\spoon.bat
```

{% endtab %}

{% tab title="macOS / Linux" %}

```bash
cd ~/Pentaho/design-tools/data-integration
./spoon.sh
```

{% endtab %}
{% endtabs %}
{% endhint %}

2. Drag **JSON Input** onto the canvas.
3. Open the step.
4. On the **File** tab, select your `jsonfile.js`.

{% hint style="info" %}
Save your transformation near `jsonfile.js`. Then use a portable path.
{% endhint %}

Example:

```
${Internal.Transformation.Filename.Directory}/jsonfile.js
```

5. Set **Loop path** to:

```
$.document.order[*]
```

6. Open the **Fields** tab.
7. Select **Get Fields**.
8. Verify these field paths and types:

* `productline` (String)
* `customer` (String)
* `status` (String)
* `date` (String)
* `value` (Number)

<figure><img src="/files/XogGI4my2DluGDO2bTpI" alt=""><figcaption><p>JSON input - file</p></figcaption></figure>

<figure><img src="/files/ykbo8iLoe9sIScKRllp2" alt=""><figcaption><p>JSON input - fields</p></figcaption></figure>

{% hint style="info" %}
Need help writing JSONPath? Use a tester like: <https://jsonpath.com/>
{% endhint %}
{% endtab %}

{% tab title="2. Dummy" %}
{% hint style="info" %}

#### Dummy

Dummy does not do anything. Use it as a preview target.
{% endhint %}

1. Expand **Flow** in the Design palette.
2. Drag **Dummy** onto the canvas.
3. Create a hop from **JSON Input** to **Dummy**.
   {% endtab %}

{% tab title="3. Run" %}
{% hint style="info" %}

#### Run the transformation

Run locally and preview the output rows.
{% endhint %}

1. Select **Run** in the canvas toolbar.
2. After it finishes, right-click **Dummy**.
3. Select **Preview**.

{% hint style="success" %}
You should see one row per order in the JSON array.
{% endhint %}

<figure><img src="/files/hwTgcdfRdFLe1MJqeXmT" alt=""><figcaption><p>Preview data</p></figcaption></figure>
{% endtab %}
{% endtabs %}

***

### Troubleshooting

<details>

<summary>No rows returned</summary>

Check the **Loop path** first. For this file, it must point to the array:

```
$.document.order[*]
```

If the JSON structure changes, update the loop path.

</details>

<details>

<summary>Fields are null</summary>

Confirm field paths match the JSON keys. If you loop over `order[*]`, use `productline`, not `$.document.order.productline`.

</details>

<details>

<summary>Wrong data types</summary>

Use **Get Fields** as a starting point. Then set types explicitly.

Example: keep `status` as **String**.

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://academy.pentaho.com/pentaho-data-integration/data-integration/data-sources/flat-files/json/read-json.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
