# Parsing natural language to structure output

## **Goal**

We will **modify our workflow** so that:

1. The user inputs a query like **“latest news in Japan today”**.
2. **ChatGPT extracts** structured details from the input (e.g., country, category, keyword).
3. The extracted details are **passed to the Mediastack Node**, dynamically filtering news.

***

## **Step 1: Add a New ChatGPT Node for Information Extraction**

We need a new ChatGPT node to **extract structured data** from the user’s input before passing it to the Mediastack Node.

<figure><img src="https://4116416135-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUjtYjZk0Li2m5r1hpmEd%2Fuploads%2Fj5xlsxufkw1aTJDgfrTE%2Fimage.png?alt=media&#x26;token=449c2e10-8cb9-497e-bba9-08cf59fcd5b9" alt=""><figcaption><p>New workflow setup with only ChatOpenAI node</p></figcaption></figure>

{% stepper %}
{% step %}
Let's remove the connection of input and output node, instead connect it to another **ChatOpenAI Node**

{% endstep %}

{% step %}
Connect the **Input Node’s** `message` output to the new **ChatGPT Node’s** `message` input.

{% endstep %}

{% step %}
**Test it** by sending the message:

* *“latest news in Malaysia today”*
* Right now, ChatGPT doesn’t understand what to do, because we haven’t given it proper instructions.
  {% endstep %}
  {% endstepper %}

## **Step 2: Instruct ChatGPT to Extract Information**

To make ChatGPT extract information, we need to **provide a system instruction**. However, the **ChatGPT Node** only has **one input**, but we need to send **both a system instruction and user query**. In this case, we use JavaScript Node **combines them into a single string**, ensuring ChatGPT processes the request properly.

{% stepper %}
{% step %}

### &#x20;Add a JavaScript Node

* Double-click an empty space in the editor and search for **JavaScript** to quickly add it.
* Name it **"System Prompt"** for clarity.
* Define **two inputs**:
  * **User Input** (`user` - string) → This will hold the user’s message.
  * **System Prompt** (`system` - string) → This will contain the instruction:\
    `"Extract the country, category, and keyword from the following user request"`
* Define **one output** (`output` - string).
  {% endstep %}

{% step %}
**Modify the JavaScript Code**

<figure><img src="https://4116416135-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUjtYjZk0Li2m5r1hpmEd%2Fuploads%2FcDcewIOcmjfeiF5zbjmV%2Fimage.png?alt=media&#x26;token=265d36a6-e378-49e0-b1db-2fbbef6998a6" alt=""><figcaption><p>Configuring Javascript Node</p></figcaption></figure>

* Inside the **Function Code** of the JavaScript Node, enter the following:

  ```javascript
  return `system: ${inputs[1]}\n user: ${inputs[0]}`;
  ```
* This ensures that ChatGPT receives the prompt in a structured format:

  ```
  system: Extract the country, category, and keyword from the following user request
  user: latest news in Malaysia today
  ```

{% endstep %}

{% step %}
**Link the Inputs & Outputs**

* Connect:
  * The **user’s message** from the **Input Node** to the **User Input** of the JavaScript Node.
  * A **Text Area Node** containing the system instruction to the **System Input** of the JavaScript Node.
  * The **JavaScript Node's output** to the **message input** of the ChatGPT Node.
    {% endstep %}

{% step %}
**Test the Setup**

<figure><img src="https://4116416135-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUjtYjZk0Li2m5r1hpmEd%2Fuploads%2FE4RrlEjlGcGbuDGbvhDM%2Fimage.png?alt=media&#x26;token=6e71f1fe-6eec-4f89-bd48-a3006e33420a" alt=""><figcaption><p>Workflow integrated with Javascript Node</p></figcaption></figure>

* Type: **"latest news in Malaysia today"**
* ChatGPT should now able to extract the information
  {% endstep %}
  {% endstepper %}

## **Step 3: Using a JSON Schema for ChatGPT Extraction**

In the previous step, we saw that the extracted data was incorrect

```
country: Malaysia
category: news
keyword: latest news today
```

This happens because **ChatGPT doesn’t know exactly how to format its response** based on the Mediastack API requirements.

To **fix this**, we will provide **a JSON Schema** to **standardize the extracted data format**.

{% stepper %}
{% step %}

#### **Prepare the JSON Schema**

We generated the JSON Schema for the [**Mediastack Live News API**](https://mediastack.com/documentation) using ChatGPT and the official JSON Schema standards ([json-schema.org](https://json-schema.org/draft/2020-12/schema)).

Here is an example JSON Schema that **enforces the correct format**:

{% code overflow="wrap" %}

```json
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "sources": {
      "type": "string",
      "description": "Use this parameter to include or exclude one or multiple comma-separated news sources. Example: 'cnn,-bbc' to include CNN and exclude BBC.",
      "pattern": "^[a-zA-Z0-9,-]+$"
    },
    "categories": {
      "type": "string",
      "description": "Use this parameter to include or exclude one or multiple comma-separated news categories. Example: 'business,-sports' to include business and exclude sports.",
      "pattern": "^[a-zA-Z,-]+$"
    },
    "countries": {
      "type": "string",
      "description": "Use this parameter to include or exclude one or multiple comma-separated countries. Example: 'au,-us' to include Australia and exclude the US.",
      "pattern": "^[a-zA-Z,-]+$"
    },
    "languages": {
      "type": "string",
      "description": "Use this parameter to include or exclude one or multiple comma-separated languages. Example: 'en,-de' to include English and exclude German.",
      "pattern": "^[a-zA-Z,-]+$"
    },
    "keywords": {
      "type": "string",
      "description": "Use this parameter to search for sentences or exclude words. Example: 'new movies 2021 -matrix' to search for 'New movies 2021' but exclude 'Matrix'."
    },
    "date": {
      "type": "string",
      "description": "Use this parameter to specify a date or date range. Examples: '2020-01-01' for a specific date or '2020-12-24,2020-12-31' for a date range.",
      "pattern": "^\\d{4}-\\d{2}-\\d{2}(,\\d{4}-\\d{2}-\\d{2})?$"
    },
    "sort": {
      "type": "string",
      "description": "Use this parameter to specify a sorting order. Available values: 'published_desc' (default), 'published_asc', 'popularity'.",
      "enum": [
        "published_desc",
        "published_asc",
        "popularity"
      ],
      "default": "published_desc"
    },
    "limit": {
      "type": "integer",
      "description": "Use this parameter to specify a pagination limit (number of results per page). Default is 25, maximum allowed is 100.",
      "minimum": 1,
      "maximum": 100,
      "default": 25
    },
    "offset": {
      "type": "integer",
      "description": "Use this parameter to specify a pagination offset value. Default is 0, starting with the first available result.",
      "minimum": 0,
      "default": 0
    }
  },
  "additionalProperties": false
}

```

{% endcode %}

{% endstep %}

{% step %}

#### **Integrate the JSON Schema into ChatGPT**

1. **Add a "Const Data" Node** to store the JSON Schema.
   * Double-click an empty space in the editor, search **Data**, and add it.
   * Paste the JSON Schema into the **Const Data** Node.
2. **Modify the System Prompt**
   * Change the prompt in the **Text Area Node** to:

     ```graphql
     Extract the mediastack data from the following user request
     ```
3. **Connect the Schema to the ChatGPT Node**
   * Link the **Const Data (JSON Schema Node)** to the **Schema input** of the ChatGPT Node.
   * Now, ChatGPT will **refer to the schema** before generating the extracted response.

{% endstep %}

{% step %}

### Test the result

<figure><img src="https://4116416135-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUjtYjZk0Li2m5r1hpmEd%2Fuploads%2FJESnHwRTe19bVTnZGJxs%2Fimage.png?alt=media&#x26;token=bc509cbe-ca02-4b5e-ac89-34ce6f75e604" alt=""><figcaption><p>Agent setup with correct schema</p></figcaption></figure>

1. Send a message like:

   ```
   latest sports news in Japan today
   ```
2. The **correct output should now be**:

   <pre class="language-json"><code class="lang-json">{
   <strong>    "categories": "sports",
   </strong>    "countries": "jp",
       "date": "2023-12-07"
   }
   </code></pre>

{% endstep %}
{% endstepper %}

## **Step 4: Connect everything together**

Next, we’ll **connect the structured data to the Mediastack API** and dynamically fetch news based on user input! <br>

**Since ChatGPT** outputs **JSON as a string** instead of an actual JSON object. We need to **parse the stringified JSON** before passing it to Mediastack

{% stepper %}
{% step %}
**Create a JavaScript Node to Parse JSON**

&#x20;Since ChatGPT outputs JSON as a string instead of an actual JSON object. We need to parse the stringified JSON before passing it to Mediastack

<figure><img src="https://4116416135-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUjtYjZk0Li2m5r1hpmEd%2Fuploads%2FfXVpZdjy05gxEGXmCf94%2Fimage.png?alt=media&#x26;token=594fb1df-85cd-4df4-ae52-17c516172016" alt=""><figcaption><p>Javascript Node to parse JSON</p></figcaption></figure>

1. **Double-click an empty space** in the editor and search for **JavaScript**.
2. **Rename it to "Parse JSON"** for clarity.
3. Set **Code, Inputs and Outputs** accordingly.
   {% endstep %}

{% step %}

#### **Link the Nodes**

1. **Connect ChatGPT's `response` output** → **Parse JSON Node’s `input`**.
2. **Parse JSON Node’s `output`** → **Mediastack News Node’s `countries`, `category`, and `keywords` inputs**.
3. **Use an Object Property Node** (if needed) to extract specific fields from the parsed JSON.
   {% endstep %}

{% step %}

#### **Test the Workflow**

1. Send: **"latest sports news in Malaysia today"**
2. Mediastack should now **correctly receive** the extracted country, category, and keyword and replied with relevent information
   {% endstep %}
   {% endstepper %}

## Summary

So now we have successfully create a agent that reply with news based on user's prompt.

<figure><img src="https://4116416135-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUjtYjZk0Li2m5r1hpmEd%2Fuploads%2FoTzDYlKl1XeFmsZma0YA%2Fimage.png?alt=media&#x26;token=53729bb4-b546-4bc5-b7a1-79710b26f91c" alt=""><figcaption><p>New Reporter Agent Setup</p></figcaption></figure>

Here is the **exported JSON file** containing the full workflow setup. You can import this into Editor to instantly recreate the agent.

{% file src="<https://4116416135-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUjtYjZk0Li2m5r1hpmEd%2Fuploads%2FEFhPRknhCIdGvfx6igF0%2Fgraph-1738083218569.json?alt=media&token=65663874-a876-4ff1-a747-204712b23857>" %}
Exported Agent's JSON File
{% endfile %}
