How to transform CSV to JSON using transformers

Table of Contents

  • Introduction
  • Guide

Introduction

Alumio offers functionality to decode/parse CSV within subscribers (incoming configuration). It is available when you use “HTTP Subscriber” or “Filesystem: file subscriber”. It allows you to automatically convert the CSV format from the file/response returned from the subscribers into JSON format.

Now, what about doing so in Transformers? Currently, there is no dedicated transformer/mapper to directly transform or deserialize CSV to JSON. However, it is still possible to achieve it by utilizing mappers and JMESPath.

Guide

For example, you receive a CSV format, such as below.

id,sku,name,qty
1,ABC123,Product X,23
2,ABC456,Product Y,9
3,ABC789,Product Z,121

Let’s say in Alumio you retrieve the above CSV and it’s available as a value of a property, for instance, data.

{
  "data": "id,sku,name,qty\r\n1,ABC123,Product X,23\r\n2,ABC456,Product Y,9\r\n3,ABC789,Product Z,121"
}

Step 1: Split each line by using the “new line” (\r\n) as the delimiter. You can use a Value Mapper with “String: Explode to an array using a PCRE expression”.

Result:

{
  "data": [
    "id,sku,name,qty",
    "1,ABC123,Product X,23",
    "2,ABC456,Product Y,9",
    "3,ABC789,Product Z,121"
  ]
}

Step 2: Split each data from each line by using the delimiter character. In this case, we use comma (,) as the delimiter.

Result:

{
  "data": [
    [
      "id",
      "sku",
      "name",
      "qty"
    ],
    [
      "1",
      "ABC123",
      "Product X",
      "23"
    ],
    [
      "2",
      "ABC456",
      "Product Y",
      "9"
    ],
    [
      "3",
      "ABC789",
      "Product Z",
      "121"
    ]
  ]
}

Step 3: Use a “Value setter” and JMESPath to map the property name to each line.

&{data[*].{"id": [0], "sku": [1], "name": [2], "qty": [3]\}}

Adjust your JMESPath syntax to your needs. The syntax may be longer if you have more fields/columns in the CSV.

Result:

{
  "data": [
    {
      "id": "id",
      "sku": "sku",
      "name": "name",
      "qty": "qty"
    },
    {
      "id": "1",
      "sku": "ABC123",
      "name": "Product X",
      "qty": "23"
    },
    {
      "id": "2",
      "sku": "ABC456",
      "name": "Product Y",
      "qty": "9"
    },
    {
      "id": "3",
      "sku": "ABC789",
      "name": "Product Z",
      "qty": "121"
    }
  ]
}

Step 4: Remove the first array of the data as it only contains the CSV headers.

Result:

{
  "data": {
    "1": {
      "id": "1",
      "sku": "ABC123",
      "name": "Product X",
      "qty": "23"
    },
    "2": {
      "id": "2",
      "sku": "ABC456",
      "name": "Product Y",
      "qty": "9"
    },
    "3": {
      "id": "3",
      "sku": "ABC789",
      "name": "Product Z",
      "qty": "121"
    }
  }
}

Step 5: In order to fix the array index after removing the header, use a Value mapper with the “List: Get values” mapper.

Result:

{
  "data": [
    {
      "id": "1",
      "sku": "ABC123",
      "name": "Product X",
      "qty": "23"
    },
    {
      "id": "2",
      "sku": "ABC456",
      "name": "Product Y",
      "qty": "9"
    },
    {
      "id": "3",
      "sku": "ABC789",
      "name": "Product Z",
      "qty": "121"
    }
  ]
}