Table of contents
- Introduction
- Steps to be implemented
- Retrieving the queue item
- Retrieving a ZIP with XML files
- Reading the ZIP and parsing the XML files
Introduction
In some data routes, it may be needed to fetch a large number of items from HTTP in a transformer. For example, when working Microsoft Dynamics, a queue can be used. The first step is to retrieve a queue item which contains a URL for the actual work items. In Alumio, such a scenario can be implemented by configuring an incoming to get the queue item and a transformer to fetch the work items.
In this guide, a Microsoft Dynamics work queue will be used to demonstrate a possible implementation, but it also applies to similar scenarios.
Steps to be implemented
Retrieving a queue item
A queue item is fetched from Microsoft Dynamics by calling /api/connector/dequeue/
API endpoint. The queue item contains a download locations such as:
{
"DownloadLocation":"https://<client>.dynamics.com:443/api/connector/download/12345678-1234-1234-1234567890ab"
}
Retrieving a ZIP with XML files
The URL in the download location is fetched. It contains a large ZIP file with XML files. Each XML file contains a large number of items.
For example, an XML file can look as follows:
<Document>
<CUSTCUSTOMERV3ENTITY>
...
</CUSTCUSTOMERV3ENTITY>
<CUSTCUSTOMERV3ENTITY>
...
</CUSTCUSTOMERV3ENTITY>
</Document>
Each CUSTCUSTOMERV3ENTITY tag contains customer data.
Reading the ZIP and parsing the XML files
The ZIP file is read and the XML files in it are parsed. Each item from the XML files is processed separately to limit the amount of used memory.
Retrieving the queue item
An HTTP subscriber can be used to fetch the queue item.
The response is parsed as JSON. Because the response is just a small JSON file we can choose “Whole file” as the read method.
Retrieving a ZIP with XML files
The ZIP in the download location is retrieved in a transformer step. The transformer step will receive the contents of the queue item that was retrieved in the subscriber step. A placeholder can be used to pass the download location as the URL to the HTTP transformer.
To be able to process items from the response separately, it is required to select “Split up: Get items from HTTP request“. This transformer is different from “HTTP transformer” because it can return parsed items one by one.
Reading the ZIP and parsing the XML files
To read the ZIP the “Archive (zip, tar, gz, bz2)” decoder is configured. This decoder will open the ZIP file and get a list of files. Multiple file processors can be configured to specify which files should be processed and which parser should be used.
For the patterns *.xml is used to process all XML files in the ZIP. For the parser XML is chosen and “Incremental (for large files)” is chosen for the read method. With this read method the XML file will be read incrementally and the parser will look for open and close tags in the XML to return items one-by-one. The path is configured as Document (the name of the root node in the XML) and CUSTCUSTOMERV3ENTITY (the tag for each item in the XML).
The result of this transformer will be a list customer items. Only one customer item will be loaded in memory at a time.