Combining multiple items to create batches

Table of contents

  • Introduction
  • Example configuration
  • Setting an entity schema

Introduction

When processing a set of items in Alumio, it is convenient to process the items one by one. In that way, it is possible to configure transformers to handle just one item. For efficiency, however, it may be needed to combine multiple items into one task.

Combining items can be done with the “Create batches by combining items” transformer.

A route that uses this transformer can, for example, have the following components:

  • A subscriber that retrieves a list of items and returns them one by one. For example, a filesystem subscriber that reads items from a file or an HTTP subscriber.
  • A number of transformers to transform the items
  • The transformer to create batches

Example configuration

In the following example, the “Create batches by combining items” transformer has been configured to combine 5 items into one. The resulting item will be placed in a property called combined. The resulting item will get “Default entity” as its entity type.

Example input

Each line is an input item of the transformer.

{"id": 1}
{"id": 2}
{"id": 3}
{"id": 4}
{"id": 5}

Example output

A single item.

{
    "combined": [
        {"id": 1},
        {"id": 2},
        {"id": 3},
        {"id": 4},
        {"id": 5}
    ]
}

Note that when there are less input items than the configured batch size, the batch will still be created with the items that were received. For example, if the batch size is 5 and there are 12 input items, the resulting batch items will have 5, 5 and 2 items.

Setting an entity schema

An entity schema can be set on the combined items. For example, an entity schema could be configured to create a combined identifier for the items.

This schema uses join(‘,’, map(&to_string(@), combined.id)) as identifier path, which is a JMESPath expression to create identifiers such as 1,2,3,4,5.

In the task list the items will be shown as follows: