Memory issues / "Garbage collection" processes

floris · January 30, 2025, 9:57am

Hi Alumio,

Past 12 months I’ve worked on quite a few memory issues. It’s hard to precisely see “what” causes the memory issues, because you can’t really see the impact of every transformer.

I’ve now found somewhat strange behaviour, where in two “functionally identical” setups, you can see a very different amount of memory used.

Attached are two files:

Configuration that “doesn’t work”, it’s “single level”.
Configuration that does work and uses only a tiny part of memory.

So the configuration saves a large (500 key/values) into a storage. It then generates an array of 500 or more objects. Then it loops through those objects, retrieves the (large) storage entity and move just one key into the object, and then removes it again.

Theoretically Alumio should be able to handle thousands of this. After all, you ‘remove’ the large storage entity as soon as you’ve moved just one key from it.

However, what turns out to happen is that in the first configuration (the “single-level” one), it maxes out at about 500 (500 works, 1.000 doesn’t work anymore). The other configuration I’ve set to 100.000, and that works fine.

The two configurations are functionally identical. They are different only in “nesting” of certain processes (by utilizing “Chain” & “Execute Entity Transformers”).

My guess is that there’s a garbage collection (cleaning of memory) process that does get triggered on some combination of factors but that doesn’t get triggered on others.

I think there’s some optimization to be done here. Otherwise, I’d like to get some clarity on how we should structure stuff like this to optimize it ourselves, because right now I’m kinda in the dark. I know the ‘nesting’ works, but I don’t know what exact thing is solving my issue.

single-level.ndjson (11.7 KB)
multi-level-config.ndjson (13.3 KB)

Thanks!

m.arief · February 5, 2025, 8:30am

Hi @floris

Thank you so much for your valuable input and your patience. We’ve checked and tested the transformers that you provided and managed to see the differences in performance between one level and multi-level case. There is a difference between processing one level transformer and multi-level transformer and as you mentioned, there is a possibility that memory cannot be released properly in the case of one-level transformer. Therefore, we will forward this information to the related team for further analysis.