I’m wondering about something. Generally I like to use this pattern for getting entities (doesn’t really matter what) from a REST API endpoint:
- Value Setter to set
datetime in accepted format for the REST API endpoint, setting it back for example 4 hours
- Get From Storage where I get the
datetime value from the storage
On default, so JSON/Whole File, not inputting a pattern
- Value Setter, date is now or for example 1 minute ago or a few seconds ago
- Save to Storage with the new datetime
- Get Branches to the entities
This pattern works great for most usecases. When the other API is unavailable for a few hours, putting the ‘save to storage’ for the new datetime on the entity transformer side, means that when the other API is unresponsive, the next run will still execute using the older date - so you’re not missing any updates.
However, I’ve recently learned to start using the Response Decoder: JSON / Incremental, because it is significantly less memory-hungry. The pattern above doesn’t work anymore in that case, since it’ll treat each entity as its own separate page. So though it’s possible, if you get a 1.000 entities, you’ll save a 1.000 times the datetime to storage. And if processing takes a long while, you’ll save a datetime that is significantly ‘wrong’.
So, my question is: what is your suggested best-practice on this?
Hey @Gugi ,
Could you check out this topic? I’m wondering what the best-practice would be for this very common pattern.
We apologize for missing out the topic.
I think your approach is what we would suggest as a best practice. As you already know that it will write the datetime to the storage entity as many as the number of entities subscribed.
However, if each entity has a last modified datetime property, we would suggest saving the last modified datetime instead of saving relative datetime (1 minute ago or a few seconds ago). It also depends on whether we can filter the entities from the REST API endpoint by adding last modified datetime filter (greater than the saved datetime). This approach will avoid the invalid date time filter for the next run, as it doesn’t depend on processing time anymore.
Feel free to let us know if I missed out on anything.
Saving the relative date seems a good option, yes!
I’m wondering, is the - possibly thousands of times - saving into a storage an issue?
It could be time-consuming. I tried saving a value of separate 8,000 entities, and it took 9 seconds to finish. Of course, doing it thousands of times will take much longer. It also depends on how good the specification of the server. Would you please let us know whether you have any feedback for us about it?
Hi @Gugi ,
I think the best solution would be - but I’m not sure this is architecturally possible - a sort of ‘wrap around’ transformer which doesn’t interact with the body data anymore (since that would take a lot of memory etc.).
So the current process is:
- Call system (once)
- Paginate on pages (multiple times)
- Branch to entities (thousands of times)
It would be great if we have an extra “setting” for the HTTP transformer that allows us to write a date-timestamp to a storage (note: it should be the datetimestamp of the first call to the system), but only write it once after all processes (so just before the process ‘finishes’, to make sure you didn’t get any errors).
I can’t be sure, but I think on a process-level this should be possible. Especially if you don’t read from the data.
We appreciate your valuable feedback.
Could you please confirm whether I understood it correctly? You need to subscribe all the entities, and branch to them (automatically due to Incremental read method), but only write timestamp if it’s the last entity or just before the whole process finishes.
Please let me know if I missed out on anything.
Thank you for confirming. As you may already know, that is not something that we already support at the moment. Therefore, we would like to pass this on to the team to see whether it can be implemented in the future.
We will get back to you once we have an update.