Hi @yoeri,
Please follow the below steps to achieve your objective.
For instance, you have the below entity:
{
"shopifyProduct": {
"images": {
"nodes": [
{
"url": "<url>/files/40785087-1.jpg?v=1729606668"
},
{
"url": "<url>/files/40785087-2.jpg?v=1729606668"
}
]
}
},
"media": [
{
"originalSource": "<url>/16411820149/40785087-1.jpg"
},
{
"originalSource": "<url>/16411820150/40785087-3.jpg"
}
]
}
-
First, copy the URLs of the existing media to a separate array, i.e.,
existingMedia
, using Copy using a pattern. -
Fetch only the filename within the URLs using String: File basename mapper, and remove the version parameter using String: PCRE replace and “String: Replace”.
For your information, since the “String: PCRE replace” mapper requires us to fill in the replacement text, you can put any character that is rarely used and replace it using “String: Replace” with an empty string to remove the character.
-
Copy the array of the formatted URLs to each object in the
media
array using Recursively copy values to children. -
Loop into the
media
array by using “Node, transform nodes”. If you can’t find the entity transformer, it means that you are inside a “Data, transform nodes using mappers and conditions”. Then, you can use Execute entity transformers first, and then select “Node, transform nodes”, as shown below. -
Within each object, create a new variable that holds the filename within the URL (
originalSource
) using the String: File basename mapper (point 1 in the picture below). -
Since each object has the list of media file names that exist in Shopify (as a result of Step 3), you can now check whether the current media file name (
file
) exists in the existing list (existingMedia
) using JMESPath functioncontains
(point 2 in the picture below).It will result in a boolean;
true
means the media exists in the Shopify product and needs to be removed from themedia
array, whilefalse
means it shouldn’t be removed. -
All objects in the media now have a property
exists: true/false.
Leave the loop and use a Value Setter and JMESPath query to filter out the ones withexists: true
.
The result of the above transformers is the entity below:
{
"shopifyProduct": {
"images": {
"nodes": [
{
"url": "<url>/files/40785087-1.jpg?v=1729606668"
},
{
"url": "<url>/files/40785087-2.jpg?v=1729606668"
}
]
}
},
"media": [
{
"originalSource": "<url>/16411820150/40785087-3.jpg"
}
],
"existingMedia": [
"40785087-1.jpg",
"40785087-2.jpg"
]
}
We hope our explanation is clear to you. We also created a sample entity transformer of the above steps below.
export_20241025161824.ndjson (2.2 KB)
Feel free to let us know if you have any questions.