We have a number of running regex jobs that pull from an increasing number of different csv files. We try to avoid duplicate data as much as we can, but we often have repeating data. Regex should handle this via its union process, but because the regex jobs import the index of the csv where the data lives, these lines do not get eliminated in the union process. Is there a way within the regex IO job to tell it not to generate the index column? I couldn’t find the answers in the documentation so I came here.
Hey @pmcgavick , there is a pretty simple way to ignore the index column. In the job configuration, under the input step, you can add an ignores parameter as part of the use: mitto.iov2.transform#ExtraColumnsTransform
that should look something like this:
{
ignores: [
$.__index__
]
use: mitto.iov2.transform#ExtraColumnsTransform
}
Let me know how this works if you need further assistance!
1 Like