Mitto v2.9 Sneak Peek - IncludeTransform and ExcludeTransform

Mitto v2.9 introduces two new transforms for IO jobs:

  • IncludeTransform - Specify the exact columns to include from the data piped from the input.
  • ExcludeTransform - Specify the exact columns to exclude from the data piped from the input.

Potential use cases:

  • Narrow down a wide dataset when you know you don’t need all the fields
  • Remove fields with personally identifiable information

As these are transforms and part of the steps of an IO job, they work with any input (e.g. APIs, databases, and files).

IncludeTransform Example

Let’s take a simple CSV as an example:

id,name
1,Justin
2,Matt

A standard IO job using this CSV as an input and a database table as an output would result in this database table:

__index__ id name
1 1 Justin
2 2 Matt

By adding the IncludeTransform and specifying only id in the keys, the output table becomes this:

id
1
2

Only the id column is included.

Here’s the job’s modified “input” step:

"steps": [
    {
      "use": "mitto.iov2.steps#Input",
      "transforms": [
        {
          "use": "mitto.iov2.transform#IncludeTransform",
          "keys": [
            "id"
          ]
        },
        {
          "use": "mitto.iov2.transform#ExtraColumnsTransform"
        },
        {
          "use": "mitto.iov2.transform#ColumnsTransform"
        }
      ]
    },
    ...
]

ExcludeTransform Example

By adding the ExcludeTransform and specifying only name in the keys, the output table becomes this:

__index__ id
1 1
2 2

The name column is excluded.

Here’s the job’s modified “input” step:

"steps": [
    {
      "use": "mitto.iov2.steps#Input",
      "transforms": [
        {
          "use": "mitto.iov2.transform#ExcludeTransform",
          "keys": [
            "name"
          ]
        },
        {
          "use": "mitto.iov2.transform#ExtraColumnsTransform"
        },
        {
          "use": "mitto.iov2.transform#ColumnsTransform"
        }
      ]
    },
    ...
]