Pre Mitto v2.9, using IO jobs, Mitto outputs data from APIs, databases, and files to relational databases (typical behavior) and delimited flat files (e.g. csv, tsv, etc).
Mitto v2.9 introduces two new file outputs for IO jobs:
This introduces several new use cases for IO jobs:
- Output raw API data directly to JSON or JSON lines - This is especially useful when exploring data from new APIs and helping to understand the structure of that potentially nested data
- Output database data directly to JSON or JSON lines
- Convert files (e.g. csv, tsv, json, json lines etc) to JSON or JSON lines
Example Use Case
This example demonstrates using Mitto to download a .json and a .jsonl file from a public Github with a Mitto curl job, piping that data through Mitto, and outputting the data as a JSON or JSON lines file.
curl job
Here are the two files we will be downloading with Mitto:
- data/zuar_pets.json at master · zuarbase/data · GitHub
- data/zuar_pets.jsonl at master · zuarbase/data · GitHub
Here are the two curl job configs:
{
url: https://raw.githubusercontent.com/zuarbase/data/master/zuar_pets.json
args: [
-s
-b
/tmp/cookies
-L
-O
-f
]
}
{
url: https://raw.githubusercontent.com/zuarbase/data/master/zuar_pets.jsonl
args: [
-s
-b
/tmp/cookies
-L
-O
-f
]
}
End result: Two new files in Mitto’s file manager.
IO job - JSON input with JSON output
Here’s the IO job that takes the zuar_pets.json file and pipes it through Mitto and outputs it as zuar_pets_tojson.json:
{
input: {
use: flatfile.iov2#JsonInput
source: /var/mitto/data/zuar_pets.json
}
output: {
path: /var/mitto/data/zuar_pets_tojson.json
use: call:mitto.iov2#tojson
}
steps: [
{
transforms: [
{
use: mitto.iov2.transform#ExtraColumnsTransform
rename_columns: false
include_empty_columns: true
include_nested_json: true
}
]
use: mitto.iov2.steps#Input
}
{
transforms: [
{
use: mitto.iov2.transform#FlattenTransform
}
]
use: mitto.iov2.steps#Output
}
]
}
Two critical job config pieces here:
- The
output'susereferences the newtojsoncode and thepathreferences the output file Mitto will create. - The
ExtraColumnsTransformtransform step includes a new parameterinclude_nested_json: true.
End result - We end up with the exact same JSON file we started with as a new file.
IO job - JSON lines input with JSON lines output
Here’s the IO job that takes the zuar_pets.json file and pipes it through Mitto and outputs it as zuar_pets_tojson.json:
{
input: {
use: flatfile.iov2#JsonlInput
source: /var/mitto/data/zuar_pets.jsonl
}
output: {
path: /var/mitto/data/zuar_pets_tojsonl.jsonl
use: call:mitto.iov2#tojsonl
}
steps: [
{
transforms: [
{
use: mitto.iov2.transform#ExtraColumnsTransform
rename_columns: false
include_empty_columns: true
include_nested_json: true
}
]
use: mitto.iov2.steps#Input
}
{
transforms: [
{
use: mitto.iov2.transform#FlattenTransform
}
]
use: mitto.iov2.steps#Output
}
]
}
Differences here:
- The
input'suseisJsonlInputinstead ofJsonInput. - The
output'suseistojsonlinstead oftojson. - The
output'spathends injsonlinstead ofjson.
End result - We end up with the exact same JSON lines file we started with as a new file.

