I understand that the ExtraColumnsTransform transformation reads in the data and does not add columns with all NULL values. In my use case I cannot use a Reflect Transformation, so is there another transformation that adds all columns from the source table even if they have no data?
The ExtraColumnsTransform
step has a parameter include_empty_columns
which defaults to false
but can be set to true
.
For example:
"steps": [
{
"transforms": [
{
"use": "mitto.iov2.transform#ExtraColumnsTransform",
"include_empty_columns": true
},
{
"use": "mitto.iov2.transform#ColumnsTransform"
}
],
"use": "mitto.iov2.steps#Input"
},
...
The use case here has been to include empty columns when using Mitto to output to a file. Flat files (csv, json, etc) for the most part don’t care about data type, but relational databases do.
I believe the ColumnsTransform
which sets the database column data types drops any input columns that are all NULL.
After talking with dev, you can use SDL to force a NULL column to be created.
Here’s an example query IO job:
{
"input": {
"dbo": "postgresql://localhost/analytics",
"query": [
"SELECT 1 as id, NULL as no_data",
"UNION",
"SELECT 2 as id, NULL as no_data"
],
"use": "query.io#QueryInput"
},
"output": {
"dbo": "postgresql://localhost/analytics",
"schema": "test",
"tablename": "test_null_column",
"use": "call:mitto.iov2.db#todb"
},
"sdl": {
"columns": [
{
"name": "no_data",
"type": "Boolean"
}
]
},
"steps": [
{
"transforms": [
{
"use": "mitto.iov2.transform#ExtraColumnsTransform"
},
{
"use": "mitto.iov2.transform#ColumnsTransform"
}
],
"use": "mitto.iov2.steps#Input"
},
{
"use": "mitto.iov2.steps#CreateTable"
},
{
"transforms": [
{
"use": "mitto.iov2.transform#FlattenTransform"
}
],
"use": "mitto.iov2.steps#Output"
},
{
"use": "mitto.iov2.steps#CountTable"
}
]
}
The SDL is what forces the NULL column to be created:
"sdl": {
"columns": [
{
"name": "no_data",
"type": "Boolean"
}
]
},