Mitto SQL Jobs - Best Practice

gregpowell · February 9, 2021, 12:57am

A client has a mitto job with 3-4 queries that drop/create multiple tables in the DB schema. They have had issues with this job not completing/not creating the tables that they need.

I would suggest that they split out the job into it’s individual parts, mostly for organization, but is there a more technical reason why they might want to split up the multiple DROP/CREATES of tables?

anon68878319 · February 9, 2021, 7:03pm

Here’s the Mitto SQL documentation for reference: SQL

Short answer:

Each job in Mitto is a unit of work (and a separate database transaction in the case of the SQL job).

Best practice is to separate your SQL statements into separate jobs. So if you are dropping and recreating 4 separate tables, make 4 separate SQL jobs (one for each table). Add all of these jobs into a sequence and set Continue on Error to FALSE. This gives better organization, simplifies troubleshooting (you know exactly which job failed and the specific SQL statements related to it), and gives more specific monitoring capabilities (webhooks, etc).

Longer answer:

By default, when a Mitto SQL job is created via the job wizard and the output table is using Mitto’s internal PostgreSQL database server, the resulting job config looks like this:

{
    "dbo": "postgresql://localhost/analytics",
    "kwargs": {},
    "parameters": {},
    "split": false,
    "sql": [
        "SELECT 1;"
    ],
    "transaction": true
}

Note "transaction": true and "split": false

Here’s a slightly more complex job config with multiple SQL statements:

{
    "dbo": "postgresql://localhost/analytics",
    "kwargs": {},
    "parameters": {},
    "split": false,
    "sql": [
        "DROP TABLE IF EXISTS test.sql_transaction",
        ";",
        "",
        "CREATE TABLE IF NOT EXISTS test.sql_transaction (",
        "    id         SERIAL,",
        "    updated_at TIMESTAMPTZ DEFAULT NOW(),",
        "    comment    VARCHAR",
        ")",
        ";",
        "",
        "",
        "INSERT INTO",
        "    test.sql_transaction (comment)",
        "VALUES",
        "    ('hello world!')",
        ";",
        "",
        "INSERT INTO",
        "    test.sql_transaction (updated_at, comment)",
        "VALUES",
        "    ('this is not valid', 'this is not valid')",
        ";",
        ""
    ],
    "transaction": true
}

These 4 SQL statements drop and create a table and attempt to insert two rows into the table. The second insert is invalid and fails.

By default, every SQL query is sent in one transaction ("transaction": true). So in the above example, all 4 SQL statements (DROP, CREATE, INSERT, and INSERT) would have to succeed for the table to be created with 2 records and for the Mitto job to succeed.

In the example above, the last INSERT statement is invalid (it is attempting to add a varchar into a timestamptz column), so the end result of running this job is that none of the SQL statements are committed and no table is created. The Mitto job also fails.

SQL job parameters can be adjusted based on the use case.

For example (although not recommended in this case):

{
    "dbo": "postgresql://localhost/analytics",
    "kwargs": {},
    "parameters": {},
    "split": true,
    "sql": [
        "DROP TABLE IF EXISTS test.sql_transaction",
        ";",
        "",
        "CREATE TABLE IF NOT EXISTS test.sql_transaction (",
        "    id         SERIAL,",
        "    updated_at TIMESTAMPTZ DEFAULT NOW(),",
        "    comment    VARCHAR",
        ")",
        ";",
        "",
        "",
        "INSERT INTO",
        "    test.sql_transaction (comment)",
        "VALUES",
        "    ('hello world!')",
        ";",
        "",
        "INSERT INTO",
        "    test.sql_transaction (updated_at, comment)",
        "VALUES",
        "    ('this is not valid', 'this is not valid')",
        ";",
        ""
    ],
    "transaction": false
}

The only adjustments here are "split": true and "transaction": false.

This Mitto SQL job sends each SQL statement one at a time. The end result is a table with one row (the first INSERT) and a failed Mitto job (due to the last INSERT failing).

Mitto SQL Jobs - Best Practice

Short answer:

Longer answer:

Further reading: