Run ETL when data source is updated

Question: Is there a way for Mitto to kick off a sequence only when the data source is updated?

Background:

  • I have ~20+ XLS spreadsheets on SharePoint
  • I currently have a Mitto sequence that is ran every day at 2a to drop all historical data and re-grab all XLS spreadsheets.
  • The Mitto sequence is ran in its entirety, regardless of if there are any data source changes or not.
  • Because the sequence is kicked off at 2a – and only 2a – an update that is made later in the day - say 3p – won’t get picked up until 2a.

Goal:
– I would love for Mitto to kick off the sequence only when --and shortly after (maybe even real time?) – the data source (e.g. an Excel file) has been changed.

Are the 20 XLS files all updated daily or are a subset of the files updated in Sharepoint?

If all the files are updated daily in Sharepoint, then Mitto will rerun all the files (assuming you are using a RegEx job).

If a subset of files are updated daily in Sharepoint, then Mitto should only rerun those files that have been updated.

Hi Justin -
It is usually a subset of the files that are updated - but it could be all of them.

However, the answer doesn’t address the core of the question – and that is how do we get Mitto to monitor the files essentially “real time” and run the sequence when a file is updated?

It is an interesting use case to run a job/sequence triggered from a file being updated.

We haven’t had this particular use case come up, but similar to the monitor job use case, a webhook could be used to start a sequence.

Here’s the thread on that topic: Notify me when SQL query returns 0 rows

You can use a command line job to check the last modified of a file and cause the job to fail due to some criteria. The webhook could then fire on job failure.

I don’t have a great answer for this one right now unfortunately.

1 Like