Matvey Arye 5d8c7cc6f6 Add a scheduler for background jobs
TimescaleDB will want to run multiple background jobs. This PR
adds a simple scheduler so that jobs inserted into a jobs table
could be run on a schedule. This first implementation has two limitations:

1) The list of jobs to be run is read from the database when the scheduler
is first started. We do not update this list if the jobs table changes.

2) There is no prioritization for when to run jobs.

There design of the scheduler is as follows:
The scheduler itself is a background job that continuously runs and waits
for a time when  jobs need to be scheduled. It then launches jobs as new
background workers that it controls through the background worker handle.

Aggregate statistics about a job are kept in the job_stat catalog table.
These statistics include the start and finish times of the last run of the job
as well as whether or not the job succeeded. The next_start is used to
figure out when next to run a job after a scheduler is restarted.

The statistics table also tracks consecutive failures and crashes for the job
which is used for calculating the exponential backoff after a crash or failure
(which is used to set the next_start after the crash/failure). Note also that
there is a minimum time after the db scheduler starts up and a crashed job
is restarted. This is to allow the operator enough time to disable the job
if needed.

Note that the number of crashes is an overestimate of the actual number of crashes
for a job. This is so that we are conservative and never miss a crash and fail to
use the appropriate backoff logic.  Note that there is some complexity
in ensuring that all crashes are counted since a crash in Postgres causes /all/
processes to SIGQUIT: we must commit changes to the stats
table /before/ a job starts so that we can then deduce after a job has crashed
and the scheduler comes back up that a job was started, and not finished before
the crash (meaning that it could have been the crashing process).
2018-09-10 13:29:59 -04:00
..