This PR adds a new mode for continuous aggregates that we name
real-time aggregates. Unlike the original this new mode will
combine materialized data with new data received after the last
refresh has happened. This new mode will be the default behaviour
for newly created continuous aggregates.
To upgrade existing continuous aggregates to the new behaviour
the following command needs to be run for all continuous aggregates
ALTER VIEW continuous_view_name SET (timescaledb.materialized_only=false);
To disable this behaviour for newly created continuous aggregates
and get the old behaviour the following command can be run
ALTER VIEW continuous_view_name SET (timescaledb.materialized_only=true);
We added a timescaledb.ignore_invalidation_older_than parameter for
continuous aggregatess. This parameter accept a time-interval (e.g. 1
month). if set, it limits the amount of time for which to process
invalidation. Thus, if
timescaledb.ignore_invalidation_older_than = '1 month'
then any modifications for data older than 1 month from the current
timestamp at insert time will not cause updates to the continuous
aggregate. This limits the amount of work that a backfill can trigger.
This parameter must be >= 0. A value of 0 means that invalidations are
never processed.
When recording invalidations for the hypertable at insert time, we use
the maximum ignore_invalidation_older_than of any continuous agg attached
to the hypertable as a cutoff for whether to record the invalidation
at all. When materializing a particular continuous agg, we use that
aggs ignore_invalidation_older_than cutoff. However we have to apply
that cutoff relative to the insert time not the materialization
time to make it easier for users to reason about. Therefore,
we record the insert time as part of the invalidation entry.
The microsoft compiler can't figure out that elog(ERROR) doesn't
return and warns about functions not returning a value in all code
paths. This patch adds pg_unreachable calls to those functions.
In various places, most notably drop_chunks and show_chunks, we
dispatch based on the type of the "time" column of the hypertable, for
things such as determining which interval type to use. With a custom
partition function, this logic is incorrect, as we should instead be
determining this based on the return type of the partitioning function.
This commit changes all relevant access of dimension.column_type to a
new function, ts_dimension_get_partition_type, which has the correct
behavior: it returns the partitioning function's return type, if one
exists, and only otherwise uses the column type. After this commit, all
references to column_type directly should have a comment explaining why
this is appropriate.
fixes Gihub issue #1250
Add indexes for materialization table created by continuous aggregates.
This behavior can be turned on/off by using timescaledb.create_group_indexes parameter
of the WITH clause when the continuous agg is created.
We lower the retry_period of cont agg jobs from a constant 1 day to
the schedule_interval because 1 day was too long.
The retry time formula is:
retry_period * 2^(consecutive_failures - 1)
So this seems reasonable.
Also changed the update logic to set retry_period to refresh_interval
on WITH clause alters.
Add a setting max_materialized_per_run which can be set to prevent a
continuous aggregate from materializing too much of the table in a
single run. This will prevent a single run from locking the hypertable
for too long, when running on a large data set.
1) Change with clause name to 'timescaledb.continuous'
Used to be timescaledb.continuous_agg as a text field, now is a bool.
2) Add more WITH options for continuous aggs
- Refresh lag control the amount by which the materialization will lag
behind a the maximum current time value.
- Refresh interval controls how often the background materializer is run.
3) Handle ALTER VIEW on continuous aggs
Handle setting WITH options using continuous views.
Block all other ALTER VIEW commands on user and partial views.