When rebuilding the bgw_job table the update script wouldnt remember
the state of the sequence and reset it back to the default leading
to failed job inserts until the sequence catches up.
When a hypertable was referenced in a subquery that was not already
in our hypertable cache we would fail to detect it as hypertable
leading to transparent decompression not working for that hypertable.
When the extension is updated to 2.0, we need to migrate
existing ignore_invalidation_older_than settings to the new
continuous aggregate policy framework.
ignore_invalidation_older_than setting is mapped to start_interval
of the refresh policy.If the default value is used, it is mapped
to NULL start_interval, otherwise it is converted to an
interval value.
When a constraint is backed by an index like a unique constraint
or a primary key constraint the constraint can be renamed by either
ALTER TABLE RENAME CONSTRAINT or by ALTER INDEX RENAME. Depending
on the command used to rename different internal metadata tables
would be adjusted leading to corrupt metadata. This patch makes
ALTER TABLE RENAME CONSTRAINT and ALTER INDEX RENAME adjust the
same metadata tables.
Tests are updated to no longer use continuous aggregate options that
will be removed, such as `refresh_lag`, `max_interval_per_job` and
`ignore_invalidation_older_than`. `REFRESH MATERIALIZED VIEW` has also
been replaced with `CALL refresh_continuous_aggregate()` using ranges
that try to replicate the previous refresh behavior.
The materializer test (`continuous_aggregate_materialize`) has been
removed, since this tested the "old" materializer code, which is no
longer used without `REFRESH MATERIALIZED VIEW`. The new API using
`refresh_continuous_aggregate` already allows manual materialization
and there are two previously added tests (`continuous_aggs_refresh`
and `continuous_aggs_invalidate`) that cover the new refresh path in
similar ways.
When updated to use the new refresh API, some of the concurrency
tests, like `continuous_aggs_insert` and `continuous_aggs_multi`, have
slightly different concurrency behavior. This is explained by
different and sometimes more conservative locking. For instance, the
first transaction of a refresh serializes around an exclusive lock on
the invalidation threshold table, even if no new threshold is
written. The previous code, only took the heavier lock once, and if, a
new threshold was written. This new, and stricter locking, means that
insert processes that read the invalidation threshold will block for a
short time when there are concurrent refreshes. However, since this
blocking only occurs during the first transaction of the refresh
(which is quite short), it probably doesn't matter too much in
practice. The relaxing of locks to improve concurrency and performance
can be implemented in the future.
This moves the SQL definitions for policy and job APIs to their
separate files to improve code structure. Previously, all of these
user-visible API functions were located in the `bgw_scheduler.sql`
file, mixing internal and public functions and APIs.
To improved the structure, all API-related functions are now located
in their own distinct SQL files that have the `_api.sql` file
ending. Internal policy functions have been moved to
`policy_internal.sql`.
This change simplifies the name of the functions for adding and
removing a continuous aggregate policy. The functions are renamed
from:
- `add_refresh_continuous_aggregate_policy`
- `remove_refresh_continuous_aggregate_policy`
to
- `add_continuous_aggregate_policy`
- `remove_continuous_aggregate_policy`
Fixes#2320
This commit will add support for `WITH NO DATA` when creating a
continuous aggregate and will refresh the continuous aggregate when
creating it unless `WITH NO DATA` is provided.
All test cases are also updated to use `WITH NO DATA` and an additional
test case for verifying that both `WITH DATA` and `WITH NO DATA` works
as expected.
Closes#2341
Enforce index scan for queries that would produce different output
between 32bit and 64bit platform to make explain output for
constraint_exclusion_prepared, ordered_append and ordered_append_join
test output consistent across platforms.
If a table access method is provided when creating a continuous
aggregate using `CREATE MATERIALIZED VIEW` it will be used to set the
table access method for the materialized hypertable.
Closes#2123
The invalidation log initialization did not adjust
the timestamp to internal format. Use ts_get_now_internal
in all places that write entries to the invalidation log.
The update tests use the scripts for building the
timescaledb image. Build errors should be reported
and return a failure code. This will force the failure
of the rest of the pipeline.
When dropping chunks that have dependent objects, like a continuous
aggregate view, a dependent object error is raised. The hint message
is overridden to produce a more useful hint for the drop chunks use
case. However, the hint is overriden without checking the error code,
which means the hint is replaced for any error raised. This can
produce confusing log output.
The issue is fixed by first examining the error code before overriding
the hint.
This change makes the behavior of dropping chunks on a hypertable that
has associated continuous aggregates consistent with other
mutations. In other words, any way of deleting data, irrespective of
whether this is done through a `DELETE`, `DROP TABLE <chunk>` or
`drop_chunks` command, will invalidate the region of deleted data so
that a subsequent refresh of a continuous aggregate will know that the
region is out-of-date and needs to be materialized.
Previously, only a `DELETE` would invalidate continuous aggregates,
while `DROP TABLE <chunk>` and `drop_chunks` did not. In fact, each
way to delete data had different behavior:
1. A `DELETE` would generate invalidations and the materializer would
update any aggregates to reflect the changes.
2. A `DROP TABLE <chunk>` would not generate invalidations and the
changes would therefore not be reflected in aggregates.
3. A `drop_chunks` command would not work unless
`ignore_invalidation_older_than` was set. When enabled, the
`drop_chunks` would first materialize the data to be dropped and
then never materialize that region again, unless
`ignore_invalidation_older_than` was reset. But then the continuous
aggregates would be in an undefined state since invalidations had
been ignored.
Due to the different behavior of these mutations, a continuous
aggregate could get "out-of-sync" with the underlying hypertable. This
has now been fixed.
For the time being, the previous behavior of "refresh-on-drop" (i.e.,
materializing the data on continuous aggregates before dropping it) is
retained for `drop_chunks`. However, such "refresh-on-drop" behavior
should probably be revisited in the future since it happens silently
by default without an opt out. There are situations when such silent
refreshing might be undesirable; for instance, let's say the dropped
data had seen erroneous backfill that a user wants to ignore. Another
issue with "refresh-on-drop" is that it only happens for `drop_chunks`
and not other ways of deleting data.
Fixes#2242
With the new continuous aggregate API, some of
the parameters used to create a continuous agg are
now obsolete. Remove refresh_lag, max_interval_per_job
and ignore_invalidation_older_than information from
timescaledb_information.continuous_aggregates.
When refreshing with an "infinite" refresh window going forward in
time, the invalidation threshold is also moved forward to the end of
the valid time range. This effectively renders the invalidation
threshold useless, leading to unnecessary write amplification.
To handle infinite refreshes better, this change caps the refresh
window at the end of the last bucket of data in the underlying
hypertable, as to not move the invalidation threshold further than
necessary. For instance, if the max time value in the hypertable is
11, a refresh command such as:
```
CALL refresh_continuous_aggregate(NULL, NULL);
```
would be turned into
```
CALL refresh_continuous_aggregate(NULL, 20);
```
assuming that a bucket starts at 10 and ends at 20 (exclusive). Thus
the invalidation threshold would at most move to 20, allowing the
threshold to still do its work once time again moves forward and
beyond it.
Note that one must never process invalidations beyond the invalidation
threshold without also moving it, as that would clear that area from
invalidations and thus prohibit refreshing that region once the
invalidation threshold is moved forward. Therefore, if we do not move
the threshold further than a certain point, we cannot refresh beyond
it either. An alternative, and perhaps safer, approach would be to
always invalidate the region over which the invalidation threshold is
moved (i.e., new_threshold - old_threshold). However, that is left for
a future change.
It would be possible to also cap non-infinite refreshes, e.g.,
refreshes that end at a higher time value than the max time value in
the hypertable. However, when an explicit end is specified, it might
be on purpose so optimizing this case is also left for the future.
Closes#2333
If a tablespace is attached to a hypertable the tablespace of the
hypertable is not set, but if the tablespace is set it is also
attached. A similar situation occurs if tablespaces are detached.
This means that if a hypertable is created with a tablespace and then
all tablespaces are detached, the chunks will still be put in the
tablespace of the hypertable.
With this commit, attaching a tablespace to a hypertable will set the
tablespace of the hypertable if it does not already have one. Detaching
a tablespace from a hypertable will set the tablespace to the default
tablespace if the tablespace being detached is the tablespace for the
hypertable.
If `detach_tablespace` is called with only a tablespace name, it will
be detached from all tables it is attached to. This commit ensures that
the tablespace for the hypertable is set to the default tablespace if
it was set to the tablespace being detached.
Fixes#2299
This maintenance release contains bugfixes since the 1.7.3 release. We deem it
high priority for upgrading if TimescaleDB is deployed with replicas (synchronous
or asynchronous).
In particular the fixes contained in this maintenance release address an issue with
running queries on compressed hypertables on standby nodes.
**Bugfixes**
* #2340 Remove tuple lock on select path
When executing a SELECT on a hypertable, it is not possible to acquire
tuple locks on hot standbys since they require a transaction id and
transaction ids cannot be created when running as a standby running in
ephemeral recovery mode.
This commit removes the tuple lock from the SELECT code path if running
in recovery mode.
This patch changes the scheduler to ignore telemetry jobs when
telemetry is disabled. With this change telemetry jobs will no
longer use background worker resources when telemetry is disabled.
Patterns `#*#` and `.#*` are for auto-genereated files from Emacs and
can end up in source directories.
Pattern `.clangd` is the working directory for `clangd` and handles
source code indexing.
If a hypertable has a time type that is a date, any interval of less
than a day will be truncated to the same day. This created a
time-triggered test failure in `continuous_aggs_policy` and this commit
changes it to use full days for the start and end interval.
The test `deadlock_dropchunks_select` get a timeout on MacOS, but not
on other platforms, so this commit extend the lock timeout to make the
test pass.
Due to a version conflict between the 12.2 and the 12.4 postgres
packages in alpine the 12.2 package was not installable on recent
alpine images. This patch bumps the involved packages to the latest
postgres version and also uses the most recent postgres image for
running appveyor tests.
The function `cagg_watermark` returns the time threshold at which
materialized data ends and raw query data begins in a real-time
aggregation query (union view).
The watermark is simply the completed threshold of the continuous
aggregate materializer. However, since the completed threshold will no
longer exist with the new continuous aggregates, the watermark
function has been changed to return the end of the last bucket in the
materialized hypertable.
In most cases, the completed threshold is the same as the end of the
last materialized bucket. However, there are situations when it is
not; for example, when there is a filter in the view query some
buckets might not be materialized because no data matched the
filter. The completed threshold would move ahead regardless. For
instance, if there is only data from "device_2" in the raw hypertable
and the aggregate has a filter `device=1`, there will be no buckets
materialized although the completed threshold moves forward. Therefore
the new watermark function might sometimes return a lower watermark
than the old function. A similar situation explains the different
output in one of the union view tests.
Time types, like date and timestamps, have limits that aren't the same
as the underlying storage type. For instance, while a timestamp is
stored as an `int64` internally, its max supported time value is not
`INT64_MAX`. Instead, `INT64_MAX` represents `+Infinity` and the
actual largest possible timestamp is close to `INT64_MAX` (but not
`INT64_MAX-1` either). The same applies to min values.
Unfortunately, time handling code does not check for these boundaries;
in most cases, overflow handling when, e.g., bucketing, are checked
against the max integer values instead of type-specific boundaries. In
other cases, overflows simply throw errors instead of clamping to the
boundary values, which makes more sense in many situations.
Using integer time suffers from similar issues. To take one example,
simply inserting a valid `smallint` value close to the max into a
table with a `smallint` time column fails:
```
INSERT INTO smallint_table VALUES ('32765', 1, 2.0);
ERROR: value "32770" is out of range for type smallint
```
This happens because the code that adds dimensional constraints always
checks for overflow against `INT64_MAX` instead of the type-specific
max value. Therefore, it tries to create a chunk constraint that ends
at `32770`, which is outside the allowed range of `smallint`.
The resolve these issues, several time-related utility functions have
been implemented that, e.g., return type-specific range boundaries,
and perform saturated addition and subtraction while clamping to
supported boundaries.
Fixes#2292
There are number of issues when time_bucket_gapfill is run on
distributed hypertable. Thus a non-supported error is returned in this
case until the issues are fixed.
Adding support for tablespaces when creating a continuous aggregate
using `CREATE MATERIALIZED VIEW` and when altering a continuous
aggregate using `ALTER MATERIALIZED VIEW`.
Fixes#2122
Tablespaces are created cluster-wide, which means that tests that
create tablespaces cannot run together with other tests that create the
same tablespaces. This commit make those tests into solo tests to avoid
collisions with other tablespace-creating tests and also fix a test.
This change renames function to approximate_row_count() and adds
support for regular tables. Return a row count estimate for a
table instead of a table list.
Some tests fail on appveyor due to background
worker timing issues or difference in timestamp
outputs on different platforms.
Fix affected tests bgw_reorder_drop_chunks
and continuous_aggs_bgw.
Support add and remove continuous agg policy functions
Integrate policy execution with refresh api for continuous
aggregates
The old api for continuous aggregates adds a job automatically
for a continuous aggregate. This is an explicit step with the
new API. So remove this functionality.
Refactor some of the utility functions so that the code can be shared
by multiple policies.
The ddl_single test was almost exactly the same as the ddl test except
for 5 statements not part of the ddl_single test. So the ddl_single test
can safely be removed.