A lot of planner code was imported to support hypertable expansion in
PG12. This code is now moved to a `import` directory to avoid having
this code mix with regular TimescaleDB code.
It is not necessary to create a new range table entry for each chunk
during inserts. Instead, we can point to the range table entry of the
hypertable's root table.
The INSERT and COPY paths have been refactored to better handle
differences between PostgreSQL versions. In particular, PostgreSQL 12
introduced the new table access mangement (AM) API which ties tuple
table slots to specific table AM implementations, requiring more
careful management of those data structures.
The code tries to adopt the new (PG12) API to the extent possible,
providing compatibility layers and functions for older PostgreSQL
versions where needed.
Replace relation_close with appropriate specific functions. Remove
todo comment. Remove unnecessary headers. Capitalize new macro. Remove
code duplication in compatibility macros. Improve comment.
Correcting conditions in #ifdefs, adding missing includes, removing
and rearranging existing includes, replacing PG12 with PG12_GE for
forward compatibility. Fixed number of places with relation_close to
table_close, which were missed earlier.
relation_open is a general function, which is called from more
specific functions per database type. This commit replaces them
with the specific functions, which control correct types.
This change includes a major refactoring to support PostgreSQL
12. Note that many tests aren't passing at this point. Changes
include, but are not limited to:
- Handle changes related to table access methods
- New way to expand hypertables since expansion has changed in
PostgreSQL 12 (more on this below).
- Handle changes related to table expansion for UPDATE/DELETE
- Fixes for various TimescaleDB optimizations that were affected by
planner changes in PostgreSQL (gapfill, first/last, etc.)
Before PostgreSQL 12, planning was organized something like as
follows:
1. construct add `RelOptInfo` for base and appendrels
2. add restrict info, joins, etc.
3. perform the actual planning with `make_one_rel`
For our optimizations we would expand hypertables in the middle of
step 1; since nothing in the query planner before `make_one_rel` cared
about the inheritance children, we didn’t have to be too precises
about where we were doing it.
However, with PG12, and the optimizations around declarative
partitioning, PostgreSQL now does care about when the children are
expanded, since it wants as much information as possible to perform
partition-pruning. Now planning is organized like:
1. construct add RelOptInfo for base rels only
2. add restrict info, joins, etc.
3. expand appendrels, removing irrelevant declarative partitions
4. perform the actual planning with make_one_rel
Step 3 always expands appendrels, so when we also expand them during
step 1, the hypertable gets expanded twice, and things in the planner
break.
The changes to support PostgreSQL 12 attempts to solve this problem by
keeping the hypertable root marked as a non-inheritance table until
`make_one_rel` is called, and only then revealing to PostgreSQL that
it does in fact have inheritance children. While this strategy entails
the least code change on our end, the fact that the first hook we can
use to re-enable inheritance is `set_rel_pathlist_hook` it does entail
a number of annoyances:
1. this hook is called after the sizes of tables are calculated, so we
must recalculate the sizes of all hypertables, as they will not
have taken the chunk sizes into account
2. the table upon which the hook is called will have its paths planned
under the assumption it has no inheritance children, so if it's a
hypertable we have to replan it's paths
Unfortunately, the code for doing these is static, so we need to copy
them into our own codebase, instead of just using PostgreSQL's.
In PostgreSQL 12, UPDATE/DELETE on inheritance relations have also
changed and are now planned in two stages:
- In stage 1, the statement is planned as if it was a `SELECT` and all
leaf tables are discovered.
- In stage 2, the original query is planned against each leaf table,
discovered in stage 1, directly, not part of an Append.
Unfortunately, this means we cannot look in the appendrelinfo during
UPDATE/DELETE planning, in particular to determine if a table is a
chunk, as the appendrelinfo is not at the point we wish to do so
initialized. This has consequences for how we identify operations on
chunks (sometimes for blocking and something for enabling
functionality).
We have to let the leader participate in parallel plan execution
if for some reason no parallel workers were started.
This commit also changes the parallel EXPLAINs to not run with
ANALYZE because the output is not stable as it depends on
worker assignment.
Docker images are build with latest versions of PostgreSQL, thus
updating regression tests to run on the latest PG versions. Fixing
authentication for PG docker images, since it is required after a
recent change in docker-library/postgres@42ce743.
Running isolation tests on new versions of PG produces additional
output:
- Notifications are not missed to print postgres/postgres@ebd4992
- Notifications are prefixed with session name
postgres/postgres@a28e10e
- Timeout cancellation is printed in isolation tests
postgres/postgres@b578404
The expected outputs are modified to succeed on latest PG versions,
while the affected isolation tests are disabled for earlier versions
of PG.
The view definition for the union view was missing a cast from int8
to the actual int type used for hypertables with int partitioning
column this was not a problem on 64 bit systems but on 32 bit systems
int4 and int8 are represented differently.
The function to get the materialization end point for a continuous
aggregate could use uninitialized min and max time values in its
calculations when a hypertable has no data.
This change ensures that the min and max time is initialized to
`INT64_MIN` and `INT64_MAX`, respectively, if no min and max values are
found.
This will mute warnings on some compilers.
This PR adds a new mode for continuous aggregates that we name
real-time aggregates. Unlike the original this new mode will
combine materialized data with new data received after the last
refresh has happened. This new mode will be the default behaviour
for newly created continuous aggregates.
To upgrade existing continuous aggregates to the new behaviour
the following command needs to be run for all continuous aggregates
ALTER VIEW continuous_view_name SET (timescaledb.materialized_only=false);
To disable this behaviour for newly created continuous aggregates
and get the old behaviour the following command can be run
ALTER VIEW continuous_view_name SET (timescaledb.materialized_only=true);
last_run_success value is reset when a job is started.
So mask the value if the status of a job is
running, otherwise it will show an incorrect state.
Fixes#1781
This commit fixes `chunk_scan_find` to not return chunks that are
marked as dropped, since the callers don't expect such chunks to be
returned as found.
Set the threshold for continuous aggregates as the
max value in the raw hypertable when the max value
is lesser than the computed now time. This helps avoid
unnecessary materialization checks for data ranges
that do not exist. As a result, we also prevent
unnecessary writes to the thresholds and invalidation
log tables.
The queries to produce test data for space partitioned
hypertables in the append test did not have an explicit
ORDER BY clause leading to a different ordering for the
chunks created on PG12.
The definition change for timescaledb_information.continuous_aggregates
requires a drop view in the update script because the update might
be from a version that has a incompatible view definition.
PG12 needs version specific output for some
isolation tests. Modify the makefile to make it
consistent with use of TEST_TEMPLATES in other
regression suites and add version specific output
files.
If `ignore_invalidation_older_than` is undefined, it is set to maximum
for `BIGINT` type. This is not handled in `continuous_aggregates`
information schema so the column shows up as a very strange value.
This commit fixes this by checking if `ignore_invalidation_older_than`
is set to maximum, and uses `NULL` in the view in that case, which will
show up as empty.
The function `get_chunks_to_compress` return chunks that are not
compressed but that are dropped, meaning a lookup using
`ts_chunk_get_by_id` will fail to find the corresponding `table_id`,
which later leads to a null pointer when looking for the chunk. This
leads to a segmentation fault.
This commit fixes this by ignoring chunk that have are marked as
dropped in the chunk table when scanning for chunks to compress.
This maintenance release contains bugfixes since the 1.6.0 release. We deem it medium
priority for upgrading.
In particular the fixes contained in this maintenance release address bugs in continuous
aggregates, time_bucket_gapfill, partial index handling and drop_chunks.
**For this release only**, you will need to restart the database after upgrade before
restoring a backup.
**Minor Features**
* #1666 Support drop_chunks API for continuous aggregates
* #1711 Change log level for continuous aggregate materialization messages
**Bugfixes**
* #1630 Print notice for COPY TO on hypertable
* #1648 Drop chunks from materialized hypertable
* #1668 Cannot add dimension if hypertable has empty chunks
* #1673 Fix crash when interrupting create_hypertable
* #1674 Fix time_bucket_gapfill's interaction with GROUP BY
* #1686 Fix order by queries on compressed hypertables that have char segment by column
* #1687 Fix issue with disabling compression when foreign keys are present
* #1688 Handle many BGW jobs better
* #1698 Add logic to ignore dropped chunks in hypertable_relation_size
* #1704 Fix bad plan for continuous aggregate materialization
* #1709 Prevent starting background workers with NOLOGIN
* #1713 Fix miscellaneous background worker issues
* #1715 Fix issue with overly aggressive chunk exclusion in outer joins
* #1719 Fix restoring/scheduler entrypoint to avoid BGW death
* #1720 Add scheduler cache invalidations
* #1727 Fix compressing INTERVAL columns
* #1728 Handle Sort nodes in ConstraintAwareAppend
* #1730 Fix partial index handling on hypertables
* #1739 Use release OpenSSL DLLs for debug builds on Windows
* #1740 Fix invalidation entries from multiple caggs on same hypertable
* #1743 Fix continuous aggregate materialization timezone handling
* #1748 Fix remove_drop_chunks_policy for continuous aggregates
**Thanks**
* @RJPhillips01 for reporting an issue with drop chunks.
* @b4eEx for reporting an issue with disabling compression.
* @darko408 for reporting an issue with order by on compressed hypertables
* @mrechte for reporting an issue with compressing INTERVAL columns
* @tstaehli for reporting an issue with ConstraintAwareAppend
* @chadshowalter for reporting an issue with partial index on hypertables
* @geoffreybennett for reporting an issue with create_hypertable when interrupting operations
* @alxndrdude for reporting an issue with background workers during restore
* @zcavaliero for reporting and fixing an issue with dropped columns in hypertable_relation_size
* @ismailakpolat for reporting an issue with cagg materialization on hypertables with TIMESTAMP column
If `clang-format` with the correct usage does not exist on the machine,
the `clang_format_all.sh` script is executed in a docker image. Since a
`--user` is not provided, the script is executed with the wrong user,
which gives wrong user for formatted files.
This commit fixes this by providing the user id and group id of the
current user when using the docker version of `clang-format`.
Fixes to return error in ts_chunk_get_by_relid if fail_if_not_found is
true. Removes TSDLLEXPORT macro from definitions in few C files, which
helps VS Code to work properly and find function definitions.
Invalidate cache state when jobs are detected as
potentially missing. This will update the jobs list
used by the scheduler.
Restrict job start times to a finite interval when
there are consecutive job failures.
Additional logging.
When building TimescaleDB with the EnterpriseDB package installed, it
does not include the debug versions of the DLLs, so a debug build of
TimescaleDB fails.
This commit fix the issue by replacing the debug DLLs in
`OPENSSL_LIBRARIES` with release DLLs even for a debug build when
compiling using MSVC.
Function `remove_drop_chunks_policy` did not work if a continuous
aggregate was provided as input.
This commit fixes that by looking for a continuous aggregate if a
hypertable is not found.
Fixestimescale/timescaledb-private#670
For hypertables with a TIMESTAMP column materialization would not
honor the local timezone when determining the range for materialization
and instead treat times as UTC.
Function hypertable_relation_size includes chunks that were dropped
which causes a failure when looking up the size of dropped chunks.
This patch adds a constraint to ignore dropped chunks when determining
the size of the hypertable.
The test `continuous_aggs_ddl` failed on PostgreSQL 9.6 because it had
a line that tested compression on a hypertable when this feature is
not supported in 9.6. This prohibited a large portion of the test to
run on 9.6.
This change moves the testing of compression on a continuous aggregate
to the `compression` test instead, which only runs on supported
PostgreSQL versions. A permission check on a view is also removed,
since similar tests are already in the `continuous_aggs_permissions`
tests.
The permission check was the only thing that caused different output
across PostgreSQL versions, so therefore the test no longer requires
version-specific output files and has been simplified to use the same
output file irrespective of PostgreSQL version.
This change fixes a number of typos and issues with inconsistent
formatting for compression-related code. A couple of other fixes for
variable names, etc. have also been applied.
The gapfill test assumed 1 day is always <= 24 hours which is not
true during DST switch leading to a failing test when run in that
time. This PR fixes the test to have reproducable output even
when run during DST switch.
There was a race condition between the post_restore function
restarting the background worker and the setting of the
restoring flag to "off". If the worker started before the
change to the restoring flag had been committed, it would not
see the change and then die because the worker should exit
when the db is in a restoring state. This modifies the
post_restore function to use a restart instead of a start
so that it waits on the commit to start up. It also adds
logic to the entrypoint to reload config changes caused
by an `ALTER DATABASE SET` command. These changes are
normally only seen at connection startup but we have to
wait until after our lock on the modifying transaction is
released to know whether we should adopt them.
When a MergeAppendPath has children that do not produce sorted
output a Sort node will be injected during plan creation, those
plans would trigger an error about invalid child nodes in
ConstraintAwareAppend. This PR makes ConstraintAwareAppend handle
those plans correctly.
When we copy the invalidation logs for individual continuous
aggregates, the lowest-value was globally overwritten. Fix this
so that the change is specific to a continuous aggregate.
This bug could result in missing invalidations.
This PR improves the scheduling of jobs when the number of
jobs exceeds the amount of background workers. Previously,
this was not a case the scheduler handled well.
The basic strategy we employ to handle this case better is to
use a job's next_start field to create a priority for jobs.
More concretely, jobs are scheduled in increasing order of
next_start. If the scheduler runs out of bgw's it waits to
until bgws become available and then retries again, also
in increasing next_start order.
The first change this PR implements is start jobs in order
of increasing next_start. We also make sure that if we run
out of BGWs, the scheduler will try again in START_RETRY_MS
(1 second by default).
This PR also needed to change the logic of what happens when
a job fails to start because BGWs have run out. Previously,
such jobs were marked as failed and their next_start was reset
using the regular post-failure backoff logic. But, this means
that a job looses its priority every time we run out of BGWs.
Thus, we changed this logic so that next_start does not change
when we encounter this situation.
There are actually 2 ways to run out of BGWs:
1) We run out of the timescale limit on BGWs - in this case
the job is simply put back into the scheduled state, and it
will be retried in START_RETRY_MS. The job is not marked
started or failed. This is the common error.
2) We run out of PostgreSQL workers. We won't know if this
failed until we try to start the worker, by which time the
job must be in the started state. Thus if we run into this
error we must mark the job as failed. But we don't change
next_start. To do this we create a new job result type
called JOB_FAILURE_TO_START.
When the index for a chunk was created the attnos for the index
predicate were not adjusted leading to insert errors on hypertables
with dropped columns that had indexes with predicates.
This PR adjust index predicate attnos when creating the chunk index
to match the chunk attno.