This refactors the `hypertable_distributed` test to make better use of
the `remote_exec` utility function. The refactoring also makes sure we
actually use space partitioning when testing distributed hypertables.
Chunks are placed across data nodes based on the ordinal of the slice
in the first space dimension, if such a dimension exists. For
instance, if a chunk belongs to the second slice in the space
dimension, this ordinal number will be used modulo the number of
data nodes to find the data node to place the chunk on.
However, the ordinal is calculated based on the existing slices in the
dimension, and, because slices are created lazily, the ordinal of a
specific slice might vary until all slices are created in the space
dimension. This has the result that chunks aren't consistently placed
on data nodes based on their space partition, prohibiting some
push-down optimizations that rely on consistent partitioning.
This change ensures the ordinal of a space slice is calculated as if
all slices in the dimension are pre-existing. This might still lead to
inconsistencies during times of repartioning, but fixes issues that
occur initially when no slices exists.
Deparse a table into a set of SQL commands that can be used to
reconstruct it. Together with column definiton it deparses
constraints, indexes, triggers and rules as well. There are some table
types that are not supported: temporary, partitioned, foreign,
inherited and a table that uses options. Row security is also not
supported.
This patch changes chunk index creation to use the same functions
for creating index in one transaction and using multiple transactions.
The single transaction index creation used to adjust the original
stmt and adjusted it for the chunk which lead to problems with table
references not being adjusted properly for the chunk.
Unless otherwise listed, the TODO was converted to a comment or put
into an issue tracker.
test/sql/
- triggers.sql: Made required change
tsl/test/
- CMakeLists.txt: TODO complete
- bgw_policy.sql: TODO complete
- continuous_aggs_materialize.sql: TODO complete
- compression.sql: TODO complete
- compression_algos.sql: TODO complete
tsl/src/
- compression/compression.c:
- row_compressor_decompress_row: Expected complete
- compression/dictionary.c: FIXME complete
- materialize.c: TODO complete
- reorder.c: TODO complete
- simple8b_rle.h:
- compressor_finish: Removed (obsolete)
src/
- extension.c: Removed due to age
- adts/simplehash.h: TODOs are from copied Postgres code
- adts/vec.h: TODO is non-significant
- planner.c: Removed
- process_utility.c
- process_altertable_end_subcmd: Removed (PG will handle case)
This patch adds a continuous aggregate with real-time aggregation
enabled to the update test suite since we rebuild view definition
for real time aggregation during extension update.
The continuous aggregate is in its own schema because all view
definitions in the public schema are dumped and those view
definitions will change between versions.
Test code in C should use test-specific assertions that throw errors
instead of exiting the program with a signal (crash). Not only does
this provide more useful and easily accessible information of the
failing condition, but it also allows running the test suite without
assertions (`USE_ASSERT_CHECKING`) enabled. Having assertions disabled
during tests also provides more accurate test coverage numbers. Note
that these test-specific assertions are not intended to replace
regular assertions (`Assert`), which are used in non-test code.
The way to enable (or disable) assertions in CMake has also been
simplified and cleaned up. The option `-DASSERTIONS=[ON|OFF]` can be
used to enable assertions for a build, unless already enabled in the
PostgreSQL one is building against (in which case that setting takes
precedence). Note that the `ASSERTIONS` option defaults to `OFF` since
it is no longer necessary to have assertions enabled for tests.
Modify table state is not created with an empty tables, which lead
to NULL pointer evaluation.
Starting from PG12 the planner injects a gating plan node above
any node that has pseusdo-constant quals.
To fix this, we need to check for such a gating node and handle the case.
We could optionally prune the extra node, since there's already
such a node below ChunkDispatch.
Fixes#1883.
The `plan_hypertable_cache` test used `current_date` to generate data,
which is inherently flaky since it can create a different number of
chunks depending on which date you start at. When the number of chunks
differ, the test output changes too.
When classify_relation is called for relations of subqueries it
would not be able to correctly classify the relation unless it
was already in cache. This patch changes classify_relation to
call get_hypertable without CACHE_FLAG_NOCREATE when the RangeTblEntry
has the inheritance flag set.
When copying from standard input the range table was not set up to
handle the constraints for the target table and instead is initialized
to null. In addition, the range table index was set to zero, causing an
underflow when executing the constraint check. This commit fixes this
by initializing the range table and setting the index correctly.
The code worked correctly for PG12, so the code is also refactored to
ensure that the range table and index is set the same way in all
versions.
Fixes#1840
When doing a UNION ALL query between a hypertable and a regular
table the hypertable would not get expanded leading to
no data of the hypertable included in the resultset.
Due to the changes of the default view behaviour of continuous
aggregates we need a new testsuite for the update tests for 1.7.0
This patch also changes the update test for 9.6 and 10 to run on
cron and 11 and 12 on pull request.
When calling show_chunks or drop_chunks without specifying
a particular hypertable TimescaleDB iterates through all
existing hypertables and builds a list. While doing this
it adds the internal '_compressed_hypertable_*' tables
which leads to incorrect behaviour of
ts_chunk_get_chunks_in_time_range function. This fix
filters out the internal compressed tables while scanning
at ts_hypertable_get_all function.
This adds a test for INSERTs with cached plans. This test causes
a segfault before 1.7 but was fixed independently by the refactoring
of the INSERT path when adding PG12 support.
PG12 by default optimizes CTEs from WITH clause with the rest of the
queries. MATERIALIZED is used in rowsecurity tests to produce the same
query plans as in PG11. This commit adds query plans tests with
default behavior, which is equivalent to NOT MATERIALIZED.
Runs a modification of rowsecurity test on PG12. The differences are:
- WITH OIDS is removed from PG12.
- OID column is replaced with CTID in queries, which is expected to be
stable enough.
- MATERIALIZED is used on WITH to get the same plan.
- Detail of an error message is slightly different in PG12.
Improved ORDER BY in number of queries in PG11 and PG12 tests to avoid
result permutations.
If the telemetry response is malformed, strange errors will be
generated in the log because the use of `DirectFunctionCall2` expect
the result of function calls to not be NULL and will throw an error if
it is not.
By printing the response in the log we can debug what went wrong.
The telemetry response processing is handled in the function
`process_response`, which was an internal function and cannot be tested
using unit tests.
This commit rename the function to follow conventions for extern
functions and add test functions and tests to check that it can handle
well-formed responses.
No tests for malformed responses are added since the function cannot
currently handle that.
PG12 introduced support for custom table access methods. While one
could previously set a custom table access method on a hypertable, it
wasn't propagated to chunks. This is now fixed, and a test is added to
show that chunks inherit the table access method of the parent
hypertable.
This change modifies the query test to use the test template
mechanism so that we can capture the plan differences introduced by
Postgres 12 pruning append nodes.
The PG12 output file also contains a plan change which seems to
result in a GroupAggregate being replace by a less efficient Sort
plus HashAggregate. This new behavior is still being investigated,
but it is still a correct plan (just potentially suboptimal).
This test changes the multi_transaction_index test to sort the
chunk_index selection commands by index_name. This fixes an issue
where the order would vary depending on the version of postgres.
This also made the multi_transaction_index test a test template so
that we can capture the explain difference for a plan that has an
append node pruned in PostgresQL 12.
PostgreSQL 12 introduced space optimizations for indexes, which caused
the adaptive chunking test to fail since its measure of chunk size
includes indexes that now report different sizes.
To fix this, the adaptive chunking test now has version-specific
output files.
This change moves the custom_type, ddl, ddl_single, insert, and
partition tests under test/sql, as well as the move and reorder
tests under /tsl/test/sql to our test template framework. This
allows them to have different golden files for different version of
Postgres.
With the exception of the reorder test, all of these tests produced
new output in PG12 which only differed by the exclusion of append
nodes from plan explanations (this exclusion is a new feature of
PG12). The reorder test had some of these modified plans, but also
included a test using a table with OIDs, which is not valid in PG12.
This test was modified to allow the table creation to fail, and we
captured the expected output in the new golden file for PG12.
PG12 allows users to add a WHERE clause when copying from from a
file into a table. This change adds support for such clauses on
hypertables. Also fixes a issue that would have arisen in cases
where a table being copied into had a trigger that caused a row to
be skipped.
The `pg_dump` command has slightly different informational output
across PostgreSQL versions, which causes problems for tests. This
change makes sure that all tests that use `pg_dump` uses the
appropriate wrapper script where we can better control the output to
make it the same across PostgreSQL versions.
Note that the main `pg_dump` test still fails for other reasons that
will be fixed separately.
PG12 introduced new feature which allows to define auto generated
columns. This PR adds check which prevents using such columns for
hypertable partitioning.
PostgreSQL 12 changed the log level in client tools, such as
`pg_dump`, which makes some of our tests fail due to different log
level labels.
This change filters and modifies the log level output of `pg_dump` in
earlier PostgreSQL versions to adopt the new PostgreSQL 12 format.
This change includes a major refactoring to support PostgreSQL
12. Note that many tests aren't passing at this point. Changes
include, but are not limited to:
- Handle changes related to table access methods
- New way to expand hypertables since expansion has changed in
PostgreSQL 12 (more on this below).
- Handle changes related to table expansion for UPDATE/DELETE
- Fixes for various TimescaleDB optimizations that were affected by
planner changes in PostgreSQL (gapfill, first/last, etc.)
Before PostgreSQL 12, planning was organized something like as
follows:
1. construct add `RelOptInfo` for base and appendrels
2. add restrict info, joins, etc.
3. perform the actual planning with `make_one_rel`
For our optimizations we would expand hypertables in the middle of
step 1; since nothing in the query planner before `make_one_rel` cared
about the inheritance children, we didn’t have to be too precises
about where we were doing it.
However, with PG12, and the optimizations around declarative
partitioning, PostgreSQL now does care about when the children are
expanded, since it wants as much information as possible to perform
partition-pruning. Now planning is organized like:
1. construct add RelOptInfo for base rels only
2. add restrict info, joins, etc.
3. expand appendrels, removing irrelevant declarative partitions
4. perform the actual planning with make_one_rel
Step 3 always expands appendrels, so when we also expand them during
step 1, the hypertable gets expanded twice, and things in the planner
break.
The changes to support PostgreSQL 12 attempts to solve this problem by
keeping the hypertable root marked as a non-inheritance table until
`make_one_rel` is called, and only then revealing to PostgreSQL that
it does in fact have inheritance children. While this strategy entails
the least code change on our end, the fact that the first hook we can
use to re-enable inheritance is `set_rel_pathlist_hook` it does entail
a number of annoyances:
1. this hook is called after the sizes of tables are calculated, so we
must recalculate the sizes of all hypertables, as they will not
have taken the chunk sizes into account
2. the table upon which the hook is called will have its paths planned
under the assumption it has no inheritance children, so if it's a
hypertable we have to replan it's paths
Unfortunately, the code for doing these is static, so we need to copy
them into our own codebase, instead of just using PostgreSQL's.
In PostgreSQL 12, UPDATE/DELETE on inheritance relations have also
changed and are now planned in two stages:
- In stage 1, the statement is planned as if it was a `SELECT` and all
leaf tables are discovered.
- In stage 2, the original query is planned against each leaf table,
discovered in stage 1, directly, not part of an Append.
Unfortunately, this means we cannot look in the appendrelinfo during
UPDATE/DELETE planning, in particular to determine if a table is a
chunk, as the appendrelinfo is not at the point we wish to do so
initialized. This has consequences for how we identify operations on
chunks (sometimes for blocking and something for enabling
functionality).
We have to let the leader participate in parallel plan execution
if for some reason no parallel workers were started.
This commit also changes the parallel EXPLAINs to not run with
ANALYZE because the output is not stable as it depends on
worker assignment.
The queries to produce test data for space partitioned
hypertables in the append test did not have an explicit
ORDER BY clause leading to a different ordering for the
chunks created on PG12.
There was a race condition between the post_restore function
restarting the background worker and the setting of the
restoring flag to "off". If the worker started before the
change to the restoring flag had been committed, it would not
see the change and then die because the worker should exit
when the db is in a restoring state. This modifies the
post_restore function to use a restart instead of a start
so that it waits on the commit to start up. It also adds
logic to the entrypoint to reload config changes caused
by an `ALTER DATABASE SET` command. These changes are
normally only seen at connection startup but we have to
wait until after our lock on the modifying transaction is
released to know whether we should adopt them.
This PR improves the scheduling of jobs when the number of
jobs exceeds the amount of background workers. Previously,
this was not a case the scheduler handled well.
The basic strategy we employ to handle this case better is to
use a job's next_start field to create a priority for jobs.
More concretely, jobs are scheduled in increasing order of
next_start. If the scheduler runs out of bgw's it waits to
until bgws become available and then retries again, also
in increasing next_start order.
The first change this PR implements is start jobs in order
of increasing next_start. We also make sure that if we run
out of BGWs, the scheduler will try again in START_RETRY_MS
(1 second by default).
This PR also needed to change the logic of what happens when
a job fails to start because BGWs have run out. Previously,
such jobs were marked as failed and their next_start was reset
using the regular post-failure backoff logic. But, this means
that a job looses its priority every time we run out of BGWs.
Thus, we changed this logic so that next_start does not change
when we encounter this situation.
There are actually 2 ways to run out of BGWs:
1) We run out of the timescale limit on BGWs - in this case
the job is simply put back into the scheduled state, and it
will be retried in START_RETRY_MS. The job is not marked
started or failed. This is the common error.
2) We run out of PostgreSQL workers. We won't know if this
failed until we try to start the worker, by which time the
job must be in the started state. Thus if we run into this
error we must mark the job as failed. But we don't change
next_start. To do this we create a new job result type
called JOB_FAILURE_TO_START.
When the index for a chunk was created the attnos for the index
predicate were not adjusted leading to insert errors on hypertables
with dropped columns that had indexes with predicates.
This PR adjust index predicate attnos when creating the chunk index
to match the chunk attno.
To make tests more stable and to remove some repeated code in the
tests this PR changes the test runner to stop background workers.
Individual tests that need background workers can still start them
and this PR will only stop background workers for the initial database
for the test, behaviour for additional databases created during the
tests will not change.
This change fixes the `plan_expand_hypertable` test, which was broken
and never ran the output comparison due to prematurely ending in an
uncaught error. The test appeared to succeeded, however, since also
the broken test "expected" files were committed to the repo.
Fixing the test revealed that the query output with our optimizations
enabled is incorrect for outer joins (i.e., the output from the query
differed from regular PostgreSQL). Restriction clauses were too
aggressively pushed down to outer relations, leading to chunk
exclusion when it shouldn't happen. This incorrect behavior has also
been fixed.
When using `COPY TO` on a hypertable (which copies from the hypertable
to some other destination), nothing will be printed and nothing will be
copied. Since this can be potentially confusing for users, this commit
print a notice when an attempt is made to copy from a hypertable
directly (not using a query) to some other destination.
Descending into subplans during constification of params
seems unsafe and has led to bugs. Turning this off seems
to be safe and not regress any tested optimizations.
In the future we may want to find a way to optimize this
case as well.
Fixes#1598.
We added a timescaledb.ignore_invalidation_older_than parameter for
continuous aggregatess. This parameter accept a time-interval (e.g. 1
month). if set, it limits the amount of time for which to process
invalidation. Thus, if
timescaledb.ignore_invalidation_older_than = '1 month'
then any modifications for data older than 1 month from the current
timestamp at insert time will not cause updates to the continuous
aggregate. This limits the amount of work that a backfill can trigger.
This parameter must be >= 0. A value of 0 means that invalidations are
never processed.
When recording invalidations for the hypertable at insert time, we use
the maximum ignore_invalidation_older_than of any continuous agg attached
to the hypertable as a cutoff for whether to record the invalidation
at all. When materializing a particular continuous agg, we use that
aggs ignore_invalidation_older_than cutoff. However we have to apply
that cutoff relative to the insert time not the materialization
time to make it easier for users to reason about. Therefore,
we record the insert time as part of the invalidation entry.
PG11 added an optimization where columns that were added by
an ALTER TABLE that had a DEFAULT value did not cause a table
re-write. Instead, those columns are filled with the default
value on read.
But, this mechanism does not apply to catalog tables and does
not work with our catalog scanning code. This tests makes
sure we never have such alters in our updates.