63 Commits

Author SHA1 Message Date
Dmitry Simonenko
ea5038f263 Add connection cache invalidation ignore logic
Calling `ts_dist_cmd_invoke_on_data_nodes_using_search_path()` function
without an active transaction allows connection invalidation event
happen between applying `search_path` and the actual command
execution, which leads to an error.

This change introduces a way to ignore connection cache invalidations
using `remote_connection_cache_invalidation_ignore()` function.

This work is based on @nikkhils original fix and the problem research.

Fix #4022
2022-10-04 10:50:45 +03:00
Jan Nidzwetzki
de30d190e4 Fix a deadlock in chunk decompression and SELECTs
This patch fixes a deadlock between chunk decompression and SELECT
queries executed in parallel. The change in
a608d7db614c930213dee8d6a5e9d26a0259da61 requests an AccessExclusiveLock
for the decompressed chunk instead of the compressed chunk, resulting in
deadlocks.

In addition, an isolation test has been added to test that SELECT
queries on a chunk that is currently decompressed can be executed.

Fixes #4605
2022-09-22 14:37:14 +02:00
Sven Klemm
424f6f7648 Remove database port from test output
Don't include the used database ports into test output as this
will lead to failing tests when running against a local instance
or against a preconfigured cloud instance.
2022-09-15 07:56:12 +02:00
Alexander Kuzmenkov
706a3c0e50 Enable statement logging in the tests
Remove 'client_min_messages = LOG' where not needed, and add the 'LOG:
statement' output otherwise.
2022-08-25 15:29:28 +03:00
Jan Nidzwetzki
a608d7db61 Fix race conditions during chunk (de)compression
This patch introduces a further check to compress_chunk_impl and
decompress_chunk_impl. After all locks are acquired, a check is made
to see if the chunk is still (un-)compressed. If the chunk was
(de-)compressed while waiting for the locks, the (de-)compression
operation is stopped.

In addition, the chunk locks in decompress_chunk_impl
are upgraded to AccessExclusiveLock to ensure the chunk is not deleted
while other transactions are using it.

Fixes: #4480
2022-07-05 15:13:10 +02:00
Erik Nordström
7b9d867358 Fix crash and other issues in telemetry reporter
Make the following changes to the telemetry reporter background worker:

- Add a read lock to the current relation that the reporter collects
  stats for. This lock protects against concurrent deletion of the
  relation, which could lead to errors that would prevent the reporter
  from completing its report.
- Set an active snapshot in the telemetry background process for use
  when scanning a relation for stats collection.

- Reopen the scan iterator when collecting chunk compression stats for
  a relation instead of keeping it open and restarting the scan. The
  previous approach seems to cause crashes due to memory corruption of
  the scan state. Unfortunately, the exact cause has not been
  identified, but the change has been verified to work on a live
  running instance (thanks to @abrownsword for the help with
  reproducing the crash and testing fixes).

Fixes #4266
2022-05-20 16:52:54 +02:00
Fabrízio de Royes Mello
f266f5cf56 Continuous Aggregates finals form
Following work started by #4294 to improve performance of Continuous
Aggregates by removing the re-aggregation in the user view.

This PR get rid of `partialize_agg` and `finalize_agg` aggregate
functions and store the finalized aggregated (plain) data in the
materialization hypertable.

Because we're not storing partials anymore and removed the
re-aggregation, now is be possible to create indexes on aggregated
columns in the materialization hypertable in order to improve the
performance even more.

Also removed restrictions on types of aggregates users can perform
with Continuous Aggregates:
* aggregates with DISTINCT
* aggregates with FILTER
* aggregates with FILTER in HAVING clause
* aggregates without combine function
* ordered-set aggregates
* hypothetical-set aggregates

By default new Continuous Aggregates will be created using this new
format, but the previous version (with partials) will be supported.

Users can create the previous style by setting to `false` the storage
paramater named `timescaledb.finalized` during the creation of the
Continuous Aggregate.

Fixes #4233
2022-05-18 11:38:58 -03:00
gayyappan
9f4dcea301 Add _timescaledb_internal.freeze_chunk API
This is an internal function to freeze a chunk
for PG14 and later.

This function sets a chunk status to frozen.
Operations that modify the chunk data
(like insert, update, delete) are not
supported. Frozen chunks can be dropped.

Additionally, chunk status is cached as part of
classify_relation.
2022-05-10 14:00:32 -04:00
Sven Klemm
e2d578cfac Fix cagg_multi_dist_ht isolation test
Adjust cagg_multi_dist_ht isolation test to no longer include
chunk names. isolation tests that expose chunk names cannot be
run by themselves or in a custom test list because chunk numbering
is depending on the spec file list.
2022-01-14 10:28:57 +01:00
Sven Klemm
6a8c2b666e Shorten isolation test spec file names
Isolation test identifiers have a length limit and when the
isolationtester encounters names that are too long they get
truncated. More recent versions will produce a warning when this
truncation is done leading to flaky tests.
Only continuous_aggs_concurrent_refresh_dist_ht.spec exceeded
the truncation limit but since all the continuous_aggs isolation
tests have quite long name this patch shortens the names from
continuous_aggs_* to cagg_* to prevent this problem from happening
to other isolation tests as well.
2022-01-14 10:28:57 +01:00
Sven Klemm
22fd4d4426 Fix compression_ddl isolation test
This patch gets rid of all hardcoded chunk names from the
compression_ddl isolation test and also gets rid of chunk names
from the output files. Chunk names in isolation test files are
problematic as it prevents changing the order of execution of
isolation test runs as the database is shared between the individual
tests. Output will also differ when only a subset of the tests is run
leading to flaky tests.
2022-01-13 20:31:21 +01:00
Mats Kindahl
aae19319c0 Rewrite recompress_chunk as procedure
When executing `recompress_chunk` and a query at the same time, a
deadlock can be generated because the chunk relation and the chunk
index and the compressed and uncompressd chunks are locked in different
orders. In particular, when `recompress_chunk` is executing, it will
first decompress the chunk and as part of that lock the uncompressed
chunk index in AccessExclusive mode and when trying to compress the
chunk again it will try to lock the uncompressed chunk in
AccessExclusive as part of truncating it.

Note that `decompress_chunk` and `compress_chunk` lock the relations in
the same order and the issue arises because the procedures are combined
inth a single transaction.

To avoid the deadlock, this commit rewrites the `recompress_chunk` to
be a procedure and adds a commit between the decompression and
compression. Committing the transaction after the decompress will allow
reads and inserts to proceed by working on the uncompressed chunk, and
the compression part of the procedure will take the necessary locks in
strict order, thereby avoiding a deadlock.

In addition, the isolation test is rewritten so that instead of adding
a waitpoint in the PL/SQL function, we implement the isolation test by
taking a lock on the compressed table after the decompression.

Fixes #3846
2021-12-09 19:42:12 +01:00
Mats Kindahl
1ff6dfe6ab Fix race condition in deadlock_recompress_chunk
After the synchronizing lock is released and the transaction is
committed, both sessions are free to execute independently. This means
that the query can actually start running before the recompress step
has completed, which means that the order for completion is
non-deterministic.

We fix this by adding a marker so that the query is not reported as
completed until the recompress has finished execution. Since markers in
isolation tests is a recent thing, we only run the test for PostgreSQL
versions with markers added.

Part-Of: #3846
2021-12-03 11:24:35 +01:00
Mats Kindahl
112107546f Eliminate deadlock in recompress chunk policy
When executing recompress chunk policy concurrently with queries query, a
deadlock can be generated because the chunk relation and the chunk
index or the uncompressed chunk or the compressed chunk are locked in
different orders. In particular, when recompress chunk policy is
executing, it will first decompress the chunk and as part of that lock
the compressed chunk in `AccessExclusive` mode when dropping it and when
trying to compress the chunk again it will try to lock the uncompressed
chunk in `AccessExclusive` mode as part of truncating it.

To avoid the deadlock, this commit updates the recompress policy to do
the compression and the decompression steps in separate transactions,
which will avoid the deadlock since each phase (decompress and compress
chunk) locks indexes and compressed/uncompressed chunks in the same
order.

Note that this fixes the policy only, and not the `recompress_chunk`
function, which still is prone to deadlocks.

Partial-Bug: #3846
2021-11-30 18:04:30 +01:00
Fabrízio de Royes Mello
2ccba5ecc9 Refactor isolation tests to prevent SQL injection
During the developing of Continuous Aggregates for Distributed
Hypertables we left some work to be done later and refator the
isolation tests to prevent SQL injection was on of them.

Per discussion:
https://github.com/timescale/timescaledb/pull/3693#discussion_r735098888

Epic issue #3721
2021-11-17 12:59:07 -03:00
Markos Fountoulakis
221437e8ef Continuous aggregates for distributed hypertables
Add support for continuous aggregates for distributed hypertables by
allowing a continuous aggregate to read from a distributed hypertable
so that the continuous aggregate is on the access node while the
hypertable data is on the data nodes.

For distributed hypertables, both the hypertable and continuous
aggregate invalidation log are kept on the data nodes and the refresh
window is computed at refresh time on each data node. Since the
continuous aggregate materialization hypertable is not present on the
data nodes, the invalidation log was extended to allow using a
non-local hypertable id on the data nodes. This means that you cannot
create continuous aggregates on the data nodes since those could clash
with continuous aggregates on the access node.

Some utility statements added entries to the invalidation logs
directly (truncating chunks and hypertables, as well as dropping
individual chunks), so to handle this case, internal functions were
added to allow logging invalidation on the data nodes from the access
node.

The commit also includes some fixes to memory context usage that
caused crashes for invalidation triggers and also disable per data
node queries during refresh since that would otherwise generate an
exception.

Fixes #3435

Co-authored-by: Mats Kindahl <mats@timescale.com>
2021-10-25 18:20:11 +03:00
Sven Klemm
acc6abee92 Support transparent decompression on individual chunks
This patch adds support for transparent decompression in queries
on individual chunks.
This is required for distributed hypertables with compression
when enable_per_data_node_queries is set to false. Without
this functionality queries on distributed hypertables with
compression would not return data for compressed chunks as
the generated FDW queries would target individual chunks.

Fixes #3714
2021-10-20 20:42:21 +02:00
gayyappan
0277ed7461 Verify compressed chunk validity for insert path
When a insert into a compressed chunk is blocked by a
concurrent recompress_chunk, the latter process could move
the storage for the compressed chunk. Verify validity of
the compressed chunk before proceeding to acquire locks.

Fixes #3400
2021-07-22 15:18:42 -04:00
Dmitry Simonenko
40d2bf17b6 Add support for error injections
Rework debug waitpoint functionality to produce an error in
case if the waitpoint is enabled.

This update introduce a controlled way to simulate errors
scenarios during testing.
2021-07-02 16:43:36 +03:00
Mats Kindahl
71e8f13871 Add workflow and CMake support for formatting
Add a workflow to check that CMake files are correctly formatted as
well as a custom target to format CMake files in the repository. This
commit also runs the formatting on all CMake files in the repository.
2021-06-17 22:52:29 +02:00
gayyappan
426918c59f Fix locking issue when updating chunk status
Two insert transactions could potentially try
to update the chunk status to unordered. This results in
one of the transactions failing with a tuple concurrently
update error.
Before updating status, lock the tuple for update, thus
forcing the other transaction to wait for the tuple lock, then
check status column value and update it if needed.
2021-05-24 18:03:47 -04:00
Dmitry Simonenko
db27b23e15 Fix dist_restore_point test on PG11.0
The test fails on PG11.0 since there is no notice during data node deletion.
Reduce client notice level to warning, to make it compatible with other versions.

Fixes #3096
2021-04-09 13:30:29 +03:00
Dmitry Simonenko
6a1c81b63e Add distributed restore point functionality
This change adds create_distributed_restore_point() function
which allows to create recovery restore point across data
nodes.

Fix #2846
2021-02-25 15:39:50 +03:00
Sven Klemm
23cd4098ef Remove unreferenced steps from remote_create_chunk test
This patch removes unreferenced steps from the remote_create_chunk
isolation test because PG13 will print warnings about unreferenced
steps leading to test failures.
2021-01-15 15:45:53 +01:00
Erik Nordström
2a22b7e9e1 Optimize locking for create chunk API
The create-chunk API, which is used by the access node to create
chunks on data nodes, always grabs a lock on the root hypertable. This
change optimizes this locking so that the lock is only taken when the
chunk doesn't already exist.

A new isolation test is also added to test locking behavior with
concurrent transactions creating a chunk.
2020-12-16 15:00:54 +01:00
Erik Nordström
c311b44a09 Fix crash for concurrent drop and compress chunk
This change fixes a segfault that occured when `drop_chunks` is
concurrently executed with `compress_chunk` and the same chunk that
gets dropped is also being compressed.

The crash happened because the tuple lock status function for a
dimension slice passed in a pointer to a dimension slice that was
always NULL.

An isolation test is also added to cover concurrent compression and
drop of the same chunk. To make the test pass with identical errors
for PG11 and PG12, additional changes are made to the scanner API to
pass on the lock failure data so that it is possible to distinguish
between an update and delete on PG11.
2020-11-30 17:49:15 +01:00
Sven Klemm
b1c28c9c7c Remove unreferenced steps from isolation tests
Some isolation tests had steps that were not referenced in
any of the permutations so this patch removes those.
2020-10-15 04:03:48 +02:00
gayyappan
ef7f21df6d Modify job_stats and continuous_aggregates view
Use hypertable_schema and hypertable_name instead
of regclass hypertable in job_stats and
continuous_aggregates views.
2020-10-01 11:39:10 -04:00
Ruslan Fomkin
66c63476e5 Change cagg refresh to cover buckets
Refresh of a continuous aggregate was expanding refresh window to
include buckets, which contain the start and end of the window. This
was leading to refreshing dropped data into the first bucket in the
corner case, when drop_before of a retention policy is the same as
start_offset of a continuous aggregate policy and the last dropped
chunk happens to intersect with the first bucket. See #2198 for
detailed discussion.

A behavior of a refresh, which is called when chunks are dropped, is
not changed, i.e., buckets, which fully cover chunks ot be dropped,
are refreshed if needed (i.e., there were changes in the chunks, which
were not refreshed yet). It can be done separately if needed.

Fixes #2198
2020-09-29 22:03:07 +02:00
Erik Nordström
5179447613 Remove completed threshold
The completed threshold in the TimescaleDB catalog is no longer used
by the refactored continuous aggregates, so it is removed.

Fixes #2178
2020-09-15 17:18:59 +02:00
Erik Nordström
202692f1ef Make tests use the new continuous aggregate API
Tests are updated to no longer use continuous aggregate options that
will be removed, such as `refresh_lag`, `max_interval_per_job` and
`ignore_invalidation_older_than`. `REFRESH MATERIALIZED VIEW` has also
been replaced with `CALL refresh_continuous_aggregate()` using ranges
that try to replicate the previous refresh behavior.

The materializer test (`continuous_aggregate_materialize`) has been
removed, since this tested the "old" materializer code, which is no
longer used without `REFRESH MATERIALIZED VIEW`. The new API using
`refresh_continuous_aggregate` already allows manual materialization
and there are two previously added tests (`continuous_aggs_refresh`
and `continuous_aggs_invalidate`) that cover the new refresh path in
similar ways.

When updated to use the new refresh API, some of the concurrency
tests, like `continuous_aggs_insert` and `continuous_aggs_multi`, have
slightly different concurrency behavior. This is explained by
different and sometimes more conservative locking. For instance, the
first transaction of a refresh serializes around an exclusive lock on
the invalidation threshold table, even if no new threshold is
written. The previous code, only took the heavier lock once, and if, a
new threshold was written. This new, and stricter locking, means that
insert processes that read the invalidation threshold will block for a
short time when there are concurrent refreshes. However, since this
blocking only occurs during the first transaction of the refresh
(which is quite short), it probably doesn't matter too much in
practice. The relaxing of locks to improve concurrency and performance
can be implemented in the future.
2020-09-11 16:07:21 +02:00
Mats Kindahl
9565cbd0f7 Continuous aggregates support WITH NO DATA
This commit will add support for `WITH NO DATA` when creating a
continuous aggregate and will refresh the continuous aggregate when
creating it unless `WITH NO DATA` is provided.

All test cases are also updated to use `WITH NO DATA` and an additional
test case for verifying that both `WITH DATA` and `WITH NO DATA` works
as expected.

Closes #2341
2020-09-11 14:02:41 +02:00
Mats Kindahl
c054b381c6 Change syntax for continuous aggregates
We change the syntax for defining continuous aggregates to use `CREATE
MATERIALIZED VIEW` rather than `CREATE VIEW`. The command still creates
a view, while `CREATE MATERIALIZED VIEW` creates a table.  Raise an
error if `CREATE VIEW` is used to create a continuous aggregate and
redirect to `CREATE MATERIALIZED VIEW`.

In a similar vein, `DROP MATERIALIZED VIEW` is used for continuous
aggregates and continuous aggregates cannot be dropped with `DROP
VIEW`.

Continuous aggregates are altered using `ALTER MATERIALIZED VIEW`
rather than `ALTER VIEW`, so we ensure that it works for `ALTER
MATERIALIZED VIEW` and gives an error if you try to use `ALTER VIEW` to
change a continuous aggregate.

Note that we allow `ALTER VIEW ... SET SCHEMA` to be used with the
partial view as well as with the direct view, so this is handled as a
special case.

Fixes #2233

Co-authored-by: =?UTF-8?q?Erik=20Nordstr=C3=B6m?= <erik@timescale.com>
Co-authored-by: Mats Kindahl <mats@timescale.com>
2020-08-27 17:16:10 +02:00
Erik Nordström
a48a4646b0 Improve the concurrent refresh test
This change moves the invalidation threshold in the setup phase of the
concurrent refresh test for continuous aggregates in order to generate
invalidations. Without any invalidations, the invalidation logs are
never really processed and thus not subjected to concurrency.
2020-08-19 09:53:39 +02:00
Erik Nordström
c01faa72f0 Set invalidation threshold during refresh
The invalidation threshold governs the window of data from the head of
a hypertable that shouldn't be subject to invalidations in order to
reduce write amplification during inserts on the hypertable.

When a continuous aggregate is refreshed, the invalidation threshold
must be moved forward (or initialized if it doesn't previously exist)
whenever the refresh window stretches beyond the current threshold.

Tests for setting the invalidation threshold are also added, including
new isolation tests for concurrency.
2020-08-12 11:16:23 +02:00
Sven Klemm
d547d61516 Refactor continuous aggregate policy
This patch modifies the continuous aggregate policy to store its
configuration in the jobs table.
2020-08-11 22:57:02 +02:00
gayyappan
9f13fb9906 Add functions for compression stats
Add chunk_compression_stats and hypertable_compression_stats
functions to get before/after compression sizes
2020-08-03 10:19:55 -04:00
Erik Nordström
7c4247c3fb Add test for concurrent continuous aggregate refresh
This change adds an new isolation test for concurrent refreshing on a
continuous aggregate. Although a bucket (group) in a continuous
aggregate should be unique on the GROUP BY columns there is no unique
constraint on the materialized hypertable to protect against duplicate
buckets. Therefore, concurrent refreshes can result in duplicate rows
in the materialized hypertable although such duplicates should not be
possible by the underlying query's definition.
2020-07-30 01:04:32 +02:00
Ruslan Fomkin
13aa729f68 Add license note to TSL isolation test
Adding Timescale License header to specs of isolation TSL tests.
2020-07-18 09:19:24 +02:00
gayyappan
43edbf8639 Fix concurrent tuple deletes for continuous aggregates
When we have multiple continuous aggregates defined on
the same hypertable, they could try to delete the hypertable
invalidation logs concurrently. Resolve this by serializing
invalidation log deletes by raw hypertable id.

Fixes #1940
2020-07-03 23:46:01 -04:00
Sven Klemm
5aaa07b9ee Fix flaky reorder_vs_insert isolation test
This patch makes the lock_timeout values for the 2 sessions distinct
so they will always fire in the same order leading to reproducable
results.
2020-06-19 19:40:02 +02:00
Sven Klemm
5b8de4710e Fix compression_ddl isolation test
The compression_ddl test had a permutation that depended on the
PGISOLATIONTIMEOUT to cancel the test leading to unreasonably long
running and flaky test. This patch changes the test to set
lock_timeout instead to cancel the blocking much earlier.
2020-06-19 00:05:37 +02:00
Sven Klemm
4a6bdb7f1b Increase lock_timeout for isolation tests
In slower environments the isolation tests are extremely flaky
because lock_timeout is only 50ms this patch changes lock_timeout
to 500ms for the isolation tests leading to much more reliable tests
in those environments.
2020-06-19 00:05:37 +02:00
Sven Klemm
c90397fd6a Remove support for PG9.6 and PG10
This patch removes code support for PG9.6 and PG10. In addition to
removing PG96 and PG10 macros the following changes are done:

remove HAVE_INT64_TIMESTAMP since this is always true on PG10+
remove PG_VERSION_SUPPORTS_MULTINODE
2020-06-02 23:48:35 +02:00
Stephen Polcyn
b57d2ac388 Cleanup TODOs and FIXMEs
Unless otherwise listed, the TODO was converted to a comment or put
into an issue tracker.

test/sql/
- triggers.sql: Made required change

tsl/test/
- CMakeLists.txt: TODO complete
- bgw_policy.sql: TODO complete
- continuous_aggs_materialize.sql: TODO complete
- compression.sql: TODO complete
- compression_algos.sql: TODO complete

tsl/src/
- compression/compression.c:
  - row_compressor_decompress_row: Expected complete
- compression/dictionary.c: FIXME complete
- materialize.c: TODO complete
- reorder.c: TODO complete
- simple8b_rle.h:
  - compressor_finish: Removed (obsolete)

src/
- extension.c: Removed due to age
- adts/simplehash.h: TODOs are from copied Postgres code
- adts/vec.h: TODO is non-significant
- planner.c: Removed
- process_utility.c
  - process_altertable_end_subcmd: Removed (PG will handle case)
2020-05-18 20:16:03 -04:00
gayyappan
ed64af76a5 Fix real time aggregate support for multiple aggregates
We should compute the watermark using the materialization
hypertable id and not by using the raw hypertable id.
New test cases added to continuous_aggs_multi.sql. Existing test
cases in continuous_aggs_multi.sql were not correctly updated
for this feature.

Fixes #1865
2020-05-15 10:15:53 -04:00
Ruslan Fomkin
403782a589 Run regression test on latest PG versions
Docker images are build with latest versions of PostgreSQL, thus
updating regression tests to run on the latest PG versions. Fixing
authentication for PG docker images, since it is required after a
recent change in docker-library/postgres@42ce743.

Running isolation tests on new versions of PG produces additional
output:
- Notifications are not missed to print postgres/postgres@ebd4992
- Notifications are prefixed with session name
postgres/postgres@a28e10e
- Timeout cancellation is printed in isolation tests
postgres/postgres@b578404

The expected outputs are modified to succeed on latest PG versions,
while the affected isolation tests are disabled for earlier versions
of PG.
2020-04-02 08:57:28 +02:00
Sven Klemm
2ae4592930 Add real-time support to continuous aggregates
This PR adds a new mode for continuous aggregates that we name
real-time aggregates. Unlike the original this new mode will
combine materialized data with new data received after the last
refresh has happened. This new mode will be the default behaviour
for newly created continuous aggregates.

To upgrade existing continuous aggregates to the new behaviour
the following command needs to be run for all continuous aggregates

ALTER VIEW continuous_view_name SET (timescaledb.materialized_only=false);

To disable this behaviour for newly created continuous aggregates
and get the old behaviour the following command can be run

ALTER VIEW continuous_view_name SET (timescaledb.materialized_only=true);
2020-03-31 22:09:42 +02:00
gayyappan
70e23d3d28 Modify isolation test makefile rules
PG12 needs version specific output for some
isolation tests. Modify the makefile to make it
consistent with use of TEST_TEMPLATES in other
regression suites and add version specific output
files.
2020-03-20 17:41:46 -04:00
Sven Klemm
08c3d9015f Change log level for cagg materialization messages
The log level used for continuous aggregate materialization messages
was INFO which is for requested information. Since there is no way to
control the behaviour externally INFO is a suboptimal choice because
INFO messages cannot be easily suppressed leading to irreproducable
test output. Even though time can be mocked to make output consistent
this is only available in debug builds.

This patch changes the log level of those messages to LOG, so
clients can easily control the ouput by setting client_min_messages.
2020-03-06 01:09:08 +01:00