4822 Commits

Author SHA1 Message Date
Sven Klemm
e4eb666ca3 Release 2.16.0
This release contains significant performance improvements when working with compressed data, extended join
support in continuous aggregates, and the ability to define foreign keys from regular tables towards hypertables.
We recommend that you upgrade at the next available opportunity.

In TimescaleDB v2.16.0 we:

* Introduce multiple performance focused optimizations for data manipulation operations (DML) over compressed chunks.

  Improved upsert performance by more than 100x in some cases and more than 1000x in some update/delete scenarios.

* Add the ability to define chunk skipping indexes on non-partitioning columns of compressed hypertables

  TimescaleDB v2.16.0 extends chunk exclusion to use those skipping (sparse) indexes when queries filter on the relevant columns,
  and prune chunks that do not include any relevant data for calculating the query response.

* Offer new options for use cases that require foreign keys defined.

  You can now add foreign keys from regular tables towards hypertables. We have also removed
  some really annoying locks in the reverse direction that blocked access to referenced tables
  while compression was running.

* Extend Continuous Aggregates to support more types of analytical queries.

  More types of joins are supported, additional equality operators on join clauses, and
  support for joins between multiple regular tables.

**Highlighted features in this release**

* Improved query performance through chunk exclusion on compressed hypertables.

  You can now define chunk skipping indexes on compressed chunks for any column with one of the following
  integer data types: `smallint`, `int`, `bigint`, `serial`, `bigserial`, `date`, `timestamp`, `timestamptz`.

  After you call `enable_chunk_skipping` on a column, TimescaleDB tracks the min and max values for
  that column. TimescaleDB uses that information to exclude chunks for queries that filter on that
  column, and would not find any data in those chunks.

* Improved upsert performance on compressed hypertables.

  By using index scans to verify constraints during inserts on compressed chunks, TimescaleDB speeds
  up some ON CONFLICT clauses by more than 100x.

* Improved performance of updates, deletes, and inserts on compressed hypertables.

  By filtering data while accessing the compressed data and before decompressing, TimescaleDB has
  improved performance for updates and deletes on all types of compressed chunks, as well as inserts
  into compressed chunks with unique constraints.

  By signaling constraint violations without decompressing, or decompressing only when matching
  records are found in the case of updates, deletes and upserts, TimescaleDB v2.16.0 speeds
  up those operations more than 1000x in some update/delete scenarios, and 10x for upserts.

* You can add foreign keys from regular tables to hypertables, with support for all types of cascading options.
  This is useful for hypertables that partition using sequential IDs, and need to reference those IDs from other tables.

* Lower locking requirements during compression for hypertables with foreign keys

  Advanced foreign key handling removes the need for locking referenced tables when new chunks are compressed.
  DML is no longer blocked on referenced tables while compression runs on a hypertable.

* Improved support for queries on Continuous Aggregates

  `INNER/LEFT` and `LATERAL` joins are now supported. Plus, you can now join with multiple regular tables,
  and you can have more than one equality operator on join clauses.

**PostgreSQL 13 support removal announcement**

Following the deprecation announcement for PostgreSQL 13 in TimescaleDB v2.13,
PostgreSQL 13 is no longer supported in TimescaleDB v2.16.

The Currently supported PostgreSQL major versions are 14, 15 and 16.
2024-07-31 18:43:01 +02:00
gayyappan
01b5de5d4a Add more foreign key tests for OSM chunk 2024-07-31 12:21:50 -04:00
Sven Klemm
35726351dd Don't propagate FK lookup queries to osm chunks
Skip the OSM chunk when doing hypertable expansion for FK lookup
queries. OSM chunks are considered archived data and we dont want
to incur the performance hit of querying OSM data on modifications
on the FK reference table.
2024-07-31 14:00:33 +02:00
Sven Klemm
e7850a8cc5 Add GUC for foreign key propagation
Allow controlling FK query behaviour with GUC
enable_foreign_key_propagation. Do not propagate
FK check queries to tiered chunks.
2024-07-30 15:54:29 +02:00
Sven Klemm
5a81be85cf Dont apply sort optimization when interval length is not fixed
We applied our sort transformation for interval calculation too
aggressively even in situations where it is not safe to do so, leading
to potentially incorrectly sorted output or `mergejoin input data is out
of order` error messages.

Fixes #7097
2024-07-30 15:05:30 +02:00
Mats Kindahl
b443f46482 Fix REASSIGN OWNED BY for background jobs
Using `REASSIGN OWNED BY` for background jobs do not work because it
does not change the owner of the job. This commit fixes this by
capturing the utility command and makes the necessary changes to the
`bgw_job` table.

It also factors out background jobs DDL tests into a separate file.
2024-07-29 13:04:38 +02:00
Fabrízio de Royes Mello
a4a023e89a Rename {enable|disable}_column_stats API
For better understanding we've decided to rename the public API from
`{enable|disable}_column_stats` to `{enable|disable}_chunk_skipping`.
2024-07-26 18:28:17 -03:00
Fabrízio de Royes Mello
554957cc19 Move chunk_column_stats tests to template 2024-07-26 18:28:17 -03:00
Fabrízio de Royes Mello
b39aa6c885 Install testgres using pip on CI
Now we can install `testgres` using `pip` since release `1.10.1` that
includes the PR #125 was merged.

https://github.com/postgrespro/testgres/releases/tag/1.10.1
2024-07-26 15:56:40 -03:00
Fabrízio de Royes Mello
bdfac2969a Fix partial view definition on CAggs
When creating a CAgg using a column on the projection that is not part
of the `GROUP BY` clause but is functionally dependent of the primary
key of the referenced table is leading to a problem in dump/restore
because the wrong dependencies created changing the order and way dump
is generated.

Fixed it by copying the `Query` data structure of the `direct view` and
changing the necessary properties instead of creating it from scratch.
2024-07-25 09:48:53 -03:00
Sven Klemm
d2b0213326 Fix snapshot tests
Migrate the extension_state tap test to normal regression check.
This test was causing tests against latest pg snapshot to fail.
2024-07-25 14:01:53 +02:00
Ante Kresic
377cc15c70 Optimize index path generation for ordered append
When creating index paths for compressed chunks, root->eq_classes
can contain a lot of entries from other relations which slow down
plan time. By filtering them to include only ECs which are from
that relation, we can improve plan times significantly when
lots of chunks are involved in the query.
2024-07-24 13:03:09 +02:00
Fabrízio de Royes Mello
11af76179d Refactor Hierarchical CAggs tests
Refactored the Hierarchical Continuous Aggregate regression tests
including more columns in JOIN tests and also added an `ORDER BY`
clause to definition to avoid flake tests when querying and show
the result rows.
2024-07-23 15:30:28 -03:00
Fabrízio de Royes Mello
95acaa406b Fix segfault on CAggs with multiple JOINs
Creating or changing to realtime a Continuous Aggregate with multiple
joins was leading to a segfault.

Fixed it by dealing properly with the `varno` when creating the `Quals`
for the union view in realtime mode.

Also get rid of some left over when we relaxed the CAggs join
restrictions in #7111.
2024-07-23 15:30:28 -03:00
Fabrízio de Royes Mello
d18361e412 Remove unused variable 2024-07-22 10:30:23 -03:00
Sven Klemm
af6b4a3911 Change hypertable foreign key handling
Don't copy foreign key constraints to the individual chunks and
instead modify the lookup query to propagate to individual chunks
to mimic how postgres does this for partitioned tables.
This patch also removes the requirement for foreign key columns
to be segmentby columns.
2024-07-22 14:33:00 +02:00
Fabrízio de Royes Mello
fe533e2dec Use DELETE after compression
The last step of compressing chunk is cleanup the uncompressed chunk and
currently it is done by a `TRUNCATE` that requires an
`AccessExclusiveLock` preventing concurrent sessions even `SELECT` data
from the hypertable.

With this PR will be possible to execute a `DELETE` instead of a
`TRUNCATE` on the uncompressed chunk relaxing the lock to
`RowExclusiveLock`. This new behavior is controled by a new GUC
`timescaledb.enable_delete_after_compression` that is `false` by
default.

The side effect of enabling this behavior will be more WAL generation
because we'll delete each row from uncompressed chunk and also bloat due
to a lot of dead tuples created.
2024-07-21 11:07:25 -03:00
Sven Klemm
b8d958cb9e Block c function definitions in latest-dev.sql
Having c function references in the versioned part of the sql
scripts introduces linking requirements to the update script
potentially preventing version updates. To prevent this we can
have a dummy function in latest-dev.sql since it will get over-
written as the final step of the extension update.
2024-07-21 10:49:57 +02:00
Sven Klemm
1a8318633b Fix integer overflow in window size calculation
Found by coverity
2024-07-19 13:51:22 +02:00
Fabrízio de Royes Mello
88a832fb0b Fix CAgg watermark constify using CTE
Some more complex queries using CAggs on CTEs was not properly applying
the `cagg_watermark` constify optimization because we restricted it to
more simple queries.

Simplified the code and only restrict SELECT queries to apply the
optimization.
2024-07-19 06:40:29 -03:00
Fabrízio de Royes Mello
6f5646c4c1 Fix enable_job_execution_logging GUC context
In #6767 we allowed users to track job execution history by turning on
the new GUC `timescaledb.enable_job_execution_logging`.

Unfortunately we defined this GUC as PGC_USERSET context but the right
context should be PGC_SIGHUP since it won't work when setting at session
and/or database level so we should restrict it to be used only by `ALTER
SYSTEM SET` or changing the config files.
2024-07-19 05:24:10 -03:00
Sven Klemm
6d83e0584d Add more tests for foreign key constraints on hypertables 2024-07-18 11:57:57 +02:00
Sven Klemm
640cb377a2 Fix coverity warning about NULL pointer dereference
In current code skip_current_tuple will never be NULL pointer
in ON CONFLICT DO NOTHING case but add an additional check nonetheless
to make it safe against future refactoring.
2024-07-17 14:50:11 +02:00
Fabrízio de Royes Mello
a0638f5408 Relax CAggs JOIN restrictions
Remove some Continuous Aggregates JOIN restrictions by allowing:
* INNER/LEFT join;
* LATERAL join;
* JOIN between 1 hypertable and N tables, foreign tables, views or
  materialized views;
* Remove restriction of only ONE equality operator on JOIN clause.
2024-07-16 10:19:58 -03:00
Sven Klemm
ccc3e113c8 Add compression tuple filtering information to EXPLAIN
Show information about filtered batches to EXPLAIN ANALYZE output.
2024-07-15 17:07:37 +02:00
Pallavi Sontakke
399f6c639a
Improve release-notes further (#7121)
For docs-compliance for the upcoming release.
2024-07-15 12:43:19 +05:30
Sven Klemm
47efc2f3f9 Reduce decompression for INSERT with unique constraints
On INSERT into compressed chunks with unique constraints we can
check for conflict without decompressing when no ON CONFLICT clause
is present and we only have one unique constraint. With ON CONFLICT
clause with DO NOTHING we can just skip the INSERT if we detect conflict
and return early. Only for ON CONFLICT DO UPDATE/UPSERT do we need
to decompress when there is a constraint conflict.
Doing the optimization in the presence of multiple constraints is
also possible but not part of this patch.
2024-07-13 14:55:07 +02:00
Mats Kindahl
2c651f8118 Fix flaky transparent_decompression
The `transparent_decompression` test is flaky because incremental sort
is chosen most times but normal sort is picked up as well when under
load.

Making the test less flaky by turning off incremental sort. The test
exists to test that DecompressChunk works as intended.
2024-07-12 14:04:35 +02:00
Nikhil Sontakke
50bca31130 Add support for chunk column statistics tracking
Allow users to specify that ranges (min/max values) be tracked
for a specific column using the enable_column_stats() API. We
will store such min/max ranges in a new timescaledb catalog table
_timescaledb_catalog.chunk_column_stats. As of now we support tracking
min/max ranges for smallint, int, bigint, serial, bigserial, date,
timestamp, timestamptz data types. Support for other stats for bloom
filters etc. will be added in the future.

We add an entry of the form (ht_id, invalid_chunk_id, col, -INF, +INF)
into this catalog to indicate that min/max values need to be calculated
for this column in a given hypertable for chunks. We also iterate
through existing chunks and add -INF, +INF entries for them in the
catalog. This allows for selection of these chunks by default since no
min/max values have been calculated for them.

This actual min-max start/end range is calculated later. One of the
entry points is during compression for now. The range is stored in
start (inclusive) and end (exclusive) form. If DML happens into a
compressed chunk then as part of marking it as partial, we also mark
the corresponding catalog entries as "invalid". So partial chunks do
not get excluded further. When recompression happens we get the new
min/max ranges from the uncompressed portion and try to reconcile the
ranges in the catalog based on these new values. This is safe to do in
case of INSERTs and UPDATEs. In case of DELETEs, since we are deleting
rows, it's possible that the min/max ranges change, but as of now we
err on the side of caution and retain the earlier values which can be
larger than the actual range.

We can thus store the min/max values for such columns in this catalog
table at the per-chunk level. Note that these min/max range values do
not participate in partitioning of the data. Such data ranges will be
used for chunk pruning if the WHERE clause of an SQL query specifies
ranges on such a column.

Note that Executor startup time chunk exclusion logic is also able to
use this metadata effectively.

A "DROP COLUMN" on a column with a statistics tracking enabled on it
ends up removing all relevant entries from the catalog tables.

A "decompress_chunk" on a compressed chunk removes its entries from the
"chunk_column_stats" catalog table since now it's available for DML.

Also a new "disable_column_stats" API has been introduced to allow
removal of min/max entries from the catalog for a specific column.
2024-07-12 14:43:16 +05:30
Sven Klemm
4861ca61a5 Don't link against openssl directly
This patch changes our build process to no longer link against
openssl directly but instead rely on postgres linking it.
Linking to openssl directly is causing problems when the openssl
version we link against does not match the version postgres links
against. While this is easy to prevent where we fully control the
build process it is repeatedly causing problems e.g. in ABI tests.
This patch changes only changes the behaviour for non-Windows as
we are running into linker problems on Windows with this change.
Until we can find a workaround for those problems Windows binaries
we still link OpenSSL directly.
2024-07-11 05:44:15 +02:00
Sven Klemm
5e65ec6eff Add PG17 snapshot run to CI 2024-07-10 21:50:54 +02:00
Aleksander Alekseev
81deb9dd03 Add PG17 test files
Co-authored-by: Sven Klemm <sven@timescale.com>
2024-07-10 21:50:54 +02:00
Sven Klemm
23e73abd37 Allow building against PG17
With these changes TimescaleDB can be built against PG17. Doing
so still requires -DEXPERIMENTAL=ON..

Co-authored-by: Aleksander Alekseev <aleksander@timescale.com>
2024-07-10 10:49:13 +02:00
Sven Klemm
3c58c3fb78 Add missing utils/guc.h include to enable building with pg17 2024-07-07 07:48:21 +02:00
Sven Klemm
595d7dbf83 Handle NULL as attstattarget default value in PG17
PG17 changed attstattarget to be NULLABLE and changed the default
to NULL. This patch changes the pg_attribute to produce the same
result against PG17 and previous versions.

4f622503d6
2024-07-07 07:48:21 +02:00
Sven Klemm
1e04331615 Reduce decompressions for compressed UPDATE/DELETE
Only decompress batches for compressed UPDATE/DELETE when the batch
actually has tuples that match the query constraints. This will
work even for columns we have no metadata on.
2024-07-05 16:01:38 +02:00
Sven Klemm
c10fae76dd Refactor compression file organization
Move compression dml code into separate file, moves code dealing
with ScanKey into separate file and move compression algorithms code
into dedicated subdirectory.
2024-07-04 13:47:31 +02:00
Sven Klemm
731c80093b Don't decompress on compressed INSERT unless necessary
Previously for INSERTs into compressed chunks with unique constraints
we would decompress the batch which would contain the tuple matching
according to the constraints. This patch will skip the decompressing
if the batch does not contain an actual matching tuples. This patch
adds the optimization for INSERT with unique constraints.
Similar optimizations for UPDATE and DELETE will be added in followup
patches.
2024-07-03 15:18:35 +02:00
Sven Klemm
473f0c9441 Remove centos7 from package test
Centos 7 is EOL since June 30th so we no longer build packages for
it and test it.
2024-07-03 13:04:49 +02:00
Sven Klemm
26e9eb521a TimescaleDB 2.15.3 post-release adjustments
Adjust CHANGELOG and downgrade scripts for 2.15.3
2024-07-03 13:04:38 +02:00
Fabrízio de Royes Mello
038b5757ac Use processed group clause in PG16 (take 2)
In #6377 we fixed an `ORDER BY/GROUP BY expression not found in
targetlist` by using the `root->processed_groupClause` instead of
`parse->groupClause` due to an optimization introduced in PG16 that
removes redundant grouping and distinct columns.

But looks like we didn't change all necessary places, specially our
HashAggregate optimization.
2024-07-02 15:29:09 +02:00
Pallavi Sontakke
e41b183ee5
Release 2.15.3 (#7089)
This release contains bug fixes since the 2.15.2 release.
Best practice is to upgrade at the next available opportunity.

**Migrating from self-hosted TimescaleDB v2.14.x and earlier**

After you run `ALTER EXTENSION`, you must run [this SQL
script](https://github.com/timescale/timescaledb-extras/blob/master/utils/2.15.X-fix_hypertable_foreign_keys.sql).
For more details, see the following pull request
[#6797](https://github.com/timescale/timescaledb/pull/6797).

If you are migrating from TimescaleDB v2.15.0, v2.15.1 or v2.15.2, no
changes are required.

**Bugfixes**
* #7061: Fix the handling of multiple unique indexes in a compressed
INSERT.
* #7080: Fix the `corresponding equivalence member not found` error.
* #7088: Fix the leaks in the DML functions.
* #7035: Fix the error when acquiring a tuple lock on the OSM chunks on
the replica.

**Thanks**
* @Kazmirchuk for reporting the issue about leaks with the functions in
DML.
2024-07-02 17:48:18 +05:30
Nikhil Sontakke
ebbca2dd77 Fix leaks with functions in DML
If plpgsql functions are used in DML queries then we were leaking 8KB
for every invocation of that function. This can quickly add up.

The issue was that the "CurTransactionContext" was not getting cleaned
up after every invocation. The reason was that we were inadvertantly
allocating a temporary list in that context. Postgres then thought that
this CurTransactionContext needs to be re-used further and kept it
around. We now use a proper memory context to avoid this.

Fixes #7053
2024-07-02 13:32:06 +05:30
Sven Klemm
1b7f109311 Fix corresponding equivalence member not found
When querying compressed data we determine whether the requested ORDER
can be applied to the underlying query on the compressed data itself.
This happens twice. The first time we decide whether we can push down
the sort and then we do a recheck when we setup the sort metadata.
Unfortunately those two checks did not agree. The initial check concluded
it is possible but the recheck disagreed.  This was due to a bug when
checking the query properties we mixed up the attnos and used attnos
from uncompressed chunk and compressed chunk in the same bitmapset.
If a segmentby column with equality constraint was present in the WHERE
clause whose attno was identical to a compressed attno of a different
column that was part of the ORDER BY the recheck would fail.

This patch removes the recheck and relies on the initial assesment
when building sort metadata.
2024-07-01 12:21:00 +02:00
Sven Klemm
0eabedc1c0 Remove some dead code related to distributed hypertables 2024-07-01 11:13:00 +02:00
Nikhil Sontakke
6e01b8c581 Add logs in the recompression code path
We added a few diagnostic log messages in the compression/decompression
code paths some time ago and they have been useful in identifying
hotspots in the actual activities. Adding a few more for recompression
now. The row_compressor_append_sorted_rows function which is also used
in recompression is already logged so we need just a few log messages
here.
2024-06-28 14:30:59 +05:30
Nikhil Sontakke
60c9f4d246 Fix bug in default segmentby calc. in compression
There was a typo in the query used for the calculation of default
segmentbys in the case of compression.
2024-06-27 17:50:38 +05:30
Sven Klemm
65ea2fa6e0 Don't use index column names
We must never use index column names to try to match relation column
names between different relations as index column names are independent
of relation column names and can get out of sync due to column renames.
2024-06-26 16:21:25 +02:00
Fabrízio de Royes Mello
cdfa1560e5 Refactor code for getting time bucket function Oid
This is a small refactoring for getting time bucket function Oid from
a view definition. It will be necessary for a following PRs for
completely remove the uncessary catalog metadata table
`continuous_aggs_bucket_function`.

Also added a new SQL function `cagg_get_bucket_function_info` to return
all `time_bucket` information based on a user view definition.
2024-06-26 10:33:23 -03:00
Alexander Kuzmenkov
82ab09d8fb
Stabilize more tests (#7066)
* cagg_watermark_concurrent_update is very dependent on the chunk
numbers, and should be ran first.
* telemetry_stats should do VACUUM and REINDEX before getting the
statistics, to avoid dependency on how the index was build
* cagg_migrate_function is missing some orderbys
2024-06-26 10:28:09 +00:00