This release contains bug fixes since the 2.14.1 release.
We recommend that you upgrade at the next available opportunity.
**Bugfixes**
* #6655 Fix segfault in cagg_validate_query
* #6660 Fix refresh on empty CAgg with variable bucket
* #6670 Don't try to compress osm chunks
**Thanks**
* @kav23alex for reporting a segfault in cagg_validate_query
This simplifies passing the columnar data out of the DecompressChunk to
Vectorized Aggregation node which we plan to implement. Also this should
improve memory locality and bring us closer to the architecture used in
TAM for ArrowTupleSlot.
The PR #6624 introduces a catalog access to get the Oid of a relation.
However, C strings were passed to the catalog cache and NameStr are
needed since namehashfast() is called internally. This PR fixes the
problem. Found by sanitizer.
So far, we have not handled CAggs with variable buckets correctly. The
CAgg refresh on a hypertable without any data lead to the error message
"timestamp out of range". This patch fixes the problem by declaring
empty CAggs as up-to-date.
So far, bucket_origin was defined as a Timestamp but used as a
TimestampTz in many places. This commit changes this and unifies the
usage of the variable.
The catalog table continuous_aggs_bucket_function is currently only used
for variable bucket sizes. Information about the fixed-size buckets is
stored in the table continuous_agg only. This causes some problems
(e.g., we have redundant fields for the bucket_size, fixes size buckets
with offsets are not supported, ...).
This commit is the first in a row of commits that refactor the catalog
for the CAgg time_bucket function. The goals are:
* Remove the CAgg redundant attributes in the catalog
* Create an entry in continuous_aggs_bucket_function for all CAggs
that use time_bucket
This first commit refactors the continuous_aggs_bucket_function table
and prepares it for more generic use. Not all attributes are used yet,
but these will change in follow-up PRs.
Historically we preserve chunk metadata because the old format of the
Continuous Aggregate has the `chunk_id` column in the materialization
hypertable so in order to don't have chunk ids left over there we just
mark it as dropped when dropping chunks.
In #4269 we introduced a new Continuous Aggregate format that don't
store the `chunk_id` in the materialization hypertable anymore so it's
safe to also remove the metadata when dropping chunk and all associated
Continuous Aggregates are in the new format.
Also added a post-update SQL script to cleanup unnecessary dropped chunk
metadata in our catalog.
Closes#6570
Postgres has `AttrNumberGetAttrOffset()` macro to proper access Datum
array members, so added a new coccinele static analysis to check for
missing macro usage.
Example:
`datum[attrno - 1]` should be `datum[AttrNumberGetAttrOffset(attrno)]`
Reference:
* https://github.com/postgres/postgres/blob/master/src/include/access/attnum.h)
When the input to pg_parse_query does not contain anything to parse
it will return NIL. This patch adds a check for NIL to prevent the
segfault that would otherwise happen later in the code.
Fixes: #6625
This commit updates the following workflows that used Node.js 16 to a
newer version that uses Node.js 20:
* actions/cache
* codecov/codecov-action
* Vampire/setup-wsl
This release contains bug fixes since the 2.14.0 release.
We recommend that you upgrade at the next available opportunity.
**Features**
* #6630 Add views for per chunk compression settings
**Bugfixes**
* #6636 Fixes extension update of compressed hypertables with dropped columns
* #6637 Reset sequence numbers on non-rollup compression
* #6639 Disable default indexscan for compression
* #6651 Fix DecompressChunk path generation with per chunk settings
**Thanks**
* @anajavi for reporting an issue with extension update of compressed hypertables
Memory operations can add up to tens of percents of the total
compression CPU load. To reduce the need for them, reserve for the
expected array sizes when initializing the compressor.
While the downgrade script doesnt combine multiple version into a single
script since we only create the script for the previous version, fixing
this will make backporting in the release branch easier.
Adjust DecompressChunk path generation to use the per chunk settings
and not the hypertable settings when building compression info.
This patch also fixes the missing chunk configuration generation
in the update script which was masked by this bug.
In a previous commit, a performance regression was introduced
which needlessly scanned indexes to get sequence numbers when
it was not necessary. This change resets sequence numbers when
we know that we are not rolling up chunks during compression.
The function ts_continuous_agg_find_by_mat_hypertable_id is used to read
the data about a CAgg from the catalog. If the CAgg for a given
mat_hypertable_id is not found, the function returns NULL. Therefore,
most code paths performed a NULL check and did some error handling
afterward. This PR moves the duplicated error handling into the
function.
The current code would always prefer indexscan over tuplesort while
doing scans of the rows from the chunk that was being compressed.
The thinking was that we'd avoid doing a sort via the indexscan.
The theory looked good on paper, but from various cloud customer
reports we have seen that the random access of the heap pages via the
indexscan was typically more expensive than doing the tuplesort. So we
disable the default indexscan till we get better usecases warranting
enabling it again for all scenarios. Specific use cases can enable the
timescaledb.enable_compression_indexscan manually if desired.
SQLSmith finds many internal program errors (`elog`, code `XX000`).
Normally these errors shouldn't be triggered by user actions and
indicate a bug in the program (like `variable not found in subplan
targetlist`). We don't have a capacity to fix all of them currently,
especially since some of them seem to be the upstream ones. This commit
adds logging for these errors so that we at least can study the current
situation.
Variables cannot be used to specify the action version. Looks like
we have to wait for an upstream fix to make a node image available
that is compatible with centos7.
Version 2.14.0 removes the multi-node code. However, there were a few
leftovers for the handling of distributed CAggs. This commit cleans up
the CAgg code and removes the no longer needed functions:
invalidation_cagg_log_add_entry(integer,bigint,bigint);
invalidation_hyper_log_add_entry(integer,bigint,bigint);
materialization_invalidation_log_delete(integer);
invalidation_process_cagg_log(integer,integer,regtype,bigint,
bigint,integer[],bigint[],bigint[]);
invalidation_process_cagg_log(integer,integer,regtype,bigint,
bigint,integer[],bigint[],bigint[],text[]);
invalidation_process_hypertable_log(integer,integer,regtype,
integer[],bigint[],bigint[]);
invalidation_process_hypertable_log(integer,integer,regtype,
integer[],bigint[],bigint[],text[]);
hypertable_invalidation_log_delete(integer);
This release contains performance improvements and bug fixes since
the 2.13.1 release. We recommend that you upgrade at the next
available opportunity.
In addition, it includes these noteworthy features:
* Ability to change compression settings on existing compressed hypertables at any time.
New compression settings take effect on any new chunks that are compressed after the change.
* Reduced locking requirements during chunk recompression
* Limiting tuple decompression during DML operations to avoid decompressing a lot of tuples and causing storage issues (100k limit, configurable)
* Helper functions for determining compression settings
**For this release only**, you will need to restart the database before running `ALTER EXTENSION`
**Multi-node support removal announcement**
Following the deprecation announcement for Multi-node in TimescaleDB 2.13,
Multi-node is no longer supported starting with TimescaleDB 2.14.
TimescaleDB 2.13 is the last version that includes multi-node support. Learn more about it [here](docs/MultiNodeDeprecation.md).
If you want to migrate from multi-node TimescaleDB to single-node TimescaleDB, read the
[migration documentation](https://docs.timescale.com/migrate/latest/multi-node-to-timescale-service/).
**Deprecation notice: recompress_chunk procedure**
TimescaleDB 2.14 is the last version that will include the recompress_chunk procedure. Its
functionality will be replaced by the compress_chunk function, which, starting on TimescaleDB 2.14,
works on both uncompressed and partially compressed chunks.
The compress_chunk function should be used going forward to fully compress all types of chunks or even recompress
old fully compressed chunks using new compression settings (through the newly introduced recompress optional parameter).
**Features**
* #6325 Add plan-time chunk exclusion for real-time CAggs
* #6360 Remove support for creating Continuous Aggregates with old format
* #6386 Add functions for determining compression defaults
* #6410 Remove multinode public API
* #6440 Allow SQLValueFunction pushdown into compressed scan
* #6463 Support approximate hypertable size
* #6513 Make compression settings per chunk
* #6529 Remove reindex_relation from recompression
* #6531 Fix if_not_exists behavior for CAgg policy with NULL offsets
* #6545 Remove restrictions for changing compression settings
* #6566 Limit tuple decompression during DML operations
* #6579 Change compress_chunk and decompress_chunk to idempotent version by default
* #6608 Add LWLock for OSM usage in loader
* #6609 Deprecate recompress_chunk
* #6609 Add optional recompress argument to compress_chunk
**Bugfixes**
* #6541 Inefficient join plans on compressed hypertables.
* #6491 Enable now() plantime constification with BETWEEN
* #6494 Fix create_hypertable referenced by fk succeeds
* #6498 Suboptimal query plans when using time_bucket with query parameters
* #6507 time_bucket_gapfill with timezones doesn't handle daylight savings
* #6509 Make extension state available through function
* #6512 Log extension state changes
* #6522 Disallow triggers on CAggs
* #6523 Reduce locking level on compressed chunk index during segmentwise recompression
* #6531 Fix if_not_exists behavior for CAgg policy with NULL offsets
* #6571 Fix pathtarget adjustment for MergeAppend paths in aggregation pushdown code
* #6575 Fix compressed chunk not found during upserts
* #6592 Fix recompression policy ignoring partially compressed chunks
* #6610 Ensure qsort comparison function is transitive
**Thanks**
* @coney21 and @GStechschulte for reporting the problem with inefficient join plans on compressed hypertables.
* @HollowMan6 for reporting triggers not working on materialized views of
CAggs
* @jbx1 for reporting suboptimal query plans when using time_bucket with query parameters
* @JerkoNikolic for reporting the issue with gapfill and DST
* @pdipesh02 for working on removing the old Continuous Aggregate format
* @raymalt and @martinhale for reporting very slow query plans on realtime CAggs queries
This patch deprecates the recompress_chunk procedure as all that
functionality is covered by compress_chunk now. This patch also adds a
new optional boolean argument to compress_chunk to force applying
changed compression settings to existing compressed chunks.
Add a CMake option to add the PostgreSQL source directory as a system
include for the build system. This will ensure that the PG source
include directory will end up in a generated `compile_commands.json`,
which can be used by language servers (e.g., clangd) to navigate from
the TimescaleDB source directly to the PostgreSQL source.
If the comparison function for qsort is non-transitive, there is a risk
of out-of-bounds access. Subtraction of integers can lead to overflows,
so instead use a real comparison function.
Log progress of the compression/decompression APIs into the postgresql
log file. In case of issues in the field, we really do not have much
idea about which stage the compression function is stuck at. Hopefully
we will have a better idea now with these log messages in place. We
want to keep things simple and enable logging of the progress by
default for these APIs. We can re-look if the users complaint about the
chattiness later.