Hooks in postgres are supposed to be chained and call the previous
hook in your own added hooks. Not calling previous hook will prevent
other extensions from having working hooks.
Vectorized aggregation assumes that it runs on top of a
DecompressChunk child node, which makes it difficult to support other
child plans; including those that fetch data via Hypercore TAM.
Most of the DecompressChunk-specific code for planning VectorAgg
relates to identifying vectorizable columns. This code is moved to a
separate source file so that the main planning code is mostly
child-node independent.
Currently while a job is running we set `pid = SchedulerPid`,
`succeed = false` and `execution_finish=NOW()` and it leads to
confusion when querying either `timescaledb_information.job_errors`
or `timescaledb_information.job_history` views showing in the
`err_message = job crash detected, see server logs`. This information
is wrong and create confusion.
Fixed it by setting `succeed=NULL` and `pid=NULL` when the scheduler
launch the job and then when the job worker start to work then set
`pid=MyProcPid` (the worker PID) meaning that the job started and
didn't finished yet, and at the end of the execution we set
`succeed=TRUE or FALSE` and the `execution_finish=NOW()` to mark the
end of the job execution. Also adjusted the views to expose the
information properly.
It should not be possible to merge a frozen chunk since it is used to
tier chunks. If the chunk is merged, it might no longer exist when the
tiering happens.
When an INSERT with ON CONFLICT DO NOTHING hits the first conflicts
it would abort additional INSERTS following the INSERT triggering
the DO NOTHING clause leading to missed INSERTs.
Fixes#7672
Since we split the fixes and thanks messages into separate sections
in the changelog the context between fix and thanks will be lost so
the thanks note should repeat any required context
Since we depend on the openssl version of the postgres installer
to match the openssl version we built against and we can't ensure
stability of that version in the postgres installer we only test
windows package against the latest available postgres version.
When filtering arrow slots in ColumnarScan, quals on segmentby columns
should be executed separately from those on other columns because they
don't require decompression and might filter the whole arrow slot in
one go. Furthermore, the quals only need to be applied once per arrow
slot since the segmentby value is the same for all compressed rows in
the slot.
This will speed up scans when filters on segmentby columns cannot be
pushed down to Hypercore TAM as scankeys. For example, "<column> IN
(1, 2, 3)" won't be pushed down as a scankey because only index scans
support scankeys with such scalar array expressions.
Before compressed chunks were mutable, adding a compression policy
to a continuous aggregate that could include portions of the refrsh
window were blocked. When a continuous aggregate is refreshed, the
underlying chunks needed to allow DELETEs and INSERTs, so they
could not be compressed. Now, compressed chunks allow both operations
and there is no longer a need to prevent the refresh window and
compression window from overlapping.
Quals on orderby columns can be pushed down to Hypercore TAM and be
transformed to the corresponding min/max scankeys on the compressed
relation. Previously, only quals on non-compressed segmentby columns
were pushed down as scankeys.
Pushing down orderby scan keys seem to give a good performance boost
for columnar scans when no index exists.
The scankey push down can be disabled with a new GUC:
`timescaledb.enable_hypercore_scankey_pushdown=false`
Depending on the branch protection settings, we might have to update
them before they can be merged, so do this automatically to minimize the
required manual work.
The EquivalenceMember lookup is the most costly part, so share it
between different uses.
Switch batch sorted merge to use the generic pathkey matching code.
Also cache some intermediate data in the CompressionInfo struct.
When pushing down expressions into the compressed scan we assumed
all valid expressions use btree operators and dropped any that weren't.
This patch changes the behaviour to keep those expressions and use
them as heap filter on the compressed scan for UPDATE and DELETE
on compressed chunks.
The VectorAgg exec loop reads tuples directly from a compressed
relation, thus bypassing the DecompressChunk child node. This won't
work with arrow slots, which are read via a table access method.
To make the VectorAgg exec code similar to the standard pattern of
reading slots from child nodes, code specific to decompressing batches
is moved out of the main VectorAgg exec loop so that the loop only
deals with the final compressed batch slot instead of the raw
compressed slot. The code is instead put in a "get_next_slot"
function, which is called from the loop.
Also move the code to initialize vectorized filters to its own
"init_vector_qual" function, since it is specific to compressed
batches.
With these two function interfaces, it is possible to provide
implementations of the functions for handling arrow slots.
The downgrade script for 2.17.2 that was merged into main branch
was different from the one in 2.18.0 due to merge error. This
patch syncs the downgrade script with the version used in 2.18.0
release.
When deleting from compressed chunk the direct delete optimization
would ignore constraints that were not using btree operators
leading to constraints of the DELETE not being applied to the
direct delete on the compressed chunk, potentially leading to
data corruption. This patch disables the direct delete optimization
when any of the constraints can not be applied.
Fixes#7644
with GitHub workflow files.
We run these steps for a new minor version - feature freeze :
- create the Bump-version PR on `main`,
- create the new minor-version-branch , e.g. `2.18.x`
- create the Release PR on minor-version-branch
We no more use a fork, but a branch directly.
To support VectorAgg on top of Hypercore TAM, change the vector agg
processing functions to pass around tuple table slots instead of
compressed batches. This makes it possible to pass any "compatible"
"vector slot" to the vector agg functions, which is required since TAM
uses a slightly different slot implementation for arrow/vector data.
In addition, add some functions to handle reading vector data from
compatible vector slot implementations. This commit only adds the code
to read from compressed batches. Arrow slots will be supported as part
of a later change.
If an index is dropped, it is necessary to lock the heap table (of
the index) before the index since all normal operations do it in this
order. When dropping an index, we did not take all the necessary locks
in the right order before calling `performMultipleDeletions`, which can
cause deadlocks when dropping an index on a hypertable at the same time
as running a utility statement that takes heavy locks, e.g., VACUUM or
ANALYZE.
Adding a isolation test as well that will generate a deadlock if the
index and table locks are not taken in the correct order.
Since we only support generating downgrade script for the previous
version anything targeting versions before 2.18 will never be
executed in current context. So we can safely remove the code that
deals with version before 2.3.
To detect the problematic patterns that were part of the 2.18 release
we can check the sql scripts against a list of allowed statements.
Any non idempotent operation should be in the pre_install scripts
and not the scripts that get appended for the update scripts.
This patch adjusts the downgrade script generation to not include
incompatible files from the 2.18.0 release that would break script
generation and replaces them with a working version. This adjustment
can be removed after we release of 2.18.1.
This patch also reenables the downgrade test.
The 2.18.0 sql files for building downgrade scripts have some
incompatible changes in them that prevent downgrade script generation.
This patch disable downgrade test until the necessary adjustments
in downgrade script generation are made.
This release introduces the ability to add secondary indexes to the columnstore, improves group by and filtering performance through columnstore vectorization, and contains the highly upvoted community request of transition table support. We recommend that you upgrade at the next available opportunity.
**Highlighted features in TimescaleDB v2.18.0**
* The ability to add secondary indexes to the columnstore through the new hypercore table access method.
* Significant performance improvements through vectorization (`SIMD`) for aggregations using a group by with one column and/or using a filter clause when querying the columnstore.
* Hypertables support triggers for transition tables, which is one of the most upvoted community feature requests.
* Updated methods to manage Timescale's hybrid row-columnar store (hypercore) that highlight the usage of the columnstore which includes both an optimized columnar format as well as compression.
**Dropping support for Bitnami images**
After the recent change in Bitnami’s [LTS support policy](https://github.com/bitnami/containers/issues/75671), we are no longer building Bitnami images for TimescaleDB. We recommend using the [official TimescaleDB Docker image](https://hub.docker.com/r/timescale/timescaledb-ha)
**Deprecation Notice**
We are deprecating the following parameters, functions, procedures and views. They will be removed with the next major release of TimescaleDB. Please find the replacements in the table below:
| Deprecated | Replacement | Type |
| --- | --- | --- |
| decompress_chunk | convert_to_rowstore | Procedure |
| compress_chunk | convert_to_columnstore | Procedure |
| add_compression_policy | add_columnstore_policy | Function |
| remove_compression_policy | remove_columnstore_policy | Function |
| hypertable_compression_stats | hypertable_columnstore_stats | Function |
| chunk_compression_stats | chunk_columnstore_stats | Function |
| hypertable_compression_settings | hypertable_columnstore_settings | View |
| chunk_compression_settings | chunk_columnstore_settings | View |
| compression_settings | columnstore_settings | View |
| timescaledb.compress | timescaledb.enable_columnstore | Parameter |
| timescaledb.compress_segmentby | timescaledb.segmentby | Parameter |
| timescaledb.compress_orderby | timescaledb.orderby | Parameter |
**Features**
* #7341: Vectorized aggregation with grouping by one fixed-size by-value compressed column (such as arithmetic types).
* #7104: Hypercore table access method.
* #6901: Add hypertable support for transition tables.
* #7482: Optimize recompression of partially compressed chunks.
* #7458: Support vectorized aggregation with aggregate `filter` clauses that are also vectorizable.
* #7433: Add support for merging chunks.
* #7271: Push down `order by` in real-time continuous aggregate queries.
* #7455: Support `drop not null` on compressed hypertables.
* #7295: Support `alter table set access method` on hypertable.
* #7411: Change parameter name to enable hypercore table access method.
* #7436: Add index creation on `order by` columns.
* #7443: Add hypercore function and view aliases.
* #7521: Add optional `force` argument to `refresh_continuous_aggregate`.
* #7528: Transform sorting on `time_bucket` to sorting on time for compressed chunks in some cases.
* #7565: Add hint when hypertable creation fails.
* #7390: Disable custom `hashagg` planner code.
* #7587: Add `include_tiered_data` parameter to `add_continuous_aggregate_policy` API.
* #7486: Prevent building against PostgreSQL versions with broken ABI.
* #7412: Add [GUC](https://www.postgresql.org/docs/current/acronyms.html#:~:text=GUC) for the `hypercore_use_access_method` default.
* #7413: Add GUC for segmentwise recompression.
**Bugfixes**
* #7378: Remove obsolete job referencing `policy_job_error_retention`.
* #7409: Update `bgw_job` table when altering procedure.
* #7410: Fix the `aggregated compressed column not found` error on aggregation query.
* #7426: Fix `datetime` parsing error in chunk constraint creation.
* #7432: Verify that the heap tuple is valid before using.
* #7434: Fix the segfault when internally setting the replica identity for a given chunk.
* #7488: Emit error for transition table trigger on chunks.
* #7514: Fix the error: `invalid child of chunk append`.
* #7517: Fix the performance regression on the `cagg_migrate` procedure.
* #7527: Restart scheduler on error.
* #7557: Fix null handling for in-memory tuple filtering.
* #7566: Improve transaction check in CAGG refresh.
* #7584: Fix NaN-handling for vectorized aggregation.
* #7598: Match the Postgres NaN comparison behavior in WHERE clause over compressed tables.
**Thanks**
* @bharrisau for reporting the segfault when creating chunks.
* @jakehedlund for reporting the incompatible NaN behavior in WHERE clause over compressed tables.
* @k-rus for suggesting that we add a hint when hypertable creation fails.
* @staticlibs for sending the pull request that improves the transaction check in CAGG refresh.
* @uasiddiqi for reporting the `aggregated compressed column not found` error.
The TAM SQL code was not written with update and downgrade scripts
in mind prevents further releases past 2.18.0 due to not splitting
up the parts that need to be part of every update script and those
that can only run once during initial installation.
Move the code for vector qual execution to its own module. The vector
qual execution will produce a result in the form of a bitmap filter
for the arrow array. Add functions to the arrow slot to carry the
result bitmap in the arrow tuple table slot. This allows passing the
filter result to nodes above the node that computed the vector qual
result. This is necessary to, e.g., run vectorized aggregation above a
columnar scan.