This patch allows creating foreign keys on hypertables with
compressed chunks. Previously this required decompressing any
compressed chunks before adding the foreign key constraints.
Freezing a partially compressed chunk prevents us from compressing
it back to fully compressed. If the chunk was frozen, we can just
ignore them and continue with the rest of the chunks.
With reduction of locking during segmentwise recompression,
we are no longer getting ExclusiveLocks on relations at the
start of the operation. This GUC can be used to enable this
behavior in order to replicate legacy locking style if necessary.
This patch allows setting chunk_time_interval when creating
a continuous agg and allows changing it with ALTER MATERIALIZED
VIEW. Previously you had to create the cagg with `WITH NO DATA`
and then call `set_chunk_time_interval` followed by manually
refreshing.
A function in the planning code for vectoraggs casted a pointer to a
boolean as return value. However, the return value is no longer used,
so return void instead. This also fixes a Coverity warning.
The GITHUB_TOKEN variable is getting overwritten with the Github Actions
token.
Moreover, the token-based remote's credentials are somehow overwritten
on push with credentials used for checkout action. Use the PAT for it as
well.
This should finally let the backport script to update the outstanding
backport PRs to the current target branch, and workflows to start for
them.
Disable-check: force-changelog-file
Disable-check: approval-count
In the past, rolling up chunks while using primary dimension
in DESC order would create unordered chunks which meant we
had to do full recompress of the rolled up compressed chunk. With
removal of sequence from compression schema, this no longer tracks
thus we can remove this limitation.
The usage of timer mock was wrong because we're setting it on the
current session but the job is executed by a background worker that
don't have the mock time configured.
Fixed it by creating a user with an specific timezone and time mock
configured so any background worker created by the user will have the
proper configuration.
Failed attempts to fix#7813 and #7819.
Previously adding a unique or primary key constraint to a hypertable
with compressed chunks required decompression of all chunks and
then adding the constraint. With this patch the decompression is
no longer required as violation of the constraint is checked against
compressed and uncompressed data.
This patch does not add the actual uniqueness checks for DML on
compressed chunks these are alread in place. This is just a quality
of life improvement for adding uniqueness constraints for already
compressed chunks.
When querying from a chunk directly that is partially compressed
the ordering would be applied locally to compressed and uncompressed
data and not globally to the complete resultset leading to incorrect
ordering.
Recompression used to take heavy Exclusive lock on chunk while it
recompressed data. This blocked a lot of concurrent operations like
inserts, deletes, and updates. By reducing the locking requirement
to ShareUpdateExclusive, we enable more concurrent operations on
the chunk and hypertable while relying on tuple locking. Recompression
can still end up taking an Exclusive lock at the end in order to
change the chunk status but it does this conditionally, meaning
it won't take it unless it can do it immediatelly.
This fixes bug #7714 where adding a column with a default
value (jargon: missing value) and a compressed batch with
all nulls created an ambiguity. In the all null cases the
compressed block was stored as a NULL value.
With this change, I introduce a new special compression
type, the 'NULL' compression which is a single byte place
holder for an 'all-null' compressed block. This allows us
to distinguish between the missing value vs the all-null
values.
Please note that the wrong results impacted existing tests
so I updated the expected results, as well as I added
reference queries before compression to prove the updated
values were wrong before.
A new debug only GUC was added for testing a future upgrade
script, which will arrive as a separate PR.
Fixes#7714
Nowadays a Continuous Aggregate refresh policy process everything only
once independent of how large the refresh window is. For example if you
have a hypertable with a huge amount of rows it can take a lot of time
and requires a lot of resources in terms of CPU, Memory and I/O to
refresh a CAgg, and all the aggregated data will be visible for the
users only when the refresh policy complete it execution.
This PR add the capability of a CAgg refresh policy be executed
incrementaly in "batches". Each "batch" is an individual transaction
that will process a small fraction of the entire refresh window, and
once the "batch" finishes the execution the data refreshed will already
be visible for the users even before policy execution end.
To tweak and control the incremental refresh some new options was added
to `add_continuous_aggregate_policy` API:
* `buckets_per_batch`: number of buckets to be refreshed by a "batch".
To summarize this value is multiplied by the CAgg bucket width to
determine the size of the batch range. Default value is `0` (zero)
that means it will keep the current behavior of single batch
execution. Values less than `0` (zero) are not allowed.
* `max_batches_per_execution`: maximum number of batches to be executed
by a policy execution. This option is used to limit the number of
batches processed by a single policy execution, so if some batches
remain next time the policy run they will be processed. Default
value is `10` (ten) that means that each job execution will process
the maximum of ten batches. To make it unlimited then the value
should be `0` (zero). Values less than `0` (zero) are not allowed.
Compressed chunks can be merged by applying the same copy+heap swap
technique to the internal compressed heaps as done for the
non-compressed heaps. However, special consideration is necessary for
cases where some chunks are compressed and some are not.
The way this is handled is to pick the first compressed chunk found
among the input chunks, and then use that as the "result" chunk. In
the next step all the chunks' non-compressed heaps are merged followed
by all the "internal" compressed heaps. In the last step, the result
chunk has its non-compressed and compressed heaps swapped with the
merged ones, respectively.
In all other regards, the merging works the same as before when merging
non-compressed chunks.
When a table was dropped with the "cascade" option, compression
settings weren't properly cleaned up as a result of refactoring in
commit adf7c39. It seems dropping with the "cascade" option invokes a
different path for dropping the hypertable and children chunks, and
there was no test for it. A test has now been added.
Make sure the presence of min/max metadata is detected for non-orderby
columns when planning a ColumnarScan so that scankeys are pushed down
to the TAM. This improves performance by avoiding decompression.
If `ALTER TABLE ... SET ACCESS METHOD DEFAULT` is used, the name of the
table access method is NULL in the `AlterTableCmd`, so add checks that
the name is not null before using it.
Function `hypercore_alter_access_method_finish` expects a chunk, but
could be called with a non-chunk, so checking that the relid is a
chunk.
When building indexes over a table using Hypercore TAM, some memory
for decompressing data wasn't released until the index build
completes. This change optimizes the memory usage during index builds
to release memory after every compressed batch has been processed.
Add code to call mem_guard to enable memory tracking for background
workers. This is enabled for all background workers started by
timescaledb, but not to other workers started by PostgreSQL.
When we have compression where default order by is set to "", we cannot
do meaningful segmentwise recompression. Therefore we will return error
when segmentwise recompression is requested directly and will do full
recompression when general compression is requested.
Fixes: #7748
When compress_chunk_time_interval is configured but compress_orderby
does not have the primary dimension as first column, chunk merging will
be less efficient as chunks have to be decompressed to be merged.
This patch adds a warning when we encounter this configuration.
When computing cost for a merge join path, an index scan will be done
on the relation to find the actual variable range using
`get_actual_variable_range()`, which will include an index scan of the
hypertable parent.
In addition to being unneccessary, it can also cause problems in
situations where the hypertable parent contains data as a result of a
bug.
There is a check that an index scan is not done for a partitioned
table, so we do the same here by setting the indexlist to NIL after
hypertable expansion. The index list is needed while expanding the
hypertables to construct correct index scans for chunks of a
hypertable.
Fix a bug which caused compression settings to remain after having
converted a Hypercore TAM chunk back to another TAM (e.g., heap).
Remove any orphaned settings in the update script.
Minor fix but raising this PR because I noticed that the
`docs/BuildSource.md` file mistakenly says that timescaledb cannot be
built against postgres 17. Looking at the git blame for `CMakelists.txt`
it seems that postgres 17 build functionality was officially supported
as of commit `abd637beaa`
The temporary function `cagg_get_bucket_function` was created to be
used in the update script for 2.14.2 to 2.15.0 and for some regression
tests, but in 2.16.0 (#7042) we added a new persistent function
`cagg_get_bucket_function_info` as a replacement so used it instead.
Leftover from #7042 refactoring PR.
Refactor the compression settings metadata table to include a mapping
from a chunk's relid to its compressed chunk's relid.
This simplifies the code that looks up compression metadata as it no
longer requires first looking up the corresponding compressed chunk in
the "chunk" metadata table. Instead, given the non-compressed chunk's
relid, the corresponding compressed chunk's relid can easily be looked
up via the compression settings.
The refactoring is a step towards removing a chunk's compression table
from the chunk metadata. In other words, the "compressed chunk" will
no longer be a chunk, just a relation associated with the regular
chunk via compression settings. However, this requires further
refactoring that is left for follow-up changes.
The merge chunks functionality can use a "lock upgrade" approach that
allows reads during the merge. However, this is not the default
approach since it can lead to deadlocks. Some isolation test
permutations were added to illustrate deadlocks for the "lock upgrade"
approach, but since it can lead to either process being killed the
test output is not deterministic.
Also add two new permutations for the non-lock-upgrade approach that
show it does not deadlock for concurrent reads and writes.
The GUC timescaledb.enable_transparent_decompression can be set to
'hypercore' when using Hypercore TAM in order to get a DecompressChunk
plan. This will make a scan read only non-compressed data from the
TAM, and is used for debugging. Howver, the setting can be dangerous
if a chunk is rewritten by decompression or vaccum full, leading to
data loss. Therefore, block rewrite operations on Hypercore TAM when
the GUC is set to 'hypercore'. Also remove the possibility to use this
GUC value entirely in release builds.
In #7630 we fixed wrong crash error messages on job history. It was
fixed by removing the `NOT NULL` constraint from `succeeded` column when
the launcher insert the job history row that is updated by the worker
when the job actually start to run.
Fixed the TAP test to properly deal with nulls on the `succeeded` column
when checking for failed executions.
Flaky execution:
https://github.com/timescale/timescaledb/actions/runs/13391680375/job/37400633002?pr=7719#step:16:701