4553 Commits

Author SHA1 Message Date
Mats Kindahl
248891131e Use checkout v3 for centos7
Also fix a version tag
2024-02-01 15:54:06 +01:00
Fabrízio de Royes Mello
b093749b45 Properly set job failure on compression policies
In #4770 we changed the behavior of compression policies to continue
compressing chunks even if a failure happens in one of them.

The problem is if a failure happens in one or all the chunks the job is
marked as successful, and this is wrong because a failure happens so if
an exception occurs when compressing chunks we'll raise an exception at
the end of the procedure in order to mark the job as failed.
2024-02-01 10:59:47 -03:00
Alexander Kuzmenkov
d7239f30bd Fix UBSan failure in bulk text decompression
Fix missing alignment handling, and also buffer size problems. We were
not properly reporting the failures in the compression_algo tests, so
didn't notice these problems. Add part of tests from "vector text
predicates" PR to improve coverage.
2024-02-01 14:25:08 +01:00
Jan Nidzwetzki
7474eee3b4 Remove double initialization of time_bucket origin
The variable origin in ts_timestamptz_bucket was initialized two times.
This commit removes one of the initializations.
2024-02-01 14:18:58 +01:00
Sven Klemm
4a00434e9d Bump pgspot to 0.7.0
Use pgspot 0.7.0 in our CI which updates the parser used by pgspot to
the PG16 parser.
2024-02-01 11:24:26 +01:00
Nikhil Sontakke
2b8f98c616 Support approximate hypertable size
If a lot of chunks are involved then the current pl/pgsql function
to compute the size of each chunk via a nested loop is pretty slow.
Additionally, the current functionality makes a system call to get the
file size on disk for each chunk everytime this function is called.
That again slows things down. We now have an approximate function which
is implemented in C to avoid the issues in the pl/pgsql function.
Additionally, this function also uses per backend caching using the
smgr layer to compute the approximate size cheaply. The PG cache
invalidation clears off the cached size for a chunk when DML happens
into it. That size cache is thus able to get the latest size in a
matter of minutes. Also, due to the backend caching, any long running
session will only fetch latest data for new or modified chunks and can
use the cached data (which is calculated afresh the first time around)
effectively for older chunks.
2024-02-01 13:25:41 +05:30
Nikhil Sontakke
d86346fd48 Fix openssl issues on MacOSX on CI
Issues due to OpenSSL 3.2 seem to be resolved via components
installing 3.1.x. So, remove our temporary changes.

We also disable uploading of test stats till the next minor release
of PG on macOS.
2024-02-01 13:02:14 +05:30
Sven Klemm
1502bad832 Change boolean default value for compress_chunk and decompress_chunk
This patch changes those functions to no longer error by default when
the chunk is not the expected state, instead a warning is raised.
This is in preparation for changing compress_chunk to forward to
the appropriate operation depending on the chunk state.
2024-01-31 20:55:55 +01:00
Erik Nordström
101acb149a Fix compressed chunk not found during upserts
When inserting into a compressed chunk with an ON CONFLICT clause, and
there is a concurrent recompression of the chunk, an error can occur
due to a failed lookup of the compressed chunk.

The error happens because the chunk information is re-read from an
outdated cache holding a compressed chunk ID that no longer exists
after the chunk was recompressed (recompression effectively removes
the compressed chunk and creates it again).

Fix this issue by ensuring that the insert path only uses up-to-date
chunk metadata read _after_ the lock on the chunk was taken.
2024-01-31 17:07:26 +01:00
Erik Nordström
09be6df8a9 Reproduce "chunk not found" error during upserts
Add an isolation test that reproduces a "chunk not found" error that
happen during upserts on compressed chunks.
2024-01-31 17:07:26 +01:00
Mats Kindahl
ac12c43c64 Switch to node20 for github actions
See https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/
2024-01-31 16:09:33 +01:00
Alexander Kuzmenkov
bf20e5f970 Bulk decompression of text columns
Implement bulk decompression for text columns. This will allow us to use
them in the vectorized computation pipeline.
2024-01-31 14:03:01 +01:00
Alejandro Do Nascimento Mora
85b27b4f34 Fix create_hypertable referenced by fk succeeds
Foreign keys pointing to hypertables are not supported. Creating a
hypertable from a table referenced by foreign key succeeds, but it
leaves the referencing (child) table in a broken state, failing on every
insert with a `violates foreign key constraint` error.

To prevent this scenario, if a foreign key reference to the table exists
before converting it to a hypertable, the following error will be
raised:

```
cannot have FOREIGN KEY contraints to hypertable "<table_name>"
HINT: Remove all FOREIGN KEY constraints to table "<table_name>"
  before making it a hypertable.
```

Fixes #6452
2024-01-30 15:01:12 -03:00
Sven Klemm
8875c7a897 Refactor tsl_get_compressed_chunk_index_for_recompression
Refactor the compression to code to move the column map code
in a separate function and change tsl_get_compressed_chunk_index_for_recompression
to not initialize RowCompressor anymore.
2024-01-30 15:26:31 +01:00
Alexander Kuzmenkov
47e28a8ff9 Vectorize boolean operators
Implement vectorized computation of AND, OR and NOT operators in WHERE
clause.
2024-01-29 19:41:31 +01:00
Jan Nidzwetzki
b63fbad33d Add plan-time chunk exclusion for real-time CAggs
The watermark function for CAggs is declared as STABLE since the value
of the function changes after every CAgg refresh. The function
volatility prevents the planner from replacing the function invocation
with a constant value and executing plan time chunk exclusion. This
leads to high planning times on hypertables with many chunks.

This PR replaces the function invocation with a constant value to allow
plan time exclusion of chunks. We perform the replacement at plan time
instead of changing the function volatility to IMMUTABLE, because we
want to control the constification. Only queries that access the
underlying hypertable in a query (i.e., no queries like SELECT
cagg_watermark(...) without any FROM condition) are rewritten. This is
done to make sure that the query is properly invalidated when the
underlying table changes (e.g., the watermark is updated) and the query
is replanned on the subsequent execution.

Fixes: #6105, #6321
Co-authored-by: Fabrizio de Royes Mello <fabriziomello@gmail.com>
2024-01-29 15:13:34 +01:00
Ante Kresic
285fd7c7e8 Limit tuple decompression during DML operations
This change introduces a GUC to limit the amount of tuples
that can be decompressed during an INSERT, UPDATE, or
DELETE on a compressed hypertable. Its used to prevent
running out of disk space if you try to decompress
a lot of data by accident.
2024-01-29 14:40:33 +01:00
Jan Nidzwetzki
d878b9ee29 Fix pathtarget adjustment for MergeAppend paths
The chunk-wise aggregation pushdown code creates a copy of the existing
paths; it copies the existing paths to create new ones with pushed-down
aggregates. However, the copy_merge_append_path function behaves
differently than other copy functions (e.g., copy_append_path); it
resets the pathtarget on the copy. This leads to a wrong pathlist and
crashes. This PR fixes the wrong pathtarget by setting it after the path
is copied.
2024-01-29 12:57:54 +01:00
Sven Klemm
50c757c6f1 Adjust .gitignore for transparent_decompression test files 2024-01-29 12:57:43 +01:00
Sven Klemm
83ff9662dc Verify compression configuration on compression rollup
Add a check to verify compression settings for chunks to be rolled
up are identical.
2024-01-25 16:43:47 +01:00
Sven Klemm
4934941f75 Remove restrictions for changing compression settings
This patch removes the restrictions preventing changing compression
settings when compressed chunks exist. Compression settings can now
be changed at any time and will not affect existing compressed chunks
but any subsequent compress_chunk will apply the changed settings.
A decompress_chunk/compress_chunk cycle will also compress the chunk
with the new settings.
2024-01-25 16:13:15 +01:00
Nikhil Sontakke
c715d96aa4 Don't dump unnecessary extension tables
Logging and caching related tables from the timescaledb extension
should not be dumped using pg_dump. Our scripts specify a few such
unwanted tables. Apart from being unnecessary, the "job_errors" had
some restricted permissions causing additional problems in pg_dump.

We now don't include such tables for dumping.

Fixes #5449
2024-01-25 12:01:11 +05:30
Konstantina Skovola
929ad41fa7 Fix now() plantime constification with BETWEEN
Previously when using BETWEEN ... AND additional constraints in a
WHERE clause, the BETWEEN was not handled correctly because it was
wrapped in a BoolExpr node, which prevented plantime exclusion.
The flattening of such expressions happens in `eval_const_expressions`
which gets called after our constify_now code.
This commit fixes the handling of this case to allow chunk exclusion to
take place at planning time.
Also, makes sure we use our mock timestamp in all places in tests.
Previously we were using a mix of current_timestamp_mock and now(),
which was returning unexpected/incorrect results.
2024-01-24 18:25:39 +02:00
Alexander Kuzmenkov
92e9009960 Vectorize more arithmetic functions
We didn't account for some permutations of types like comparing int4 to
int2.
2024-01-24 14:22:37 +01:00
Konstantina Skovola
942f1fb399 Fix OSM build failure
The commit #6513 removed some restricted chunk operations, enabling
adding constraints to OSM chunks directly. This operation is blocked
on OSM chunks. The present commit ensures that adding a constraint
directly on an OSM chunk is blocked.
2024-01-23 17:07:47 +02:00
Ante Kresic
80f1b23738 Reduce index lock in segmentwise recompression
Segmentwise recompression grabbed an AccessExclusiveLock on
the compressed chunk index. This would block all read operations
on the chunk which involved said index. Reducing the lock to
ExclusiveLock would allow reads, removing blocks from other
ongoing operations.
2024-01-23 15:08:57 +01:00
Sven Klemm
0b23bab466 Include _timescaledb_catalog.metadata in dumps
This patch changes the dump configuration for
_timescaledb_catalog.metadata to include all entries. To allow loading
logical dumps with this configuration an insert trigger is added that
turns uniqueness conflicts into updates to not block the restore.
2024-01-23 12:53:48 +01:00
Jan Nidzwetzki
4cb7bacd17 Change GitHub token for changelog check
Since #6505, the changelog script tries to access the
secrets.ORG_AUTOMATION_TOKEN. However, accessing secrets is not possible
for PRs. This PR changes the token to the default access token, which
is available in PRs and provides read access to the issue API.
2024-01-23 11:15:52 +01:00
Alexander Kuzmenkov
d7018953c1 Fix joins with compressed chunks
We couldn't use parameterized index scan by segmentby column on
compressed chunk, effectively making joins on segmentby unusable. We
missed this when bulk-updating test references for PG16. The underlying
reason is the incorrect creation of equivalence members for segmentby
columns of compressed chunk tables. This commit fixes it.
2024-01-23 10:18:05 +01:00
Sven Klemm
acd69d4cd2 Support all time_bucket variants in plan time chunk exclusion
Since the optional time_bucket arguments like offset, origin and
timezone shift the output by at most bucket width we can optimize
these similar to how we optimize the other time_bucket constraints.

Fixes #4825
2024-01-22 16:23:23 +01:00
Fabrízio de Royes Mello
a44a19b095 Remove some multinode leftovers from CAggs 2024-01-22 12:14:17 -03:00
Matvey Arye
e89bc24af2 Add functions for determining compression defaults
Add functions to help determine defaults for segment_by and order_by.
2024-01-22 08:10:23 -05:00
Konstantina Skovola
2ab2a4fbb5 List failed tests on PG16 CI 2024-01-22 14:28:38 +02:00
Konstantina Skovola
77981ba30b Fix OSX build failure
Use Python 3.12 instead of 3.11 to avoid brew link error
with dependencies.
2024-01-22 13:35:20 +02:00
Mats Kindahl
b7d0c14e32 Send message to security channel on segfault label
If an issue is labeled with either `segfault` or `security`, send a
message to the security channel that the issue needs attention.
2024-01-22 11:34:29 +01:00
Sven Klemm
754f77e083 Remove chunks_in function
This function was used to propagate chunk exclusion decisions from
an access node to data nodes and is no longer needed with the removal
of multinode.
2024-01-22 09:18:26 +01:00
Sven Klemm
9722eabb38 Refactor compression settings processing 2024-01-20 17:24:16 +01:00
Alexander Kuzmenkov
2174a7188d Allow mentioning fixed issues in the changelog
Also allow mentioning the PR like now. The numbers of issues must match
the issues referenced by the PR.
2024-01-19 19:39:13 +01:00
Alexander Kuzmenkov
de93e3916a Use parameterized time_bucket for chunk exclusion
When time_bucket is compared to constant in WHERE, we also add condition
on the underlying time variable (ts_transform_time_bucket_comparison).
Unfortunately we only do this for plan-time constants, which prevents
chunk exclusion when the interval is given by a query parameter, and a
generic plan is used. This commit also tries to apply this optimization
after applying the execution-time constants.

This PR also enables startup exclusion based on parameterized filters.
2024-01-19 19:39:13 +01:00
Alexander Kuzmenkov
02846e326a Improve memory locality in batch sorted merge
In the binary heap, address smaller dedicated structures with data
necessary for comparison, instead of the entire CompressedBatch
structures. This is important when we have a large number of batches.
2024-01-19 17:50:48 +01:00
Sven Klemm
20804275ee Remove outdated checks from update scripts
This patch removes some version checks that are now superfluous.
The oldest version our update process needs to be able to handle is
2.1.0 as previous versions will not work with currently supported
postgres versions.
2024-01-19 17:47:26 +01:00
Sven Klemm
0a9285075e Remove reference to PG12 in cmake file 2024-01-19 16:50:48 +01:00
Sven Klemm
d5fa21de0a Add isolation test for parallel compression
The changes for per chunk compression settings got rid of some
locking that previously prevented compressing different chunks
of the same hypertable. This patch just adds an isolation test for
that functionality.
2024-01-17 20:40:20 +01:00
Sven Klemm
f57d584dd2 Make compression settings per chunk
This patch implements changes to the compressed hypertable to allow per
chunk configuration. To enable this the compressed hypertable can no
longer be in an inheritance tree as the schema of the compressed chunk
is determined by the compression settings. While this patch implements
all the underlying infrastructure changes, the restrictions for changing
compression settings remain intact and will be lifted in a followup patch.
2024-01-17 12:53:07 +01:00
Konstantina Skovola
55ae29cf16 Fix if_not_exists behavior for CAgg policy with NULL offsets
Fixes #5688
2024-01-17 11:38:08 +02:00
Nikhil Sontakke
15c14bd339 Enable debug info about compression scan paths
We enabled scans using indexes for a chunk that is to be compressed.
The theory was that avoding tuplesort will be a win if there's a
matching index to the compression settings. However, a few customers
have reported very slow compress timings with lots of disk usage. It's
important to know which scan is being used for the compression in such
cases to help debug the issue. There's an existing GUC parameter which
was "DEBUG" only till now. Make it available in release builds as well.
2024-01-17 12:57:32 +05:30
Ante Kresic
45bc8a01c5 Remove reindex_relation from recompression
We used to reindex relation when compressing chunks. Recently
we moved to inserting into indexes on compressed chunks in
order to reduce locks necessary for the operation. Since
recompression uses RowCompressor, it also started inserting
tuples into indexes but we never removed the relation reindexing.
This change removes the unnecessary reindex call.
2024-01-16 13:27:15 +01:00
Nikhil Sontakke
7ae7cc5713 Disallow triggers on CAggs
We don't support triggers on CAggs yet. Disallow that explicitly
till we support them later.

Fixes #6500
2024-01-15 19:01:55 +05:30
Alexander Kuzmenkov
095504938d Re-enable clang-tidy in CI
Also fix the accumulated problems.
2024-01-15 13:57:36 +01:00
Sven Klemm
d16bc17fe6 Remove grep based update script check
Remove the grep based update script check and add some additional
checks to the AST based one.
2024-01-15 13:33:05 +01:00