In #4770 we changed the behavior of compression policies to continue
compressing chunks even if a failure happens in one of them.
The problem is if a failure happens in one or all the chunks the job is
marked as successful, and this is wrong because a failure happens so if
an exception occurs when compressing chunks we'll raise an exception at
the end of the procedure in order to mark the job as failed.
Fix missing alignment handling, and also buffer size problems. We were
not properly reporting the failures in the compression_algo tests, so
didn't notice these problems. Add part of tests from "vector text
predicates" PR to improve coverage.
If a lot of chunks are involved then the current pl/pgsql function
to compute the size of each chunk via a nested loop is pretty slow.
Additionally, the current functionality makes a system call to get the
file size on disk for each chunk everytime this function is called.
That again slows things down. We now have an approximate function which
is implemented in C to avoid the issues in the pl/pgsql function.
Additionally, this function also uses per backend caching using the
smgr layer to compute the approximate size cheaply. The PG cache
invalidation clears off the cached size for a chunk when DML happens
into it. That size cache is thus able to get the latest size in a
matter of minutes. Also, due to the backend caching, any long running
session will only fetch latest data for new or modified chunks and can
use the cached data (which is calculated afresh the first time around)
effectively for older chunks.
Issues due to OpenSSL 3.2 seem to be resolved via components
installing 3.1.x. So, remove our temporary changes.
We also disable uploading of test stats till the next minor release
of PG on macOS.
This patch changes those functions to no longer error by default when
the chunk is not the expected state, instead a warning is raised.
This is in preparation for changing compress_chunk to forward to
the appropriate operation depending on the chunk state.
When inserting into a compressed chunk with an ON CONFLICT clause, and
there is a concurrent recompression of the chunk, an error can occur
due to a failed lookup of the compressed chunk.
The error happens because the chunk information is re-read from an
outdated cache holding a compressed chunk ID that no longer exists
after the chunk was recompressed (recompression effectively removes
the compressed chunk and creates it again).
Fix this issue by ensuring that the insert path only uses up-to-date
chunk metadata read _after_ the lock on the chunk was taken.
Foreign keys pointing to hypertables are not supported. Creating a
hypertable from a table referenced by foreign key succeeds, but it
leaves the referencing (child) table in a broken state, failing on every
insert with a `violates foreign key constraint` error.
To prevent this scenario, if a foreign key reference to the table exists
before converting it to a hypertable, the following error will be
raised:
```
cannot have FOREIGN KEY contraints to hypertable "<table_name>"
HINT: Remove all FOREIGN KEY constraints to table "<table_name>"
before making it a hypertable.
```
Fixes#6452
Refactor the compression to code to move the column map code
in a separate function and change tsl_get_compressed_chunk_index_for_recompression
to not initialize RowCompressor anymore.
The watermark function for CAggs is declared as STABLE since the value
of the function changes after every CAgg refresh. The function
volatility prevents the planner from replacing the function invocation
with a constant value and executing plan time chunk exclusion. This
leads to high planning times on hypertables with many chunks.
This PR replaces the function invocation with a constant value to allow
plan time exclusion of chunks. We perform the replacement at plan time
instead of changing the function volatility to IMMUTABLE, because we
want to control the constification. Only queries that access the
underlying hypertable in a query (i.e., no queries like SELECT
cagg_watermark(...) without any FROM condition) are rewritten. This is
done to make sure that the query is properly invalidated when the
underlying table changes (e.g., the watermark is updated) and the query
is replanned on the subsequent execution.
Fixes: #6105, #6321
Co-authored-by: Fabrizio de Royes Mello <fabriziomello@gmail.com>
This change introduces a GUC to limit the amount of tuples
that can be decompressed during an INSERT, UPDATE, or
DELETE on a compressed hypertable. Its used to prevent
running out of disk space if you try to decompress
a lot of data by accident.
The chunk-wise aggregation pushdown code creates a copy of the existing
paths; it copies the existing paths to create new ones with pushed-down
aggregates. However, the copy_merge_append_path function behaves
differently than other copy functions (e.g., copy_append_path); it
resets the pathtarget on the copy. This leads to a wrong pathlist and
crashes. This PR fixes the wrong pathtarget by setting it after the path
is copied.
This patch removes the restrictions preventing changing compression
settings when compressed chunks exist. Compression settings can now
be changed at any time and will not affect existing compressed chunks
but any subsequent compress_chunk will apply the changed settings.
A decompress_chunk/compress_chunk cycle will also compress the chunk
with the new settings.
Logging and caching related tables from the timescaledb extension
should not be dumped using pg_dump. Our scripts specify a few such
unwanted tables. Apart from being unnecessary, the "job_errors" had
some restricted permissions causing additional problems in pg_dump.
We now don't include such tables for dumping.
Fixes#5449
Previously when using BETWEEN ... AND additional constraints in a
WHERE clause, the BETWEEN was not handled correctly because it was
wrapped in a BoolExpr node, which prevented plantime exclusion.
The flattening of such expressions happens in `eval_const_expressions`
which gets called after our constify_now code.
This commit fixes the handling of this case to allow chunk exclusion to
take place at planning time.
Also, makes sure we use our mock timestamp in all places in tests.
Previously we were using a mix of current_timestamp_mock and now(),
which was returning unexpected/incorrect results.
The commit #6513 removed some restricted chunk operations, enabling
adding constraints to OSM chunks directly. This operation is blocked
on OSM chunks. The present commit ensures that adding a constraint
directly on an OSM chunk is blocked.
Segmentwise recompression grabbed an AccessExclusiveLock on
the compressed chunk index. This would block all read operations
on the chunk which involved said index. Reducing the lock to
ExclusiveLock would allow reads, removing blocks from other
ongoing operations.
This patch changes the dump configuration for
_timescaledb_catalog.metadata to include all entries. To allow loading
logical dumps with this configuration an insert trigger is added that
turns uniqueness conflicts into updates to not block the restore.
Since #6505, the changelog script tries to access the
secrets.ORG_AUTOMATION_TOKEN. However, accessing secrets is not possible
for PRs. This PR changes the token to the default access token, which
is available in PRs and provides read access to the issue API.
We couldn't use parameterized index scan by segmentby column on
compressed chunk, effectively making joins on segmentby unusable. We
missed this when bulk-updating test references for PG16. The underlying
reason is the incorrect creation of equivalence members for segmentby
columns of compressed chunk tables. This commit fixes it.
Since the optional time_bucket arguments like offset, origin and
timezone shift the output by at most bucket width we can optimize
these similar to how we optimize the other time_bucket constraints.
Fixes#4825
When time_bucket is compared to constant in WHERE, we also add condition
on the underlying time variable (ts_transform_time_bucket_comparison).
Unfortunately we only do this for plan-time constants, which prevents
chunk exclusion when the interval is given by a query parameter, and a
generic plan is used. This commit also tries to apply this optimization
after applying the execution-time constants.
This PR also enables startup exclusion based on parameterized filters.
In the binary heap, address smaller dedicated structures with data
necessary for comparison, instead of the entire CompressedBatch
structures. This is important when we have a large number of batches.
This patch removes some version checks that are now superfluous.
The oldest version our update process needs to be able to handle is
2.1.0 as previous versions will not work with currently supported
postgres versions.
The changes for per chunk compression settings got rid of some
locking that previously prevented compressing different chunks
of the same hypertable. This patch just adds an isolation test for
that functionality.
This patch implements changes to the compressed hypertable to allow per
chunk configuration. To enable this the compressed hypertable can no
longer be in an inheritance tree as the schema of the compressed chunk
is determined by the compression settings. While this patch implements
all the underlying infrastructure changes, the restrictions for changing
compression settings remain intact and will be lifted in a followup patch.
We enabled scans using indexes for a chunk that is to be compressed.
The theory was that avoding tuplesort will be a win if there's a
matching index to the compression settings. However, a few customers
have reported very slow compress timings with lots of disk usage. It's
important to know which scan is being used for the compression in such
cases to help debug the issue. There's an existing GUC parameter which
was "DEBUG" only till now. Make it available in release builds as well.
We used to reindex relation when compressing chunks. Recently
we moved to inserting into indexes on compressed chunks in
order to reduce locks necessary for the operation. Since
recompression uses RowCompressor, it also started inserting
tuples into indexes but we never removed the relation reindexing.
This change removes the unnecessary reindex call.