Change compression policy to use segmentwise
recompression when possible to increase performance.
Segmentwise recompression decompresses rows into memory,
thus reducing IO load when recompressing, making it
much faster for bigger chunks.
The retention and compression policies can now use drop_created_before
and compress_created_before arguments respectively to specify chunk
selection using their creation times.
We don't support creation times for CAggs, yet.
In 2 instances of the exception handling _message and _detail were
not properly set in compression_policy_execute leading to empty
message and detail being reported.
To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- policy_compression_check(jsonb)
- policy_compression_execute(integer,integer,anyelement,integer,boolean,boolean)
- policy_compression(integer,jsonb)
- policy_job_error_retention_check(jsonb)
- policy_job_error_retention(integer,jsonb)
- policy_recompression(integer,jsonb)
- policy_refresh_continuous_aggregate_check(jsonb)
- policy_refresh_continuous_aggregate(integer,jsonb)
- policy_reorder_check(jsonb)
- policy_reorder(integer,jsonb)
- policy_retention_check(jsonb)
- policy_retention(integer,jsonb)
To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- to_unix_microseconds(timestamptz)
- to_timestamp(bigint)
- to_timestamp_without_timezone(bigint)
- to_date(bigint)
- to_interval(bigint)
- interval_to_usec(interval)
- time_to_internal(anyelement)
- subtract_integer_from_now(regclass, bigint)
This patch changes INSERTs into compressed chunks to no longer
be immediately compressed but stored in the uncompressed chunk
instead and later merged with the compressed chunk by a separate
job.
This greatly simplifies the INSERT-codepath as we no longer have
to rewrite the target of INSERTs and on-the-fly compress leading
to a roughly 2x improvement on INSERT rate into compressed chunk.
Additionally this improves TRIGGER-support for INSERTs into
compressed chunks.
This is a necessary refactoring to allow UPSERT/UPDATE/DELETE on
compressed chunks in follow-patches.
When a compression_policy is executed by a background worker, the policy
should continue to execute even if compressing or decompressing one of
the chunks fails.
Fixes: #4610
Previously users had no way to update the check function
registered with add_job. This commit adds a parameter check_config
to alter_job to allow updating the check function field.
Also, previously the signature expected from a check was of
the form (job_id, config) and there was no validation
that the check function given had the correct signature.
This commit removes the job_id as it is not required and
also checks that the check function has the correct signature
when it is registered with add_job, preventing an error being
thrown at job runtime.
Old patch was using old validation functions, but there are already
validation functions that both read and validate the policy, so using
those. Also removing the old `job_config_check` function since that is
no longer use and instead adding a `job_config_check` that calls the
checking function with the configuration.
Postgres will prepend pg_temp to the effective search_path if it
is not present in the search_path. While pg_temp will never be
used to look up functions or operators unless explicitly requested
pg_temp will be used to look up relations. Putting pg_temp in
search_path makes sure objects in pg_temp will be considered last
and pg_temp cannot be used to mask existing objects.
SET LOCAL is only active until end of transaction so we set search_path
again after COMMIT in functions that do transaction control. While we
could use SET at the start of the function we do not want to bleed out
search_path to caller.
This patch locks down search_path in extension install and update
scripts to only contain pg_catalog, this requires that any reference
in those scripts is fully qualified. Additionally we add explicit
create commands to all update scripts for objects added to the
public schema. This change will make update scripts fail if a
function with identical signature already exists when installing
or upgrading instead reusing the existing object.
When executing recompress chunk policy concurrently with queries query, a
deadlock can be generated because the chunk relation and the chunk
index or the uncompressed chunk or the compressed chunk are locked in
different orders. In particular, when recompress chunk policy is
executing, it will first decompress the chunk and as part of that lock
the compressed chunk in `AccessExclusive` mode when dropping it and when
trying to compress the chunk again it will try to lock the uncompressed
chunk in `AccessExclusive` mode as part of truncating it.
To avoid the deadlock, this commit updates the recompress policy to do
the compression and the decompression steps in separate transactions,
which will avoid the deadlock since each phase (decompress and compress
chunk) locks indexes and compressed/uncompressed chunks in the same
order.
Note that this fixes the policy only, and not the `recompress_chunk`
function, which still is prone to deadlocks.
Partial-Bug: #3846
Commit fffd6c2350f5b3237486f3d49d7167105e72a55b fixes problem related
to PortalContext using PL/pgSQL procedure to execute the policy.
Unfortunately this new implementation introduced a problem when we use
INTEGER and not BIGINT for the time dimension.
Fixed it by dealing correclty with the integer types: SMALLINT, INTEGER
and BIGINT.
Also refatored the policy compression procedure replacing the two
procedures `policy_compression_{interval|integer}` by a simple
`policy_compression_execute` casting dimension type dynamically.
Fixes#3773
This PR removes the C code that executes the compression
policy. Instead we use a PL/pgSQL procedure to execute
the policy.
PG13.4 and PG12.8 introduced some changes
that require PortalContexts while executing transactions.
The compression policy procedure compresses chunks in
multiple transactions. We have seen some issues with snapshots
and portal management in the policy code (due to the
PG13.4 code changes). SPI API has transaction-portal management
code. However, the compression policy code does not use SPI
interfaces. But it is fairly easy to just convert this into
a PL/pgSQL procedure (which calls SPI) rather than replicating
portal managment code in C to manage multiple txns in the
compression policy.
This PR also disallows decompress_chunk, compress_chunk and
recompress_chunk in txn read only mode.
Fixes#3656
This moves the SQL definitions for policy and job APIs to their
separate files to improve code structure. Previously, all of these
user-visible API functions were located in the `bgw_scheduler.sql`
file, mixing internal and public functions and APIs.
To improved the structure, all API-related functions are now located
in their own distinct SQL files that have the `_api.sql` file
ending. Internal policy functions have been moved to
`policy_internal.sql`.