The `compress_chunk()` function can now be used to create hyperstores
by passing the option `compress_using => 'hyperstore'` to the
function.
Using the `compress_chunk()` function is an alternative to using
`ALTER TABLE my_hyperstore SET ACCESS METHOD` that is compatible with
the existing way of compressing hypertable chunks. It will also make
it easier to support hyperstore compression via compression policies.
Additionally, implement "fast migration" to hyperstore when a table is
already compressed. In that case, simply update the PG catalog to say
that the table is using hyperstore as TAM without rewriting the
table. This fast migration works with both with `...SET ACCESS METHOD`
and `compress_chunk()`.
Changing the code to remove the assumption of 1:1
mapping between chunks and chunk constraints. Including
a check if a chunk constraint is shared with another chunk.
In that case, skip deleting the dimension slice.
The removal function would only remove chunk_constraints that are
part of dimension constraints. This patch changes it to remove all
constraints of a chunk.
Historically we preserve chunk metadata because the old format of the
Continuous Aggregate has the `chunk_id` column in the materialization
hypertable so in order to don't have chunk ids left over there we just
mark it as dropped when dropping chunks.
In #4269 we introduced a new Continuous Aggregate format that don't
store the `chunk_id` in the materialization hypertable anymore so it's
safe to also remove the metadata when dropping chunk and all associated
Continuous Aggregates are in the new format.
Also added a post-update SQL script to cleanup unnecessary dropped chunk
metadata in our catalog.
Closes#6570
This patch deprecates the recompress_chunk procedure as all that
functionality is covered by compress_chunk now. This patch also adds a
new optional boolean argument to compress_chunk to force applying
changed compression settings to existing compressed chunks.
This patch changes those functions to no longer error by default when
the chunk is not the expected state, instead a warning is raised.
This is in preparation for changing compress_chunk to forward to
the appropriate operation depending on the chunk state.
If users have accidentally been removed from `pg_authid` as a result of
bugs where dropping a user did not revoke privileges from all tables
where the had privileges, it will not be possible to create new chunks
since these require the user to be found when copying the privileges
for the parent table (either compressed hypertable or normal
hypertable).
To fix the situation, we repair the `pg_class` table when updating the
extension by modifying the `relacl` for relations and remove any user
that do not have an entry in `pg_authid`.
A repair function `_timescaledb_functions.repair_relation_acls` is
added that will perform the job. A `makeaclitem` from PG16 that accepts
a list of comma and used as part of the repair is also added as
`_timescaledb_functions.makeaclitem`.
To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- chunk_constraint_add_table_constraint(_timescaledb_catalog.chunk_constraint)
- chunk_drop_replica(regclass,name)
- chunk_index_clone(oid)
- chunk_index_replace(oid,oid)
- create_chunk_replica_table(regclass,name)
- drop_stale_chunks(name,integer[])
- health()
- hypertable_constraint_add_table_fk_constraint(name,name,name,integer)
- process_ddl_event()
- wait_subscription_sync(name,name,integer,numeric)
To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- calculate_chunk_interval(int, bigint, bigint)
- chunk_status(regclass)
- chunks_in(record, integer[])
- chunk_id_from_relid(oid)
- show_chunk(regclass)
- create_chunk(regclass, jsonb, name, name, regclass)
- set_chunk_default_data_node(regclass, name)
- get_chunk_relstats(regclass)
- get_chunk_colstats(regclass)
- create_chunk_table(regclass, jsonb, name, name)
- freeze_chunk(regclass)
- unfreeze_chunk(regclass)
- drop_chunk(regclass)
- attach_osm_table_chunk(regclass, regclass)
This patch introduces a C-function to perform the recompression at
a finer granularity instead of decompressing and subsequently
compressing the entire chunk.
This improves performance for the following reasons:
- it needs to sort less data at a time and
- it avoids recreating the decompressed chunk and the heap
inserts associated with that by decompressing each segment
into a tuplesort instead.
If no segmentby is specified when enabling compression or if an
index does not exist on the compressed chunk then the operation is
performed as before, decompressing and subsequently
compressing the entire chunk.
A chunk is in this state when it is compressed but also has
uncompressed data in the uncompressed chunk. Individual tuples
can only ever exist in either area. This is preparation patch
to add support for uncompressed staging area for DML operations.
Postgres will prepend pg_temp to the effective search_path if it
is not present in the search_path. While pg_temp will never be
used to look up functions or operators unless explicitly requested
pg_temp will be used to look up relations. Putting pg_temp in
search_path makes sure objects in pg_temp will be considered last
and pg_temp cannot be used to mask existing objects.
SET LOCAL is only active until end of transaction so we set search_path
again after COMMIT in functions that do transaction control. While we
could use SET at the start of the function we do not want to bleed out
search_path to caller.
This patch locks down search_path in extension install and update
scripts to only contain pg_catalog, this requires that any reference
in those scripts is fully qualified. Additionally we add explicit
create commands to all update scripts for objects added to the
public schema. This change will make update scripts fail if a
function with identical signature already exists when installing
or upgrading instead reusing the existing object.
When executing `recompress_chunk` and a query at the same time, a
deadlock can be generated because the chunk relation and the chunk
index and the compressed and uncompressd chunks are locked in different
orders. In particular, when `recompress_chunk` is executing, it will
first decompress the chunk and as part of that lock the uncompressed
chunk index in AccessExclusive mode and when trying to compress the
chunk again it will try to lock the uncompressed chunk in
AccessExclusive as part of truncating it.
Note that `decompress_chunk` and `compress_chunk` lock the relations in
the same order and the issue arises because the procedures are combined
inth a single transaction.
To avoid the deadlock, this commit rewrites the `recompress_chunk` to
be a procedure and adds a commit between the decompression and
compression. Committing the transaction after the decompress will allow
reads and inserts to proceed by working on the uncompressed chunk, and
the compression part of the procedure will take the necessary locks in
strict order, thereby avoiding a deadlock.
In addition, the isolation test is rewritten so that instead of adding
a waitpoint in the PL/SQL function, we implement the isolation test by
taking a lock on the compressed table after the decompression.
Fixes#3846
Remove copy_chunk_data() function and code needed to support it,
such as the 'transactional' argument.
Rework copy chunk logic using separate stages.
Introduce copy_chunk() API function as an internal wrapper for
the move_chunk().
The building blocks required for implementing end-to-end copy/move
chunk functionality have now been wrapped in a procedure.
A procedure is required because multiple transactions are needed to
carry out the activity across the access node and the involved two data
nodes.
The following steps are encapsulated in this procedure
1) Create an empty chunk table on the destination data node
2) Copy the data from the src data node chunk to this newly created
destination node chunk. This is done via inbuilt PostgreSQL logical
replication functionality
3) Attach this chunk to the hypertable on the dst data node
4) Remove this chunk from the src data node to complete the move if
requested
A new catalog table "chunk_copy_activity" has been added to track
the progress of the above stages. A unique id gets assigned to each
activity and it is updated with the completed stages as things
progress.
After inserts go into a compressed chunk, the chunk is marked as
unordered.This PR adds a new function recompress_chunk that
compresses the data and sets the status back to compressed. Further
optimizations for this function are planned but not part of this PR.
This function can be invoked by calling
SELECT recompress_chunk(<chunk_name>).
recompress_chunk function is automatically invoked by the compression
policy job, when it sees that a chunk is in unordered state.
This commit improves the API of compress_chunk and decompress_chunk:
- have it return the chunk regclass processed (or NULL in the
idempotent case);
- mark it as STRICT
- add if_not_compressed/if_compressed options for idempotency
Add support for compress_chunks function.
This also adds support for compress_orderby and compress_segmentby
parameters in ALTER TABLE. These parameteres are used by the
compress_chunks function.
The parsing code will most likely be changed to use PG raw_parser
function.
Adds a move_chunk function which to a different tablespace. This is
implemented as an extension to the reorder command.
Given that the heap, toast tables, and indexes are being rewritten
during the reorder operation, adding the ability to modify the tablespace
is relatively simple and mostly requires adding parameters to the relevant
functions for the destination tablespace (and index tablespace). The tests
do not focus on further exercising the reorder infrastructure, but instead
ensure that tablespace movement and permissions checks properly occur.
New cluster-like command which writes to a new index than swaps,
much like is done for the data table, and only acquires
exclusive locks for said swap. This trades off disk usage for
lower contention: we hold locks for a much lower period of time,
allowing reads to work concurrently, but we have both the old
and new versions of the table existing at once, approximately
doubling storage usage while reorder is running.
Currently only works on chunks.