During chunk creation, the chunk's dimensional CHECK constraints are
created via an "upcall" to PL/pgSQL code. However, creating
dimensional constraints in PL/pgSQL code sometimes fails, especially
during high-concurrency inserts, because PL/pgSQL code scans metadata
using a snapshot that might not see the same metadata as the C
code. As a result, chunk creation sometimes fail during constraint
creation.
To fix this issue, implement dimensional CHECK-constraint creation in
C code. Other constraints (FK, PK, etc.) are still created via an
upcall, but should probably also be rewritten in C. However, since
these constraints don't depend on recently updated metadata, this is
left to a future change.
Fixes#5456
This patch introduces a C-function to perform the recompression at
a finer granularity instead of decompressing and subsequently
compressing the entire chunk.
This improves performance for the following reasons:
- it needs to sort less data at a time and
- it avoids recreating the decompressed chunk and the heap
inserts associated with that by decompressing each segment
into a tuplesort instead.
If no segmentby is specified when enabling compression or if an
index does not exist on the compressed chunk then the operation is
performed as before, decompressing and subsequently
compressing the entire chunk.
When calling the `cagg_watermark` function to get the watermark of a
Continuous Aggregate we execute a `SELECT MAX(time_dimension)` query
in the underlying materialization hypertable.
The problem is that a `SELECT MAX(time_dimention)` query can be
expensive because it will scan all hypertable chunks increasing the
planning time for a Realtime Continuous Aggregates.
Improved it by creating a new catalog table to serve as a cache table
to store the current Continous Aggregate watermark in the following
situations:
- Create CAgg: store the minimum value of hypertable time dimension
data type;
- Refresh CAgg: store the last value of the time dimension materialized
in the underlying materialization hypertable (or the minimum value of
materialization hypertable time dimension data type if there's no
data materialized);
- Drop CAgg Chunks: the same as refresh cagg.
Closes#4699, #5307
When adding new status values we must make sure to add special
handling for these values to the downgrade script as previous
versions will not know how to deal with those.
This patch allows unique constraints on compressed chunks. When
trying to INSERT into compressed chunks with unique constraints
any potentially conflicting compressed batches will be decompressed
to let postgres do constraint checking on the INSERT.
With this patch only INSERT ON CONFLICT DO NOTHING will be supported.
For decompression only segment by information is considered to
determine conflicting batches. This will be enhanced in a follow-up
patch to also include orderby metadata to require decompressing
less batches.
This patch changes INSERTs into compressed chunks to no longer
be immediately compressed but stored in the uncompressed chunk
instead and later merged with the compressed chunk by a separate
job.
This greatly simplifies the INSERT-codepath as we no longer have
to rewrite the target of INSERTs and on-the-fly compress leading
to a roughly 2x improvement on INSERT rate into compressed chunk.
Additionally this improves TRIGGER-support for INSERTs into
compressed chunks.
This is a necessary refactoring to allow UPSERT/UPDATE/DELETE on
compressed chunks in follow-patches.
This patch changes the code that blocks frozen chunk
modifications to no longer use triggers but to use custom
node instead. Frozen chunks is a timescaledb internal object
and should therefore not be protected by TRIGGER which is
external and creates several hazards. TRIGGERs created to
protect internal state contend with user-created triggers.
The trigger created to protect frozen chunks does not work
well with our restoring GUC which we use when restoring
logical dumps. Thirdly triggers are not functional for any
internal operations but are only working in code paths that
explicitly added trigger support.
If a datanode goes down for whatever reason then DML activity to
chunks residing on (or targeted to) that DN will start erroring out.
We now handle this by marking the target chunk as "stale" for this
DN by changing the metadata on the access node. This allows us to
continue to do DML to replicas of the same chunk data on other DNs
in the setup. This obviously will only work for chunks which have
"replication_factor" > 1. Note that for chunks which do not have
undergo any change will continue to carry the appropriate DN related
metadata on the AN.
This means that such "stale" chunks will become underreplicated and
need to be re-balanced by using the copy_chunk functionality by a micro
service or some such process.
Fixes#4846
This function drops chunks on a specified data node if those chunks are
not known by the access node.
Call drop_stale_chunks() automatically when data node becomes
available again.
Fix#4848
The commit 9f4dcea30135d1e36d1c452d631fc8b8743b3995 introduces frozen
chunks. Checking whether a chunk is frozen or not has been done so far
in the query planner. If it is not possible to determine which chunks
are affected by a query in the planner (e.g., due to a cast in the WHERE
condition), all chunks are checked. This leads (1) to an increased
planning time and (2) to the situation that a single frozen chunk could
reject queries, even if the frozen chunk is not addressed by the query.
A chunk is in this state when it is compressed but also has
uncompressed data in the uncompressed chunk. Individual tuples
can only ever exist in either area. This is preparation patch
to add support for uncompressed staging area for DML operations.
This change introduces a new option to the compression procedure which
decouples the uncompressed chunk interval from the compressed chunk
interval. It does this by allowing multiple uncompressed chunks into one
compressed chunk as part of the compression procedure. The main use-case
is to allow much smaller uncompressed chunks than compressed ones. This
has several advantages:
- Reduce the size of btrees on uncompressed data (thus allowing faster
inserts because those indexes are memory-resident).
- Decrease disk-space usage for uncompressed data.
- Reduce number of chunks over historical data.
From a UX point of view, we simple add a compression with clause option
`compress_chunk_time_interval`. The user should set that according to
their needs for constraint exclusion over historical data. Ideally, it
should be a multiple of the uncompressed chunk interval and so we throw
a warning if it is not.
After data is tiered using OSM, we cannot insert data into the same
range. Need a callback that can be invoked by timescaledb to check
for range overlaps before creating a new chunk
We don't have to look up the dimension slices for dimensions for which
we don't have restrictions.
Also sort chunks by ids before looking up the metadata, because this
gives more favorable table access patterns (closer to sequential).
This fixes a planning time regression introduced in 2.7.
The OSM chunk registers a dummy primary dimension range
in the TimescaleDB catalog. Use the max interval of
the dimension instead of the min interval i.e
use range like [Dec 31 294246 PST, infinity).
Otherwise, policies can try to apply the policy on an OSM
chunk.
Add test with policies for OSM chunks
Do not allocate various temporary data in PortalContext, such as the
hyperspace point corresponding to the row, or the intermediate data
required for chunk lookup.
OSM chunks manage their ranges and the timescale
catalog has dummy ranges for these dimensions.
So the chunk exclusion logic cannot rely on the
timescaledb catalog metadata to exclude an OSM chunk.
If a default privilege is configured and applied to a given Continuous
Aggregate during it creation just the user view has the ACL properly
configured but the underlying materialization hypertable no leading to
permission errors.
Fixed it by copying the privileges from the user view to the
materialization hypertable during the Continous Aggregate creation.
Fixes#4555
When a table is added to an inheritance hierrachy, PG checks
if all check constraints are present on this table. When a OSM chunk
is added as a child of a hypertable with constraints,
make sure that all check constraints are replicated on the child OSM
chunk as well.
Add a new metadata table `dimension_partition` which explicitly and
statefully details how a space dimension is split into partitions, and
(in the case of multi-node) which data nodes are responsible for
storing chunks in each partition. Previously, partition and data nodes
were assigned dynamically based on the current state when creating a
chunk.
This is the first in a series of changes that will add more advanced
functionality over time. For now, the metadata table simply writes out
what was previously computed dynamically in code. Future code changes
will alter the behavior to do smarter updates to the partitions when,
e.g., adding and removing data nodes.
The idea of the `dimension_partition` table is to minimize changes in
the partition to data node mappings across various events, such as
changes in the number of data nodes, number of partitions, or the
replication factor, which affect the mappings. For example, increasing
the number of partitions from 3 to 4 currently leads to redefining all
partition ranges and data node mappings to account for the new
partition. Complete repartitioning can be disruptive to multi-node
deployments. With stateful mappings, it is possible to split an
existing partition without affecting the other partitions (similar to
partitioning using consistent hashing).
Note that the dimension partition table expresses the current state of
space partitions; i.e., the space-dimension constraints and data nodes
to be assigned to new chunks. Existing chunks are not affected by
changes in the dimension partition table, although an external job
could rewrite, move, or copy chunks as desired to comply with the
current dimension partition state. As such, the dimension partition
table represents the "desired" space partitioning state.
Part of #4125
This change allows to create new dimensions even with
existing chunks.
It does not modify any existing data or do migration,
instead it creates full-range (-inf/inf) dimension slice for
existing chunks in order to be compatible with newly created
dimension.
All new chunks created after this will follow logic of the new
dimension and its partitioning.
Fix: #2818
When triggering chunk creation on a hypertable with non-default
statistics targets by a user different from the hypertable owner
the chunk creation will fail with a permission error. This patch
changes the chunk table creation to run the attribute modification
as the table owner.
Fixes#4474
This patch introduces a further check to compress_chunk_impl and
decompress_chunk_impl. After all locks are acquired, a check is made
to see if the chunk is still (un-)compressed. If the chunk was
(de-)compressed while waiting for the locks, the (de-)compression
operation is stopped.
In addition, the chunk locks in decompress_chunk_impl
are upgraded to AccessExclusiveLock to ensure the chunk is not deleted
while other transactions are using it.
Fixes: #4480
A chunk in frozen state cannot be dropped.
drop_chunks will skip over frozen chunks without erroring.
Internal api , drop_chunk will error if you attempt to drop
a chunk without unfreezing it.
This PR also adds a new internal API to unfreeze a chunk.
Add _timescaledb_internal.attach_osm_table_chunk.
This treats a pre-existing foreign table as a
hypertable chunk by adding dummy metadata to the
catalog tables.
Don't keep the chunk constraints while searching. The number of
candidate chunks can be very large, so keeping these constraints is a
lot of work and uses a lot of memory. For finding the matching chunk,
it is enough to track the number of dimensions that matched a given
chunk id. After finding the chunk id, we can look up only the matching
chunk data with the usual function.
This saves some work when doing INSERTs.
A number of TimescaleDB functions internally call `AlterTableInternal`
to modify tables or indexes. For instance, `compress_chunk` and
`attach_tablespace` act as DDL commands to modify
hypertables. However, crashes occur when these functions are called
via `SELECT * INTO FROM <function_name>` or the equivalent `CREATE
TABLE AS` statement. The crashes happen because these statements are
considered process utility commands and therefore sets up an event
trigger context for collecting commands. However, the event trigger
context is not properly set up to record alter table statements in
this code path, thus causing the crashes.
To prevent crashes, wrap `AlterTableInternal` with the event trigger
functions to properly initialize the event trigger context.
Add an internal api to drop a single chunk.
This function drops the storage and metadata
associated with the chunk.
Note that chunk dependencies are not affected.
e.g. Continuous aggs are not updated when this chunk
is dropped.
This is an internal function to freeze a chunk
for PG14 and later.
This function sets a chunk status to frozen.
Operations that modify the chunk data
(like insert, update, delete) are not
supported. Frozen chunks can be dropped.
Additionally, chunk status is cached as part of
classify_relation.
Improve the performance of metadata scanning during hypertable
expansion.
When a hypertable is expanded to include all children chunks, only the
chunks that match the query restrictions are included. To find the
matching chunks, the planner first scans for all matching dimension
slices. The chunks that reference those slices are the chunks to
include in the expansion.
This change optimizes the scanning for slices by avoiding repeated
open/close of the dimension slice metadata table and index.
At the same time, related dimension slice scanning functions have been
refactored along the same line.
An index on the chunk constraint metadata table is also changed to
allow scanning on dimension_slice_id. Previously, dimension_slice_id
was the second key in the index, which made scans on this key less
efficient.
Chunk scan performance during querying is improved by avoiding
repeated open and close of relations and indexes when joining chunk
information from different metadata tables.
When executing a query on a hypertable, it is expanded to include all
its children chunks. However, during the expansion, the chunks that
don't match the query constraints should also be excluded. The
following changes are made to make the scanning and exclusion more
efficient:
* Ensure metadata relations and indexes are only opened once even
though metadata for multiple chunks are scanned. This avoids doing
repeated open and close of tables and indexes for each chunk
scanned.
* Avoid interleaving scans of different relations, ensuring better
data locality, and having, e.g., indexes warm in cache.
* Avoid unnecessary scans that repeat work already done.
* Ensure chunks are locked in a consistent order (based on Oid).
To enable the above changes, some refactoring was necessary. The chunk
scans that happen during constraint exclusion are moved into separate
source files (`chunk_scan.c`) for better structure and readability.
Some test outputs are affected due to the new ordering of chunks in
append relations.
When a hypertable uses a non default tablespace, based
on attach_tablespace settings, the compressed chunk's
index is still created in the default tablespace.
This PR fixes this behavior and creates the compressed
chunk and its indexes in the same tablespace.
When move_chunk is executed on a compressed chunk,
move the indexes to the specified destination tablespace.
Fixes#4000
Chunks that are dropped but preserve the catalog entries
have an incorrect status when they are marked as dropped.
This happens if the chunk was previously compressed and then
gets dropped - the status in the catalog tuple reflects the
compression status. This should be reset since the data is now
dropped.
This function is called often, at least 4 times per chunk, so these add
up. Freeing these chunks allows us to save memory. Ideally, we should
fix the function not to look up the chunk anew each time.
Also make a couple other tweaks that reduce memory usage for planning.
Commit 97c2578ffa6b08f733a75381defefc176c91826b overcomplicated the
`invalidate_add_entry` API by adding parameters related to the remote
function call for multi-node on materialization hypertables.
Refactored it simplifying the function interface and adding a new
function to deal with materialization hypertables on multi-node
environment.
Fixes#3833
This patch fixes a segfault when calling show_chunks on internal
compressed hypertable and a cache lookup failure for drop_chunks
when calling on internal compressed hypertable.
Enable ALTER MATERIALIZED VIEW (timescaledb.compress)
This enables compression on the underlying materialized
hypertable. The segmentby and orderby columns for
compression are based on the GROUP BY clause and time_bucket
clause used while setting up the continuous aggregate.
timescaledb_information.continuous_aggregate view defn
change
Add support for compression policy on continuous
aggregates
Move code from job.c to policy_utils.c
Add support functions to check compression
policy validity for continuous aggregates.
When executing `recompress_chunk` and a query at the same time, a
deadlock can be generated because the chunk relation and the chunk
index and the compressed and uncompressd chunks are locked in different
orders. In particular, when `recompress_chunk` is executing, it will
first decompress the chunk and as part of that lock the uncompressed
chunk index in AccessExclusive mode and when trying to compress the
chunk again it will try to lock the uncompressed chunk in
AccessExclusive as part of truncating it.
Note that `decompress_chunk` and `compress_chunk` lock the relations in
the same order and the issue arises because the procedures are combined
inth a single transaction.
To avoid the deadlock, this commit rewrites the `recompress_chunk` to
be a procedure and adds a commit between the decompression and
compression. Committing the transaction after the decompress will allow
reads and inserts to proceed by working on the uncompressed chunk, and
the compression part of the procedure will take the necessary locks in
strict order, thereby avoiding a deadlock.
In addition, the isolation test is rewritten so that instead of adding
a waitpoint in the PL/SQL function, we implement the isolation test by
taking a lock on the compressed table after the decompression.
Fixes#3846