Commit 57fde383b3dddd0b52263218e65a0135981c2d34 changed the
messaging but did not format the error hint correctly.
This patch fixes the error hint.
Fixes#5490
This patch introduces a C-function to perform the recompression at
a finer granularity instead of decompressing and subsequently
compressing the entire chunk.
This improves performance for the following reasons:
- it needs to sort less data at a time and
- it avoids recreating the decompressed chunk and the heap
inserts associated with that by decompressing each segment
into a tuplesort instead.
If no segmentby is specified when enabling compression or if an
index does not exist on the compressed chunk then the operation is
performed as before, decompressing and subsequently
compressing the entire chunk.
During the compression autovacuum use to be disabled for uncompressed
chunk and enable after decompression. This leads to postgres
maintainence issue. Let's not disable autovacuum for uncompressed
chunk anymore. Let postgres take care of the stats in its natural way.
Fixes#309
This patch adds the functionality that is needed to perform distributed,
parallel joins on reference tables on access nodes. This code allows the
pushdown of a join if:
* (1) The setting "ts_guc_enable_per_data_node_queries" is enabled
* (2) The outer relation is a distributed hypertable
* (3) The inner relation is marked as a reference table
* (4) The join is a left join or an inner join
If a datanode goes down for whatever reason then DML activity to
chunks residing on (or targeted to) that DN will start erroring out.
We now handle this by marking the target chunk as "stale" for this
DN by changing the metadata on the access node. This allows us to
continue to do DML to replicas of the same chunk data on other DNs
in the setup. This obviously will only work for chunks which have
"replication_factor" > 1. Note that for chunks which do not have
undergo any change will continue to carry the appropriate DN related
metadata on the AN.
This means that such "stale" chunks will become underreplicated and
need to be re-balanced by using the copy_chunk functionality by a micro
service or some such process.
Fixes#4846
This function drops chunks on a specified data node if those chunks are
not known by the access node.
Call drop_stale_chunks() automatically when data node becomes
available again.
Fix#4848
Add a new function, `alter_data_node()`, which can be used to change
the data node's configuration originally set up via `add_data_node()`
on the access node.
The new functions introduces a new option "available" that allows
configuring the availability of the data node. Setting
`available=>false` means that the node should no longer be used for
reads and writes. Only read "failover" is implemented as part of this
change, however.
To fail over reads, the alter data node function finds all the chunks
for which the unavailable data node is the "primary" query target and
"fails over" to a chunk replica on another data node instead. If some
chunks do not have a replica to fail over to, a warning will be
raised.
When a data node is available again, the function can be used to
switch back to using the data node for queries.
Closes#2104
A new health check function _timescaledb_internal.health() returns the
health and status of the database instance, including any configured
data nodes (in case the instance is an access node).
Since the function returns also the health of the data nodes, it tries
hard to avoid throwing errors. An error will fail the whole function
and therefore not return any node statuses, although some of the nodes
might be healthy.
The health check on the data nodes is a recursive (remote) call to the
same function on those nodes. Unfortunately, the check will fail with
an error if a connection cannot be established to a node (or an error
occurs on the connection), which means the whole function call will
fail. This will be addressed in a future change by returning the error
in the function result instead.
This patch adds a new time_bucket_gapfill function that
allows bucketing in a specific timezone.
You can gapfill with explicit timezone like so:
`SELECT time_bucket_gapfill('1 day', time, 'Europe/Berlin') ...`
Unfortunately this introduces an ambiguity with some previous
call variations when an untyped start/finish argument was passed
to the function. Some queries might need to be adjusted and either
explicitly name the positional argument or resolve the type ambiguity
by casting to the intended type.
Old patch was using old validation functions, but there are already
validation functions that both read and validate the policy, so using
those. Also removing the old `job_config_check` function since that is
no longer use and instead adding a `job_config_check` that calls the
checking function with the configuration.
This patch ensures that the TSL library is loaded when the
database is upgraded and post_update_cagg_try_repair is
called. There are some situations when the library is not
loaded properly (see #4573 and Support-Dev-Collab#468),
resulting in the following error message:
"[..] is not supported under the current "timescale" license
HINT: Upgrade your license to 'timescale'"
A call to `compressed_data_out` from a replication worker would
produce a misleading error saying that your license is "timescale"
and you should upgrade to "timescale" license, even if you have
already upgraded.
As a workaround, we try to load the TSL module it in this function.
It will still error out in the "apache" version as intended.
We already had the same fix for `compressed_data_in` function.
At the time of adding or updating policies, it is
checked if the policies are compatible with each
other and to those already on the CAgg.
These checks are:
- refresh and compression policies should not overlap
- refresh and retention policies should not overlap
- compression and retention policies should not overlap
Co-authored-by: Markos Fountoulakis <markos@timescale.com>
-Add infinity for refresh window range
Now to create open ended refresh policy
use +/- infinity for end_offset and star_offset
respectivly for the refresh policy.
-Add remove_all_policies function
This will remove all the policies on a given
CAgg.
-Remove parameter refresh_schedule_interval
-Fix downgrade scripts
-Fix IF EXISTS case
Co-authored-by: Markos Fountoulakis <markos@timescale.com>
This simplifies the process of adding the policies
for the CAggs. Now, with one single sql statements
all the policies can be added for a given CAgg.
Similarly, all the policies can be removed or modified
via single sql statement only.
This also adds a new function as well as a view to show all
the policies on a continuous aggregate.
This PR introduces a new SQL function to associate a
hypertable or continuous agg with a custom job. If
this dependency is setup, the job is automatically
deleted when the hypertable/cagg is dropped.
The non-superuser needs to have REPLICATION privileges atleast. A
new function "subscription_cmd" has been added to allow running
subscription related commands on datanodes. This function implicitly
upgrades to the bootstrapped superuser and then performs subscription
creation/alteration/deletion commands. It only accepts subscriptions
related commands and errors out otherwise.
First step to remove the re-aggregation for Continuous Aggregates
is to remove the `chunk_id` from the materialization hypertable.
Also added new metadata column named `finalized` to `continuous_cagg`
catalog table in order to store information about the new following
finalized version of Continuous Aggregates that will not need the
partials anymore. This flag is important to maintain backward
compatibility with previous Continuous Aggregate implementation that
requires the `chunk_id` to refresh data properly.
Add the missing variables to the finalization view of Continuous
Aggregates and the corresponding columns to the materialization table.
Cover the case of targets that contain Aggref nodes and Var nodes
that are outside of the Aggref nodes at the same time.
Stop rebuilding the Continuous Aggregate view with ALTER MATERIALIZED
VIEW. Attempt to repair the view at post-update time instead, and fail
gracefully if it is not possible to do so without raw hypertable schema
or data modifications.
Stop rebuilding the Continuous Aggregate view when switching realtime
aggregation on and off. Instead, manipulate the User View by either:
1. removing the UNION ALL right-hand side and the WHERE clause when
disabling realtime aggregation
2. adding the Direct View to the right of a UNION ALL operator and
defining WHERE clauses with the relevant watermark checks when
enabling realtime aggregation
Fixes#3898
Add option `USE_TELEMETRY` that can be used to exclude telemetry from
the compile.
Telemetry-specific SQL is moved, which is only included when extension
is compiled with telemetry and the notice is changed so that the
message about telemetry is not printed when Telemetry is not compiled
in.
The following code is not compiled in when telemetry is not used:
- Cross-module functions for telemetry.
- Checks for telemetry job in job execution.
- GUC variables `telemetry_level` and `telemetry_cloud`.
Telemetry subsystem is not included when compiling without telemetry,
which requires some functions to be moved out of the telemetry
subsystem:
- Metadata handling is moved out of the telemetry module since it is
used not only with telemetry.
- UUID functions are moved into a separate module instead of being
part of the telemetry subsystem.
- Telemetry functions are either added or removed when updating from a
previous version.
Tests are updated to:
- Not use telemetry functions to get UUID or Metadata and instead use
the moved UUID and metadata functions.
- Not include telemetry information in tests that do not require it.
- Configuration files do not set telemetry variables when telemetry is
not compiled in.
- Replaced usage of telemetry functions in non-telemetry tests with
other sources of same information.
Fixes#3931
Commit 97c2578ffa6b08f733a75381defefc176c91826b overcomplicated the
`invalidate_add_entry` API by adding parameters related to the remote
function call for multi-node on materialization hypertables.
Refactored it simplifying the function interface and adding a new
function to deal with materialization hypertables on multi-node
environment.
Fixes#3833
Since we are re-implementing `recompress_chunk` as a PL/SQL function,
there is no need to keep the C language version around any more, so we
remove it from the code.
Surprisly we're not taking care of `max_retries` option leading us to
failed jobs running forever.
Fixed it by properly handle the `max_retries` option in our scheduler.
Fixes#3035
This patch does refactoring and rework of the logic beside
dist_ddl_preprocess() function.
The idea behind it is to simplify process by splitting
each DDL command logic inside separate function and avoid relaying on
the hypertable list count to make decisions.
This change allows easier to process more complex commands
(such as GRANT), which would require query rewrite or to be
executed on a different data nodes. Additionally this would make it
easier to follow and be more alike as main code path inside
src/process_util.c.
Add support for continuous aggregates for distributed hypertables by
allowing a continuous aggregate to read from a distributed hypertable
so that the continuous aggregate is on the access node while the
hypertable data is on the data nodes.
For distributed hypertables, both the hypertable and continuous
aggregate invalidation log are kept on the data nodes and the refresh
window is computed at refresh time on each data node. Since the
continuous aggregate materialization hypertable is not present on the
data nodes, the invalidation log was extended to allow using a
non-local hypertable id on the data nodes. This means that you cannot
create continuous aggregates on the data nodes since those could clash
with continuous aggregates on the access node.
Some utility statements added entries to the invalidation logs
directly (truncating chunks and hypertables, as well as dropping
individual chunks), so to handle this case, internal functions were
added to allow logging invalidation on the data nodes from the access
node.
The commit also includes some fixes to memory context usage that
caused crashes for invalidation triggers and also disable per data
node queries during refresh since that would otherwise generate an
exception.
Fixes#3435
Co-authored-by: Mats Kindahl <mats@timescale.com>
After row triggers do not work when we insert into a compressed chunk.
This causes a problem for caggs as invalidations are not recorded.
Explicitly call the function to record invalidations when we
insert into a compressed chunk (if the hypertable has caggs
defined on it)
Fixes#3410.
This PR removes the C code that executes the compression
policy. Instead we use a PL/pgSQL procedure to execute
the policy.
PG13.4 and PG12.8 introduced some changes
that require PortalContexts while executing transactions.
The compression policy procedure compresses chunks in
multiple transactions. We have seen some issues with snapshots
and portal management in the policy code (due to the
PG13.4 code changes). SPI API has transaction-portal management
code. However, the compression policy code does not use SPI
interfaces. But it is fairly easy to just convert this into
a PL/pgSQL procedure (which calls SPI) rather than replicating
portal managment code in C to manage multiple txns in the
compression policy.
This PR also disallows decompress_chunk, compress_chunk and
recompress_chunk in txn read only mode.
Fixes#3656
A chunk copy/move operation is carried out in stages and it can
fail in any of them. We track the last completed stage in the
"chunk_copy_operation" catalog table. In case of failure, a
"chunk_copy_cleanup" function can be invoked to bring the chunk back
to its original state on the source datanode and all transient objects
like replication slot, publication, subscription, empty chunk, metadata
updates, etc are cleaned up.
Includes test case changes for each and every stage induced failure.
To avoid confusion between chunk copy activity and chunk copy operation
this patch also consistently uses "operation" everywhere now instead of
"activity"
Remove copy_chunk_data() function and code needed to support it,
such as the 'transactional' argument.
Rework copy chunk logic using separate stages.
Introduce copy_chunk() API function as an internal wrapper for
the move_chunk().
The building blocks required for implementing end-to-end copy/move
chunk functionality have now been wrapped in a procedure.
A procedure is required because multiple transactions are needed to
carry out the activity across the access node and the involved two data
nodes.
The following steps are encapsulated in this procedure
1) Create an empty chunk table on the destination data node
2) Copy the data from the src data node chunk to this newly created
destination node chunk. This is done via inbuilt PostgreSQL logical
replication functionality
3) Attach this chunk to the hypertable on the dst data node
4) Remove this chunk from the src data node to complete the move if
requested
A new catalog table "chunk_copy_activity" has been added to track
the progress of the above stages. A unique id gets assigned to each
activity and it is updated with the completed stages as things
progress.
Add internal copy_chunk_data() function which implements a way
to copy chunk data between data nodes using logical
replication.
This patch prepared together with @nikkhils.
This function drops a chunk on a specified data node. It then removes
the metadata about the datanode, chunk association on the access node.
This function is meant for internal use as part of the "move chunk"
functionality.
If only one chunk replica remains then this function refuses to drop it
to avoid data loss.
Creates a table for chunk replica on the given data node. The table
gets the same schema and name as the chunk. The created chunk replica
table is not added into metadata on the access node or data node.
The primary goal is to use it during copy/move chunk.
Adds an internal API function to create an empty chunk table according
the given hypertable for the given chunk table name and dimension
slices. This functions creates a chunk table inheriting from the
hypertable, so it guarantees the same schema. No TimescaleDB's
metadata is updated.
To be able to create the chunk table in a tablespace attached to the
hyeprtable, this commit allows calculating the tablespace id without
the dimension slice to exist in the catalog.
If there is already a chunk, which collides on dimension slices, the
function fails to create the chunk table.
The function will be used internally in multi-node to be able to
replicate a chunk from one data node to another.
Harden core APIs by adding the `const` qualifier to pointer parameters
and return values passed by reference. Adding `const` to APIs has
several benefits and potentially reduces bugs.
* Allows core APIs to be called using `const` objects.
* Callers know that objects passed by reference are not modified as a
side-effect of a function call.
* Returning `const` pointers enforces "read-only" usage of pointers to
internal objects, forcing users to copy objects when mutating them
or using explicit APIs for mutations.
* Allows compiler to apply optimizations and helps static analysis.
Note that these changes are so far only applied to core API
functions. Further work can be done to improve other parts of the
code.
After inserts go into a compressed chunk, the chunk is marked as
unordered.This PR adds a new function recompress_chunk that
compresses the data and sets the status back to compressed. Further
optimizations for this function are planned but not part of this PR.
This function can be invoked by calling
SELECT recompress_chunk(<chunk_name>).
recompress_chunk function is automatically invoked by the compression
policy job, when it sees that a chunk is in unordered state.
A new custom plan/executor node is added that implements distributed
INSERT using COPY in the backend (between access node and data
nodes). COPY is significantly faster than the existing method that
sets up prepared INSERT statements on each data node. With COPY,
tuples are streamed to data nodes instead of batching them in order to
"fill" a configured prepared statement. A COPY also avoids the
overhead of having to plan the statement on each data node.
Using COPY doesn't work in all situations, however. Neither ON
CONFLICT nor RETURNING clauses work since COPY lacks support for
them. Still, RETURNING is possible if one knows that the tuples aren't
going to be modified by, e.g., a trigger. When tuples aren't modified,
one can return the original tuples on the access node.
In order to implement the new custom node, some refactoring has been
performed to the distributed COPY code. The basic COPY support
functions have been moved to the connection module so that switching
in and out of COPY_IN mode is part of the core connection
handling. This allows other parts of the code to manage the connection
mode, which is necessary when, e.g., creating a remote chunk. To
create a chunk, the connection needs to be switched out of COPY_IN
mode so that regular SQL statements can be executed again.
Partial fix for #3025.
ALTER TABLE <hypertable> RENAME <column_name> TO <new_column_name>
is now supported for hypertables that have compression enabled.
Note: Column renaming is not supported for distributed hypertables.
So this will not work on distributed hypertables that have
compression enabled.
This change improves memory usage in the `COPY` code used for
distributed hypertables. The following issues have been addressed:
* `PGresult` objects were not cleared, leading to memory leaks.
* The caching of chunk connections didn't work since the lookup
compared ephemeral chunk pointers instead of chunk IDs. The effect
was that cached chunk connection state was reallocated every time
instead of being reused. This likely also caused worse performance.
To address these issues, the following changes are made:
* All `PGresult` objects are now cleared with `PQclear`.
* Lookup for chunk connections now compares chunk IDs instead of chunk
pointers.
* The per-tuple memory context is moved the to the outer processing
loop to ensure that everything in the loop is allocated on the
per-tuple memory context, which is also reset at every iteration of
the loop.
* The use of memory contexts is also simplified to have only one
memory context for state that should survive across resets of the
per-tuple memory context.
Fixes#2677
Function refresh_continuous_aggregate, which takes a continuous
aggregate and a chunk, is added. It refreshes the continuous aggregate
on the given chunk if there are invalidations. The function can be
used in a transaction, e.g., together with following drop_chunks. This
allows users to create a user defined action to refresh and drop
chunks. Therefore, the refresh on drop is removed from drop_chunks.