156 Commits

Author SHA1 Message Date
Konstantina Skovola
3814a3f351 Properly format license error hint
Commit 57fde383b3dddd0b52263218e65a0135981c2d34 changed the
messaging but did not format the error hint correctly.
This patch fixes the error hint.

Fixes #5490
2023-04-10 14:06:39 +03:00
Konstantina Skovola
22841abdf0 Update community license related errors
Update the error message printed when attempting to use
a community license feature with apache license installed.

Fixes #5438
2023-03-27 16:25:28 +03:00
Konstantina Skovola
72c0f5b25e Rewrite recompress_chunk in C for segmentwise processing
This patch introduces a C-function to perform the recompression at
a finer granularity instead of decompressing and subsequently
compressing the entire chunk.

This improves performance for the following reasons:
- it needs to sort less data at a time and
- it avoids recreating the decompressed chunk and the heap
inserts associated with that by decompressing each segment
into a tuplesort instead.

If no segmentby is specified when enabling compression or if an
index does not exist on the compressed chunk then the operation is
performed as before, decompressing and subsequently
compressing the entire chunk.
2023-03-23 11:39:43 +02:00
shhnwz
699fcf48aa Stats improvement for Uncompressed Chunks
During the compression autovacuum use to be disabled for uncompressed
chunk and enable after decompression. This leads to postgres
maintainence issue. Let's not disable autovacuum for uncompressed
chunk anymore. Let postgres take care of the stats in its natural way.

Fixes #309
2023-03-22 23:51:13 +05:30
Jan Nidzwetzki
e0be9eaa28 Allow pushdown of reference table joins
This patch adds the functionality that is needed to perform distributed,
parallel joins on reference tables on access nodes. This code allows the
pushdown of a join if:

 * (1) The setting "ts_guc_enable_per_data_node_queries" is enabled
 * (2) The outer relation is a distributed hypertable
 * (3) The inner relation is marked as a reference table
 * (4) The join is a left join or an inner join
2023-02-23 14:32:12 +01:00
Sven Klemm
dbe89644b5 Remove no longer used compression code
The recent refactoring of INSERT into compression chunk made this
code obsolete but forgot to remove it in that patch.
2023-01-16 14:18:56 +01:00
Nikhil Sontakke
c92e29ba3a Fix DML HA in multi-node
If a datanode goes down for whatever reason then DML activity to
chunks residing on (or targeted to) that DN will start erroring out.
We now handle this by marking the target chunk as "stale" for this
DN by changing the metadata on the access node. This allows us to
continue to do DML to replicas of the same chunk data on other DNs
in the setup. This obviously will only work for chunks which have
"replication_factor" > 1. Note that for chunks which do not have
undergo any change will continue to carry the appropriate DN related
metadata on the AN.

This means that such "stale" chunks will become underreplicated and
need to be re-balanced by using the copy_chunk functionality by a micro
service or some such process.

Fixes #4846
2022-11-25 17:42:26 +05:30
Dmitry Simonenko
5813173e07 Introduce drop_stale_chunks() function
This function drops chunks on a specified data node if those chunks are
not known by the access node.

Call drop_stale_chunks() automatically when data node becomes
available again.

Fix #4848
2022-11-23 19:21:05 +02:00
gayyappan
b9ca06d6e3 Move freeze/unfreeze chunk to tsl
Move code for freeze and unfreeze chunk to tsl directory.
2022-11-17 15:28:47 -05:00
Erik Nordström
f13214891c Add function to alter data nodes
Add a new function, `alter_data_node()`, which can be used to change
the data node's configuration originally set up via `add_data_node()`
on the access node.

The new functions introduces a new option "available" that allows
configuring the availability of the data node. Setting
`available=>false` means that the node should no longer be used for
reads and writes. Only read "failover" is implemented as part of this
change, however.

To fail over reads, the alter data node function finds all the chunks
for which the unavailable data node is the "primary" query target and
"fails over" to a chunk replica on another data node instead. If some
chunks do not have a replica to fail over to, a warning will be
raised.

When a data node is available again, the function can be used to
switch back to using the data node for queries.

Closes #2104
2022-11-11 13:59:42 +01:00
Sutou Kouhei
8d1755bd78 Fix a typo in process_compressed_data_out() 2022-11-02 13:49:47 +01:00
Erik Nordström
4b05402580 Add health check function
A new health check function _timescaledb_internal.health() returns the
health and status of the database instance, including any configured
data nodes (in case the instance is an access node).

Since the function returns also the health of the data nodes, it tries
hard to avoid throwing errors. An error will fail the whole function
and therefore not return any node statuses, although some of the nodes
might be healthy.

The health check on the data nodes is a recursive (remote) call to the
same function on those nodes. Unfortunately, the check will fail with
an error if a connection cannot be established to a node (or an error
occurs on the connection), which means the whole function call will
fail. This will be addressed in a future change by returning the error
in the function result instead.
2022-10-21 10:34:16 +02:00
Sven Klemm
b34b91f18b Add timezone support to time_bucket_gapfill
This patch adds a new time_bucket_gapfill function that
allows bucketing in a specific timezone.

You can gapfill with explicit timezone like so:
`SELECT time_bucket_gapfill('1 day', time, 'Europe/Berlin') ...`

Unfortunately this introduces an ambiguity with some previous
call variations when an untyped start/finish argument was passed
to the function. Some queries might need to be adjusted and either
explicitly name the positional argument or resolve the type ambiguity
by casting to the intended type.
2022-09-07 16:37:53 +02:00
Mats Kindahl
e0f3e17575 Use new validation functions
Old patch was using old validation functions, but there are already
validation functions that both read and validate the policy, so using
those. Also removing the old `job_config_check` function since that is
no longer use and instead adding a `job_config_check` that calls the
checking function with the configuration.
2022-08-25 10:38:03 +03:00
Jan Nidzwetzki
0786226e43 Ensure TSL library is loaded on database upgrades
This patch ensures that the TSL library is loaded when the
database is upgraded and post_update_cagg_try_repair is
called. There are some situations when the library is not
loaded properly (see #4573 and Support-Dev-Collab#468),
resulting in the following error message:

"[..] is not supported under the current "timescale" license
HINT:  Upgrade your license to 'timescale'"
2022-08-24 06:34:12 +02:00
Dmitry Simonenko
90cace417e Load TSL library on compressed_data_out call
A call to `compressed_data_out` from a replication worker would
produce a misleading error saying that your license is "timescale"
and you should upgrade to "timescale" license, even if you have
already upgraded.

As a workaround, we try to load the TSL module it in this function.
It will still error out in the "apache" version as intended.

We already had the same fix for `compressed_data_in` function.
2022-08-22 17:47:36 +03:00
Rafia Sabih
16fdb6ca5e Checks for policy validation and compatibility
At the time of adding or updating policies, it is
checked if the policies are compatible with each
other and to those already on the CAgg.
These checks are:
- refresh and compression policies should not overlap
- refresh and retention policies should not overlap
- compression and retention policies should not overlap

Co-authored-by: Markos Fountoulakis <markos@timescale.com>
2022-08-12 00:55:18 +03:00
Rafia Sabih
088f688780 Miscellaneous
-Add infinity for refresh window range
 Now to create open ended refresh policy
 use +/- infinity for end_offset and star_offset
 respectivly for the refresh policy.
-Add remove_all_policies function
 This will remove all the policies on a given
 CAgg.
-Remove parameter refresh_schedule_interval
-Fix downgrade scripts
-Fix IF EXISTS case

Co-authored-by: Markos Fountoulakis <markos@timescale.com>
2022-08-12 00:55:18 +03:00
Rafia Sabih
bca65f4697 1 step CAgg policy management
This simplifies the process of adding the policies
for the CAggs. Now, with one single sql statements
all the policies can be added for a given CAgg.
Similarly, all the policies can be removed or modified
via single sql statement only.

This also adds a new function as well as a view to show all
the policies on a continuous aggregate.
2022-08-12 00:55:18 +03:00
gayyappan
79bf4f53b1 Add api to associate a hypertable with custom jobs
This PR introduces a new SQL function to associate a
hypertable or continuous agg with a custom job. If
this dependency is setup, the job is automatically
deleted when the hypertable/cagg is dropped.
2022-06-23 13:33:33 -04:00
Sven Klemm
308ce8c47b Fix various misspellings 2022-06-13 10:53:08 +02:00
Dmitry Simonenko
f1575bb4c3 Support moving compressed chunks between data nodes
This change allows to copy or move compressed chunks
between data nodes by including compressed chunk into the
chunk copy command stages.
2022-05-18 22:14:50 +03:00
Nikhil Sontakke
ddd02922c9 Support non-superuser move chunk operations
The non-superuser needs to have REPLICATION privileges atleast. A
new function "subscription_cmd" has been added to allow running
subscription related commands on datanodes. This function implicitly
upgrades to the bootstrapped superuser and then performs subscription
creation/alteration/deletion commands. It only accepts subscriptions
related commands and errors out otherwise.
2022-05-18 16:56:31 +05:30
Fabrízio de Royes Mello
1e8d37b54e Remove chunk_id from materialization hypertable
First step to remove the re-aggregation for Continuous Aggregates
is to remove the `chunk_id` from the materialization hypertable.

Also added new metadata column named `finalized` to `continuous_cagg`
catalog table in order to store information about the new following
finalized version of Continuous Aggregates that will not need the
partials anymore. This flag is important to maintain backward
compatibility with previous Continuous Aggregate implementation that
requires the `chunk_id` to refresh data properly.
2022-05-06 14:30:00 -03:00
Markos Fountoulakis
fab16f3798 Fix segfault in Continuous Aggregates
Add the missing variables to the finalization view of Continuous
Aggregates and the corresponding columns to the materialization table.
Cover the case of targets that contain Aggref nodes and Var nodes
that are outside of the Aggref nodes at the same time.

Stop rebuilding the Continuous Aggregate view with ALTER MATERIALIZED
VIEW. Attempt to repair the view at post-update time instead, and fail
gracefully if it is not possible to do so without raw hypertable schema
or data modifications.

Stop rebuilding the Continuous Aggregate view when switching realtime
aggregation on and off. Instead, manipulate the User View by either:
  1. removing the UNION ALL right-hand side and the WHERE clause when
     disabling realtime aggregation
  2. adding the Direct View to the right of a UNION ALL operator and
     defining WHERE clauses with the relevant watermark checks when
     enabling realtime aggregation

Fixes #3898
2022-04-18 12:54:20 +03:00
Mats Kindahl
15d33f0624 Add option to compile without telemetry
Add option `USE_TELEMETRY` that can be used to exclude telemetry from
the compile.

Telemetry-specific SQL is moved, which is only included when extension
is compiled with telemetry and the notice is changed so that the
message about telemetry is not printed when Telemetry is not compiled
in.

The following code is not compiled in when telemetry is not used:
- Cross-module functions for telemetry.
- Checks for telemetry job in job execution.
- GUC variables `telemetry_level` and `telemetry_cloud`.

Telemetry subsystem is not included when compiling without telemetry,
which requires some functions to be moved out of the telemetry
subsystem:
- Metadata handling is moved out of the telemetry module since it is
  used not only with telemetry.
- UUID functions are moved into a separate module instead of being
  part of the telemetry subsystem.
- Telemetry functions are either added or removed when updating from a
  previous version.

Tests are updated to:
- Not use telemetry functions to get UUID or Metadata and instead use
  the moved UUID and metadata functions.
- Not include telemetry information in tests that do not require it.
- Configuration files do not set telemetry variables when telemetry is
  not compiled in.
- Replaced usage of telemetry functions in non-telemetry tests with
  other sources of same information.

Fixes #3931
2022-03-03 12:21:07 +01:00
Fabrízio de Royes Mello
342f848d90 Refactor invalidation log inclusion
Commit 97c2578ffa6b08f733a75381defefc176c91826b overcomplicated the
`invalidate_add_entry` API by adding parameters related to the remote
function call for multi-node on materialization hypertables.

Refactored it simplifying the function interface and adding a new
function to deal with materialization hypertables on multi-node
environment.

Fixes #3833
2022-01-17 11:45:12 -03:00
Mats Kindahl
b208f5276f Remove C language recompress_chunk
Since we are re-implementing `recompress_chunk` as a PL/SQL function,
there is no need to keep the C language version around any more, so we
remove it from the code.
2021-12-10 14:15:47 +01:00
Fabrízio de Royes Mello
da8ce2e140 Properly handle max_retries option
Surprisly we're not taking care of `max_retries` option leading us to
failed jobs running forever.

Fixed it by properly handle the `max_retries` option in our scheduler.

Fixes #3035
2021-11-25 09:47:54 -03:00
Dmitry Simonenko
3d11927567 Rework distributed DDL processing logic
This patch does refactoring and rework of the logic beside
dist_ddl_preprocess() function.

The idea behind it is to simplify process by splitting
each DDL command logic inside separate function and avoid relaying on
the hypertable list count to make decisions.

This change allows easier to process more complex commands
(such as GRANT), which would require query rewrite or to be
executed on a different data nodes. Additionally this would make it
easier to follow and be more alike as main code path inside
src/process_util.c.
2021-10-29 16:15:58 +03:00
Markos Fountoulakis
221437e8ef Continuous aggregates for distributed hypertables
Add support for continuous aggregates for distributed hypertables by
allowing a continuous aggregate to read from a distributed hypertable
so that the continuous aggregate is on the access node while the
hypertable data is on the data nodes.

For distributed hypertables, both the hypertable and continuous
aggregate invalidation log are kept on the data nodes and the refresh
window is computed at refresh time on each data node. Since the
continuous aggregate materialization hypertable is not present on the
data nodes, the invalidation log was extended to allow using a
non-local hypertable id on the data nodes. This means that you cannot
create continuous aggregates on the data nodes since those could clash
with continuous aggregates on the access node.

Some utility statements added entries to the invalidation logs
directly (truncating chunks and hypertables, as well as dropping
individual chunks), so to handle this case, internal functions were
added to allow logging invalidation on the data nodes from the access
node.

The commit also includes some fixes to memory context usage that
caused crashes for invalidation triggers and also disable per data
node queries during refresh since that would otherwise generate an
exception.

Fixes #3435

Co-authored-by: Mats Kindahl <mats@timescale.com>
2021-10-25 18:20:11 +03:00
gayyappan
b0886c1b6d Support cagg invalidation trigger for inserts into compressed chunks
After row triggers do not work when we insert into a compressed chunk.
This causes a problem for caggs as invalidations are not recorded.
Explicitly call the function to record invalidations when we
insert into a compressed chunk (if the hypertable has caggs
defined on it)

Fixes #3410.
2021-10-21 11:44:11 -04:00
gayyappan
fffd6c2350 Use plpgsql procedure for executing compression policy
This PR removes the C code that executes the compression
policy. Instead we use a PL/pgSQL procedure to execute
the policy.

PG13.4 and PG12.8 introduced some changes
that require PortalContexts while executing transactions.
The compression policy procedure compresses chunks in
multiple transactions. We have seen some issues with snapshots
and portal management in the policy code (due to the
PG13.4 code changes). SPI API has transaction-portal management
code. However, the compression policy code does not use SPI
interfaces. But it is fairly easy to just convert this into
a PL/pgSQL procedure (which calls SPI) rather than replicating
portal managment code in C to manage multiple txns in the
compression policy.

This PR also disallows decompress_chunk, compress_chunk and
recompress_chunk in txn read only mode.

Fixes #3656
2021-10-13 09:11:59 -04:00
Nikhil
2ffa1bf436 Implement cleanup for chunk copy/move
A chunk copy/move operation is carried out in stages and it can
fail in any of them. We track the last completed stage in the
"chunk_copy_operation" catalog table. In case of failure, a
"chunk_copy_cleanup" function can be invoked to bring the chunk back
to its original state on the source datanode and all transient objects
like replication slot, publication, subscription, empty chunk, metadata
updates, etc are cleaned up.

Includes test case changes for each and every stage induced failure.

To avoid confusion between chunk copy activity and chunk copy operation
this patch also consistently uses "operation" everywhere now instead of
"activity"
2021-07-29 16:53:12 +03:00
Dmitry Simonenko
38c1781748 Copy/move chunk refactoring
Remove copy_chunk_data() function and code needed to support it,
such as the 'transactional' argument.

Rework copy chunk logic using separate stages.

Introduce copy_chunk() API function as an internal wrapper for
the move_chunk().
2021-07-29 16:53:12 +03:00
Nikhil
f6b0250557 Implement wrapper API for copy/move chunk
The building blocks required for implementing end-to-end copy/move
chunk functionality have now been wrapped in a procedure.

A procedure is required because multiple transactions are needed to
carry out the activity across the access node and the involved two data
nodes.

The following steps are encapsulated in this procedure

1) Create an empty chunk table on the destination data node

2) Copy the data from the src data node chunk to this newly created
destination node chunk. This is done via inbuilt PostgreSQL logical
replication functionality

3) Attach this chunk to the hypertable on the dst data node

4) Remove this chunk from the src data node to complete the move if
requested

A new catalog table "chunk_copy_activity" has been added to track
the progress of the above stages. A unique id gets assigned to each
activity and it is updated with the completed stages as things
progress.
2021-07-29 16:53:12 +03:00
Dmitry Simonenko
2c66c1fd64 Introduce function to copy chunk data between data nodes
Add internal copy_chunk_data() function which implements a way
to copy chunk data between data nodes using logical
replication.

This patch prepared together with @nikkhils.
2021-07-29 16:53:12 +03:00
Nikhil
762053431e Implement drop_chunk_replica API
This function drops a chunk on a specified data node. It then removes
the metadata about the datanode, chunk association on the access node.

This function is meant for internal use as part of the "move chunk"
functionality.

If only one chunk replica remains then this function refuses to drop it
to avoid data loss.
2021-07-29 16:53:12 +03:00
Ruslan Fomkin
404f1cdbad Create chunk table from access node
Creates a table for chunk replica on the given data node. The table
gets the same schema and name as the chunk. The created chunk replica
table is not added into metadata on the access node or data node.

The primary goal is to use it during copy/move chunk.
2021-07-29 16:53:12 +03:00
Ruslan Fomkin
28ccecbe7c Create an empty chunk table
Adds an internal API function to create an empty chunk table according
the given hypertable for the given chunk table name and dimension
slices. This functions creates a chunk table inheriting from the
hypertable, so it guarantees the same schema. No TimescaleDB's
metadata is updated.

To be able to create the chunk table in a tablespace attached to the
hyeprtable, this commit allows calculating the tablespace id without
the dimension slice to exist in the catalog.

If there is already a chunk, which collides on dimension slices, the
function fails to create the chunk table.

The function will be used internally in multi-node to be able to
replicate a chunk from one data node to another.
2021-07-29 16:53:12 +03:00
Erik Nordström
98110af75b Constify parameters and return values of core APIs
Harden core APIs by adding the `const` qualifier to pointer parameters
and return values passed by reference. Adding `const` to APIs has
several benefits and potentially reduces bugs.

* Allows core APIs to be called using `const` objects.
* Callers know that objects passed by reference are not modified as a
  side-effect of a function call.
* Returning `const` pointers enforces "read-only" usage of pointers to
  internal objects, forcing users to copy objects when mutating them
  or using explicit APIs for mutations.
* Allows compiler to apply optimizations and helps static analysis.

Note that these changes are so far only applied to core API
functions. Further work can be done to improve other parts of the
code.
2021-06-14 22:09:10 +02:00
Sven Klemm
fe872cb684 Add policy_recompression procedure
This patch adds a recompress procedure that may be used as custom
job when compression and recompression should run as separate
background jobs.
2021-05-24 18:03:47 -04:00
gayyappan
4f865f7870 Add recompress_chunk function
After inserts go into a compressed chunk, the chunk is marked as
unordered.This PR adds a new function recompress_chunk that
compresses the data and sets the status back to compressed. Further
optimizations for this function are planned but not part of this PR.

This function can be invoked by calling
SELECT recompress_chunk(<chunk_name>).

recompress_chunk function is automatically invoked by the compression
policy job, when it sees that a chunk is in unordered state.
2021-05-24 18:03:47 -04:00
gayyappan
93be235d33 Support for inserts into compressed hypertables
Add CompressRowSingleState .
This has functions to compress a single row.
2021-05-24 18:03:47 -04:00
Erik Nordström
f6967b349f Use COPY when executing distributed INSERTs
A new custom plan/executor node is added that implements distributed
INSERT using COPY in the backend (between access node and data
nodes). COPY is significantly faster than the existing method that
sets up prepared INSERT statements on each data node. With COPY,
tuples are streamed to data nodes instead of batching them in order to
"fill" a configured prepared statement. A COPY also avoids the
overhead of having to plan the statement on each data node.

Using COPY doesn't work in all situations, however. Neither ON
CONFLICT nor RETURNING clauses work since COPY lacks support for
them. Still, RETURNING is possible if one knows that the tuples aren't
going to be modified by, e.g., a trigger. When tuples aren't modified,
one can return the original tuples on the access node.

In order to implement the new custom node, some refactoring has been
performed to the distributed COPY code. The basic COPY support
functions have been moved to the connection module so that switching
in and out of COPY_IN mode is part of the core connection
handling. This allows other parts of the code to manage the connection
mode, which is necessary when, e.g., creating a remote chunk. To
create a chunk, the connection needs to be switched out of COPY_IN
mode so that regular SQL statements can be executed again.

Partial fix for #3025.
2021-05-12 16:14:28 +02:00
Dmitry Simonenko
6a1c81b63e Add distributed restore point functionality
This change adds create_distributed_restore_point() function
which allows to create recovery restore point across data
nodes.

Fix #2846
2021-02-25 15:39:50 +03:00
gayyappan
5be6a3e4e9 Support column rename for hypertables with compression enabled
ALTER TABLE <hypertable> RENAME <column_name> TO <new_column_name>
is now supported for hypertables that have compression enabled.

Note: Column renaming is not supported for distributed hypertables.
So this will not work on distributed hypertables that have
compression enabled.
2021-02-19 10:21:50 -05:00
gayyappan
f649736f2f Support ADD COLUMN for compressed hypertables
Support ALTER TABLE .. ADD COLUMN <colname> <typname>
for hypertables with compressed chunks.
2021-01-14 09:32:50 -05:00
Erik Nordström
2ecb53e7bb Improve memory handling for remote COPY
This change improves memory usage in the `COPY` code used for
distributed hypertables. The following issues have been addressed:

* `PGresult` objects were not cleared, leading to memory leaks.
* The caching of chunk connections didn't work since the lookup
  compared ephemeral chunk pointers instead of chunk IDs. The effect
  was that cached chunk connection state was reallocated every time
  instead of being reused. This likely also caused worse performance.

To address these issues, the following changes are made:

* All `PGresult` objects are now cleared with `PQclear`.
* Lookup for chunk connections now compares chunk IDs instead of chunk
  pointers.
* The per-tuple memory context is moved the to the outer processing
  loop to ensure that everything in the loop is allocated on the
  per-tuple memory context, which is also reset at every iteration of
  the loop.
* The use of memory contexts is also simplified to have only one
  memory context for state that should survive across resets of the
  per-tuple memory context.

Fixes #2677
2020-12-02 17:40:44 +01:00
Ruslan Fomkin
791b0a4db7 Add API to refresh continuous aggregate on chunk
Function refresh_continuous_aggregate, which takes a continuous
aggregate and a chunk, is added. It refreshes the continuous aggregate
on the given chunk if there are invalidations. The function can be
used in a transaction, e.g., together with following drop_chunks. This
allows users to create a user defined action to refresh and drop
chunks. Therefore, the refresh on drop is removed from drop_chunks.
2020-11-12 08:33:35 +01:00