timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-16 18:43:18 +08:00

Author	SHA1	Message	Date
Konstantina Skovola	3814a3f351	Properly format license error hint Commit 57fde383b3dddd0b52263218e65a0135981c2d34 changed the messaging but did not format the error hint correctly. This patch fixes the error hint. Fixes #5490	2023-04-10 14:06:39 +03:00
Konstantina Skovola	22841abdf0	Update community license related errors Update the error message printed when attempting to use a community license feature with apache license installed. Fixes #5438	2023-03-27 16:25:28 +03:00
Konstantina Skovola	72c0f5b25e	Rewrite recompress_chunk in C for segmentwise processing This patch introduces a C-function to perform the recompression at a finer granularity instead of decompressing and subsequently compressing the entire chunk. This improves performance for the following reasons: - it needs to sort less data at a time and - it avoids recreating the decompressed chunk and the heap inserts associated with that by decompressing each segment into a tuplesort instead. If no segmentby is specified when enabling compression or if an index does not exist on the compressed chunk then the operation is performed as before, decompressing and subsequently compressing the entire chunk.	2023-03-23 11:39:43 +02:00
shhnwz	699fcf48aa	Stats improvement for Uncompressed Chunks During the compression autovacuum use to be disabled for uncompressed chunk and enable after decompression. This leads to postgres maintainence issue. Let's not disable autovacuum for uncompressed chunk anymore. Let postgres take care of the stats in its natural way. Fixes #309	2023-03-22 23:51:13 +05:30
Jan Nidzwetzki	e0be9eaa28	Allow pushdown of reference table joins This patch adds the functionality that is needed to perform distributed, parallel joins on reference tables on access nodes. This code allows the pushdown of a join if: * (1) The setting "ts_guc_enable_per_data_node_queries" is enabled * (2) The outer relation is a distributed hypertable * (3) The inner relation is marked as a reference table * (4) The join is a left join or an inner join	2023-02-23 14:32:12 +01:00
Sven Klemm	dbe89644b5	Remove no longer used compression code The recent refactoring of INSERT into compression chunk made this code obsolete but forgot to remove it in that patch.	2023-01-16 14:18:56 +01:00
Nikhil Sontakke	c92e29ba3a	Fix DML HA in multi-node If a datanode goes down for whatever reason then DML activity to chunks residing on (or targeted to) that DN will start erroring out. We now handle this by marking the target chunk as "stale" for this DN by changing the metadata on the access node. This allows us to continue to do DML to replicas of the same chunk data on other DNs in the setup. This obviously will only work for chunks which have "replication_factor" > 1. Note that for chunks which do not have undergo any change will continue to carry the appropriate DN related metadata on the AN. This means that such "stale" chunks will become underreplicated and need to be re-balanced by using the copy_chunk functionality by a micro service or some such process. Fixes #4846	2022-11-25 17:42:26 +05:30
Dmitry Simonenko	5813173e07	Introduce drop_stale_chunks() function This function drops chunks on a specified data node if those chunks are not known by the access node. Call drop_stale_chunks() automatically when data node becomes available again. Fix #4848	2022-11-23 19:21:05 +02:00
gayyappan	b9ca06d6e3	Move freeze/unfreeze chunk to tsl Move code for freeze and unfreeze chunk to tsl directory.	2022-11-17 15:28:47 -05:00
Erik Nordström	f13214891c	Add function to alter data nodes Add a new function, `alter_data_node()`, which can be used to change the data node's configuration originally set up via `add_data_node()` on the access node. The new functions introduces a new option "available" that allows configuring the availability of the data node. Setting `available=>false` means that the node should no longer be used for reads and writes. Only read "failover" is implemented as part of this change, however. To fail over reads, the alter data node function finds all the chunks for which the unavailable data node is the "primary" query target and "fails over" to a chunk replica on another data node instead. If some chunks do not have a replica to fail over to, a warning will be raised. When a data node is available again, the function can be used to switch back to using the data node for queries. Closes #2104	2022-11-11 13:59:42 +01:00
Sutou Kouhei	8d1755bd78	Fix a typo in process_compressed_data_out()	2022-11-02 13:49:47 +01:00
Erik Nordström	4b05402580	Add health check function A new health check function _timescaledb_internal.health() returns the health and status of the database instance, including any configured data nodes (in case the instance is an access node). Since the function returns also the health of the data nodes, it tries hard to avoid throwing errors. An error will fail the whole function and therefore not return any node statuses, although some of the nodes might be healthy. The health check on the data nodes is a recursive (remote) call to the same function on those nodes. Unfortunately, the check will fail with an error if a connection cannot be established to a node (or an error occurs on the connection), which means the whole function call will fail. This will be addressed in a future change by returning the error in the function result instead.	2022-10-21 10:34:16 +02:00
Sven Klemm	b34b91f18b	Add timezone support to time_bucket_gapfill This patch adds a new time_bucket_gapfill function that allows bucketing in a specific timezone. You can gapfill with explicit timezone like so: `SELECT time_bucket_gapfill('1 day', time, 'Europe/Berlin') ...` Unfortunately this introduces an ambiguity with some previous call variations when an untyped start/finish argument was passed to the function. Some queries might need to be adjusted and either explicitly name the positional argument or resolve the type ambiguity by casting to the intended type.	2022-09-07 16:37:53 +02:00
Mats Kindahl	e0f3e17575	Use new validation functions Old patch was using old validation functions, but there are already validation functions that both read and validate the policy, so using those. Also removing the old `job_config_check` function since that is no longer use and instead adding a `job_config_check` that calls the checking function with the configuration.	2022-08-25 10:38:03 +03:00
Jan Nidzwetzki	0786226e43	Ensure TSL library is loaded on database upgrades This patch ensures that the TSL library is loaded when the database is upgraded and post_update_cagg_try_repair is called. There are some situations when the library is not loaded properly (see #4573 and Support-Dev-Collab#468), resulting in the following error message: "[..] is not supported under the current "timescale" license HINT: Upgrade your license to 'timescale'"	2022-08-24 06:34:12 +02:00
Dmitry Simonenko	90cace417e	Load TSL library on compressed_data_out call A call to `compressed_data_out` from a replication worker would produce a misleading error saying that your license is "timescale" and you should upgrade to "timescale" license, even if you have already upgraded. As a workaround, we try to load the TSL module it in this function. It will still error out in the "apache" version as intended. We already had the same fix for `compressed_data_in` function.	2022-08-22 17:47:36 +03:00
Rafia Sabih	16fdb6ca5e	Checks for policy validation and compatibility At the time of adding or updating policies, it is checked if the policies are compatible with each other and to those already on the CAgg. These checks are: - refresh and compression policies should not overlap - refresh and retention policies should not overlap - compression and retention policies should not overlap Co-authored-by: Markos Fountoulakis <markos@timescale.com>	2022-08-12 00:55:18 +03:00
Rafia Sabih	088f688780	Miscellaneous -Add infinity for refresh window range Now to create open ended refresh policy use +/- infinity for end_offset and star_offset respectivly for the refresh policy. -Add remove_all_policies function This will remove all the policies on a given CAgg. -Remove parameter refresh_schedule_interval -Fix downgrade scripts -Fix IF EXISTS case Co-authored-by: Markos Fountoulakis <markos@timescale.com>	2022-08-12 00:55:18 +03:00
Rafia Sabih	bca65f4697	1 step CAgg policy management This simplifies the process of adding the policies for the CAggs. Now, with one single sql statements all the policies can be added for a given CAgg. Similarly, all the policies can be removed or modified via single sql statement only. This also adds a new function as well as a view to show all the policies on a continuous aggregate.	2022-08-12 00:55:18 +03:00
gayyappan	79bf4f53b1	Add api to associate a hypertable with custom jobs This PR introduces a new SQL function to associate a hypertable or continuous agg with a custom job. If this dependency is setup, the job is automatically deleted when the hypertable/cagg is dropped.	2022-06-23 13:33:33 -04:00
Sven Klemm	308ce8c47b	Fix various misspellings	2022-06-13 10:53:08 +02:00
Dmitry Simonenko	f1575bb4c3	Support moving compressed chunks between data nodes This change allows to copy or move compressed chunks between data nodes by including compressed chunk into the chunk copy command stages.	2022-05-18 22:14:50 +03:00
Nikhil Sontakke	ddd02922c9	Support non-superuser move chunk operations The non-superuser needs to have REPLICATION privileges atleast. A new function "subscription_cmd" has been added to allow running subscription related commands on datanodes. This function implicitly upgrades to the bootstrapped superuser and then performs subscription creation/alteration/deletion commands. It only accepts subscriptions related commands and errors out otherwise.	2022-05-18 16:56:31 +05:30
Fabrízio de Royes Mello	1e8d37b54e	Remove `chunk_id` from materialization hypertable First step to remove the re-aggregation for Continuous Aggregates is to remove the `chunk_id` from the materialization hypertable. Also added new metadata column named `finalized` to `continuous_cagg` catalog table in order to store information about the new following finalized version of Continuous Aggregates that will not need the partials anymore. This flag is important to maintain backward compatibility with previous Continuous Aggregate implementation that requires the `chunk_id` to refresh data properly.	2022-05-06 14:30:00 -03:00
Markos Fountoulakis	fab16f3798	Fix segfault in Continuous Aggregates Add the missing variables to the finalization view of Continuous Aggregates and the corresponding columns to the materialization table. Cover the case of targets that contain Aggref nodes and Var nodes that are outside of the Aggref nodes at the same time. Stop rebuilding the Continuous Aggregate view with ALTER MATERIALIZED VIEW. Attempt to repair the view at post-update time instead, and fail gracefully if it is not possible to do so without raw hypertable schema or data modifications. Stop rebuilding the Continuous Aggregate view when switching realtime aggregation on and off. Instead, manipulate the User View by either: 1. removing the UNION ALL right-hand side and the WHERE clause when disabling realtime aggregation 2. adding the Direct View to the right of a UNION ALL operator and defining WHERE clauses with the relevant watermark checks when enabling realtime aggregation Fixes #3898	2022-04-18 12:54:20 +03:00
Mats Kindahl	15d33f0624	Add option to compile without telemetry Add option `USE_TELEMETRY` that can be used to exclude telemetry from the compile. Telemetry-specific SQL is moved, which is only included when extension is compiled with telemetry and the notice is changed so that the message about telemetry is not printed when Telemetry is not compiled in. The following code is not compiled in when telemetry is not used: - Cross-module functions for telemetry. - Checks for telemetry job in job execution. - GUC variables `telemetry_level` and `telemetry_cloud`. Telemetry subsystem is not included when compiling without telemetry, which requires some functions to be moved out of the telemetry subsystem: - Metadata handling is moved out of the telemetry module since it is used not only with telemetry. - UUID functions are moved into a separate module instead of being part of the telemetry subsystem. - Telemetry functions are either added or removed when updating from a previous version. Tests are updated to: - Not use telemetry functions to get UUID or Metadata and instead use the moved UUID and metadata functions. - Not include telemetry information in tests that do not require it. - Configuration files do not set telemetry variables when telemetry is not compiled in. - Replaced usage of telemetry functions in non-telemetry tests with other sources of same information. Fixes #3931	2022-03-03 12:21:07 +01:00
Fabrízio de Royes Mello	342f848d90	Refactor invalidation log inclusion Commit 97c2578ffa6b08f733a75381defefc176c91826b overcomplicated the `invalidate_add_entry` API by adding parameters related to the remote function call for multi-node on materialization hypertables. Refactored it simplifying the function interface and adding a new function to deal with materialization hypertables on multi-node environment. Fixes #3833	2022-01-17 11:45:12 -03:00
Mats Kindahl	b208f5276f	Remove C language recompress_chunk Since we are re-implementing `recompress_chunk` as a PL/SQL function, there is no need to keep the C language version around any more, so we remove it from the code.	2021-12-10 14:15:47 +01:00
Fabrízio de Royes Mello	da8ce2e140	Properly handle `max_retries` option Surprisly we're not taking care of `max_retries` option leading us to failed jobs running forever. Fixed it by properly handle the `max_retries` option in our scheduler. Fixes #3035	2021-11-25 09:47:54 -03:00
Dmitry Simonenko	3d11927567	Rework distributed DDL processing logic This patch does refactoring and rework of the logic beside dist_ddl_preprocess() function. The idea behind it is to simplify process by splitting each DDL command logic inside separate function and avoid relaying on the hypertable list count to make decisions. This change allows easier to process more complex commands (such as GRANT), which would require query rewrite or to be executed on a different data nodes. Additionally this would make it easier to follow and be more alike as main code path inside src/process_util.c.	2021-10-29 16:15:58 +03:00
Markos Fountoulakis	221437e8ef	Continuous aggregates for distributed hypertables Add support for continuous aggregates for distributed hypertables by allowing a continuous aggregate to read from a distributed hypertable so that the continuous aggregate is on the access node while the hypertable data is on the data nodes. For distributed hypertables, both the hypertable and continuous aggregate invalidation log are kept on the data nodes and the refresh window is computed at refresh time on each data node. Since the continuous aggregate materialization hypertable is not present on the data nodes, the invalidation log was extended to allow using a non-local hypertable id on the data nodes. This means that you cannot create continuous aggregates on the data nodes since those could clash with continuous aggregates on the access node. Some utility statements added entries to the invalidation logs directly (truncating chunks and hypertables, as well as dropping individual chunks), so to handle this case, internal functions were added to allow logging invalidation on the data nodes from the access node. The commit also includes some fixes to memory context usage that caused crashes for invalidation triggers and also disable per data node queries during refresh since that would otherwise generate an exception. Fixes #3435 Co-authored-by: Mats Kindahl <mats@timescale.com>	2021-10-25 18:20:11 +03:00
gayyappan	b0886c1b6d	Support cagg invalidation trigger for inserts into compressed chunks After row triggers do not work when we insert into a compressed chunk. This causes a problem for caggs as invalidations are not recorded. Explicitly call the function to record invalidations when we insert into a compressed chunk (if the hypertable has caggs defined on it) Fixes #3410.	2021-10-21 11:44:11 -04:00
gayyappan	fffd6c2350	Use plpgsql procedure for executing compression policy This PR removes the C code that executes the compression policy. Instead we use a PL/pgSQL procedure to execute the policy. PG13.4 and PG12.8 introduced some changes that require PortalContexts while executing transactions. The compression policy procedure compresses chunks in multiple transactions. We have seen some issues with snapshots and portal management in the policy code (due to the PG13.4 code changes). SPI API has transaction-portal management code. However, the compression policy code does not use SPI interfaces. But it is fairly easy to just convert this into a PL/pgSQL procedure (which calls SPI) rather than replicating portal managment code in C to manage multiple txns in the compression policy. This PR also disallows decompress_chunk, compress_chunk and recompress_chunk in txn read only mode. Fixes #3656	2021-10-13 09:11:59 -04:00
Nikhil	2ffa1bf436	Implement cleanup for chunk copy/move A chunk copy/move operation is carried out in stages and it can fail in any of them. We track the last completed stage in the "chunk_copy_operation" catalog table. In case of failure, a "chunk_copy_cleanup" function can be invoked to bring the chunk back to its original state on the source datanode and all transient objects like replication slot, publication, subscription, empty chunk, metadata updates, etc are cleaned up. Includes test case changes for each and every stage induced failure. To avoid confusion between chunk copy activity and chunk copy operation this patch also consistently uses "operation" everywhere now instead of "activity"	2021-07-29 16:53:12 +03:00
Dmitry Simonenko	38c1781748	Copy/move chunk refactoring Remove copy_chunk_data() function and code needed to support it, such as the 'transactional' argument. Rework copy chunk logic using separate stages. Introduce copy_chunk() API function as an internal wrapper for the move_chunk().	2021-07-29 16:53:12 +03:00
Nikhil	f6b0250557	Implement wrapper API for copy/move chunk The building blocks required for implementing end-to-end copy/move chunk functionality have now been wrapped in a procedure. A procedure is required because multiple transactions are needed to carry out the activity across the access node and the involved two data nodes. The following steps are encapsulated in this procedure 1) Create an empty chunk table on the destination data node 2) Copy the data from the src data node chunk to this newly created destination node chunk. This is done via inbuilt PostgreSQL logical replication functionality 3) Attach this chunk to the hypertable on the dst data node 4) Remove this chunk from the src data node to complete the move if requested A new catalog table "chunk_copy_activity" has been added to track the progress of the above stages. A unique id gets assigned to each activity and it is updated with the completed stages as things progress.	2021-07-29 16:53:12 +03:00
Dmitry Simonenko	2c66c1fd64	Introduce function to copy chunk data between data nodes Add internal copy_chunk_data() function which implements a way to copy chunk data between data nodes using logical replication. This patch prepared together with @nikkhils.	2021-07-29 16:53:12 +03:00
Nikhil	762053431e	Implement drop_chunk_replica API This function drops a chunk on a specified data node. It then removes the metadata about the datanode, chunk association on the access node. This function is meant for internal use as part of the "move chunk" functionality. If only one chunk replica remains then this function refuses to drop it to avoid data loss.	2021-07-29 16:53:12 +03:00
Ruslan Fomkin	404f1cdbad	Create chunk table from access node Creates a table for chunk replica on the given data node. The table gets the same schema and name as the chunk. The created chunk replica table is not added into metadata on the access node or data node. The primary goal is to use it during copy/move chunk.	2021-07-29 16:53:12 +03:00
Ruslan Fomkin	28ccecbe7c	Create an empty chunk table Adds an internal API function to create an empty chunk table according the given hypertable for the given chunk table name and dimension slices. This functions creates a chunk table inheriting from the hypertable, so it guarantees the same schema. No TimescaleDB's metadata is updated. To be able to create the chunk table in a tablespace attached to the hyeprtable, this commit allows calculating the tablespace id without the dimension slice to exist in the catalog. If there is already a chunk, which collides on dimension slices, the function fails to create the chunk table. The function will be used internally in multi-node to be able to replicate a chunk from one data node to another.	2021-07-29 16:53:12 +03:00
Erik Nordström	98110af75b	Constify parameters and return values of core APIs Harden core APIs by adding the `const` qualifier to pointer parameters and return values passed by reference. Adding `const` to APIs has several benefits and potentially reduces bugs. * Allows core APIs to be called using `const` objects. * Callers know that objects passed by reference are not modified as a side-effect of a function call. * Returning `const` pointers enforces "read-only" usage of pointers to internal objects, forcing users to copy objects when mutating them or using explicit APIs for mutations. * Allows compiler to apply optimizations and helps static analysis. Note that these changes are so far only applied to core API functions. Further work can be done to improve other parts of the code.	2021-06-14 22:09:10 +02:00
Sven Klemm	fe872cb684	Add policy_recompression procedure This patch adds a recompress procedure that may be used as custom job when compression and recompression should run as separate background jobs.	2021-05-24 18:03:47 -04:00
gayyappan	4f865f7870	Add recompress_chunk function After inserts go into a compressed chunk, the chunk is marked as unordered.This PR adds a new function recompress_chunk that compresses the data and sets the status back to compressed. Further optimizations for this function are planned but not part of this PR. This function can be invoked by calling SELECT recompress_chunk(<chunk_name>). recompress_chunk function is automatically invoked by the compression policy job, when it sees that a chunk is in unordered state.	2021-05-24 18:03:47 -04:00
gayyappan	93be235d33	Support for inserts into compressed hypertables Add CompressRowSingleState . This has functions to compress a single row.	2021-05-24 18:03:47 -04:00
Erik Nordström	f6967b349f	Use COPY when executing distributed INSERTs A new custom plan/executor node is added that implements distributed INSERT using COPY in the backend (between access node and data nodes). COPY is significantly faster than the existing method that sets up prepared INSERT statements on each data node. With COPY, tuples are streamed to data nodes instead of batching them in order to "fill" a configured prepared statement. A COPY also avoids the overhead of having to plan the statement on each data node. Using COPY doesn't work in all situations, however. Neither ON CONFLICT nor RETURNING clauses work since COPY lacks support for them. Still, RETURNING is possible if one knows that the tuples aren't going to be modified by, e.g., a trigger. When tuples aren't modified, one can return the original tuples on the access node. In order to implement the new custom node, some refactoring has been performed to the distributed COPY code. The basic COPY support functions have been moved to the connection module so that switching in and out of COPY_IN mode is part of the core connection handling. This allows other parts of the code to manage the connection mode, which is necessary when, e.g., creating a remote chunk. To create a chunk, the connection needs to be switched out of COPY_IN mode so that regular SQL statements can be executed again. Partial fix for #3025.	2021-05-12 16:14:28 +02:00
Dmitry Simonenko	6a1c81b63e	Add distributed restore point functionality This change adds create_distributed_restore_point() function which allows to create recovery restore point across data nodes. Fix #2846	2021-02-25 15:39:50 +03:00
gayyappan	5be6a3e4e9	Support column rename for hypertables with compression enabled ALTER TABLE <hypertable> RENAME <column_name> TO <new_column_name> is now supported for hypertables that have compression enabled. Note: Column renaming is not supported for distributed hypertables. So this will not work on distributed hypertables that have compression enabled.	2021-02-19 10:21:50 -05:00
gayyappan	f649736f2f	Support ADD COLUMN for compressed hypertables Support ALTER TABLE .. ADD COLUMN <colname> <typname> for hypertables with compressed chunks.	2021-01-14 09:32:50 -05:00
Erik Nordström	2ecb53e7bb	Improve memory handling for remote COPY This change improves memory usage in the `COPY` code used for distributed hypertables. The following issues have been addressed: * `PGresult` objects were not cleared, leading to memory leaks. * The caching of chunk connections didn't work since the lookup compared ephemeral chunk pointers instead of chunk IDs. The effect was that cached chunk connection state was reallocated every time instead of being reused. This likely also caused worse performance. To address these issues, the following changes are made: * All `PGresult` objects are now cleared with `PQclear`. * Lookup for chunk connections now compares chunk IDs instead of chunk pointers. * The per-tuple memory context is moved the to the outer processing loop to ensure that everything in the loop is allocated on the per-tuple memory context, which is also reset at every iteration of the loop. * The use of memory contexts is also simplified to have only one memory context for state that should survive across resets of the per-tuple memory context. Fixes #2677	2020-12-02 17:40:44 +01:00
Ruslan Fomkin	791b0a4db7	Add API to refresh continuous aggregate on chunk Function refresh_continuous_aggregate, which takes a continuous aggregate and a chunk, is added. It refreshes the continuous aggregate on the given chunk if there are invalidations. The function can be used in a transaction, e.g., together with following drop_chunks. This allows users to create a user defined action to refresh and drop chunks. Therefore, the refresh on drop is removed from drop_chunks.	2020-11-12 08:33:35 +01:00

1 2 3 4

156 Commits