timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-28 09:46:44 +08:00

Author	SHA1	Message	Date
gayyappan	d8d392914a	Support for compression on continuous aggregates Enable ALTER MATERIALIZED VIEW (timescaledb.compress) This enables compression on the underlying materialized hypertable. The segmentby and orderby columns for compression are based on the GROUP BY clause and time_bucket clause used while setting up the continuous aggregate. timescaledb_information.continuous_aggregate view defn change Add support for compression policy on continuous aggregates Move code from job.c to policy_utils.c Add support functions to check compression policy validity for continuous aggregates.	2021-12-17 10:51:33 -05:00
Fabrízio de Royes Mello	7e3e771d9f	Fix compression policy on tables using INTEGER Commit fffd6c2350f5b3237486f3d49d7167105e72a55b fixes problem related to PortalContext using PL/pgSQL procedure to execute the policy. Unfortunately this new implementation introduced a problem when we use INTEGER and not BIGINT for the time dimension. Fixed it by dealing correclty with the integer types: SMALLINT, INTEGER and BIGINT. Also refatored the policy compression procedure replacing the two procedures `policy_compression_{interval\|integer}` by a simple `policy_compression_execute` casting dimension type dynamically. Fixes #3773	2021-11-05 14:55:23 -03:00
gayyappan	77c969071c	Modify compression job processing logic Instead of picking 1 chunk for processing, we find the list of chunks that have to be compressed by the compression job, and proceed to process each one in its own transaction. Without this, we could end up in a situation where the first chunk is continually picked for recompression (due to active inserts into the chunk) and we don't make any progress. We can limit the number of chunks processed by a single run of the job by setting the new config parameter: max_chunks_to_compress, for the compression job. Valid values are > 0, The job processes only maxchunks_to_compress number of chunks and defers any remaining items to the next scheduled run of the job. The default is to process all pending chunks. We have an additional job config parameter: verbose_log. This enables additional logging that logs the chunks that are processed by the job.	2021-09-09 11:49:37 -04:00
gayyappan	4f865f7870	Add recompress_chunk function After inserts go into a compressed chunk, the chunk is marked as unordered.This PR adds a new function recompress_chunk that compresses the data and sets the status back to compressed. Further optimizations for this function are planned but not part of this PR. This function can be invoked by calling SELECT recompress_chunk(<chunk_name>). recompress_chunk function is automatically invoked by the compression policy job, when it sees that a chunk is in unordered state.	2021-05-24 18:03:47 -04:00
Markos Fountoulakis	bc740a32fb	Add distributed hypertable compression policies Add support for compression policies on Access Nodes. Extend the compress_chunk() function to maintain compression state per chunk on the Access Node.	2021-05-07 16:50:12 +03:00
Erik Nordström	202692f1ef	Make tests use the new continuous aggregate API Tests are updated to no longer use continuous aggregate options that will be removed, such as `refresh_lag`, `max_interval_per_job` and `ignore_invalidation_older_than`. `REFRESH MATERIALIZED VIEW` has also been replaced with `CALL refresh_continuous_aggregate()` using ranges that try to replicate the previous refresh behavior. The materializer test (`continuous_aggregate_materialize`) has been removed, since this tested the "old" materializer code, which is no longer used without `REFRESH MATERIALIZED VIEW`. The new API using `refresh_continuous_aggregate` already allows manual materialization and there are two previously added tests (`continuous_aggs_refresh` and `continuous_aggs_invalidate`) that cover the new refresh path in similar ways. When updated to use the new refresh API, some of the concurrency tests, like `continuous_aggs_insert` and `continuous_aggs_multi`, have slightly different concurrency behavior. This is explained by different and sometimes more conservative locking. For instance, the first transaction of a refresh serializes around an exclusive lock on the invalidation threshold table, even if no new threshold is written. The previous code, only took the heavier lock once, and if, a new threshold was written. This new, and stricter locking, means that insert processes that read the invalidation threshold will block for a short time when there are concurrent refreshes. However, since this blocking only occurs during the first transaction of the refresh (which is quite short), it probably doesn't matter too much in practice. The relaxing of locks to improve concurrency and performance can be implemented in the future.	2020-09-11 16:07:21 +02:00
Mats Kindahl	9565cbd0f7	Continuous aggregates support WITH NO DATA This commit will add support for `WITH NO DATA` when creating a continuous aggregate and will refresh the continuous aggregate when creating it unless `WITH NO DATA` is provided. All test cases are also updated to use `WITH NO DATA` and an additional test case for verifying that both `WITH DATA` and `WITH NO DATA` works as expected. Closes #2341	2020-09-11 14:02:41 +02:00
Erik Nordström	caf64357f4	Handle dropping chunks with continuous aggregates This change makes the behavior of dropping chunks on a hypertable that has associated continuous aggregates consistent with other mutations. In other words, any way of deleting data, irrespective of whether this is done through a `DELETE`, `DROP TABLE <chunk>` or `drop_chunks` command, will invalidate the region of deleted data so that a subsequent refresh of a continuous aggregate will know that the region is out-of-date and needs to be materialized. Previously, only a `DELETE` would invalidate continuous aggregates, while `DROP TABLE <chunk>` and `drop_chunks` did not. In fact, each way to delete data had different behavior: 1. A `DELETE` would generate invalidations and the materializer would update any aggregates to reflect the changes. 2. A `DROP TABLE <chunk>` would not generate invalidations and the changes would therefore not be reflected in aggregates. 3. A `drop_chunks` command would not work unless `ignore_invalidation_older_than` was set. When enabled, the `drop_chunks` would first materialize the data to be dropped and then never materialize that region again, unless `ignore_invalidation_older_than` was reset. But then the continuous aggregates would be in an undefined state since invalidations had been ignored. Due to the different behavior of these mutations, a continuous aggregate could get "out-of-sync" with the underlying hypertable. This has now been fixed. For the time being, the previous behavior of "refresh-on-drop" (i.e., materializing the data on continuous aggregates before dropping it) is retained for `drop_chunks`. However, such "refresh-on-drop" behavior should probably be revisited in the future since it happens silently by default without an opt out. There are situations when such silent refreshing might be undesirable; for instance, let's say the dropped data had seen erroneous backfill that a user wants to ignore. Another issue with "refresh-on-drop" is that it only happens for `drop_chunks` and not other ways of deleting data. Fixes #2242	2020-09-09 21:14:45 +02:00
Sven Klemm	4397e57497	Remove job_type from bgw_job table Due to recent refactoring all policies now use the columns added with the generic job support so the job_type column is no longer needed.	2020-09-01 14:49:30 +02:00
Mats Kindahl	c054b381c6	Change syntax for continuous aggregates We change the syntax for defining continuous aggregates to use `CREATE MATERIALIZED VIEW` rather than `CREATE VIEW`. The command still creates a view, while `CREATE MATERIALIZED VIEW` creates a table. Raise an error if `CREATE VIEW` is used to create a continuous aggregate and redirect to `CREATE MATERIALIZED VIEW`. In a similar vein, `DROP MATERIALIZED VIEW` is used for continuous aggregates and continuous aggregates cannot be dropped with `DROP VIEW`. Continuous aggregates are altered using `ALTER MATERIALIZED VIEW` rather than `ALTER VIEW`, so we ensure that it works for `ALTER MATERIALIZED VIEW` and gives an error if you try to use `ALTER VIEW` to change a continuous aggregate. Note that we allow `ALTER VIEW ... SET SCHEMA` to be used with the partial view as well as with the direct view, so this is handled as a special case. Fixes #2233 Co-authored-by: =?UTF-8?q?Erik=20Nordstr=C3=B6m?= <erik@timescale.com> Co-authored-by: Mats Kindahl <mats@timescale.com>	2020-08-27 17:16:10 +02:00
Sven Klemm	a9c087eb1e	Allow scheduling custom functions as bgw jobs This patch adds functionality to schedule arbitrary functions or procedures as background jobs. New functions: add_job( proc REGPROC, schedule_interval INTERVAL, config JSONB DEFAULT NULL, initial_start TIMESTAMPTZ DEFAULT NULL, scheduled BOOL DEFAULT true ) Add a job that runs proc every schedule_interval. Proc can be either a function or a procedure implemented in any language. delete_job(job_id INTEGER) Deletes the job. run_job(job_id INTEGER) Execute a job in the current session.	2020-08-20 11:23:49 +02:00
Sven Klemm	d547d61516	Refactor continuous aggregate policy This patch modifies the continuous aggregate policy to store its configuration in the jobs table.	2020-08-11 22:57:02 +02:00
Ruslan Fomkin	56b4c10a74	Fix error messages to compression policy Error messages are improved and formulated in terms of compression policy.	2020-08-06 19:17:44 +02:00
Ruslan Fomkin	393e5b9c1a	Remove enabling enterprise from compression test Compression is not enterprise feature anymore. Thus enabling enterprise is not needed in tests.	2020-08-05 14:25:27 +02:00
gayyappan	9f13fb9906	Add functions for compression stats Add chunk_compression_stats and hypertable_compression_stats functions to get before/after compression sizes	2020-08-03 10:19:55 -04:00
Mats Kindahl	590446c6a7	Remove cascade_to_materialization parameter The parameter `cascade_to_materialization` is removed from `drop_chunks` and `add_drop_chunks_policy` as well as associated tables and test functions. Fixes #2137	2020-07-31 11:21:36 +02:00
Sven Klemm	0d5f1ffc83	Refactor compress chunk policy This patch changes the compression policy to store its configuration in the bgw_job table and removes the bgw_policy_compress_chunks table.	2020-07-30 19:58:37 +02:00
Mats Kindahl	a089843ffd	Make table mandatory for drop_chunks The `drop_chunks` function is refactored to make table name mandatory for the function. As a result, the function was also refactored to accept the `regclass` type instead of table name plus schema name and the parameters were reordered to match the order for `show_chunks`. The commit also refactor the code to pass the hypertable structure between internal functions rather than the hypertable relid and moving error checks to the PostgreSQL function. This allow the internal functions to avoid some lookups and use the information in the structure directly and also give errors earlier instead of first dropping chunks and then error and roll back the transaction.	2020-06-17 06:56:50 +02:00
Mats Kindahl	d465c81e6a	Do not compress chunks that are dropped The function `get_chunks_to_compress` return chunks that are not compressed but that are dropped, meaning a lookup using `ts_chunk_get_by_id` will fail to find the corresponding `table_id`, which later leads to a null pointer when looking for the chunk. This leads to a segmentation fault. This commit fixes this by ignoring chunk that have are marked as dropped in the chunk table when scanning for chunks to compress.	2020-03-16 20:33:34 +01:00
Brian Rowe	25eb98c0ec	Prevent starting background workers with NOLOGIN This change will check sql commands which start a background worker on a hypertable to verify that the table owner has permission to log into the database. This is necessary, as background workers for these commands will run with the permissions of the table owner, and thus immediately fail if unable to log in.	2020-03-08 15:09:23 -07:00
Sven Klemm	0cc22ad278	Stop background worker in tests To make tests more stable and to remove some repeated code in the tests this PR changes the test runner to stop background workers. Individual tests that need background workers can still start them and this PR will only stop background workers for the initial database for the test, behaviour for additional databases created during the tests will not change.	2020-03-06 15:27:53 +01:00
Matvey Arye	a2ea01831a	Fix compression_bgw test flakiness Previously we were creating multiple rows using generate_series and now(), depending on the time of day the test was run, this could create one or two chunks, causing flakiness. We changed the test to only create one row and thus one chunk	2019-10-29 19:02:58 -04:00
gayyappan	43aa49ddc0	Add more information in compression views Rename compression views to compressed_hypertable_stats and compressed_chunk_stats and summarize information about compression status for chunks.	2019-10-29 19:02:58 -04:00
gayyappan	6e60d2614c	Add compress chunks policy support Add and drop compress chunks policy using bgw infrastructure.	2019-10-29 19:02:58 -04:00

24 Commits