timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-28 01:30:29 +08:00

Author	SHA1	Message	Date
Alexander Kuzmenkov	706a3c0e50	Enable statement logging in the tests Remove 'client_min_messages = LOG' where not needed, and add the 'LOG: statement' output otherwise.	2022-08-25 15:29:28 +03:00
Fabrízio de Royes Mello	28440b7900	Enable ORDER BY on Continuous Aggregates Users often execute TopN like queries over Continuous Aggregates and now with the release 2.7 such queries are even faster because we remove the re-aggregation and don't store partials anymore. Also the previous PR #4430 gave us the ability to create indexes direct on the aggregated columns leading to performance improvements. But there are a noticable performance difference between `Materialized-Only` and `Real-Time` Continuous Aggregates for TopN queries. Enabling the ORDER BY clause in the Continuous Aggregates definition result in: 1) improvements of the User Experience that can use this so commom clause in SELECT queries 2) performance improvements because we give the planner a chance to use the MergeAppend node by producing ordered datasets. Closes #4456	2022-07-31 15:52:55 -03:00
Fabrízio de Royes Mello	f266f5cf56	Continuous Aggregates finals form Following work started by #4294 to improve performance of Continuous Aggregates by removing the re-aggregation in the user view. This PR get rid of `partialize_agg` and `finalize_agg` aggregate functions and store the finalized aggregated (plain) data in the materialization hypertable. Because we're not storing partials anymore and removed the re-aggregation, now is be possible to create indexes on aggregated columns in the materialization hypertable in order to improve the performance even more. Also removed restrictions on types of aggregates users can perform with Continuous Aggregates: * aggregates with DISTINCT * aggregates with FILTER * aggregates with FILTER in HAVING clause * aggregates without combine function * ordered-set aggregates * hypothetical-set aggregates By default new Continuous Aggregates will be created using this new format, but the previous version (with partials) will be supported. Users can create the previous style by setting to `false` the storage paramater named `timescaledb.finalized` during the creation of the Continuous Aggregate. Fixes #4233	2022-05-18 11:38:58 -03:00
Fabrízio de Royes Mello	1e8d37b54e	Remove `chunk_id` from materialization hypertable First step to remove the re-aggregation for Continuous Aggregates is to remove the `chunk_id` from the materialization hypertable. Also added new metadata column named `finalized` to `continuous_cagg` catalog table in order to store information about the new following finalized version of Continuous Aggregates that will not need the partials anymore. This flag is important to maintain backward compatibility with previous Continuous Aggregate implementation that requires the `chunk_id` to refresh data properly.	2022-05-06 14:30:00 -03:00
Konstantina Skovola	687e7c7233	Fix option "timescaledb.create_group_indexes" Previously this option was ignored when creating a continuous aggregate, even when explicitly set to true. Fixes #4249	2022-04-26 20:51:11 +03:00
Mats Kindahl	1b2926c076	Do not modify aggregation state in finalize The function `tsl_finalize_agg_ffunc` modified the aggregation state by setting `trans_value` to the final result when computing the final value. Since the state can be re-used several times, there could be several calls to the finalization function, and the finalization function would be confused when passed a final value instead of a aggregation state transition value. This commit fixes this by not modifying the `trans_value` when computing the final value and instead just returns it (or the original `trans_value` if there is no finalization function). Fixes #3248	2022-04-06 20:50:47 +02:00
gayyappan	d8d392914a	Support for compression on continuous aggregates Enable ALTER MATERIALIZED VIEW (timescaledb.compress) This enables compression on the underlying materialized hypertable. The segmentby and orderby columns for compression are based on the GROUP BY clause and time_bucket clause used while setting up the continuous aggregate. timescaledb_information.continuous_aggregate view defn change Add support for compression policy on continuous aggregates Move code from job.c to policy_utils.c Add support functions to check compression policy validity for continuous aggregates.	2021-12-17 10:51:33 -05:00
gayyappan	217ba461ac	Fix havingqual processing for caggs If the targetlist for the cagg query has both subexprs and exprs from the having clause, the havingqual for the partial view is generated incorrectly. Fix this issue by checking havingqual against all the entries in the targetlist instead of first match. Fixes #2655	2021-08-17 11:12:28 -04:00
Ruslan Fomkin	f98337cd3c	Avoid partitionwise planning of partialize_agg partialize_agg is an internal function, which serializes partial aggregate results. It is used to prepare partials for materialization in continuous aggregates and partial results on data nodes in distributed query execution. paritalize_agg doesn't expect push down of aggregates, which happens when partitionwise aggregate is enabled, and produces a query plan, which either crashes on assert during execution or produces incorrect result. This fix avoids adding partition info if the function is present in the query. This can be seen as a work around and it is good to fix planning of partialize_agg in the case of pushed down aggregates. This commit also contains few minor fixes of readability of comments and code around the changes. Fixes #2849 and fixes #2858	2021-01-28 09:00:08 +01:00
Mats Kindahl	d043ff1e04	Check configuration in alter_job and add_job If a bad value is given to `alter_job` or `add_job` for a configuration parameter, no error will be given but the job will fail to execute. This commit adds checks of the configuration parameters to the functions so that an error is given immediately when calling it. The commit factors out the extraction of parameters from the configuration from the execution functions into a separate functions and calls them from `alter_job` and `add_job` as well as when executing the job. Only non-custom job checks are done. The commit also moves a few functions that were only used in TSL code from the `src/` directory to the `tsl/src/` directory and also removes a redundant permission check and does a minor refactoring of the `job_execute` function so that an active snapshot is always created regardless of whether a transaction is open or not. The corresponding code in the individual policy functions are removed since they are not needed. Closes #2607	2020-12-02 11:04:02 +01:00
Ruslan Fomkin	6a9a965409	Fix support for complex aggregate expression Fixes support for continuous aggregates when the view query contains an expression with several aggregates, e.g., `max(val) - min(val)`. Usage of continuous aggregates with such expression was producing errors if the aggregate expression was not the last in the SELECT clause or not all GROUP BY expressions were present in the SELECT clause. An expression with several aggregates is materialized with partials per aggregate. For example, `max(val) - min(val)` will be materialized in two partial entry columns: one for `max` and one for `min`. Thus all columns in the materialized hypertable should account for the number of partials and cannot just use the position in the original query. This fix makes sure to account for such case. Fixes #2616	2020-11-20 17:39:46 +01:00
Sven Klemm	295817f18e	Improve cagg datatype handling This patch improves datatype handling when the aggregate function argument type is a pseudotype.	2020-10-19 12:01:43 +02:00
Erik Nordström	4623db14ad	Use consistent column names in views Make all views that reference hypertables use `hypertable_schema` and `hypertable_name`.	2020-10-05 15:18:47 +02:00
Sven Klemm	dbb9988eee	Fix result ordering in tests This patch fixes the result sorting in tests that had no ORDER BY clause or where ORDER BY clause did not result in fixed ordering.	2020-09-28 12:15:42 +02:00
Erik Nordström	27e44f20ac	Cleanup functions to find continuous aggregates This change cleans up and removes duplicate code for internal lookups of continuous aggregates. A number of related error messages have also been cleaned up and made conformant with the error style guide.	2020-09-15 17:18:59 +02:00
Erik Nordström	4f74262991	Filter materialized hypertables in view This change filters materialized hypertables from the hypertables view, similar to how internal compression hypertables are filtered. Materialized hypertables are internal objects created as a side effect of creating a continuous aggregate, and these internal hypertables are still listed in the continuous_aggregates view. Fixes #2383	2020-09-14 13:04:59 +02:00
Erik Nordström	202692f1ef	Make tests use the new continuous aggregate API Tests are updated to no longer use continuous aggregate options that will be removed, such as `refresh_lag`, `max_interval_per_job` and `ignore_invalidation_older_than`. `REFRESH MATERIALIZED VIEW` has also been replaced with `CALL refresh_continuous_aggregate()` using ranges that try to replicate the previous refresh behavior. The materializer test (`continuous_aggregate_materialize`) has been removed, since this tested the "old" materializer code, which is no longer used without `REFRESH MATERIALIZED VIEW`. The new API using `refresh_continuous_aggregate` already allows manual materialization and there are two previously added tests (`continuous_aggs_refresh` and `continuous_aggs_invalidate`) that cover the new refresh path in similar ways. When updated to use the new refresh API, some of the concurrency tests, like `continuous_aggs_insert` and `continuous_aggs_multi`, have slightly different concurrency behavior. This is explained by different and sometimes more conservative locking. For instance, the first transaction of a refresh serializes around an exclusive lock on the invalidation threshold table, even if no new threshold is written. The previous code, only took the heavier lock once, and if, a new threshold was written. This new, and stricter locking, means that insert processes that read the invalidation threshold will block for a short time when there are concurrent refreshes. However, since this blocking only occurs during the first transaction of the refresh (which is quite short), it probably doesn't matter too much in practice. The relaxing of locks to improve concurrency and performance can be implemented in the future.	2020-09-11 16:07:21 +02:00
Erik Nordström	07ebd5c9b2	Rename continuous aggregate policy API This change simplifies the name of the functions for adding and removing a continuous aggregate policy. The functions are renamed from: - `add_refresh_continuous_aggregate_policy` - `remove_refresh_continuous_aggregate_policy` to - `add_continuous_aggregate_policy` - `remove_continuous_aggregate_policy` Fixes #2320	2020-09-11 15:22:54 +02:00
Mats Kindahl	9565cbd0f7	Continuous aggregates support WITH NO DATA This commit will add support for `WITH NO DATA` when creating a continuous aggregate and will refresh the continuous aggregate when creating it unless `WITH NO DATA` is provided. All test cases are also updated to use `WITH NO DATA` and an additional test case for verifying that both `WITH DATA` and `WITH NO DATA` works as expected. Closes #2341	2020-09-11 14:02:41 +02:00
gayyappan	97b4d1cae2	Support refresh continuous aggregate policy Support add and remove continuous agg policy functions Integrate policy execution with refresh api for continuous aggregates The old api for continuous aggregates adds a job automatically for a continuous aggregate. This is an explicit step with the new API. So remove this functionality. Refactor some of the utility functions so that the code can be shared by multiple policies.	2020-09-01 21:41:00 -04:00
Mats Kindahl	c054b381c6	Change syntax for continuous aggregates We change the syntax for defining continuous aggregates to use `CREATE MATERIALIZED VIEW` rather than `CREATE VIEW`. The command still creates a view, while `CREATE MATERIALIZED VIEW` creates a table. Raise an error if `CREATE VIEW` is used to create a continuous aggregate and redirect to `CREATE MATERIALIZED VIEW`. In a similar vein, `DROP MATERIALIZED VIEW` is used for continuous aggregates and continuous aggregates cannot be dropped with `DROP VIEW`. Continuous aggregates are altered using `ALTER MATERIALIZED VIEW` rather than `ALTER VIEW`, so we ensure that it works for `ALTER MATERIALIZED VIEW` and gives an error if you try to use `ALTER VIEW` to change a continuous aggregate. Note that we allow `ALTER VIEW ... SET SCHEMA` to be used with the partial view as well as with the direct view, so this is handled as a special case. Fixes #2233 Co-authored-by: =?UTF-8?q?Erik=20Nordstr=C3=B6m?= <erik@timescale.com> Co-authored-by: Mats Kindahl <mats@timescale.com>	2020-08-27 17:16:10 +02:00
Sven Klemm	a9c087eb1e	Allow scheduling custom functions as bgw jobs This patch adds functionality to schedule arbitrary functions or procedures as background jobs. New functions: add_job( proc REGPROC, schedule_interval INTERVAL, config JSONB DEFAULT NULL, initial_start TIMESTAMPTZ DEFAULT NULL, scheduled BOOL DEFAULT true ) Add a job that runs proc every schedule_interval. Proc can be either a function or a procedure implemented in any language. delete_job(job_id INTEGER) Deletes the job. run_job(job_id INTEGER) Execute a job in the current session.	2020-08-20 11:23:49 +02:00
Sven Klemm	d547d61516	Refactor continuous aggregate policy This patch modifies the continuous aggregate policy to store its configuration in the jobs table.	2020-08-11 22:57:02 +02:00
Mats Kindahl	9049a5d3cb	Remove requirement of CASCADE from DROP VIEW To drop a continuous aggregate it was necessary to use the `CASCADE` keyword, which would then cascade to the materialized hypertable. Since this can cascade the drop to other objects that are dependent on the continuous aggregate, this could accidentally drop more objects than intended. This commit fixes this by removing the check for `CASCADE` and adding the materialized hypertable to the list of objects to drop. Fixes timescale/timescaledb-private#659	2020-08-03 22:01:21 +02:00
Sven Klemm	2ae4592930	Add real-time support to continuous aggregates This PR adds a new mode for continuous aggregates that we name real-time aggregates. Unlike the original this new mode will combine materialized data with new data received after the last refresh has happened. This new mode will be the default behaviour for newly created continuous aggregates. To upgrade existing continuous aggregates to the new behaviour the following command needs to be run for all continuous aggregates ALTER VIEW continuous_view_name SET (timescaledb.materialized_only=false); To disable this behaviour for newly created continuous aggregates and get the old behaviour the following command can be run ALTER VIEW continuous_view_name SET (timescaledb.materialized_only=true);	2020-03-31 22:09:42 +02:00
gayyappan	ce624d61d3	Restrict watermark to max for continuous aggregates Set the threshold for continuous aggregates as the max value in the raw hypertable when the max value is lesser than the computed now time. This helps avoid unnecessary materialization checks for data ranges that do not exist. As a result, we also prevent unnecessary writes to the thresholds and invalidation log tables.	2020-03-25 12:20:11 -04:00
Sven Klemm	0cc22ad278	Stop background worker in tests To make tests more stable and to remove some repeated code in the tests this PR changes the test runner to stop background workers. Individual tests that need background workers can still start them and this PR will only stop background workers for the initial database for the test, behaviour for additional databases created during the tests will not change.	2020-03-06 15:27:53 +01:00
Sven Klemm	08c3d9015f	Change log level for cagg materialization messages The log level used for continuous aggregate materialization messages was INFO which is for requested information. Since there is no way to control the behaviour externally INFO is a suboptimal choice because INFO messages cannot be easily suppressed leading to irreproducable test output. Even though time can be mocked to make output consistent this is only available in debug builds. This patch changes the log level of those messages to LOG, so clients can easily control the ouput by setting client_min_messages.	2020-03-06 01:09:08 +01:00
Matvey Arye	08ad7b6612	Add ignore_invalidation_older_than to continuous aggs We added a timescaledb.ignore_invalidation_older_than parameter for continuous aggregatess. This parameter accept a time-interval (e.g. 1 month). if set, it limits the amount of time for which to process invalidation. Thus, if timescaledb.ignore_invalidation_older_than = '1 month' then any modifications for data older than 1 month from the current timestamp at insert time will not cause updates to the continuous aggregate. This limits the amount of work that a backfill can trigger. This parameter must be >= 0. A value of 0 means that invalidations are never processed. When recording invalidations for the hypertable at insert time, we use the maximum ignore_invalidation_older_than of any continuous agg attached to the hypertable as a cutoff for whether to record the invalidation at all. When materializing a particular continuous agg, we use that aggs ignore_invalidation_older_than cutoff. However we have to apply that cutoff relative to the insert time not the materialization time to make it easier for users to reason about. Therefore, we record the insert time as part of the invalidation entry.	2019-12-04 15:47:03 -05:00
gayyappan	4ecc96509d	Fix partial select query for continuous aggregate continuous aggregate views like select time_bucket(), sum(col) from ... group by time_bucket(), grpcol; when grpcol is missing from the select targetlist, the partialize query's select targetlist is incorrect and the view cannot be materialized. This PR fixes this issue.	2019-12-03 13:20:38 -05:00
Matvey Arye	2f7d69f93b	Make continuous agg relative to now() Previously, refresh_lag in continuous aggs was calculated relative to the maximum timestamp in the table. Change the semantics so that it is relative to now(). This is more intuitive. Requires an integer_now function applied to hypertables with integer-based time dimensions.	2019-11-21 14:17:37 -05:00
gayyappan	60cfe6cc90	Support for multiple continuous aggregates Allow multiple continuous aggregates to be defined on a hypertable.	2019-06-24 17:05:49 -04:00
Matvey Arye	d580abf04f	Change how permissions work with continuous aggs To create a continuous agg you now only need SELECT and TRIGGER permission on the raw table. To continue refreshing the continuous agg the owner of the continuous agg needs only SELECT permission. This commit adds tests to make sure that removing the SELECT permission removes ability to refresh using both REFRESH MATERIALIZED VIEW and also through a background worker. This work also uncovered divergence in permission logic for creating triggers by a CREATE TRIGGER on chunks and when new chunks are created. This has now been unified: there is a check to make sure you can create the trigger on the main table and then there is a check that the owner of the main table can create triggers on chunks. Alter view for continuous aggregates is allowed for the owner of the view.	2019-06-24 10:57:38 -04:00
Matvey Arye	77abec0d38	Improve permission checking for continuous aggs Checks: - Create View - Drop View - Alter View - Refresh Materialized View	2019-06-24 10:57:38 -04:00
Matvey Arye	e834c2aba8	Better permission checks in API calls This commit fixes and tests permissions in the following API calls: - reorder_chunk (test only) - alter_job_schedule - add_drop_chunks_policy - remove_drop_chunks_policy - add_reorder_policy - remove_reorder_policy - drop_chunks	2019-06-24 10:57:38 -04:00
gayyappan	0e842e2d90	Fix partial view targetlist for continuous aggregates The partial view should always project the time_bucket expression related column as this is a special column for the materialization table. The partial view failed to project it when the user query's SELECT targetlist did not contain the time_bucket expression. The materialization fails in this scenario.	2019-05-02 14:36:33 -04:00
Matvey Arye	2a76041dae	Make cont aggs group column names more intuitive This commit change the name given to group columns in the materialized tables to make them more intuitive for the user. The goal was to make the column names the same as the column names in the view. The main change was to change time_partitioning_col to be the same as the view. "time_partition_col" is only used as the default when there is no alias. This commit also changes the assignment of the view aliases to the target entries to occur much earlier in the create process.	2019-05-01 14:47:53 -04:00
Joshua Lockerman	b41591bcdb	Test continuous aggregates with space partitions Just a sanity check to make sure they work correctly.	2019-05-01 11:09:43 -04:00
gayyappan	297b9ed66a	Add default index for continuous aggregates Add indexes for materialization table created by continuous aggregates. This behavior can be turned on/off by using timescaledb.create_group_indexes parameter of the WITH clause when the continuous agg is created.	2019-04-30 14:31:03 -04:00
Matvey Arye	eec90593fe	Rename continuous aggs files for consistency Rename continuous aggs files to be more consistent and follow our conventions.	2019-04-26 13:08:00 -04:00

40 Commits