timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-28 09:46:44 +08:00

Author	SHA1	Message	Date
Erik Nordström	4538fc6c40	Optimize continuous aggregate refresh This change ensures a refresh of a continuous aggregate only re-materializes the part of the aggregate that has been invalidated. This makes refreshing much more efficient, and sometimes eliminates the need to materialize data entirely (i.e., in case there are no invalidations in the refresh window). The ranges to refresh are the remainders of invalidations after they are cut by the refresh window (i.e., all invalidations, or parts of invalidations, that fall within the refresh window). The invalidations used for a refresh are collected in a tuple store (which spills to disk) as to not allocate too much memory in case of many invalidations. Invalidations are, however, merged and deduplicated before being added to the tuplestore, similar to how invalidations are processed in the invalidation logs. Currently, the refreshing proceeds with just materializing all invalidated ranges in the order they appear in the tuple store, and the ordering does not matter since all invalidated regions are refreshed in the same transaction.	2020-08-31 10:22:32 +02:00
Erik Nordström	5b8ff384dd	Add infinite invalidations to cagg log In its initial state, a continuous aggregate should be completely invalidated. Therefore, this change adds an infinite invalidation `[-Infinity, +Infinity]` when a continuous aggregate is created.	2020-08-31 10:22:32 +02:00
Sven Klemm	7f93faad02	Fix dist_hypertable test to use unique data node names Change dist_hypertable test to use unique data node names.	2020-08-29 23:15:20 +02:00
Sven Klemm	0fa778d1de	Fix dist_compression test to enable parallel execution Change dist_compression test to use unique data node name so it can be run in parallel.	2020-08-29 23:15:20 +02:00
Sven Klemm	90a5995dfb	Merge deparse and deparse_fail test	2020-08-29 23:15:20 +02:00
Sven Klemm	6ad98a45bb	Move license change test to regresscheck-shared	2020-08-29 23:15:20 +02:00
Sven Klemm	9ae409259a	Merge gapfill tests into single test	2020-08-29 23:15:20 +02:00
Erik Nordström	c5a202476e	Fix timestamp overflow in time_bucket optimization An optimization for `time_bucket` transforms expressions of the form `time_bucket(10, time) < 100` to `time < 100 + 10` in order to do chunk exclusion and make better use of indexes on the time column. However, since one bucket is added to the timestamp when doing this transformation, the timestamp can overflow. While a check for such overflows already exists, it uses `+Infinity` (INT64_MAX/DT_NOEND) as the upper bound instead of the actual end of the valid timestamp range. A further complication arises because TimescaleDB internally converts timestamps to UNIX epoch time, thus losing a little bit of the valid timestamp range in the process. Dates are further restricted by the fact that they are internally first converted to timestamps (thus limited by the timestamp range) and then converted to UNIX epoch. This change fixes the overflow issue by only applying the transformation if the resulting timestamps or dates stay within the valid (TimescaleDB-specific) ranges. A test has also been added to show the valid timestamp and date ranges, both PostgreSQL and TimescaleDB-specific ones.	2020-08-27 19:16:24 +02:00
Mats Kindahl	c054b381c6	Change syntax for continuous aggregates We change the syntax for defining continuous aggregates to use `CREATE MATERIALIZED VIEW` rather than `CREATE VIEW`. The command still creates a view, while `CREATE MATERIALIZED VIEW` creates a table. Raise an error if `CREATE VIEW` is used to create a continuous aggregate and redirect to `CREATE MATERIALIZED VIEW`. In a similar vein, `DROP MATERIALIZED VIEW` is used for continuous aggregates and continuous aggregates cannot be dropped with `DROP VIEW`. Continuous aggregates are altered using `ALTER MATERIALIZED VIEW` rather than `ALTER VIEW`, so we ensure that it works for `ALTER MATERIALIZED VIEW` and gives an error if you try to use `ALTER VIEW` to change a continuous aggregate. Note that we allow `ALTER VIEW ... SET SCHEMA` to be used with the partial view as well as with the direct view, so this is handled as a special case. Fixes #2233 Co-authored-by: =?UTF-8?q?Erik=20Nordstr=C3=B6m?= <erik@timescale.com> Co-authored-by: Mats Kindahl <mats@timescale.com>	2020-08-27 17:16:10 +02:00
Dmitry Simonenko	5300b68208	Add test for hypertable_approximate_row_count() on dist hypertable Issues: #1902	2020-08-26 12:07:04 +03:00
Erik Nordström	f8727756a6	Cleanup drop and show chunks This change removes, simplifies, and unifies code related to `drop_chunks` and `show_chunks`. As a result of prior changes to `drop_chunks`, e.g., making table relid mandatory and removing cascading options, there's an opportunity to clean up and simplify the rather complex code for dropping and showing chunks. In particular, `show_chunks` is now consistent with `drop_chunks`; the relid argument is mandatory, a continuous aggregate can be used in place of a hypertable, and the input time ranges are checked and handled in the same way. Unused code is also removed, for instance, code that cascaded drop chunks to continuous aggregates remained in the code base while the option no longer exists.	2020-08-25 14:36:15 +02:00
Brian Rowe	8e1e6036af	Preserve pg_stats on chunks before compression This change will ensure that the pg_statistics on a chunk are updated immediately prior to compression. It also ensures that these stats are not overwritten as part of a global or hypertable targetted ANALYZE. This addresses the issue that chunk will no longer generate valid statistics durings an ANALYZE once the data's been moved to the compressed table. Unfortunately any compressed rows will not be captured in the parent hypertable's pg_statistics as there is no way to change how PostGresQL samples child tables in PG11. This approach assumes that the compressed table remains static, which is mostly correct in the current implementation (though it is possible to remove compressed segments). Once we start allowing more operations on compressed chunks this solution will need to be revisited. Note that in PG12 an approach leveraging table access methods will not have a problem analyzing compressed tables.	2020-08-21 10:48:15 -07:00
Dmitry Simonenko	33d5d11821	Check CREATE INDEX with transaction per chunk using dist hypertable Issue: #836	2020-08-21 12:42:59 +03:00
Sven Klemm	c281dcdb26	Fix segfault in alter_job When trying to alter a job with NULL config alter_job did not set the isnull field for config and would segfault when trying to build the resultset tuple.	2020-08-21 08:45:58 +02:00
Mats Kindahl	aec7c59538	Block data migration for distributed hypertables Option `migrate_data` does not currently work for distributed hypertables, so we block it for the time being and generate an error if an attempt is made to migrate data when creating a distributed hypertable. Fixes #2230	2020-08-20 15:07:01 +02:00
Sven Klemm	043c29ba48	Block policy API commands in read_only transaction	2020-08-20 11:23:49 +02:00
Sven Klemm	a9c087eb1e	Allow scheduling custom functions as bgw jobs This patch adds functionality to schedule arbitrary functions or procedures as background jobs. New functions: add_job( proc REGPROC, schedule_interval INTERVAL, config JSONB DEFAULT NULL, initial_start TIMESTAMPTZ DEFAULT NULL, scheduled BOOL DEFAULT true ) Add a job that runs proc every schedule_interval. Proc can be either a function or a procedure implemented in any language. delete_job(job_id INTEGER) Deletes the job. run_job(job_id INTEGER) Execute a job in the current session.	2020-08-20 11:23:49 +02:00
Mats Kindahl	9bc5c711f4	Fix retention policy on distributed hypertables If a retention policy is set up on a distributed hypertable, it will not propagate the drop chunks call to the data nodes since the drop chunks call is done through an internal call. This commit fixes this by creating a drop chunks call internally and executing it as a function. This will then propagate to the data nodes. Fixes timescale/timescaledb-private#833 Fixes #2040	2020-08-14 07:21:02 +02:00
Erik Nordström	418f283443	Merge continuous aggregate invalidations This change implements deduplication and merging of invalidation entries for continuous aggregates in order to reduce the number of reduntant entries in the continuous aggregate invalidation log. Merging is done both when copying over entries from the hypertable to the continuous aggregate invalidation log and when cutting already existing invalidations in the latter log. Doing this merging in both steps helps reduce the number of invalidations also for the continuous aggregates that don't get refreshed by the active refresh command. Merging works by scanning invalidations in order of the lowest modified value, and given this ordering it is possible to merge the current and next entry into one large entry if they are overlapping. This can continue until the current and next invalidation are disjoint or there are no more invalidations to process. Note, however, that only the continuous aggregate that gets refreshed will be fully deduplicated. Some redundant entries might exist for other aggregates since their entries in the continuous aggregate log aren't cut against the refresh window. Full deduplication for the refreshed continuous aggregate is only possible if the continuous aggregate invalidation log is processed last, since that also includes "old" entries. Therefore, this change also changes the ordering of how the logs are processed. This also makes it possible to process the hypertable invalidation log in the first transaction of the refresh.	2020-08-13 12:35:23 +02:00
Erik Nordström	c01faa72f0	Set invalidation threshold during refresh The invalidation threshold governs the window of data from the head of a hypertable that shouldn't be subject to invalidations in order to reduce write amplification during inserts on the hypertable. When a continuous aggregate is refreshed, the invalidation threshold must be moved forward (or initialized if it doesn't previously exist) whenever the refresh window stretches beyond the current threshold. Tests for setting the invalidation threshold are also added, including new isolation tests for concurrency.	2020-08-12 11:16:23 +02:00
Erik Nordström	80720206df	Make refresh_continuous_aggregate a procedure When a continuous aggregate is refreshed, it also needs to move the invalidation threshold in case the refresh window stretches beyond the current threshold. The new invalidation threshold must be set in its own transaction during the refresh, which can only be done if the refresh command is a procedure.	2020-08-12 11:16:23 +02:00
Erik Nordström	b8ce74921a	Fix refresh of integer-time continuous aggregates The calculation of the max-size refresh window for integer-based continuous aggregates used the range of 64-bit integers for all integer types, while the max ranges for 16- and 32-bit integers are lower. This change adds the missing range boundaries.	2020-08-12 11:16:23 +02:00
Sven Klemm	d547d61516	Refactor continuous aggregate policy This patch modifies the continuous aggregate policy to store its configuration in the jobs table.	2020-08-11 22:57:02 +02:00
Dmitry Simonenko	1a8d0eae06	Add check for distributed hypertable to reorder/move_chunk Ensure that move_chunk() and reorder_chunk() functions cannot be used with distributed hypertable	2020-08-11 16:12:54 +03:00
gayyappan	eecc93f3b6	Add hypertable_index_size function Function to compute the size for a specific index of a hypertable	2020-08-10 18:00:51 -04:00
Sven Klemm	4409bff025	Add unreferenced test files to CMakeLists The with_clause_parser and continuous_aggs_drop_chunks tests were not referenced in the CMakeLists leading to those tests never being run. This patch adds them to the appropriate file and adjusts the output.	2020-08-07 15:40:57 +02:00
Dmitry Simonenko	0f60b5b33b	Add check for distributed hypertable to continuous aggs Show an error message in case if a distributed hypertable being used.	2020-08-07 15:31:29 +03:00
Ruslan Fomkin	56b4c10a74	Fix error messages to compression policy Error messages are improved and formulated in terms of compression policy.	2020-08-06 19:17:44 +02:00
Ruslan Fomkin	393e5b9c1a	Remove enabling enterprise from compression test Compression is not enterprise feature anymore. Thus enabling enterprise is not needed in tests.	2020-08-05 14:25:27 +02:00
Erik Nordström	9a7b4aa003	Process invalidations when refreshing continuous aggregate This change adds intitial support for invalidation processing when refreshing a continuous aggregate. Note that, currently, invalidations are only cleared during a refresh, but not yet used to optimize refreshes. There are two steps to this processing: 1. Invalidations are moved from hypertable invalidation log to the cagg invalidation log 2. The cagg invalidation entries are then processed for the continuous aggregate that gets refreshed. The second step involves finding all invalidations that overlap with the given refresh window and then either deleting them or cutting them, depending on how they overlap. Currently, the "invalidation threshold" is not moved up during a refresh. This would only be required if the refresh window crosses that threshold and will be addressed in a future change.	2020-08-04 14:22:04 +02:00
Sven Klemm	bb891cf4d2	Refactor retention policy This patch changes the retention policy to store its configuration in the bgw_job table and removes the bgw_policy_drop_chunks table.	2020-08-03 22:33:54 +02:00
Mats Kindahl	9049a5d3cb	Remove requirement of CASCADE from DROP VIEW To drop a continuous aggregate it was necessary to use the `CASCADE` keyword, which would then cascade to the materialized hypertable. Since this can cascade the drop to other objects that are dependent on the continuous aggregate, this could accidentally drop more objects than intended. This commit fixes this by removing the check for `CASCADE` and adding the materialized hypertable to the list of objects to drop. Fixes timescale/timescaledb-private#659	2020-08-03 22:01:21 +02:00
gayyappan	9f13fb9906	Add functions for compression stats Add chunk_compression_stats and hypertable_compression_stats functions to get before/after compression sizes	2020-08-03 10:19:55 -04:00
Mats Kindahl	590446c6a7	Remove cascade_to_materialization parameter The parameter `cascade_to_materialization` is removed from `drop_chunks` and `add_drop_chunks_policy` as well as associated tables and test functions. Fixes #2137	2020-07-31 11:21:36 +02:00
gayyappan	c93f963709	Remove chunk_relation_size Remove chunk_relation_size and chunk_relation_size_pretty functions Fix row_number in chunks view	2020-07-30 16:06:04 -04:00
Mats Kindahl	03d2f32178	Add self-reference check to add_data_node If the access node is adding itself as a data node using `add_data_node` it will deadlock since transactions will be opened on both the access node and data node both trying to update the metadata. This commit fixes this by updating `set_dist_id` to check if the UUID being added as `dist_uuid` is the same as the `uuid` of the node. If that is the case, it raises an error. Fixes #2133	2020-07-30 21:19:33 +02:00
Sven Klemm	0d5f1ffc83	Refactor compress chunk policy This patch changes the compression policy to store its configuration in the bgw_job table and removes the bgw_policy_compress_chunks table.	2020-07-30 19:58:37 +02:00
Brian Rowe	68aee5144c	Rename add_drop_chunks_policy This change replaces the add_drop_chunks_policy function with add_retention_policy. This also renames the older_than parameter of that function as retention_window. Likewise, the remove_drop_chunks_policy is also being renamed remove_retention_policy. Fixes #2119	2020-07-30 09:53:21 -07:00
Ruslan Fomkin	5696668500	Test detach_tablespaces on distributed hypertable Adds a test to call detach_tablespaces on a distributed hypertable. Since no tablespaces can be attached to distributed hyperatbles, the test detaches 0 tablespaces. Also a test to detach tablespaces on a data node is added.	2020-07-30 10:05:25 +02:00
Erik Nordström	84fd3b09b4	Add refresh function for continuous aggregates This change adds a new refresh function called `refresh_continuous_aggregate` that allows refreshing a continuous aggregate over a given window of data, called the "refresh window". This is the first step in a larger overhaul of the continuous aggregate feature with the goal of cleaning up the API and separating policy from the core functionality. Currently, the refresh function does a brute-force refresh of a window and it bypasses the whole invalidation framework. Future updates intend to integrate with this framework (with modifications) to optimize refreshes. An exclusive lock is take on the continuous aggregate's internal materialized hypertable in order to protect against concurrent refreshing. However, as this serializes refreshes, we might want to relax this locking in the future to allow, e.g., concurrent refreshes of non-overlapping windows. The new refresh functionality includes basic tests for bad input and refreshing across different windows. Unfortunately, a bug in the optimization code for `time_bucket` causes timestamps to overflow the allowed MAX time. Therefore, refresh windows that are close to the MAX allowed size are not yet supported or tested.	2020-07-30 01:04:32 +02:00
Sven Klemm	5a410736a9	Only run chunk_api test on debug build The chunk_api test requires a debug build for certain test functions this patch changes the chunk_api test to only run for debug builds.	2020-07-30 00:00:57 +02:00
gayyappan	7d3b4b5442	New size utils functions Add hypertable_detailed_size , chunk_detailed_size, hypertable_size functions. Remove hypertable_relation_size, hypertable_relation_size_pretty, and indexes_relation_size_pretty Remove size information from hypertables view.	2020-07-29 15:30:39 -04:00
Sven Klemm	3e83577916	Refactor reorder policy This patch changes the reorder policy to store it's configuration in the bgw_job table and removes the bgw_policy_reorder table.	2020-07-29 12:07:13 +02:00
Mats Kindahl	6f64f959db	Propagate privileges from hypertables to chunks Whenever chunks are created, no privileges are added to the chunks. For accesses that go through the hypertable permission checks are ignored so reads and writes will succeed anyway. However, for direct accesses to the chunks, permission checks are done, which creates problems for, e.g., `pg_dump`. This commit fixes this by propagating `GRANT` and `REVOKE` statements to the chunks when executed on the hypertable, and whenever new chunks are created, privileges are copied from the hypertable. This commit do not propagate privileges for distributed hypertables, this is in a separate commit.	2020-07-28 17:42:52 +02:00
gayyappan	dc61466aef	Add chunks and dimensions view timescaledb_information.chunks view shows metadata related to chunks. timescaledb_information.dimensions shows metadata related to hypertable's dimensions.	2020-07-26 17:10:05 -04:00
Dmitry Simonenko	fca7e36898	Support moving compressed chunks Allow move_chunk() to work with uncompressed chunk and automatically move associated compressed chunk to specified tablespace. Block move_chunk() execution for compressed chunks. Issue: #2067	2020-07-24 19:26:15 +03:00
gayyappan	926a1c9850	Add compression settings view Add informational view that lists the settings used while enabling compression on a hypertable.	2020-07-23 12:40:12 -04:00
Brian Rowe	6b62ed543c	Fetch collations from data nodes during ANALYZE This change fixes the stats collecting code to also return the slot collation fields for PG12. This fixes a bug (#2093) where running an ANALYZE in PG12 would break queries on distributed tables.	2020-07-20 10:54:44 -07:00
Ruslan Fomkin	bdced2b722	Add test of drop_chunks on distributed hypertable Testing that drop_chunks works correctly on a distributed hypertable. Tests of different arguments are assumed to be done on a usual hypertable previously.	2020-07-20 16:21:45 +02:00
Sven Klemm	3d1a7ca3ac	Fix delete on tables involving hypertables with compression The DML blocker to block INSERTs and UPDATEs on compressed hypertables would trigger if the UPDATE or DELETE referenced any hypertable with compressed chunks. This patch changes the logic to only block if the target of the UPDATE or DELETE is a compressed chunk.	2020-07-20 13:22:49 +02:00

1 2 3 4 5 ...

475 Commits