timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-15 10:11:29 +08:00

Author	SHA1	Message	Date
Sven Klemm	2d7eb18f24	Drop unused SQL functions This patch drops the following internal SQL functions which were unused: _timescaledb_internal.is_main_table(regclass); _timescaledb_internal.is_main_table(text, text); _timescaledb_internal.hypertable_from_main_table(regclass); _timescaledb_internal.main_table_from_hypertable(integer); _timescaledb_internal.time_literal_sql(bigint, regtype);	2023-04-12 11:00:18 +02:00
Fabrízio de Royes Mello	38fcd1b76b	Improve Realtime Continuous Aggregate performance When calling the `cagg_watermark` function to get the watermark of a Continuous Aggregate we execute a `SELECT MAX(time_dimension)` query in the underlying materialization hypertable. The problem is that a `SELECT MAX(time_dimention)` query can be expensive because it will scan all hypertable chunks increasing the planning time for a Realtime Continuous Aggregates. Improved it by creating a new catalog table to serve as a cache table to store the current Continous Aggregate watermark in the following situations: - Create CAgg: store the minimum value of hypertable time dimension data type; - Refresh CAgg: store the last value of the time dimension materialized in the underlying materialization hypertable (or the minimum value of materialization hypertable time dimension data type if there's no data materialized); - Drop CAgg Chunks: the same as refresh cagg. Closes #4699, #5307	2023-03-22 16:35:23 -03:00
Fabrízio de Royes Mello	c0f2ed1809	Mark cagg_watermark parallel safe The `cagg_watermark` function perform just read-only operations so is safe to make it parallel safe to take advantage of the Postgres parallel query. Since 2.7 when we introduced the new Continuous Aggregate format we don't use partials anymore and those aggregate functions `partialize_agg` and `finalize_agg` are not parallel safe, so make no sense don't take advantage of Postgres parallel query for realtime Continuous Aggregates.	2023-01-31 13:07:19 -03:00
Sven Klemm	a4081516ca	Append pg_temp to search_path Postgres will prepend pg_temp to the effective search_path if it is not present in the search_path. While pg_temp will never be used to look up functions or operators unless explicitly requested pg_temp will be used to look up relations. Putting pg_temp in search_path makes sure objects in pg_temp will be considered last and pg_temp cannot be used to mask existing objects.	2022-05-03 07:55:43 +02:00
Sven Klemm	6dddfaa54e	Lock down search_path in install scripts This patch locks down search_path in extension install and update scripts to only contain pg_catalog, this requires that any reference in those scripts is fully qualified. Additionally we add explicit create commands to all update scripts for objects added to the public schema. This change will make update scripts fail if a function with identical signature already exists when installing or upgrading instead reusing the existing object.	2022-02-09 17:53:20 +01:00
gayyappan	c55cbb9350	Expose subtract_integer_from_now as SQL function Move subtract_integer_from_now to src directory and create a SQL function for it.	2021-10-13 09:11:59 -04:00
Erik Nordström	bc9726607e	Use end of last bucket as cagg watermark The function `cagg_watermark` returns the time threshold at which materialized data ends and raw query data begins in a real-time aggregation query (union view). The watermark is simply the completed threshold of the continuous aggregate materializer. However, since the completed threshold will no longer exist with the new continuous aggregates, the watermark function has been changed to return the end of the last bucket in the materialized hypertable. In most cases, the completed threshold is the same as the end of the last materialized bucket. However, there are situations when it is not; for example, when there is a filter in the view query some buckets might not be materialized because no data matched the filter. The completed threshold would move ahead regardless. For instance, if there is only data from "device_2" in the raw hypertable and the aggregate has a filter `device=1`, there will be no buckets materialized although the completed threshold moves forward. Therefore the new watermark function might sometimes return a lower watermark than the old function. A similar situation explains the different output in one of the union view tests.	2020-09-05 00:38:36 +02:00
Sven Klemm	6aea391477	Fix signature of cagg_watermark This patch changes the signature from cagg_watermark(oid) to cagg_watermark(int). Since this is an API breaking change it couldn't be done in an earlier release.	2020-08-17 18:19:12 +02:00
gayyappan	ed64af76a5	Fix real time aggregate support for multiple aggregates We should compute the watermark using the materialization hypertable id and not by using the raw hypertable id. New test cases added to continuous_aggs_multi.sql. Existing test cases in continuous_aggs_multi.sql were not correctly updated for this feature. Fixes #1865	2020-05-15 10:15:53 -04:00
Sven Klemm	2ae4592930	Add real-time support to continuous aggregates This PR adds a new mode for continuous aggregates that we name real-time aggregates. Unlike the original this new mode will combine materialized data with new data received after the last refresh has happened. This new mode will be the default behaviour for newly created continuous aggregates. To upgrade existing continuous aggregates to the new behaviour the following command needs to be run for all continuous aggregates ALTER VIEW continuous_view_name SET (timescaledb.materialized_only=false); To disable this behaviour for newly created continuous aggregates and get the old behaviour the following command can be run ALTER VIEW continuous_view_name SET (timescaledb.materialized_only=true);	2020-03-31 22:09:42 +02:00
Joshua Lockerman	ae3480c2cb	Fix continuous_aggs info This commit switches the remaining JOIN in the continuous_aggs_stats view to LEFT JOIN. This way we'll still see info from the other columns even when the background worker has not run yet. This commit also switches the time fields to output text in the correct format for the underlying time type.	2019-04-26 13:08:00 -04:00
Joshua Lockerman	0737b370a3	Add the actual bgw job for continuous aggregates This commit adds the the actual background worker job that runs the continuous aggregate automatically. This job gets created when the continuous aggregate is created and is deleted when the aggregate is DROPed. By default this job will attempt to run every two bucket widths, and attempts to materialize up to two bucket widths behind the end of the table.	2019-04-26 13:08:00 -04:00
David Kohn	f17aeea374	Initial cont agg INSERT/materialization support This commit adds initial support for the continuous aggregate materialization and INSERT invalidations. INSERT path: On INSERT, DELETE and UPDATE we log the [max, min] time range that may be invalidated (that is, newly inserted, updated, or deleted) to _timescaledb_catalog.continuous_aggs_hypertable_invalidation_log. This log will be used to re-materialize these ranges, to ensure that the aggregate is up-to-date. Currently these invalidations are recorded in by a trigger _timescaledb_internal.continuous_agg_invalidation_trigger, which should be added to the hypertable when the continuous aggregate is created. This trigger stores a cache of min/max values per-hypertable, and on transaction commit writes them to the log, if needed. At the moment, we consider them to always be needed, unless we're in ReadCommitted mode or weaker, and the min invalidated value is greater than the hypertable's invalidation threshold (found in _timescaledb_catalog.continuous_aggs_invalidation_threshold) Materialization path: Materialization currently happens in multiple phase: in phase 1 we determine the timestamp at which we will end the new set of materializations, then we update the hypertable's invalidation threshold to that point, and finally we read the current invalidations, then materialize any invalidated rows, the new range between the continuous aggregate's completed threshold (found in _timescaledb_catalog.continuous_aggs_completed_threshold) and the hypertable's invalidation threshold. After all of this is done we update the completed threshold to the invalidation threshold. The portion of this protocol from after the invalidations are read, until the completed threshold is written (that is, actually materializing, and writing the completion threshold) is included with this commit, with the remainder to follow in subsequent ones. One important caveat is that since the thresholds are exclusive, we invalidate all values _less_ than the invalidation threshold, and we store timevalue as an int64 internally, we cannot ever determine if the row at PG_INT64_MAX is invalidated. To avoid this problem, we never materialize the time bucket containing PG_INT64_MAX.	2019-04-26 13:08:00 -04:00
Sven Klemm	f89fd07c5b	Remove year from SQL file license text This changes the license text for SQL files to be identical with the license text for C files.	2019-01-13 23:30:22 +01:00
Sven Klemm	c59a30feed	Remove unused functions from utils.c Remove the following unused functions: _timescaledb_internal.to_microseconds(TIMESTAMPTZ) _timescaledb_internal.to_timestamp_pg(BIGINT) _timescaledb_internal.time_to_internal(anyelement)	2018-12-12 20:54:20 +01:00
Joshua Lockerman	e06733acf0	Fix casing in SQL license header to be consistent with elsewhere	2018-11-15 15:18:58 -05:00
Joshua Lockerman	20ec6914c0	Add license headers to SQL files and test code	2018-10-29 13:28:19 -04:00
Joshua Lockerman	974788516a	Prefix public C functions with ts_ We've decided to adopt the ts_ prefix on all exported C functions in order to avoid having symbol conflicts with future postgres functions. We've already started using this prefix on new functions and this commit adds the prefix to to the old functions.	2018-09-27 11:45:04 -04:00
Floris van Nee	1d9ade7145	add support for other types as timescale column	2018-08-08 11:45:23 -04:00
Erik Nordström	9c9cdca6d3	Add support for adaptive chunk sizing Users can now (optionally) set a target chunk size and TimescaleDB will try to adapt the interval length of the first open ("time") dimension in order to reach that target chunk size. If a hypertable has more than one open dimension, only the first one will have a dynamically adapting interval. Users can optionally specify their own function that calculates the new dimension interval. They can also set a target size of 0 in order to estimate a suitable target size for a chunk based on available memory.	2018-08-08 17:01:31 +02:00
Erik Nordström	71962b86ec	Refactor dimension-related API functions The functions for adding and updating dimensions have been refactored in C to: - improve usage of proper error codes - make messages that better conform with the PostgreSQL standard. - improve security by avoiding that lots of code run under SECURITY DEFINER A new if_not_exists option has also been added to add_dimension() and a the number of partitions can now be set using the new set_number_partitions() function. A bug in the validation of smallint time intervals has been fixed. The previous code didn't check for intervals > 0 and smallint intervals accepted values up to UINT16_MAX instead of INT16_MAX.	2018-01-25 19:02:34 +01:00
Matvey Arye	da8cc797a4	Add support for multiple extension version in one pg instance This PR adds the ability to have multiple different versions of the timescaledb extension be used by different databases in the same PostgreSQL instance (server). This is accomplished by splitting this extension into two .so files. 1) timescaledb.so -- stuff under loader/. Really not a lot of code. This code MUST be backwards compatible in the future. 2) timescaledb-version.so (most of our code). Need not be backwards compatible. Timescaledb.so becomes a small stub which is preloaded and whose main reason for existing is to dynamically load the right timescaledb-version.so when the time comes. This change allows either of the above .so to be loaded in shared_preload_libraries. But timescaledb.so allows for multiple versions used on different databases in the same instance along with smoother upgrades. Using timescaledb-version.so allows for finer-grained control and lock-in and is appropriate in only a few production environments. This PR also adds version checking so that a clear failure message will be displayed if the .so version does not match the SQL extension version. To support multi-version functionality we changed the way SQL update scripts are generated. Previously, the system used a bunch of intermediate upgrade scripts. So with 3 versions, you would have an update script of 1--2, 2--3. But, this PR changes things so that we produce direct "shortcut" update files: 1--3, 2--3. This is done for 2 reasons: 1) Each of the update files should point to $libdir/timescaledb-current_version. Since you cannot guarantee that Previous .so for each intermediate version has been installed. 2) You don't want intermediate version updates installed without the .so. For example, if you have versions 1,2,3 and you are installing version 3, you want the upgrade files 1--3, 2--3 but not 1--2 because if you have 1--2 then a user could do ALTER EXTENSION timescaledb UPDATE TO 2. But the .so for version 2 may not be installed. In order to test this functionality, we add a mock extension version .so that we can test extension loading inside the regression framework.	2018-01-05 12:15:54 -05:00
Rob Kiefer	e44e47ed88	Update add_dimension to take INTERVAL times The user should be able to add time dimensions using INTERVAL when the column type is TIMESTAMP/TIMESTAMPTZ/DATE, so this change adds that support. Additionally it adds some additional tests and checks for add_dimension, e.g., a nice error when the table is not a hypertable.	2017-12-07 12:09:35 -05:00
Matvey Arye	13e1cb5343	Add reindex function reindex allows you to reindex the indexes of only certain chunks, filtering by time. This is a common use case because a user may want to reindex chunks after they are no longer getting new data once. reindex also has a recreate option which will not use REINDEX but will rather CREATE INDEX a new index and then DROP INDEX / RENAME new_index to old_name. This approach has advantages in terms of blocking reads for a much shorter period of time. However, it does more work and will use more disk space during the operation.	2017-11-21 14:08:57 -05:00
Matvey Arye	9c7191e898	Change TIMESTAMP partitioning to be completely tz-independent Previously, for timezones w/o tz. The range_end and range_start were defined as UTC, but the constraints on the table were written as as the local time at the time of chunk creation. This does not work well if timezones change over the life of the hypertable. This change removes the dependency on local time for all timestamp partitioning. Namely, the range_start and range_end remain as UTC but the constraints are now always written in UTC too. Since old constraints correctly describe the data currently in the chunks, the update script to handle this change changes range_start and range_end instead of the constraints. Fixes #300.	2017-11-20 09:27:30 -05:00
Erik Nordström	741b25662e	Mark IMMUTABLE functions as PARALLEL SAFE Functions marked IMMUTABLE should also be parallel safe, but aren't by default. This change marks all immutable functions as parallel safe and removes the IMMUTABLE definitions on some functions that have been wrongly labeled as IMMUTABLE. If functions that are IMMUTABLE does not have the PARALLEL SAFE label, then some standard PostgreSQL regression tests will fail (this is true for PostgreSQL >= 10).	2017-11-17 20:24:30 +01:00
Matvey Arye	5cee104d57	Allow chunk_time_interval to be specified as an INTERVAL type	2017-09-15 12:48:14 -04:00
Matvey Arye	d2561cc4fd	Add ability to partition by a date type	2017-09-07 12:22:03 -04:00
Matvey Arye	bfe58b61f7	Refactor towards supporting version upgrades Clean up the table schema to get rid of legacy tables and functionality that makes it more difficult to provide an upgrade path. Notable changes: * Get rid of legacy tables and code * Simplify directory structure for SQL code * Simplify table hierarchy: remove root table and make chunk tables * inherit directly from main table * Change chunk table suffix from _data to _chunk * Simplify schema usage: _timescaledb_internal for internal functions. * _timescaledb_catalog for metadata tables. * Remove postgres_fdw dependency * Improve code comments in sql code	2017-06-08 13:55:05 -04:00

29 Commits