timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-26 00:00:54 +08:00

Author	SHA1	Message	Date
Erik Nordström	e56b95daec	Add telemetry stats based on type of relation Refactor the telemetry function and format to include stats broken down on common relation types. The types include: - Tables - Partitioned tables - Hypertables - Distributed hypertables - Continuous aggregates - Materialized views - Views and for each of these types report (when applicable): - Total number of relations - Total number of children/chunks - Total data volume (broken into heap, toast, and indexes). - Compression stats - PG stats, like reltuples The telemetry function has also been refactored to return `jsonb` instead of `text`. This makes it easier to query and manipulate the resulting JSON format, and also gives cleaner output. Closes #3932	2022-02-08 09:44:55 +01:00
gayyappan	44be03b5c6	Fix premature cache release call The cache entry for a hypertable is created by calling ts_hypertable_from_tupleinfo. This sets up the ht->space structure, which in turn sets up the dimension information. These structures are allocated in the cache's memory context. The dimension information is accessed after the cache is released in ts_subtract_integer_from_now. This PR fixes this by releasing the cache before returning from the function. Fixes #4014	2022-01-25 18:01:29 -05:00
Aleksander Alekseev	9103d697fb	Don't allow using buckets like '1 month 15 days' + some refactorings This is in fact a backport from the "Buckets with timezones" feature branch. While working on the feature a bug was discovered. We allow creating buckets like '1 month 15 days', i.e. fixed-sized + variable-sized, which is supposed to be forbidden. This patch fixes the bug and also simplifies code a little. timezone_in / timezone_out procedures are used instead of snprintf/scanf. Also, the CAggTimebucketInfo structure was changed slightly. These changes are going to be needed for timezones support anyway.	2022-01-18 12:42:51 +03:00
Sven Klemm	39645d56da	Fix subtract_integer_from_now on 32-bit platforms This patch fixes subtract_integer_from_now on 32-bit platforms, improves error handling and adds some basic tests. subtract_integer_from_now would trigger an assert when called on a hypertable without integer time dimension (found by sqlsmith). Additionally subtract_integer_from_now would segfault when called on a hypertable without partitioning dimensions.	2021-12-20 10:02:57 +01:00
Aleksander Alekseev	958040699c	Monthly buckets support in CAGGs This patch allows using time_bucket_ng("N month", ...) in CAGGs. Users can also specify years, or months AND years. CAGGs on top of distributed hypertables are supported as well.	2021-12-13 22:21:17 +03:00
Sven Klemm	3c3290976c	Use postgres implemention of find_em_expr_for_rel find_em_expr_for_rel used to be a function in postgres_fdw which we imported but got moved to postgres main in PG13. This patch changes our code to use the postgres implementation when it is available and switches to our own implementation only for PG12.	2021-10-18 21:31:44 +02:00
gayyappan	fffd6c2350	Use plpgsql procedure for executing compression policy This PR removes the C code that executes the compression policy. Instead we use a PL/pgSQL procedure to execute the policy. PG13.4 and PG12.8 introduced some changes that require PortalContexts while executing transactions. The compression policy procedure compresses chunks in multiple transactions. We have seen some issues with snapshots and portal management in the policy code (due to the PG13.4 code changes). SPI API has transaction-portal management code. However, the compression policy code does not use SPI interfaces. But it is fairly easy to just convert this into a PL/pgSQL procedure (which calls SPI) rather than replicating portal managment code in C to manage multiple txns in the compression policy. This PR also disallows decompress_chunk, compress_chunk and recompress_chunk in txn read only mode. Fixes #3656	2021-10-13 09:11:59 -04:00
gayyappan	c55cbb9350	Expose subtract_integer_from_now as SQL function Move subtract_integer_from_now to src directory and create a SQL function for it.	2021-10-13 09:11:59 -04:00
Sven Klemm	06092fbd09	Adjust FuncnameGetCandidates calls to PG14 changes PG14 adds an include_out_arguments parameter to FuncnameGetCandidates. https://github.com/postgres/postgres/commit/e56bce5d	2021-09-08 14:52:45 +02:00
Sven Klemm	d0426ff234	Move all compatibility related files into compat directory	2021-08-28 05:17:22 +02:00
Nikhil	2ffa1bf436	Implement cleanup for chunk copy/move A chunk copy/move operation is carried out in stages and it can fail in any of them. We track the last completed stage in the "chunk_copy_operation" catalog table. In case of failure, a "chunk_copy_cleanup" function can be invoked to bring the chunk back to its original state on the source datanode and all transient objects like replication slot, publication, subscription, empty chunk, metadata updates, etc are cleaned up. Includes test case changes for each and every stage induced failure. To avoid confusion between chunk copy activity and chunk copy operation this patch also consistently uses "operation" everywhere now instead of "activity"	2021-07-29 16:53:12 +03:00
Sven Klemm	fb863f12c7	Remove support for PG11 Remove support for compiling against PostgreSQL 11. This patch also removes PG11 specific compatibility macros.	2021-06-01 20:21:06 +02:00
Sven Klemm	d26c744115	Use %u to format Oid instead of %d Since Oid is unsigned int we have to use %u to print it otherwise oids >= 2^31 will not work correctly. This also switches the places that print type oid to use format helper functions to resolve the oids.	2021-04-14 21:11:20 +02:00
Erik Nordström	ce6387aa90	Allow only integer intervals for custom time types Fix a check for a compatible chunk time interval type when creating a hypertable with a custom time type. Previously, the check allowed `Interval` type intervals for any dimension type that is not an integer type, including custom time types. The check is now changed so that it only accepts an `Interval` for timestamp and date type dimensions. A number of related error messages are also cleaned up so that they are more consistent and conform to the error style guide.	2020-10-15 18:58:01 +02:00
Erik Nordström	c4a91e5ae8	Assume custom time type range is same as bigint The database must know the valid time range of a custom time type, similar to how it knows the time ranges of officially supported time types. However, the only way to "know" the valid time range of a custom time type is to assume it is the same as the one of a supported time type. A previous commit tried to make such assumptions by finding an appropriate cast from the custom time type to a supported time type. However, this fails in case there are multiple casts available that each could return a different type and range. This change restricts the choice of valid time ranges to only that of the bigint time type. Fixes #2523	2020-10-15 18:58:01 +02:00
Mats Kindahl	0e507affc1	Remove modification time from invalidation log The `modification_time` column is hard to maintain with any level of consistency over merges and splits of invalidation ranges so this commit removes it from the invalidation log entries for both hypertables and continuous aggregates. If the modification time is needed in the future, we need to re-introduce it in a manner that can maintain it over both merges and splits. THe function `ts_get_now_internal` is also removed since it is not used any more. Part of #2521	2020-10-14 17:36:51 +02:00
Sven Klemm	ddd6ce21e4	Remove duplicate find_em_expr_for_rel function The functions find_em_expr_for_rel and ts_find_em_expr_for_rel are identical. This patch removes find_em_expr_for_rel and changes all call-sites to use ts_find_em_expr_for_rel.	2020-09-21 13:22:47 +02:00
Erik Nordström	417b66e974	Fix boundary handling in time types and constraints Time types, like date and timestamps, have limits that aren't the same as the underlying storage type. For instance, while a timestamp is stored as an `int64` internally, its max supported time value is not `INT64_MAX`. Instead, `INT64_MAX` represents `+Infinity` and the actual largest possible timestamp is close to `INT64_MAX` (but not `INT64_MAX-1` either). The same applies to min values. Unfortunately, time handling code does not check for these boundaries; in most cases, overflow handling when, e.g., bucketing, are checked against the max integer values instead of type-specific boundaries. In other cases, overflows simply throw errors instead of clamping to the boundary values, which makes more sense in many situations. Using integer time suffers from similar issues. To take one example, simply inserting a valid `smallint` value close to the max into a table with a `smallint` time column fails: ``` INSERT INTO smallint_table VALUES ('32765', 1, 2.0); ERROR: value "32770" is out of range for type smallint ``` This happens because the code that adds dimensional constraints always checks for overflow against `INT64_MAX` instead of the type-specific max value. Therefore, it tries to create a chunk constraint that ends at `32770`, which is outside the allowed range of `smallint`. The resolve these issues, several time-related utility functions have been implemented that, e.g., return type-specific range boundaries, and perform saturated addition and subtraction while clamping to supported boundaries. Fixes #2292	2020-09-04 23:27:22 +02:00
Dmitry Simonenko	cb2da81bf7	Fix ts_get_now_internal to use transaction time Issue: #2167	2020-08-31 14:47:10 +03:00
Erik Nordström	c5a202476e	Fix timestamp overflow in time_bucket optimization An optimization for `time_bucket` transforms expressions of the form `time_bucket(10, time) < 100` to `time < 100 + 10` in order to do chunk exclusion and make better use of indexes on the time column. However, since one bucket is added to the timestamp when doing this transformation, the timestamp can overflow. While a check for such overflows already exists, it uses `+Infinity` (INT64_MAX/DT_NOEND) as the upper bound instead of the actual end of the valid timestamp range. A further complication arises because TimescaleDB internally converts timestamps to UNIX epoch time, thus losing a little bit of the valid timestamp range in the process. Dates are further restricted by the fact that they are internally first converted to timestamps (thus limited by the timestamp range) and then converted to UNIX epoch. This change fixes the overflow issue by only applying the transformation if the resulting timestamps or dates stay within the valid (TimescaleDB-specific) ranges. A test has also been added to show the valid timestamp and date ranges, both PostgreSQL and TimescaleDB-specific ones.	2020-08-27 19:16:24 +02:00
Erik Nordström	e1c94484cf	Add support for infinite timestamps The internal conversion functions for timestamps didn't account for timestamps that are infinite (`-Infinity` or `+Infinity`), and they would therefore generate an error if such timestamps were encountered. This change adds extra checks to the conversion functions to allow infinite timestamps.	2020-08-14 01:52:28 +02:00
Sven Klemm	bb891cf4d2	Refactor retention policy This patch changes the retention policy to store its configuration in the bgw_job table and removes the bgw_policy_drop_chunks table.	2020-08-03 22:33:54 +02:00
Erik Nordström	a311f3735d	Adopt table scan methods for Scanner This change makes the Scanner code agnostic to the underlying storage implementation of the tables it scans. This also fixes a bug that made it impossible to use non-heap table access methods on a hypertable. The bug existed because a check is made for existing data before a table is made into a hypertable. And, since this check reads data from the table using the Scanner, it must be able to read the data irrespective of the underlying storage. As a result of the more generic scan interface, resource management is also improved by delivering tuples in reference-counted tuple table slots. A backwards-compatibility layer is used for PG11, which maps all table access functions to the heap equivalents.	2020-07-29 10:40:12 +02:00
Sven Klemm	c90397fd6a	Remove support for PG9.6 and PG10 This patch removes code support for PG9.6 and PG10. In addition to removing PG96 and PG10 macros the following changes are done: remove HAVE_INT64_TIMESTAMP since this is always true on PG10+ remove PG_VERSION_SUPPORTS_MULTINODE	2020-06-02 23:48:35 +02:00
Dmitry Simonenko	9b4aae813f	Support storage options for distributed hypertables This change allows to deparse and include a main table storage options for the CREATE TABLE command which is executed during the create_distributed_hypertable() call.	2020-05-27 17:31:09 +02:00
Erik Nordström	e2371558f7	Create chunks on remote servers This change ensures that chunk replicas are created on remote (datanode) servers whenever a chunk is created in a local distributed hypertable. Remote chunks are created using the `create_chunk()` function, which has been slightly refactored to allow specifying an explicit chunk table name. The one making the remote call also records the resulting remote chunk IDs in its `chunk_server` mappings table. Since remote command invokation without super-user permissions requires password authentication, the test configuration files have been updated to require password authentication for a cluster test user that is used in tests.	2020-05-27 17:31:09 +02:00
Ruslan Fomkin	1ddc62eb5f	Refactor header inclusion Correcting conditions in #ifdefs, adding missing includes, removing and rearranging existing includes, replacing PG12 with PG12_GE for forward compatibility. Fixed number of places with relation_close to table_close, which were missed earlier.	2020-04-14 23:12:15 +02:00
Joshua Lockerman	949b88ef2e	Initial support for PostgreSQL 12 This change includes a major refactoring to support PostgreSQL 12. Note that many tests aren't passing at this point. Changes include, but are not limited to: - Handle changes related to table access methods - New way to expand hypertables since expansion has changed in PostgreSQL 12 (more on this below). - Handle changes related to table expansion for UPDATE/DELETE - Fixes for various TimescaleDB optimizations that were affected by planner changes in PostgreSQL (gapfill, first/last, etc.) Before PostgreSQL 12, planning was organized something like as follows: 1. construct add `RelOptInfo` for base and appendrels 2. add restrict info, joins, etc. 3. perform the actual planning with `make_one_rel` For our optimizations we would expand hypertables in the middle of step 1; since nothing in the query planner before `make_one_rel` cared about the inheritance children, we didn’t have to be too precises about where we were doing it. However, with PG12, and the optimizations around declarative partitioning, PostgreSQL now does care about when the children are expanded, since it wants as much information as possible to perform partition-pruning. Now planning is organized like: 1. construct add RelOptInfo for base rels only 2. add restrict info, joins, etc. 3. expand appendrels, removing irrelevant declarative partitions 4. perform the actual planning with make_one_rel Step 3 always expands appendrels, so when we also expand them during step 1, the hypertable gets expanded twice, and things in the planner break. The changes to support PostgreSQL 12 attempts to solve this problem by keeping the hypertable root marked as a non-inheritance table until `make_one_rel` is called, and only then revealing to PostgreSQL that it does in fact have inheritance children. While this strategy entails the least code change on our end, the fact that the first hook we can use to re-enable inheritance is `set_rel_pathlist_hook` it does entail a number of annoyances: 1. this hook is called after the sizes of tables are calculated, so we must recalculate the sizes of all hypertables, as they will not have taken the chunk sizes into account 2. the table upon which the hook is called will have its paths planned under the assumption it has no inheritance children, so if it's a hypertable we have to replan it's paths Unfortunately, the code for doing these is static, so we need to copy them into our own codebase, instead of just using PostgreSQL's. In PostgreSQL 12, UPDATE/DELETE on inheritance relations have also changed and are now planned in two stages: - In stage 1, the statement is planned as if it was a `SELECT` and all leaf tables are discovered. - In stage 2, the original query is planned against each leaf table, discovered in stage 1, directly, not part of an Append. Unfortunately, this means we cannot look in the appendrelinfo during UPDATE/DELETE planning, in particular to determine if a table is a chunk, as the appendrelinfo is not at the point we wish to do so initialized. This has consequences for how we identify operations on chunks (sometimes for blocking and something for enabling functionality).	2020-04-14 23:12:15 +02:00
Michael J. Freedman	416cf13385	Clarify supported intervals in error msg Error message used to specify that interval must be defined in terms of days or smaller, which was confusing because we really meant any fixed interval (e.g., weeks, days, hours, minutes, etc.), but not an interval that is not of fixed duration (e.g., months or years).	2020-03-05 13:13:04 -05:00
Matvey Arye	ef77c2ace8	Improve continuous agg user messages Switch from using internal timestamps to more user-friendly timestamps in our log messages and clean up some messages.	2020-01-02 15:49:04 -05:00
Matvey Arye	92aa77247a	Improve minor UIUX Some small improvements: - allow alter table with empty segment by if the original definition had an empty segment by. Improve error msgs. - block compression on tables with OIDs - block compression on tables with RLS	2019-10-29 19:02:58 -04:00
gayyappan	909b0ece78	Block updates/deletes on compressed chunks	2019-10-29 19:02:58 -04:00
Sven Klemm	e2c03e40aa	Add support for pathkey pushdown for transparent decompression This patch adds support for producing ordered output. All segmentby columns need to be prefix of pathkeys and the orderby specified for the compression needs exactly match the rest of pathkeys.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	fa26992c4c	Improve deltadelta and gorilla compressors - Add fallback compressors for deltadelta/gorilla - Add bool compressor for deltadelta	2019-10-29 19:02:58 -04:00
Sven Klemm	d82ad2c8f6	Add ts_ prefix to all exported functions This patch adds the `ts_` prefix to exported functions that didnt have it and removes exports that are not needed.	2019-10-15 14:42:02 +02:00
Sven Klemm	a3a49703aa	Remove get_function_oid from utils.c The get_function_oid function was a reimplementation of PostgreSQL LookupFuncName. This patch removes the function and switches all callers to use LookupFuncName instead.	2019-09-24 21:13:06 +02:00
Sven Klemm	b86e47a8a1	Fix microsoft compiler warnings The microsoft compiler can't figure out that elog(ERROR) doesn't return and warns about functions not returning a value in all code paths. This patch adds pg_unreachable calls to those functions.	2019-09-16 10:13:21 +02:00
Sven Klemm	468c205a4f	Remove attno_find_by_attname and use get_attnum instead The patch removes the custom implementation to find the attribute number for a column and uses PostgreSQL get_attnum function instead.	2019-09-13 14:30:18 +02:00
Sven Klemm	7c434d4914	Fix ChunkAppend space partitioning support for ordered append When ordered append tried to push down targetlist to child paths it assumed childs would be scans on rels which is not true for space partitioning where children might be MergeAppend nodes. This patch also no longer applies the ordered append optimization to partial paths because its not safe to do so. This patch also adds more tests for space partitioned hypertables.	2019-08-21 23:08:15 +02:00
Narek Galstyan	62de29987b	Add a notion of now for integer time columns This commit implements functionality for users to give a custom definition of now() for integer open dimension typed hypertables. Such a now() function enables us to talk about intervals in the context of hypertables with integer time columns. In order to simplify future code. This commit defines a custom ts_interval type that unites the usual postgres intervals and integer time dimension intervals under a single composite type. The commit also enables adding drop chunks policy on hypertables with integer time dimensions if a custom now() function has been set.	2019-08-19 23:23:28 +04:00
gayyappan	5b7eea4cfe	Pass int64 using Int64GetDatum when a Datum is required int64 should be passed to functions that take a Datum parameter using Int64GetDatum. Depending on the platform, postgres either passes int64 by value or allocs a pointer to hold this value. Without this change, we get SEGV on raspberry pi.	2019-05-23 15:44:41 -04:00
Joshua Lockerman	ae3480c2cb	Fix continuous_aggs info This commit switches the remaining JOIN in the continuous_aggs_stats view to LEFT JOIN. This way we'll still see info from the other columns even when the background worker has not run yet. This commit also switches the time fields to output text in the correct format for the underlying time type.	2019-04-26 13:08:00 -04:00
Joshua Lockerman	0737b370a3	Add the actual bgw job for continuous aggregates This commit adds the the actual background worker job that runs the continuous aggregate automatically. This job gets created when the continuous aggregate is created and is deleted when the aggregate is DROPed. By default this job will attempt to run every two bucket widths, and attempts to materialize up to two bucket widths behind the end of the table.	2019-04-26 13:08:00 -04:00
David Kohn	f17aeea374	Initial cont agg INSERT/materialization support This commit adds initial support for the continuous aggregate materialization and INSERT invalidations. INSERT path: On INSERT, DELETE and UPDATE we log the [max, min] time range that may be invalidated (that is, newly inserted, updated, or deleted) to _timescaledb_catalog.continuous_aggs_hypertable_invalidation_log. This log will be used to re-materialize these ranges, to ensure that the aggregate is up-to-date. Currently these invalidations are recorded in by a trigger _timescaledb_internal.continuous_agg_invalidation_trigger, which should be added to the hypertable when the continuous aggregate is created. This trigger stores a cache of min/max values per-hypertable, and on transaction commit writes them to the log, if needed. At the moment, we consider them to always be needed, unless we're in ReadCommitted mode or weaker, and the min invalidated value is greater than the hypertable's invalidation threshold (found in _timescaledb_catalog.continuous_aggs_invalidation_threshold) Materialization path: Materialization currently happens in multiple phase: in phase 1 we determine the timestamp at which we will end the new set of materializations, then we update the hypertable's invalidation threshold to that point, and finally we read the current invalidations, then materialize any invalidated rows, the new range between the continuous aggregate's completed threshold (found in _timescaledb_catalog.continuous_aggs_completed_threshold) and the hypertable's invalidation threshold. After all of this is done we update the completed threshold to the invalidation threshold. The portion of this protocol from after the invalidations are read, until the completed threshold is written (that is, actually materializing, and writing the completion threshold) is included with this commit, with the remainder to follow in subsequent ones. One important caveat is that since the thresholds are exclusive, we invalidate all values _less_ than the invalidation threshold, and we store timevalue as an int64 internally, we cannot ever determine if the row at PG_INT64_MAX is invalidated. To avoid this problem, we never materialize the time bucket containing PG_INT64_MAX.	2019-04-26 13:08:00 -04:00
Joshua Lockerman	b0bd2775bd	Enable optimizing SELECTs within INSERTs Before this PR only SELECTs would be optimized to exclude unneeded chunks by our planner. This PR enables such optimizations on SELECTs found within an INSERT as well. This should speed up commands of the form INSERT INTO <hypertable> (SELECT ... FROM <hyepertable> WHERE ...) We would like to enable this for all commands, but currently DELETE and UPDATE can not handle them, and cause errors when the optimizations are enabled. This commit also fixes an issue that would occur if we tried to exclude chunks based off of infinite time values.	2019-04-24 14:40:08 -04:00
Sven Klemm	ef9891b2e8	Fix a couple typos	2019-04-15 21:44:10 +02:00
Sven Klemm	1813848cb7	Add time_bucket support to chunk exclusion This patch adds support for chunk exclusion for time_bucket expressions in the WHERE clause. The following transformation is done when building RestrictInfo: Transform time_bucket calls of the following form in WHERE clause: time_bucket(width, column) OP value Since time_bucket always returns the lower bound of the bucket for lower bound comparisons the width is not relevant and the following transformation can be applied: time_bucket(width, column) > value column > value Example with values: time_bucket(10, column) > 109 column > 109 For upper bound comparisons width needs to be taken into account and we need to extend the upper bound by width to capture all possible values. time_bucket(width, column) < value column < value + width Example with values: time_bucket(10, column) < 100 column < 100 + 10 This allows chunk exclusions to work for views with aggregations.	2019-04-13 04:36:36 +02:00
Joshua Lockerman	e051842fee	Add interval to internal conversions, and tests for both this and time conversions We find ourselves needing to store intervals (specifically time_bucket widths) in upcoming PRs, so this commit adds that functionality, along with tests that we perform the conversion in a sensible, round-tripa-able, manner. This commit fixes a longstanding bug in plan_hashagg where negative time values would prevent us from using a hashagg. The old logic for to_internal had a flag that caused the function to return -1 instead of throwing an error, if it could not perform the conversion. This logic was incorrect, as -1 is a valid time val The new logic throws the error uncoditionally, and forces the user to CATCH it if they wish to handle that case. Switching plan_hashagg to using the new logic fixed the bug. The commit adds a single SQL file, c_unit_tests.sql, to be the driver for all such pure-C unit tests. Since the tests run quickly, and there is very little work to be done at the SQL level, it does not seem like each group of such tests requires their own SQL file. This commit also upates the test/sql/.gitignore, as some generated files were missing.	2019-03-29 14:47:41 -04:00
Matvey Arye	34edba16a9	Run clang-format on code	2019-02-05 16:55:16 -05:00
niksa	c77f4ab1b3	Explicit chunk exclusion In some cases user might already know what chunks need to be scanned to answer a particular query. Using `chunks_in` function we can skip calculating chunks involved in particular query which should result in better performances as well. A simple example: `SELECT * FROM hypertable WHERE chunks_in(hypertable, ARRAY[1,2])`	2019-01-19 00:02:01 +01:00

1 2

97 Commits