timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-28 01:30:29 +08:00

Author	SHA1	Message	Date
Michael J. Freedman	416cf13385	Clarify supported intervals in error msg Error message used to specify that interval must be defined in terms of days or smaller, which was confusing because we really meant any fixed interval (e.g., weeks, days, hours, minutes, etc.), but not an interval that is not of fixed duration (e.g., months or years).	2020-03-05 13:13:04 -05:00
Matvey Arye	ef77c2ace8	Improve continuous agg user messages Switch from using internal timestamps to more user-friendly timestamps in our log messages and clean up some messages.	2020-01-02 15:49:04 -05:00
Matvey Arye	92aa77247a	Improve minor UIUX Some small improvements: - allow alter table with empty segment by if the original definition had an empty segment by. Improve error msgs. - block compression on tables with OIDs - block compression on tables with RLS	2019-10-29 19:02:58 -04:00
gayyappan	909b0ece78	Block updates/deletes on compressed chunks	2019-10-29 19:02:58 -04:00
Sven Klemm	e2c03e40aa	Add support for pathkey pushdown for transparent decompression This patch adds support for producing ordered output. All segmentby columns need to be prefix of pathkeys and the orderby specified for the compression needs exactly match the rest of pathkeys.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	fa26992c4c	Improve deltadelta and gorilla compressors - Add fallback compressors for deltadelta/gorilla - Add bool compressor for deltadelta	2019-10-29 19:02:58 -04:00
Sven Klemm	d82ad2c8f6	Add ts_ prefix to all exported functions This patch adds the `ts_` prefix to exported functions that didnt have it and removes exports that are not needed.	2019-10-15 14:42:02 +02:00
Sven Klemm	a3a49703aa	Remove get_function_oid from utils.c The get_function_oid function was a reimplementation of PostgreSQL LookupFuncName. This patch removes the function and switches all callers to use LookupFuncName instead.	2019-09-24 21:13:06 +02:00
Sven Klemm	b86e47a8a1	Fix microsoft compiler warnings The microsoft compiler can't figure out that elog(ERROR) doesn't return and warns about functions not returning a value in all code paths. This patch adds pg_unreachable calls to those functions.	2019-09-16 10:13:21 +02:00
Sven Klemm	468c205a4f	Remove attno_find_by_attname and use get_attnum instead The patch removes the custom implementation to find the attribute number for a column and uses PostgreSQL get_attnum function instead.	2019-09-13 14:30:18 +02:00
Sven Klemm	7c434d4914	Fix ChunkAppend space partitioning support for ordered append When ordered append tried to push down targetlist to child paths it assumed childs would be scans on rels which is not true for space partitioning where children might be MergeAppend nodes. This patch also no longer applies the ordered append optimization to partial paths because its not safe to do so. This patch also adds more tests for space partitioned hypertables.	2019-08-21 23:08:15 +02:00
Narek Galstyan	62de29987b	Add a notion of now for integer time columns This commit implements functionality for users to give a custom definition of now() for integer open dimension typed hypertables. Such a now() function enables us to talk about intervals in the context of hypertables with integer time columns. In order to simplify future code. This commit defines a custom ts_interval type that unites the usual postgres intervals and integer time dimension intervals under a single composite type. The commit also enables adding drop chunks policy on hypertables with integer time dimensions if a custom now() function has been set.	2019-08-19 23:23:28 +04:00
gayyappan	5b7eea4cfe	Pass int64 using Int64GetDatum when a Datum is required int64 should be passed to functions that take a Datum parameter using Int64GetDatum. Depending on the platform, postgres either passes int64 by value or allocs a pointer to hold this value. Without this change, we get SEGV on raspberry pi.	2019-05-23 15:44:41 -04:00
Joshua Lockerman	ae3480c2cb	Fix continuous_aggs info This commit switches the remaining JOIN in the continuous_aggs_stats view to LEFT JOIN. This way we'll still see info from the other columns even when the background worker has not run yet. This commit also switches the time fields to output text in the correct format for the underlying time type.	2019-04-26 13:08:00 -04:00
Joshua Lockerman	0737b370a3	Add the actual bgw job for continuous aggregates This commit adds the the actual background worker job that runs the continuous aggregate automatically. This job gets created when the continuous aggregate is created and is deleted when the aggregate is DROPed. By default this job will attempt to run every two bucket widths, and attempts to materialize up to two bucket widths behind the end of the table.	2019-04-26 13:08:00 -04:00
David Kohn	f17aeea374	Initial cont agg INSERT/materialization support This commit adds initial support for the continuous aggregate materialization and INSERT invalidations. INSERT path: On INSERT, DELETE and UPDATE we log the [max, min] time range that may be invalidated (that is, newly inserted, updated, or deleted) to _timescaledb_catalog.continuous_aggs_hypertable_invalidation_log. This log will be used to re-materialize these ranges, to ensure that the aggregate is up-to-date. Currently these invalidations are recorded in by a trigger _timescaledb_internal.continuous_agg_invalidation_trigger, which should be added to the hypertable when the continuous aggregate is created. This trigger stores a cache of min/max values per-hypertable, and on transaction commit writes them to the log, if needed. At the moment, we consider them to always be needed, unless we're in ReadCommitted mode or weaker, and the min invalidated value is greater than the hypertable's invalidation threshold (found in _timescaledb_catalog.continuous_aggs_invalidation_threshold) Materialization path: Materialization currently happens in multiple phase: in phase 1 we determine the timestamp at which we will end the new set of materializations, then we update the hypertable's invalidation threshold to that point, and finally we read the current invalidations, then materialize any invalidated rows, the new range between the continuous aggregate's completed threshold (found in _timescaledb_catalog.continuous_aggs_completed_threshold) and the hypertable's invalidation threshold. After all of this is done we update the completed threshold to the invalidation threshold. The portion of this protocol from after the invalidations are read, until the completed threshold is written (that is, actually materializing, and writing the completion threshold) is included with this commit, with the remainder to follow in subsequent ones. One important caveat is that since the thresholds are exclusive, we invalidate all values _less_ than the invalidation threshold, and we store timevalue as an int64 internally, we cannot ever determine if the row at PG_INT64_MAX is invalidated. To avoid this problem, we never materialize the time bucket containing PG_INT64_MAX.	2019-04-26 13:08:00 -04:00
Joshua Lockerman	b0bd2775bd	Enable optimizing SELECTs within INSERTs Before this PR only SELECTs would be optimized to exclude unneeded chunks by our planner. This PR enables such optimizations on SELECTs found within an INSERT as well. This should speed up commands of the form INSERT INTO <hypertable> (SELECT ... FROM <hyepertable> WHERE ...) We would like to enable this for all commands, but currently DELETE and UPDATE can not handle them, and cause errors when the optimizations are enabled. This commit also fixes an issue that would occur if we tried to exclude chunks based off of infinite time values.	2019-04-24 14:40:08 -04:00
Sven Klemm	ef9891b2e8	Fix a couple typos	2019-04-15 21:44:10 +02:00
Sven Klemm	1813848cb7	Add time_bucket support to chunk exclusion This patch adds support for chunk exclusion for time_bucket expressions in the WHERE clause. The following transformation is done when building RestrictInfo: Transform time_bucket calls of the following form in WHERE clause: time_bucket(width, column) OP value Since time_bucket always returns the lower bound of the bucket for lower bound comparisons the width is not relevant and the following transformation can be applied: time_bucket(width, column) > value column > value Example with values: time_bucket(10, column) > 109 column > 109 For upper bound comparisons width needs to be taken into account and we need to extend the upper bound by width to capture all possible values. time_bucket(width, column) < value column < value + width Example with values: time_bucket(10, column) < 100 column < 100 + 10 This allows chunk exclusions to work for views with aggregations.	2019-04-13 04:36:36 +02:00
Joshua Lockerman	e051842fee	Add interval to internal conversions, and tests for both this and time conversions We find ourselves needing to store intervals (specifically time_bucket widths) in upcoming PRs, so this commit adds that functionality, along with tests that we perform the conversion in a sensible, round-tripa-able, manner. This commit fixes a longstanding bug in plan_hashagg where negative time values would prevent us from using a hashagg. The old logic for to_internal had a flag that caused the function to return -1 instead of throwing an error, if it could not perform the conversion. This logic was incorrect, as -1 is a valid time val The new logic throws the error uncoditionally, and forces the user to CATCH it if they wish to handle that case. Switching plan_hashagg to using the new logic fixed the bug. The commit adds a single SQL file, c_unit_tests.sql, to be the driver for all such pure-C unit tests. Since the tests run quickly, and there is very little work to be done at the SQL level, it does not seem like each group of such tests requires their own SQL file. This commit also upates the test/sql/.gitignore, as some generated files were missing.	2019-03-29 14:47:41 -04:00
Matvey Arye	34edba16a9	Run clang-format on code	2019-02-05 16:55:16 -05:00
niksa	c77f4ab1b3	Explicit chunk exclusion In some cases user might already know what chunks need to be scanned to answer a particular query. Using `chunks_in` function we can skip calculating chunks involved in particular query which should result in better performances as well. A simple example: `SELECT * FROM hypertable WHERE chunks_in(hypertable, ARRAY[1,2])`	2019-01-19 00:02:01 +01:00
Joshua Lockerman	acc41a7712	Update license header Only have the copyright in the NOTICE. Hopefully only having to update one place each year will keep it consistent.	2019-01-03 11:57:51 -05:00
Joshua Lockerman	888dea71b5	Stop using the extra field for now and other Windows bugs Something is causing a heap corruption upon setting the license key to default when we try to use the guc extra on windows. For now stop using it and just rerun the validation function, if we get to the assign hook we must have a valid key, so it will never fail. Also Fixes error message on windows; turns out windows does not like to print NULL strings. Don't do that. Fixes other minor windows bugs.	2019-01-02 15:43:48 -05:00
Sven Klemm	c59a30feed	Remove unused functions from utils.c Remove the following unused functions: _timescaledb_internal.to_microseconds(TIMESTAMPTZ) _timescaledb_internal.to_timestamp_pg(BIGINT) _timescaledb_internal.time_to_internal(anyelement)	2018-12-12 20:54:20 +01:00
David Kohn	5aa1edac15	Refactor compatibility functions and code to support PG11 Introduce PG11 support by introducing compatibility functions for any whose signatures have changed in PG11. Additionally, refactor the structure of the compatibility functions found in compat.h by breaking them out by function (or small set of similar functions) so that it is easier to see what changed between versions and maintain changes as more versions are supported. In general, the philosophy has been to try for forward compatibility wherever possible, so that we use the latest versions of function interfaces where we can or where reasonably convenient and mimic the behavior in older versions as much as possible.	2018-12-12 11:42:33 -05:00
Sven Klemm	ed5067c356	Fix interval_from_now_to_internal timestamptz handling fix interval_from_now_to_internal to handle timezone properly for timestamptz and simplify code	2018-12-10 23:24:12 +01:00
Joshua Lockerman	9b52909b17	Add the ability to ignore tests from the command line using IGNORES	2018-12-10 16:36:44 -05:00
niksa	019971c402	Optimize FIRST/LAST aggregate functions If possible replace aggregate functions FIRST/LAST with subqueries of the form (SELECT value FROM table WHERE sort IS NOT NULL AND existing-quals ORDER BY sort ASC/DESC LIMIT 1). Given a suitable index on sort column, this plan can be much faster then scanning all the rows and running an aggregate function. The optimization can't be performed if: - query uses GROUP BY or WINDOW function - query contains CTEs - query contains other aggregate functions (eg. Combining MIN/MAX with FIRST/LAST. We can't optimize accross different aggregate functions) - query uses JOIN - FIRST/LAST used in ORDER BY Optimization also works with subqueries, or if FIRST/LAST is used in CTE subquery. In order to standardize existing FIRST/LAST aggregate function with PostgreSQL and FIRST/LAST optimization, we exclude NULL values in sort by column.	2018-12-10 09:50:55 +01:00
Joshua Lockerman	9de504f958	Add ts_ prefix to everything in headers Future proofing: if we ever want to make our functions available to others they’d need to be prefixed to prevent name collisions. In order to avoid having some functions with the ts_ prefix and others without, we’re adding the prefix to all non-static functions now.	2018-12-05 14:43:22 -05:00
Sven Klemm	b9b439fde4	Remove unused functions from utils.c Remove int_cmp, create_fmgr and makeRangeVarFromRelid from utils.c since they were not used and had no test coverage.	2018-11-30 20:12:26 +01:00
Narek Galstyan	9a3402809f	Implement show_chunks in C and have drop_chunks use it Timescale provides an efficient and easy to use api to drop individual chunks from timescale database through drop_chunks. This PR builds on that functionality and through a new show_chunks function gives the opportunity to see the chunks that would be dropped if drop_chunks was run. Additionally, it adds a newer_than option to drop_chunks (also supported by show_chunks) that allows to see/drop chunks in an interval or newer than a point in time. This commit includes: - Implementation of show_chunks in C - Additional helper functions to work with chunks - New version of drop_chunks in sql that uses show_chunks. This also adds a newer_than option to drop_chunks - More enhanced tests of drop_chunks and new tests for show_chunks Among other reasons, show_chunks was implemented in C in order to be able to have both older_than and newer_than arguments be null. This was not possible in SQL because the arguments had to have polymorphic types and whether they are used in function body or not, PL/pgSQL requires these arguments to typecheck.	2018-11-28 13:46:07 -05:00
Amy Tai	80e0b05348	Provide helper function creating struct from tuple Refactored the boilerplate that allocates and copies over data from a tuple to a struct. This is typically used in the scanner context in order to read rows from a SQL table in C.	2018-11-21 15:33:56 -05:00
Joshua Lockerman	d8e41ddaba	Add Apache License header to all C files	2018-10-29 13:28:19 -04:00
Erik Nordström	b2130f8039	Move all time_bucket funtions to same source file This change moves all time_bucket-related functions to the same source file (time_bucket.c) for consistency. There are no changes to code logic.	2018-10-23 10:44:58 +02:00
Matvey Arye	19299cf349	Make all time_bucket function STRICT All time bucket function should return NULL on any NULL parameters.	2018-10-15 10:16:10 -04:00
Matvey Arye	debd91478a	Move to using macro for time_bucket_ts Macro is used for 2 reasons: 1) It's more correct in that it doesn't mix Timestamp and TimestampTz types. There is no implicit conversion of the two beneath the hood. 2) It is slightly faster as it avoid an extra function call. This is a very performance sensitive function for OLAP queries.	2018-10-15 10:16:10 -04:00
Matvey Arye	297d88551b	Add a version of time_bucket that takes an origin This allows people to explicitly specify the origin point.	2018-10-15 10:16:10 -04:00
Matvey Arye	e74be30925	Move time_bucket epoch to a Monday Since Monday is the ISO start of the week, it makes sense to move the time_bucket epoch to start on a Monday. Before the epoch was the same as the Postgres epoch (2000-01-01, a Saturday).	2018-10-15 10:16:10 -04:00
Joshua Lockerman	974788516a	Prefix public C functions with ts_ We've decided to adopt the ts_ prefix on all exported C functions in order to avoid having symbol conflicts with future postgres functions. We've already started using this prefix on new functions and this commit adds the prefix to to the old functions.	2018-09-27 11:45:04 -04:00
Floris van Nee	1d9ade7145	add support for other types as timescale column	2018-08-08 11:45:23 -04:00
Erik Nordström	9c9cdca6d3	Add support for adaptive chunk sizing Users can now (optionally) set a target chunk size and TimescaleDB will try to adapt the interval length of the first open ("time") dimension in order to reach that target chunk size. If a hypertable has more than one open dimension, only the first one will have a dynamically adapting interval. Users can optionally specify their own function that calculates the new dimension interval. They can also set a target size of 0 in order to estimate a suitable target size for a chunk based on available memory.	2018-08-08 17:01:31 +02:00
Matvey Arye	e362e9cf18	Block mixing hypertables with postgres inheritance	2018-07-26 16:16:40 -04:00
Matvey Arye	2ec065b538	Fix formatting to comply with pgindent This PR fixes all the formatting to be inline with the latest version of pgindent. Since pgindent does not like variables named `type`, those have been appropriately renamed.	2018-07-11 09:40:29 -04:00
Mike Futerko	4f2f1a6eb7	Update the error messages to conform with the style guide; Fix tests An attempt to unify the error messages to conform with the PostgreSQL error messages style guide. See the link below: https://www.postgresql.org/docs/current/static/error-style-guide.html	2018-07-10 12:55:02 -04:00
Matvey Arye	2de6b02c16	Add optimization to use HashAggregate more often This optimization adds a HashAggregate plan to many group by queries. In plain postgres, many time-series queries will not use the hash aggregate because the planner will incorrectly assume that the number of rows is much larger than it actually is and will use the less efficient GroupAggregate instead of a HashAggregate to prevent running out of memory. The planner will assume a large number of rows because the statistics planner for grouping assumes that the number of distinct items produced by a function is the same as the number of distinct items going in. This is not true for functions like time_bucket and date_trunc. This optimization fixes the statistics and add the HashAggregate plan if appropriate. The statistics now rely on evaluating the spread of a variable and dividing it by the interval in the time_bucket or date_trunc. This is still an overestimate of the total number of groups but is better than before. A further improvement on this will be to evaluate the quals (WHERE clauses) on the query to try to derive a tighter spread on the variable. This is left to a future optimization.	2018-06-21 14:01:02 -04:00
Erik Nordström	71962b86ec	Refactor dimension-related API functions The functions for adding and updating dimensions have been refactored in C to: - improve usage of proper error codes - make messages that better conform with the PostgreSQL standard. - improve security by avoiding that lots of code run under SECURITY DEFINER A new if_not_exists option has also been added to add_dimension() and a the number of partitions can now be set using the new set_number_partitions() function. A bug in the validation of smallint time intervals has been fixed. The previous code didn't check for intervals > 0 and smallint intervals accepted values up to UINT16_MAX instead of INT16_MAX.	2018-01-25 19:02:34 +01:00
Erik Nordström	b6e2780460	Apply new indentation (pgindent) used in PostgreSQL 10 Source code indentation has been updated in PostgreSQL 10 to fix a number of issues. This update applies this new indentation to the entire code base. The new indentation requires a new version of pg_bsd_indent, which can be found here: https://git.postgresql.org/git/pg_bsd_indent.git	2018-01-18 15:19:23 +01:00
Rob Kiefer	66396fb81e	Add build support for Windows Windows 64-bit binaries should now be buildable using the cmake build system either from the command line or from Visual Studio. Previous issues regarding unresolved symbols have been resolved with compatibility header files to properly export symbols or getting GUCs via normal APIs.	2017-11-27 12:04:44 -05:00
Matvey Arye	13e1cb5343	Add reindex function reindex allows you to reindex the indexes of only certain chunks, filtering by time. This is a common use case because a user may want to reindex chunks after they are no longer getting new data once. reindex also has a recreate option which will not use REINDEX but will rather CREATE INDEX a new index and then DROP INDEX / RENAME new_index to old_name. This approach has advantages in terms of blocking reads for a much shorter period of time. However, it does more work and will use more disk space during the operation.	2017-11-21 14:08:57 -05:00

1 2

69 Commits