timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-22 05:31:41 +08:00

Author	SHA1	Message	Date
gayyappan	72588a2382	Restrict constraints on compressed hypertables. Primary and unqiue constraints are limited to segment_by and order_by columns and foreign key constraints are limited to segment_by columns when creating a compressed hypertable. There are no restrictions on check constraints.	2019-10-29 19:02:58 -04:00
Matvey Arye	0f3e74215a	Split segment meta min_max into two columns This simplifies the code and the access to the min/max metadata. Before we used a custom type, but now the min/max are just the same type as the underlying column and stored as two columns. This also removes the custom type that was used before.	2019-10-29 19:02:58 -04:00
gayyappan	43aa49ddc0	Add more information in compression views Rename compression views to compressed_hypertable_stats and compressed_chunk_stats and summarize information about compression status for chunks.	2019-10-29 19:02:58 -04:00
gayyappan	edd3999553	Add trigger to block INSERT on compressed chunk Prevent insert on compressed chunks by adding a trigger that blocks it. Enable insert if the chunk gets decompressed.	2019-10-29 19:02:58 -04:00
Matvey Arye	0db50e7ffc	Handle drops of compressed chunks/hypertables This commit add handling for dropping of chunks and hypertables in the presence of associated compressed objects. If the uncompressed chunk/hypertable is dropped than drop the associated compressed object using DROP_RESTRICT unless cascading is explicitly enabled. Also add a compressed_chunk_id index on compressed tables for figuring out whether a chunk is compressed or not. Change a bunch of APIs to use DropBehavior instead of a cascade bool to be more explicit. Also test the drop chunks policy.	2019-10-29 19:02:58 -04:00
Matvey Arye	2bf97e452d	Push down quals to segment meta columns This commit pushes down quals or order_by columns to make use of the SegmentMetaMinMax objects. Namely =,<,<=,>,>= quals can now be pushed down. We also remove filters from decompress node for quals that have been pushed down and don't need a recheck. This commit also changes tests to add more segment by and order-by columns. Finally, we rename segment meta accessor functions to be smaller	2019-10-29 19:02:58 -04:00
gayyappan	6e60d2614c	Add compress chunks policy support Add and drop compress chunks policy using bgw infrastructure.	2019-10-29 19:02:58 -04:00
Matvey Arye	b9674600ae	Add segment meta min/max Add the type for min/max segment meta object. Segment metadata objects keep metadata about data in segments (compressed rows). The min/max variant keeps the min and max values inside the compressed object. It will be used on compression order by columns to allow queries that have quals on those columns to be able to exclude entire segments if no uncompressed rows in the segment may match the qual. We also add generalized infrastructure for datum serialization / deserialization for arbitrary types to and from memory as well as binary strings.	2019-10-29 19:02:58 -04:00
Matvey Arye	a078781c2e	Add decompress_chunk function This is the opposite dual of compress_chunk.	2019-10-29 19:02:58 -04:00
gayyappan	7a728dc15f	Add view for compression size View for compressed_chunk_size and compressed_hypertable_size	2019-10-29 19:02:58 -04:00
gayyappan	1f4689eca9	Record chunk sizes after compression Compute chunk size before/after compressing a chunk and record in catalog table.	2019-10-29 19:02:58 -04:00
gayyappan	44941f7bd2	Add UI for compress_chunks functionality Add support for compress_chunks function. This also adds support for compress_orderby and compress_segmentby parameters in ALTER TABLE. These parameteres are used by the compress_chunks function. The parsing code will most likely be changed to use PG raw_parser function.	2019-10-29 19:02:58 -04:00
gayyappan	1c6aacc374	Add ability to create the compressed hypertable This happens when compression is turned on for regular hypertables.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	584f5d1061	Implement time-series compression algorithms This commit introduces 4 compression algorithms as well as 3 ADTs to support them. The compression algorithms are time-series optimized. The following algorithms are implemented: - DeltaDelta compresses integer and timestamp values - Gorilla compresses floats - Dictionary compression handles any data type and is optimized for low-cardinality datasets. - Array stores any data type in an array-like structure and does not actually compress it (though TOAST-based compression can be applied on top). These compression algorithms are are fully described in tsl/src/compression/README.md. The Abstract Data Types that are implemented are - Vector - A dynamic vector that can store any type. - BitArray - A dynamic vector to store bits. - SimpleHash - A hash table implementation from PG12. More information can be found in src/adts/README.md	2019-10-29 19:02:58 -04:00
gayyappan	3edc016dfc	Add catalog tables to support compression This commit adds catalog tables that will be used by the compression infrastructure.	2019-10-29 19:02:58 -04:00
Matvey Arye	7ea492f29e	Add last_successful_finish to bgw_job_stats This allows people to better monitor the bgw job health. It indicates when the last time the job made progress was.	2019-10-15 19:14:14 -04:00
Sven Klemm	d82ad2c8f6	Add ts_ prefix to all exported functions This patch adds the `ts_` prefix to exported functions that didnt have it and removes exports that are not needed.	2019-10-15 14:42:02 +02:00
Matvey Arye	2209133781	Add next_start option to alter_job_schedule Add the option to set the next start time on a job in the alter job schedule function. This also adds the ability to pause jobs by setting next_start to 'infinity' Also fix the enterprise licence check to only activate for enterprise jobs.	2019-10-11 15:44:19 -04:00
Matvey Arye	d2f68cbd64	Move the set_integer_now func into Apache2 We decided this should be an OSS capability.	2019-10-11 13:00:55 -04:00
Sven Klemm	33a75ab09d	Release 1.4.2 This maintenance release contains bugfixes since the 1.4.1 release. We deem it medium priority for upgrading. In particular the fixes contained in this maintenance release address 2 potential segfaults and no other security vulnerabilities. The bugfixes are related to background workers, OUTER JOINs, ordered append on space partitioned hypertables and expression indexes.	2019-09-10 10:23:15 +02:00
David Kohn	897fef42b6	Add support for moving chunks to different tablespaces Adds a move_chunk function which to a different tablespace. This is implemented as an extension to the reorder command. Given that the heap, toast tables, and indexes are being rewritten during the reorder operation, adding the ability to modify the tablespace is relatively simple and mostly requires adding parameters to the relevant functions for the destination tablespace (and index tablespace). The tests do not focus on further exercising the reorder infrastructure, but instead ensure that tablespace movement and permissions checks properly occur.	2019-08-21 12:07:28 -04:00
Narek Galstyan	62de29987b	Add a notion of now for integer time columns This commit implements functionality for users to give a custom definition of now() for integer open dimension typed hypertables. Such a now() function enables us to talk about intervals in the context of hypertables with integer time columns. In order to simplify future code. This commit defines a custom ts_interval type that unites the usual postgres intervals and integer time dimension intervals under a single composite type. The commit also enables adding drop chunks policy on hypertables with integer time dimensions if a custom now() function has been set.	2019-08-19 23:23:28 +04:00
Sven Klemm	03d4ae03d6	Release 1.4.1 This maintenance release contains bugfixes since the 1.4.0 release. We deem it medium priority for upgrading. In particular the fixes contained in this maintenance release address 2 potential segfaults and no other security vulnerabilities. The bugfixes are related to queries with prepared statements, PL/pgSQL functions and interoperability with other extensions. More details below. Bugfixes * #1362 Fix ConstraintAwareAppend subquery exclusion * #1363 Mark drop_chunks as VOLATILE and not PARALLEL SAFE * #1369 Fix ChunkAppend with prepared statements * #1373 Only allow PARAM_EXTERN as time_bucket_gapfill arguments * #1380 Handle Result nodes gracefully in ChunkAppend Thanks * @overhacked for reporting an issue with drop_chunks and parallel queries * @fvannee for reporting an issue with ConstraintAwareAppend and subqueries * @rrb3942 for reporting a segfault with ChunkAppend and prepared statements * @mchesser for reporting a segfault with time_bucket_gapfill and subqueries * @lolizeppelin for reporting and helping debug an issue with ChunkAppend and Result nodes	2019-07-31 21:00:44 +02:00
Sven Klemm	a11910b5d5	Mark drop_chunks as VOLATILE and PARALLEL UNSAFE The drop_chunks function was incorrectly marked as stable and parallel safe this patch fixes the attributes.	2019-07-22 07:29:33 +02:00
Stephen Polcyn	1dc1850793	Drop_chunks returns list of dropped chunks Previously, drop_chunks returned an empty table, giving the user no indication of what (if anything) had happened. Now, drop_chunks returns a list of the chunks identifiers in the same style as show_chunks, with the chunk's schema and table name. Notably, when show_chunks is called directly before drop_chunks, the output should be the same.	2019-07-19 12:13:24 -04:00
Sven Klemm	a2f2db9cab	Release 1.4.0 This release contains major new functionality for continuous aggregates and adds performance improvements for analytical queries. In version 1.3.0 we added support for continuous aggregates which was initially limited to one continuous aggregate per hypertable. With this release, we remove this restriction and allow multiple continuous aggregates per hypertable. This release adds a new custom node ChunkAppend that can perform execution time constraint exclusion and is also used for ordered append. Ordered append no longer requires a LIMIT clause and now supports space partitioning and ordering by time_bucket.	2019-07-17 18:23:13 +02:00
gayyappan	e9df3bc1b6	Fix continuous agg catalog table insert failure The primary key on continuous_aggs_materialization_invalidation_log prevents multiple records with the same materialization id. Remove the primary key to fix this problem.	2019-07-08 14:53:36 -04:00
gayyappan	5a0a73eabd	Add columns to continuous_aggregate_stats view Add more information about job history for continuous aggregate background worker jobs.	2019-07-08 12:54:10 -04:00
Stephen Polcyn	ff44b33327	Update get_telemetry_report to expected behavior Previously, returns full report even if telemetry is disabled. Now, reassures user telemetry is disabled and provides the option to view the report locally.	2019-06-25 19:35:17 +02:00
Sven Klemm	743a22f1fa	Add 1.3.2 to update test scripts	2019-06-25 05:11:32 +02:00
gayyappan	60cfe6cc90	Support for multiple continuous aggregates Allow multiple continuous aggregates to be defined on a hypertable.	2019-06-24 17:05:49 -04:00
Matvey Arye	e049238a07	Adjust permissions on internal functions The following functions have had permission checks added or adjusted: ts_chunk_index_clone ts_chunk_index_replace ts_hypertable_insert_blocker_trigger_add ts_current_license_key ts_calculate_chunk_interval ts_chunk_adaptive_set The following functions have been removed from the regular SQL install. They are only installed and used in tests: dimension_calculate_default_range_open dimension_calculate_default_range_closed	2019-06-24 10:57:38 -04:00
Sven Klemm	70a02b5410	Add 1.3.1 to update test scripts	2019-06-10 21:15:56 +02:00
Matvey Arye	48d9e2ce25	Add CMAKE option for default telemetry setting Adds a CMAKE option to turn telemetry off by default by specifying -DSEND_TELEMETRY_DEFAULT=NO. The default is YES or on.	2019-06-10 08:51:57 -04:00
Brian Rowe	aeac52aef6	Rename telemetry_metadata table to just metadata This change renames the _timescale_catalog.telemetry_metadata to _timescale_catalog.metadata. It also adds a new boolean column to this table which is used to flag data which should be included in telemetry. It also renamed the src/telemetry/metadata.{h,c} files to src/telemetry/telemetry_metadata.{h,c} and updated the API to reflect this. Finally it also includes the logic to use the new boolean column when populating the telemetry parse state.	2019-05-17 17:04:42 -07:00
Sven Klemm	bfabb30be0	Release 1.3.0	2019-05-07 02:47:13 +02:00
Joshua Lockerman	899cd0538d	Allow scheduled drop_chunks to cascade to aggs This commit adds a cascade_to_materializations flag to the scheduled version of drop_chunks that behaves much like the one from manual drop_chunks: if a hypertable that has a continuous aggregate tries to drop chunks, and this flag is not set, the chunks will not be dropped.	2019-04-30 15:46:49 -04:00
Matvey Arye	74f8d204a5	Optimize getting the chunk_id in continuous aggs We replace chunk_for_tuple with chunk_id_from_relid for getting chunk id fields when materializing continuous aggs. The old function required passing in the entire row. This was very slow because a lot of data was passed around at execution time. The new function just uses the internal `tableoid` attribute to convert the table relid to a chunk_id. This is much more efficient. We also add memoization to the new function because it is most often called consecutively for the same chunk.	2019-04-29 15:45:23 -04:00
Joshua Lockerman	ae3480c2cb	Fix continuous_aggs info This commit switches the remaining JOIN in the continuous_aggs_stats view to LEFT JOIN. This way we'll still see info from the other columns even when the background worker has not run yet. This commit also switches the time fields to output text in the correct format for the underlying time type.	2019-04-26 13:08:00 -04:00
Joshua Lockerman	3895e5ce0e	Add a setting for max an agg materializes per run Add a setting max_materialized_per_run which can be set to prevent a continuous aggregate from materializing too much of the table in a single run. This will prevent a single run from locking the hypertable for too long, when running on a large data set.	2019-04-26 13:08:00 -04:00
gayyappan	b8f9b91e60	Add user view query definition for cont aggs Add the query definition to timescaledb_information.continuous_aggregates. The user query (specified in the CREATE VIEW stmt of a continuous aggregate) is transformed in the process of creating a continuous aggregate and this modified query is saved in the pg_rewrite catalog tables. In order to display the original query, we create an internal view which is a replica of the user query. This is used to display the definition in timescaledb_information.continuous_aggregates. As an alternative we could save the original user query in our internal catalogs. But this approach involves replicating a lot of postgres code and causes portability problems.	2019-04-26 13:08:00 -04:00
Matvey Arye	dc0e250428	Add pg_dump/restore tests for continuous aggs The data in caggs needs to survive dump/restore. This test makes sure that caggs that are materialized both before and after restore are correct. Two code changes were necessary to make this work: 1) the valid_job_type constraint on bgw_job needed to be altered to add 'continuous_aggregate' as a valid job type 2) The user_view_query field needed to be changed to a text because dump/restore does not support pg_node_tree.	2019-04-26 13:08:00 -04:00
Joshua Lockerman	45fb1fc2c8	Handle drop_chunks on tables that have cont aggs For hypetables that have continuous aggregates, calling drop_chunks now drops all of the rows in the materialization table that were based on the dropped chunks. Since we don't know what the correct default behavior for drop_chunks is, we've added a new argument, cascade_to_materializations, which must be set to true in order to call drop_chunks on a hypertable which has a continuous aggregate. drop_chunks is blocked on the materialization tables of continuous aggregates	2019-04-26 13:08:00 -04:00
gayyappan	18d1607909	Add timescaledb_information views for continuous aggregates Add timescaledb_information.continuous_aggregate_settings and timescaledb_information.continuous_aggregate_job_stats views	2019-04-26 13:08:00 -04:00
Matvey Arye	19d47daf23	Delete related catalog rows when continuous aggs are dropped This PR deletes related rows from the following tables * completed_threshold * invalidation threshold * hypertable invalidation log The latter two tables are only affected if no other continuous aggs exist on the raw hyperatble. This commit also adds locks to prevent concurrent raw table inserts and any access to the materialization table when dropping caggs. It also moves all locks to the beginning of the function so that the lock order is easier to track and reason about. Also added a few formatting fixes.	2019-04-26 13:08:00 -04:00
gayyappan	1cbd8c74f7	Add invalidation trigger for continuous aggs Add invalidation trigger for DML changes to the hypertable used in the continuous aggregate query. Also add user_view_query definition in continuous_agg catalog table.	2019-04-26 13:08:00 -04:00
Joshua Lockerman	0737b370a3	Add the actual bgw job for continuous aggregates This commit adds the the actual background worker job that runs the continuous aggregate automatically. This job gets created when the continuous aggregate is created and is deleted when the aggregate is DROPed. By default this job will attempt to run every two bucket widths, and attempts to materialize up to two bucket widths behind the end of the table.	2019-04-26 13:08:00 -04:00
David Kohn	f17aeea374	Initial cont agg INSERT/materialization support This commit adds initial support for the continuous aggregate materialization and INSERT invalidations. INSERT path: On INSERT, DELETE and UPDATE we log the [max, min] time range that may be invalidated (that is, newly inserted, updated, or deleted) to _timescaledb_catalog.continuous_aggs_hypertable_invalidation_log. This log will be used to re-materialize these ranges, to ensure that the aggregate is up-to-date. Currently these invalidations are recorded in by a trigger _timescaledb_internal.continuous_agg_invalidation_trigger, which should be added to the hypertable when the continuous aggregate is created. This trigger stores a cache of min/max values per-hypertable, and on transaction commit writes them to the log, if needed. At the moment, we consider them to always be needed, unless we're in ReadCommitted mode or weaker, and the min invalidated value is greater than the hypertable's invalidation threshold (found in _timescaledb_catalog.continuous_aggs_invalidation_threshold) Materialization path: Materialization currently happens in multiple phase: in phase 1 we determine the timestamp at which we will end the new set of materializations, then we update the hypertable's invalidation threshold to that point, and finally we read the current invalidations, then materialize any invalidated rows, the new range between the continuous aggregate's completed threshold (found in _timescaledb_catalog.continuous_aggs_completed_threshold) and the hypertable's invalidation threshold. After all of this is done we update the completed threshold to the invalidation threshold. The portion of this protocol from after the invalidations are read, until the completed threshold is written (that is, actually materializing, and writing the completion threshold) is included with this commit, with the remainder to follow in subsequent ones. One important caveat is that since the thresholds are exclusive, we invalidate all values _less_ than the invalidation threshold, and we store timevalue as an int64 internally, we cannot ever determine if the row at PG_INT64_MAX is invalidated. To avoid this problem, we never materialize the time bucket containing PG_INT64_MAX.	2019-04-26 13:08:00 -04:00
gayyappan	2dbc28df82	Create base infrastructure for continuous aggs This PR adds a catalog table for storing metadata about continuous aggregates. It also adds code for creating the materialization hypertable and 2 views that are used by the continuous aggregate system: 1) The user view - This is the actual view queried by the enduser. It is a query on top of the materialized hypertable and is responsible for finalizing and combining partials in a manner that return to the user the data as defined by the original user-defined view. 2) The partial view - which queries the raw table and returns columns as defined in the materialized table. This will be used by the materializer to calculate the data that will be inserted into the materialization table. Note the data here is the partial state of any aggregates.	2019-04-26 13:08:00 -04:00
Joshua Lockerman	1e486ef2a4	Fix ts_chunk_for_tuple performance ts_chunk_for_tuple should use the chunk cache. ts_chunk_for_tuple should be marked stable. These fixes markedly improve performance.	2019-04-19 12:46:36 -04:00

1 2 3 4 5 ...

373 Commits