timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-15 18:13:18 +08:00

Author	SHA1	Message	Date
Mats Kindahl	e5e94960d0	Change parameter name to enable Hypercore TAM Changing from using the `compress_using` parameter with a table access method name to use the boolean parameter `hypercore_use_access_method` instead to avoid having to provide a name when using the table access method for compression.	2024-11-10 10:50:48 +01:00
Mats Kindahl	e0a7a6f6e1	Hyperstore renamed to hypercore This changes the names of all symbols, comments, files, and functions to use "hypercore" rather than "hyperstore".	2024-10-16 13:13:34 +02:00
Erik Nordström	f5eae6dc70	Support hyperstore in compression policy Make sure that hyperstore can be used in a compression policy by setting `compress_using => 'hyperstore'` in the policy configuration.	2024-10-16 13:13:34 +02:00
Erik Nordström	d1a2ea4961	Make compress_chunk() work with Hyperstore The `compress_chunk()` function can now be used to create hyperstores by passing the option `compress_using => 'hyperstore'` to the function. Using the `compress_chunk()` function is an alternative to using `ALTER TABLE my_hyperstore SET ACCESS METHOD` that is compatible with the existing way of compressing hypertable chunks. It will also make it easier to support hyperstore compression via compression policies. Additionally, implement "fast migration" to hyperstore when a table is already compressed. In that case, simply update the PG catalog to say that the table is using hyperstore as TAM without rewriting the table. This fast migration works with both with `...SET ACCESS METHOD` and `compress_chunk()`.	2024-10-16 13:13:34 +02:00
Erik Nordström	e5fd18728c	Add VACUUM support in hyperstore Implement vacuum by internally calling vacuum on both the compressed and non-compressed relations. Since hyperstore indexes are defined on the non-compressed relation, vacuuming the compressed relation won't clean up compressed tuples from those indexes. To handle this, a proxy index is defined on each compressed relation in order to direct index vacuum calls to the corresponding indexes on the hyperstore relation. The proxy index also translates the encoded TIDs stored in the index to proper TIDs for the compressed relation.	2024-10-16 13:13:34 +02:00
Mats Kindahl	ab9f072df7	Replace compressionam with hyperstore Replace "compressionam" in all functions and symbols with "hyperstore".	2024-10-16 13:13:34 +02:00
Mats Kindahl	00999801e2	Add test for SELECT FOR UPDATE Add check that SELECT FOR UPDATE does not crash as well as an isolation test to make sure that it locks rows properly. Also adds a debug function to check if a TID is for a compressed tuple.	2024-10-16 13:13:34 +02:00
Mats Kindahl	1373ec31f8	Rename compression TAM to hyperstore The access method and associated tests is renamed to "hyperstore".	2024-10-16 13:13:34 +02:00
Erik Nordström	cb8c756a1d	Add initial compression TAM Implement the table-access method API around compression in order to have, among other things, seamless index support on compressed data. The current functionality is rudimentary but common operations work, including sequence scans.	2024-10-16 13:13:34 +02:00
Pallavi Sontakke	5858892d54	Release 2.17.0 This release adds support for PostgreSQL 17, significantly improves the performance of continuous aggregate refreshes, and contains performance improvements for analytical queries and delete operations over compressed hypertables. We recommend that you upgrade at the next available opportunity. Highlighted features in TimescaleDB v2.17.0 * Full PostgreSQL 17 support for all existing features. TimescaleDB v2.17 is available for PostgreSQL 14, 15, 16, and 17. * Significant performance improvements for continuous aggregate policies: continuous aggregate refresh is now using `merge` instead of deleting old materialized data and re-inserting. This update can decrease dramatically the amount of data that must be written on the continuous aggregate in the presence of a small number of changes, reduce the `i/o` cost of refreshing a continuous aggregate, and generate fewer Write-Ahead Logs (`WAL`). Overall, continuous aggregate policies will be more lightweight, use less system resources, and complete faster. * Increased performance for real-time analytical queries over compressed hypertables: we are excited to introduce additional Single Instruction, Multiple Data (`SIMD`) vectorization optimization to our engine by supporting vectorized execution for queries that group by using the `segment_by` column(s) and aggregate using the basic aggregate functions (`sum`, `count`, `avg`, `min`, `max`). Stay tuned for more to come in follow-up releases! Support for grouping on additional columns, filtered aggregation, vectorized expressions, and `time_bucket` is coming soon. * Improved performance of deletes on compressed hypertables when a large amount of data is affected. This improvement speeds up operations that delete whole segments by skipping the decompression step. It is enabled for all deletes that filter by the `segment_by` column(s). PostgreSQL 14 deprecation announcement We will continue supporting PostgreSQL 14 until April 2025. Closer to that time, we will announce the specific version of TimescaleDB in which PostgreSQL 14 support will not be included going forward. Features * #6882: Allow delete of full segments on compressed chunks without decompression. * #7033: Use `merge` statement on continuous aggregates refresh. * #7126: Add functions to show the compression information. * #7147: Vectorize partial aggregation for `sum(int4)` with grouping on `segment by` columns. * #7204: Track additional extensions in telemetry. * #7207: Refactor the `decompress_batches_scan` functions for easier maintenance. * #7209: Add a function to drop the `osm` chunk. * #7275: Add support for the `returning` clause for `merge`. * #7200: Vectorize common aggregate functions like `min`, `max`, `sum`, `avg`, `stddev`, `variance` for compressed columns of arithmetic types, when there is grouping on `segment by` columns or no grouping. Bug fixes * #7187: Fix the string literal length for the `compressed_data_info` function. * #7191: Fix creating default indexes on chunks when migrating the data. * #7195: Fix the `segment by` and `order by` checks when dropping a column from a compressed hypertable. * #7201: Use the generic extension description when building `apt` and `rpm` loader packages. * #7227: Add an index to the `compression_chunk_size` catalog table. * #7229: Fix the foreign key constraints where the index and the constraint column order are different. * #7230: Do not propagate the foreign key constraints to the `osm` chunk. * #7234: Release the cache after accessing the cache entry. * #7258: Force English in the `pg_config` command executed by `cmake` to avoid the unexpected building errors. * #7270: Fix the memory leak in compressed DML batch filtering. * #7286: Fix the index column check while searching for the index. * #7290: Add check for null offset for continuous aggregates built on top of continuous aggregates. * #7301: Make foreign key behavior for hypertables consistent. * #7318: Fix chunk skipping range filtering. * #7320: Set the license specific extension comment in the install script. Thanks * @MiguelTubio for reporting and fixing the Windows build error. * @posuch for reporting the misleading extension description in the generic loader packages. * @snyrkill for discovering and reporting the issue with continuous aggregates built on top of continuous aggregates. --------- Signed-off-by: Pallavi Sontakke <pallavi@timescale.com> Signed-off-by: Yannis Roussos <iroussos@gmail.com> Signed-off-by: Sven Klemm <31455525+svenklemm@users.noreply.github.com> Co-authored-by: Yannis Roussos <iroussos@gmail.com> Co-authored-by: atovpeko <114177030+atovpeko@users.noreply.github.com> Co-authored-by: Sven Klemm <31455525+svenklemm@users.noreply.github.com>	2024-10-08 15:37:13 +02:00
Ante Kresic	0ac3e3429f	Removal of sequence number in compression Sequence numbers were an optimization for ordering batches based on the orderby configuration setting. It was used for ordered append and avoiding sorting compressed data when it matched the query ordering. However, with enabling changes to compressed data, bookkeeping of sequence numbers is becoming more of a hassle. Removing them and using the metadata columns for ordering reduces that burden while keeping all the existing optimizations that relied on the sequences in place.	2024-09-30 13:45:47 +02:00
Ildar Musin	01231bafd4	Add function to drop the OSM chunk The function is used by OSM to disable tiering. It removes catalog records associated with OSM chunk and resets hypertable status.	2024-09-04 11:29:32 +02:00
Mats Kindahl	e1eeedb276	Add index to compression_chunk_size catalog table During upgrade the function `remove_dropped_chunk_metadata` is used to update the metadata tables and remove data for chunks marked as dropped. The function iterates of the chunks of the provided hypertable and internally does a sequence scan of `compression_chunk_size` table to locate the `compressed_chunk_id`, resulting in quadratic execution time. This is usually not noticed for small number of chunks, but for large number of chunks this becomes a problem. This commit fixes this by adding an index to `compression_chunk_size` catalog table, turning the sequence scan into an index scan.	2024-09-04 10:28:13 +02:00
Erik Nordström	19239ff8dd	Add function to show compression information Add a function that can be used on a compressed data value to show some metadata information, such as the compression algorithm used and the presence of any null values.	2024-08-05 17:34:41 +02:00
Sven Klemm	801d32c63c	Post-release adjustments for 2.16.0	2024-08-01 07:08:34 +02:00
Fabrízio de Royes Mello	a4a023e89a	Rename {enable\|disable}_column_stats API For better understanding we've decided to rename the public API from `{enable\|disable}_column_stats` to `{enable\|disable}_chunk_skipping`.	2024-07-26 18:28:17 -03:00
Sven Klemm	af6b4a3911	Change hypertable foreign key handling Don't copy foreign key constraints to the individual chunks and instead modify the lookup query to propagate to individual chunks to mimic how postgres does this for partitioned tables. This patch also removes the requirement for foreign key columns to be segmentby columns.	2024-07-22 14:33:00 +02:00
Nikhil Sontakke	50bca31130	Add support for chunk column statistics tracking Allow users to specify that ranges (min/max values) be tracked for a specific column using the enable_column_stats() API. We will store such min/max ranges in a new timescaledb catalog table _timescaledb_catalog.chunk_column_stats. As of now we support tracking min/max ranges for smallint, int, bigint, serial, bigserial, date, timestamp, timestamptz data types. Support for other stats for bloom filters etc. will be added in the future. We add an entry of the form (ht_id, invalid_chunk_id, col, -INF, +INF) into this catalog to indicate that min/max values need to be calculated for this column in a given hypertable for chunks. We also iterate through existing chunks and add -INF, +INF entries for them in the catalog. This allows for selection of these chunks by default since no min/max values have been calculated for them. This actual min-max start/end range is calculated later. One of the entry points is during compression for now. The range is stored in start (inclusive) and end (exclusive) form. If DML happens into a compressed chunk then as part of marking it as partial, we also mark the corresponding catalog entries as "invalid". So partial chunks do not get excluded further. When recompression happens we get the new min/max ranges from the uncompressed portion and try to reconcile the ranges in the catalog based on these new values. This is safe to do in case of INSERTs and UPDATEs. In case of DELETEs, since we are deleting rows, it's possible that the min/max ranges change, but as of now we err on the side of caution and retain the earlier values which can be larger than the actual range. We can thus store the min/max values for such columns in this catalog table at the per-chunk level. Note that these min/max range values do not participate in partitioning of the data. Such data ranges will be used for chunk pruning if the WHERE clause of an SQL query specifies ranges on such a column. Note that Executor startup time chunk exclusion logic is also able to use this metadata effectively. A "DROP COLUMN" on a column with a statistics tracking enabled on it ends up removing all relevant entries from the catalog tables. A "decompress_chunk" on a compressed chunk removes its entries from the "chunk_column_stats" catalog table since now it's available for DML. Also a new "disable_column_stats" API has been introduced to allow removal of min/max entries from the catalog for a specific column.	2024-07-12 14:43:16 +05:30
Fabrízio de Royes Mello	cdfa1560e5	Refactor code for getting time bucket function Oid This is a small refactoring for getting time bucket function Oid from a view definition. It will be necessary for a following PRs for completely remove the uncessary catalog metadata table `continuous_aggs_bucket_function`. Also added a new SQL function `cagg_get_bucket_function_info` to return all `time_bucket` information based on a user view definition.	2024-06-26 10:33:23 -03:00
Fabrízio de Royes Mello	438736f6bd	Post release 2.15.1	2024-05-30 14:08:38 -03:00
Fabrízio de Royes Mello	8b994c717d	Remove regprocedure oid type from catalog In #6624 we refactored the time bucket catalog table to make it more generic and save information for all Continuous Aggregates. Previously it stored only variable bucket size information. The problem is we used the `regprocedure` type to store the OID of the given time bucket function but unfortunately it is not supported by `pg_upgrade`. Fixed it by changing the column to TEXT and resolve to/from OID using builtin `regprocedurein` and `format_procedure_qualified` functions. Fixes #6935	2024-05-22 11:01:56 -03:00
Fabrízio de Royes Mello	ca125cf620	Post-release changes for 2.15.0.	2024-05-07 16:44:43 -03:00
Sven Klemm	e298ecd532	Don't reuse job id We shouldnt reuse job ids to make it easy to recognize the job log entries for a job. We also need to keep the old job around to not break loading dumps from older versions.	2024-05-03 09:05:57 +02:00
Jan Nidzwetzki	f88899171f	Add migration for CAggs using time_bucket_ng The function time_bucket_ng is deprecated. This PR adds a migration path for existing CAggs. Since time_bucket and time_bucket_ng use different origin values, a custom origin is set if needed to let time_bucket create the same buckets as created by time_bucket_ng so far.	2024-04-25 16:08:48 +02:00
Fabrízio de Royes Mello	66c0702d3b	Refactor job execution history table In #6767 we introduced the ability to track job execution history including succeeded and failed jobs. The new metadata table `_timescaledb_internal.bgw_job_stat_history` has two JSONB columns `config` (store config information) and `error_data` (store the ErrorData information). The problem is that this approach is not flexible for future history recording changes so this PR refactor the current implementation to use only one JSONB column named `data` that will store more job information in that form: { "job": { "owner": "fabrizio", "proc_name": "error", "scheduled": true, "max_retries": -1, "max_runtime": "00:00:00", "proc_schema": "public", "retry_period": "00:05:00", "initial_start": "00:05:00", "fixed_schedule": true, "schedule_interval": "00:00:30" }, "config": { "bar": 1 }, "error_data": { "domain": "postgres-16", "lineno": 841, "context": "SQL statement \"SELECT 1/0\"\nPL/pgSQL function error(integer,jsonb) line 3 at PERFORM", "message": "division by zero", "filename": "int.c", "funcname": "int4div", "proc_name": "error", "sqlerrcode": "22012", "proc_schema": "public", "context_domain": "plpgsql-16" } }	2024-04-19 09:19:23 -03:00
Fabrízio de Royes Mello	52094a3103	Track job execution history In #4678 we added an interface for troubleshoting job failures by logging it in the metadata table `_timescaledb_internal.job_errors`. With this PR we extended the existing interface to also store succeeded executions. A new GUC named `timescaledb.enable_job_execution_logging` was added to control this new behavior and the default value is `false`. We renamed the metadata table to `_timescaledb_internal.bgw_job_stat_history` and added a new view `timescaledb_information.job_history` to users that have enough permissions can check the job execution history.	2024-04-04 10:39:28 -03:00
Jan Nidzwetzki	8dcb6eed99	Populate CAgg bucket catalog table for all CAggs This changes the behavior of the CAgg catalog tables. From now on, all CAggs that use a time_bucket function create an entry in the catalog table continuous_aggs_bucket_function. In addition, the duplicate bucket_width attribute is removed from the catalog table continuous_agg.	2024-03-13 16:40:56 +01:00
Sven Klemm	c87be4ab84	Remove get_chunk_colstats and get_chunk_relstats These 2 functions were used in the multinode context and are no longer used now.	2024-03-03 23:14:02 +01:00
Jan Nidzwetzki	fdf3aa3bfa	Use NULL in CAgg bucket function catalog table Historically, we have used an empty string for undefined values in the catalog table continuous_aggs_bucket_function. Since #6624, the optional arguments can be NULL. This patch cleans up the empty strings and changes the logic to work with NULL values.	2024-02-23 20:58:32 +01:00
Jan Nidzwetzki	b01c8e7377	Unify handling of CAgg bucket_origin So far, bucket_origin was defined as a Timestamp but used as a TimestampTz in many places. This commit changes this and unifies the usage of the variable.	2024-02-16 18:28:21 +01:00
Jan Nidzwetzki	ab7a09e876	Make CAgg time_bucket catalog table more generic The catalog table continuous_aggs_bucket_function is currently only used for variable bucket sizes. Information about the fixed-size buckets is stored in the table continuous_agg only. This causes some problems (e.g., we have redundant fields for the bucket_size, fixes size buckets with offsets are not supported, ...). This commit is the first in a row of commits that refactor the catalog for the CAgg time_bucket function. The goals are: * Remove the CAgg redundant attributes in the catalog * Create an entry in continuous_aggs_bucket_function for all CAggs that use time_bucket This first commit refactors the continuous_aggs_bucket_function table and prepares it for more generic use. Not all attributes are used yet, but these will change in follow-up PRs.	2024-02-16 15:39:49 +01:00
Fabrízio de Royes Mello	5a359ac660	Remove metadata when dropping chunk Historically we preserve chunk metadata because the old format of the Continuous Aggregate has the `chunk_id` column in the materialization hypertable so in order to don't have chunk ids left over there we just mark it as dropped when dropping chunks. In #4269 we introduced a new Continuous Aggregate format that don't store the `chunk_id` in the materialization hypertable anymore so it's safe to also remove the metadata when dropping chunk and all associated Continuous Aggregates are in the new format. Also added a post-update SQL script to cleanup unnecessary dropped chunk metadata in our catalog. Closes #6570	2024-02-16 10:45:04 -03:00
Sven Klemm	8d8f158302	2.14.1 post release Adjust update tests to include new version.	2024-02-15 06:15:59 +01:00
Sven Klemm	ea6d826c12	Add compression settings informational view This patch adds 2 new views hypertable_compression_settings and chunk_compression_settings to query the per chunk compression settings.	2024-02-13 07:33:37 +01:00
Ante Kresic	ba3ccc46db	Post-release fixes for 2.14.0 Bumping the previous version and adding tests for 2.14.0.	2024-02-12 09:32:40 +01:00
Sven Klemm	101e4c57ef	Add recompress optional argument to compress_chunk This patch deprecates the recompress_chunk procedure as all that functionality is covered by compress_chunk now. This patch also adds a new optional boolean argument to compress_chunk to force applying changed compression settings to existing compressed chunks.	2024-02-07 12:19:13 +01:00
Nikhil Sontakke	2b8f98c616	Support approximate hypertable size If a lot of chunks are involved then the current pl/pgsql function to compute the size of each chunk via a nested loop is pretty slow. Additionally, the current functionality makes a system call to get the file size on disk for each chunk everytime this function is called. That again slows things down. We now have an approximate function which is implemented in C to avoid the issues in the pl/pgsql function. Additionally, this function also uses per backend caching using the smgr layer to compute the approximate size cheaply. The PG cache invalidation clears off the cached size for a chunk when DML happens into it. That size cache is thus able to get the latest size in a matter of minutes. Also, due to the backend caching, any long running session will only fetch latest data for new or modified chunks and can use the cached data (which is calculated afresh the first time around) effectively for older chunks.	2024-02-01 13:25:41 +05:30
Nikhil Sontakke	c715d96aa4	Don't dump unnecessary extension tables Logging and caching related tables from the timescaledb extension should not be dumped using pg_dump. Our scripts specify a few such unwanted tables. Apart from being unnecessary, the "job_errors" had some restricted permissions causing additional problems in pg_dump. We now don't include such tables for dumping. Fixes #5449	2024-01-25 12:01:11 +05:30
Sven Klemm	0b23bab466	Include _timescaledb_catalog.metadata in dumps This patch changes the dump configuration for _timescaledb_catalog.metadata to include all entries. To allow loading logical dumps with this configuration an insert trigger is added that turns uniqueness conflicts into updates to not block the restore.	2024-01-23 12:53:48 +01:00
Matvey Arye	e89bc24af2	Add functions for determining compression defaults Add functions to help determine defaults for segment_by and order_by.	2024-01-22 08:10:23 -05:00
Sven Klemm	754f77e083	Remove chunks_in function This function was used to propagate chunk exclusion decisions from an access node to data nodes and is no longer needed with the removal of multinode.	2024-01-22 09:18:26 +01:00
Sven Klemm	f57d584dd2	Make compression settings per chunk This patch implements changes to the compressed hypertable to allow per chunk configuration. To enable this the compressed hypertable can no longer be in an inheritance tree as the schema of the compressed chunk is determined by the compression settings. While this patch implements all the underlying infrastructure changes, the restrictions for changing compression settings remain intact and will be lifted in a followup patch.	2024-01-17 12:53:07 +01:00
Mats Kindahl	662fcc1b1b	Make extension state available through function The extension state is not easily accessible in release builds, which makes debugging issue with the loader very difficult. This commit introduces a new schema `_timescaledb_debug` and makes the function `ts_extension_get_state` available also in release builds as `_timescaledb_debug.extension_state`. See #1682	2024-01-11 10:52:35 +01:00
Jan Nidzwetzki	df7a8fed6f	Post-release fixes for 2.13.1 Bumping the previous version and adding tests for 2.13.1	2024-01-09 16:31:07 +01:00
Sven Klemm	8f73f95c2a	Remove replication_factor field from _timescaledb_catalog.hypertable	2023-12-18 10:53:27 +01:00
Sven Klemm	11dd9af847	Remove multinode catalog objects This patch removes the following objects: tables: - _timescaledb_catalog.chunk_data_node - _timescaledb_catalog.dimension_partition - _timescaledb_catalog.hypertable_data_node - _timescaledb_catalog.remote_txn views: - timescaledb_information.data_nodes functions: - _timescaledb_functions.hypertable_remote_size - _timescaledb_functions.chunks_remote_size - _timescaledb_functions.indexes_remote_size - _timescaledb_functions.compressed_chunk_remote_stats	2023-12-18 10:53:27 +01:00
Sven Klemm	6395b249a9	Remove remote connection handling code Remove the code used by multinode to handle remote connections. This patch completely removes tsl/src/remote and any remaining distributed hypertable checks.	2023-12-15 19:13:08 +01:00
Sven Klemm	06867af966	Remove multinode functions from crossmodule struct This commit removes the multinode specific entries from the cross module function struct. It also removes the function set_chunk_default_data_node	2023-12-14 21:32:14 +01:00
Sven Klemm	11df1dd648	Remove experimental multinode functions This commit removes the following functions: - timescaledb_experimental.block_new_chunks - timescaledb_experimental.allow_new_chunks - timescaledb_experimental.subscription_exec - timescaledb_experimental.move_chunk - timescaledb_experimental.copy_chunk - timescaledb_experimental.cleanup_copy_chunk_operation	2023-12-13 23:38:32 +01:00
Sven Klemm	8a2029f569	Remove rxid type and distributed size util functions	2023-12-13 23:38:32 +01:00

1 2 3 4

173 Commits