timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-14 17:43:34 +08:00

Author	SHA1	Message	Date
Konstantina Skovola	6e65172cd8	Fix tablespace for compressed hypertable and corresponding toast If a hypertable uses a non default tablespace, the compressed hypertable and its corresponding toast table and index is still created in the default tablespace. This PR fixes this unexpected behavior and creates the compressed hypertable and its toast table and index in the same tablespace as the hypertable. Fixes #5520	2023-05-02 15:28:00 +03:00
Jan Nidzwetzki	df32ad4b79	Optimize compressed chunk resorting This patch adds an optimization to the DecompressChunk node. If the query 'order by' and the compression 'order by' are compatible (query 'order by' is equal or a prefix of compression 'order by'), the compressed batches of the segments are decompressed in parallel and merged using a binary heep. This preserves the ordering and the sorting of the result can be prevented. Especially LIMIT queries benefit from this optimization because only the first tuples of some batches have to be decompressed. Previously, all segments were completely decompressed and sorted. Fixes: #4223 Co-authored-by: Sotiris Stamokostas <sotiris@timescale.com>	2023-05-02 10:46:15 +02:00
Nikhil Sontakke	ed8ca318c0	Quote username identifier appropriately Need to use quote_ident() on the user roles. Otherwise the extension scripts will fail. Co-authored-by: Mats Kindahl <mats@timescale.com>	2023-04-28 16:53:43 +05:30
Bharathy	2ce4bbc432	Enable continuous_aggs tests on all PG version.	2023-04-28 07:40:59 +05:30
Zoltan Haindrich	1d092560f4	Fix on-insert decompression after schema changes On compressed hypertables 3 schema levels are in use simultaneously * main - hypertable level * chunk - inheritance level * compressed chunk In the build_scankeys method all of them appear - as slot have their fields represented as a for a row of the main hypertable. Accessing the slot by the attribut numbers of the chunks may lead to indexing mismatches if there are differences between the schemes. Fixes: #5577	2023-04-27 16:33:36 +02:00
Mats Kindahl	be28794384	Enable run_job() for telemetry job Since telemetry job has a special code path to be able to be used both from Apache code and from TSL code, trying to execute the telemetry job with run_job() will fail. This code will allow run_job() to be used with the telemetry job to trigger a send of telemetry data. You have to belong to the group that owns the telemetry job (or be the owner of the telemetry job) to be able to use it. Closes #5605	2023-04-27 16:00:03 +02:00
Fabrízio de Royes Mello	3c8d7cef77	Fix cagg_repair for the old CAgg format In commit 4a6650d1 we fixed the cagg_repair running for broken Continuous Aggregates with JOINs but we accidentally removed the code path for running against the old format (finalized=false) leading us to a dead code pointed out by CoverityScan: https://scan4.scan.coverity.com/reports.htm#v54116/p12995/fileInstanceId=131706317&defectInstanceId=14569420&mergedDefectId=384044 Fixed it by restoring the old code path for running the cagg_repair for Continuous Aggregates in the old format (finalized=false).	2023-04-26 14:29:58 -03:00
Mats Kindahl	d3730a4f6a	Add permission checks to run_job() There were no permission checks when calling run_job(), so it was possible to execute any job regardless of who owned it. This commit adds such checks.	2023-04-26 11:56:56 +02:00
Fabrízio de Royes Mello	4a6650d170	Fix broken CAgg with JOIN repair function The internal `cagg_rebuild_view_definition` function was trying to cast a pointer to `RangeTblRef` but it actually is a `RangeTblEntry`. Fixed it by using the already existing `direct_query` data struct to check if there are JOINs in the CAgg to be repaired.	2023-04-25 15:20:48 -03:00
Ante Kresic	910663d0be	Reduce decompression during UPDATE/DELETE When updating or deleting tuples from a compressed chunk, we first need to decompress the matching tuples then proceed with the operation. This optimization reduces the amount of data decompressed by using compressed metadata to decompress only the affected segments.	2023-04-25 15:49:59 +02:00
Jan Nidzwetzki	c54d8bd946	Add missing order by to compression_ddl tests Some queries in compression_ddl had no order by. Therefore the output order was not defined, which led to flaky tests.	2023-04-21 15:54:25 +02:00
Ante Kresic	23b3f8d7a6	Block unique idx creation on compressed hypertable This block was removed by accident, in order to support this we need to ensure the uniqueness in the compressed data which is something we should do in the future thus removing this block.	2023-04-20 16:22:03 +02:00
Zoltan Haindrich	a0df8c8e6d	Fix on-insert decompression for unique constraints Inserting multiple rows into a compressed chunk could have bypassed constraint check in case the table had segment_by columns. Decompression is narrowed to only consider candidates by the actual segment_by value. Because of caching - decompression was skipped for follow-up rows of the same Chunk. Fixes #5553	2023-04-20 13:47:47 +02:00
Ante Kresic	a49fdbcffb	Reduce decompression during constraint checking When inserting into a compressed chunk with constraints present, we need to decompress relevant tuples in order to do speculative inserting. Usually we used segment by column values to limit the amount of compressed segments to decompress. This change expands on that by also using segment metadata to further filter compressed rows that need to be decompressed.	2023-04-20 12:17:12 +02:00
Konstantina Skovola	5633960f8b	Enable indexscan on uncompressed part of partially compressed chunks This was previously disabled as no data resided on the uncompressed chunk once it was compressed, but this is not the case anymore with partially compressed chunks, so we enable indexscan for the uncompressed chunk again. Fixes #5432 Co-authored-by: Ante Kresic <ante.kresic@gmail.com>	2023-04-18 17:29:08 +03:00
Mats Kindahl	9a64385f34	Use regrole for job owner Instead of using a user name to register the owner of a job, we use regrole. This allows renames to work properly since the underlying OID does not change when the owner name changes. We add a check when calling `DROP ROLE` that there is no job with that owner and generate an error if there is.	2023-04-18 08:57:52 +02:00
Alexander Kuzmenkov	20db884bd7	Increase remote tuple and startup costs Our cost model should be self-consistent, and the relative values for the remote tuple and startup costs should reflect their real cost, relative to costs of other operations like CPU tuple cost. For example, now remote costs are set even lower than the parallel tuple and startup cost. Contrary to that, their real world cost is going to be an order of magnitude higher or more, because parallel tuples are sent through shared memory, and remote tuples are sent over the network. Increasing these costs leads to query plan improvements, e.g. we start to favor the GROUP BY pushdown in some cases.	2023-04-17 22:22:08 +04:00
Lakshmi Narayanan Sreethar	a383c8dd4f	Copy scheduled_jobs list before sorting it The start_scheduled_jobs function mistakenly sorts the scheduled_jobs list in-place. As a result, when the ts_update_scheduled_jobs_list function compares the updated list of scheduled jobs with the existing scheduled jobs list, it is comparing a list that is sorted by job_id to one that is sorted by next_start time. Fix that by properly copying the scheduled_jobs list into a new list and use that for sorting. Fixes #5537	2023-04-14 15:59:57 +05:30
Lakshmi Narayanan Sreethar	b10139ba48	Update bgw_custom testcase Added a few cleanup steps and updated a test logic to make the testcase runs more stable.	2023-04-14 15:59:57 +05:30
Ante Kresic	464d20fb41	Propagate vacuum/analyze to compressed chunks With recent changes, we enabled analyze on uncompressed chunk tables for compressed chunks. This change includes analyzing the compressed chunks table when analyzing the hypertable and its chunks, enabling us to remove the generating stats when compressing chunks.	2023-04-13 12:15:32 +02:00
Rafia Sabih	3f9cb3c27a	Pass join related structs to the cagg rte In case of joins in the continuous aggregates, pass the required structs to the new rte created. These values are required by the planner to finally query the materialized view. Fixes #5433	2023-04-13 04:57:33 +02:00
Fabrízio de Royes Mello	09565acae4	Fix timescaledb_experimental.policies duplicates Commit 16fdb6ca5e introduced `timescaledb_experimental.policies` view to expose the Continuous Aggregate policies but the current JOINS over our catalog are not accurate. Fixed it by properly JOIN the underlying catalog tables to expose the correct information without duplicates about the Continuous Aggregate policies. Fixes #5492	2023-04-12 15:28:29 -03:00
Fabrízio de Royes Mello	f6c8468ee6	Fix timestamp out of range refreshing CAgg When refreshing from the beginning (window_start=NULL) of a Continuous Aggregate with variable time bucket we were getting a `timestamp out of range` error. Fixed it by setting `-Infinity` when passing `window_start=NULL` when refreshing a Continuous Aggregate with variable time bucket. Fixes #5474, #5534	2023-04-12 14:50:04 -03:00
Mats Kindahl	3cc8a4ca34	Fix error message for continuous aggregates Several error messages for continuous aggregates are not following the error message style guidelines at https://www.postgresql.org/docs/current/error-style-guide.html In particular, they do not write the hints and detailed messages as full sentences.	2023-04-12 18:34:56 +02:00
Sven Klemm	f0623a8c38	Skip Ordered Append when only 1 child node is present This is mostly a cosmetic change. When only 1 child is present there is no need for ordered append. In this situation we might still benefit from a ChunkAppend node here due to runtime chunk exclusion when we have non-immutable constraints, so we still add the ChunkAppend node in that situation even with only 1 child.	2023-04-12 13:19:16 +02:00
Sven Klemm	0595ff0888	Move type support functions into _timescaledb_functions schema	2023-04-12 12:48:34 +02:00
Ante Kresic	dc5bf3b32e	Test compression DML with physical layout changes These tests try to verify that changing physical layout of chunks (either compressed or uncompressed) should yield consistent results. They also verify index mapping on compressed chunks is handled correctly.	2023-04-11 17:30:08 +02:00
Zoltan Haindrich	975e9ca166	Fix segfault after column drop on compressed table Decompression produces records which have all the decompressed data set, but it also retains the fields which are used internally during decompression. These didn't cause any problem - unless an operation is being done with the whole row - in which case all the fields which have ended up being non-null can be a potential segfault source. Fixes #5458 #5411	2023-04-06 08:49:54 +02:00
Bharathy	1fb058b199	Support UPDATE/DELETE on compressed hypertables. This patch does following: 1. Executor changes to parse qual ExprState to check if SEGMENTBY column is specified in WHERE clause. 2. Based on step 1, we build scan keys. 3. Executor changes to do heapscan on compressed chunk based on scan keys and move only those rows which match the WHERE clause to staging area aka uncompressed chunk. 4. Mark affected chunk as partially compressed. 5. Perform regular UPDATE/DELETE operations on staging area. 6. Since there is no Custom Scan (HypertableModify) node for UPDATE/DELETE operations on PG versions < 14, we don't support this feature on PG12 and PG13.	2023-04-05 17:19:45 +05:30
Erik Nordström	2e6c6b5c58	Refactor and optimize distributed COPY Refactor the code path that handles remote distributed COPY. The main changes include: * Use a hash table to lookup data node connections instead of a list. * Refactor the per-data node buffer code that accumulates rows into bigger CopyData messages. * Reduce the default number of rows in a CopyData message to 100. This seems to improve throughput, probably striking a better balance between message overhead and latency. * The number of rows to send in each CopyData message can now be changed via a new foreign data wrapper option.	2023-04-04 15:35:54 +02:00
Rafia Sabih	ff5959f8f9	Handle when FROM clause is missing in continuous aggregate definition It now errors out for such a case. Fixes #5500	2023-03-29 22:29:16 +02:00
Konstantina Skovola	cb81c331ae	Allow named time_bucket arguments in Cagg definition Fixes #5450	2023-03-28 18:45:41 +03:00
Rafia Sabih	98218c1d07	Enable joins for heirarchical continuous aggregates The joins could be between a continuous aggregate and hypertable, continuous aggregate and a regular Postgres table, and continuous aggregate and a regular Postgres view.	2023-03-28 15:12:54 +02:00
Konstantina Skovola	72c0f5b25e	Rewrite recompress_chunk in C for segmentwise processing This patch introduces a C-function to perform the recompression at a finer granularity instead of decompressing and subsequently compressing the entire chunk. This improves performance for the following reasons: - it needs to sort less data at a time and - it avoids recreating the decompressed chunk and the heap inserts associated with that by decompressing each segment into a tuplesort instead. If no segmentby is specified when enabling compression or if an index does not exist on the compressed chunk then the operation is performed as before, decompressing and subsequently compressing the entire chunk.	2023-03-23 11:39:43 +02:00
Fabrízio de Royes Mello	38fcd1b76b	Improve Realtime Continuous Aggregate performance When calling the `cagg_watermark` function to get the watermark of a Continuous Aggregate we execute a `SELECT MAX(time_dimension)` query in the underlying materialization hypertable. The problem is that a `SELECT MAX(time_dimention)` query can be expensive because it will scan all hypertable chunks increasing the planning time for a Realtime Continuous Aggregates. Improved it by creating a new catalog table to serve as a cache table to store the current Continous Aggregate watermark in the following situations: - Create CAgg: store the minimum value of hypertable time dimension data type; - Refresh CAgg: store the last value of the time dimension materialized in the underlying materialization hypertable (or the minimum value of materialization hypertable time dimension data type if there's no data materialized); - Drop CAgg Chunks: the same as refresh cagg. Closes #4699, #5307	2023-03-22 16:35:23 -03:00
shhnwz	699fcf48aa	Stats improvement for Uncompressed Chunks During the compression autovacuum use to be disabled for uncompressed chunk and enable after decompression. This leads to postgres maintainence issue. Let's not disable autovacuum for uncompressed chunk anymore. Let postgres take care of the stats in its natural way. Fixes #309	2023-03-22 23:51:13 +05:30
Bharathy	cc51e20e87	Add support for ON CONFLICT DO UPDATE for compressed hypertables This patch fixes execution of INSERT with ON CONFLICT DO UPDATE by removing error and allowing UPDATE do happen on the given compressed hypertable.	2023-03-20 22:55:27 +05:30
Zoltan Haindrich	790b322b24	Fix DEFAULT value handling in decompress_chunk The sql function decompress_chunk did not filled in default values during its operation. Fixes #5412	2023-03-16 09:16:50 +01:00
Alexander Kuzmenkov	827684f3e2	Use prepared statements for parameterized data node scans This allows us to avoid replanning the inner query on each new loop, speeding up the joins.	2023-03-15 18:22:01 +04:00
Dmitry Simonenko	f8022eb332	Add additional tests for compression with HA Make sure inserts into compressed chunks work when a DN is down Fix #5039	2023-03-13 17:43:48 +02:00
Sven Klemm	65562f02e8	Support unique constraints on compressed chunks This patch allows unique constraints on compressed chunks. When trying to INSERT into compressed chunks with unique constraints any potentially conflicting compressed batches will be decompressed to let postgres do constraint checking on the INSERT. With this patch only INSERT ON CONFLICT DO NOTHING will be supported. For decompression only segment by information is considered to determine conflicting batches. This will be enhanced in a follow-up patch to also include orderby metadata to require decompressing less batches.	2023-03-13 12:04:38 +01:00
Jan Nidzwetzki	356a20777c	Handle user-defined FDW options properly This patch changes the way user-defined FDW options (e.g., startup costs, per-tuple costs) are handled. So far, these values were retrieved in apply_fdw_and_server_options() but reset to default values afterward.	2023-03-13 10:39:52 +01:00
Alexander Kuzmenkov	e92d5ba748	Add more tests for compression Unit tests for different data sequences, and SQL test for float4.	2023-03-10 20:34:17 +04:00
Jan Nidzwetzki	7b8177aa74	Fix file trailer handling in the COPY fetcher The copy fetcher fetches tuples in batches. When the last element in the batch is the file trailer, the trailer was not handled correctly. The existing logic did not perform a PQgetCopyData in that case. Therefore the state of the fetcher was not set to EOF and the copy operation was not correctly finished at this point. Fixes: #5323	2023-03-09 14:29:06 +01:00
Bharathy	f54dd7b05d	Fix SEGMENTBY columns predicates to be pushed down WHERE clause with SEGMENTBY column of type text/bytea non-equality operators are not pushed down to Seq Scan node of compressed chunk. This patch fixes this issue. Fixes #5286	2023-03-08 19:17:43 +05:30
Erik Nordström	c76a0cff68	Add parallel support for partialize_agg() Make `partialize_agg()` support parallel query execution. To make this work, the finalize node need combine the individual partials from each parallel worker, but the final step that turns the resulting partial into the finished aggregate should not happen. Thus, in the case of distributed hypertables, each data node can run a parallel query to compute a partial, and the access node can later combine and finalize these partials into the final aggregate. Esssentially, there will be one combine step (minus final) on each data node, and then another one plus final on the access node. To implement this, the finalize aggregate plan is simply modified to elide the final step, and to reserialize the partial. It is only possible to do this at the plan stage; if done at the path stage, the PostgreSQL planner will hit assertions that assume that the node has certain values (e.g., it doesn't expect combine Paths to skip the final step).	2023-03-08 14:14:25 +01:00
Konstantina Skovola	5a3cacd06f	Fix sub-second intervals in hierarchical caggs Previously we used date_part("epoch", interval) and integer division internally to determine whether the top cagg's interval is a multiple of its parent's. This led to precision loss and wrong results in the case of intervals with sub-second components. Fixed by using the `ts_interval_value_to_internal` function to convert intervals to appropriate integer representation for division. Fixes #5277	2023-03-07 13:25:49 +02:00
Ildar Musin	4c0075010d	Add hooks for hypertable drops To properly clean up the OSM catalog we need a way to reliably track hypertable deletion (including internal hypertables for CAGGS).	2023-03-06 15:10:49 +01:00
Fabrízio de Royes Mello	32046832d3	Fix Hierarchical CAgg chunk_interval_size When a Continuous Aggregate is created the `chunk_interval_size` is defined my the `chunk_interval_size` of the original hypertable multiplied by a fixed factor of 10. The problem is currently when we create a Hierarchical Continuous Aggregate the same factor is applied and it lead to an exponential `chunk_interval_size`. Fixed it by just copying the `chunk_interval_size` from the base Continuous Aggregate for an Hierachical Continuous Aggreagate. Fixes #5382	2023-03-03 12:31:24 -03:00
Sotiris Stamokostas	750e69ede1	Renamed size_utils.sql Renamed: tsl/test/sql/size_utils.sql tsl/test/expected/size_utils.out To: tsl/test/sql/size_utils_tsl.sql tsl/test/expected/size_utils_tsl.out because conflicting with test/sql/size_utils.sql	2023-03-02 13:20:08 +02:00

1 2 3 4 5 ...

1065 Commits