timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-15 01:53:41 +08:00

Author	SHA1	Message	Date
Fabrízio de Royes Mello	3c707bf28a	Release 2.17.1 on main This release contains performance improvements and bug fixes since the 2.17.0 release. We recommend that you upgrade at the next available opportunity. Features * #7360 Add chunk skipping GUC Bugfixes * #7335 Change log level used in compression * #7342 Fix collation for in-memory tuple filtering Thanks * @gmilamjr for reporting an issue with the log level of compression messages * @hackbnw for reporting an issue with collation during tuple filtering	2024-10-21 15:16:05 -03:00
Erik Nordström	d83383615a	Fix flaky Hypercore join test The join test could sometimes pick a seqscan+sort instead of an indexscan when doing a MergeAppend+MergeJoin. Disabling seqscan should make it deterministic.	2024-10-20 22:29:21 +02:00
Erik Nordström	132d14fe7d	Fix flaky Hypercore index test Having multiple indexes that include same prefix of columns caused the planner to sometimes pick a different index for one of the querires, which led to different test output. Temporarily remove the alternative index to make the test predictible.	2024-10-20 22:29:21 +02:00
Sven Klemm	23b736e449	Fix flaky continuous_aggs test Add missing ORDER BY clause to continuous_aggs test to make output deterministic.	2024-10-19 14:03:52 +02:00
Sven Klemm	2e3cf30cbd	Fix flaky rowsecurity test Check sql state code instead of error message in row security foreign key check.	2024-10-19 13:47:43 +02:00
Sven Klemm	5945e01456	Fix approval count workflow When queried from within action context .authorAssociation is not filled in as MEMBER but CONTRIBUTOR instead so adjust to query to take that into account.	2024-10-19 11:34:40 +02:00
Sven Klemm	694fcf428e	Remove obsolote multinode comment about chunk status	2024-10-19 11:34:40 +02:00
Sven Klemm	a732b19084	Pushdown ORDER BY for realtime caggs Previously ordered queries on realtime caggs would always lead to full table scan as the query plan would have a sort with the limit on top. With this patch this gets changed so that the ORDER BY can be pushed down so the query can benefit from the ordered append optimization and does not require full table scan. Since the internal structure is different on PG 14 and 15 this optimization will only be available on PG 16 and 17. Fixes #4861	2024-10-18 22:06:09 +02:00
Fabrízio de Royes Mello	aa9bc607ce	Use proper INVALID_{HYPERTABLE\|CHUNK}_ID macros	2024-10-18 14:25:11 -03:00
Sven Klemm	de6b478208	Add workflow to check number of approvals All PRs except trivial ones should require 2 approvals, since this a global setting we cannot allow trivial PRs to only have 1 from github configuration alone. So we set the required approvals in github to 1 and make this check required which will enforce 2 approvals unless overwritten or only CI files are touched.	2024-10-18 19:01:05 +02:00
Ante Kresic	8767565e3f	Add chunk skipping GUC Add the ability to enable/disable new chunk skipping functionality completely.	2024-10-18 17:47:22 +02:00
Sven Klemm	b65083ef69	Pin setup-wsl version to 3.1.1 Looks like version 3.1.2 does not work so pin to the previous version instead of generic v3.	2024-10-18 12:45:51 +02:00
Sven Klemm	4316f2c203	Remove multinode ssl tests	2024-10-16 18:03:18 +02:00
Fabrízio de Royes Mello	c359c16c74	PG17: Enable Windows tests on CI We're forcing PG17 installation since the package still on moderation by the Chocolatey Community: https://community.chocolatey.org/packages/postgresql17	2024-10-16 10:46:03 -03:00
Erik Nordström	ed19e29985	Add changelog entry for Hypercore TAM The Hypercore table access method (TAM) wraps TimescaleDB's columnar compression engine in a table access method. The TAM API enables sevaral features that were previously not available on compressed data, including (but not limited to): - Ability to build indexes on compressed data (btree, hash). - Proper statistics, including column stats via ANALYZE - Better support for vacuum and vacuum full - Skip-scans on top of compressed data - Better support for DML (copy/insert/update) directly on compressed chunks - Ability to dynamically create constraints (check, unique, etc.) - Better lock handling including via CURSORs	2024-10-16 13:13:34 +02:00
Mats Kindahl	e0a7a6f6e1	Hyperstore renamed to hypercore This changes the names of all symbols, comments, files, and functions to use "hypercore" rather than "hyperstore".	2024-10-16 13:13:34 +02:00
Mats Kindahl	406901d838	Rename files using "hyperstore" to use "hypercore" Files and directories using "hyperstore" as part of the name is moved to the new name using "hypercore".	2024-10-16 13:13:34 +02:00
Mats Kindahl	5798b9f534	Add Hypercore analyze support for PG17 PG17 changed the TAM API to use the new `ReadStream` API instead of the previous block-oriented API. This commit port the existing block-oriented solution to use the new `ReadStream` API by setting up separate read streams for the two relations and using the provided read stream as a block sampler, fetching the apropriate block from either the non-compressed or compressed relation.	2024-10-16 13:13:34 +02:00
Erik Nordström	eb2ee0bc5c	Refactor hyperstore handling in compress_chunk() Break out any hypestore handling in `compress_chunk()` into separate functions. This makes the code more readable.	2024-10-16 13:13:34 +02:00
Mats Kindahl	10c78f1137	Remove memory context switch macro The macro `TS_WITH_MEMORY_CONTEXT` was used to switch memory context for a block of code and restore it afterwards. This is checked using Coccinelle rules instead and the macro is removed.	2024-10-16 13:13:34 +02:00
Mats Kindahl	2ab527e9e3	Fix TRUNCATE of hyperstore tables Truncate the compressed relation when truncating a hyperstore relation. This can happen in two situations: either a non-transactional context or in a transactional context. For the transactional context, `relation_set_new_filelocator` will be called to replace the file locator. If this happens, we need to replace the file locator for the compressed relation as well, if there is one. For the non-transactional case, `relation_nontransactional_truncate` will be called, and we will just forward the call to the compressed relation as well, if it exists.	2024-10-16 13:13:34 +02:00
Mats Kindahl	29cb359d46	Optimize check for segmentby-only index scans If an index scan is on segment-by columns only, the index is optimized to only contain references to complete segments. However, deciding if a scan is only on segment-by columns requires checking all columns used in the index scan and since this does not change during a scan, but needs to be checked for each tuple, we cache this information for the duration of the scan.	2024-10-16 13:13:34 +02:00
Erik Nordström	ee0a3afee1	Fix Hyperstore index builds with null segments The index build function didn't properly handle the case when all rolled-up values in a compressed column were null, thus having a null-segment. The code has been slightly refactored to handle this case. A test is also added for this case.	2024-10-16 13:13:34 +02:00
Erik Nordström	b5b73dc3b6	Fix handling of dropped columns in Arrow slot Dropped columns need to be included in a tuple table slot's values array after having called slot_getsomeattrs(). The arrow slot didn't do this and instead skipped dropped columns, which lead to assertion errors in some cases.	2024-10-16 13:13:34 +02:00
Erik Nordström	201cfe3b94	Fix issue when recompressing Hyperstore When recompressing a Hyperstore after changing compression settings, the compressed chunk could be created twice, leading to a conflict error when inserting two compression chunk size rows. The reason this happened was that Hyperstore creates a compressed chunk on-demand if it doesn't exist when the relation is opened. And, the recompression code had not yet associated the compressed chunk with the main chunk when compressing the data. Fix this by associating the compressed chunk with the main chunk before opening the main chunk relation to compress the data.	2024-10-16 13:13:34 +02:00
Erik Nordström	ea31d4f5c2	Refactor setting attributes in Arrow getsomeattrs() When populating an Arrow slot's tts_values array with values in the getsomeattrs() function, the function set_attr_value() is called. This function requires passing in an ArrowArray which is acquired via a compression cache lookup. However, that lookup is not necassary for segmentby columns (which aren't compressed) and, to avoid it, a special fast-path was created for segmentby columns outside set_attr_value(). That, unfortunately, created som code duplication. This change moves the cache lookup into set_attr_value() instead, where it can be performed only for the columns that need it. This leads to cleaner code and less code duplication.	2024-10-16 13:13:34 +02:00
Mats Kindahl	e73d0ceb04	Always copy into non-compressed slot of arrow slot When copying from a non-arrow slot to a arrow slot, we should always copy the data into the non-compressed slot and never to the compressed slot. The previous check for matching number of attributes fail when you drop one column from the hyperstore.	2024-10-16 13:13:34 +02:00
Mats Kindahl	86fb747202	Disable hash agg for hypertable_index_btree We disable hash aggregation in favor of group aggregation to get stable test. It was flaky because it could pick either group aggregate or hash aggregate.	2024-10-16 13:13:34 +02:00
Mats Kindahl	1a9d319d4b	Fix issue when copying into arrow slot If you set the table access method for a hypertable all new chunks will use `ArrowTupleTableSlot` but the copy code assumes that the parent table has a virtual tuple table slot. This causes a crash when copying a heap tuple since the values are stored in the "main" slot and not in either of the child tuple table slots. Fix this issue by storing the values in the uncompressed slot when it is empty.	2024-10-16 13:13:34 +02:00
Mats Kindahl	d28a9fc892	Raise error when using Hyperstore with plain table If an attempt is made to use the hyperstore table access method with a plain table during creation, throw an error instead of allowing the table access method to be used. The table access method currently only supports hypertables and expect chunks to exist for the table.	2024-10-16 13:13:34 +02:00
Erik Nordström	8f311b7844	Do simple projection in columnar scan When a columnar scan needs to return a subset of the columns in a scan relation, it is possible to do a "simple" projection that just copies the column values to the projection result slot. This avoids a more costly projection done by PostgreSQL.	2024-10-16 13:13:34 +02:00
Erik Nordström	ff940170cd	Always set tts_tableOid in Arrow slot The tableOid was not set in an Arrow slot when hyperstore was delivering the next arrow value from the same compressed child slot, assuming that the tableOid would remain the same since since delivering the previous value. This is not always the case, however, as the same slot can be used in a CREATE TABLE AS or similar statement that inserts the data into another table. In that case, the insert function of that table will change the slot's tableOid. To fix this, hyperstore will always set the tableOid on the slot when delivering new values.	2024-10-16 13:13:34 +02:00
Erik Nordström	689b1bdd76	Pass on scankeys in parallel columnar scans This fixes an issue where scankeys were not applied in parallel scans due to PG not passing on the scan keys to the underlying table access method when using the function `table_beginscan_parallel()`. To test the use of scankeys in parallel scans, a test is added that uses a filter on a segmentby column (this is, currently, the only case where scankeys are used instead of quals).	2024-10-16 13:13:34 +02:00
Erik Nordström	d7724f348c	Add fast-path for iterating an arrow slot When consuming the values of an arrow array (via and arrow slot) during a scan, it is best to try to increment the slot as quickly as possible without doing other (unnecessary) work. Ensuring this "fast path" exists gives a decent speed boost.	2024-10-16 13:13:34 +02:00
Erik Nordström	cb8f4c2e68	Avoid cache lookup for segmentby columns in arrow slot When calling slot_getsomeattrs() on an arrow slot, the slot's values array is populated with data, which includes a potential lookup into the arrow cache and decompression of the values. However, if the column is a segmentby column, there is nothing to decompress and it is not necessary to check the decompression cache. Avoid this cache lookup by adding a fast path for segmentby columns that directly copies the value from the underlying (compressed) tuple.	2024-10-16 13:13:34 +02:00
Erik Nordström	3e0daf6ad4	Refactor attribute map in Arrow slot Refactor the function to get the attribute offset map in ArrowTupleTableSlot so that it has an inlined fast path and a slow path that initializes the map during the first call. After initalization, the fast path simply returns the map.	2024-10-16 13:13:34 +02:00
Mats Kindahl	09e5aee285	Add whitelist for Hyperstore index access methods Indexes for Hyperstore require special considerations so we want to whitelist index access methods that are supported and create an option to allow the whitelist to be set in the configuration file using the `timescaledb.hyperstore_indexam_whitelist` option.	2024-10-16 13:13:34 +02:00
Mats Kindahl	a29e9acd50	Error out on expression index with Hyperstore If an attempt was made to create an expression index a debug build would abort because it tried to use a system attribute number (zero or negative). This commit fixes this by adding a check that expression indexes or system attribute numbers are not used when building the index and error out if that happens.	2024-10-16 13:13:34 +02:00
Erik Nordström	ba9a2743c1	Declare ColumnarScan as projection capable ColumnarScan supports projections, but didn't announce it did. Make sure it sets CUSTOMPATH_SUPPORT_PROJECTION in the CustomPath flags so that the planner doesn't add unnecessary Result nodes.	2024-10-16 13:13:34 +02:00
Erik Nordström	0dc1a0e645	Improve decompression cache stats explain Make the decompression cache stats track more information, including actual cache hits, misses, and evictions (in terms of hash-table lookups). One of the most interesting metrics is number of decompressions. However, the this statistic was internally tracked as cache hits, which was confusing since it doesn't have anything to do to with cache hits. Conversely, every non-hit, or "avoided decompression", was tracked as cache misses, which is also a bit ambiguous because ideally one should never try to decompress something that is already decompressed. This is further complicated by the fact that some columns should not be decompressed at all, but are still counted towards this metric. For now, simply label this as "decompress calls" and hide it by default unless explain uses verbose.	2024-10-16 13:13:34 +02:00
Erik Nordström	3076fd4ccb	Cache typbyval in ArrowArray private for arrow slot Calling get_typbyval() when creating datums from arrow arrays has a non-negligible performance impact due to syscache lookup. Optimize this for a noticable performance gain by caching the typbyval information in the arrow array's private field.	2024-10-16 13:13:34 +02:00
Erik Nordström	db68f6eeb8	Cache text datums in hyperstore ArrowArray private Reduce the amount of memory allocations when creating text datums from arrow array values by creating a reusable memory area in the ArrowArray's private data storage.	2024-10-16 13:13:34 +02:00
Erik Nordström	9d71fec1af	Save ArrowColumnCache entry in Arrow slot header The ArrowColumnCache entry is valid for an arrow slot until it the next compressed tuple is stored (or a non-compressed one). Therefore, it is possible to avoid repeated hash-table lookups by saving the ArrowColumnCache entry in the Arrow slot header after the first lookup. This gives a noticable speed up when iterating the arrow array in the slot.	2024-10-16 13:13:34 +02:00
Erik Nordström	9fcc3a250f	Refactor getsomeattrs() in arrow slot The getsomeattrs() function is on the "hot path" for table scans. Simplify and optimize this function and related subfunctions it calls to make it more efficient. This makes the overall flow easier to understand, ensures quick exit if there's nothing to do. The refactor also fixes an issue that caused unreferenced columns being set with getmissingattr(). In this case, getmissingattr() doesn't do anything when the attribute is not marked with `atthasmissing`, but it caused unnecessary function calls and checks.	2024-10-16 13:13:34 +02:00
Erik Nordström	774e742210	Use bool array for referenced attrs in arrow slot Turning referenced_attrs from a bitmapset to a bool array has a measureable performance impact, unlike segmentby_attrs. Furthermore, making all attrs-tracking sets into similar bool arrays makes things consistent.	2024-10-16 13:13:34 +02:00
Erik Nordström	90891638c8	Use bool array for segmentby_attrs in arrow slot The segmentby_attrs in arrow slot is a bitmapset similar to how valid_attrs used to be a bitmapset, and bitmapsets can be slow. Therefore also make segmentby_attrs into a bool array. Since segmentby_attrs isn't cleared and reallocated when iterating an arrow slot (unlike valid_attrs), the performance impact isn't as big (or even measurable) as for valid_attrs. Still, the extra overhead of a bool array doesn't make a big difference.	2024-10-16 13:13:34 +02:00
Erik Nordström	375516376b	Use bool array for valid_attrs in arrow slot The valid_attrs in the arrow tuple table slot is a bitmapset used to track columns/attributes that are "materialized" in the slot. This bitmapset is cleared by freeing the set and then reallocating it again for the next row in an arrow array. The reallocation happens on a performance-critical "hot path", and has a significant peformance impact. The performance is improved by making valid_attrs a bool array instead, and preallocating the array at slot initialization. Clearing it is a simple memset(). While a bool array takes a bit more space than a bitmapset, it has simpler semantics and is always the size of the number of attributes.	2024-10-16 13:13:34 +02:00
Mats Kindahl	efdf236f26	Add tests for update This commit adds tests for update of a segment-by column and update using the RETURNING clause, mainly as a sanity check.	2024-10-16 13:13:34 +02:00
Mats Kindahl	5b5991c2f0	Add tests for MERGE command	2024-10-16 13:13:34 +02:00
Mats Kindahl	8be54d759d	Reduce runtime of tests based on setup_hyperstore This commit reduces the number of tuples added to the hyperstore table to reduce the runtime and also fixes `hyperstore_scans`. For `hyperstore_scans` it is necessary to reduce the number of locations since we want to trigger dictionary compression and make sure that it works for that as well.	2024-10-16 13:13:34 +02:00

1 2 3 4 5 ...

5103 Commits