timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-17 11:03:36 +08:00

Author	SHA1	Message	Date
Ante Kresic	583c36e91e	Refactor compression code to reduce duplication	2023-04-20 22:27:34 +02:00
Ante Kresic	a49fdbcffb	Reduce decompression during constraint checking When inserting into a compressed chunk with constraints present, we need to decompress relevant tuples in order to do speculative inserting. Usually we used segment by column values to limit the amount of compressed segments to decompress. This change expands on that by also using segment metadata to further filter compressed rows that need to be decompressed.	2023-04-20 12:17:12 +02:00
Ante Kresic	84b6783a19	Fix chunk status when inserting into chunks While executing compression operations in parallel with inserting into chunks (both operations which can potentially change the chunk status), we could get into situations where the chunk status would end up inconsistent. This change re-reads the chunk status after locking the chunk to make sure it can decompress data when handling ON CONFLICT inserts correctly.	2023-04-12 10:50:44 +02:00
Bharathy	1fb058b199	Support UPDATE/DELETE on compressed hypertables. This patch does following: 1. Executor changes to parse qual ExprState to check if SEGMENTBY column is specified in WHERE clause. 2. Based on step 1, we build scan keys. 3. Executor changes to do heapscan on compressed chunk based on scan keys and move only those rows which match the WHERE clause to staging area aka uncompressed chunk. 4. Mark affected chunk as partially compressed. 5. Perform regular UPDATE/DELETE operations on staging area. 6. Since there is no Custom Scan (HypertableModify) node for UPDATE/DELETE operations on PG versions < 14, we don't support this feature on PG12 and PG13.	2023-04-05 17:19:45 +05:30
Konstantina Skovola	72c0f5b25e	Rewrite recompress_chunk in C for segmentwise processing This patch introduces a C-function to perform the recompression at a finer granularity instead of decompressing and subsequently compressing the entire chunk. This improves performance for the following reasons: - it needs to sort less data at a time and - it avoids recreating the decompressed chunk and the heap inserts associated with that by decompressing each segment into a tuplesort instead. If no segmentby is specified when enabling compression or if an index does not exist on the compressed chunk then the operation is performed as before, decompressing and subsequently compressing the entire chunk.	2023-03-23 11:39:43 +02:00
shhnwz	699fcf48aa	Stats improvement for Uncompressed Chunks During the compression autovacuum use to be disabled for uncompressed chunk and enable after decompression. This leads to postgres maintainence issue. Let's not disable autovacuum for uncompressed chunk anymore. Let postgres take care of the stats in its natural way. Fixes #309	2023-03-22 23:51:13 +05:30
Zoltan Haindrich	790b322b24	Fix DEFAULT value handling in decompress_chunk The sql function decompress_chunk did not filled in default values during its operation. Fixes #5412	2023-03-16 09:16:50 +01:00
Sven Klemm	65562f02e8	Support unique constraints on compressed chunks This patch allows unique constraints on compressed chunks. When trying to INSERT into compressed chunks with unique constraints any potentially conflicting compressed batches will be decompressed to let postgres do constraint checking on the INSERT. With this patch only INSERT ON CONFLICT DO NOTHING will be supported. For decompression only segment by information is considered to determine conflicting batches. This will be enhanced in a follow-up patch to also include orderby metadata to require decompressing less batches.	2023-03-13 12:04:38 +01:00
Sven Klemm	c02cb76b38	Don't reindex relation during decompress_chunk Reindexing a relation requires AccessExclusiveLock which prevents queries on that chunk. This patch changes decompress_chunk to update the index during decompression instead of reindexing. This patch does not change the required locks as there are locking adjustments needed in other places to make it safe to weaken that lock.	2023-03-13 10:58:26 +01:00
Sven Klemm	8132908c97	Refactor chunk decompression functions Restructure the code inside decompress_chunk slightly to make core loop reusable by other functions.	2023-02-06 14:52:06 +01:00
Sven Klemm	b229b3aefd	Small decompress_chunk refactor Refactor the decompression code to move the decompressor initialization into a separate function.	2023-01-30 16:47:16 +01:00
Sven Klemm	dbe89644b5	Remove no longer used compression code The recent refactoring of INSERT into compression chunk made this code obsolete but forgot to remove it in that patch.	2023-01-16 14:18:56 +01:00
shhnwz	601b37daa8	Index support for compress chunk It allows to override tuplesort with indexscan if compression setting keys matches with Index keys. Moreover this feature has Enable/Disable Toggle. To Disable from the client use the following command, SET timescaledb.enable_compression_indexscan = 'OFF'	2022-12-15 20:26:00 +05:30
Ante Kresic	cbf51803dd	Fix index att number calculation Attribute offset was used by mistake where attribute number was needed causing wrong values to be fetched when scanning compressed chunk index.	2022-12-15 11:23:10 +01:00
Matvey Arye	df16815009	Fix memory leak for compression with merge chunks The RelationInitIndexAccessInfo call leaks cache memory and seems to be unnecessary.	2022-12-13 08:22:49 +01:00
Alexander Kuzmenkov	1b65297ff7	Fix memory leak with INSERT into compressed hypertable We used to allocate some temporary data in the ExecutorContext.	2022-11-16 13:58:52 +04:00
Fabrízio de Royes Mello	f1535660b0	Honor usage of OidIsValid() macro Postgres source code define the macro `OidIsValid()` to check if the Oid is valid or not (comparing against the `InvalidOid` type). See `src/include/c.h` in Postgres source three. Changed all direct comparisons against `InvalidOid` for the `OidIsValid` call and add a coccinelle check to make sure the future changes will use it correctly.	2022-11-03 16:10:50 -03:00
Ante Kresic	2475c1b92f	Roll up uncompressed chunks into compressed ones This change introduces a new option to the compression procedure which decouples the uncompressed chunk interval from the compressed chunk interval. It does this by allowing multiple uncompressed chunks into one compressed chunk as part of the compression procedure. The main use-case is to allow much smaller uncompressed chunks than compressed ones. This has several advantages: - Reduce the size of btrees on uncompressed data (thus allowing faster inserts because those indexes are memory-resident). - Decrease disk-space usage for uncompressed data. - Reduce number of chunks over historical data. From a UX point of view, we simple add a compression with clause option `compress_chunk_time_interval`. The user should set that according to their needs for constraint exclusion over historical data. Ideally, it should be a multiple of the uncompressed chunk interval and so we throw a warning if it is not.	2022-11-02 15:14:18 +01:00
Alexander Kuzmenkov	313845a882	Enable -Wextra Our code mostly has warnings about comparison with different signedness.	2022-10-27 16:06:58 +04:00
Alexander Kuzmenkov	f862212c8c	Add clang-tidy warning readability-inconsistent-declaration-parameter-name Mostly cosmetic stuff. Matched to definition automatically with --fix-notes.	2022-10-20 19:42:11 +04:00
Bharathy	38878bee16	Fix segementation fault during INSERT into compressed hypertable. INSERT into compressed hypertable with number of open chunks greater than ts_guc_max_open_chunks_per_insert causes segementation fault. New row which needs to be inserted into compressed chunk has to be compressed. Memory required as part of compressing a row is allocated from RowCompressor::per_row_ctx memory context. Once row is compressed, ExecInsert() is called, where memory from same context is used to allocate and free it instead of using "Executor State". This causes a corruption in memory. Fixes: #4778	2022-10-13 20:48:23 +05:30
Ante Kresic	cc110a33a2	Move ANALYZE after heap scan during compression Depending on the statistics target, running ANALYZE on a chunk before compression can cause a lot of random IO operations for chunks that are bigger than the number of pages ANALYZE needs to read. By moving that operation after the heap is loaded into memory for sorting, we increase the chance of hitting cache and reducing disk operations necessary to execute compression jobs.	2022-09-28 14:40:52 +02:00
Ante Kresic	9c819882f3	Increase memory usage for compression jobs When compressing larger chunks, compression sort tends to use temporary files since memory limits (`work_mem`) are usually pretty small to fit all the data into memory. On the other hand, using `maintenance_work_mem` makes more sense since its generally safer to use a larger value without impacting general resource usage.	2022-09-28 14:40:52 +02:00
Jan Nidzwetzki	de30d190e4	Fix a deadlock in chunk decompression and SELECTs This patch fixes a deadlock between chunk decompression and SELECT queries executed in parallel. The change in a608d7db614c930213dee8d6a5e9d26a0259da61 requests an AccessExclusiveLock for the decompressed chunk instead of the compressed chunk, resulting in deadlocks. In addition, an isolation test has been added to test that SELECT queries on a chunk that is currently decompressed can be executed. Fixes #4605	2022-09-22 14:37:14 +02:00
Sven Klemm	131773a902	Reset compression sequence when group resets The sequence number of the compressed tuple is per segment by grouping and should be reset when the grouping changes to prevent overflows with many segmentby columns.	2022-08-15 13:34:00 +02:00
Sven Klemm	a6107020e6	Fix segfaults in compression code with corrupt data Sanity check the compression header for sane algorithm before using it as index into an array. Previously this would result in a segfault and could happen with corrupted compressed data.	2022-08-08 22:04:30 +02:00
Alexander Kuzmenkov	a3ef038465	Fix clang-tidy warning `bugprone-macro-parentheses`	2022-05-26 13:51:36 +05:30
Mats Kindahl	aaffc1d5a6	Set null vector for insert into compressed table As part of inserting into a compressed table, the tuple is materialized, which computes the data size for the tuple using `heap_compute_data_size`. When computing the data size of the tuple, columns that are null are not considered and are just ignored. Columns that are dropped are, however, not explicitly checked and instead the `heap_compute_data_size` rely on these columns being set to null. When reading tuples from a compressed table for insert, the null vector is cleared, meaning that it by default is non-null. Since columns that are dropped are not explicitly processed, they are expected to have a defined value, which they do not have, causing a crash when an attempt to dereference them are made. This commit fixes this by setting the null vector to all null, and the code after will overwrite the columns with proper null bits, except the dropped columns that will be considered null. Fixes #4251	2022-04-26 17:24:02 +02:00
gayyappan	9f64df8567	Add ts_catalog subdirectory Move files that are related to timescaledb catalog access to this subdirectory	2022-01-24 16:58:09 -05:00
Sven Klemm	b27c9cbd47	Add missing heap_freetuple calls This patch adds missing heap_freetuple calls in 2 locations. The missing call in compression.c was a leak making the allocation live for much longer than needed. This was found by coccinelle.	2021-10-26 20:48:41 +02:00
Sven Klemm	f686b2af40	Fix various windows compilation problems with PG14 The windows compiler has problems with the macros in genbki.h complaining about redefinition of a variable with a different storage class. Since those specific macros are processed by a perl script and not relevant for the build process we turn them into noops for windows.	2021-10-14 02:14:37 +02:00
Sven Klemm	265e18627b	Adjust code to PG14 reindex_relation changes PG14 changes the reindex_relation `params` argument from integer to a struct. https://github.com/postgres/postgres/commit/a3dc9260	2021-09-08 15:24:46 +02:00
Sven Klemm	d0426ff234	Move all compatibility related files into compat directory	2021-08-28 05:17:22 +02:00
Sven Klemm	5719c50e51	Remove TTSOps pointer macros Remove TTSOpsVirtualP, TTSOpsHeapTupleP, TTSOpsMinimalTupleP and TTSOpsBufferHeapTupleP macros since they were only needed on PG11 to allow us to define compatibility macros for TupleTableSlot operations.	2021-06-03 14:34:31 +02:00
Sven Klemm	fb863f12c7	Remove support for PG11 Remove support for compiling against PostgreSQL 11. This patch also removes PG11 specific compatibility macros.	2021-06-01 20:21:06 +02:00
gayyappan	ad25f787fc	Test support for copy on distributed hypertables with compressed chunks Add a test case for copy on distr. hypertables with compressed chunks. verifies that recompress_chunk and compression policy work as expected. Additional changes include: Clean up commented code Make use of BulkInsertState optional in row compressor Add test for insert into compressed chunk by a different role other than the owner	2021-05-24 18:03:47 -04:00
gayyappan	4f865f7870	Add recompress_chunk function After inserts go into a compressed chunk, the chunk is marked as unordered.This PR adds a new function recompress_chunk that compresses the data and sets the status back to compressed. Further optimizations for this function are planned but not part of this PR. This function can be invoked by calling SELECT recompress_chunk(<chunk_name>). recompress_chunk function is automatically invoked by the compression policy job, when it sees that a chunk is in unordered state.	2021-05-24 18:03:47 -04:00
Sven Klemm	5f6e492474	Adjust pathkeys generation for unordered compressed chunks Compressed chunks with inserts after being compressed have batches that are not ordered according to compress_orderby for those chunks we cannot set pathkeys on the DecompressChunk node and we need an extra sort step if we require ordered output from those chunks.	2021-05-24 18:03:47 -04:00
gayyappan	d9839b9b61	Support defaults, sequences, check constraints for compressed chunks Support defaults, sequences and check constraints with inserts into compressed chunks	2021-05-24 18:03:47 -04:00
gayyappan	93be235d33	Support for inserts into compressed hypertables Add CompressRowSingleState . This has functions to compress a single row.	2021-05-24 18:03:47 -04:00
Sven Klemm	d26c744115	Use %u to format Oid instead of %d Since Oid is unsigned int we have to use %u to print it otherwise oids >= 2^31 will not work correctly. This also switches the places that print type oid to use format helper functions to resolve the oids.	2021-04-14 21:11:20 +02:00
gayyappan	5be6a3e4e9	Support column rename for hypertables with compression enabled ALTER TABLE <hypertable> RENAME <column_name> TO <new_column_name> is now supported for hypertables that have compression enabled. Note: Column renaming is not supported for distributed hypertables. So this will not work on distributed hypertables that have compression enabled.	2021-02-19 10:21:50 -05:00
Sven Klemm	002510cb01	Add compatibilty wrapper functions for base64 encoding/decoding PG13 adds a destination length 4th argument to pg_b64_decode and pg_b64_encode functions so this patch adds a macro that translates to the 3 argument and 4 argument calls depending on postgres version. This patch also adds checking of return values for those functions. https://github.com/postgres/postgres/commit/cfc40d384a	2020-12-10 18:40:37 +01:00
gayyappan	05319cd424	Support analyze of internal compression table This commit modifies analyze behavior as follows: 1. When an internal compression table is analyzed, statistics from the compressed chunk (such as page count and tuple count) is used to update the statistics of the corresponding chunk parent, if it is missing. 2. Analyze compressed chunk instead of raw chunks When the command ANALYZE <hypertable> is executed, a) analyze uncompressed chunks and b) skip the raw chunk, but analyze the compressed chunk.	2020-11-11 15:05:14 -05:00
Sven Klemm	97254783d4	Fix segfault in decompress_chunk for chunks with dropped columns This patch fixes a segfault in decompress_chunk for chunks with dropped columns. Since dropped columns don't exists in the compressed chunk the values for those columns were undefined in the decompressed tuple leading to a segfault when trying to build the heap tuple.	2020-11-10 10:13:45 +01:00
Brian Rowe	5acf3343b5	Ensure reltuples are preserved during compression This change captures the reltuples and relpages (and relallvisible) statistics from the pg_class table for chunks immediately before truncating them during the compression code path. It then restores the values after truncating, as there is no way to keep postgresql from clearing these values during this operation. It also properly uses these values properly during planning, working around some postgresql code which substitutes in arbitrary sizing for tables which don't see to hold data. Fixes #2524	2020-10-19 07:21:38 -07:00
gayyappan	b93b30b0c2	Add counts to compression statistics Store information related to compressed and uncompressed row counts after compressing a chunk. This is saved in compression_chunk_size table.	2020-06-19 15:58:04 -04:00
Sven Klemm	c90397fd6a	Remove support for PG9.6 and PG10 This patch removes code support for PG9.6 and PG10. In addition to removing PG96 and PG10 macros the following changes are done: remove HAVE_INT64_TIMESTAMP since this is always true on PG10+ remove PG_VERSION_SUPPORTS_MULTINODE	2020-06-02 23:48:35 +02:00
Stephen Polcyn	b57d2ac388	Cleanup TODOs and FIXMEs Unless otherwise listed, the TODO was converted to a comment or put into an issue tracker. test/sql/ - triggers.sql: Made required change tsl/test/ - CMakeLists.txt: TODO complete - bgw_policy.sql: TODO complete - continuous_aggs_materialize.sql: TODO complete - compression.sql: TODO complete - compression_algos.sql: TODO complete tsl/src/ - compression/compression.c: - row_compressor_decompress_row: Expected complete - compression/dictionary.c: FIXME complete - materialize.c: TODO complete - reorder.c: TODO complete - simple8b_rle.h: - compressor_finish: Removed (obsolete) src/ - extension.c: Removed due to age - adts/simplehash.h: TODOs are from copied Postgres code - adts/vec.h: TODO is non-significant - planner.c: Removed - process_utility.c - process_altertable_end_subcmd: Removed (PG will handle case)	2020-05-18 20:16:03 -04:00
Ruslan Fomkin	ed32d093dc	Use table_open/close and PG aggregated directive Fixing more places to use table_open and table_close introduced in PG12. Unifies PG version directives to use aggregated macro.	2020-04-14 23:12:15 +02:00

1 2

76 Commits