timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-22 22:11:29 +08:00

Author	SHA1	Message	Date
Bharathy	38878bee16	Fix segementation fault during INSERT into compressed hypertable. INSERT into compressed hypertable with number of open chunks greater than ts_guc_max_open_chunks_per_insert causes segementation fault. New row which needs to be inserted into compressed chunk has to be compressed. Memory required as part of compressing a row is allocated from RowCompressor::per_row_ctx memory context. Once row is compressed, ExecInsert() is called, where memory from same context is used to allocate and free it instead of using "Executor State". This causes a corruption in memory. Fixes: #4778	2022-10-13 20:48:23 +05:30
Ante Kresic	cc110a33a2	Move ANALYZE after heap scan during compression Depending on the statistics target, running ANALYZE on a chunk before compression can cause a lot of random IO operations for chunks that are bigger than the number of pages ANALYZE needs to read. By moving that operation after the heap is loaded into memory for sorting, we increase the chance of hitting cache and reducing disk operations necessary to execute compression jobs.	2022-09-28 14:40:52 +02:00
Ante Kresic	9c819882f3	Increase memory usage for compression jobs When compressing larger chunks, compression sort tends to use temporary files since memory limits (`work_mem`) are usually pretty small to fit all the data into memory. On the other hand, using `maintenance_work_mem` makes more sense since its generally safer to use a larger value without impacting general resource usage.	2022-09-28 14:40:52 +02:00
Jan Nidzwetzki	de30d190e4	Fix a deadlock in chunk decompression and SELECTs This patch fixes a deadlock between chunk decompression and SELECT queries executed in parallel. The change in a608d7db614c930213dee8d6a5e9d26a0259da61 requests an AccessExclusiveLock for the decompressed chunk instead of the compressed chunk, resulting in deadlocks. In addition, an isolation test has been added to test that SELECT queries on a chunk that is currently decompressed can be executed. Fixes #4605	2022-09-22 14:37:14 +02:00
Sven Klemm	131773a902	Reset compression sequence when group resets The sequence number of the compressed tuple is per segment by grouping and should be reset when the grouping changes to prevent overflows with many segmentby columns.	2022-08-15 13:34:00 +02:00
Sven Klemm	a6107020e6	Fix segfaults in compression code with corrupt data Sanity check the compression header for sane algorithm before using it as index into an array. Previously this would result in a segfault and could happen with corrupted compressed data.	2022-08-08 22:04:30 +02:00
Alexander Kuzmenkov	a3ef038465	Fix clang-tidy warning `bugprone-macro-parentheses`	2022-05-26 13:51:36 +05:30
Mats Kindahl	aaffc1d5a6	Set null vector for insert into compressed table As part of inserting into a compressed table, the tuple is materialized, which computes the data size for the tuple using `heap_compute_data_size`. When computing the data size of the tuple, columns that are null are not considered and are just ignored. Columns that are dropped are, however, not explicitly checked and instead the `heap_compute_data_size` rely on these columns being set to null. When reading tuples from a compressed table for insert, the null vector is cleared, meaning that it by default is non-null. Since columns that are dropped are not explicitly processed, they are expected to have a defined value, which they do not have, causing a crash when an attempt to dereference them are made. This commit fixes this by setting the null vector to all null, and the code after will overwrite the columns with proper null bits, except the dropped columns that will be considered null. Fixes #4251	2022-04-26 17:24:02 +02:00
gayyappan	9f64df8567	Add ts_catalog subdirectory Move files that are related to timescaledb catalog access to this subdirectory	2022-01-24 16:58:09 -05:00
Sven Klemm	b27c9cbd47	Add missing heap_freetuple calls This patch adds missing heap_freetuple calls in 2 locations. The missing call in compression.c was a leak making the allocation live for much longer than needed. This was found by coccinelle.	2021-10-26 20:48:41 +02:00
Sven Klemm	f686b2af40	Fix various windows compilation problems with PG14 The windows compiler has problems with the macros in genbki.h complaining about redefinition of a variable with a different storage class. Since those specific macros are processed by a perl script and not relevant for the build process we turn them into noops for windows.	2021-10-14 02:14:37 +02:00
Sven Klemm	265e18627b	Adjust code to PG14 reindex_relation changes PG14 changes the reindex_relation `params` argument from integer to a struct. https://github.com/postgres/postgres/commit/a3dc9260	2021-09-08 15:24:46 +02:00
Sven Klemm	d0426ff234	Move all compatibility related files into compat directory	2021-08-28 05:17:22 +02:00
Sven Klemm	5719c50e51	Remove TTSOps pointer macros Remove TTSOpsVirtualP, TTSOpsHeapTupleP, TTSOpsMinimalTupleP and TTSOpsBufferHeapTupleP macros since they were only needed on PG11 to allow us to define compatibility macros for TupleTableSlot operations.	2021-06-03 14:34:31 +02:00
Sven Klemm	fb863f12c7	Remove support for PG11 Remove support for compiling against PostgreSQL 11. This patch also removes PG11 specific compatibility macros.	2021-06-01 20:21:06 +02:00
gayyappan	ad25f787fc	Test support for copy on distributed hypertables with compressed chunks Add a test case for copy on distr. hypertables with compressed chunks. verifies that recompress_chunk and compression policy work as expected. Additional changes include: Clean up commented code Make use of BulkInsertState optional in row compressor Add test for insert into compressed chunk by a different role other than the owner	2021-05-24 18:03:47 -04:00
gayyappan	4f865f7870	Add recompress_chunk function After inserts go into a compressed chunk, the chunk is marked as unordered.This PR adds a new function recompress_chunk that compresses the data and sets the status back to compressed. Further optimizations for this function are planned but not part of this PR. This function can be invoked by calling SELECT recompress_chunk(<chunk_name>). recompress_chunk function is automatically invoked by the compression policy job, when it sees that a chunk is in unordered state.	2021-05-24 18:03:47 -04:00
Sven Klemm	5f6e492474	Adjust pathkeys generation for unordered compressed chunks Compressed chunks with inserts after being compressed have batches that are not ordered according to compress_orderby for those chunks we cannot set pathkeys on the DecompressChunk node and we need an extra sort step if we require ordered output from those chunks.	2021-05-24 18:03:47 -04:00
gayyappan	d9839b9b61	Support defaults, sequences, check constraints for compressed chunks Support defaults, sequences and check constraints with inserts into compressed chunks	2021-05-24 18:03:47 -04:00
gayyappan	93be235d33	Support for inserts into compressed hypertables Add CompressRowSingleState . This has functions to compress a single row.	2021-05-24 18:03:47 -04:00
Sven Klemm	d26c744115	Use %u to format Oid instead of %d Since Oid is unsigned int we have to use %u to print it otherwise oids >= 2^31 will not work correctly. This also switches the places that print type oid to use format helper functions to resolve the oids.	2021-04-14 21:11:20 +02:00
gayyappan	5be6a3e4e9	Support column rename for hypertables with compression enabled ALTER TABLE <hypertable> RENAME <column_name> TO <new_column_name> is now supported for hypertables that have compression enabled. Note: Column renaming is not supported for distributed hypertables. So this will not work on distributed hypertables that have compression enabled.	2021-02-19 10:21:50 -05:00
Sven Klemm	002510cb01	Add compatibilty wrapper functions for base64 encoding/decoding PG13 adds a destination length 4th argument to pg_b64_decode and pg_b64_encode functions so this patch adds a macro that translates to the 3 argument and 4 argument calls depending on postgres version. This patch also adds checking of return values for those functions. https://github.com/postgres/postgres/commit/cfc40d384a	2020-12-10 18:40:37 +01:00
gayyappan	05319cd424	Support analyze of internal compression table This commit modifies analyze behavior as follows: 1. When an internal compression table is analyzed, statistics from the compressed chunk (such as page count and tuple count) is used to update the statistics of the corresponding chunk parent, if it is missing. 2. Analyze compressed chunk instead of raw chunks When the command ANALYZE <hypertable> is executed, a) analyze uncompressed chunks and b) skip the raw chunk, but analyze the compressed chunk.	2020-11-11 15:05:14 -05:00
Sven Klemm	97254783d4	Fix segfault in decompress_chunk for chunks with dropped columns This patch fixes a segfault in decompress_chunk for chunks with dropped columns. Since dropped columns don't exists in the compressed chunk the values for those columns were undefined in the decompressed tuple leading to a segfault when trying to build the heap tuple.	2020-11-10 10:13:45 +01:00
Brian Rowe	5acf3343b5	Ensure reltuples are preserved during compression This change captures the reltuples and relpages (and relallvisible) statistics from the pg_class table for chunks immediately before truncating them during the compression code path. It then restores the values after truncating, as there is no way to keep postgresql from clearing these values during this operation. It also properly uses these values properly during planning, working around some postgresql code which substitutes in arbitrary sizing for tables which don't see to hold data. Fixes #2524	2020-10-19 07:21:38 -07:00
gayyappan	b93b30b0c2	Add counts to compression statistics Store information related to compressed and uncompressed row counts after compressing a chunk. This is saved in compression_chunk_size table.	2020-06-19 15:58:04 -04:00
Sven Klemm	c90397fd6a	Remove support for PG9.6 and PG10 This patch removes code support for PG9.6 and PG10. In addition to removing PG96 and PG10 macros the following changes are done: remove HAVE_INT64_TIMESTAMP since this is always true on PG10+ remove PG_VERSION_SUPPORTS_MULTINODE	2020-06-02 23:48:35 +02:00
Stephen Polcyn	b57d2ac388	Cleanup TODOs and FIXMEs Unless otherwise listed, the TODO was converted to a comment or put into an issue tracker. test/sql/ - triggers.sql: Made required change tsl/test/ - CMakeLists.txt: TODO complete - bgw_policy.sql: TODO complete - continuous_aggs_materialize.sql: TODO complete - compression.sql: TODO complete - compression_algos.sql: TODO complete tsl/src/ - compression/compression.c: - row_compressor_decompress_row: Expected complete - compression/dictionary.c: FIXME complete - materialize.c: TODO complete - reorder.c: TODO complete - simple8b_rle.h: - compressor_finish: Removed (obsolete) src/ - extension.c: Removed due to age - adts/simplehash.h: TODOs are from copied Postgres code - adts/vec.h: TODO is non-significant - planner.c: Removed - process_utility.c - process_altertable_end_subcmd: Removed (PG will handle case)	2020-05-18 20:16:03 -04:00
Ruslan Fomkin	ed32d093dc	Use table_open/close and PG aggregated directive Fixing more places to use table_open and table_close introduced in PG12. Unifies PG version directives to use aggregated macro.	2020-04-14 23:12:15 +02:00
Ruslan Fomkin	1ddc62eb5f	Refactor header inclusion Correcting conditions in #ifdefs, adding missing includes, removing and rearranging existing includes, replacing PG12 with PG12_GE for forward compatibility. Fixed number of places with relation_close to table_close, which were missed earlier.	2020-04-14 23:12:15 +02:00
Joshua Lockerman	949b88ef2e	Initial support for PostgreSQL 12 This change includes a major refactoring to support PostgreSQL 12. Note that many tests aren't passing at this point. Changes include, but are not limited to: - Handle changes related to table access methods - New way to expand hypertables since expansion has changed in PostgreSQL 12 (more on this below). - Handle changes related to table expansion for UPDATE/DELETE - Fixes for various TimescaleDB optimizations that were affected by planner changes in PostgreSQL (gapfill, first/last, etc.) Before PostgreSQL 12, planning was organized something like as follows: 1. construct add `RelOptInfo` for base and appendrels 2. add restrict info, joins, etc. 3. perform the actual planning with `make_one_rel` For our optimizations we would expand hypertables in the middle of step 1; since nothing in the query planner before `make_one_rel` cared about the inheritance children, we didn’t have to be too precises about where we were doing it. However, with PG12, and the optimizations around declarative partitioning, PostgreSQL now does care about when the children are expanded, since it wants as much information as possible to perform partition-pruning. Now planning is organized like: 1. construct add RelOptInfo for base rels only 2. add restrict info, joins, etc. 3. expand appendrels, removing irrelevant declarative partitions 4. perform the actual planning with make_one_rel Step 3 always expands appendrels, so when we also expand them during step 1, the hypertable gets expanded twice, and things in the planner break. The changes to support PostgreSQL 12 attempts to solve this problem by keeping the hypertable root marked as a non-inheritance table until `make_one_rel` is called, and only then revealing to PostgreSQL that it does in fact have inheritance children. While this strategy entails the least code change on our end, the fact that the first hook we can use to re-enable inheritance is `set_rel_pathlist_hook` it does entail a number of annoyances: 1. this hook is called after the sizes of tables are calculated, so we must recalculate the sizes of all hypertables, as they will not have taken the chunk sizes into account 2. the table upon which the hook is called will have its paths planned under the assumption it has no inheritance children, so if it's a hypertable we have to replan it's paths Unfortunately, the code for doing these is static, so we need to copy them into our own codebase, instead of just using PostgreSQL's. In PostgreSQL 12, UPDATE/DELETE on inheritance relations have also changed and are now planned in two stages: - In stage 1, the statement is planned as if it was a `SELECT` and all leaf tables are discovered. - In stage 2, the original query is planned against each leaf table, discovered in stage 1, directly, not part of an Append. Unfortunately, this means we cannot look in the appendrelinfo during UPDATE/DELETE planning, in particular to determine if a table is a chunk, as the appendrelinfo is not at the point we wish to do so initialized. This has consequences for how we identify operations on chunks (sometimes for blocking and something for enabling functionality).	2020-04-14 23:12:15 +02:00
Erik Nordström	a4fb0cec3f	Cleanup compression-related errors This change fixes a number of typos and issues with inconsistent formatting for compression-related code. A couple of other fixes for variable names, etc. have also been applied.	2020-03-11 13:27:16 +01:00
Joshua Lockerman	07841670a7	Fix issues discovered by coverity This commit fixes issues reported by coverity. Of these, the only real issue is an integer overflow in bitarray, which can never happen in its current usages. This also adds a PG_USED_FOR_ASSERTS_ONLY for a variable only used for Assert.	2019-10-29 19:02:58 -04:00
Matvey Arye	0f3e74215a	Split segment meta min_max into two columns This simplifies the code and the access to the min/max metadata. Before we used a custom type, but now the min/max are just the same type as the underlying column and stored as two columns. This also removes the custom type that was used before.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	6687189a6c	Free memory earlier in decompress_chunk This was supposed to be part of an earlier commit, but seems to have been lost. This should reduce peak memory usage of that function.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	fac8eca0b3	Free Memory Earlier in decompress_chunk This commit alters decompress_chunk to free memory as soon as possible instead of waiting until the function ends. This should decrease peak memory usage from roughly the size of the dataset to roughly the size of the a single compressed row.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	0606aeba9e	Reduce Peak Memory Usage for compress_chunk Before this PR some state (most notably deTOASTed values) would persist across compressed rows during compress_chunk, despite the fact that they were no longer needed. This increased peak memory usage of compress_chunk. This commit adds a MemoryContext that is reset after each compressed row is inserted, ensuring that state needed for only one row does not hang around longer than needed.	2019-10-29 19:02:58 -04:00
Matvey Arye	6465a4e85a	Switch to using get_attnum function This is a fix for a rebase on master since `attno_find_by_attname` was removed.	2019-10-29 19:02:58 -04:00
Matvey Arye	8250714a29	Add fixes for Windows - Fix declaration of functions wrt TSDLLEXPORT consistency - Empty structs need to be created with '{ 0 }' syntax. - Alignment sentinels have to use uint64 instead of a struct with a 0-size member - Add some more ORDER BY clauses in the tests to constrain the order of results - Add ANALYZE after running compression in transparent-decompression test	2019-10-29 19:02:58 -04:00
Matvey Arye	5c891f732e	Add sequence id metadata col to compressed table Add a sequence id to the compressed table. This id increments monotonically for each compressed row in a way that follows the order by clause. We leave gaps to allow for the possibility to fill in rows due to e.g. inserts down the line. The sequence id is global to the entire chunk and does not reset for each segment-by-group-change since this has the potential to allow some micro optimizations when ordering by a segment by columns as well. The sequence number is a INT32, which allows up to 200 billion uncompressed rows per chunk to be supported (assuming 1000 rows per compressed row and a gap of 10). Overflow is checked in the code and will error if this is breached.	2019-10-29 19:02:58 -04:00
Matvey Arye	b4a7108492	Integrate segment meta into compression This commit integrates the SegmentMetaMinMax into the compression logic. It adds metadata columns to the compressed table and correctly sets it upon compression. We also fix several errors with datum detoasting in SegmentMetaMinMax	2019-10-29 19:02:58 -04:00
Matvey Arye	be199bec70	Add type cache Add a type cache to get the OID corresponding to a particular defined SQL type.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	8b273a5187	Fix flush when num-rows overflow We should only free the segment-bys when we're changing groups not when we've got too many rows to compress, in that case we'll need them.	2019-10-29 19:02:58 -04:00
gayyappan	6832ed2ca5	Modify storage type for toast columns This PR modifies the toast type for compressed columns based on the algorithm used for compression.	2019-10-29 19:02:58 -04:00
Matvey Arye	0059360522	Fix indexes during compression and decompression This rebuilds indexes during compression and decompression. Previously, indexes were not updated during these operations. We also fix a small bug with orderby and segmentby handling of empty strings/ lists. Finally, we add some more tests.	2019-10-29 19:02:58 -04:00
Matvey Arye	cdf6fcb69a	Allow altering compression options We now allow changing the compression options on a hypertable as long as there are no existing compressed chunks.	2019-10-29 19:02:58 -04:00
Matvey Arye	f6573f9247	Add a metadata count column to compressed table This is useful, if some or all compressed columns are NULL. The count reflects the number of uncompressed rows that are in the compressed row. Stored as a 32-bit integer.	2019-10-29 19:02:58 -04:00
Matvey Arye	a078781c2e	Add decompress_chunk function This is the opposite dual of compress_chunk.	2019-10-29 19:02:58 -04:00
Sven Klemm	bdc599793c	Add helper function to get decompression iterator init function	2019-10-29 19:02:58 -04:00

1 2 3

106 Commits