timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-17 02:53:51 +08:00

Author	SHA1	Message	Date
Sven Klemm	83cf5605a5	Remove multinode functionality from compression	2023-12-13 23:38:32 +01:00
Sven Klemm	36c36564a8	Refactor compression setting storage This patch drops the catalog table _timescaledb_catalog.hypertable_compression and stores those settings in _timescaledb_catalog.compression_settings instead. The storage format is changed and the new table will have 1 entry per relation instead of 1 entry per column and has no dependancy on hypertables. All other aspects of compression will remain the same. This is refactoring is to enable per chunk compression settings in a follow-up patch.	2023-12-12 21:45:33 +01:00
Erik Nordström	e8b81c2ebe	Use table scan API in compression code Refactor the compression code to only use the table scan API when scanning relations instead of a mix of the table and heap scan APIs. The table scan API is a higher-level API and recommended as it works for any type of relation and uses table slots directly, which means that in some cases a full heap tuple need not be materialized.	2023-12-12 14:40:52 +01:00
Fabrízio de Royes Mello	48c9edda9c	Use NameStr() macro for NameData struct usage	2023-12-09 16:23:01 -03:00
Ante Kresic	645727bfe1	Optimize segmentwise recompression Instead of recompressing all the segments, try to find segments which have uncompressed tuples and only recompress those segments.	2023-11-28 10:51:28 +01:00
Ante Kresic	5f421eca9e	Reset compressor between recompression runs If we reuse the compressor to recompress multiple sets of tuples, internal state gets left behind from the previous run which can contain invalid data. Resetting the compressor first iteration field between runs fixes this.	2023-11-24 11:40:49 +01:00
Alexander Kuzmenkov	e5840e3d45	Use bulk heap insert API for decompression This should improve the throughput somewhat. This commit does several things: * Simplify loop condition in decompressing the compressed batch by using the count metadata column. * Split out a separate function that decompresses the entire compressed batch and saves the decompressed tuples slot into RowDecompressor. * Use bulk table insert function for inserting the decompressed rows, this reduces WAL activity. If we have indexes on uncompressed chunk, update them one index for entire batch at a time, to reduce load on shared buffers cache. Before that we used to update all indexes for one row, then for another, etc. * Add a test for memory leaks during (de)compression. * Update the compression_update_delete test to use INFO messages + a debug GUC instead of DEBUG messages which are flaky. This gives 10%-30% speedup on tsbench for decompress_chunk and various compressed DML queries. This is very far from the performance we had in 2.10, but still a nice improvement.	2023-11-16 14:57:06 +01:00
Jan Nidzwetzki	8767de658b	Reduce WAL activity by freezing tuples immediately When we compress a chunk, we create a new compressed chunk for storing the compressed data. So far, the tuples were just inserted into the compressed chunk and frozen by a later vacuum run. However, freezing tuples causes WAL activity can be optimized because the compressed chunk is created in the same transaction as the tuples. This patch reduces the WAL activity by storing these tuples directly as frozen and preventing a freeze operation in the future. This approach is similar to PostgreSQL's COPY FREEZE.	2023-10-25 13:27:07 +02:00
Konstantina Skovola	7b7722e241	Fix index inserts during decompression Since version 2.11.0 we would get a segmentation fault during decompression when there was an expressional or partial index on the uncompressed chunk. This patch fixes this by calling ExecInsertIndexTuples to insert into indexes during chunk decompression, instead of CatalogIndexInsert. In addition, when enabling compression on a hypertable, we check the unique indexes defined on it to provide performance improvement hints in case the unique index columns are not specified as compression parameters. However this check threw an error when expression columns were present in the index, preventing the user from enabling compression. This patch fixes this by simply ignoring the expression columns in the index, since we cannot currently segment by an expression. Fixes #6205, #6186	2023-10-24 16:51:57 +03:00
James Guthrie	01e480d5d6	Account for uncompressed rows in 'create_compressed_chunk' `_timescaledb_internal.create_compressed_chunk` can be used to create a compressed chunk with existing compressed data. It did not account for the fact that the chunk can contain uncompressed data, in which case the chunk status must be set to partial. Fixes #5946	2023-08-28 12:53:14 +02:00
Dmitry Simonenko	7aeed663b9	Feature flags for TimescaleDB features This PR adds several GUCs which allow to enable/disable major timescaledb features: - enable_hypertable_create - enable_hypertable_compression - enable_cagg_create - enable_policy_create	2023-08-07 10:11:01 +03:00
Ante Kresic	fb0df1ae4e	Insert into indexes during chunk compression If there any indexes on the compressed chunk, insert into them while inserting the heap data rather than reindexing the relation at the end. This reduces the amount of locking on the compressed chunk indexes which created issues when merging chunks and should help with the future updates of compressed data.	2023-06-26 09:37:12 +02:00
Zoltan Haindrich	b2132f00a7	Make validate_chunk_status accept Chunk as argument This makes the calls to this method more straightforward and could help to do better checks inside the method.	2023-06-22 11:47:31 +02:00
Konstantina Skovola	df70f3e050	Remove unused variable in tsl_get_compressed_chunk_index_for_recompression Commit 72c0f5b25e569015aacb98cc1be3169a1720116d introduced an unused variable. This patch removes it.	2023-04-06 10:58:57 +03:00
Erik Nordström	a51d21efbe	Fix issue creating dimensional constraints During chunk creation, the chunk's dimensional CHECK constraints are created via an "upcall" to PL/pgSQL code. However, creating dimensional constraints in PL/pgSQL code sometimes fails, especially during high-concurrency inserts, because PL/pgSQL code scans metadata using a snapshot that might not see the same metadata as the C code. As a result, chunk creation sometimes fail during constraint creation. To fix this issue, implement dimensional CHECK-constraint creation in C code. Other constraints (FK, PK, etc.) are still created via an upcall, but should probably also be rewritten in C. However, since these constraints don't depend on recently updated metadata, this is left to a future change. Fixes #5456	2023-03-24 10:55:08 +01:00
Konstantina Skovola	72c0f5b25e	Rewrite recompress_chunk in C for segmentwise processing This patch introduces a C-function to perform the recompression at a finer granularity instead of decompressing and subsequently compressing the entire chunk. This improves performance for the following reasons: - it needs to sort less data at a time and - it avoids recreating the decompressed chunk and the heap inserts associated with that by decompressing each segment into a tuplesort instead. If no segmentby is specified when enabling compression or if an index does not exist on the compressed chunk then the operation is performed as before, decompressing and subsequently compressing the entire chunk.	2023-03-23 11:39:43 +02:00
shhnwz	699fcf48aa	Stats improvement for Uncompressed Chunks During the compression autovacuum use to be disabled for uncompressed chunk and enable after decompression. This leads to postgres maintainence issue. Let's not disable autovacuum for uncompressed chunk anymore. Let postgres take care of the stats in its natural way. Fixes #309	2023-03-22 23:51:13 +05:30
Fabrízio de Royes Mello	f1535660b0	Honor usage of OidIsValid() macro Postgres source code define the macro `OidIsValid()` to check if the Oid is valid or not (comparing against the `InvalidOid` type). See `src/include/c.h` in Postgres source three. Changed all direct comparisons against `InvalidOid` for the `OidIsValid` call and add a coccinelle check to make sure the future changes will use it correctly.	2022-11-03 16:10:50 -03:00
Ante Kresic	2475c1b92f	Roll up uncompressed chunks into compressed ones This change introduces a new option to the compression procedure which decouples the uncompressed chunk interval from the compressed chunk interval. It does this by allowing multiple uncompressed chunks into one compressed chunk as part of the compression procedure. The main use-case is to allow much smaller uncompressed chunks than compressed ones. This has several advantages: - Reduce the size of btrees on uncompressed data (thus allowing faster inserts because those indexes are memory-resident). - Decrease disk-space usage for uncompressed data. - Reduce number of chunks over historical data. From a UX point of view, we simple add a compression with clause option `compress_chunk_time_interval`. The user should set that according to their needs for constraint exclusion over historical data. Ideally, it should be a multiple of the uncompressed chunk interval and so we throw a warning if it is not.	2022-11-02 15:14:18 +01:00
Ante Kresic	cc110a33a2	Move ANALYZE after heap scan during compression Depending on the statistics target, running ANALYZE on a chunk before compression can cause a lot of random IO operations for chunks that are bigger than the number of pages ANALYZE needs to read. By moving that operation after the heap is loaded into memory for sorting, we increase the chance of hitting cache and reducing disk operations necessary to execute compression jobs.	2022-09-28 14:40:52 +02:00
Jan Nidzwetzki	de30d190e4	Fix a deadlock in chunk decompression and SELECTs This patch fixes a deadlock between chunk decompression and SELECT queries executed in parallel. The change in a608d7db614c930213dee8d6a5e9d26a0259da61 requests an AccessExclusiveLock for the decompressed chunk instead of the compressed chunk, resulting in deadlocks. In addition, an isolation test has been added to test that SELECT queries on a chunk that is currently decompressed can be executed. Fixes #4605	2022-09-22 14:37:14 +02:00
Jan Nidzwetzki	a608d7db61	Fix race conditions during chunk (de)compression This patch introduces a further check to compress_chunk_impl and decompress_chunk_impl. After all locks are acquired, a check is made to see if the chunk is still (un-)compressed. If the chunk was (de-)compressed while waiting for the locks, the (de-)compression operation is stopped. In addition, the chunk locks in decompress_chunk_impl are upgraded to AccessExclusiveLock to ensure the chunk is not deleted while other transactions are using it. Fixes: #4480	2022-07-05 15:13:10 +02:00
gayyappan	6c20e74674	Block drop chunk if chunk is in frozen state A chunk in frozen state cannot be dropped. drop_chunks will skip over frozen chunks without erroring. Internal api , drop_chunk will error if you attempt to drop a chunk without unfreezing it. This PR also adds a new internal API to unfreeze a chunk.	2022-06-30 09:56:50 -04:00
Sven Klemm	048d86e7e7	Fix duplicate header guard The compression code had 2 files using the same header guard. This patch renames the file with floating point helper functions to float_utils.h and renames the other file to compression/api since that more clearly reflects the purpose of the functions.	2022-06-20 09:03:02 +02:00

24 Commits