timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-22 13:40:56 +08:00

Author	SHA1	Message	Date
gayyappan	4f865f7870	Add recompress_chunk function After inserts go into a compressed chunk, the chunk is marked as unordered.This PR adds a new function recompress_chunk that compresses the data and sets the status back to compressed. Further optimizations for this function are planned but not part of this PR. This function can be invoked by calling SELECT recompress_chunk(<chunk_name>). recompress_chunk function is automatically invoked by the compression policy job, when it sees that a chunk is in unordered state.	2021-05-24 18:03:47 -04:00
Sven Klemm	fc5f10d454	Remove chunk_dml_blocker trigger Remove the chunk_dml_blocker trigger which was used to prevent INSERTs into compressed chunks.	2021-05-24 18:03:47 -04:00
Sven Klemm	5f6e492474	Adjust pathkeys generation for unordered compressed chunks Compressed chunks with inserts after being compressed have batches that are not ordered according to compress_orderby for those chunks we cannot set pathkeys on the DecompressChunk node and we need an extra sort step if we require ordered output from those chunks.	2021-05-24 18:03:47 -04:00
gayyappan	d9839b9b61	Support defaults, sequences, check constraints for compressed chunks Support defaults, sequences and check constraints with inserts into compressed chunks	2021-05-24 18:03:47 -04:00
gayyappan	93be235d33	Support for inserts into compressed hypertables Add CompressRowSingleState . This has functions to compress a single row.	2021-05-24 18:03:47 -04:00
Sven Klemm	eef71fdfb1	Replace StrNCpy with strlcpy PG14 removes StrNCpy and some Name helper functions. https://github.com/postgres/postgres/commit/1784f278a6	2021-05-20 08:54:54 +02:00
Markos Fountoulakis	bc740a32fb	Add distributed hypertable compression policies Add support for compression policies on Access Nodes. Extend the compress_chunk() function to maintain compression state per chunk on the Access Node.	2021-05-07 16:50:12 +03:00
Sven Klemm	d26c744115	Use %u to format Oid instead of %d Since Oid is unsigned int we have to use %u to print it otherwise oids >= 2^31 will not work correctly. This also switches the places that print type oid to use format helper functions to resolve the oids.	2021-04-14 21:11:20 +02:00
gayyappan	5be6a3e4e9	Support column rename for hypertables with compression enabled ALTER TABLE <hypertable> RENAME <column_name> TO <new_column_name> is now supported for hypertables that have compression enabled. Note: Column renaming is not supported for distributed hypertables. So this will not work on distributed hypertables that have compression enabled.	2021-02-19 10:21:50 -05:00
Sven Klemm	8bc2568f7d	Adjust compression code to PG13 AlterTable changes PG13 changes the signature of AlterTable so we switch to using AlterTableInternal instead.	2021-01-15 13:07:52 +01:00
gayyappan	f649736f2f	Support ADD COLUMN for compressed hypertables Support ALTER TABLE .. ADD COLUMN <colname> <typname> for hypertables with compressed chunks.	2021-01-14 09:32:50 -05:00
Sven Klemm	002510cb01	Add compatibilty wrapper functions for base64 encoding/decoding PG13 adds a destination length 4th argument to pg_b64_decode and pg_b64_encode functions so this patch adds a macro that translates to the 3 argument and 4 argument calls depending on postgres version. This patch also adds checking of return values for those functions. https://github.com/postgres/postgres/commit/cfc40d384a	2020-12-10 18:40:37 +01:00
gayyappan	bca1e35a52	Support disabling compression on distributed hypertables This change allows a user to execute ALTER TABLE hyper SET (timescaledb.compress = false) on distributed hypertables. Fixes #2716	2020-12-08 12:08:54 -05:00
gayyappan	d41aa2aff5	Rename macro TS_HYPERTABLE_HAS_COMPRESSION This PR cleans up some macro names. Rename macro TS_HYPERTABLE_HAS_COMPRESSION to TS_HYPERTABLE_HAS_COMPRESSION_TABLE.	2020-12-02 15:44:48 -05:00
gayyappan	7c76fd4d09	Save compression settings on access node for distributed hypertables 1. Add compression_state column for hypertable catalog by renaming compressed column for the hypertable catalog table. compression_state is a tri-state column. This column indicates if the hypertable has compression enabled (value = 1) or if it is an internal compression table (value = 2). 2. Save compression settings on access node when compression is turned on for a distributed hypertable For a distributed hypertable, that has compression enabled, compression_state is set. We don't create any internal tables on the access node. Fixes #2660	2020-12-02 10:42:57 -05:00
gayyappan	05319cd424	Support analyze of internal compression table This commit modifies analyze behavior as follows: 1. When an internal compression table is analyzed, statistics from the compressed chunk (such as page count and tuple count) is used to update the statistics of the corresponding chunk parent, if it is missing. 2. Analyze compressed chunk instead of raw chunks When the command ANALYZE <hypertable> is executed, a) analyze uncompressed chunks and b) skip the raw chunk, but analyze the compressed chunk.	2020-11-11 15:05:14 -05:00
Sven Klemm	97254783d4	Fix segfault in decompress_chunk for chunks with dropped columns This patch fixes a segfault in decompress_chunk for chunks with dropped columns. Since dropped columns don't exists in the compressed chunk the values for those columns were undefined in the decompressed tuple leading to a segfault when trying to build the heap tuple.	2020-11-10 10:13:45 +01:00
Brian Rowe	525e821055	Add missing increment for PG11 decompression There is a bug in some versions of PG11 where ANALYZE was not calling CommandCounterIncrement. This is causing us to fail to update pg_class statistics during compression for those versions. To work around this, this change adds an explicit CommandCounterIncrement call after ExecVacuum in PG11. Fixes #2581	2020-10-20 11:54:35 -07:00
Erik Nordström	3cf9c857c4	Make errors and messages conform to style guide Errors and messages are overhauled to conform to the official PostgreSQL style guide. In particular, the following things from the guide has been given special attention: * Correct capitalization of first letter: capitalize only for hints, and detail messages. * Correct handling of periods at the end of messages (should be elided for primary message, but not detail and hint messages). * The primary message should be short, factual, and avoid reference to implementation details such as specific function names. Some messages have also been reworded for clarity and to better conform with the last bullet above (short primary message). In other cases, messages have been updated to fix references to, e.g., function parameters that used the wrong parameter name. Closes #2364	2020-10-20 16:49:32 +02:00
Brian Rowe	5acf3343b5	Ensure reltuples are preserved during compression This change captures the reltuples and relpages (and relallvisible) statistics from the pg_class table for chunks immediately before truncating them during the compression code path. It then restores the values after truncating, as there is no way to keep postgresql from clearing these values during this operation. It also properly uses these values properly during planning, working around some postgresql code which substitutes in arbitrary sizing for tables which don't see to hold data. Fixes #2524	2020-10-19 07:21:38 -07:00
Dmitry Simonenko	a51aa6d04b	Move enterprise features to community This patch removes enterprise license support and moves move_chunk() function under community license (TSL). Licensing validation code been reworked and simplified. Previously used timescaledb.license_key guc been renamed to timescaledb.license. This change also makes testing code more strict against used license. Apache test suite now can test only apache-licensed functions. Fixes #2359	2020-09-30 15:14:17 +03:00
Mats Kindahl	02ad8b4e7e	Turn debug messages into DEBUG1 We have some debug messages that are printed as notices, but are more suitable to have at `DEBUG1` level. This commit removes a notice about indexes being added and turns it into a `DEBUG1` notice.	2020-09-29 11:04:07 +02:00
Brian Rowe	8e1e6036af	Preserve pg_stats on chunks before compression This change will ensure that the pg_statistics on a chunk are updated immediately prior to compression. It also ensures that these stats are not overwritten as part of a global or hypertable targetted ANALYZE. This addresses the issue that chunk will no longer generate valid statistics durings an ANALYZE once the data's been moved to the compressed table. Unfortunately any compressed rows will not be captured in the parent hypertable's pg_statistics as there is no way to change how PostGresQL samples child tables in PG11. This approach assumes that the compressed table remains static, which is mostly correct in the current implementation (though it is possible to remove compressed segments). Once we start allowing more operations on compressed chunks this solution will need to be revisited. Note that in PG12 an approach leveraging table access methods will not have a problem analyzing compressed tables.	2020-08-21 10:48:15 -07:00
Sven Klemm	7d230290b9	Remove unnecessary exports in tsl library Since almost all the functions in the tsl library are accessed via cross module functions there is no need to export the indivial functions.	2020-08-17 18:58:18 +02:00
Sven Klemm	0d5f1ffc83	Refactor compress chunk policy This patch changes the compression policy to store its configuration in the bgw_job table and removes the bgw_policy_compress_chunks table.	2020-07-30 19:58:37 +02:00
Oleg Smirnov	0e9f1ee9f5	Enable compression for tables with compound foreign key When enabling compression on a hypertable the existing constraints are being cloned to the new compressed hypertable. During validation of existing constraints a loop through the conkey array is performed, and constraint name is erroneously added to the list multiple times. This fix moves the addition to the list outside the conkey loop. Fixes #2000	2020-07-02 12:22:30 +02:00
gayyappan	b93b30b0c2	Add counts to compression statistics Store information related to compressed and uncompressed row counts after compressing a chunk. This is saved in compression_chunk_size table.	2020-06-19 15:58:04 -04:00
Sven Klemm	c90397fd6a	Remove support for PG9.6 and PG10 This patch removes code support for PG9.6 and PG10. In addition to removing PG96 and PG10 macros the following changes are done: remove HAVE_INT64_TIMESTAMP since this is always true on PG10+ remove PG_VERSION_SUPPORTS_MULTINODE	2020-06-02 23:48:35 +02:00
Stephen Polcyn	d1aacdccad	Change compression locking order This patch changes the order in which locks are taken during compression to avoid taking strong locks for long periods on referenced tables. Previously, constraints from the uncompressed chunk were copied to the compressed chunk before compressing the data. When the uncompressed chunk had foreign key constraints, this resulted in a ShareRowExclusiveLock being held on the referenced table for the remainder of the transaction, which includes the (potentially long) period while the data is compressed, and prevented any INSERTs/UPDATEs/DELETEs on the referenced table during the remainder of the time it took the compression transaction to complete. Copying constraints after completing the actual data compression does not pose safety issues (as any updates to referenced keys are caught by the FK constraint on the uncompressed chunk), and it enables the compression job to minimize the time during which strong locks are held on referenced tables. Fixes #1614.	2020-06-01 16:16:05 -04:00
Erik Nordström	ccc1018f44	Fix various linter-found issues This fixes variuous issues found by clang-tidy. A number of compression-related issues still remain, however.	2020-05-29 14:04:25 +02:00
Erik Nordström	1dd9314f4d	Improve linting support with clang-tidy This change replaces the existing `clang-tidy` linter target with CMake's built-in support for it. The old way of invoking the linter relied on the `run-clang-tidy` wrapper script, which is not installed by default on some platforms. Discovery of the `clang-tidy` tool has also been improved to work with more installation locations. As a result, linting now happens at compile time and is enabled automatically when `clang-tidy` is installed and found. In enabling `clang-tidy`, several non-trivial issues were discovered in compression-related code. These might be false positives, but, until a proper solution can be found, "warnings-as-errors" have been disabled for that code to allow compilation to succeed with the linter enabled.	2020-05-29 14:04:25 +02:00
Erik Nordström	686860ea23	Support compression on distributed hypertables Initial support for compression on distributed hypertables. This _only_ includes the ability to run `compress_chunk` and `decompress_chunk` on a distributed hypertable. There is no support for automation, at least not beyond what one can do individually on each data node. Note that an access node keeps no local metadata about which distributed hypertables have compressed chunks. This information needs to be fetched directly from data nodes, although such functionality is not yet implemented. For example, informational views on the access nodes will not yet report the correct compression states for distributed hypertables.	2020-05-27 17:31:09 +02:00
Erik Nordström	33f1601e6f	Handle constraints, triggers, and indexes on distributed hypertables In distributed hypertables, chunks are foreign tables and such tables do not support (or should not support) indexes, certain constraints, and triggers. Therefore, such objects should not recurse to foreign table chunks nor add a mappings in the `chunk_constraint` or `chunk_index` tables. This change ensures that we properly filter out the indexes, triggers, and constraints that should not recurse to chunks on distributed hypertables.	2020-05-27 17:31:09 +02:00
Erik Nordström	596be8cda1	Add mappings table for remote chunks A frontend node will now maintain mappings from a local chunk to the corresponding remote chunks in a `chunk_server` table. The frontend creates local chunks as foreign tables and adds entries to `chunk_server` for each chunk it creates on remote data node. Currently, the creation of remote chunks is not implemented, so a dummy chunk_id for the remote chunk will be added instead for testing purposes.	2020-05-27 17:31:09 +02:00
Stephen Polcyn	b57d2ac388	Cleanup TODOs and FIXMEs Unless otherwise listed, the TODO was converted to a comment or put into an issue tracker. test/sql/ - triggers.sql: Made required change tsl/test/ - CMakeLists.txt: TODO complete - bgw_policy.sql: TODO complete - continuous_aggs_materialize.sql: TODO complete - compression.sql: TODO complete - compression_algos.sql: TODO complete tsl/src/ - compression/compression.c: - row_compressor_decompress_row: Expected complete - compression/dictionary.c: FIXME complete - materialize.c: TODO complete - reorder.c: TODO complete - simple8b_rle.h: - compressor_finish: Removed (obsolete) src/ - extension.c: Removed due to age - adts/simplehash.h: TODOs are from copied Postgres code - adts/vec.h: TODO is non-significant - planner.c: Removed - process_utility.c - process_altertable_end_subcmd: Removed (PG will handle case)	2020-05-18 20:16:03 -04:00
Erik Nordström	28e9a443b3	Improve handling of "dropped" chunks The internal chunk API is updated to avoid returning `Chunk` objects that are marked `dropped=true` along with some refactoring, hardening, and cleanup of the internal chunk APIs. In particular, apart from being returned in a dropped state, chunks could also be returned in a partial state (without all fields set, partial constraints, etc.). None of this is allowed as of this change. Further, lock handling was unclear when joining chunk metadata from different catalog tables. This is made clear by having chunks built within nested scan loops so that proper locks are held when joining in additional metadata (such as constraints). This change also fixes issues with dropped chunks that caused chunk metadata to be processed many times instead of just once, leading to potential bugs or bad performance. In particular, since the introduction of the “dropped” flag, chunk metadata can exist in two states: 1. `dropped=false` 2. `dropped=true`. When dropping chunks (e.g., via `drop_chunks`, `DROP TABLE <chunk>`, or `DROP TABLE <hypertable>`) there are also two modes of dropping: 1. DELETE row and 2. UPDATE row and SET dropped=true. The deletion mode and the current state of chunk lead to a cross-product resulting in 4 cases when dropping/deleting a chunk: 1. DELETE row when dropped=false 2. DELETE row when dropped=true 3. UPDATE row when dropped=false 4. UPDATE row when dropped=true Unfortunately, the code didn't distinguish between these cases. In particular, case (4) should not be able to happen, but since it did it lead to a recursing loop where an UPDATE created a new tuple that then is recursed to in the same loop, and so on. To fix this recursing loop and make the code for dropping chunks less error prone, a number of assertions have been added, including some new light-weight scan functions to access chunk information without building a full-blown chunk. This change also removes the need to provide the number of constraints when scanning for chunks. This was really just a hint anyway, but this is no longer needed since all constraints are joined in anyway.	2020-04-28 13:49:14 +02:00
Erik Nordström	0e9461251b	Silence various compiler warnings This change fixes various compiler warnings that show up on different compilers and platforms. In particular, MSVC is sensitive to functions that do not return a value after throwing an error since it doesn't realize that the code path is not reachable.	2020-04-27 15:02:18 +02:00
Ruslan Fomkin	16897d2238	Drop FK constraints on chunk compression Drop Foreign Key constraints from uncompressed chunks during the compression. This allows to cascade data deletion in FK-referenced tables to compressed chunks. The foreign key constrains are restored during decompression.	2020-04-14 23:12:15 +02:00
Ruslan Fomkin	ed32d093dc	Use table_open/close and PG aggregated directive Fixing more places to use table_open and table_close introduced in PG12. Unifies PG version directives to use aggregated macro.	2020-04-14 23:12:15 +02:00
Erik Nordström	36af23ec94	Use flags for cache query options Cache queries support multiple optional behaviors, such as "missing ok" (do not fail on cache miss) and "no create" (do not create a new entry if one doesn't exist in the cache). With multiple boolean parameters, the query API has become unwieldy so this change turns these booleans into one flag parameter.	2020-04-14 23:12:15 +02:00
Ruslan Fomkin	1ddc62eb5f	Refactor header inclusion Correcting conditions in #ifdefs, adding missing includes, removing and rearranging existing includes, replacing PG12 with PG12_GE for forward compatibility. Fixed number of places with relation_close to table_close, which were missed earlier.	2020-04-14 23:12:15 +02:00
Ruslan Fomkin	e57ee45fcf	Replace general relation_open with specific relation_open is a general function, which is called from more specific functions per database type. This commit replaces them with the specific functions, which control correct types.	2020-04-14 23:12:15 +02:00
Joshua Lockerman	949b88ef2e	Initial support for PostgreSQL 12 This change includes a major refactoring to support PostgreSQL 12. Note that many tests aren't passing at this point. Changes include, but are not limited to: - Handle changes related to table access methods - New way to expand hypertables since expansion has changed in PostgreSQL 12 (more on this below). - Handle changes related to table expansion for UPDATE/DELETE - Fixes for various TimescaleDB optimizations that were affected by planner changes in PostgreSQL (gapfill, first/last, etc.) Before PostgreSQL 12, planning was organized something like as follows: 1. construct add `RelOptInfo` for base and appendrels 2. add restrict info, joins, etc. 3. perform the actual planning with `make_one_rel` For our optimizations we would expand hypertables in the middle of step 1; since nothing in the query planner before `make_one_rel` cared about the inheritance children, we didn’t have to be too precises about where we were doing it. However, with PG12, and the optimizations around declarative partitioning, PostgreSQL now does care about when the children are expanded, since it wants as much information as possible to perform partition-pruning. Now planning is organized like: 1. construct add RelOptInfo for base rels only 2. add restrict info, joins, etc. 3. expand appendrels, removing irrelevant declarative partitions 4. perform the actual planning with make_one_rel Step 3 always expands appendrels, so when we also expand them during step 1, the hypertable gets expanded twice, and things in the planner break. The changes to support PostgreSQL 12 attempts to solve this problem by keeping the hypertable root marked as a non-inheritance table until `make_one_rel` is called, and only then revealing to PostgreSQL that it does in fact have inheritance children. While this strategy entails the least code change on our end, the fact that the first hook we can use to re-enable inheritance is `set_rel_pathlist_hook` it does entail a number of annoyances: 1. this hook is called after the sizes of tables are calculated, so we must recalculate the sizes of all hypertables, as they will not have taken the chunk sizes into account 2. the table upon which the hook is called will have its paths planned under the assumption it has no inheritance children, so if it's a hypertable we have to replan it's paths Unfortunately, the code for doing these is static, so we need to copy them into our own codebase, instead of just using PostgreSQL's. In PostgreSQL 12, UPDATE/DELETE on inheritance relations have also changed and are now planned in two stages: - In stage 1, the statement is planned as if it was a `SELECT` and all leaf tables are discovered. - In stage 2, the original query is planned against each leaf table, discovered in stage 1, directly, not part of an Append. Unfortunately, this means we cannot look in the appendrelinfo during UPDATE/DELETE planning, in particular to determine if a table is a chunk, as the appendrelinfo is not at the point we wish to do so initialized. This has consequences for how we identify operations on chunks (sometimes for blocking and something for enabling functionality).	2020-04-14 23:12:15 +02:00
Erik Nordström	a4fb0cec3f	Cleanup compression-related errors This change fixes a number of typos and issues with inconsistent formatting for compression-related code. A couple of other fixes for variable names, etc. have also been applied.	2020-03-11 13:27:16 +01:00
Sven Klemm	030443a8e2	Fix compressing interval columns When trying to compress a chunk that had a column of datatype interval delta-delta compression would be selected for the column but our delta-delta compression does not support interval and would throw an errow when trying to compress a chunk. This PR changes the compression selected for interval to dictionary compression.	2020-03-06 21:44:31 +01:00
gayyappan	565cca795a	Support disabling compression when foreign keys are present Fix failure with disabling compression when no compressed chunks are present and the table has foreign key constraints	2020-02-14 08:54:58 -05:00
Ruslan Fomkin	4dc0693d1f	Unify error message if hypertable not found Refactors multiple implementations of finding hypertables in cache and failing with different error messages if not found. The implementations are replaced with calling functions, which encapsulate a single error message. This provides the unified error message and removes need for copy-paste.	2020-01-29 08:10:27 +01:00
gayyappan	b1b840f00e	Use timescaledb prefix for compression errors Modify compression parameter related error messages to make them consistent.	2020-01-28 09:08:58 -05:00
Matvey Arye	d52b48e0c3	Delete compression policy when drop hypertable Previously we could have a dangling policy and job referring to a now-dropped hypertable. We also block changing the compression options if a policy exists. Fixes #1570	2020-01-02 16:40:59 -05:00
Matvey Arye	6122e08fcb	Fix error in compression constraint check The constraint check previously assumed that the col_meta offset for a column was equal to that columns attribute offset. This is incorrect in the presence of dropped columns. Fixed to match on column names. Fixes #1590	2020-01-02 13:56:46 -05:00

1 2 3

106 Commits