timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-24 06:53:59 +08:00

Author	SHA1	Message	Date
gayyappan	05319cd424	Support analyze of internal compression table This commit modifies analyze behavior as follows: 1. When an internal compression table is analyzed, statistics from the compressed chunk (such as page count and tuple count) is used to update the statistics of the corresponding chunk parent, if it is missing. 2. Analyze compressed chunk instead of raw chunks When the command ANALYZE <hypertable> is executed, a) analyze uncompressed chunks and b) skip the raw chunk, but analyze the compressed chunk.	2020-11-11 15:05:14 -05:00
Sven Klemm	97254783d4	Fix segfault in decompress_chunk for chunks with dropped columns This patch fixes a segfault in decompress_chunk for chunks with dropped columns. Since dropped columns don't exists in the compressed chunk the values for those columns were undefined in the decompressed tuple leading to a segfault when trying to build the heap tuple.	2020-11-10 10:13:45 +01:00
Brian Rowe	525e821055	Add missing increment for PG11 decompression There is a bug in some versions of PG11 where ANALYZE was not calling CommandCounterIncrement. This is causing us to fail to update pg_class statistics during compression for those versions. To work around this, this change adds an explicit CommandCounterIncrement call after ExecVacuum in PG11. Fixes #2581	2020-10-20 11:54:35 -07:00
Erik Nordström	3cf9c857c4	Make errors and messages conform to style guide Errors and messages are overhauled to conform to the official PostgreSQL style guide. In particular, the following things from the guide has been given special attention: * Correct capitalization of first letter: capitalize only for hints, and detail messages. * Correct handling of periods at the end of messages (should be elided for primary message, but not detail and hint messages). * The primary message should be short, factual, and avoid reference to implementation details such as specific function names. Some messages have also been reworded for clarity and to better conform with the last bullet above (short primary message). In other cases, messages have been updated to fix references to, e.g., function parameters that used the wrong parameter name. Closes #2364	2020-10-20 16:49:32 +02:00
Brian Rowe	5acf3343b5	Ensure reltuples are preserved during compression This change captures the reltuples and relpages (and relallvisible) statistics from the pg_class table for chunks immediately before truncating them during the compression code path. It then restores the values after truncating, as there is no way to keep postgresql from clearing these values during this operation. It also properly uses these values properly during planning, working around some postgresql code which substitutes in arbitrary sizing for tables which don't see to hold data. Fixes #2524	2020-10-19 07:21:38 -07:00
Dmitry Simonenko	a51aa6d04b	Move enterprise features to community This patch removes enterprise license support and moves move_chunk() function under community license (TSL). Licensing validation code been reworked and simplified. Previously used timescaledb.license_key guc been renamed to timescaledb.license. This change also makes testing code more strict against used license. Apache test suite now can test only apache-licensed functions. Fixes #2359	2020-09-30 15:14:17 +03:00
Mats Kindahl	02ad8b4e7e	Turn debug messages into DEBUG1 We have some debug messages that are printed as notices, but are more suitable to have at `DEBUG1` level. This commit removes a notice about indexes being added and turns it into a `DEBUG1` notice.	2020-09-29 11:04:07 +02:00
Brian Rowe	8e1e6036af	Preserve pg_stats on chunks before compression This change will ensure that the pg_statistics on a chunk are updated immediately prior to compression. It also ensures that these stats are not overwritten as part of a global or hypertable targetted ANALYZE. This addresses the issue that chunk will no longer generate valid statistics durings an ANALYZE once the data's been moved to the compressed table. Unfortunately any compressed rows will not be captured in the parent hypertable's pg_statistics as there is no way to change how PostGresQL samples child tables in PG11. This approach assumes that the compressed table remains static, which is mostly correct in the current implementation (though it is possible to remove compressed segments). Once we start allowing more operations on compressed chunks this solution will need to be revisited. Note that in PG12 an approach leveraging table access methods will not have a problem analyzing compressed tables.	2020-08-21 10:48:15 -07:00
Sven Klemm	7d230290b9	Remove unnecessary exports in tsl library Since almost all the functions in the tsl library are accessed via cross module functions there is no need to export the indivial functions.	2020-08-17 18:58:18 +02:00
Sven Klemm	0d5f1ffc83	Refactor compress chunk policy This patch changes the compression policy to store its configuration in the bgw_job table and removes the bgw_policy_compress_chunks table.	2020-07-30 19:58:37 +02:00
Oleg Smirnov	0e9f1ee9f5	Enable compression for tables with compound foreign key When enabling compression on a hypertable the existing constraints are being cloned to the new compressed hypertable. During validation of existing constraints a loop through the conkey array is performed, and constraint name is erroneously added to the list multiple times. This fix moves the addition to the list outside the conkey loop. Fixes #2000	2020-07-02 12:22:30 +02:00
gayyappan	b93b30b0c2	Add counts to compression statistics Store information related to compressed and uncompressed row counts after compressing a chunk. This is saved in compression_chunk_size table.	2020-06-19 15:58:04 -04:00
Sven Klemm	c90397fd6a	Remove support for PG9.6 and PG10 This patch removes code support for PG9.6 and PG10. In addition to removing PG96 and PG10 macros the following changes are done: remove HAVE_INT64_TIMESTAMP since this is always true on PG10+ remove PG_VERSION_SUPPORTS_MULTINODE	2020-06-02 23:48:35 +02:00
Stephen Polcyn	d1aacdccad	Change compression locking order This patch changes the order in which locks are taken during compression to avoid taking strong locks for long periods on referenced tables. Previously, constraints from the uncompressed chunk were copied to the compressed chunk before compressing the data. When the uncompressed chunk had foreign key constraints, this resulted in a ShareRowExclusiveLock being held on the referenced table for the remainder of the transaction, which includes the (potentially long) period while the data is compressed, and prevented any INSERTs/UPDATEs/DELETEs on the referenced table during the remainder of the time it took the compression transaction to complete. Copying constraints after completing the actual data compression does not pose safety issues (as any updates to referenced keys are caught by the FK constraint on the uncompressed chunk), and it enables the compression job to minimize the time during which strong locks are held on referenced tables. Fixes #1614.	2020-06-01 16:16:05 -04:00
Erik Nordström	ccc1018f44	Fix various linter-found issues This fixes variuous issues found by clang-tidy. A number of compression-related issues still remain, however.	2020-05-29 14:04:25 +02:00
Erik Nordström	1dd9314f4d	Improve linting support with clang-tidy This change replaces the existing `clang-tidy` linter target with CMake's built-in support for it. The old way of invoking the linter relied on the `run-clang-tidy` wrapper script, which is not installed by default on some platforms. Discovery of the `clang-tidy` tool has also been improved to work with more installation locations. As a result, linting now happens at compile time and is enabled automatically when `clang-tidy` is installed and found. In enabling `clang-tidy`, several non-trivial issues were discovered in compression-related code. These might be false positives, but, until a proper solution can be found, "warnings-as-errors" have been disabled for that code to allow compilation to succeed with the linter enabled.	2020-05-29 14:04:25 +02:00
Erik Nordström	686860ea23	Support compression on distributed hypertables Initial support for compression on distributed hypertables. This _only_ includes the ability to run `compress_chunk` and `decompress_chunk` on a distributed hypertable. There is no support for automation, at least not beyond what one can do individually on each data node. Note that an access node keeps no local metadata about which distributed hypertables have compressed chunks. This information needs to be fetched directly from data nodes, although such functionality is not yet implemented. For example, informational views on the access nodes will not yet report the correct compression states for distributed hypertables.	2020-05-27 17:31:09 +02:00
Erik Nordström	33f1601e6f	Handle constraints, triggers, and indexes on distributed hypertables In distributed hypertables, chunks are foreign tables and such tables do not support (or should not support) indexes, certain constraints, and triggers. Therefore, such objects should not recurse to foreign table chunks nor add a mappings in the `chunk_constraint` or `chunk_index` tables. This change ensures that we properly filter out the indexes, triggers, and constraints that should not recurse to chunks on distributed hypertables.	2020-05-27 17:31:09 +02:00
Erik Nordström	596be8cda1	Add mappings table for remote chunks A frontend node will now maintain mappings from a local chunk to the corresponding remote chunks in a `chunk_server` table. The frontend creates local chunks as foreign tables and adds entries to `chunk_server` for each chunk it creates on remote data node. Currently, the creation of remote chunks is not implemented, so a dummy chunk_id for the remote chunk will be added instead for testing purposes.	2020-05-27 17:31:09 +02:00
Stephen Polcyn	b57d2ac388	Cleanup TODOs and FIXMEs Unless otherwise listed, the TODO was converted to a comment or put into an issue tracker. test/sql/ - triggers.sql: Made required change tsl/test/ - CMakeLists.txt: TODO complete - bgw_policy.sql: TODO complete - continuous_aggs_materialize.sql: TODO complete - compression.sql: TODO complete - compression_algos.sql: TODO complete tsl/src/ - compression/compression.c: - row_compressor_decompress_row: Expected complete - compression/dictionary.c: FIXME complete - materialize.c: TODO complete - reorder.c: TODO complete - simple8b_rle.h: - compressor_finish: Removed (obsolete) src/ - extension.c: Removed due to age - adts/simplehash.h: TODOs are from copied Postgres code - adts/vec.h: TODO is non-significant - planner.c: Removed - process_utility.c - process_altertable_end_subcmd: Removed (PG will handle case)	2020-05-18 20:16:03 -04:00
Erik Nordström	28e9a443b3	Improve handling of "dropped" chunks The internal chunk API is updated to avoid returning `Chunk` objects that are marked `dropped=true` along with some refactoring, hardening, and cleanup of the internal chunk APIs. In particular, apart from being returned in a dropped state, chunks could also be returned in a partial state (without all fields set, partial constraints, etc.). None of this is allowed as of this change. Further, lock handling was unclear when joining chunk metadata from different catalog tables. This is made clear by having chunks built within nested scan loops so that proper locks are held when joining in additional metadata (such as constraints). This change also fixes issues with dropped chunks that caused chunk metadata to be processed many times instead of just once, leading to potential bugs or bad performance. In particular, since the introduction of the “dropped” flag, chunk metadata can exist in two states: 1. `dropped=false` 2. `dropped=true`. When dropping chunks (e.g., via `drop_chunks`, `DROP TABLE <chunk>`, or `DROP TABLE <hypertable>`) there are also two modes of dropping: 1. DELETE row and 2. UPDATE row and SET dropped=true. The deletion mode and the current state of chunk lead to a cross-product resulting in 4 cases when dropping/deleting a chunk: 1. DELETE row when dropped=false 2. DELETE row when dropped=true 3. UPDATE row when dropped=false 4. UPDATE row when dropped=true Unfortunately, the code didn't distinguish between these cases. In particular, case (4) should not be able to happen, but since it did it lead to a recursing loop where an UPDATE created a new tuple that then is recursed to in the same loop, and so on. To fix this recursing loop and make the code for dropping chunks less error prone, a number of assertions have been added, including some new light-weight scan functions to access chunk information without building a full-blown chunk. This change also removes the need to provide the number of constraints when scanning for chunks. This was really just a hint anyway, but this is no longer needed since all constraints are joined in anyway.	2020-04-28 13:49:14 +02:00
Erik Nordström	0e9461251b	Silence various compiler warnings This change fixes various compiler warnings that show up on different compilers and platforms. In particular, MSVC is sensitive to functions that do not return a value after throwing an error since it doesn't realize that the code path is not reachable.	2020-04-27 15:02:18 +02:00
Ruslan Fomkin	16897d2238	Drop FK constraints on chunk compression Drop Foreign Key constraints from uncompressed chunks during the compression. This allows to cascade data deletion in FK-referenced tables to compressed chunks. The foreign key constrains are restored during decompression.	2020-04-14 23:12:15 +02:00
Ruslan Fomkin	ed32d093dc	Use table_open/close and PG aggregated directive Fixing more places to use table_open and table_close introduced in PG12. Unifies PG version directives to use aggregated macro.	2020-04-14 23:12:15 +02:00
Erik Nordström	36af23ec94	Use flags for cache query options Cache queries support multiple optional behaviors, such as "missing ok" (do not fail on cache miss) and "no create" (do not create a new entry if one doesn't exist in the cache). With multiple boolean parameters, the query API has become unwieldy so this change turns these booleans into one flag parameter.	2020-04-14 23:12:15 +02:00
Ruslan Fomkin	1ddc62eb5f	Refactor header inclusion Correcting conditions in #ifdefs, adding missing includes, removing and rearranging existing includes, replacing PG12 with PG12_GE for forward compatibility. Fixed number of places with relation_close to table_close, which were missed earlier.	2020-04-14 23:12:15 +02:00
Ruslan Fomkin	e57ee45fcf	Replace general relation_open with specific relation_open is a general function, which is called from more specific functions per database type. This commit replaces them with the specific functions, which control correct types.	2020-04-14 23:12:15 +02:00
Joshua Lockerman	949b88ef2e	Initial support for PostgreSQL 12 This change includes a major refactoring to support PostgreSQL 12. Note that many tests aren't passing at this point. Changes include, but are not limited to: - Handle changes related to table access methods - New way to expand hypertables since expansion has changed in PostgreSQL 12 (more on this below). - Handle changes related to table expansion for UPDATE/DELETE - Fixes for various TimescaleDB optimizations that were affected by planner changes in PostgreSQL (gapfill, first/last, etc.) Before PostgreSQL 12, planning was organized something like as follows: 1. construct add `RelOptInfo` for base and appendrels 2. add restrict info, joins, etc. 3. perform the actual planning with `make_one_rel` For our optimizations we would expand hypertables in the middle of step 1; since nothing in the query planner before `make_one_rel` cared about the inheritance children, we didn’t have to be too precises about where we were doing it. However, with PG12, and the optimizations around declarative partitioning, PostgreSQL now does care about when the children are expanded, since it wants as much information as possible to perform partition-pruning. Now planning is organized like: 1. construct add RelOptInfo for base rels only 2. add restrict info, joins, etc. 3. expand appendrels, removing irrelevant declarative partitions 4. perform the actual planning with make_one_rel Step 3 always expands appendrels, so when we also expand them during step 1, the hypertable gets expanded twice, and things in the planner break. The changes to support PostgreSQL 12 attempts to solve this problem by keeping the hypertable root marked as a non-inheritance table until `make_one_rel` is called, and only then revealing to PostgreSQL that it does in fact have inheritance children. While this strategy entails the least code change on our end, the fact that the first hook we can use to re-enable inheritance is `set_rel_pathlist_hook` it does entail a number of annoyances: 1. this hook is called after the sizes of tables are calculated, so we must recalculate the sizes of all hypertables, as they will not have taken the chunk sizes into account 2. the table upon which the hook is called will have its paths planned under the assumption it has no inheritance children, so if it's a hypertable we have to replan it's paths Unfortunately, the code for doing these is static, so we need to copy them into our own codebase, instead of just using PostgreSQL's. In PostgreSQL 12, UPDATE/DELETE on inheritance relations have also changed and are now planned in two stages: - In stage 1, the statement is planned as if it was a `SELECT` and all leaf tables are discovered. - In stage 2, the original query is planned against each leaf table, discovered in stage 1, directly, not part of an Append. Unfortunately, this means we cannot look in the appendrelinfo during UPDATE/DELETE planning, in particular to determine if a table is a chunk, as the appendrelinfo is not at the point we wish to do so initialized. This has consequences for how we identify operations on chunks (sometimes for blocking and something for enabling functionality).	2020-04-14 23:12:15 +02:00
Erik Nordström	a4fb0cec3f	Cleanup compression-related errors This change fixes a number of typos and issues with inconsistent formatting for compression-related code. A couple of other fixes for variable names, etc. have also been applied.	2020-03-11 13:27:16 +01:00
Sven Klemm	030443a8e2	Fix compressing interval columns When trying to compress a chunk that had a column of datatype interval delta-delta compression would be selected for the column but our delta-delta compression does not support interval and would throw an errow when trying to compress a chunk. This PR changes the compression selected for interval to dictionary compression.	2020-03-06 21:44:31 +01:00
gayyappan	565cca795a	Support disabling compression when foreign keys are present Fix failure with disabling compression when no compressed chunks are present and the table has foreign key constraints	2020-02-14 08:54:58 -05:00
Ruslan Fomkin	4dc0693d1f	Unify error message if hypertable not found Refactors multiple implementations of finding hypertables in cache and failing with different error messages if not found. The implementations are replaced with calling functions, which encapsulate a single error message. This provides the unified error message and removes need for copy-paste.	2020-01-29 08:10:27 +01:00
gayyappan	b1b840f00e	Use timescaledb prefix for compression errors Modify compression parameter related error messages to make them consistent.	2020-01-28 09:08:58 -05:00
Matvey Arye	d52b48e0c3	Delete compression policy when drop hypertable Previously we could have a dangling policy and job referring to a now-dropped hypertable. We also block changing the compression options if a policy exists. Fixes #1570	2020-01-02 16:40:59 -05:00
Matvey Arye	6122e08fcb	Fix error in compression constraint check The constraint check previously assumed that the col_meta offset for a column was equal to that columns attribute offset. This is incorrect in the presence of dropped columns. Fixed to match on column names. Fixes #1590	2020-01-02 13:56:46 -05:00
Matvey Arye	2c594ec6f9	Keep catalog rows for some dropped chunks If a chunk is dropped but it has a continuous aggregate that is not dropped we want to preserve the chunk catalog row instead of deleting the row. This is to prevent dangling identifiers in the materialization hypertable. It also preserves the dimension slice and chunk constraints rows for the chunk since those will be necessary when enabling this with multinode and is necessary to recreate the chunk too. The postgres objects associated with the chunk are all dropped (table, constraints, indexes). If data is ever reinserted to the same data region, the chunk is recreated with the same dimension definitions as before. The postgres objects are simply recreated.	2019-12-30 09:10:44 -05:00
Matvey Arye	d9d1a44d2e	Refactor chunk handling to separate out stub Previously, the Chunk struct was used to represent both a full chunk and the stub used for joins. The stub used for joins only contained valid values for some chunk fields and not others. After the join determined that a Chunk was complete, it filled in the rest of the chunk field. The fact that a chunk could have only some fields filled out and not others at different times, made the code hard to follow and error prone. So we separate out the stub state of the chunk into a separate struct that doesn't contain the not-filled-out fields inside of it. This leverages the type system to prevent errors that try to access invalid fields during the join phase and makes the code easier to follow.	2019-12-06 15:04:51 -05:00
Joshua Lockerman	48ef701fa9	Set toast_tuple_target to 128B when able We want compressed data to be stored out-of-line whenever possible so that the headers are colocated and scans on the metadata and segmentbys are cheap. This commit lowers toast_tuple_target to 128 bytes, so that more tables will have this occur; using the default size, very often a non-trivial portion of the data ends up in the main table, and only very few rows are stored in a page.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	efb131dd6f	Add missing tests discovered by Codecov 2 This commit adds tests for DATE, TIMESTAMP, and FLOAT compression and decompression, NULL compression and decompression in dictionaries and fixes a bug where the database would refuse to decompress DATEs. This commit also removes the fallback allowing any binary compatible 8-byte types to be compressed by our integer compressors as I believe I found a bug in said fallback last time I reviewed it, and cannot recall what the bug was. These can be re-added later, with appropriate tests.	2019-10-29 19:02:58 -04:00
Sven Klemm	e2df62c81c	Fix transparent decompression interaction with first/last Queries with the first/last optimization on compressed chunks would not properly decompress data but instead access the uncompressed chunk. This patch fixes the behaviour and also unifies the check whether a hypertable has compression.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	07841670a7	Fix issues discovered by coverity This commit fixes issues reported by coverity. Of these, the only real issue is an integer overflow in bitarray, which can never happen in its current usages. This also adds a PG_USED_FOR_ASSERTS_ONLY for a variable only used for Assert.	2019-10-29 19:02:58 -04:00
Matvey Arye	85d30e404d	Add ability to turn off compression Since enabling compression creates limits on the hypertable (e.g. types of constraints allowed) even if there are no compressed chunks, we add the ability to turn off compression. This is only possible if there are no compressed chunks.	2019-10-29 19:02:58 -04:00
Matvey Arye	2fe51d2735	Improve (de)compress_chunk API This commit improves the API of compress_chunk and decompress_chunk: - have it return the chunk regclass processed (or NULL in the idempotent case); - mark it as STRICT - add if_not_compressed/if_compressed options for idempotency	2019-10-29 19:02:58 -04:00
Matvey Arye	92aa77247a	Improve minor UIUX Some small improvements: - allow alter table with empty segment by if the original definition had an empty segment by. Improve error msgs. - block compression on tables with OIDs - block compression on tables with RLS	2019-10-29 19:02:58 -04:00
Matvey Arye	b8a98c1f18	Make compressed chunks use same tablespace as uncompressed For tablepaces with compressed chunks the semantics are the following: - compressed chunks get put into the same tablespace as the uncommpressed chunk on compression. - set tablespace on uncompressed hypertable cascades to compressed hypertable+chunks - set tablespace on all chunks is blocked (same as w/o compression) - move chunks on a uncompressed chunk errors - move chunks on compressed chunk works In the future we will: - add tablespace option to compress_chunk function and policy (this will override the setting of the uncompressed chunk). This will allow changing tablespaces upon compression - Note: The current plan is to never listen to the setting on compressed hypertable. In fact, we will block setting tablespace on compressed hypertables	2019-10-29 19:02:58 -04:00
Joshua Lockerman	91a73c3e17	Set statistics on compressed chunks The statistics on segmentby and metadata columns are very important as they affect the decompressed data a thousand-fold. Statistics on the compressed columns are irrelevant, as the regular postgres planner cannot understand the compressed columns. This commit sets the statistics for compressed tables based on this, weighting the uncompressed columns greatly, and the compressed columns not-at-all.	2019-10-29 19:02:58 -04:00
gayyappan	72588a2382	Restrict constraints on compressed hypertables. Primary and unqiue constraints are limited to segment_by and order_by columns and foreign key constraints are limited to segment_by columns when creating a compressed hypertable. There are no restrictions on check constraints.	2019-10-29 19:02:58 -04:00
Matvey Arye	0f3e74215a	Split segment meta min_max into two columns This simplifies the code and the access to the min/max metadata. Before we used a custom type, but now the min/max are just the same type as the underlying column and stored as two columns. This also removes the custom type that was used before.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	6687189a6c	Free memory earlier in decompress_chunk This was supposed to be part of an earlier commit, but seems to have been lost. This should reduce peak memory usage of that function.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	64f56d5088	Create indexes on segmentby columns This commit creates indexes on all segmentby columns of the compressed hypertable.	2019-10-29 19:02:58 -04:00

1 2

91 Commits