timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-15 01:53:41 +08:00

Author	SHA1	Message	Date
Bharathy	cc51e20e87	Add support for ON CONFLICT DO UPDATE for compressed hypertables This patch fixes execution of INSERT with ON CONFLICT DO UPDATE by removing error and allowing UPDATE do happen on the given compressed hypertable.	2023-03-20 22:55:27 +05:30
Sven Klemm	65562f02e8	Support unique constraints on compressed chunks This patch allows unique constraints on compressed chunks. When trying to INSERT into compressed chunks with unique constraints any potentially conflicting compressed batches will be decompressed to let postgres do constraint checking on the INSERT. With this patch only INSERT ON CONFLICT DO NOTHING will be supported. For decompression only segment by information is considered to determine conflicting batches. This will be enhanced in a follow-up patch to also include orderby metadata to require decompressing less batches.	2023-03-13 12:04:38 +01:00
Bharathy	9a2cbe30a1	Fix ChunkAppend, ConstraintAwareAppend child subplan When TidRangeScan is child of ChunkAppend or ConstraintAwareAppend node, an error is reported as "invalid child of chunk append: Node (26)". This patch fixes the issue by recognising TidRangeScan as a valid child. Fixes: #4872	2023-01-18 18:06:30 +05:30
Fabrízio de Royes Mello	a4356f342f	Remove trailing whitespaces from test code	2022-11-18 16:31:47 -03:00
Sven Klemm	131773a902	Reset compression sequence when group resets The sequence number of the compressed tuple is per segment by grouping and should be reset when the grouping changes to prevent overflows with many segmentby columns.	2022-08-15 13:34:00 +02:00
gayyappan	e5db6a9eec	Fix status for dropped chunks that have catalog entries Chunks that are dropped but preserve the catalog entries have an incorrect status when they are marked as dropped. This happens if the chunk was previously compressed and then gets dropped - the status in the catalog tuple reflects the compression status. This should be reset since the data is now dropped.	2022-01-31 17:39:39 -05:00
Sven Klemm	ff5d7e42bb	Adjust code to PG14 reltuples changes PG14 changes the initial value of pg_class.reltuples to -1 to allow differentiating between an empty relation and a relation where ANALYZE has not yet run. https://github.com/postgres/postgres/commit/3d351d916b	2021-06-29 16:35:35 +02:00
gayyappan	7c76fd4d09	Save compression settings on access node for distributed hypertables 1. Add compression_state column for hypertable catalog by renaming compressed column for the hypertable catalog table. compression_state is a tri-state column. This column indicates if the hypertable has compression enabled (value = 1) or if it is an internal compression table (value = 2). 2. Save compression settings on access node when compression is turned on for a distributed hypertable For a distributed hypertable, that has compression enabled, compression_state is set. We don't create any internal tables on the access node. Fixes #2660	2020-12-02 10:42:57 -05:00
gayyappan	05319cd424	Support analyze of internal compression table This commit modifies analyze behavior as follows: 1. When an internal compression table is analyzed, statistics from the compressed chunk (such as page count and tuple count) is used to update the statistics of the corresponding chunk parent, if it is missing. 2. Analyze compressed chunk instead of raw chunks When the command ANALYZE <hypertable> is executed, a) analyze uncompressed chunks and b) skip the raw chunk, but analyze the compressed chunk.	2020-11-11 15:05:14 -05:00
Brian Rowe	8a11b022bc	Exclude compressed chunks from ANALYZE/VACUUM This change makes sure that ANALYZE and VACUUM commands run without any relations will not clear the stats on compressed chunks that were saved at compression time. It also will skip any distributed tables. Fixes #2576	2020-10-20 09:18:39 -07:00
Brian Rowe	5acf3343b5	Ensure reltuples are preserved during compression This change captures the reltuples and relpages (and relallvisible) statistics from the pg_class table for chunks immediately before truncating them during the compression code path. It then restores the values after truncating, as there is no way to keep postgresql from clearing these values during this operation. It also properly uses these values properly during planning, working around some postgresql code which substitutes in arbitrary sizing for tables which don't see to hold data. Fixes #2524	2020-10-19 07:21:38 -07:00
Erik Nordström	4623db14ad	Use consistent column names in views Make all views that reference hypertables use `hypertable_schema` and `hypertable_name`.	2020-10-05 15:18:47 +02:00
Erik Nordström	202692f1ef	Make tests use the new continuous aggregate API Tests are updated to no longer use continuous aggregate options that will be removed, such as `refresh_lag`, `max_interval_per_job` and `ignore_invalidation_older_than`. `REFRESH MATERIALIZED VIEW` has also been replaced with `CALL refresh_continuous_aggregate()` using ranges that try to replicate the previous refresh behavior. The materializer test (`continuous_aggregate_materialize`) has been removed, since this tested the "old" materializer code, which is no longer used without `REFRESH MATERIALIZED VIEW`. The new API using `refresh_continuous_aggregate` already allows manual materialization and there are two previously added tests (`continuous_aggs_refresh` and `continuous_aggs_invalidate`) that cover the new refresh path in similar ways. When updated to use the new refresh API, some of the concurrency tests, like `continuous_aggs_insert` and `continuous_aggs_multi`, have slightly different concurrency behavior. This is explained by different and sometimes more conservative locking. For instance, the first transaction of a refresh serializes around an exclusive lock on the invalidation threshold table, even if no new threshold is written. The previous code, only took the heavier lock once, and if, a new threshold was written. This new, and stricter locking, means that insert processes that read the invalidation threshold will block for a short time when there are concurrent refreshes. However, since this blocking only occurs during the first transaction of the refresh (which is quite short), it probably doesn't matter too much in practice. The relaxing of locks to improve concurrency and performance can be implemented in the future.	2020-09-11 16:07:21 +02:00
Mats Kindahl	9565cbd0f7	Continuous aggregates support WITH NO DATA This commit will add support for `WITH NO DATA` when creating a continuous aggregate and will refresh the continuous aggregate when creating it unless `WITH NO DATA` is provided. All test cases are also updated to use `WITH NO DATA` and an additional test case for verifying that both `WITH DATA` and `WITH NO DATA` works as expected. Closes #2341	2020-09-11 14:02:41 +02:00
Dmitry Simonenko	e10b437712	Make hypertable_approximate_row_count return row count only This change renames function to approximate_row_count() and adds support for regular tables. Return a row count estimate for a table instead of a table list.	2020-09-02 12:18:34 +03:00
Mats Kindahl	c054b381c6	Change syntax for continuous aggregates We change the syntax for defining continuous aggregates to use `CREATE MATERIALIZED VIEW` rather than `CREATE VIEW`. The command still creates a view, while `CREATE MATERIALIZED VIEW` creates a table. Raise an error if `CREATE VIEW` is used to create a continuous aggregate and redirect to `CREATE MATERIALIZED VIEW`. In a similar vein, `DROP MATERIALIZED VIEW` is used for continuous aggregates and continuous aggregates cannot be dropped with `DROP VIEW`. Continuous aggregates are altered using `ALTER MATERIALIZED VIEW` rather than `ALTER VIEW`, so we ensure that it works for `ALTER MATERIALIZED VIEW` and gives an error if you try to use `ALTER VIEW` to change a continuous aggregate. Note that we allow `ALTER VIEW ... SET SCHEMA` to be used with the partial view as well as with the direct view, so this is handled as a special case. Fixes #2233 Co-authored-by: =?UTF-8?q?Erik=20Nordstr=C3=B6m?= <erik@timescale.com> Co-authored-by: Mats Kindahl <mats@timescale.com>	2020-08-27 17:16:10 +02:00
Brian Rowe	8e1e6036af	Preserve pg_stats on chunks before compression This change will ensure that the pg_statistics on a chunk are updated immediately prior to compression. It also ensures that these stats are not overwritten as part of a global or hypertable targetted ANALYZE. This addresses the issue that chunk will no longer generate valid statistics durings an ANALYZE once the data's been moved to the compressed table. Unfortunately any compressed rows will not be captured in the parent hypertable's pg_statistics as there is no way to change how PostGresQL samples child tables in PG11. This approach assumes that the compressed table remains static, which is mostly correct in the current implementation (though it is possible to remove compressed segments). Once we start allowing more operations on compressed chunks this solution will need to be revisited. Note that in PG12 an approach leveraging table access methods will not have a problem analyzing compressed tables.	2020-08-21 10:48:15 -07:00
gayyappan	9f13fb9906	Add functions for compression stats Add chunk_compression_stats and hypertable_compression_stats functions to get before/after compression sizes	2020-08-03 10:19:55 -04:00
gayyappan	7d3b4b5442	New size utils functions Add hypertable_detailed_size , chunk_detailed_size, hypertable_size functions. Remove hypertable_relation_size, hypertable_relation_size_pretty, and indexes_relation_size_pretty Remove size information from hypertables view.	2020-07-29 15:30:39 -04:00
gayyappan	926a1c9850	Add compression settings view Add informational view that lists the settings used while enabling compression on a hypertable.	2020-07-23 12:40:12 -04:00
Sven Klemm	3d1a7ca3ac	Fix delete on tables involving hypertables with compression The DML blocker to block INSERTs and UPDATEs on compressed hypertables would trigger if the UPDATE or DELETE referenced any hypertable with compressed chunks. This patch changes the logic to only block if the target of the UPDATE or DELETE is a compressed chunk.	2020-07-20 13:22:49 +02:00
Oleg Smirnov	0e9f1ee9f5	Enable compression for tables with compound foreign key When enabling compression on a hypertable the existing constraints are being cloned to the new compressed hypertable. During validation of existing constraints a loop through the conkey array is performed, and constraint name is erroneously added to the list multiple times. This fix moves the addition to the list outside the conkey loop. Fixes #2000	2020-07-02 12:22:30 +02:00
gayyappan	b93b30b0c2	Add counts to compression statistics Store information related to compressed and uncompressed row counts after compressing a chunk. This is saved in compression_chunk_size table.	2020-06-19 15:58:04 -04:00
Mats Kindahl	a089843ffd	Make table mandatory for drop_chunks The `drop_chunks` function is refactored to make table name mandatory for the function. As a result, the function was also refactored to accept the `regclass` type instead of table name plus schema name and the parameters were reordered to match the order for `show_chunks`. The commit also refactor the code to pass the hypertable structure between internal functions rather than the hypertable relid and moving error checks to the PostgreSQL function. This allow the internal functions to avoid some lookups and use the information in the structure directly and also give errors earlier instead of first dropping chunks and then error and roll back the transaction.	2020-06-17 06:56:50 +02:00
Stephen Polcyn	b57d2ac388	Cleanup TODOs and FIXMEs Unless otherwise listed, the TODO was converted to a comment or put into an issue tracker. test/sql/ - triggers.sql: Made required change tsl/test/ - CMakeLists.txt: TODO complete - bgw_policy.sql: TODO complete - continuous_aggs_materialize.sql: TODO complete - compression.sql: TODO complete - compression_algos.sql: TODO complete tsl/src/ - compression/compression.c: - row_compressor_decompress_row: Expected complete - compression/dictionary.c: FIXME complete - materialize.c: TODO complete - reorder.c: TODO complete - simple8b_rle.h: - compressor_finish: Removed (obsolete) src/ - extension.c: Removed due to age - adts/simplehash.h: TODOs are from copied Postgres code - adts/vec.h: TODO is non-significant - planner.c: Removed - process_utility.c - process_altertable_end_subcmd: Removed (PG will handle case)	2020-05-18 20:16:03 -04:00
Derek Marsh	88773323f4	Ignore dropped chunks in compressed_chunk_stats	2020-04-16 16:34:46 +02:00
Ruslan Fomkin	16897d2238	Drop FK constraints on chunk compression Drop Foreign Key constraints from uncompressed chunks during the compression. This allows to cascade data deletion in FK-referenced tables to compressed chunks. The foreign key constrains are restored during decompression.	2020-04-14 23:12:15 +02:00
Erik Nordström	afb4c7ba51	Refactor planner hooks This change refactors our main planner hooks in `planner.c` with the intention of providing a consistent way to classify planned relations across hooks. In our hooks, we'd like to know whether a planned relation (`RelOptInfo`) is one of the following: * Hypertable * Hypertable child (a hypertable can appear as a child of itself) * Chunk as a child of hypertable (from expansion) * Chunk as standalone (operation directly on chunk) * Any other relation Previously, there was no way to consistently know which of these one was dealing with. Instead, a mix of various functions was used without "remembering" the classification for reuse in later sections of the code. When classifying relations according to the above categories, the only source of truth about a relation is our catalog metadata. In case of hypertables, this is cached in the hypertable cache. However, this cache is read-through, so, in case of a cache miss, the metadata will always be scanned to resolve a new entry. To avoid unnecessary metadata scans, this change introduces a way to do cache-only queries. This requires maintaining a single warmed cache throughout planning and is enabled by using a planner-global cache object. The pre-planning query processing warms the cache by populating it with all hypertables in the to-be-planned query.	2020-04-14 23:12:15 +02:00
Ruslan Fomkin	bddcf2a78a	Enable collation test with compression Enables back the test of collation error as part of the compression test.	2020-04-14 23:12:15 +02:00
Joshua Lockerman	949b88ef2e	Initial support for PostgreSQL 12 This change includes a major refactoring to support PostgreSQL 12. Note that many tests aren't passing at this point. Changes include, but are not limited to: - Handle changes related to table access methods - New way to expand hypertables since expansion has changed in PostgreSQL 12 (more on this below). - Handle changes related to table expansion for UPDATE/DELETE - Fixes for various TimescaleDB optimizations that were affected by planner changes in PostgreSQL (gapfill, first/last, etc.) Before PostgreSQL 12, planning was organized something like as follows: 1. construct add `RelOptInfo` for base and appendrels 2. add restrict info, joins, etc. 3. perform the actual planning with `make_one_rel` For our optimizations we would expand hypertables in the middle of step 1; since nothing in the query planner before `make_one_rel` cared about the inheritance children, we didn’t have to be too precises about where we were doing it. However, with PG12, and the optimizations around declarative partitioning, PostgreSQL now does care about when the children are expanded, since it wants as much information as possible to perform partition-pruning. Now planning is organized like: 1. construct add RelOptInfo for base rels only 2. add restrict info, joins, etc. 3. expand appendrels, removing irrelevant declarative partitions 4. perform the actual planning with make_one_rel Step 3 always expands appendrels, so when we also expand them during step 1, the hypertable gets expanded twice, and things in the planner break. The changes to support PostgreSQL 12 attempts to solve this problem by keeping the hypertable root marked as a non-inheritance table until `make_one_rel` is called, and only then revealing to PostgreSQL that it does in fact have inheritance children. While this strategy entails the least code change on our end, the fact that the first hook we can use to re-enable inheritance is `set_rel_pathlist_hook` it does entail a number of annoyances: 1. this hook is called after the sizes of tables are calculated, so we must recalculate the sizes of all hypertables, as they will not have taken the chunk sizes into account 2. the table upon which the hook is called will have its paths planned under the assumption it has no inheritance children, so if it's a hypertable we have to replan it's paths Unfortunately, the code for doing these is static, so we need to copy them into our own codebase, instead of just using PostgreSQL's. In PostgreSQL 12, UPDATE/DELETE on inheritance relations have also changed and are now planned in two stages: - In stage 1, the statement is planned as if it was a `SELECT` and all leaf tables are discovered. - In stage 2, the original query is planned against each leaf table, discovered in stage 1, directly, not part of an Append. Unfortunately, this means we cannot look in the appendrelinfo during UPDATE/DELETE planning, in particular to determine if a table is a chunk, as the appendrelinfo is not at the point we wish to do so initialized. This has consequences for how we identify operations on chunks (sometimes for blocking and something for enabling functionality).	2020-04-14 23:12:15 +02:00
Sven Klemm	039607dc1a	Add rescan function to CompressChunkDml CustomScan node The CompressChunkDml custom scan was missing a rescan function leading to a segfault in plans that required a rescan.	2020-03-25 01:37:53 +01:00
Erik Nordström	474db5e448	Fix continuous aggs DDL test on PG9.6 The test `continuous_aggs_ddl` failed on PostgreSQL 9.6 because it had a line that tested compression on a hypertable when this feature is not supported in 9.6. This prohibited a large portion of the test to run on 9.6. This change moves the testing of compression on a continuous aggregate to the `compression` test instead, which only runs on supported PostgreSQL versions. A permission check on a view is also removed, since similar tests are already in the `continuous_aggs_permissions` tests. The permission check was the only thing that caused different output across PostgreSQL versions, so therefore the test no longer requires version-specific output files and has been simplified to use the same output file irrespective of PostgreSQL version.	2020-03-12 14:02:16 +01:00
Sven Klemm	030443a8e2	Fix compressing interval columns When trying to compress a chunk that had a column of datatype interval delta-delta compression would be selected for the column but our delta-delta compression does not support interval and would throw an errow when trying to compress a chunk. This PR changes the compression selected for interval to dictionary compression.	2020-03-06 21:44:31 +01:00
Matvey Arye	94f3cff709	Fix bug with parent table in decompression Fix bug with transparent decompression getting the hypertable parent table. This can happen with self-referencing updates. Fixes #1555	2020-01-07 17:33:01 -05:00
gayyappan	87786f1520	Add compressed table size to existing views Some information views report hypertable sizes. Include compressed table size in the calculation when applicable.	2019-10-29 19:02:58 -04:00
Matvey Arye	0f3e74215a	Split segment meta min_max into two columns This simplifies the code and the access to the min/max metadata. Before we used a custom type, but now the min/max are just the same type as the underlying column and stored as two columns. This also removes the custom type that was used before.	2019-10-29 19:02:58 -04:00
gayyappan	43aa49ddc0	Add more information in compression views Rename compression views to compressed_hypertable_stats and compressed_chunk_stats and summarize information about compression status for chunks.	2019-10-29 19:02:58 -04:00
gayyappan	909b0ece78	Block updates/deletes on compressed chunks	2019-10-29 19:02:58 -04:00
gayyappan	edd3999553	Add trigger to block INSERT on compressed chunk Prevent insert on compressed chunks by adding a trigger that blocks it. Enable insert if the chunk gets decompressed.	2019-10-29 19:02:58 -04:00
Matvey Arye	8250714a29	Add fixes for Windows - Fix declaration of functions wrt TSDLLEXPORT consistency - Empty structs need to be created with '{ 0 }' syntax. - Alignment sentinels have to use uint64 instead of a struct with a 0-size member - Add some more ORDER BY clauses in the tests to constrain the order of results - Add ANALYZE after running compression in transparent-decompression test	2019-10-29 19:02:58 -04:00
Matvey Arye	2bf97e452d	Push down quals to segment meta columns This commit pushes down quals or order_by columns to make use of the SegmentMetaMinMax objects. Namely =,<,<=,>,>= quals can now be pushed down. We also remove filters from decompress node for quals that have been pushed down and don't need a recheck. This commit also changes tests to add more segment by and order-by columns. Finally, we rename segment meta accessor functions to be smaller	2019-10-29 19:02:58 -04:00
Matvey Arye	5c891f732e	Add sequence id metadata col to compressed table Add a sequence id to the compressed table. This id increments monotonically for each compressed row in a way that follows the order by clause. We leave gaps to allow for the possibility to fill in rows due to e.g. inserts down the line. The sequence id is global to the entire chunk and does not reset for each segment-by-group-change since this has the potential to allow some micro optimizations when ordering by a segment by columns as well. The sequence number is a INT32, which allows up to 200 billion uncompressed rows per chunk to be supported (assuming 1000 rows per compressed row and a gap of 10). Overflow is checked in the code and will error if this is breached.	2019-10-29 19:02:58 -04:00
Matvey Arye	b4a7108492	Integrate segment meta into compression This commit integrates the SegmentMetaMinMax into the compression logic. It adds metadata columns to the compressed table and correctly sets it upon compression. We also fix several errors with datum detoasting in SegmentMetaMinMax	2019-10-29 19:02:58 -04:00
Joshua Lockerman	8b273a5187	Fix flush when num-rows overflow We should only free the segment-bys when we're changing groups not when we've got too many rows to compress, in that case we'll need them.	2019-10-29 19:02:58 -04:00
Sven Klemm	45fac0ebe6	Add test for compress_chunk plan invalidation This patch adds a testcase for prepared statement plan invalidation when a chunk gets compressed.	2019-10-29 19:02:58 -04:00
gayyappan	6832ed2ca5	Modify storage type for toast columns This PR modifies the toast type for compressed columns based on the algorithm used for compression.	2019-10-29 19:02:58 -04:00
Sven Klemm	4cc1a4159a	Add DecompressChunk custom scan node This patch adds a DecompressChunk custom scan node, which will be used when querying hypertables with compressed chunks to transparently decompress chunks.	2019-10-29 19:02:58 -04:00
Matvey Arye	f6573f9247	Add a metadata count column to compressed table This is useful, if some or all compressed columns are NULL. The count reflects the number of uncompressed rows that are in the compressed row. Stored as a 32-bit integer.	2019-10-29 19:02:58 -04:00
Matvey Arye	a078781c2e	Add decompress_chunk function This is the opposite dual of compress_chunk.	2019-10-29 19:02:58 -04:00
Matvey Arye	9223f08d68	Truncate chunks after (de-)compression This commit will truncate the original chunk after compression or decompression.	2019-10-29 19:02:58 -04:00

1 2

54 Commits