timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-24 23:19:01 +08:00

Author	SHA1	Message	Date
Matvey Arye	2c594ec6f9	Keep catalog rows for some dropped chunks If a chunk is dropped but it has a continuous aggregate that is not dropped we want to preserve the chunk catalog row instead of deleting the row. This is to prevent dangling identifiers in the materialization hypertable. It also preserves the dimension slice and chunk constraints rows for the chunk since those will be necessary when enabling this with multinode and is necessary to recreate the chunk too. The postgres objects associated with the chunk are all dropped (table, constraints, indexes). If data is ever reinserted to the same data region, the chunk is recreated with the same dimension definitions as before. The postgres objects are simply recreated.	2019-12-30 09:10:44 -05:00
Matvey Arye	d9d1a44d2e	Refactor chunk handling to separate out stub Previously, the Chunk struct was used to represent both a full chunk and the stub used for joins. The stub used for joins only contained valid values for some chunk fields and not others. After the join determined that a Chunk was complete, it filled in the rest of the chunk field. The fact that a chunk could have only some fields filled out and not others at different times, made the code hard to follow and error prone. So we separate out the stub state of the chunk into a separate struct that doesn't contain the not-filled-out fields inside of it. This leverages the type system to prevent errors that try to access invalid fields during the join phase and makes the code easier to follow.	2019-12-06 15:04:51 -05:00
Joshua Lockerman	48ef701fa9	Set toast_tuple_target to 128B when able We want compressed data to be stored out-of-line whenever possible so that the headers are colocated and scans on the metadata and segmentbys are cheap. This commit lowers toast_tuple_target to 128 bytes, so that more tables will have this occur; using the default size, very often a non-trivial portion of the data ends up in the main table, and only very few rows are stored in a page.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	efb131dd6f	Add missing tests discovered by Codecov 2 This commit adds tests for DATE, TIMESTAMP, and FLOAT compression and decompression, NULL compression and decompression in dictionaries and fixes a bug where the database would refuse to decompress DATEs. This commit also removes the fallback allowing any binary compatible 8-byte types to be compressed by our integer compressors as I believe I found a bug in said fallback last time I reviewed it, and cannot recall what the bug was. These can be re-added later, with appropriate tests.	2019-10-29 19:02:58 -04:00
Sven Klemm	e2df62c81c	Fix transparent decompression interaction with first/last Queries with the first/last optimization on compressed chunks would not properly decompress data but instead access the uncompressed chunk. This patch fixes the behaviour and also unifies the check whether a hypertable has compression.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	07841670a7	Fix issues discovered by coverity This commit fixes issues reported by coverity. Of these, the only real issue is an integer overflow in bitarray, which can never happen in its current usages. This also adds a PG_USED_FOR_ASSERTS_ONLY for a variable only used for Assert.	2019-10-29 19:02:58 -04:00
Matvey Arye	85d30e404d	Add ability to turn off compression Since enabling compression creates limits on the hypertable (e.g. types of constraints allowed) even if there are no compressed chunks, we add the ability to turn off compression. This is only possible if there are no compressed chunks.	2019-10-29 19:02:58 -04:00
Matvey Arye	2fe51d2735	Improve (de)compress_chunk API This commit improves the API of compress_chunk and decompress_chunk: - have it return the chunk regclass processed (or NULL in the idempotent case); - mark it as STRICT - add if_not_compressed/if_compressed options for idempotency	2019-10-29 19:02:58 -04:00
Matvey Arye	92aa77247a	Improve minor UIUX Some small improvements: - allow alter table with empty segment by if the original definition had an empty segment by. Improve error msgs. - block compression on tables with OIDs - block compression on tables with RLS	2019-10-29 19:02:58 -04:00
Matvey Arye	b8a98c1f18	Make compressed chunks use same tablespace as uncompressed For tablepaces with compressed chunks the semantics are the following: - compressed chunks get put into the same tablespace as the uncommpressed chunk on compression. - set tablespace on uncompressed hypertable cascades to compressed hypertable+chunks - set tablespace on all chunks is blocked (same as w/o compression) - move chunks on a uncompressed chunk errors - move chunks on compressed chunk works In the future we will: - add tablespace option to compress_chunk function and policy (this will override the setting of the uncompressed chunk). This will allow changing tablespaces upon compression - Note: The current plan is to never listen to the setting on compressed hypertable. In fact, we will block setting tablespace on compressed hypertables	2019-10-29 19:02:58 -04:00
Joshua Lockerman	91a73c3e17	Set statistics on compressed chunks The statistics on segmentby and metadata columns are very important as they affect the decompressed data a thousand-fold. Statistics on the compressed columns are irrelevant, as the regular postgres planner cannot understand the compressed columns. This commit sets the statistics for compressed tables based on this, weighting the uncompressed columns greatly, and the compressed columns not-at-all.	2019-10-29 19:02:58 -04:00
gayyappan	72588a2382	Restrict constraints on compressed hypertables. Primary and unqiue constraints are limited to segment_by and order_by columns and foreign key constraints are limited to segment_by columns when creating a compressed hypertable. There are no restrictions on check constraints.	2019-10-29 19:02:58 -04:00
Matvey Arye	0f3e74215a	Split segment meta min_max into two columns This simplifies the code and the access to the min/max metadata. Before we used a custom type, but now the min/max are just the same type as the underlying column and stored as two columns. This also removes the custom type that was used before.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	6687189a6c	Free memory earlier in decompress_chunk This was supposed to be part of an earlier commit, but seems to have been lost. This should reduce peak memory usage of that function.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	64f56d5088	Create indexes on segmentby columns This commit creates indexes on all segmentby columns of the compressed hypertable.	2019-10-29 19:02:58 -04:00
gayyappan	909b0ece78	Block updates/deletes on compressed chunks	2019-10-29 19:02:58 -04:00
gayyappan	edd3999553	Add trigger to block INSERT on compressed chunk Prevent insert on compressed chunks by adding a trigger that blocks it. Enable insert if the chunk gets decompressed.	2019-10-29 19:02:58 -04:00
Matvey Arye	12929fc813	Use DatumSerialize for binary strings This is a refactor of the array and dictionary code to use binary string functions in datum serialize to consolidate code. We also made the datum serialize more flexible in that it no longer must use a byte to store the encoding type (binary or text) but instead can get that as input. This makes the encoding use less data in the array case.	2019-10-29 19:02:58 -04:00
Matvey Arye	14f02f423e	Switch the array code to use DatumSerializer This commit switches the array compressor code to using DatumSerializer/DatumDeserializer to reduce code duplication and to add in some more efficiency.	2019-10-29 19:02:58 -04:00
Matvey Arye	300db8594a	Fix detoasting bug and add tests Previously, the detoasting in Array was incorrect and so the compressed table stored pointers into the toast table of the uncomoressed table. This commit fixes the bug and also add logic to the test to remove the uncompressed table so such a bug would cause test failures in the future.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	fac8eca0b3	Free Memory Earlier in decompress_chunk This commit alters decompress_chunk to free memory as soon as possible instead of waiting until the function ends. This should decrease peak memory usage from roughly the size of the dataset to roughly the size of the a single compressed row.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	0606aeba9e	Reduce Peak Memory Usage for compress_chunk Before this PR some state (most notably deTOASTed values) would persist across compressed rows during compress_chunk, despite the fact that they were no longer needed. This increased peak memory usage of compress_chunk. This commit adds a MemoryContext that is reset after each compressed row is inserted, ensuring that state needed for only one row does not hang around longer than needed.	2019-10-29 19:02:58 -04:00
Matvey Arye	6465a4e85a	Switch to using get_attnum function This is a fix for a rebase on master since `attno_find_by_attname` was removed.	2019-10-29 19:02:58 -04:00
Matvey Arye	8250714a29	Add fixes for Windows - Fix declaration of functions wrt TSDLLEXPORT consistency - Empty structs need to be created with '{ 0 }' syntax. - Alignment sentinels have to use uint64 instead of a struct with a 0-size member - Add some more ORDER BY clauses in the tests to constrain the order of results - Add ANALYZE after running compression in transparent-decompression test	2019-10-29 19:02:58 -04:00
Matvey Arye	df4c444551	Delete related rows for compression This fixes delete of relate rows when we have compressed hypertables. Namely we delete rows from: - compression_chunk_size - hypertable_compression We also fix hypertable_compression to handle NULLS correctly. We add a stub for tests with continuous aggs as well as compression. But, that's broken for now so it's commented out. Will be fixed in another PR.	2019-10-29 19:02:58 -04:00
Matvey Arye	0db50e7ffc	Handle drops of compressed chunks/hypertables This commit add handling for dropping of chunks and hypertables in the presence of associated compressed objects. If the uncompressed chunk/hypertable is dropped than drop the associated compressed object using DROP_RESTRICT unless cascading is explicitly enabled. Also add a compressed_chunk_id index on compressed tables for figuring out whether a chunk is compressed or not. Change a bunch of APIs to use DropBehavior instead of a cascade bool to be more explicit. Also test the drop chunks policy.	2019-10-29 19:02:58 -04:00
Matvey Arye	2bf97e452d	Push down quals to segment meta columns This commit pushes down quals or order_by columns to make use of the SegmentMetaMinMax objects. Namely =,<,<=,>,>= quals can now be pushed down. We also remove filters from decompress node for quals that have been pushed down and don't need a recheck. This commit also changes tests to add more segment by and order-by columns. Finally, we rename segment meta accessor functions to be smaller	2019-10-29 19:02:58 -04:00
gayyappan	6e60d2614c	Add compress chunks policy support Add and drop compress chunks policy using bgw infrastructure.	2019-10-29 19:02:58 -04:00
Matvey Arye	5c891f732e	Add sequence id metadata col to compressed table Add a sequence id to the compressed table. This id increments monotonically for each compressed row in a way that follows the order by clause. We leave gaps to allow for the possibility to fill in rows due to e.g. inserts down the line. The sequence id is global to the entire chunk and does not reset for each segment-by-group-change since this has the potential to allow some micro optimizations when ordering by a segment by columns as well. The sequence number is a INT32, which allows up to 200 billion uncompressed rows per chunk to be supported (assuming 1000 rows per compressed row and a gap of 10). Overflow is checked in the code and will error if this is breached.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	6d0dfdfe1a	Switch Timestamptz to use deltadelta and bugfixes Timestamptz is an integer-like type, and thus should use deltadelta encoding by default. Making this change uncovered a bug where RLE was truncating values on decompression, which has also been fixed.	2019-10-29 19:02:58 -04:00
Matvey Arye	b4a7108492	Integrate segment meta into compression This commit integrates the SegmentMetaMinMax into the compression logic. It adds metadata columns to the compressed table and correctly sets it upon compression. We also fix several errors with datum detoasting in SegmentMetaMinMax	2019-10-29 19:02:58 -04:00
Matvey Arye	be199bec70	Add type cache Add a type cache to get the OID corresponding to a particular defined SQL type.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	2b1e950df3	Store first deltadelta element in simple8b This commit changes deltadelta compression to store the first element in the simple8b array instead of out-of-line. Besides shrinking the data in some cases, this also ensures that the simple8b array is never empty, fixing the case where only a single element is stored.	2019-10-29 19:02:58 -04:00
Sven Klemm	3d55595ad0	Fix error hint for compress_chunk The errror hint for compress_chunk misspelled the option to use for enabling compression. This patch changes the error hint and also makes the hint a proper sentence.	2019-10-29 19:02:58 -04:00
Matvey Arye	b9674600ae	Add segment meta min/max Add the type for min/max segment meta object. Segment metadata objects keep metadata about data in segments (compressed rows). The min/max variant keeps the min and max values inside the compressed object. It will be used on compression order by columns to allow queries that have quals on those columns to be able to exclude entire segments if no uncompressed rows in the segment may match the qual. We also add generalized infrastructure for datum serialization / deserialization for arbitrary types to and from memory as well as binary strings.	2019-10-29 19:02:58 -04:00
Joshua Lockerman	8b273a5187	Fix flush when num-rows overflow We should only free the segment-bys when we're changing groups not when we've got too many rows to compress, in that case we'll need them.	2019-10-29 19:02:58 -04:00
Matvey Arye	ea7d2c7e60	Enforce license checks for compression Enforce enterprise license check for compression. Note: these checks are now outdated as compression is now a community, not enterprise feature.	2019-10-29 19:02:58 -04:00
gayyappan	6832ed2ca5	Modify storage type for toast columns This PR modifies the toast type for compressed columns based on the algorithm used for compression.	2019-10-29 19:02:58 -04:00
Matvey Arye	bce292a64f	Fix locking when altering compression options Take an exclusive lock when taking compression options as it is safer.	2019-10-29 19:02:58 -04:00
Matvey Arye	0059360522	Fix indexes during compression and decompression This rebuilds indexes during compression and decompression. Previously, indexes were not updated during these operations. We also fix a small bug with orderby and segmentby handling of empty strings/ lists. Finally, we add some more tests.	2019-10-29 19:02:58 -04:00
Matvey Arye	cdf6fcb69a	Allow altering compression options We now allow changing the compression options on a hypertable as long as there are no existing compressed chunks.	2019-10-29 19:02:58 -04:00
Matvey Arye	eba612ea2e	Add time column to compressed order by list Add the column to the order by list if it's not already there. This is never wrong and might improve performance. This also guarantees that we have at least one ordering column during compression and therefore can always use tuplesort (o/w we'd need a non-tuplesort method of getting tuples).	2019-10-29 19:02:58 -04:00
Matvey Arye	6f22a7a68c	Improve parsing of segment by and order by lists Replace custom parsing of order by and segment by lists with the postgres parser. The segment by list is now parsed in the same way as the GROUP BY clause and the order by list in the same way as the ORDER BY clause. Also fix default for nulls first/last to follow the PG convention: LAST for ASC, FIRST for DESC.	2019-10-29 19:02:58 -04:00
Matvey Arye	f6573f9247	Add a metadata count column to compressed table This is useful, if some or all compressed columns are NULL. The count reflects the number of uncompressed rows that are in the compressed row. Stored as a 32-bit integer.	2019-10-29 19:02:58 -04:00
Matvey Arye	a078781c2e	Add decompress_chunk function This is the opposite dual of compress_chunk.	2019-10-29 19:02:58 -04:00
Sven Klemm	bdc599793c	Add helper function to get decompression iterator init function	2019-10-29 19:02:58 -04:00
Matvey Arye	9223f08d68	Truncate chunks after (de-)compression This commit will truncate the original chunk after compression or decompression.	2019-10-29 19:02:58 -04:00
Matvey Arye	5bdb29b8f7	Fix compression for PG96 Fixes some compilation and test errors.	2019-10-29 19:02:58 -04:00
gayyappan	1f4689eca9	Record chunk sizes after compression Compute chunk size before/after compressing a chunk and record in catalog table.	2019-10-29 19:02:58 -04:00
gayyappan	44941f7bd2	Add UI for compress_chunks functionality Add support for compress_chunks function. This also adds support for compress_orderby and compress_segmentby parameters in ALTER TABLE. These parameteres are used by the compress_chunks function. The parsing code will most likely be changed to use PG raw_parser function.	2019-10-29 19:02:58 -04:00

1 2

56 Commits