Unless otherwise listed, the TODO was converted to a comment or put
into an issue tracker.
test/sql/
- triggers.sql: Made required change
tsl/test/
- CMakeLists.txt: TODO complete
- bgw_policy.sql: TODO complete
- continuous_aggs_materialize.sql: TODO complete
- compression.sql: TODO complete
- compression_algos.sql: TODO complete
tsl/src/
- compression/compression.c:
- row_compressor_decompress_row: Expected complete
- compression/dictionary.c: FIXME complete
- materialize.c: TODO complete
- reorder.c: TODO complete
- simple8b_rle.h:
- compressor_finish: Removed (obsolete)
src/
- extension.c: Removed due to age
- adts/simplehash.h: TODOs are from copied Postgres code
- adts/vec.h: TODO is non-significant
- planner.c: Removed
- process_utility.c
- process_altertable_end_subcmd: Removed (PG will handle case)
This commit switches the array compressor code to using
DatumSerializer/DatumDeserializer to reduce code duplication
and to add in some more efficiency.
Previously, the detoasting in Array was incorrect and so the compressed
table stored pointers into the toast table of the uncomoressed table.
This commit fixes the bug and also add logic to the test to remove
the uncompressed table so such a bug would cause test failures in
the future.
This commit integrates the SegmentMetaMinMax into the
compression logic. It adds metadata columns to the compressed table
and correctly sets it upon compression.
We also fix several errors with datum detoasting in SegmentMetaMinMax
This commit changes deltadelta compression to store the first element
in the simple8b array instead of out-of-line. Besides shrinking the
data in some cases, this also ensures that the simple8b array is never
empty, fixing the case where only a single element is stored.
Add the type for min/max segment meta object. Segment metadata
objects keep metadata about data in segments (compressed rows).
The min/max variant keeps the min and max values inside the compressed
object. It will be used on compression order by columns to allow
queries that have quals on those columns to be able to exclude entire
segments if no uncompressed rows in the segment may match the qual.
We also add generalized infrastructure for datum serialization
/ deserialization for arbitrary types to and from memory as well
as binary strings.
This rebuilds indexes during compression and decompression. Previously,
indexes were not updated during these operations. We also fix
a small bug with orderby and segmentby handling of empty strings/
lists.
Finally, we add some more tests.
Dictionary compression can be a pessimization if there aren't many
repeated values. Since we want to have a single fallback compressor we
can recommend when one of the more specialized compressors aren't
appropriate, this commit adds a fallback where, if it would be more
efficient to store data as an array instead of dictionary-compressed,
the dictionary compressor will automatically return the value as an
array.
We eventually want to be able to compress chunks in the background as
they become old enough. As an incremental step in this directions, this
commit adds the ability to compress any table, albeit with an
unintuitive and brittle interface. This will eventually married to our
catalogs and background workers to provide a seamless experience.
This commit also fixes a bug in gorilla in which the compressor could
not handle the case where the leading/trailing zeroes were always 0.
This commit introduces 4 compression algorithms
as well as 3 ADTs to support them. The compression
algorithms are time-series optimized. The following
algorithms are implemented:
- DeltaDelta compresses integer and timestamp values
- Gorilla compresses floats
- Dictionary compression handles any data type
and is optimized for low-cardinality datasets.
- Array stores any data type in an array-like
structure and does not actually compress it (though
TOAST-based compression can be applied on top).
These compression algorithms are are fully described in
tsl/src/compression/README.md.
The Abstract Data Types that are implemented are
- Vector - A dynamic vector that can store any type.
- BitArray - A dynamic vector to store bits.
- SimpleHash - A hash table implementation from PG12.
More information can be found in
src/adts/README.md