The windows tests apply the non-tsl config options to the
TSL tests. Update the linux postgresql.conf files to use the
same semantics.
This caused a change in the transparent_decompression golden
file because of the change to random_page_cost for the tsl
test.
This commit enables IndexScans on the segmentby columns of compressed
chunks, if they have an index. It makes three changes to enable this:
1. It creates a DecompressChunkPath for every path planned on the
compressed chunk, not only the cheapest one.
2. It sets up the reltargetlist on the compressed RelOptInfo accurately
reflect the columns of the compressed chunk that are read, instead
of leaving it empty (needed to prevent IndexOnlyScans from being
planned).
3. It plans IndexPaths, not only SeqScanPaths.
This patch adds support for producing ordered output. All
segmentby columns need to be prefix of pathkeys and the orderby
specified for the compression needs exactly match the rest of
pathkeys.
- Fix declaration of functions wrt TSDLLEXPORT consistency
- Empty structs need to be created with '{ 0 }' syntax.
- Alignment sentinels have to use uint64 instead of a struct
with a 0-size member
- Add some more ORDER BY clauses in the tests to constrain
the order of results
- Add ANALYZE after running compression in
transparent-decompression test
Add drop policy can only be added to the public uncompressed hypertable.
This blocks a call when a policy is added to the internal
hypertable with compressed data.
This fixes delete of relate rows when we have compressed
hypertables. Namely we delete rows from:
- compression_chunk_size
- hypertable_compression
We also fix hypertable_compression to handle NULLS correctly.
We add a stub for tests with continuous aggs as well as compression.
But, that's broken for now so it's commented out. Will be fixed
in another PR.
Since chunks now have NULL fields, some cleanup was necessary.
Namely we remove all direct GETSTRUCT usage and instead move
to a method that uses heap_form/deform_tuple. We also
cleanup some naming.
The catalog_insert function for tuples was made public to ease
unifying interfaces for going from formdata->tuples.
This commit add handling for dropping of chunks and hypertables
in the presence of associated compressed objects. If the uncompressed
chunk/hypertable is dropped than drop the associated compressed object
using DROP_RESTRICT unless cascading is explicitly enabled.
Also add a compressed_chunk_id index on compressed tables for
figuring out whether a chunk is compressed or not.
Change a bunch of APIs to use DropBehavior instead of a cascade bool
to be more explicit.
Also test the drop chunks policy.
This commit pushes down quals or order_by columns to make
use of the SegmentMetaMinMax objects. Namely =,<,<=,>,>= quals
can now be pushed down.
We also remove filters from decompress node for quals that
have been pushed down and don't need a recheck.
This commit also changes tests to add more segment by and
order-by columns.
Finally, we rename segment meta accessor functions to be smaller
Add a sequence id to the compressed table. This id increments
monotonically for each compressed row in a way that follows
the order by clause. We leave gaps to allow for the
possibility to fill in rows due to e.g. inserts down
the line.
The sequence id is global to the entire chunk and does not reset
for each segment-by-group-change since this has the potential
to allow some micro optimizations when ordering by a segment by
columns as well.
The sequence number is a INT32, which allows up to 200 billion
uncompressed rows per chunk to be supported (assuming 1000 rows
per compressed row and a gap of 10). Overflow is checked in the
code and will error if this is breached.
Timestamptz is an integer-like type, and thus should use deltadelta
encoding by default. Making this change uncovered a bug where RLE was
truncating values on decompression, which has also been fixed.
This commit integrates the SegmentMetaMinMax into the
compression logic. It adds metadata columns to the compressed table
and correctly sets it upon compression.
We also fix several errors with datum detoasting in SegmentMetaMinMax
When DecompressChunk is used in parallel plans the scan on the
compressed hypertable chunk needs to be parallel aware to prevent
duplicating work. This patch will change DecompressChunk to always
create a non parallel safe path and if requested a parallel safe
partial path with a parallel aware scan.
This commit changes deltadelta compression to store the first element
in the simple8b array instead of out-of-line. Besides shrinking the
data in some cases, this also ensures that the simple8b array is never
empty, fixing the case where only a single element is stored.
The errror hint for compress_chunk misspelled the option to use
for enabling compression. This patch changes the error hint and
also makes the hint a proper sentence.
Add the type for min/max segment meta object. Segment metadata
objects keep metadata about data in segments (compressed rows).
The min/max variant keeps the min and max values inside the compressed
object. It will be used on compression order by columns to allow
queries that have quals on those columns to be able to exclude entire
segments if no uncompressed rows in the segment may match the qual.
We also add generalized infrastructure for datum serialization
/ deserialization for arbitrary types to and from memory as well
as binary strings.
Adds several test features:
- Add a tests on compression of an altered hypertable
- Adds tests for transparent decompression in hypertables tests
- Add dump/restore in hypertable test
Since we do not adjust selectedCols for the Chunk RangeTblEntry
attribute numbers in selectedCols will be wrong if attribute numbers
on hypertable and chunk differ. This patch changes the target list
creation to use the hypertable selectedCols and look up names
on the hypertable to work around this.
This rebuilds indexes during compression and decompression. Previously,
indexes were not updated during these operations. We also fix
a small bug with orderby and segmentby handling of empty strings/
lists.
Finally, we add some more tests.
Add the column to the order by list if it's not already there.
This is never wrong and might improve performance. This
also guarantees that we have at least one ordering column
during compression and therefore can always use tuplesort
(o/w we'd need a non-tuplesort method of getting tuples).
This patch adds a DecompressChunk custom scan node, which will be
used when querying hypertables with compressed chunks to transparently
decompress chunks.
Replace custom parsing of order by and segment by lists
with the postgres parser. The segment by list is now
parsed in the same way as the GROUP BY clause and the
order by list in the same way as the ORDER BY clause.
Also fix default for nulls first/last to follow the PG
convention: LAST for ASC, FIRST for DESC.
This is useful, if some or all compressed columns are NULL.
The count reflects the number of uncompressed rows that are
in the compressed row. Stored as a 32-bit integer.
Add support for compress_chunks function.
This also adds support for compress_orderby and compress_segmentby
parameters in ALTER TABLE. These parameteres are used by the
compress_chunks function.
The parsing code will most likely be changed to use PG raw_parser
function.
Dictionary compression can be a pessimization if there aren't many
repeated values. Since we want to have a single fallback compressor we
can recommend when one of the more specialized compressors aren't
appropriate, this commit adds a fallback where, if it would be more
efficient to store data as an array instead of dictionary-compressed,
the dictionary compressor will automatically return the value as an
array.