1809 Commits

Author SHA1 Message Date
Sven Klemm
4c3bb6d2d6 Add JOIN tests for transparent decompression 2019-10-29 19:02:58 -04:00
Matvey Arye
6465a4e85a Switch to using get_attnum function
This is a fix for a rebase on master since `attno_find_by_attname`
was removed.
2019-10-29 19:02:58 -04:00
Matvey Arye
4140c58f1b Update postgresql.conf used for testing
The windows tests apply the non-tsl config options to the
TSL tests. Update the linux postgresql.conf files to use the
same semantics.

This caused a change in the transparent_decompression golden
file because of the change to random_page_cost for the tsl
test.
2019-10-29 19:02:58 -04:00
Joshua Lockerman
965054658e Enable IndexScans on compressed chunks
This commit enables IndexScans on the segmentby columns of compressed
chunks, if they have an index. It makes three changes to enable this:

1. It creates a DecompressChunkPath for every path planned on the
   compressed chunk, not only the cheapest one.
2. It sets up the reltargetlist on the compressed RelOptInfo accurately
   reflect the columns of the compressed chunk that are read, instead
   of leaving it empty (needed to prevent IndexOnlyScans from being
   planned).
3. It plans IndexPaths, not only SeqScanPaths.
2019-10-29 19:02:58 -04:00
Sven Klemm
e2c03e40aa Add support for pathkey pushdown for transparent decompression
This patch adds support for producing ordered output. All
segmentby columns need to be prefix of pathkeys and the orderby
specified for the compression needs exactly match the rest of
pathkeys.
2019-10-29 19:02:58 -04:00
Matvey Arye
8250714a29 Add fixes for Windows
- Fix declaration of functions wrt TSDLLEXPORT consistency
- Empty structs need to be created with '{ 0 }' syntax.
- Alignment sentinels have to use uint64 instead of a struct
  with a 0-size member
- Add some more ORDER BY clauses in the tests to constrain
  the order of results
- Add ANALYZE after running compression in
  transparent-decompression test
2019-10-29 19:02:58 -04:00
Matvey Arye
be946c436d Block add_drop_policy on internal compressed table
Add drop policy can only be added to the public uncompressed hypertable.
This blocks a call when a policy is added to the internal
hypertable with compressed data.
2019-10-29 19:02:58 -04:00
Matvey Arye
df4c444551 Delete related rows for compression
This fixes delete of relate rows when we have compressed
hypertables. Namely we delete rows from:

- compression_chunk_size
- hypertable_compression

We also fix hypertable_compression to handle NULLS correctly.

We add a stub for tests with continuous aggs as well as compression.
But, that's broken for now so it's commented out. Will be fixed
in another PR.
2019-10-29 19:02:58 -04:00
Matvey Arye
06557257f5 Fix the chunk model to handle NULLs correctly
Since chunks now have NULL fields, some cleanup was necessary.
Namely we remove all direct GETSTRUCT usage and instead move
to a method that uses heap_form/deform_tuple. We also
cleanup some naming.

The catalog_insert function for tuples was made public to ease
unifying interfaces for going from formdata->tuples.
2019-10-29 19:02:58 -04:00
Matvey Arye
0db50e7ffc Handle drops of compressed chunks/hypertables
This commit add handling for dropping of chunks and hypertables
in the presence of associated compressed objects. If the uncompressed
chunk/hypertable is dropped than drop the associated compressed object
using DROP_RESTRICT unless cascading is explicitly enabled.

Also add a compressed_chunk_id index on compressed tables for
figuring out whether a chunk is compressed or not.

Change a bunch of APIs to use DropBehavior instead of a cascade bool
to be more explicit.

Also test the drop chunks policy.
2019-10-29 19:02:58 -04:00
Matvey Arye
a4773adb58 Make compression feature use the community license
Compression is a community, not enterprise feature.
2019-10-29 19:02:58 -04:00
Matvey Arye
2bf97e452d Push down quals to segment meta columns
This commit pushes down quals or order_by columns to make
use of the SegmentMetaMinMax objects. Namely =,<,<=,>,>= quals
can now be pushed down.

We also remove filters from decompress node for quals that
have been pushed down and don't need a recheck.

This commit also changes tests to add more segment by and
order-by columns.

Finally, we rename segment meta accessor functions to be smaller
2019-10-29 19:02:58 -04:00
gayyappan
6e60d2614c Add compress chunks policy support
Add and drop compress chunks policy using bgw
infrastructure.
2019-10-29 19:02:58 -04:00
Matvey Arye
5c891f732e Add sequence id metadata col to compressed table
Add a sequence id to the compressed table. This id increments
monotonically for each compressed row in a way that follows
the order by clause. We leave gaps to allow for the
possibility to fill in rows due to e.g. inserts down
the line.

The sequence id is global to the entire chunk and does not reset
for each segment-by-group-change since this has the potential
to allow some micro optimizations when ordering by a segment by
columns as well.

The sequence number is a INT32, which allows up to 200 billion
uncompressed rows per chunk to be supported (assuming 1000 rows
per compressed row and a gap of 10). Overflow is checked in the
code and will error if this is breached.
2019-10-29 19:02:58 -04:00
Joshua Lockerman
6d0dfdfe1a Switch Timestamptz to use deltadelta and bugfixes
Timestamptz is an integer-like type, and thus should use deltadelta
encoding by default. Making this change uncovered a bug where RLE was
truncating values on decompression, which has also been fixed.
2019-10-29 19:02:58 -04:00
Joshua Lockerman
f2e4266aa0 Don't de-toast compressed values multiple times
This is a performance fix, as detoasting multiple times is expensive.
2019-10-29 19:02:58 -04:00
Matvey Arye
b4a7108492 Integrate segment meta into compression
This commit integrates the SegmentMetaMinMax into the
compression logic. It adds metadata columns to the compressed table
and correctly sets it upon compression.

We also fix several errors with datum detoasting in SegmentMetaMinMax
2019-10-29 19:02:58 -04:00
Matvey Arye
be199bec70 Add type cache
Add a type cache to get the OID corresponding to a particular
defined SQL type.
2019-10-29 19:02:58 -04:00
Sven Klemm
42a2c8666e Fix DecompressChunk parallel execution
When DecompressChunk is used in parallel plans the scan on the
compressed hypertable chunk needs to be parallel aware to prevent
duplicating work. This patch will change DecompressChunk to always
create a non parallel safe path and if requested a parallel safe
partial path with a parallel aware scan.
2019-10-29 19:02:58 -04:00
Joshua Lockerman
abbe5c84fd Test all compressors with single-value tables
Single-value tables have previously had bugs in deltadelta
and is a good edge case to have in general.
2019-10-29 19:02:58 -04:00
Joshua Lockerman
2b1e950df3 Store first deltadelta element in simple8b
This commit changes deltadelta compression to store the first element
in the simple8b array instead of out-of-line. Besides shrinking the
data in some cases, this also ensures that the simple8b array is never
empty, fixing the case where only a single element is stored.
2019-10-29 19:02:58 -04:00
Sven Klemm
3d55595ad0 Fix error hint for compress_chunk
The errror hint for compress_chunk misspelled the option to use
for enabling compression. This patch changes the error hint and
also makes the hint a proper sentence.
2019-10-29 19:02:58 -04:00
Matvey Arye
b9674600ae Add segment meta min/max
Add the type for min/max segment meta object. Segment metadata
objects keep metadata about data in segments (compressed rows).
The min/max variant keeps the min and max values inside the compressed
object. It will be used on compression order by columns to allow
queries that have quals on those columns to be able to exclude entire
segments if no uncompressed rows in the segment may match the qual.

We also add generalized infrastructure for datum serialization
/ deserialization for arbitrary types to and from memory as well
as binary strings.
2019-10-29 19:02:58 -04:00
Sven Klemm
b1a5000b5c Improve qual pushdown for transparent decompression
This patch adds support for pushing down IS NULL, IS NOT NULL and
ScalarArrayOp expression to the scan on the compressed chunk.
2019-10-29 19:02:58 -04:00
Joshua Lockerman
8b273a5187 Fix flush when num-rows overflow
We should only free the segment-bys when we're changing groups not when
we've got too many rows to compress, in that case we'll need them.
2019-10-29 19:02:58 -04:00
Matvey Arye
dcc1d902d1 Add more compression tests
Adds several test features:
- Add a tests on compression of an altered hypertable
- Adds tests for transparent decompression in hypertables tests
- Add dump/restore in hypertable test
2019-10-29 19:02:58 -04:00
Sven Klemm
45fac0ebe6 Add test for compress_chunk plan invalidation
This patch adds a testcase for prepared statement plan invalidation
when a chunk gets compressed.
2019-10-29 19:02:58 -04:00
Matvey Arye
ea7d2c7e60 Enforce license checks for compression
Enforce enterprise license check for compression. Note: these
checks are now outdated as compression is now a community,
not enterprise feature.
2019-10-29 19:02:58 -04:00
Sven Klemm
70b43482e9 Fixup for rebase against master 2019-10-29 19:02:58 -04:00
Sven Klemm
7c52e82aaf Use hypertable selectedCols for building scan on compressed chunk
Since we do not adjust selectedCols for the Chunk RangeTblEntry
attribute numbers in selectedCols will be wrong if attribute numbers
on hypertable and chunk differ. This patch changes the target list
creation to use the hypertable selectedCols and look up names
on the hypertable to work around this.
2019-10-29 19:02:58 -04:00
gayyappan
6832ed2ca5 Modify storage type for toast columns
This PR modifies the toast type for compressed columns based on
the algorithm used for compression.
2019-10-29 19:02:58 -04:00
Matvey Arye
b1a3449693 Add missing test to CMakeLists 2019-10-29 19:02:58 -04:00
Matvey Arye
bce292a64f Fix locking when altering compression options
Take an exclusive lock when taking compression options as it is
safer.
2019-10-29 19:02:58 -04:00
Matvey Arye
0059360522 Fix indexes during compression and decompression
This rebuilds indexes during compression and decompression. Previously,
indexes were not updated during these operations. We also fix
a small bug with orderby and segmentby handling of empty strings/
lists.

Finally, we add some more tests.
2019-10-29 19:02:58 -04:00
Matvey Arye
cdf6fcb69a Allow altering compression options
We now allow changing the compression options on a hypertable
as long as there are no existing compressed chunks.
2019-10-29 19:02:58 -04:00
Matvey Arye
eba612ea2e Add time column to compressed order by list
Add the column to the order by list if it's not already there.
This is never wrong and might improve performance. This
also guarantees that we have at least one ordering column
during compression and therefore can always use tuplesort
(o/w we'd need a non-tuplesort method of getting tuples).
2019-10-29 19:02:58 -04:00
Sven Klemm
4cc1a4159a Add DecompressChunk custom scan node
This patch adds a DecompressChunk custom scan node, which will be
used when querying hypertables with compressed chunks to transparently
decompress chunks.
2019-10-29 19:02:58 -04:00
Matvey Arye
6f22a7a68c Improve parsing of segment by and order by lists
Replace custom parsing of order by and segment by lists
with the postgres parser. The segment by list is now
parsed in the same way as the GROUP BY clause and the
order by list in the same way as the ORDER BY clause.

Also fix default for nulls first/last to follow the PG
convention: LAST for ASC, FIRST for DESC.
2019-10-29 19:02:58 -04:00
Matvey Arye
f6573f9247 Add a metadata count column to compressed table
This is useful, if some or all compressed columns are NULL.
The count reflects the number of uncompressed rows that are
in the compressed row. Stored as a 32-bit integer.
2019-10-29 19:02:58 -04:00
Matvey Arye
a078781c2e Add decompress_chunk function
This is the opposite dual of compress_chunk.
2019-10-29 19:02:58 -04:00
Sven Klemm
bdc599793c Add helper function to get decompression iterator init function 2019-10-29 19:02:58 -04:00
Sven Klemm
a5a3dca517 Add GUC for transparent decompression
This GUC will control whether the planner automatically queries
compressed data when some chunks in a query are compressed.
2019-10-29 19:02:58 -04:00
Sven Klemm
47c1d7e323 Add set_rel_pathlist hook for tsl code
Will be needed for compression.
2019-10-29 19:02:58 -04:00
Matvey Arye
9223f08d68 Truncate chunks after (de-)compression
This commit will truncate the original chunk after compression
or decompression.
2019-10-29 19:02:58 -04:00
Matvey Arye
5bdb29b8f7 Fix compression for PG96
Fixes some compilation and test errors.
2019-10-29 19:02:58 -04:00
gayyappan
7a728dc15f Add view for compression size
View for compressed_chunk_size and compressed_hypertable_size
2019-10-29 19:02:58 -04:00
gayyappan
1f4689eca9 Record chunk sizes after compression
Compute chunk size before/after compressing a chunk and record in
catalog table.
2019-10-29 19:02:58 -04:00
gayyappan
44941f7bd2 Add UI for compress_chunks functionality
Add support for compress_chunks function.

This also adds support for compress_orderby and compress_segmentby
parameters in ALTER TABLE. These parameteres are used by the
compress_chunks function.

The parsing code will most likely be changed to use PG raw_parser
function.
2019-10-29 19:02:58 -04:00
Joshua Lockerman
bb89e62629 Add fallback from dictionary compressor to array
Dictionary compression can be a pessimization if there aren't many
repeated values. Since we want to have a single fallback compressor we
can recommend when one of the more specialized compressors aren't
appropriate, this commit adds a fallback where, if it would be more
efficient to store data as an array instead of dictionary-compressed,
the dictionary compressor will automatically return the value as an
array.
2019-10-29 19:02:58 -04:00
Joshua Lockerman
fa26992c4c Improve deltadelta and gorilla compressors
- Add fallback compressors for deltadelta/gorilla
- Add bool compressor for deltadelta
2019-10-29 19:02:58 -04:00