When an INSERT with ON CONFLICT DO NOTHING hits the first conflicts
it would abort additional INSERTS following the INSERT triggering
the DO NOTHING clause leading to missed INSERTs.
Fixes#7672
Scan keys created for in-memory filtering were using
collation from the hypertable tuple descriptor. Since
the scan keys are meant to be ran on chunk tuples,
switching to using the collation from the chunk tuple
descriptor fixes the issue.
Sequence numbers were an optimization for ordering batches based on the
orderby configuration setting. It was used for ordered append and
avoiding sorting compressed data when it matched the query ordering.
However, with enabling changes to compressed data, bookkeeping of
sequence numbers is becoming more of a hassle. Removing them and
using the metadata columns for ordering reduces that burden while
keeping all the existing optimizations that relied on the sequences
in place.
When doing an index scan, we should filter out any columns
already used as index scan keys from key columns since they
are later on used to generate heap scan keys.
Attribute numbers for key columns are stored with system column
offsets which need to be added every time we calculate an attribute
number from an index. This was causing us not to find appropriate
indexes for scanning during DML compression.
In order to verify constraints, we have to decompress
batches that could contain duplicates of the tuples
we are inserting. To find such batches, we use heap
scans which can be very expensive if the compressed
chunk contains a lot of tuples. Doing an index scan
makes much more sense in this scenario and will
give great performance benefits.
Additionally, we don't want to create the decompressor
until we determine we actually want to decompress a
batch so we try to lazily initialize it once a batch
is found.
Previously, we create functions to calculate default order by and
segment by values. This PR makes those functions be used by default
when compression is enabled. We also added GUCs to disable those
functions or to use alternative functions for the defaults calculation.
When inserting into a compressed chunk with constraints present,
we need to decompress relevant tuples in order to do speculative
inserting. Usually we used segment by column values to limit the
amount of compressed segments to decompress. This change expands
on that by also using segment metadata to further filter
compressed rows that need to be decompressed.
This patch allows unique constraints on compressed chunks. When
trying to INSERT into compressed chunks with unique constraints
any potentially conflicting compressed batches will be decompressed
to let postgres do constraint checking on the INSERT.
With this patch only INSERT ON CONFLICT DO NOTHING will be supported.
For decompression only segment by information is considered to
determine conflicting batches. This will be enhanced in a follow-up
patch to also include orderby metadata to require decompressing
less batches.