54 Commits

Author SHA1 Message Date
Bharathy
cc51e20e87 Add support for ON CONFLICT DO UPDATE for compressed hypertables
This patch fixes execution of INSERT with ON CONFLICT DO UPDATE by
removing error and allowing UPDATE do happen on the given compressed
hypertable.
2023-03-20 22:55:27 +05:30
Sven Klemm
65562f02e8 Support unique constraints on compressed chunks
This patch allows unique constraints on compressed chunks. When
trying to INSERT into compressed chunks with unique constraints
any potentially conflicting compressed batches will be decompressed
to let postgres do constraint checking on the INSERT.
With this patch only INSERT ON CONFLICT DO NOTHING will be supported.
For decompression only segment by information is considered to
determine conflicting batches. This will be enhanced in a follow-up
patch to also include orderby metadata to require decompressing
less batches.
2023-03-13 12:04:38 +01:00
Bharathy
9a2cbe30a1 Fix ChunkAppend, ConstraintAwareAppend child subplan
When TidRangeScan is child of ChunkAppend or ConstraintAwareAppend node, an
error is reported as "invalid child of chunk append: Node (26)". This patch
fixes the issue by recognising TidRangeScan as a valid child.

Fixes: #4872
2023-01-18 18:06:30 +05:30
Fabrízio de Royes Mello
a4356f342f Remove trailing whitespaces from test code 2022-11-18 16:31:47 -03:00
Sven Klemm
131773a902 Reset compression sequence when group resets
The sequence number of the compressed tuple is per segment by grouping
and should be reset when the grouping changes to prevent overflows with
many segmentby columns.
2022-08-15 13:34:00 +02:00
gayyappan
e5db6a9eec Fix status for dropped chunks that have catalog entries
Chunks that are dropped but preserve the catalog entries
have an incorrect status when they are marked as dropped.
This happens if the chunk was previously compressed and then
gets dropped - the status in the catalog tuple reflects the
compression status. This should be reset since the data is now
dropped.
2022-01-31 17:39:39 -05:00
Sven Klemm
ff5d7e42bb Adjust code to PG14 reltuples changes
PG14 changes the initial value of pg_class.reltuples to -1 to allow
differentiating between an empty relation and a relation where
ANALYZE has not yet run.

https://github.com/postgres/postgres/commit/3d351d916b
2021-06-29 16:35:35 +02:00
gayyappan
7c76fd4d09 Save compression settings on access node for distributed hypertables
1. Add compression_state column for hypertable catalog
by renaming compressed column for the hypertable catalog
table. compression_state is a tri-state column.
This column indicates if the hypertable has
compression enabled (value = 1) or if it is an internal
compression table (value = 2).

2. Save compression settings on access node when compression
is turned on for a distributed hypertable
For a distributed hypertable, that has compression enabled,
compression_state is set. We don't create any internal tables
on the access node.

Fixes #2660
2020-12-02 10:42:57 -05:00
gayyappan
05319cd424 Support analyze of internal compression table
This commit modifies analyze behavior as follows:
1. When an internal compression table is analyzed,
statistics from the compressed chunk (such as page
count and tuple count) is used to update the
statistics of the corresponding chunk parent, if
it is missing.

2. Analyze compressed chunk instead of raw chunks
When the command ANALYZE <hypertable> is executed,
a) analyze uncompressed chunks and b) skip the raw chunk,
but analyze the compressed chunk.
2020-11-11 15:05:14 -05:00
Brian Rowe
8a11b022bc Exclude compressed chunks from ANALYZE/VACUUM
This change makes sure that ANALYZE and VACUUM commands run without
any relations will not clear the stats on compressed chunks that
were saved at compression time.  It also will skip any distributed
tables.

Fixes #2576
2020-10-20 09:18:39 -07:00
Brian Rowe
5acf3343b5 Ensure reltuples are preserved during compression
This change captures the reltuples and relpages (and relallvisible)
statistics from the pg_class table for chunks immediately before
truncating them during the compression code path.  It then restores
the values after truncating, as there is no way to keep postgresql
from clearing these values during this operation.  It also properly
uses these values properly during planning, working around some
postgresql code which substitutes in arbitrary sizing for tables
which don't see to hold data.

Fixes #2524
2020-10-19 07:21:38 -07:00
Erik Nordström
4623db14ad Use consistent column names in views
Make all views that reference hypertables use `hypertable_schema` and
`hypertable_name`.
2020-10-05 15:18:47 +02:00
Erik Nordström
202692f1ef Make tests use the new continuous aggregate API
Tests are updated to no longer use continuous aggregate options that
will be removed, such as `refresh_lag`, `max_interval_per_job` and
`ignore_invalidation_older_than`. `REFRESH MATERIALIZED VIEW` has also
been replaced with `CALL refresh_continuous_aggregate()` using ranges
that try to replicate the previous refresh behavior.

The materializer test (`continuous_aggregate_materialize`) has been
removed, since this tested the "old" materializer code, which is no
longer used without `REFRESH MATERIALIZED VIEW`. The new API using
`refresh_continuous_aggregate` already allows manual materialization
and there are two previously added tests (`continuous_aggs_refresh`
and `continuous_aggs_invalidate`) that cover the new refresh path in
similar ways.

When updated to use the new refresh API, some of the concurrency
tests, like `continuous_aggs_insert` and `continuous_aggs_multi`, have
slightly different concurrency behavior. This is explained by
different and sometimes more conservative locking. For instance, the
first transaction of a refresh serializes around an exclusive lock on
the invalidation threshold table, even if no new threshold is
written. The previous code, only took the heavier lock once, and if, a
new threshold was written. This new, and stricter locking, means that
insert processes that read the invalidation threshold will block for a
short time when there are concurrent refreshes. However, since this
blocking only occurs during the first transaction of the refresh
(which is quite short), it probably doesn't matter too much in
practice. The relaxing of locks to improve concurrency and performance
can be implemented in the future.
2020-09-11 16:07:21 +02:00
Mats Kindahl
9565cbd0f7 Continuous aggregates support WITH NO DATA
This commit will add support for `WITH NO DATA` when creating a
continuous aggregate and will refresh the continuous aggregate when
creating it unless `WITH NO DATA` is provided.

All test cases are also updated to use `WITH NO DATA` and an additional
test case for verifying that both `WITH DATA` and `WITH NO DATA` works
as expected.

Closes #2341
2020-09-11 14:02:41 +02:00
Dmitry Simonenko
e10b437712 Make hypertable_approximate_row_count return row count only
This change renames function to approximate_row_count() and adds
support for regular tables. Return a row count estimate for a
table instead of a table list.
2020-09-02 12:18:34 +03:00
Mats Kindahl
c054b381c6 Change syntax for continuous aggregates
We change the syntax for defining continuous aggregates to use `CREATE
MATERIALIZED VIEW` rather than `CREATE VIEW`. The command still creates
a view, while `CREATE MATERIALIZED VIEW` creates a table.  Raise an
error if `CREATE VIEW` is used to create a continuous aggregate and
redirect to `CREATE MATERIALIZED VIEW`.

In a similar vein, `DROP MATERIALIZED VIEW` is used for continuous
aggregates and continuous aggregates cannot be dropped with `DROP
VIEW`.

Continuous aggregates are altered using `ALTER MATERIALIZED VIEW`
rather than `ALTER VIEW`, so we ensure that it works for `ALTER
MATERIALIZED VIEW` and gives an error if you try to use `ALTER VIEW` to
change a continuous aggregate.

Note that we allow `ALTER VIEW ... SET SCHEMA` to be used with the
partial view as well as with the direct view, so this is handled as a
special case.

Fixes #2233

Co-authored-by: =?UTF-8?q?Erik=20Nordstr=C3=B6m?= <erik@timescale.com>
Co-authored-by: Mats Kindahl <mats@timescale.com>
2020-08-27 17:16:10 +02:00
Brian Rowe
8e1e6036af Preserve pg_stats on chunks before compression
This change will ensure that the pg_statistics on a chunk are
updated immediately prior to compression. It also ensures that
these stats are not overwritten as part of a global or hypertable
targetted ANALYZE.

This addresses the issue that chunk will no longer generate valid
statistics durings an ANALYZE once the data's been moved to the
compressed table. Unfortunately any compressed rows will not be
captured in the parent hypertable's pg_statistics as there is no way
to change how PostGresQL samples child tables in PG11.

This approach assumes that the compressed table remains static, which
is mostly correct in the current implementation (though it is
possible to remove compressed segments). Once we start allowing more
operations on compressed chunks this solution will need to be
revisited. Note that in PG12 an approach leveraging table access
methods will not have a problem analyzing compressed tables.
2020-08-21 10:48:15 -07:00
gayyappan
9f13fb9906 Add functions for compression stats
Add chunk_compression_stats and hypertable_compression_stats
functions to get before/after compression sizes
2020-08-03 10:19:55 -04:00
gayyappan
7d3b4b5442 New size utils functions
Add hypertable_detailed_size , chunk_detailed_size,
hypertable_size functions.
Remove hypertable_relation_size,
hypertable_relation_size_pretty, and indexes_relation_size_pretty
Remove size information from hypertables view.
2020-07-29 15:30:39 -04:00
gayyappan
926a1c9850 Add compression settings view
Add informational view that lists the settings
used while enabling compression on a hypertable.
2020-07-23 12:40:12 -04:00
Sven Klemm
3d1a7ca3ac Fix delete on tables involving hypertables with compression
The DML blocker to block INSERTs and UPDATEs on compressed hypertables
would trigger if the UPDATE or DELETE referenced any hypertable with
compressed chunks. This patch changes the logic to only block if the
target of the UPDATE or DELETE is a compressed chunk.
2020-07-20 13:22:49 +02:00
Oleg Smirnov
0e9f1ee9f5 Enable compression for tables with compound foreign key
When enabling compression on a hypertable the existing
constraints are being cloned to the new compressed hypertable.
During validation of existing constraints a loop
through the conkey array is performed, and constraint name
is erroneously added to the list multiple times. This fix
moves the addition to the list outside the conkey loop.

Fixes #2000
2020-07-02 12:22:30 +02:00
gayyappan
b93b30b0c2 Add counts to compression statistics
Store information related to compressed and uncompressed row
counts after compressing a chunk. This is saved in
compression_chunk_size table.
2020-06-19 15:58:04 -04:00
Mats Kindahl
a089843ffd Make table mandatory for drop_chunks
The `drop_chunks` function is refactored to make table name mandatory
for the function. As a result, the function was also refactored to
accept the `regclass` type instead of table name plus schema name and
the parameters were reordered to match the order for `show_chunks`.

The commit also refactor the code to pass the hypertable structure
between internal functions rather than the hypertable relid and moving
error checks to the PostgreSQL function.  This allow the internal
functions to avoid some lookups and use the information in the
structure directly and also give errors earlier instead of first
dropping chunks and then error and roll back the transaction.
2020-06-17 06:56:50 +02:00
Stephen Polcyn
b57d2ac388 Cleanup TODOs and FIXMEs
Unless otherwise listed, the TODO was converted to a comment or put
into an issue tracker.

test/sql/
- triggers.sql: Made required change

tsl/test/
- CMakeLists.txt: TODO complete
- bgw_policy.sql: TODO complete
- continuous_aggs_materialize.sql: TODO complete
- compression.sql: TODO complete
- compression_algos.sql: TODO complete

tsl/src/
- compression/compression.c:
  - row_compressor_decompress_row: Expected complete
- compression/dictionary.c: FIXME complete
- materialize.c: TODO complete
- reorder.c: TODO complete
- simple8b_rle.h:
  - compressor_finish: Removed (obsolete)

src/
- extension.c: Removed due to age
- adts/simplehash.h: TODOs are from copied Postgres code
- adts/vec.h: TODO is non-significant
- planner.c: Removed
- process_utility.c
  - process_altertable_end_subcmd: Removed (PG will handle case)
2020-05-18 20:16:03 -04:00
Derek Marsh
88773323f4 Ignore dropped chunks in compressed_chunk_stats 2020-04-16 16:34:46 +02:00
Ruslan Fomkin
16897d2238 Drop FK constraints on chunk compression
Drop Foreign Key constraints from uncompressed chunks during the
compression. This allows to cascade data deletion in FK-referenced
tables to compressed chunks. The foreign key constrains are restored
during decompression.
2020-04-14 23:12:15 +02:00
Erik Nordström
afb4c7ba51 Refactor planner hooks
This change refactors our main planner hooks in `planner.c` with the
intention of providing a consistent way to classify planned relations
across hooks. In our hooks, we'd like to know whether a planned
relation (`RelOptInfo`) is one of the following:

* Hypertable
* Hypertable child (a hypertable can appear as a child of itself)
* Chunk as a child of hypertable (from expansion)
* Chunk as standalone (operation directly on chunk)
* Any other relation

Previously, there was no way to consistently know which of these one
was dealing with. Instead, a mix of various functions was used without
"remembering" the classification for reuse in later sections of the
code.

When classifying relations according to the above categories, the only
source of truth about a relation is our catalog metadata. In case of
hypertables, this is cached in the hypertable cache. However, this
cache is read-through, so, in case of a cache miss, the metadata will
always be scanned to resolve a new entry. To avoid unnecessary
metadata scans, this change introduces a way to do cache-only
queries. This requires maintaining a single warmed cache throughout
planning and is enabled by using a planner-global cache object. The
pre-planning query processing warms the cache by populating it with
all hypertables in the to-be-planned query.
2020-04-14 23:12:15 +02:00
Ruslan Fomkin
bddcf2a78a Enable collation test with compression
Enables back the test of collation error as part of the compression
test.
2020-04-14 23:12:15 +02:00
Joshua Lockerman
949b88ef2e Initial support for PostgreSQL 12
This change includes a major refactoring to support PostgreSQL
12. Note that many tests aren't passing at this point. Changes
include, but are not limited to:

- Handle changes related to table access methods
- New way to expand hypertables since expansion has changed in
  PostgreSQL 12 (more on this below).
- Handle changes related to table expansion for UPDATE/DELETE
- Fixes for various TimescaleDB optimizations that were affected by
  planner changes in PostgreSQL (gapfill, first/last, etc.)

Before PostgreSQL 12, planning was organized something like as
follows:

 1. construct add `RelOptInfo` for base and appendrels
 2. add restrict info, joins, etc.
 3. perform the actual planning with `make_one_rel`

For our optimizations we would expand hypertables in the middle of
step 1; since nothing in the query planner before `make_one_rel` cared
about the inheritance children, we didn’t have to be too precises
about where we were doing it.

However, with PG12, and the optimizations around declarative
partitioning, PostgreSQL now does care about when the children are
expanded, since it wants as much information as possible to perform
partition-pruning. Now planning is organized like:

 1. construct add RelOptInfo for base rels only
 2. add restrict info, joins, etc.
 3. expand appendrels, removing irrelevant declarative partitions
 4. perform the actual planning with make_one_rel

Step 3 always expands appendrels, so when we also expand them during
step 1, the hypertable gets expanded twice, and things in the planner
break.

The changes to support PostgreSQL 12 attempts to solve this problem by
keeping the hypertable root marked as a non-inheritance table until
`make_one_rel` is called, and only then revealing to PostgreSQL that
it does in fact have inheritance children. While this strategy entails
the least code change on our end, the fact that the first hook we can
use to re-enable inheritance is `set_rel_pathlist_hook` it does entail
a number of annoyances:

 1. this hook is called after the sizes of tables are calculated, so we
    must recalculate the sizes of all hypertables, as they will not
    have taken the chunk sizes into account
 2. the table upon which the hook is called will have its paths planned
    under the assumption it has no inheritance children, so if it's a
    hypertable we have to replan it's paths

Unfortunately, the code for doing these is static, so we need to copy
them into our own codebase, instead of just using PostgreSQL's.

In PostgreSQL 12, UPDATE/DELETE on inheritance relations have also
changed and are now planned in two stages:

- In stage 1, the statement is planned as if it was a `SELECT` and all
  leaf tables are discovered.
- In stage 2, the original query is planned against each leaf table,
  discovered in stage 1, directly, not part of an Append.

Unfortunately, this means we cannot look in the appendrelinfo during
UPDATE/DELETE planning, in particular to determine if a table is a
chunk, as the appendrelinfo is not at the point we wish to do so
initialized. This has consequences for how we identify operations on
chunks (sometimes for blocking and something for enabling
functionality).
2020-04-14 23:12:15 +02:00
Sven Klemm
039607dc1a Add rescan function to CompressChunkDml CustomScan node
The CompressChunkDml custom scan was missing a rescan function
leading to a segfault in plans that required a rescan.
2020-03-25 01:37:53 +01:00
Erik Nordström
474db5e448 Fix continuous aggs DDL test on PG9.6
The test `continuous_aggs_ddl` failed on PostgreSQL 9.6 because it had
a line that tested compression on a hypertable when this feature is
not supported in 9.6. This prohibited a large portion of the test to
run on 9.6.

This change moves the testing of compression on a continuous aggregate
to the `compression` test instead, which only runs on supported
PostgreSQL versions. A permission check on a view is also removed,
since similar tests are already in the `continuous_aggs_permissions`
tests.

The permission check was the only thing that caused different output
across PostgreSQL versions, so therefore the test no longer requires
version-specific output files and has been simplified to use the same
output file irrespective of PostgreSQL version.
2020-03-12 14:02:16 +01:00
Sven Klemm
030443a8e2 Fix compressing interval columns
When trying to compress a chunk that had a column of datatype
interval delta-delta compression would be selected for the column
but our delta-delta compression does not support interval and
would throw an errow when trying to compress a chunk.

This PR changes the compression selected for interval to dictionary
compression.
2020-03-06 21:44:31 +01:00
Matvey Arye
94f3cff709 Fix bug with parent table in decompression
Fix bug with transparent decompression getting the
hypertable parent table. This can happen with self-referencing
updates.

Fixes #1555
2020-01-07 17:33:01 -05:00
gayyappan
87786f1520 Add compressed table size to existing views
Some information views report hypertable sizes. Include
compressed table size in the calculation when applicable.
2019-10-29 19:02:58 -04:00
Matvey Arye
0f3e74215a Split segment meta min_max into two columns
This simplifies the code and the access to the min/max
metadata. Before we used a custom type, but now the min/max
are just the same type as the underlying column and stored as two
columns.

This also removes the custom type that was used before.
2019-10-29 19:02:58 -04:00
gayyappan
43aa49ddc0 Add more information in compression views
Rename compression views to compressed_hypertable_stats and
compressed_chunk_stats and summarize information about compression
status for chunks.
2019-10-29 19:02:58 -04:00
gayyappan
909b0ece78 Block updates/deletes on compressed chunks 2019-10-29 19:02:58 -04:00
gayyappan
edd3999553 Add trigger to block INSERT on compressed chunk
Prevent insert on compressed chunks by adding a trigger that blocks it.
Enable insert if the chunk gets decompressed.
2019-10-29 19:02:58 -04:00
Matvey Arye
8250714a29 Add fixes for Windows
- Fix declaration of functions wrt TSDLLEXPORT consistency
- Empty structs need to be created with '{ 0 }' syntax.
- Alignment sentinels have to use uint64 instead of a struct
  with a 0-size member
- Add some more ORDER BY clauses in the tests to constrain
  the order of results
- Add ANALYZE after running compression in
  transparent-decompression test
2019-10-29 19:02:58 -04:00
Matvey Arye
2bf97e452d Push down quals to segment meta columns
This commit pushes down quals or order_by columns to make
use of the SegmentMetaMinMax objects. Namely =,<,<=,>,>= quals
can now be pushed down.

We also remove filters from decompress node for quals that
have been pushed down and don't need a recheck.

This commit also changes tests to add more segment by and
order-by columns.

Finally, we rename segment meta accessor functions to be smaller
2019-10-29 19:02:58 -04:00
Matvey Arye
5c891f732e Add sequence id metadata col to compressed table
Add a sequence id to the compressed table. This id increments
monotonically for each compressed row in a way that follows
the order by clause. We leave gaps to allow for the
possibility to fill in rows due to e.g. inserts down
the line.

The sequence id is global to the entire chunk and does not reset
for each segment-by-group-change since this has the potential
to allow some micro optimizations when ordering by a segment by
columns as well.

The sequence number is a INT32, which allows up to 200 billion
uncompressed rows per chunk to be supported (assuming 1000 rows
per compressed row and a gap of 10). Overflow is checked in the
code and will error if this is breached.
2019-10-29 19:02:58 -04:00
Matvey Arye
b4a7108492 Integrate segment meta into compression
This commit integrates the SegmentMetaMinMax into the
compression logic. It adds metadata columns to the compressed table
and correctly sets it upon compression.

We also fix several errors with datum detoasting in SegmentMetaMinMax
2019-10-29 19:02:58 -04:00
Joshua Lockerman
8b273a5187 Fix flush when num-rows overflow
We should only free the segment-bys when we're changing groups not when
we've got too many rows to compress, in that case we'll need them.
2019-10-29 19:02:58 -04:00
Sven Klemm
45fac0ebe6 Add test for compress_chunk plan invalidation
This patch adds a testcase for prepared statement plan invalidation
when a chunk gets compressed.
2019-10-29 19:02:58 -04:00
gayyappan
6832ed2ca5 Modify storage type for toast columns
This PR modifies the toast type for compressed columns based on
the algorithm used for compression.
2019-10-29 19:02:58 -04:00
Sven Klemm
4cc1a4159a Add DecompressChunk custom scan node
This patch adds a DecompressChunk custom scan node, which will be
used when querying hypertables with compressed chunks to transparently
decompress chunks.
2019-10-29 19:02:58 -04:00
Matvey Arye
f6573f9247 Add a metadata count column to compressed table
This is useful, if some or all compressed columns are NULL.
The count reflects the number of uncompressed rows that are
in the compressed row. Stored as a 32-bit integer.
2019-10-29 19:02:58 -04:00
Matvey Arye
a078781c2e Add decompress_chunk function
This is the opposite dual of compress_chunk.
2019-10-29 19:02:58 -04:00
Matvey Arye
9223f08d68 Truncate chunks after (de-)compression
This commit will truncate the original chunk after compression
or decompression.
2019-10-29 19:02:58 -04:00