24 Commits

Author SHA1 Message Date
gayyappan
d8d392914a Support for compression on continuous aggregates
Enable ALTER MATERIALIZED VIEW (timescaledb.compress)
This enables compression on the underlying materialized
hypertable. The segmentby and orderby columns for
compression are based on the GROUP BY clause and time_bucket
clause used while setting up the continuous aggregate.

timescaledb_information.continuous_aggregate view defn
change

Add support for compression policy on continuous
aggregates

Move code from job.c to policy_utils.c
Add support functions to check compression
policy validity for continuous aggregates.
2021-12-17 10:51:33 -05:00
Fabrízio de Royes Mello
7e3e771d9f Fix compression policy on tables using INTEGER
Commit fffd6c2350f5b3237486f3d49d7167105e72a55b fixes problem related
to PortalContext using PL/pgSQL procedure to execute the policy.
Unfortunately this new implementation introduced a problem when we use
INTEGER and not BIGINT for the time dimension.

Fixed it by dealing correclty with the integer types: SMALLINT, INTEGER
and BIGINT.

Also refatored the policy compression procedure replacing the two
procedures `policy_compression_{interval|integer}` by a simple
`policy_compression_execute` casting dimension type dynamically.

Fixes #3773
2021-11-05 14:55:23 -03:00
gayyappan
77c969071c Modify compression job processing logic
Instead of picking 1 chunk for processing, we find the
list of chunks that have to be compressed by
the compression job, and proceed to process each one in its
own transaction. Without this, we could end up in a situation
where the first chunk is continually picked for recompression
(due to active inserts into the chunk) and we don't make any
progress.

We can limit the number of chunks processed by a single run
of the job by setting the new config
parameter: max_chunks_to_compress, for the compression job.
Valid values are > 0, The job processes
only maxchunks_to_compress number of chunks and defers any
remaining items to the next scheduled run of the job.
The default is to process all pending chunks.

We have an additional job config parameter: verbose_log.
This enables additional logging that logs the chunks that
are processed by the job.
2021-09-09 11:49:37 -04:00
gayyappan
4f865f7870 Add recompress_chunk function
After inserts go into a compressed chunk, the chunk is marked as
unordered.This PR adds a new function recompress_chunk that
compresses the data and sets the status back to compressed. Further
optimizations for this function are planned but not part of this PR.

This function can be invoked by calling
SELECT recompress_chunk(<chunk_name>).

recompress_chunk function is automatically invoked by the compression
policy job, when it sees that a chunk is in unordered state.
2021-05-24 18:03:47 -04:00
Markos Fountoulakis
bc740a32fb Add distributed hypertable compression policies
Add support for compression policies on Access Nodes. Extend the
compress_chunk() function to maintain compression state per chunk
on the Access Node.
2021-05-07 16:50:12 +03:00
Erik Nordström
202692f1ef Make tests use the new continuous aggregate API
Tests are updated to no longer use continuous aggregate options that
will be removed, such as `refresh_lag`, `max_interval_per_job` and
`ignore_invalidation_older_than`. `REFRESH MATERIALIZED VIEW` has also
been replaced with `CALL refresh_continuous_aggregate()` using ranges
that try to replicate the previous refresh behavior.

The materializer test (`continuous_aggregate_materialize`) has been
removed, since this tested the "old" materializer code, which is no
longer used without `REFRESH MATERIALIZED VIEW`. The new API using
`refresh_continuous_aggregate` already allows manual materialization
and there are two previously added tests (`continuous_aggs_refresh`
and `continuous_aggs_invalidate`) that cover the new refresh path in
similar ways.

When updated to use the new refresh API, some of the concurrency
tests, like `continuous_aggs_insert` and `continuous_aggs_multi`, have
slightly different concurrency behavior. This is explained by
different and sometimes more conservative locking. For instance, the
first transaction of a refresh serializes around an exclusive lock on
the invalidation threshold table, even if no new threshold is
written. The previous code, only took the heavier lock once, and if, a
new threshold was written. This new, and stricter locking, means that
insert processes that read the invalidation threshold will block for a
short time when there are concurrent refreshes. However, since this
blocking only occurs during the first transaction of the refresh
(which is quite short), it probably doesn't matter too much in
practice. The relaxing of locks to improve concurrency and performance
can be implemented in the future.
2020-09-11 16:07:21 +02:00
Mats Kindahl
9565cbd0f7 Continuous aggregates support WITH NO DATA
This commit will add support for `WITH NO DATA` when creating a
continuous aggregate and will refresh the continuous aggregate when
creating it unless `WITH NO DATA` is provided.

All test cases are also updated to use `WITH NO DATA` and an additional
test case for verifying that both `WITH DATA` and `WITH NO DATA` works
as expected.

Closes #2341
2020-09-11 14:02:41 +02:00
Erik Nordström
caf64357f4 Handle dropping chunks with continuous aggregates
This change makes the behavior of dropping chunks on a hypertable that
has associated continuous aggregates consistent with other
mutations. In other words, any way of deleting data, irrespective of
whether this is done through a `DELETE`, `DROP TABLE <chunk>` or
`drop_chunks` command, will invalidate the region of deleted data so
that a subsequent refresh of a continuous aggregate will know that the
region is out-of-date and needs to be materialized.

Previously, only a `DELETE` would invalidate continuous aggregates,
while `DROP TABLE <chunk>` and `drop_chunks` did not. In fact, each
way to delete data had different behavior:

1. A `DELETE` would generate invalidations and the materializer would
   update any aggregates to reflect the changes.
2. A `DROP TABLE <chunk>` would not generate invalidations and the
   changes would therefore not be reflected in aggregates.
3. A `drop_chunks` command would not work unless
   `ignore_invalidation_older_than` was set. When enabled, the
   `drop_chunks` would first materialize the data to be dropped and
   then never materialize that region again, unless
   `ignore_invalidation_older_than` was reset. But then the continuous
   aggregates would be in an undefined state since invalidations had
   been ignored.

Due to the different behavior of these mutations, a continuous
aggregate could get "out-of-sync" with the underlying hypertable. This
has now been fixed.

For the time being, the previous behavior of "refresh-on-drop" (i.e.,
materializing the data on continuous aggregates before dropping it) is
retained for `drop_chunks`. However, such "refresh-on-drop" behavior
should probably be revisited in the future since it happens silently
by default without an opt out. There are situations when such silent
refreshing might be undesirable; for instance, let's say the dropped
data had seen erroneous backfill that a user wants to ignore. Another
issue with "refresh-on-drop" is that it only happens for `drop_chunks`
and not other ways of deleting data.

Fixes #2242
2020-09-09 21:14:45 +02:00
Sven Klemm
4397e57497 Remove job_type from bgw_job table
Due to recent refactoring all policies now use the columns added
with the generic job support so the job_type column is no longer
needed.
2020-09-01 14:49:30 +02:00
Mats Kindahl
c054b381c6 Change syntax for continuous aggregates
We change the syntax for defining continuous aggregates to use `CREATE
MATERIALIZED VIEW` rather than `CREATE VIEW`. The command still creates
a view, while `CREATE MATERIALIZED VIEW` creates a table.  Raise an
error if `CREATE VIEW` is used to create a continuous aggregate and
redirect to `CREATE MATERIALIZED VIEW`.

In a similar vein, `DROP MATERIALIZED VIEW` is used for continuous
aggregates and continuous aggregates cannot be dropped with `DROP
VIEW`.

Continuous aggregates are altered using `ALTER MATERIALIZED VIEW`
rather than `ALTER VIEW`, so we ensure that it works for `ALTER
MATERIALIZED VIEW` and gives an error if you try to use `ALTER VIEW` to
change a continuous aggregate.

Note that we allow `ALTER VIEW ... SET SCHEMA` to be used with the
partial view as well as with the direct view, so this is handled as a
special case.

Fixes #2233

Co-authored-by: =?UTF-8?q?Erik=20Nordstr=C3=B6m?= <erik@timescale.com>
Co-authored-by: Mats Kindahl <mats@timescale.com>
2020-08-27 17:16:10 +02:00
Sven Klemm
a9c087eb1e Allow scheduling custom functions as bgw jobs
This patch adds functionality to schedule arbitrary functions
or procedures as background jobs.

New functions:

add_job(
  proc REGPROC,
  schedule_interval INTERVAL,
  config JSONB DEFAULT NULL,
  initial_start TIMESTAMPTZ DEFAULT NULL,
  scheduled BOOL DEFAULT true
)

Add a job that runs proc every schedule_interval. Proc can
be either a function or a procedure implemented in any language.

delete_job(job_id INTEGER)

Deletes the job.

run_job(job_id INTEGER)

Execute a job in the current session.
2020-08-20 11:23:49 +02:00
Sven Klemm
d547d61516 Refactor continuous aggregate policy
This patch modifies the continuous aggregate policy to store its
configuration in the jobs table.
2020-08-11 22:57:02 +02:00
Ruslan Fomkin
56b4c10a74 Fix error messages to compression policy
Error messages are improved and formulated in terms of compression
policy.
2020-08-06 19:17:44 +02:00
Ruslan Fomkin
393e5b9c1a Remove enabling enterprise from compression test
Compression is not enterprise feature anymore. Thus enabling
enterprise is not needed in tests.
2020-08-05 14:25:27 +02:00
gayyappan
9f13fb9906 Add functions for compression stats
Add chunk_compression_stats and hypertable_compression_stats
functions to get before/after compression sizes
2020-08-03 10:19:55 -04:00
Mats Kindahl
590446c6a7 Remove cascade_to_materialization parameter
The parameter `cascade_to_materialization` is removed from
`drop_chunks` and `add_drop_chunks_policy` as well as associated tables
and test functions.

Fixes #2137
2020-07-31 11:21:36 +02:00
Sven Klemm
0d5f1ffc83 Refactor compress chunk policy
This patch changes the compression policy to store its configuration
in the bgw_job table and removes the bgw_policy_compress_chunks table.
2020-07-30 19:58:37 +02:00
Mats Kindahl
a089843ffd Make table mandatory for drop_chunks
The `drop_chunks` function is refactored to make table name mandatory
for the function. As a result, the function was also refactored to
accept the `regclass` type instead of table name plus schema name and
the parameters were reordered to match the order for `show_chunks`.

The commit also refactor the code to pass the hypertable structure
between internal functions rather than the hypertable relid and moving
error checks to the PostgreSQL function.  This allow the internal
functions to avoid some lookups and use the information in the
structure directly and also give errors earlier instead of first
dropping chunks and then error and roll back the transaction.
2020-06-17 06:56:50 +02:00
Mats Kindahl
d465c81e6a Do not compress chunks that are dropped
The function `get_chunks_to_compress` return chunks that are not
compressed but that are dropped, meaning a lookup using
`ts_chunk_get_by_id` will fail to find the corresponding `table_id`,
which later leads to a null pointer when looking for the chunk. This
leads to a segmentation fault.

This commit fixes this by ignoring chunk that have are marked as
dropped in the chunk table when scanning for chunks to compress.
2020-03-16 20:33:34 +01:00
Brian Rowe
25eb98c0ec Prevent starting background workers with NOLOGIN
This change will check sql commands which start a background worker
on a hypertable to verify that the table owner has permission to
log into the database.  This is necessary, as background workers for
these commands will run with the permissions of the table owner, and
thus immediately fail if unable to log in.
2020-03-08 15:09:23 -07:00
Sven Klemm
0cc22ad278 Stop background worker in tests
To make tests more stable and to remove some repeated code in the
tests this PR changes the test runner to stop background workers.
Individual tests that need background workers can still start them
and this PR will only stop background workers for the initial database
for the test, behaviour for additional databases created during the
tests will not change.
2020-03-06 15:27:53 +01:00
Matvey Arye
a2ea01831a Fix compression_bgw test flakiness
Previously we were creating multiple rows using generate_series
and now(), depending on the time of day the test was run, this
could create one or two chunks, causing flakiness.

We changed the test to only create one row and thus one chunk
2019-10-29 19:02:58 -04:00
gayyappan
43aa49ddc0 Add more information in compression views
Rename compression views to compressed_hypertable_stats and
compressed_chunk_stats and summarize information about compression
status for chunks.
2019-10-29 19:02:58 -04:00
gayyappan
6e60d2614c Add compress chunks policy support
Add and drop compress chunks policy using bgw
infrastructure.
2019-10-29 19:02:58 -04:00