This change adds the ability to truncate a continuous aggregate and
its source hypertable. Like other mutations (DELETE, UPDATE) on a
continuous aggregate or the underlying hypertable, an invalidation
needs to be added to ensure the aggregate can be refreshed again.
When a hypertable is truncated, an invalidation is added to the
hypertable invalidation log so that all its continuous aggregates will
be invalidated. When a specific continuous aggregate is truncated, the
invaliation is instead added to the continuous aggregate invalidation
log, so that only the one aggregate is invalidated.
This change fixes some corner-case issues that could lead to a refresh
not actually refreshing when it should.
The issues arise because invalidations are inclusive in both ends
while the refresh window is exclusive in the end. In some case, this
wasn't correctly accounted for.
To fix the issues, the remainder after cutting invalidations has been
adjusted and we always add 1 to the end of the refresh window when it
is set from an invalidation. In addition, we add an extra check for
this case when computing the bucketed refresh window and the end of
the refresh window is at the start of the bucket (i.e., the window
size is 1).
The test suite has also been expanded to test for some of these corner
cases.
Postgres added a new alter table subcommand in a bugfix for
CREATE TABLE LIKE with inheritance which got backported to previous
versions. We check for presence of the subcommand with cmake so
we can support building on snapshot versions which also fixes the
ABI breakage test.
5028981923
This change ensures a refresh of a continuous aggregate only
re-materializes the part of the aggregate that has been
invalidated. This makes refreshing much more efficient, and sometimes
eliminates the need to materialize data entirely (i.e., in case there
are no invalidations in the refresh window).
The ranges to refresh are the remainders of invalidations after they
are cut by the refresh window (i.e., all invalidations, or parts of
invalidations, that fall within the refresh window). The invalidations
used for a refresh are collected in a tuple store (which spills to
disk) as to not allocate too much memory in case of many
invalidations. Invalidations are, however, merged and deduplicated
before being added to the tuplestore, similar to how invalidations are
processed in the invalidation logs.
Currently, the refreshing proceeds with just materializing all
invalidated ranges in the order they appear in the tuple store, and
the ordering does not matter since all invalidated regions are
refreshed in the same transaction.
In its initial state, a continuous aggregate should be completely
invalidated. Therefore, this change adds an infinite invalidation
`[-Infinity, +Infinity]` when a continuous aggregate is created.
This change refactors and cleans up some of the test infrastructure
around distributed transactions. In particular, the node killer now
waits for the killed process to exit in an attempt to make tests more
predictible.
An optimization for `time_bucket` transforms expressions of the form
`time_bucket(10, time) < 100` to `time < 100 + 10` in order to do
chunk exclusion and make better use of indexes on the time
column. However, since one bucket is added to the timestamp when doing
this transformation, the timestamp can overflow.
While a check for such overflows already exists, it uses `+Infinity`
(INT64_MAX/DT_NOEND) as the upper bound instead of the actual end of
the valid timestamp range. A further complication arises because
TimescaleDB internally converts timestamps to UNIX epoch time, thus
losing a little bit of the valid timestamp range in the process. Dates
are further restricted by the fact that they are internally first
converted to timestamps (thus limited by the timestamp range) and then
converted to UNIX epoch.
This change fixes the overflow issue by only applying the
transformation if the resulting timestamps or dates stay within the
valid (TimescaleDB-specific) ranges.
A test has also been added to show the valid timestamp and date
ranges, both PostgreSQL and TimescaleDB-specific ones.
We change the syntax for defining continuous aggregates to use `CREATE
MATERIALIZED VIEW` rather than `CREATE VIEW`. The command still creates
a view, while `CREATE MATERIALIZED VIEW` creates a table. Raise an
error if `CREATE VIEW` is used to create a continuous aggregate and
redirect to `CREATE MATERIALIZED VIEW`.
In a similar vein, `DROP MATERIALIZED VIEW` is used for continuous
aggregates and continuous aggregates cannot be dropped with `DROP
VIEW`.
Continuous aggregates are altered using `ALTER MATERIALIZED VIEW`
rather than `ALTER VIEW`, so we ensure that it works for `ALTER
MATERIALIZED VIEW` and gives an error if you try to use `ALTER VIEW` to
change a continuous aggregate.
Note that we allow `ALTER VIEW ... SET SCHEMA` to be used with the
partial view as well as with the direct view, so this is handled as a
special case.
Fixes#2233
Co-authored-by: =?UTF-8?q?Erik=20Nordstr=C3=B6m?= <erik@timescale.com>
Co-authored-by: Mats Kindahl <mats@timescale.com>
This maintenance release contains bugfixes since the 1.7.2 release. We deem it high
priority for upgrading.
In particular the fixes contained in this maintenance release address issues in compression,
drop_chunks and the background worker scheduler.
**Bugfixes**
* #2059 Improve infering start and stop arguments from gapfill query
* #2067 Support moving compressed chunks
* #2068 Apply SET TABLESPACE for compressed chunks
* #2090 Fix index creation with IF NOT EXISTS for existing indexes
* #2092 Fix delete on tables involving hypertables with compression
* #2164 Fix telemetry installed_time format
* #2184 Fix background worker scheduler memory consumption
* #2222 Fix `negative bitmapset member not allowed` in decompression
* #2255 Propagate privileges from hypertables to chunks
* #2256 Fix segfault in chunk_append with space partitioning
* #2259 Fix recursion in cache processing
* #2261 Lock dimension slice tuple when scanning
**Thanks**
* @akamensky for reporting an issue with drop_chunks and ChunkAppend with space partitioning
* @dewetburger430 for reporting an issue with setting tablespace for compressed chunks
* @fvannee for reporting an issue with cache invalidation
* @nexces for reporting an issue with ChunkAppend on space-partitioned hypertables
* @PichetGoulu for reporting an issue with index creation and IF NOT EXISTS
* @prathamesh-sonpatki for contributing a typo fix
* @sezaru for reporting an issue with background worker scheduler memory consumption
Previously, cache invalidation could cause recursion in cache processing.
PR #1493 fixed this for the
cache_invalidate_callback() -> ts_extension_invalidate()
call path. But the call path
cache_invalidate_callback() -> ts_extension_is_loaded()
could still go into recursion.
So, this PR moves the recursion-prevention logic into
extension_update_state(), which is common to both call paths.
Fixes#2200.
In the function `ts_hypercube_from_constraints` a hypercube is build
from constraints which reference dimension slices in `dimension_slice`.
As part of a run of `drop_chunks` or when a chunk is explicitly dropped
as part of other operations, dimension slices can be removed from this
table causing the dimension slices to be removed, which makes the
hypercube reference non-existent dimension slices which subsequently
causes a crash.
This commit fixes this by adding a tuple lock on the dimension slices
that are used to build the hypercube.
If two `drop_chunks` are running concurrently, there can be a race if
dimension slices are removed as a result removing a chunk. We treat
this case in the same way as if the dimension slice was updated: report
an error that another session locked the tuple.
Fixes#1986
If a hypertable accidentally broke because a dimension slice is
missing, a segmentation fault will result when an attempt is made to
remove a chunk that references the dimension slice. This happens
because no check if the dimension slice was found is made and the
assumption is that it should be there (by design). Instead of crashing
the server, this commit adds code that prints a warning that the
dimension slice did not exist and proceed with removing the chunk. This
is safe since the chunk should be removed anyway and the missing
dimension slice does not change this.
Add implementation for debug waitpoints. Debug waitpoints can be added
to code and can be enabled using the `ts_debug_waitpoint_enable`
function. Once execution reaches this point, the session will block
waiting for a call to `ts_debug_waitpoint_release`.
Waitpoints are added to code by using the macro `DEBUG_WAITPOINT` with
a string, for example:
DEBUG_WAITPOINT("references_fetched");
The string is hashed to compute a 32-bit number that is then used as an
shared advisory lock. The waitpoint can be enabled with the function
`ts_debug_waitpoint_enable`. This function takes a string and computes
a hash and will use the hash to take an exclusive advisory lock. This
will cause all sessions reaching the waitpoint to block until the lock
is released with `ts_debug_waitpoint_release`.
When postgres prunes children before we create the ChunkAppend
path there might be a mismatch between the children of the path
and the ordered list of children in a space partitioned hypertable.
Fixes#1841
This change removes, simplifies, and unifies code related to
`drop_chunks` and `show_chunks`. As a result of prior changes to
`drop_chunks`, e.g., making table relid mandatory and removing
cascading options, there's an opportunity to clean up and simplify the
rather complex code for dropping and showing chunks.
In particular, `show_chunks` is now consistent with `drop_chunks`; the
relid argument is mandatory, a continuous aggregate can be used in
place of a hypertable, and the input time ranges are checked and
handled in the same way.
Unused code is also removed, for instance, code that cascaded drop
chunks to continuous aggregates remained in the code base while the
option no longer exists.
When scan is started using `ts_scanner_start_scan` it create a snapshot
to use when scanning the data, but this is not subsequently used in
`ts_scanner_next`.
This commit fixes this by using the already created snapshot instead of
calling `GetLastSnapshot()`.
Since GitHub updated the macos image used for CI it updated llvm
to a version that is buggy on macos. So this patch disables building
postgres with llvm support on macos. It also fixes the cache suffix
code for macos because it changed with the latest runner version,
and uses the image version now as cache suffix.
https://bugs.llvm.org/show_bug.cgi?id=47226
This change will ensure that the pg_statistics on a chunk are
updated immediately prior to compression. It also ensures that
these stats are not overwritten as part of a global or hypertable
targetted ANALYZE.
This addresses the issue that chunk will no longer generate valid
statistics durings an ANALYZE once the data's been moved to the
compressed table. Unfortunately any compressed rows will not be
captured in the parent hypertable's pg_statistics as there is no way
to change how PostGresQL samples child tables in PG11.
This approach assumes that the compressed table remains static, which
is mostly correct in the current implementation (though it is
possible to remove compressed segments). Once we start allowing more
operations on compressed chunks this solution will need to be
revisited. Note that in PG12 an approach leveraging table access
methods will not have a problem analyzing compressed tables.
When trying to alter a job with NULL config alter_job did not
set the isnull field for config and would segfault when trying
to build the resultset tuple.
Option `migrate_data` does not currently work for distributed
hypertables, so we block it for the time being and generate an error if
an attempt is made to migrate data when creating a distributed
hypertable.
Fixes#2230
This patch adds functionality to schedule arbitrary functions
or procedures as background jobs.
New functions:
add_job(
proc REGPROC,
schedule_interval INTERVAL,
config JSONB DEFAULT NULL,
initial_start TIMESTAMPTZ DEFAULT NULL,
scheduled BOOL DEFAULT true
)
Add a job that runs proc every schedule_interval. Proc can
be either a function or a procedure implemented in any language.
delete_job(job_id INTEGER)
Deletes the job.
run_job(job_id INTEGER)
Execute a job in the current session.
This change moves the invalidation threshold in the setup phase of the
concurrent refresh test for continuous aggregates in order to generate
invalidations. Without any invalidations, the invalidation logs are
never really processed and thus not subjected to concurrency.
This change makes sure we always use the bucketed refresh window
internally for processing invalidations, moving thresholds and doing
the actual materialization.
Invalidation processing during refreshing of a continuous aggregate is
now better protected against concurrent refreshes by taking the lock
on the materialized hypertable before invalidation processing.
Since invalidation processing is now split across two transactions,
the first one processing the hypertable invalidation log and the
second one processing the continuous aggregate invalidation lock, they
are now separately protected by serializing around the invalidation
threshold lock and the materialized hypertable lock, respectively.
Due to a typo in the code the relid of the relation instead of the
rangetable index was used when building the pathkey for the
DecompressChunk node. Since relids are unsigned but bitmapsets
use signed int32 this lead to 'negative bitmapset member not allowed'
being thrown as error when the relid is greater than INT32_MAX.
This patch also adds an assert to prevent this from happening again.
This patch changes the signature from cagg_watermark(oid) to
cagg_watermark(int). Since this is an API breaking change it couldn't
be done in an earlier release.
If a retention policy is set up on a distributed hypertable, it will
not propagate the drop chunks call to the data nodes since the drop
chunks call is done through an internal call.
This commit fixes this by creating a drop chunks call internally and
executing it as a function. This will then propagate to the data nodes.
Fixes timescale/timescaledb-private#833
Fixes#2040
When clearing invalidations during a refresh of a continuous
aggregate, one should use the bucketed refresh window rather than the
"raw" user-defined window. This ensures that the window is capped at
the allowable timestamp range (e.g., when using an infinite window)
and that the window used to clear invalidations actually matches what
gets materialized.
Note, however, that, since the end of the refresh window is not
inclusive, the last possible time value is not included in the
refresh. To include that value, there needs exist a time-type
agnostic definition of "infinity" and both the invalidation code and
the materialization code must be able to handle such windows.
The internal conversion functions for timestamps didn't account for
timestamps that are infinite (`-Infinity` or `+Infinity`), and they
would therefore generate an error if such timestamps were
encountered. This change adds extra checks to the conversion functions
to allow infinite timestamps.