This test file was created to handle repairing of hypertables that
had broken related metadata in the dimension_slice catalog tables.
Probably the test does not make sense today given that we have more
robust referential integrity in our catalog tables. Removing it now.
So far, we allowed only CAggs without origin or offset parameters in the
time_bucket definition. This commit adds support for the remaining
time_bucket variants.
Fixes#2265, Fixes#5453, Fixes#5828
In #4678 we added an interface for troubleshoting job failures by
logging it in the metadata table `_timescaledb_internal.job_errors`.
With this PR we extended the existing interface to also store succeeded
executions. A new GUC named `timescaledb.enable_job_execution_logging`
was added to control this new behavior and the default value is `false`.
We renamed the metadata table to `_timescaledb_internal.bgw_job_stat_history`
and added a new view `timescaledb_information.job_history` to users that
have enough permissions can check the job execution history.
The logic in chunk append path creation when a space dimension was
involved was crashing while checking for matches in the flattened out
children chunk lists. This has been fixed now.
For UPDATEs and DELETEs when a compressed chunk is involved, the code
decompresses the relevant data into the uncompressed portion of the
chunk. This happens during execution, so it's possible that if the
planner doesn't have a plan for the uncompressed chunk then we might
miss scanning out on those decompressed rows. We now check for the
possibility of a compressed chunk becoming partial during the planning
itself and tag on an APPEND plan on top of scans on the compressed and
uncompressed parts.
Several scheduler regression tests create the plpgsql function
`wait_for_job_to_run` with the same purpose of waiting for a given job
to execute or fail, so refactor the regression tests by adding it to the
testsupport.sql library.
Recent changes switched recompression of unordered compressed
chunks to use segmentwise recompression which is designed to work
with partial chunks only. This change reverts that back to full
recompression.
Due to the chunk status being a bitmap, updating any of the flags
needs previous locking of the tuple to make sure nobody can update
the status from under us and cause lost updates. This change unifies
all the places were we change chunk status and makes this more explicit
so we don't run into concurrency issues due to updates.
We were not uploading the correct Postgres log to the CI artifacts, so
fixed it by setting the TEST_PG_LOG_DIRECTORY when building the
extension and using it properly when uploading the postmaster.log
artifact.
So far, the invalidation_state_init function used a loop to find the
proper CAgg bucket function. This PR refactors the parameter of the
function and passes the needed information directly.
Previously, we create functions to calculate default order by and
segment by values. This PR makes those functions be used by default
when compression is enabled. We also added GUCs to disable those
functions or to use alternative functions for the defaults calculation.
Compression size stats are formed during initial compression.
At that time, we know what the uncompressed stats were before
the operation and the compressed stats after. But during incremental
recompression, we are not decompressing the whole chunk thus we
cannot update those statistics. So far, we only updated compressed
stats and the tuple count. This ends up making the compression
ratio incorrect since we are not updating all the stats. Removing
any updates during incremental recompression will at least keep
the initial stats consistent which is better than partially updated
stats.
It is not possible to get anything sensible out of core dumps that do
not contain debug info, so make sure that release builds are built with
debug information by using `RelWithDebInfo`.
The cagg_invalidation test stops the background workers and performs
tests with invalidations. However, they the test was running in a
parallel group, and background workers could be activated by other
tests, leading to flaky test behavior. This PR moves the
cagg_invalidation tests to the solo tests.
This changes the behavior of the CAgg catalog tables. From now on, all
CAggs that use a time_bucket function create an entry in the catalog
table continuous_aggs_bucket_function. In addition, the duplicate
bucket_width attribute is removed from the catalog table continuous_agg.
PR #6705 introduces the function is_sparse_index_type. However, it is
only used for assert checking. If no assets are checked, the function is
unused, and some compilers error out. This PR includes the function only
if asserts are checked.
When creating a Continuous Aggregate we can only reference the primary
hypertable dimension column on the `time_bucket` function, so reworded a
bit the error message to be more clear.
Currently, when a CAgg is refreshed using a background worker, only the
refresh invocation is logged. However, the details of the refresh, such
as the number of individual ranges that are refreshed, are not logged.
This PR changes the log level for these details in background workers to
INFO, to ensure the information is captured.
PR #2926 introduced a session-based configuration parameter for the CAgg
refresh behavior. If more individual refreshes have to be carried out
than specified by this setting, a refresh for a larger window is
performed.
It is mentioned in the original PR that this setting should be converted
into a GUC later. This PR performs the proposed change.
The decision to add a minmax sparse index is made every time when the
compressed chunk is created (full decompression followed by compression)
based on the currently present indexes on the hypertable. No new chunk
compression settings are added.
No action is required on upgrade, but the feature is not enabled on
existing chunks. The minmax index will be added when the chunk is fully
decompressed and compressed.
No action is required on downgrade, we ignore the unknown metadata
columns. They will be removed when the chunk is fully decompressed and
compressed.
The potential drawback of this feature is increasing the storage
requirements for the compressed chunk table, but it is normally only a
few percent of the total compressed data size. It can be disabled with
the GUC `timescaledb.auto_sparse_indexes`.
Here's a small example of this feature in action:
https://gist.github.com/akuzm/84d4b3b609e3581768173bd21001dfbf
Note that the number of hit buffers is reduced almost 4x.
Removing constraints is always safe so there is no reason to block it on
compressed hypertables. Adding constraints is still blocked for
compressed hypertables as verifying of constraints currently requires
decompressed hypertable.
In old TSDB versions, we disabled autovacuum for compressed chunks to
keep the statistics. However, this restriction was removed in #5118, but
no migration was performed to reset the custom autovacuum setting for
existing chunks.
If "created_after/before" is used with a "time" type partitioning
column then show_chunks was not showing appropriate list due to a
mismatch in the comparison of the "creation_time" metadata (which is
stored as a timestamptz) with the internally converted epoch based
input argument value. This is now fixed by not doing the unnecessary
conversion into the internal format for cases where it's not needed.
Fixes#6611
If memory context is switched temporarily and there is no switch back,
it will cause strange errors.
This adds a Coccinelle rule that checks for a case where the memory
context is saved in a temporary variable but this variable is not used
to switch back to the original memory context.
Co-authored-by: Fabrízio de Royes Mello <fabrizio@timescale.com>
When updating catalog tables we rely on low level functions instead of
SQL statements and in order to read/write data from/to those tables we
frequent do something like:
```CPP
Datum values[natts] = { 0 };
bool nulls[natts] = { false };
char *char_value = "foo";
if (char_value != NULL)
values[AttrNumberGetAttrOffset(text_value_offset)] =
PointerGetDatum(cstring_to_text(char_value);
else
null[AttrNumberGetAttrOffset(char_value_offset)] = true;
```
So instead of using a pair of Datum and bool arrays we'll replace it by
using arrays of `NullableDatum` that contains both members and introduce
some accessor functions to encapsulate the logic to fill the proper
values, like:
```CPP
ts_datum_set_text(int index, NullableDatum *datums, text *value);
ts_datum_set_bool(int index, NullableDatum *datums, bool value);
```
We also introduce a new `ts_heap_form_tuple` that essentially to the
same as Postgres `heap_form_tuple` but using array of `NullableDatum`
instead of Datum and bool arrays.
In this first commit we added only the necessary accessor functions to
refactor the existing `create_cagg_validate_query_datum` as example.
More accessor functions to deal with other C types should be introduced
in the future.
This PR renames some things in preparation to cost changes from
https://github.com/timescale/timescaledb/pull/6550
Also simplifies path creation for partial chunks, improving some plans
(now index scans are used instead of sort).
Encapsulate the logic for getting the MINIMUM value of a time dimension
for a Continuous Aggregate in a single place instead of spread the logic
everywhere that can lead to wrong usage specially because there are a
difference between fixed and variable size buckets.
It's convenient to squash the cleanup commits using the GitHub
interface, but currently the PR validation prevents the squash merge
from succeeding. Fix this.
The CAgg migration path contained two bugs. This PR fixes both. A typo
in the column type prevented 'timestamp with time zone' buckets from
being handled properly. In addition, a custom setting of the datestyle
could create errors during the parsing of the generated timestamp
values.
Fixes: #5359
The CAgg refresh job did not handle the NULL value of start_offset for a
time_bucket function with a variable width properly. This problem has
led to the creation of invalid invalidation records and 'timestamp out
of range' errors during the next refresh.
Fixes: #5474
When querying a realtime Continuous Aggregate using window functions the
new planner optimization to constify the `cagg_watermark` function call
is not working because we're checking for window function usage on the
query tree but actually this is a current limitation when creating a new
Continuous Aggregate, but users can use window functions when querying
it.
Fixed it by removing the check for `query->hasWindowFuncs` to prevent
the process of constification of the `cagg_watermak` function call.
Fixes#6722
The only relevant update test versions are v7 and v8 all previous
versions are no longeri used in any supported update path so we can
safely remove those files.
Currently the update test is quite inconvenient to run locally and also
inconvenient to debug as the different update tests all run in their own
docker container. This patch refactors the update test to no longer
require docker and make it easier to debug as it will run in the
local environment as determined by pg_config.
This patch also consolidates update/downgrade and repair test since
they do very similar things and adds support for coredump stacktraces to
the github action and removes some dead code from the update tests.
Additionally the versions to be used in the update test are now
determined from existing git tags so the post release patch no longer
needs to add newly released versions.