The logic in chunk append path creation when a space dimension was
involved was crashing while checking for matches in the flattened out
children chunk lists. This has been fixed now.
If "created_after/before" is used with a "time" type partitioning
column then show_chunks was not showing appropriate list due to a
mismatch in the comparison of the "creation_time" metadata (which is
stored as a timestamptz) with the internally converted epoch based
input argument value. This is now fixed by not doing the unnecessary
conversion into the internal format for cases where it's not needed.
Fixes#6611
- Updated show_chunks, drop_chunks APIs to get the affected
chunks using chunk creation time metadata based on the
"date/time/interval" like boundary specified for the INTEGER
columns.
- We honor "integer_now" function if it's specified so as to keep
backwards compatibility with the existing behavior
Co-authored-by: Dipesh Pandit <dipesh@timescale.com>
To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- get_partition_for_key(val anyelement)
- get_partition_hash(val anyelement)
To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- calculate_chunk_interval(int, bigint, bigint)
- chunk_status(regclass)
- chunks_in(record, integer[])
- chunk_id_from_relid(oid)
- show_chunk(regclass)
- create_chunk(regclass, jsonb, name, name, regclass)
- set_chunk_default_data_node(regclass, name)
- get_chunk_relstats(regclass)
- get_chunk_colstats(regclass)
- create_chunk_table(regclass, jsonb, name, name)
- freeze_chunk(regclass)
- unfreeze_chunk(regclass)
- drop_chunk(regclass)
- attach_osm_table_chunk(regclass, regclass)
Make truncating a uncompressed chunk drop the data for the case where
they reside in a corresponding compressed chunk.
Generate invalidations for Continuous Aggregates after TRUNCATE, so
as to have consistent refresh operations on the materialization
hypertable.
Fixes#4362
Add an internal api to drop a single chunk.
This function drops the storage and metadata
associated with the chunk.
Note that chunk dependencies are not affected.
e.g. Continuous aggs are not updated when this chunk
is dropped.
In this specific case, when we create a hypertable then
we add a "not-null" constraint to the "time" column if it
does not exist. That is done via an internal ALTER TABLE
subcommand in dimension_add_not_null_on_column function. If the
currentEventTriggerState structure is enabled then it's necessary to
set up the command tracking appropriately, otherwise crash ensues. This
has been fixed via this now.
Includes test changes.
Renaming the parameter `hypertable_or_cagg` in functions `drop_chunks`
and `show_chunks` to `relation` and changing parameter name from
`main_table` to `hypertable` or `relation` depending on context.
This change removes, simplifies, and unifies code related to
`drop_chunks` and `show_chunks`. As a result of prior changes to
`drop_chunks`, e.g., making table relid mandatory and removing
cascading options, there's an opportunity to clean up and simplify the
rather complex code for dropping and showing chunks.
In particular, `show_chunks` is now consistent with `drop_chunks`; the
relid argument is mandatory, a continuous aggregate can be used in
place of a hypertable, and the input time ranges are checked and
handled in the same way.
Unused code is also removed, for instance, code that cascaded drop
chunks to continuous aggregates remained in the code base while the
option no longer exists.
The `drop_chunks` function is refactored to make table name mandatory
for the function. As a result, the function was also refactored to
accept the `regclass` type instead of table name plus schema name and
the parameters were reordered to match the order for `show_chunks`.
The commit also refactor the code to pass the hypertable structure
between internal functions rather than the hypertable relid and moving
error checks to the PostgreSQL function. This allow the internal
functions to avoid some lookups and use the information in the
structure directly and also give errors earlier instead of first
dropping chunks and then error and roll back the transaction.
This commit removes the `cascade` option from the function
`drop_chunks` and `add_drop_chunk_policy`, which will now never cascade
drops to dependent objects. The tests are fixed accordingly and
verbosity turned up to ensure that the dependent objects are printed in
the error details.
Running `drop_chunks` on a distributed hypertable should remove chunks
from all its data nodes. To make it work we send the same SQL command
to all involved data nodes.
When calling show_chunks or drop_chunks without specifying
a particular hypertable TimescaleDB iterates through all
existing hypertables and builds a list. While doing this
it adds the internal '_compressed_hypertable_*' tables
which leads to incorrect behaviour of
ts_chunk_get_chunks_in_time_range function. This fix
filters out the internal compressed tables while scanning
at ts_hypertable_get_all function.
This change includes a major refactoring to support PostgreSQL
12. Note that many tests aren't passing at this point. Changes
include, but are not limited to:
- Handle changes related to table access methods
- New way to expand hypertables since expansion has changed in
PostgreSQL 12 (more on this below).
- Handle changes related to table expansion for UPDATE/DELETE
- Fixes for various TimescaleDB optimizations that were affected by
planner changes in PostgreSQL (gapfill, first/last, etc.)
Before PostgreSQL 12, planning was organized something like as
follows:
1. construct add `RelOptInfo` for base and appendrels
2. add restrict info, joins, etc.
3. perform the actual planning with `make_one_rel`
For our optimizations we would expand hypertables in the middle of
step 1; since nothing in the query planner before `make_one_rel` cared
about the inheritance children, we didn’t have to be too precises
about where we were doing it.
However, with PG12, and the optimizations around declarative
partitioning, PostgreSQL now does care about when the children are
expanded, since it wants as much information as possible to perform
partition-pruning. Now planning is organized like:
1. construct add RelOptInfo for base rels only
2. add restrict info, joins, etc.
3. expand appendrels, removing irrelevant declarative partitions
4. perform the actual planning with make_one_rel
Step 3 always expands appendrels, so when we also expand them during
step 1, the hypertable gets expanded twice, and things in the planner
break.
The changes to support PostgreSQL 12 attempts to solve this problem by
keeping the hypertable root marked as a non-inheritance table until
`make_one_rel` is called, and only then revealing to PostgreSQL that
it does in fact have inheritance children. While this strategy entails
the least code change on our end, the fact that the first hook we can
use to re-enable inheritance is `set_rel_pathlist_hook` it does entail
a number of annoyances:
1. this hook is called after the sizes of tables are calculated, so we
must recalculate the sizes of all hypertables, as they will not
have taken the chunk sizes into account
2. the table upon which the hook is called will have its paths planned
under the assumption it has no inheritance children, so if it's a
hypertable we have to replan it's paths
Unfortunately, the code for doing these is static, so we need to copy
them into our own codebase, instead of just using PostgreSQL's.
In PostgreSQL 12, UPDATE/DELETE on inheritance relations have also
changed and are now planned in two stages:
- In stage 1, the statement is planned as if it was a `SELECT` and all
leaf tables are discovered.
- In stage 2, the original query is planned against each leaf table,
discovered in stage 1, directly, not part of an Append.
Unfortunately, this means we cannot look in the appendrelinfo during
UPDATE/DELETE planning, in particular to determine if a table is a
chunk, as the appendrelinfo is not at the point we wish to do so
initialized. This has consequences for how we identify operations on
chunks (sometimes for blocking and something for enabling
functionality).
Previously, drop_chunks returned an empty table, giving the user
no indication of what (if anything) had happened.
Now, drop_chunks returns a list of the chunks identifiers in the
same style as show_chunks, with the chunk's schema and table name.
Notably, when show_chunks is called directly before drop_chunks, the
output should be the same.
This commit fixes and tests permissions in the following
API calls:
- reorder_chunk (test only)
- alter_job_schedule
- add_drop_chunks_policy
- remove_drop_chunks_policy
- add_reorder_policy
- remove_reorder_policy
- drop_chunks
In various places, most notably drop_chunks and show_chunks, we
dispatch based on the type of the "time" column of the hypertable, for
things such as determining which interval type to use. With a custom
partition function, this logic is incorrect, as we should instead be
determining this based on the return type of the partitioning function.
This commit changes all relevant access of dimension.column_type to a
new function, ts_dimension_get_partition_type, which has the correct
behavior: it returns the partitioning function's return type, if one
exists, and only otherwise uses the column type. After this commit, all
references to column_type directly should have a comment explaining why
this is appropriate.
fixes Gihub issue #1250
We replace chunk_for_tuple with chunk_id_from_relid for getting
chunk id fields when materializing continuous aggs. The old
function required passing in the entire row. This was very slow
because a lot of data was passed around at execution time.
The new function just uses the internal `tableoid` attribute to
convert the table relid to a chunk_id. This is much more efficient.
We also add memoization to the new function because it is most often
called consecutively for the same chunk.
The chunk_utils test sets `client_min_messages` to `FATAL` in order to
mute some error messages, which differ between PostgreSQL versions and
would otherwise cause test failures on some platforms. However,
according to the PostgreSQL documentation going back to at least 9.6,
this is not a valid log level for this configuration
parameter, although it has been allowed for legacy reasons. However,
starting with PostgreSQL 11.2, `FATAL` is silently turned into `ERROR`
and will cause the test to output the error anyway and thus fail.
This change removes the muting altogether, because the error that is
output is actually a TimescaleDB error and not a PostgreSQL error. The
generated error output probably changed at some point and therefore
this muting is no longer necessary.
Remove the existing PLPGSQL function that implements drop_chunks, replacing it with a direct call to the C function, which also implements the old PLPGSQL checks in C. Refactor out much of the code shared between the C implementations of show_chunks and drop_chunks.
Timescale provides an efficient and easy to use api to drop individual
chunks from timescale database through drop_chunks. This PR builds on
that functionality and through a new show_chunks function gives the
opportunity to see the chunks that would be dropped if drop_chunks was run.
Additionally, it adds a newer_than option to drop_chunks (also supported
by show_chunks) that allows to see/drop chunks in an interval or newer
than a point in time.
This commit includes:
- Implementation of show_chunks in C
- Additional helper functions to work with chunks
- New version of drop_chunks in sql that uses show_chunks. This
also adds a newer_than option to drop_chunks
- More enhanced tests of drop_chunks and new tests for show_chunks
Among other reasons, show_chunks was implemented in C in order
to be able to have both older_than and newer_than arguments be null. This
was not possible in SQL because the arguments had to have polymorphic types
and whether they are used in function body or not, PL/pgSQL requires these
arguments to typecheck.