The retention and compression policies can now use drop_created_before
and compress_created_before arguments respectively to specify chunk
selection using their creation times.
We don't support creation times for CAggs, yet.
In certain scenarios, when generating buckets with
monthly buckets and different timezones, gapfill
would create timestamps which don't align with
time_bucket and thus potentially generating multiple
rows for an individual month. Instead of relying on
previous timestamp to generate the next one, now
we generate them always from the start point
which will make us align with time_bucket buckets.
EXPLAIN ANALYZE for compressed DML would error out with `bogus varno`
error because we would modify the original expressions of the plan
that were still referenced in nodes instead of adjusting copies and
using those copies in our internal scans.
This patch stores the current catalog version in an internal
table to allow us to verify catalog and code version match.
When doing dump/restore people occasionally report very unusual
errors and during investigation it is discovered that they loaded
a dump from an older version and run it with a later code version.
This allows to detect mismatches between installed code version
and loaded dump version. The version number in the metadata table
will be kept updated in upgrade and downgrade scripts.
This patch adds tracking number of batches and tuples that needed
to be decompressed as part of DML operations on compressed hypertables.
These will be visible in EXPLAIN ANALYZE output like so:
QUERY PLAN
Custom Scan (HypertableModify) (actual rows=0 loops=1)
Batches decompressed: 2
Tuples decompressed: 25
-> Insert on decompress_tracking (actual rows=0 loops=1)
-> Custom Scan (ChunkDispatch) (actual rows=2 loops=1)
-> Values Scan on "*VALUES*" (actual rows=2 loops=1)
(6 rows)
The trigger `continuous_agg_invalidation_trigger` receive the hypertable
id as parameter as the following example:
Triggers:
ts_cagg_invalidation_trigger AFTER INSERT OR DELETE OR UPDATE ON
_timescaledb_internal._hyper_3_59_chunk
FOR EACH ROW EXECUTE FUNCTION
_timescaledb_functions.continuous_agg_invalidation_trigger('3')
The problem is in the compatibility layer using PL/pgSQL code there's no
way to passdown the parameter from the wrapper trigger function created
to the underlying trigger function in another schema.
To solve this we simple create a new function in the deprecated
`_timescaledb_internal` schema pointing to the function trigger and
inside the C code we emit the WARNING message if the function is called
from the deprecated schema.
The multinode tests in regresscheck-shared were already disabled
by default and removing them allows us to skip setting up the
multinode environment in regresscheck-shared. This database
is also used for sqlsmith which will make sqlsmith runs more
targetted. Additionally this will be a step towards running
regresscheck-shared unmodified against our cloud.
Historically creating a Continuous Aggregate make it realtime by default
but it confuse users specially when using `WITH NO DATA` option. Also is
well known that realtime Continuous Aggregates can potentially lead to
issues with Hierarchical and Data Tiering.
Improved the UX by making Continuous Aggregates non-realtime by default.
Fall back to btree operator input type when it is binary compatible with
the column type and no operator for column type could be found. This
should improve performance when using column types like char or varchar
instead of text.
This patch adds support for partial aggregations at the chunk level.
The aggregation is replanned in the create_upper_paths_hook of
PostgreSQL. The AggPath is split up into multiple
AGGSPLIT_INITIAL_SERIAL operations (one on top of each chunk), which
create partials, and one AGGSPLIT_FINAL_DESERIAL operation, which
finalizes the aggregation.
With timescaledb 2.12 all the functions present in _timescaledb_internal
were moved into the _timescaledb_functions schema to improve schema
security. This patch adds a compatibility layer so external callers
of these internal functions will not break and allow for more
flexibility when migrating.
To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- get_partition_for_key(val anyelement)
- get_partition_hash(val anyelement)
To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- chunk_constraint_add_table_constraint(_timescaledb_catalog.chunk_constraint)
- chunk_drop_replica(regclass,name)
- chunk_index_clone(oid)
- chunk_index_replace(oid,oid)
- create_chunk_replica_table(regclass,name)
- drop_stale_chunks(name,integer[])
- health()
- hypertable_constraint_add_table_fk_constraint(name,name,name,integer)
- process_ddl_event()
- wait_subscription_sync(name,name,integer,numeric)
To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- cagg_watermark(integer)
- cagg_watermark_materialized(integer)
- hypertable_invalidation_log_delete(integer)
- invalidation_cagg_log_add_entry(integer,bigint,bigint)
- invalidation_hyper_log_add_entry(integer,bigint,bigint)
- invalidation_process_cagg_log(integer,integer,regtype,bigint,bigint,integer[],bigint[],bigint[])
- invalidation_process_cagg_log(integer,integer,regtype,bigint,bigint,integer[],bigint[],bigint[],text[])
- invalidation_process_hypertable_log(integer,integer,regtype,integer[],bigint[],bigint[])
- invalidation_process_hypertable_log(integer,integer,regtype,integer[],bigint[],bigint[],text[])
- materialization_invalidation_log_delete(integer)
To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- generate_uuid()
- get_git_commit()
- get_os_info()
- tsl_loaded()
The equality comparison function is called using
`DirectFunctionCall2Coll`, which do not set the `fcinfo->flinfo` when
calling the PostgreSQL function. Since `array_eq` uses
`fcinfo->flinfo->fn_extra` for caching, and `flinfo` is null, this
causes a crash.
Fix this issue by using `FunctionCall2Coll` instead, which sets
`fcinfo->flinfo` before calling the PostgreSQL function.
Fixes#5981
To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- set_dist_id(uuid)
- set_peer_dist_id(uuid)
- validate_as_data_node()
- show_connection_cache()
- ping_data_node(name, interval)
- remote_txn_heal_data_node(oid)
To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- to_unix_microseconds(timestamptz)
- to_timestamp(bigint)
- to_timestamp_without_timezone(bigint)
- to_date(bigint)
- to_interval(bigint)
- interval_to_usec(interval)
- time_to_internal(anyelement)
- subtract_integer_from_now(regclass, bigint)
This will move the definitions of `debug_waitpoint_enable`,
`debug_waitpoint_disable`, and `debug_waitpoint_id` to always be
defined for debug builds and modify existing tests accordingly.
This means that it is no longer necessary to generate isolation test
files from templates (in most cases), and it will be straightforward to
use these functions in debug builds.
The debug utilities can be disabled by setting the option
`ENABLE_DEBUG_UTILS` to `OFF`.
If there any indexes on the compressed chunk, insert into them while
inserting the heap data rather than reindexing the relation at the
end. This reduces the amount of locking on the compressed chunk
indexes which created issues when merging chunks and should help
with the future updates of compressed data.
So far, we have set the number of desired workers for decompression to
1. If a query touches only one chunk, we end up with one worker in a
parallel plan. Only if the query touches multiple chunks PostgreSQL
spins up multiple workers. These workers could then be used to process
the data of one chunk.
This patch removes our custom worker calculation and relies on
PostgreSQL logic to calculate the desired parallelity.
Co-authored-by: Jan Kristof Nidzwetzki <jan@timescale.com>
This patch does following:
1. Planner changes to create ChunkDispatch node when MERGE command
has INSERT action.
2. Changes to map partition attributes from a tuple returned from
child node of ChunkDispatch against physical targetlist, so that
ChunkDispatch node can read the correct value from partition column.
3. Fixed issues with MERGE on compressed hypertable.
4. Added more testcases.
5. MERGE in distributed hypertables is not supported.
6. Since there is no Custom Scan (HypertableModify) node for MERGE
with UPDATE/DELETE on compressed hypertables, we don't support this.
Fixes#5139
This patch adds an optimization to the DecompressChunk node. If the
query 'order by' and the compression 'order by' are compatible (query
'order by' is equal or a prefix of compression 'order by'), the
compressed batches of the segments are decompressed in parallel and
merged using a binary heep. This preserves the ordering and the sorting
of the result can be prevented. Especially LIMIT queries benefit from
this optimization because only the first tuples of some batches have to
be decompressed. Previously, all segments were completely decompressed
and sorted.
Fixes: #4223
Co-authored-by: Sotiris Stamokostas <sotiris@timescale.com>
All children of an append path are required to have the same parameterization
so we have to reparameterize when the selected path does not have the right
parameterization.
The function to execute remote commands on data nodes used a blocking
libpq API that doesn't integrate with PostgreSQL interrupt handling,
making it impossible for a user or statement timeout to cancel a
remote command.
Refactor the remote command execution function to use a non-blocking
API and integrate with PostgreSQL signal handling via WaitEventSets.
Partial fix for #4958.
Refactor remote command execution function
SELECT from partially compressed chunk crashes due to reference to NULL
pointer. When generating paths for DecompressChunk, uncompressed_partial_path
is null which is not checked, thus causing a crash. This patch checks for NULL
before calling create_append_path().
Fixes#5134
On caggs with realtime aggregation changing the column name does
not update all the column aliases inside the view metadata.
This patch changes the code that creates the compression
configuration for caggs to get the column name from the materialization
hypertable instead of the view internals.
Fixes#5100
The cursor_fetcher_rewind method assumes that the data node cursor is
rewind either after eof or when there is an associated request. But the
rewind can also occur once the server has generated required number of
rows by joining the relation being scanned with another regular
relation. In this case, the fetch would not have reached eof and there
will be no associated requests as the rows would have been already
loaded into the cursor causing the assertion in cursor_fetcher_rewind
to fail. Fixed that by removing the Assert and updating
cursor_fetcher_rewind to discard the response only if there is an
associated request.
Fixes#5053
Ensure the COPY fetcher implementation reads data until EOF with
`PQgetCopyData()`. Also ensure the malloc'ed copy data is freed with
`PQfreemem()` if an error is thrown in the processing loop.
Previously, the COPY fetcher didn't read until EOF, and instead
assumed EOF when the COPY file trailer is received. Since EOF wasn't
reached, it required terminating the COPY with an extra call to the
(deprecated) `PQendcopy()` function.
Still, there are cases when a COPY needs to be prematurely terminated,
for example, when querying with a LIMIT clause. Therefore, distinguish
between "normal" end (when receiving EOF) and forceful end (cancel the
ongoing query).
INSERT .. SELECT query containing distributed hypertables generates plan
with DataNodeCopy node which is not supported. Issue is in function
tsl_create_distributed_insert_path() where we decide if we should
generate DataNodeCopy or DataNodeDispatch node based on the kind of
query. In PG15 for INSERT .. SELECT query timescaledb planner generates
DataNodeCopy as rte->subquery is set to NULL. This is because of a commit
in PG15 where rte->subquery is set to NULL as part of a fix.
This patch checks if SELECT subquery has distributed hypertables or not
by looking into root->parse->jointree which represents subquery.
Fixes#4983
We don't want to support BitmapScans below DecompressChunk
as this adds additional complexity to support and there
is little benefit in doing so.
This fixes a bug that can happen when we have a parameterized
BitmapScan that is parameterized on a compressed column and
will lead to an execution failure with an error regarding
incorrect attribute types in the expression.
On PG15 CustomScan by default is not projection capable, thus wraps this
node in Result node. THis change in PG15 causes tests result files which
have EXPLAIN output to fail. This patch fixes the plan outputs.
Fixes#4833