OSM chunks manage their ranges and the timescale
catalog has dummy ranges for these dimensions.
So the chunk exclusion logic cannot rely on the
timescaledb catalog metadata to exclude an OSM chunk.
At the time of adding or updating policies, it is
checked if the policies are compatible with each
other and to those already on the CAgg.
These checks are:
- refresh and compression policies should not overlap
- refresh and retention policies should not overlap
- compression and retention policies should not overlap
Co-authored-by: Markos Fountoulakis <markos@timescale.com>
-Add infinity for refresh window range
Now to create open ended refresh policy
use +/- infinity for end_offset and star_offset
respectivly for the refresh policy.
-Add remove_all_policies function
This will remove all the policies on a given
CAgg.
-Remove parameter refresh_schedule_interval
-Fix downgrade scripts
-Fix IF EXISTS case
Co-authored-by: Markos Fountoulakis <markos@timescale.com>
This simplifies the process of adding the policies
for the CAggs. Now, with one single sql statements
all the policies can be added for a given CAgg.
Similarly, all the policies can be removed or modified
via single sql statement only.
This also adds a new function as well as a view to show all
the policies on a continuous aggregate.
Add a new metadata table `dimension_partition` which explicitly and
statefully details how a space dimension is split into partitions, and
(in the case of multi-node) which data nodes are responsible for
storing chunks in each partition. Previously, partition and data nodes
were assigned dynamically based on the current state when creating a
chunk.
This is the first in a series of changes that will add more advanced
functionality over time. For now, the metadata table simply writes out
what was previously computed dynamically in code. Future code changes
will alter the behavior to do smarter updates to the partitions when,
e.g., adding and removing data nodes.
The idea of the `dimension_partition` table is to minimize changes in
the partition to data node mappings across various events, such as
changes in the number of data nodes, number of partitions, or the
replication factor, which affect the mappings. For example, increasing
the number of partitions from 3 to 4 currently leads to redefining all
partition ranges and data node mappings to account for the new
partition. Complete repartitioning can be disruptive to multi-node
deployments. With stateful mappings, it is possible to split an
existing partition without affecting the other partitions (similar to
partitioning using consistent hashing).
Note that the dimension partition table expresses the current state of
space partitions; i.e., the space-dimension constraints and data nodes
to be assigned to new chunks. Existing chunks are not affected by
changes in the dimension partition table, although an external job
could rewrite, move, or copy chunks as desired to comply with the
current dimension partition state. As such, the dimension partition
table represents the "desired" space partitioning state.
Part of #4125
In the session timescaledb_post_restore() was called the value for
timescaledb.restoring might not be changed because the reset_val
for the GUC was still on. We have to use explicit SET in this
session to adjust the GUC.
This release is a patch release. We recommend that you upgrade at the
next available opportunity.
Among other things this release fixes several memory leaks, handling
of TOASTed values in GapFill and parameter handling in prepared statements.
**Bugfixes**
* #4517 Fix prepared statement param handling in ChunkAppend
* #4522 Fix ANALYZE on dist hypertable for a set of nodes
* #4526 Fix gapfill group comparison for TOASTed values
* #4527 Handle stats properly for range types
* #4532 Fix memory leak in function telemetry
* #4534 Use explicit memory context with hash_create
* #4538 Fix chunk creation on hypertables with non-default statistics
**Thanks**
* @3a6u9ka, @bgemmill, @hongquan, @stl-leonid-kalmaev and @victor-sudakov for reporting a memory leak
* @hleung2021 and @laocaixw for reporting an issue with parameter handling in prepared statements
A chunk in frozen state cannot be dropped.
drop_chunks will skip over frozen chunks without erroring.
Internal api , drop_chunk will error if you attempt to drop
a chunk without unfreezing it.
This PR also adds a new internal API to unfreeze a chunk.
This PR introduces a new SQL function to associate a
hypertable or continuous agg with a custom job. If
this dependency is setup, the job is automatically
deleted when the hypertable/cagg is dropped.
Add _timescaledb_internal.attach_osm_table_chunk.
This treats a pre-existing foreign table as a
hypertable chunk by adding dummy metadata to the
catalog tables.
In `src/ts_catalog/catalog.c` we explicit define some constraints and
indexes names into `catalog_table_index_definitions` array, but in our
pre-install SQL script for schema definition we don't, so let's be more
explicit here and prevent future surprises.
Add a parameter `drop_remote_data` to `detach_data_node()` which
allows dropping the hypertable on the data node when detaching
it. This is useful when detaching a data node and then immediately
attaching it again. If the data remains on the data node, the
re-attach will fail with an error complaining that the hypertable
already exists.
The new parameter is analogous to the `drop_database` parameter of
`delete_data_node`. The new parameter is `false` by default for
compatibility and ensures that a data node can be detached without
requiring communicating with the data node (e.g., if the data node is
not responding due to a failure).
Closes#4414
Add a parameter `schedule_interval` to retention and
compression policies to allow users to define the schedule
interval. Fall back to previous default if no value is
specified.
Fixes#3806
Postgres knows whether a given aggregate is parallel-safe, and creates
parallel aggregation plans based on that. The `partialize_agg` is a
wrapper we use to perform partial aggregation on data nodes. It is a
pure function that produces serialized aggregation state as a result.
Being pure, it doesn't influence parallel safety. This means we don't
need to mark it parallel-unsafe to artificially disable the parallel
plans for partial aggregation. They will be chosen as usual based on
the parallel-safety of the underlying aggregate function.
This release adds major new features since the 2.6.1 release.
We deem it moderate priority for upgrading.
This release includes these noteworthy features:
* Optimize continuous aggregate query performance and storage
* The following query clauses and functions can now be used in a continuous
aggregate: FILTER, DISTINCT, ORDER BY as well as [Ordered-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE)
and [Hypothetical-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE)
* Optimize now() query planning time
* Improve COPY insert performance
* Improve performance of UPDATE/DELETE on PG14 by excluding chunks
This release also includes several bug fixes.
If you are upgrading from a previous version and were using compression
with a non-default collation on a segmentby-column you should recompress
those hypertables.
**Features**
* #4045 Custom origin's support in CAGGs
* #4120 Add logging for retention policy
* #4158 Allow ANALYZE command on a data node directly
* #4169 Add support for chunk exclusion on DELETE to PG14
* #4209 Add support for chunk exclusion on UPDATE to PG14
* #4269 Continuous Aggregates finals form
* #4301 Add support for bulk inserts in COPY operator
* #4311 Support non-superuser move chunk operations
* #4330 Add GUC "bgw_launcher_poll_time"
* #4340 Enable now() usage in plan-time chunk exclusion
**Bugfixes**
* #3899 Fix segfault in Continuous Aggregates
* #4225 Fix TRUNCATE error as non-owner on hypertable
* #4236 Fix potential wrong order of results for compressed hypertable with a non-default collation
* #4249 Fix option "timescaledb.create_group_indexes"
* #4251 Fix INSERT into compressed chunks with dropped columns
* #4255 Fix option "timescaledb.create_group_indexes"
* #4259 Fix logic bug in extension update script
* #4269 Fix bad Continuous Aggregate view definition reported in #4233
* #4289 Support moving compressed chunks between data nodes
* #4300 Fix refresh window cap for cagg refresh policy
* #4315 Fix memory leak in scheduler
* #4323 Remove printouts from signal handlers
* #4342 Fix move chunk cleanup logic
* #4349 Fix crashes in functions using AlterTableInternal
* #4358 Fix crash and other issues in telemetry reporter
**Thanks**
* @abrownsword for reporting a bug in the telemetry reporter and testing the fix
* @jsoref for fixing various misspellings in code, comments and documentation
* @yalon for reporting an error with ALTER TABLE RENAME on distributed hypertables
* @zhuizhuhaomeng for reporting and fixing a memory leak in our scheduler
Postgres changes the internal state format for numeric aggregates
which we materialize in caggs in PG14. This will invalidate the
affected columns when upgrading from PG13 to PG14. This patch
adds a warning to the update script when we encounter this
configuration.
The non-superuser needs to have REPLICATION privileges atleast. A
new function "subscription_cmd" has been added to allow running
subscription related commands on datanodes. This function implicitly
upgrades to the bootstrapped superuser and then performs subscription
creation/alteration/deletion commands. It only accepts subscriptions
related commands and errors out otherwise.
Allow users to specify an explicit "operation_id" while carrying out
a copy/move operation. If it's specified then that is used as the
identifier for the copy/move operation. Otherwise, am implicit id as
before gets created and used.
Add an internal api to drop a single chunk.
This function drops the storage and metadata
associated with the chunk.
Note that chunk dependencies are not affected.
e.g. Continuous aggs are not updated when this chunk
is dropped.
This is an internal function to freeze a chunk
for PG14 and later.
This function sets a chunk status to frozen.
Operations that modify the chunk data
(like insert, update, delete) are not
supported. Frozen chunks can be dropped.
Additionally, chunk status is cached as part of
classify_relation.
First step to remove the re-aggregation for Continuous Aggregates
is to remove the `chunk_id` from the materialization hypertable.
Also added new metadata column named `finalized` to `continuous_cagg`
catalog table in order to store information about the new following
finalized version of Continuous Aggregates that will not need the
partials anymore. This flag is important to maintain backward
compatibility with previous Continuous Aggregate implementation that
requires the `chunk_id` to refresh data properly.
Postgres will prepend pg_temp to the effective search_path if it
is not present in the search_path. While pg_temp will never be
used to look up functions or operators unless explicitly requested
pg_temp will be used to look up relations. Putting pg_temp in
search_path makes sure objects in pg_temp will be considered last
and pg_temp cannot be used to mask existing objects.
This was not caught earlier and is not currently caught by CI
because the check for unqualified casts is currently only in main
branch of pgspot and not yet in the tagged version we use as
part of PR checks.
post_update_cagg_try_repair was created with `CREATE OR REPLACE`
instead of CREATE. Additionally the procedure was created in the
public schema. This patch adjusts the procedure to be created
with `CREATE` and in a timescaledb internal schema.
Found by pgspot.
The query to get the list of saved privileges during extension
upgrade had a bug and only applying the classoid restriction
for a subset of the entries when it should have been applied to
all returned rows leading to a failure during extension update
when init privileges for other classoids existed on any of the
relevant objects.
Add the missing variables to the finalization view of Continuous
Aggregates and the corresponding columns to the materialization table.
Cover the case of targets that contain Aggref nodes and Var nodes
that are outside of the Aggref nodes at the same time.
Stop rebuilding the Continuous Aggregate view with ALTER MATERIALIZED
VIEW. Attempt to repair the view at post-update time instead, and fail
gracefully if it is not possible to do so without raw hypertable schema
or data modifications.
Stop rebuilding the Continuous Aggregate view when switching realtime
aggregation on and off. Instead, manipulate the User View by either:
1. removing the UNION ALL right-hand side and the WHERE clause when
disabling realtime aggregation
2. adding the Direct View to the right of a UNION ALL operator and
defining WHERE clauses with the relevant watermark checks when
enabling realtime aggregation
Fixes#3898
This release is patch release. We recommend that you upgrade at the next available opportunity.
**Bugfixes**
* #3974 Fix remote EXPLAIN with parameterized queries
* #4122 Fix segfault on INSERT into distributed hypertable
* #4142 Ignore invalid relid when deleting hypertable
* #4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* #4161 Fix memory handling during scans
* #4186 Fix owner change for distributed hypertable
* #4192 Abort sessions after extension reload
* #4193 Fix relcache callback handling causing crashes
**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
Improve the performance of metadata scanning during hypertable
expansion.
When a hypertable is expanded to include all children chunks, only the
chunks that match the query restrictions are included. To find the
matching chunks, the planner first scans for all matching dimension
slices. The chunks that reference those slices are the chunks to
include in the expansion.
This change optimizes the scanning for slices by avoiding repeated
open/close of the dimension slice metadata table and index.
At the same time, related dimension slice scanning functions have been
refactored along the same line.
An index on the chunk constraint metadata table is also changed to
allow scanning on dimension_slice_id. Previously, dimension_slice_id
was the second key in the index, which made scans on this key less
efficient.
Reorganize the code and fix minor bug that was not computing the size
of FSM, VM and INIT forks of the parent hypertable.
Fixed the bug by exposing the `ts_relation_size` function to the SQL
level to encapsulate the logic to compute `heap`, `indexes` and `toast`
sizes.
Add option `USE_TELEMETRY` that can be used to exclude telemetry from
the compile.
Telemetry-specific SQL is moved, which is only included when extension
is compiled with telemetry and the notice is changed so that the
message about telemetry is not printed when Telemetry is not compiled
in.
The following code is not compiled in when telemetry is not used:
- Cross-module functions for telemetry.
- Checks for telemetry job in job execution.
- GUC variables `telemetry_level` and `telemetry_cloud`.
Telemetry subsystem is not included when compiling without telemetry,
which requires some functions to be moved out of the telemetry
subsystem:
- Metadata handling is moved out of the telemetry module since it is
used not only with telemetry.
- UUID functions are moved into a separate module instead of being
part of the telemetry subsystem.
- Telemetry functions are either added or removed when updating from a
previous version.
Tests are updated to:
- Not use telemetry functions to get UUID or Metadata and instead use
the moved UUID and metadata functions.
- Not include telemetry information in tests that do not require it.
- Configuration files do not set telemetry variables when telemetry is
not compiled in.
- Replaced usage of telemetry functions in non-telemetry tests with
other sources of same information.
Fixes#3931
This release is medium priority for upgrade. We recommend that you
upgrade at the next available opportunity.
This release adds major new features since the 2.5.2 release,
including:
Compression in continuous aggregates Experimental support for timezones
in continuous aggregates Experimental support for monthly buckets in
continuous aggregates It also includes several bug fixes. Telemetry
reports are switched to a new format, and now include more detailed
statistics on compression, distributed hypertables and indexes.
**Features**
* #3768 Allow ALTER TABLE ADD COLUMN with DEFAULT on compressed
hypertable
* #3769 Allow ALTER TABLE DROP COLUMN on compressed hypertable
* #3943 Optimize first/last
* #3945 Add support for ALTER SCHEMA on multi-node
* #3949 Add support for DROP SCHEMA on multi-node
**Bugfixes**
* #3808 Properly handle max_retries option
* #3863 Fix remote transaction heal logic
* #3869 Fix ALTER SET/DROP NULL contstraint on distributed hypertable
* #3944 Fix segfault in add_compression_policy
* #3961 Fix crash in EXPLAIN VERBOSE on distributed hypertable
* #4015 Eliminate float rounding instabilities in interpolate
* #4019 Update ts_extension_oid in transitioning state
* #4073 Fix buffer overflow in partition scheme
**Improvements**
Query planning performance is improved for hypertables with a large
number of chunks.
**Thanks**
* @fvannee for reporting a first/last memory leak
* @mmouterde for reporting an issue with floats and interpolate
SET LOCAL is only active until end of transaction so we set search_path
again after COMMIT in functions that do transaction control. While we
could use SET at the start of the function we do not want to bleed out
search_path to caller.
Resetting search_path in reverse-dev was necessary before the
release of 2.5.2 as the previous timescaledb version scripts
didn't handle locked down search_path. We can remove setting
search_path too as the downgrade script includes pre-update.sql
which locks down search_path.
This release contains bug fixes since the 2.5.1 release.
This release is high priority for upgrade. We strongly recommend that you
upgrade as soon as possible.
**Bugfixes**
* #3900 Improve custom scan node registration
* #3911 Fix role type deparsing for GRANT command
* #3918 Fix DataNodeScan plans with one-time filter
* #3921 Fix segfault on insert into internal compressed table
* #3938 Fix subtract_integer_from_now on 32-bit platforms and improve error handling
* #3939 Fix projection handling in time_bucket_gapfill
* #3948 Avoid double PGclear() in data fetchers
* #3979 Fix deparsing of index predicates
* #4015 Eliminate float rounding instabilities in interpolate
* #4020 Fix ALTER TABLE EventTrigger initialization
* #4024 Fix premature cache release call
* #4037 Fix status for dropped chunks that have catalog entries
* #4069 Fix riinfo NULL handling in ANY construct
* #4071 Fix extension installation privilege escalation
* #4073 Fix buffer overflow in partition scheme
**Thanks**
* @carlocperez for reporting crash with NULL handling in ANY construct
* @erikhh for reporting an issue with time_bucket_gapfill
* @fvannee for reporting a first/last memory leak
* @kancsuki for reporting drop column and partial index creation not working
* @mmouterde for reporting an issue with floats and interpolate
* Pedro Gallegos for reporting a possible privilege escalation during extension installation
Security: CVE-2022-24128
This patch locks down search_path in extension install and update
scripts to only contain pg_catalog, this requires that any reference
in those scripts is fully qualified. Additionally we add explicit
create commands to all update scripts for objects added to the
public schema. This change will make update scripts fail if a
function with identical signature already exists when installing
or upgrading instead reusing the existing object.
TimescaleDB was vulnerable to a privilege escalation attack in
the extension installation script. An attacker could precreate
objects normally owned by the extension and get those objects
used in the installation script since the script would only try
to create them if they did not already exist. Thanks to Pedro
Gallegos for reporting the problem.
This patch changes the schema, table and function creation to fail
and abort the installation when the object already exists instead
of using the existing object.
Security: CVE-2022-24128
Refactor the telemetry function and format to include stats broken
down on common relation types. The types include:
- Tables
- Partitioned tables
- Hypertables
- Distributed hypertables
- Continuous aggregates
- Materialized views
- Views
and for each of these types report (when applicable):
- Total number of relations
- Total number of children/chunks
- Total data volume (broken into heap, toast, and indexes).
- Compression stats
- PG stats, like reltuples
The telemetry function has also been refactored to return `jsonb`
instead of `text`. This makes it easier to query and manipulate the
resulting JSON format, and also gives cleaner output.
Closes#3932