Instances upgraded to 2.8.0 will end up with a wrong check constraint
in catalog table `continuous_aggregate_migrate_plan_step`.
Fixed it by removing and adding the constraint with the correct checks.
Fix#4727
This patch fixes a deadlock between chunk decompression and SELECT
queries executed in parallel. The change in
a608d7db614c930213dee8d6a5e9d26a0259da61 requests an AccessExclusiveLock
for the decompressed chunk instead of the compressed chunk, resulting in
deadlocks.
In addition, an isolation test has been added to test that SELECT
queries on a chunk that is currently decompressed can be executed.
Fixes#4605
Consider a compressed hypertable has many columns (like more than 600 columns).
In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes
error as "row is too big: size 10856, maximum size 8160."
This patch estimates the tuple size of compressed hypertable and reports a
warning when compression is enabled on hypertable. Thus user gets aware of
this warning before calling compress_chunk().
Fixes#4398
Allow planner chunk exclusion in subqueries. When we decicde on
whether a query may benefit from constifying now and encounter a
subquery peek into the subquery and check if the constraint
references a hypertable partitioning column.
Fixes#4524
The schema of base table on which hypertables are created, should define
columns with proper data types. As per postgres best practices Wiki
(https://wiki.postgresql.org/wiki/Don't_Do_This), one should not define
columns with CHAR, VARCHAR, VARCHAR(N), instead use TEXT data type.
Similarly instead of using timestamp, one should use timestamptz.
This patch reports a WARNING to end user when creating hypertables,
if underlying parent table, has columns of above mentioned data types.
Fixes#4335
Since we do not use our own hypertable expansion for SELECT FOR UPDATE
queries we need to make sure to add the extra information necessary to
get hashed space partitions with the native postgres inheritance
expansion working.
The primary key for compression_chunk_size was defined as chunk_id,
compressed_chunk_id but other places assumed chunk_id is actually
unique and would error when it was not. Since it makes no sense
to have multiple entries per chunk since that reference would be
to a no longer existing chunk the primary key is changed to chunk_id
only with this patch.
This patch adds a new time_bucket_gapfill function that
allows bucketing in a specific timezone.
You can gapfill with explicit timezone like so:
`SELECT time_bucket_gapfill('1 day', time, 'Europe/Berlin') ...`
Unfortunately this introduces an ambiguity with some previous
call variations when an untyped start/finish argument was passed
to the function. Some queries might need to be adjusted and either
explicitly name the positional argument or resolve the type ambiguity
by casting to the intended type.
When using a custom ENUM data type for compressed hypertable on the GROUP BY
clause raises an error.
Fixed it by generating scan paths for the query by checking if the SEGMENT BY
column is a custom ENUM type and then report a valid error message.
Fixes#3481
This release adds major new features since the 2.7.2 release.
We deem it moderate priority for upgrading.
This release includes these noteworthy features:
* time_bucket now supports bucketing by month, year and timezone
* Improve performance of bulk SELECT and COPY for distributed hypertables
* 1 step CAgg policy management
* Migrate Continuous Aggregates to the new format
**Features**
* #4188 Use COPY protocol in row-by-row fetcher
* #4307 Mark partialize_agg as parallel safe
* #4380 Enable chunk exclusion for space dimensions in UPDATE/DELETE
* #4384 Add schedule_interval to policies
* #4390 Faster lookup of chunks by point
* #4393 Support intervals with day component when constifying now()
* #4397 Support intervals with month component when constifying now()
* #4405 Support ON CONFLICT ON CONSTRAINT for hypertables
* #4412 Add telemetry about replication
* #4415 Drop remote data when detaching data node
* #4416 Handle TRUNCATE TABLE on chunks
* #4425 Add parameter check_config to alter_job
* #4430 Create index on Continuous Aggregates
* #4439 Allow ORDER BY on continuous aggregates
* #4443 Add stateful partition mappings
* #4484 Use non-blocking data node connections for COPY
* #4495 Support add_dimension() with existing data
* #4502 Add chunks to baserel cache on chunk exclusion
* #4545 Add hypertable distributed argument and defaults
* #4552 Migrate Continuous Aggregates to the new format
* #4556 Add runtime exclusion for hypertables
* #4561 Change get_git_commit to return full commit hash
* #4563 1 step CAgg policy management
* #4641 Allow bucketing by month, year, century in time_bucket and time_bucket_gapfill
* #4642 Add timezone support to time_bucket
**Bugfixes**
* #4359 Create composite index on segmentby columns
* #4374 Remove constified now() constraints from plan
* #4416 Handle TRUNCATE TABLE on chunks
* #4478 Synchronize chunk cache sizes
* #4486 Adding boolean column with default value doesn't work on compressed table
* #4512 Fix unaligned pointer access
* #4519 Throw better error message on incompatible row fetcher settings
* #4549 Fix dump_meta_data for windows
* #4553 Fix timescaledb_post_restore GUC handling
* #4573 Load TSL library on compressed_data_out call
* #4575 Fix use of `get_partition_hash` and `get_partition_for_key` inside an IMMUTABLE function
* #4577 Fix segfaults in compression code with corrupt data
* #4580 Handle default privileges on CAggs properly
* #4582 Fix assertion in GRANT .. ON ALL TABLES IN SCHEMA
* #4583 Fix partitioning functions
* #4589 Fix rename for distributed hypertable
* #4601 Reset compression sequence when group resets
* #4611 Fix a potential OOM when loading large data sets into a hypertable
* #4624 Fix heap buffer overflow
* #4627 Fix telemetry initialization
* #4631 Ensure TSL library is loaded on database upgrades
* #4646 Fix time_bucket_ng origin handling
* #4647 Fix the error "SubPlan found with no parent plan" that occurred if using joins in RETURNING clause.
**Thanks**
* @AlmiS for reporting error on `get_partition_hash` executed inside an IMMUTABLE function
* @Creatation for reporting an issue with renaming hypertables
* @janko for reporting an issue when adding bool column with default value to compressed hypertable
* @jayadevanm for reporting error of TRUNCATE TABLE on compressed chunk
* @michaelkitson for reporting permission errors using default privileges on Continuous Aggregates
* @mwahlhuetter for reporting error in joins in RETURNING clause
* @ninjaltd and @mrksngl for reporting a potential OOM when loading large data sets into a hypertable
* @PBudmark for reporting an issue with dump_meta_data.sql on Windows
* @ssmoss for reporting an issue with time_bucket_ng origin handling
Do not allocate various temporary data in PortalContext, such as the
hyperspace point corresponding to the row, or the intermediate data
required for chunk lookup.
Make truncating a uncompressed chunk drop the data for the case where
they reside in a corresponding compressed chunk.
Generate invalidations for Continuous Aggregates after TRUNCATE, so
as to have consistent refresh operations on the materialization
hypertable.
Fixes#4362
If a default privilege is configured and applied to a given Continuous
Aggregate during it creation just the user view has the ACL properly
configured but the underlying materialization hypertable no leading to
permission errors.
Fixed it by copying the privileges from the user view to the
materialization hypertable during the Continous Aggregate creation.
Fixes#4555
When executing `get_partition_{hash|for_key}` inside an IMMUTABLE
function we're getting the following error:
`ERROR: unsupported expression argument node type 112`
This error is because the underlying `resolve_function_argtype` was not
dealing with `T_Param` node type.
Fixed it by dealing properly with `T_Param` node type returning the
`paramtype` for the argument type.
Fixes#4575
Enables adding a boolean column with default value to a compressed table.
This limitation was occurring due to the internal representation of default
boolean values like 'True' or 'False', hence more checks are added for this.
Fixes#4486
This release is a patch release. We recommend that you upgrade at the
next available opportunity.
Among other things this release fixes several memory leaks, handling
of TOASTed values in GapFill and parameter handling in prepared statements.
**Bugfixes**
* #4517 Fix prepared statement param handling in ChunkAppend
* #4522 Fix ANALYZE on dist hypertable for a set of nodes
* #4526 Fix gapfill group comparison for TOASTed values
* #4527 Handle stats properly for range types
* #4532 Fix memory leak in function telemetry
* #4534 Use explicit memory context with hash_create
* #4538 Fix chunk creation on hypertables with non-default statistics
**Thanks**
* @3a6u9ka, @bgemmill, @hongquan, @stl-leonid-kalmaev and @victor-sudakov for reporting a memory leak
* @hleung2021 and @laocaixw for reporting an issue with parameter handling in prepared statements
The gapfill mechanism to detect an aggregation group change was
using datumIsEqual to compare the group values. datumIsEqual does
not detoast values so when one value is toasted and the other value
is not it will not return the correct result. This patch changes
the gapfill code to use the correct equal operator for the type
of the group column instead of datumIsEqual.
Executing an IMMUTABLE function that has parameters and exception
handling block multiple times in the same transaction causes a null
pointer segfault when try to reset a non-initialized ts_baserel_info.
Fixed it by preventing to reset a non-initialized `ts_baserel_info`.
Fixes#4489
This patch introduces a further check to compress_chunk_impl and
decompress_chunk_impl. After all locks are acquired, a check is made
to see if the chunk is still (un-)compressed. If the chunk was
(de-)compressed while waiting for the locks, the (de-)compression
operation is stopped.
In addition, the chunk locks in decompress_chunk_impl
are upgraded to AccessExclusiveLock to ensure the chunk is not deleted
while other transactions are using it.
Fixes: #4480
For certain inserts on a distributed hypertable, e.g., involving CTEs
and upserts, plans can be generated that weren't properly handled by
the DataNodeCopy and DataNodeDispatch execution nodes. In particular,
the nodes expect ChunkDispatch as a child node, but PostgreSQL can
sometimes insert a Result node above ChunkDispatch, causing the crash.
Further, behavioral changes in PG14 also caused the DataNodeCopy node
to sometimes wrongly believe a RETURNING clause was present. The check
for returning clauses has been updated to fix this issue.
Fixes#4339
When dealing with Intervals with month component timezone changes
can result in multiple day differences in the outcome of these
calculations due to different month lengths. When dealing with
months we add a 7 day safety buffer.
For all these calculations it is fine if we exclude less chunks
than strictly required for the operation, additional exclusion
with exact values will happen in the executor. But under no
circumstances must we exclude too much cause there would be
no way for the executor to get those chunks back.
The initial patch to use now() expressions during planner hypertable
expansion only supported intervals with no day or month component.
This patch adds support for intervals with day component.
If the interval has a day component then the calculation needs
to take into account daylight saving time switches and thereby a
day would not always be exactly 24 hours. We mitigate this by
adding a safety buffer to account for these dst switches when
dealing with intervals with day component. These calculations
will be repeated with exact values during execution.
Since dst switches seem to range between -1 and 2 hours we set
the safety buffer to 4 hours.
This patch also refactors the tests since the previous tests
made it hard to tell the feature was working after the constified
values have been removed from the plans.
This release adds major new features since the 2.6.1 release.
We deem it moderate priority for upgrading.
This release includes these noteworthy features:
* Optimize continuous aggregate query performance and storage
* The following query clauses and functions can now be used in a continuous
aggregate: FILTER, DISTINCT, ORDER BY as well as [Ordered-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE)
and [Hypothetical-Set Aggregate](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE)
* Optimize now() query planning time
* Improve COPY insert performance
* Improve performance of UPDATE/DELETE on PG14 by excluding chunks
This release also includes several bug fixes.
If you are upgrading from a previous version and were using compression
with a non-default collation on a segmentby-column you should recompress
those hypertables.
**Features**
* #4045 Custom origin's support in CAGGs
* #4120 Add logging for retention policy
* #4158 Allow ANALYZE command on a data node directly
* #4169 Add support for chunk exclusion on DELETE to PG14
* #4209 Add support for chunk exclusion on UPDATE to PG14
* #4269 Continuous Aggregates finals form
* #4301 Add support for bulk inserts in COPY operator
* #4311 Support non-superuser move chunk operations
* #4330 Add GUC "bgw_launcher_poll_time"
* #4340 Enable now() usage in plan-time chunk exclusion
**Bugfixes**
* #3899 Fix segfault in Continuous Aggregates
* #4225 Fix TRUNCATE error as non-owner on hypertable
* #4236 Fix potential wrong order of results for compressed hypertable with a non-default collation
* #4249 Fix option "timescaledb.create_group_indexes"
* #4251 Fix INSERT into compressed chunks with dropped columns
* #4255 Fix option "timescaledb.create_group_indexes"
* #4259 Fix logic bug in extension update script
* #4269 Fix bad Continuous Aggregate view definition reported in #4233
* #4289 Support moving compressed chunks between data nodes
* #4300 Fix refresh window cap for cagg refresh policy
* #4315 Fix memory leak in scheduler
* #4323 Remove printouts from signal handlers
* #4342 Fix move chunk cleanup logic
* #4349 Fix crashes in functions using AlterTableInternal
* #4358 Fix crash and other issues in telemetry reporter
**Thanks**
* @abrownsword for reporting a bug in the telemetry reporter and testing the fix
* @jsoref for fixing various misspellings in code, comments and documentation
* @yalon for reporting an error with ALTER TABLE RENAME on distributed hypertables
* @zhuizhuhaomeng for reporting and fixing a memory leak in our scheduler
Make the following changes to the telemetry reporter background worker:
- Add a read lock to the current relation that the reporter collects
stats for. This lock protects against concurrent deletion of the
relation, which could lead to errors that would prevent the reporter
from completing its report.
- Set an active snapshot in the telemetry background process for use
when scanning a relation for stats collection.
- Reopen the scan iterator when collecting chunk compression stats for
a relation instead of keeping it open and restarting the scan. The
previous approach seems to cause crashes due to memory corruption of
the scan state. Unfortunately, the exact cause has not been
identified, but the change has been verified to work on a live
running instance (thanks to @abrownsword for the help with
reproducing the crash and testing fixes).
Fixes#4266
Following work started by #4294 to improve performance of Continuous
Aggregates by removing the re-aggregation in the user view.
This PR get rid of `partialize_agg` and `finalize_agg` aggregate
functions and store the finalized aggregated (plain) data in the
materialization hypertable.
Because we're not storing partials anymore and removed the
re-aggregation, now is be possible to create indexes on aggregated
columns in the materialization hypertable in order to improve the
performance even more.
Also removed restrictions on types of aggregates users can perform
with Continuous Aggregates:
* aggregates with DISTINCT
* aggregates with FILTER
* aggregates with FILTER in HAVING clause
* aggregates without combine function
* ordered-set aggregates
* hypothetical-set aggregates
By default new Continuous Aggregates will be created using this new
format, but the previous version (with partials) will be supported.
Users can create the previous style by setting to `false` the storage
paramater named `timescaledb.finalized` during the creation of the
Continuous Aggregate.
Fixes#4233
This implements an optimization to allow now() expression to be
used during plan time chunk exclusions. Since now() is stable it
would not normally be considered for plan time chunk exclusion.
To enable this behaviour we convert `column > now()` expressions
into `column > const AND column > now()`. Assuming that time
always moves forward this is safe even for prepared statements.
This optimization works for SELECT, UPDATE and DELETE.
On hypertables with many chunks this can lead to a considerable
speedup for certain queries.
The following expressions are supported:
- column > now()
- column >= now()
- column > now() - Interval
- column > now() + Interval
- column >= now() - Interval
- column >= now() + Interval
Interval must not have a day or month component as those depend
on timezone settings.
Some microbenchmark to show the improvements, I did best of five
for all of the queries.
-- hypertable with 1k chunks
-- with optimization
select * from metrics1k where time > now() - '5m'::interval;
Time: 3.090 ms
-- without optimization
select * from metrics1k where time > now() - '5m'::interval;
Time: 145.640 ms
-- hypertable with 5k chunks
-- with optimization
select * from metrics5k where time > now() - '5m'::interval;
Time: 4.317 ms
-- without optimization
select * from metrics5k where time > now() - '5m'::interval;
Time: 775.259 ms
-- hypertable with 10k chunks
-- with optimization
select * from metrics10k where time > now() - '5m'::interval;
Time: 4.853 ms
-- without optimization
select * from metrics10k where time > now() - '5m'::interval;
Time: 1766.319 ms (00:01.766)
-- hypertable with 20k chunks
-- with optimization
select * from metrics20k where time > now() - '5m'::interval;
Time: 6.141 ms
-- without optimization
select * from metrics20k where time > now() - '5m'::interval;
Time: 3321.968 ms (00:03.322)
Speedup with 1k chunks: 47x
Speedup with 5k chunks: 179x
Speedup with 10k chunks: 363x
Speedup with 20k chunks: 540x
This commit backports the Postgres multi-buffer / bulk insert
optimization into the timescale copy operator. If the target chunk
allows it (e.g., if no triggers are defined on the hypertable or the
chunk is not compressed), the data is stored in in-memory buffers
first and then flushed to the chunks in bulk operations.
Implements: #4080
This patch adds the spelling fix commit to the git blame ignore
list and adds a thank you to the changelog for the author.
The git blame change couldnt be done in the spelling PR because
it references the commit hash.
The query to get the list of saved privileges during extension
upgrade had a bug and only applying the classoid restriction
for a subset of the entries when it should have been applied to
all returned rows leading to a failure during extension update
when init privileges for other classoids existed on any of the
relevant objects.
Add the missing variables to the finalization view of Continuous
Aggregates and the corresponding columns to the materialization table.
Cover the case of targets that contain Aggref nodes and Var nodes
that are outside of the Aggref nodes at the same time.
Stop rebuilding the Continuous Aggregate view with ALTER MATERIALIZED
VIEW. Attempt to repair the view at post-update time instead, and fail
gracefully if it is not possible to do so without raw hypertable schema
or data modifications.
Stop rebuilding the Continuous Aggregate view when switching realtime
aggregation on and off. Instead, manipulate the User View by either:
1. removing the UNION ALL right-hand side and the WHERE clause when
disabling realtime aggregation
2. adding the Direct View to the right of a UNION ALL operator and
defining WHERE clauses with the relevant watermark checks when
enabling realtime aggregation
Fixes#3898
Stop throwing error "must be owner of hypertable" when a user with
TRUNCATE privilege on the hypertable attempts to TRUNCATE.
Previously we had a check that required TRUNCATE to only be
performed by the table owner, not taking into account the user's
TRUNCATE privilege, which is sufficient to allow this operation.
Fixes#4183
This release is patch release. We recommend that you upgrade at the next available opportunity.
**Bugfixes**
* #3974 Fix remote EXPLAIN with parameterized queries
* #4122 Fix segfault on INSERT into distributed hypertable
* #4142 Ignore invalid relid when deleting hypertable
* #4159 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
* #4161 Fix memory handling during scans
* #4186 Fix owner change for distributed hypertable
* #4192 Abort sessions after extension reload
* #4193 Fix relcache callback handling causing crashes
**Thanks**
* @abrownsword for reporting a crash in the telemetry reporter
* @daydayup863 for reporting issue with remote explain
Currently only IMMUTABLE constraints will exclude chunks from an UPDATE plan,
with this patch STABLE expressions will be used to exclude chunks as well.
This is a big performance improvement as chunks not matching partitioning
column constraints don't have to be scanned for UPDATEs.
Since the codepath for UPDATE is different for PG < 14 this patch only adds
the optimization for PG14.
With this patch the plan for UPDATE on hypertables looks like this:
Custom Scan (HypertableModify) (actual rows=0 loops=1)
-> Update on public.metrics_int2 (actual rows=0 loops=1)
Update on public.metrics_int2 metrics_int2_1
Update on _timescaledb_internal._hyper_1_1_chunk metrics_int2
Update on _timescaledb_internal._hyper_1_2_chunk metrics_int2
Update on _timescaledb_internal._hyper_1_3_chunk metrics_int2
-> Custom Scan (ChunkAppend) on public.metrics_int2 (actual rows=0 loops=1)
Output: '123'::text, metrics_int2.tableoid, metrics_int2.ctid
Startup Exclusion: true
Runtime Exclusion: false
Chunks excluded during startup: 3
-> Seq Scan on public.metrics_int2 metrics_int2_1 (actual rows=0 loops=1)
Output: metrics_int2_1.tableoid, metrics_int2_1.ctid
Filter: (metrics_int2_1."time" = length(version()))
Also remove unused code from compression_api. The function
policy_compression_get_verbose_log was unused. Moved it to
policy_utils and renamed to policy_get_verbose_log so that it can
be used by all policies.
If a session is started and loads (and caches, by OID) functions in the
extension to use them in, for example, a `SELECT` query on a continuous
aggregate, the extension will be marked as loaded internally.
If an `ALTER EXTENSION` is then executed in a separate session, it will
update `pg_extension` to hold the new version, and any other sessions
will see this as the new version, including the session that already
loaded the previous version of the shared library.
Since the pre-update session has loaded some functions from the old
version already, running the same queries with the old named functions
will trigger a reload of the new version of the shared library to get
the new functions (same name, but different OID), but since this has
already been loaded in a different version, it will trigger an error
that GUC variables are re-defined.
Further queries after that will then corrupt the database causing a
crash.
This commit fixes this by recording the version loaded rather than if
it has been loaded and check that the version did not change after a
query has been analyzed (in the `post_analyze_hook`). If the version
changed, it will generate a fatal error to force an abort of the
session.
Fixes#4191