4692 Commits

Author SHA1 Message Date
Fabrízio de Royes Mello
ca125cf620 Post-release changes for 2.15.0. 2024-05-07 16:44:43 -03:00
Sven Klemm
c5ef48965c Use ts_extract_expr_args for HypertableRestrictInfo
Refactor HypertableRestrictInfo code to make use of the newly added
ts_extract_expr_args function.
2024-05-07 20:03:12 +02:00
Sven Klemm
0075bb0c81 Refactor expression handling for compressed UPDATE/DELETE 2024-05-07 20:03:12 +02:00
Erik Nordström
5bf6a3f502 Fix use of ObjectId datums in compression settings
Use ObjectId datum functions in compression settings instead of int32
functions when storing `regclass` types.

Also fix a minor issue where an array for Datum information was using
the wrong size.
2024-05-07 09:22:15 +02:00
Fabrízio de Royes Mello
defe4ef581 Release 2.15.0
This release contains performance improvements and bug fixes since
the 2.14.2 release. We recommend that you upgrade at the next
available opportunity.

In addition, it includes these noteworthy features:
* Support `time_bucket` with `origin` and/or `offset` on Continuous Aggregate
* Compression improvements:
  - Improve expression pushdown
  - Add minmax sparse indexes when compressing columns with btree indexes
  - Make compression use the defaults functions
  - Vectorize filters in WHERE clause that contain text equality operators and LIKE expressions

**Deprecation warning**
* Starting on this release will not be possible to create Continuous Aggregate using `time_bucket_ng` anymore and it will be completely removed on the upcoming releases.
* Recommend users to [migrate their old Continuous Aggregate format to the new one](https://docs.timescale.com/use-timescale/latest/continuous-aggregates/migrate/) because it support will be completely removed in next releases prevent them to migrate.
* This is the last release supporting PostgreSQL 13.

**For on-premise users and this release only**, you will need to run [this SQL script](https://github.com/timescale/timescaledb-extras/blob/master/utils/2.15.X-fix_hypertable_foreign_keys.sql) after running `ALTER EXTENSION`. More details can be found in the pull request [#6786](https://github.com/timescale/timescaledb/pull/6797).

**Features**
* #6382 Support for time_bucket with origin and offset in CAggs
* #6696 Improve defaults for compression segment_by and order_by
* #6705 Add sparse minmax indexes for compressed columns that have uncompressed btree indexes
* #6754 Allow DROP CONSTRAINT on compressed hypertables
* #6767 Add metadata table `_timestaledb_internal.bgw_job_stat_history` for tracking job execution history
* #6798 Prevent usage of deprecated time_bucket_ng in CAgg definition
* #6810 Add telemetry for access methods
* #6811 Remove no longer relevant timescaledb.allow_install_without_preload GUC
* #6837 Add migration path for CAggs using time_bucket_ng
* #6865 Update the watermark when truncating a CAgg

**Bugfixes**
* #6617 Fix error in show_chunks
* #6621 Remove metadata when dropping chunks
* #6677 Fix snapshot usage in CAgg invalidation scanner
* #6698 Define meaning of 0 retries for jobs as no retries
* #6717 Fix handling of compressed tables with primary or unique index in COPY path
* #6726 Fix constify cagg_watermark using window function when querying a CAgg
* #6729 Fix NULL start value handling in CAgg refresh
* #6732 Fix CAgg migration with custom timezone / date format settings
* #6752 Remove custom autovacuum setting from compressed chunks
* #6770 Fix plantime chunk exclusion for OSM chunk
* #6789 Fix deletes with subqueries and compression
* #6796 Fix a crash involving a view on a hypertable
* #6797 Fix foreign key constraint handling on compressed hypertables
* #6816 Fix handling of chunks with no contraints
* #6820 Fix a crash when the ts_hypertable_insert_blocker was called directly
* #6849 Use non-orderby compressed metadata in compressed DML
* #6867 Clean up compression settings when deleting compressed cagg
* #6869 Fix compressed DML with constraints of form value OP column
* #6870 Fix bool expression pushdown for queries on compressed chunks

**Thanks**
* @brasic for reporting a crash when the ts_hypertable_insert_blocker was called directly
* @bvanelli for reporting an issue with the jobs retry count
* @djzurawsk For reporting error when dropping chunks
* @Dzuzepppe for reporting an issue with DELETEs using subquery on compressed chunk working incorrectly.
* @hongquan For reporting a 'timestamp out of range' error during CAgg migrations
* @kevcenteno for reporting an issue with the show_chunks API showing incorrect output when 'created_before/created_after' was used with time-partitioned columns.
* @mahipv For starting working on the job history PR
* @rovo89 For reporting constify cagg_watermark not working using window function when querying a CAgg
2024-05-06 12:40:40 -03:00
Fabrízio de Royes Mello
d8057c4326 Fix policy refresh window too small
When adding a CAgg refresh policy using `start_offset => '2 months'` and
`end_offset => '0 months'` was failing because the policy refresh window
is too small and it should cover at least two buckets in the valid time
range of timestamptz.

The problem was when calculating the bucket width for the variable
bucket size (like months) we're assuming 31 days for each month but when
converting the end offset to integer using `interval_to_int64` we use 30
days per month. Fixed it by aligning the variable bucket size
calculation to also use 30 days.
2024-05-06 03:50:50 -03:00
Fabrízio de Royes Mello
bc644f7965 Release notes header 2.15.0 2024-05-03 09:47:24 -03:00
Sven Klemm
e298ecd532 Don't reuse job id
We shouldnt reuse job ids to make it easy to recognize the job
log entries for a job. We also need to keep the old job around
to not break loading dumps from older versions.
2024-05-03 09:05:57 +02:00
Fabrízio de Royes Mello
fc46eabf99 Introduce release notes header template
This PR introduce the release notes header template using Jinja [1].
Also improved the script to merge changelogs to include the upcoming
.unreleased/RELEASE_NOTES_HEADER.md.j2 where we'll actually write the
release notes header for the next release.
2024-04-30 09:44:32 -03:00
Sven Klemm
acc73b9d5b Fix bool expression pushdown for queries on compressed chunks
`NOT column` or `column = false` result in different expressions than
other boolean expressions which were not handled in the qual pushdown
code. This patch enables pushdown for these expressions and also enables
pushdown for OR expressions on compressed chunks.
2024-04-29 16:12:12 +02:00
Sven Klemm
23a29631c4 Remove TimevalInfinity enum
This enum is used in no codepath.
2024-04-29 14:31:54 +02:00
Sven Klemm
396dd0cb11 Fix compressed DML with constraints of form value OP column
UPDATE/DELETE operations with constraints where the column is on the
right side of the expression and the value on the left side e.g.
'a' > column were not handled correctly when operating on compressed chunks.
2024-04-29 09:17:51 +02:00
Sven Klemm
a7890e6411 Clean up compression settings when deleting compressed cagg
When deleting a cagg with compression on the materialization hypertable
the compression settings for that hypertable would not get removed when
dropping the cagg.
2024-04-28 07:21:03 +02:00
Fabrízio de Royes Mello
183d309b2c Update the watermark when truncating a CAgg
In #5261 we cached the Continuous Aggregate watermark value in a
metadata table to improve performance avoiding compute the watermark at
planning time.

Manually DML operations on a CAgg are not recommended and instead the
user should use the `refresh_continuous_aggregate` procedure. But we
handle `TRUNCATE` over CAggs generating the necessary invalidation logs
so make sense to also update the watermark.
2024-04-26 16:57:10 -03:00
Jan Nidzwetzki
e4846c7648 Mark variable with PG_USED_FOR_ASSERTS_ONLY
The function continuous_agg_migrate_to_time_bucket contains a variable
that is used only for asserts. This PR marks this variable as
PG_USED_FOR_ASSERTS_ONLY.
2024-04-26 14:54:14 +02:00
Jan Nidzwetzki
f88899171f Add migration for CAggs using time_bucket_ng
The function time_bucket_ng is deprecated. This PR adds a migration path
for existing CAggs. Since time_bucket and time_bucket_ng use different
origin values, a custom origin is set if needed to let time_bucket
create the same buckets as created by time_bucket_ng so far.
2024-04-25 16:08:48 +02:00
Jan Nidzwetzki
d7c95279e2 Update changed homebrew path
The path to PostgreSQL has changed. This PR adjusts the needed
environment variable.
2024-04-25 10:55:47 +02:00
Jan Nidzwetzki
3ad948163c Remove MN leftover create distributed hypertable
This commit removes an unused function that was used in MN setups to
create distributed hypertables.
2024-04-25 10:55:27 +02:00
Fabrízio de Royes Mello
a41e5b8da7 Remove relation_oid duplicated function
We already have `ts_get_relation_relid` that have the same purpose of
the `relation_oid` so removed it.
2024-04-25 03:26:07 -03:00
Sven Klemm
1072c73bc6 Remove pg12 version check in prove tests 2024-04-24 16:34:27 +02:00
Sven Klemm
1a5370643e Remove conditional for trusted in control file
All postgres versions supported by timescaledb support trusted
extensions so we do no longer need the conditional in the control file.
2024-04-24 15:29:03 +02:00
Sven Klemm
ae7a73abb0 Change error message for disabled feature flags
This intentionally doesnt follow the postgres guidelines for message and
detail because by default users won't see the detail and we want to
present a clear message without requiring enabling additional verbosity.
2024-04-24 15:26:47 +02:00
Jan Nidzwetzki
6b4e6c9c55 Remove unused ts_chunk_oid_cmp function
This PR removes the unused ts_chunk_oid_cmp function.
2024-04-24 11:24:02 +02:00
Jan Nidzwetzki
668e6a2bb9 Update outdated CAgg error hint
The CAgg error hint regarding adding indexes to non-finalized CAggs was
proposing to recreate the whole CAgg. However, pointing to the migration
function should be the preferred method. This PR changes the error
wording.
2024-04-24 09:15:32 +02:00
Fabrízio de Royes Mello
e91e591d0d Use proper macros to check for invalid objects
* INVALID_HYPERTABLE_ID
* INVALID_CHUNK_ID
* OidIsValid
2024-04-23 16:36:28 -03:00
Sven Klemm
1a3c2cbd17 Use non-orderby compressed metadata in compressed DML
Currently the additional metadata derived from index columns are
only used for the qualifier pushdown in querying but not the
decompression during compressed DML. This patch makes use of this
metadata for compressed DML as well.
This will lead to considerable speedup when deleting or updating
compressed chunks with filters on non-segmentby columns.
2024-04-23 15:20:55 +02:00
James Guthrie
95b4e246eb Remove stray comment missed in 0b23bab4 2024-04-23 11:39:31 +01:00
Jan Nidzwetzki
8703497535 Fix handling of timezone attr in job scheduler
So far, we have loaded the timezone information not into the correct
memory context when we fetched a job from the database. Therefore, this
value was stored in the scratch_mctx and removed too early. This PR
moved the value into the desired memory context.
2024-04-23 08:39:23 +02:00
Fabrízio de Royes Mello
d55957abc5 Fix missing pid in job execution history
When the `timescaledb.enable_job_execution_logging` is OFF we should
track only errors but we're not saving the related PID.

Fixed it by using MyProcPid when inserting new tuple in
`_timescaledb_internal.bgw_job_stat_history`.
2024-04-22 17:44:49 -03:00
Fabrízio de Royes Mello
866ffba40e Migrate missing job info to history table
In #6767 and #6831 we introduced the ability to track job execution
history including succeeded and failed jobs.

We migrate records from the old `_timescaledb_internal.job_errors` to
the new `_timescaledb_internal.bgw_job_stat_history` table but we miss
to get the job information into the JSONB field where we store detailed
information about the job execution.
2024-04-22 14:06:59 -03:00
Fabrízio de Royes Mello
0f1983ef78 Add ts_stat_statements callbacks
Currently we finish the execution of some process utility statements
and don't execute other hooks in the chain.

Because that reason neither `ts_stat_statements` and
`pg_stat_statements` are able to track some utility statements, for
example COPY ... FROM.

To be able to track it on `ts_stat_statements` we're introducing some
callbacks in order to hook `pgss_store` from TimescaleDB and store
information about the execution of those statements.

In this PR we're also adding a new GUC `enable_tss_callbacks=true` to
enable or disable the ability to hook `ts_stat_statements` from
TimescaleDB.
2024-04-19 17:32:51 -03:00
Fabrízio de Royes Mello
66c0702d3b Refactor job execution history table
In #6767 we introduced the ability to track job execution history
including succeeded and failed jobs.

The new metadata table `_timescaledb_internal.bgw_job_stat_history` has
two JSONB columns `config` (store config information) and `error_data`
(store the ErrorData information). The problem is that this approach is
not flexible for future history recording changes so this PR refactor
the current implementation to use only one JSONB column named `data`
that will store more job information in that form:

{
  "job": {
    "owner": "fabrizio",
    "proc_name": "error",
    "scheduled": true,
    "max_retries": -1,
    "max_runtime": "00:00:00",
    "proc_schema": "public",
    "retry_period": "00:05:00",
    "initial_start": "00:05:00",
    "fixed_schedule": true,
    "schedule_interval": "00:00:30"
  },
  "config": {
    "bar": 1
  },
  "error_data": {
    "domain": "postgres-16",
    "lineno": 841,
    "context": "SQL statement \"SELECT 1/0\"\nPL/pgSQL function error(integer,jsonb) line 3 at PERFORM",
    "message": "division by zero",
    "filename": "int.c",
    "funcname": "int4div",
    "proc_name": "error",
    "sqlerrcode": "22012",
    "proc_schema": "public",
    "context_domain": "plpgsql-16"
  }
}
2024-04-19 09:19:23 -03:00
Jan Nidzwetzki
e02a473b57 Allow changing real-time flag on deprecated CAggs
PR #6798 prevents the usage of time_bucket_ng in CAgg definitions.
However, flipping the real-time functionality should still be possible.
2024-04-18 21:50:18 +02:00
Alexander Kuzmenkov
069ccbebc4
Correctly determine relation type of OSM chunk as standalone table (#6840)
Previously we only handled the case of OSM chunk expanded as a child of
hypertable, so in the case of direct select it segfaulted while trying
to access an fdw_private which is managed by OSM.
2024-04-18 18:55:11 +00:00
Konstantina Skovola
f4ec0f4ada Fix plantime chunk exclusion for OSM chunk 2024-04-18 18:12:14 +02:00
Jan Nidzwetzki
67bd5a820b Remove CAgg watermark cache
Since #6325 we constify the watermark value of a CAgg during planning
time. Therefore, the planner calls the watermark function only once.
This PR removes the old code to cache the watermark value and speed up
multiple calls of the watermark function.
2024-04-17 08:02:47 +02:00
gayyappan
07bf49a174 Modify hypertable catalog update API calls
Use the same logic as PR 6773 while updating hypertable catalog tuples.
 PR 6773 addresses chunk catalog updates. We first lock the tuple and
then modify the values and update the locked tuple. Replace
ts_hypertable_update with field specific APIs and use
hypertable_update_catalog_tuple calls consistently.
2024-04-16 13:24:47 -04:00
Sven Klemm
ecf6beae5d Fix FK constraints for compressed chunks
When foreign key support for compressed chunks was added we moved
the FK constraint from the uncompressed chunk to the compressed chunk as
part of compress_chunk and moved it back as part of decompress_chunk.
With the addition of partially compressed chunks in 2.10.x this approach
was no longer sufficient and the FK constraint needs to be present on
both the uncompressed and the compressed chunk.

While this patch will fix future compressed chunks a migration has to be
run after upgrading timescaledb to migrate existing chunks affected by
this.

The following code will fix any affected hypertables:
```
CREATE OR REPLACE FUNCTION pg_temp.constraint_columns(regclass, int2[]) RETURNS text[] AS
$$
  SELECT array_agg(attname) FROM unnest($2) un(attnum) LEFT JOIN pg_attribute att ON att.attrelid=$1 AND att.attnum = un.attnum;
$$ LANGUAGE SQL SET search_path TO pg_catalog, pg_temp;

DO $$
DECLARE
  ht_id int;
  ht regclass;
  chunk regclass;
  con_oid oid;
  con_frelid regclass;
  con_name text;
  con_columns text[];
  chunk_id int;

BEGIN

  -- iterate over all hypertables that have foreign key constraints
  FOR ht_id, ht in
    SELECT
      ht.id,
      format('%I.%I',ht.schema_name,ht.table_name)::regclass
    FROM _timescaledb_catalog.hypertable ht
    WHERE
      EXISTS (
        SELECT FROM pg_constraint con
        WHERE
          con.contype='f' AND
          con.conrelid=format('%I.%I',ht.schema_name,ht.table_name)::regclass
      )
  LOOP
    RAISE NOTICE 'Hypertable % has foreign key constraint', ht;

    -- iterate over all foreign key constraints on the hypertable
    -- and check that they are present on every chunk
    FOR con_oid, con_frelid, con_name, con_columns IN
      SELECT con.oid, con.confrelid, con.conname, pg_temp.constraint_columns(con.conrelid,con.conkey)
      FROM pg_constraint con
      WHERE
        con.contype='f' AND
        con.conrelid=ht
    LOOP
      RAISE NOTICE 'Checking constraint % %', con_name, con_columns;
      -- check that the foreign key constraint is present on the chunk

      FOR chunk_id, chunk IN
        SELECT
          ch.id,
          format('%I.%I',ch.schema_name,ch.table_name)::regclass
        FROM _timescaledb_catalog.chunk ch
        WHERE
          ch.hypertable_id=ht_id
      LOOP
        RAISE NOTICE 'Checking chunk %', chunk;
        IF NOT EXISTS (
          SELECT FROM pg_constraint con
          WHERE
            con.contype='f' AND
            con.conrelid=chunk AND
            con.confrelid=con_frelid  AND
            pg_temp.constraint_columns(con.conrelid,con.conkey) = con_columns
        ) THEN
          RAISE WARNING 'Restoring constraint % on chunk %', con_name, chunk;
          PERFORM _timescaledb_functions.constraint_clone(con_oid, chunk);
          INSERT INTO _timescaledb_catalog.chunk_constraint(chunk_id, dimension_slice_id, constraint_name, hypertable_constraint_name) VALUES (chunk_id, NULL, con_name, con_name);
        END IF;

      END LOOP;
    END LOOP;

  END LOOP;

END
$$;

DROP FUNCTION pg_temp.constraint_columns(regclass, int2[]);
```
2024-04-13 16:28:46 +02:00
Sven Klemm
f6b2215560 Bump pgspot version used in CI 2024-04-12 12:50:54 +02:00
Alexander Kuzmenkov
4bd73f5043
Fix backport script when PR references an issue from another repo (#6824)
We were always looking at the referenced issue number in the timescaledb
repo, which is incorrect.
2024-04-12 09:23:01 +00:00
Alexander Kuzmenkov
4420561485
Treat segmentby columns same as compressed columns with default value (#6817)
This is a minor refactoring that will later allow to simplify the
vectorized aggregation code. No functional or performance changes are
expected.
2024-04-12 09:10:16 +00:00
Alexander Kuzmenkov
6e7b6e9a6e
vectorized aggregation as separate plan node (#6784)
This PR is a little too big, but it proved difficult to split into parts
because they are all dependent.

* Move the vectorized aggregation into a separate plan node, which
simplifies working with targetlist in DecompressChunk node.

* Add a post-planning hook that replaces the normal partial aggregation
node with the vectorized aggregation node. The advantage of this
compared to planning on Path stage is that we know which columns support
bulk decompression and which filters are vectorized.

* Use the compressed batch API in vectorized aggregation. This
simplifies the code.

* Support vectorized aggregation after vectorized filters.

* Add a simple generic interface for vectorized aggregate functions. For
now the only function is still `sum(int4)`.

* The parallel plans are now used more often, maybe because the old code
didn't add costs for aggregation and just used the costs from
DecompressChunk, so the costs of parallel plans were less different. The
current code does the cost-based planning for normal aggregates, and
then after planning replaces them with vectorized, so now we basically
follow the plan choice that Postgres makes for the usual aggregation.
2024-04-11 17:15:26 +00:00
Alexander Kuzmenkov
610db31241
Don't copy compressed slot to compressed batch struct (#6806)
There is overhead associated with copying the heap tuple and (un)pinning
the respective heap buffers, which becomes apparent in vectorized
aggregation.

Instead of this, it is enough to copy the by-reference segmentby values
to the per-batch context.

Also we have to copy in the rare case where the compressed data is
inlined into the compressed row and not toasted.
2024-04-11 11:55:49 +00:00
Mats Kindahl
971e6c370e Use ignore words file with codespell
It is not possible to automatically backport pull requests that makes
modifications to workflow files and since the codespell action has a
hard-coded list of ignored words as options, this means that any
changes to the ignored codespell words cannot be backported.

This pull request fixes this by using a ignore words file instead,
which means that adding new words does not require changing a workflow
file and hence the pull requests can be automatically backported.
2024-04-11 10:51:51 +02:00
Jan Nidzwetzki
8347621016 Check for trigger context before accessing data
The ts_hypertable_insert_blocker function was accessing data from the
trigger context before it was tested that a trigger context actually
exists. This led to a crash when the function was called directly.

Fixes: #6819
2024-04-11 08:36:14 +02:00
Jan Nidzwetzki
c4ebdf6b4f Fix handling of chunks with no contraints
When a catalog corruption occurs, and a chunk does not contain any
dimension slices, we crash in ts_dimension_slice_cmp(). This patch adds
a proper check and errors out before the code path is called.
2024-04-10 22:59:13 +02:00
Mats Kindahl
ea2284386b Add telemetry for access methods
Add telemetry for tracking access methods used, number of pages for
each access method, and number of instances using each access method.

Also introduces a type-based function `ts_jsonb_set_value_by_type` that
can generate correct JSONB based on the PostgreSQL type. It will
generate "bare" values for numerics, and strings for anything else
using the output function for the type.

To test this for string values, we update `ts_jsonb_add_interval` to
use this new function, which is calling the output function for the
type, just like `ts_jsonb_set_value_by_type`.
2024-04-10 15:13:29 +02:00
Ante Kresic
7ffdd0716c Reduce locking on decompress_chunk to allow reads
With a recent change, we updated the lock on decompress_chunk
to take an AccessExclusiveLock on the uncompressed chunk at
the start of this potentially long running operation. Reducing
this lock to ExclusiveLock would enable reads to execute while
we are decompressing the chunk. AccessExclusive lock will be
taken on the compressed chunk at the end of the operation,
during its removal.
2024-04-09 16:10:16 +02:00
Jan Nidzwetzki
626975f09f Prevent usage of time_bucket_ng in CAgg definition
The function timescaledb_experimental.time_bucket_ng() has been
deprecated for two years. This PR removes it from the list of bucketing
functions supported in a CAgg. Existing CAggs using this function will
still be supported; however, no new CAggs using this function can be
created.
2024-04-09 14:55:31 +02:00
Alexander Kuzmenkov
2a30ca428d
Remove restrict from const objects (#6791)
We don't really need it if we systematically use restrict on the
read/write objects.

This is a minor refactoring to avoid confusion, shouldn't actually
change any behavior or code generation.
2024-04-09 12:48:24 +00:00