4685 Commits

Author SHA1 Message Date
Sven Klemm
e298ecd532 Don't reuse job id
We shouldnt reuse job ids to make it easy to recognize the job
log entries for a job. We also need to keep the old job around
to not break loading dumps from older versions.
2024-05-03 09:05:57 +02:00
Fabrízio de Royes Mello
fc46eabf99 Introduce release notes header template
This PR introduce the release notes header template using Jinja [1].
Also improved the script to merge changelogs to include the upcoming
.unreleased/RELEASE_NOTES_HEADER.md.j2 where we'll actually write the
release notes header for the next release.
2024-04-30 09:44:32 -03:00
Sven Klemm
acc73b9d5b Fix bool expression pushdown for queries on compressed chunks
`NOT column` or `column = false` result in different expressions than
other boolean expressions which were not handled in the qual pushdown
code. This patch enables pushdown for these expressions and also enables
pushdown for OR expressions on compressed chunks.
2024-04-29 16:12:12 +02:00
Sven Klemm
23a29631c4 Remove TimevalInfinity enum
This enum is used in no codepath.
2024-04-29 14:31:54 +02:00
Sven Klemm
396dd0cb11 Fix compressed DML with constraints of form value OP column
UPDATE/DELETE operations with constraints where the column is on the
right side of the expression and the value on the left side e.g.
'a' > column were not handled correctly when operating on compressed chunks.
2024-04-29 09:17:51 +02:00
Sven Klemm
a7890e6411 Clean up compression settings when deleting compressed cagg
When deleting a cagg with compression on the materialization hypertable
the compression settings for that hypertable would not get removed when
dropping the cagg.
2024-04-28 07:21:03 +02:00
Fabrízio de Royes Mello
183d309b2c Update the watermark when truncating a CAgg
In #5261 we cached the Continuous Aggregate watermark value in a
metadata table to improve performance avoiding compute the watermark at
planning time.

Manually DML operations on a CAgg are not recommended and instead the
user should use the `refresh_continuous_aggregate` procedure. But we
handle `TRUNCATE` over CAggs generating the necessary invalidation logs
so make sense to also update the watermark.
2024-04-26 16:57:10 -03:00
Jan Nidzwetzki
e4846c7648 Mark variable with PG_USED_FOR_ASSERTS_ONLY
The function continuous_agg_migrate_to_time_bucket contains a variable
that is used only for asserts. This PR marks this variable as
PG_USED_FOR_ASSERTS_ONLY.
2024-04-26 14:54:14 +02:00
Jan Nidzwetzki
f88899171f Add migration for CAggs using time_bucket_ng
The function time_bucket_ng is deprecated. This PR adds a migration path
for existing CAggs. Since time_bucket and time_bucket_ng use different
origin values, a custom origin is set if needed to let time_bucket
create the same buckets as created by time_bucket_ng so far.
2024-04-25 16:08:48 +02:00
Jan Nidzwetzki
d7c95279e2 Update changed homebrew path
The path to PostgreSQL has changed. This PR adjusts the needed
environment variable.
2024-04-25 10:55:47 +02:00
Jan Nidzwetzki
3ad948163c Remove MN leftover create distributed hypertable
This commit removes an unused function that was used in MN setups to
create distributed hypertables.
2024-04-25 10:55:27 +02:00
Fabrízio de Royes Mello
a41e5b8da7 Remove relation_oid duplicated function
We already have `ts_get_relation_relid` that have the same purpose of
the `relation_oid` so removed it.
2024-04-25 03:26:07 -03:00
Sven Klemm
1072c73bc6 Remove pg12 version check in prove tests 2024-04-24 16:34:27 +02:00
Sven Klemm
1a5370643e Remove conditional for trusted in control file
All postgres versions supported by timescaledb support trusted
extensions so we do no longer need the conditional in the control file.
2024-04-24 15:29:03 +02:00
Sven Klemm
ae7a73abb0 Change error message for disabled feature flags
This intentionally doesnt follow the postgres guidelines for message and
detail because by default users won't see the detail and we want to
present a clear message without requiring enabling additional verbosity.
2024-04-24 15:26:47 +02:00
Jan Nidzwetzki
6b4e6c9c55 Remove unused ts_chunk_oid_cmp function
This PR removes the unused ts_chunk_oid_cmp function.
2024-04-24 11:24:02 +02:00
Jan Nidzwetzki
668e6a2bb9 Update outdated CAgg error hint
The CAgg error hint regarding adding indexes to non-finalized CAggs was
proposing to recreate the whole CAgg. However, pointing to the migration
function should be the preferred method. This PR changes the error
wording.
2024-04-24 09:15:32 +02:00
Fabrízio de Royes Mello
e91e591d0d Use proper macros to check for invalid objects
* INVALID_HYPERTABLE_ID
* INVALID_CHUNK_ID
* OidIsValid
2024-04-23 16:36:28 -03:00
Sven Klemm
1a3c2cbd17 Use non-orderby compressed metadata in compressed DML
Currently the additional metadata derived from index columns are
only used for the qualifier pushdown in querying but not the
decompression during compressed DML. This patch makes use of this
metadata for compressed DML as well.
This will lead to considerable speedup when deleting or updating
compressed chunks with filters on non-segmentby columns.
2024-04-23 15:20:55 +02:00
James Guthrie
95b4e246eb Remove stray comment missed in 0b23bab4 2024-04-23 11:39:31 +01:00
Jan Nidzwetzki
8703497535 Fix handling of timezone attr in job scheduler
So far, we have loaded the timezone information not into the correct
memory context when we fetched a job from the database. Therefore, this
value was stored in the scratch_mctx and removed too early. This PR
moved the value into the desired memory context.
2024-04-23 08:39:23 +02:00
Fabrízio de Royes Mello
d55957abc5 Fix missing pid in job execution history
When the `timescaledb.enable_job_execution_logging` is OFF we should
track only errors but we're not saving the related PID.

Fixed it by using MyProcPid when inserting new tuple in
`_timescaledb_internal.bgw_job_stat_history`.
2024-04-22 17:44:49 -03:00
Fabrízio de Royes Mello
866ffba40e Migrate missing job info to history table
In #6767 and #6831 we introduced the ability to track job execution
history including succeeded and failed jobs.

We migrate records from the old `_timescaledb_internal.job_errors` to
the new `_timescaledb_internal.bgw_job_stat_history` table but we miss
to get the job information into the JSONB field where we store detailed
information about the job execution.
2024-04-22 14:06:59 -03:00
Fabrízio de Royes Mello
0f1983ef78 Add ts_stat_statements callbacks
Currently we finish the execution of some process utility statements
and don't execute other hooks in the chain.

Because that reason neither `ts_stat_statements` and
`pg_stat_statements` are able to track some utility statements, for
example COPY ... FROM.

To be able to track it on `ts_stat_statements` we're introducing some
callbacks in order to hook `pgss_store` from TimescaleDB and store
information about the execution of those statements.

In this PR we're also adding a new GUC `enable_tss_callbacks=true` to
enable or disable the ability to hook `ts_stat_statements` from
TimescaleDB.
2024-04-19 17:32:51 -03:00
Fabrízio de Royes Mello
66c0702d3b Refactor job execution history table
In #6767 we introduced the ability to track job execution history
including succeeded and failed jobs.

The new metadata table `_timescaledb_internal.bgw_job_stat_history` has
two JSONB columns `config` (store config information) and `error_data`
(store the ErrorData information). The problem is that this approach is
not flexible for future history recording changes so this PR refactor
the current implementation to use only one JSONB column named `data`
that will store more job information in that form:

{
  "job": {
    "owner": "fabrizio",
    "proc_name": "error",
    "scheduled": true,
    "max_retries": -1,
    "max_runtime": "00:00:00",
    "proc_schema": "public",
    "retry_period": "00:05:00",
    "initial_start": "00:05:00",
    "fixed_schedule": true,
    "schedule_interval": "00:00:30"
  },
  "config": {
    "bar": 1
  },
  "error_data": {
    "domain": "postgres-16",
    "lineno": 841,
    "context": "SQL statement \"SELECT 1/0\"\nPL/pgSQL function error(integer,jsonb) line 3 at PERFORM",
    "message": "division by zero",
    "filename": "int.c",
    "funcname": "int4div",
    "proc_name": "error",
    "sqlerrcode": "22012",
    "proc_schema": "public",
    "context_domain": "plpgsql-16"
  }
}
2024-04-19 09:19:23 -03:00
Jan Nidzwetzki
e02a473b57 Allow changing real-time flag on deprecated CAggs
PR #6798 prevents the usage of time_bucket_ng in CAgg definitions.
However, flipping the real-time functionality should still be possible.
2024-04-18 21:50:18 +02:00
Alexander Kuzmenkov
069ccbebc4
Correctly determine relation type of OSM chunk as standalone table (#6840)
Previously we only handled the case of OSM chunk expanded as a child of
hypertable, so in the case of direct select it segfaulted while trying
to access an fdw_private which is managed by OSM.
2024-04-18 18:55:11 +00:00
Konstantina Skovola
f4ec0f4ada Fix plantime chunk exclusion for OSM chunk 2024-04-18 18:12:14 +02:00
Jan Nidzwetzki
67bd5a820b Remove CAgg watermark cache
Since #6325 we constify the watermark value of a CAgg during planning
time. Therefore, the planner calls the watermark function only once.
This PR removes the old code to cache the watermark value and speed up
multiple calls of the watermark function.
2024-04-17 08:02:47 +02:00
gayyappan
07bf49a174 Modify hypertable catalog update API calls
Use the same logic as PR 6773 while updating hypertable catalog tuples.
 PR 6773 addresses chunk catalog updates. We first lock the tuple and
then modify the values and update the locked tuple. Replace
ts_hypertable_update with field specific APIs and use
hypertable_update_catalog_tuple calls consistently.
2024-04-16 13:24:47 -04:00
Sven Klemm
ecf6beae5d Fix FK constraints for compressed chunks
When foreign key support for compressed chunks was added we moved
the FK constraint from the uncompressed chunk to the compressed chunk as
part of compress_chunk and moved it back as part of decompress_chunk.
With the addition of partially compressed chunks in 2.10.x this approach
was no longer sufficient and the FK constraint needs to be present on
both the uncompressed and the compressed chunk.

While this patch will fix future compressed chunks a migration has to be
run after upgrading timescaledb to migrate existing chunks affected by
this.

The following code will fix any affected hypertables:
```
CREATE OR REPLACE FUNCTION pg_temp.constraint_columns(regclass, int2[]) RETURNS text[] AS
$$
  SELECT array_agg(attname) FROM unnest($2) un(attnum) LEFT JOIN pg_attribute att ON att.attrelid=$1 AND att.attnum = un.attnum;
$$ LANGUAGE SQL SET search_path TO pg_catalog, pg_temp;

DO $$
DECLARE
  ht_id int;
  ht regclass;
  chunk regclass;
  con_oid oid;
  con_frelid regclass;
  con_name text;
  con_columns text[];
  chunk_id int;

BEGIN

  -- iterate over all hypertables that have foreign key constraints
  FOR ht_id, ht in
    SELECT
      ht.id,
      format('%I.%I',ht.schema_name,ht.table_name)::regclass
    FROM _timescaledb_catalog.hypertable ht
    WHERE
      EXISTS (
        SELECT FROM pg_constraint con
        WHERE
          con.contype='f' AND
          con.conrelid=format('%I.%I',ht.schema_name,ht.table_name)::regclass
      )
  LOOP
    RAISE NOTICE 'Hypertable % has foreign key constraint', ht;

    -- iterate over all foreign key constraints on the hypertable
    -- and check that they are present on every chunk
    FOR con_oid, con_frelid, con_name, con_columns IN
      SELECT con.oid, con.confrelid, con.conname, pg_temp.constraint_columns(con.conrelid,con.conkey)
      FROM pg_constraint con
      WHERE
        con.contype='f' AND
        con.conrelid=ht
    LOOP
      RAISE NOTICE 'Checking constraint % %', con_name, con_columns;
      -- check that the foreign key constraint is present on the chunk

      FOR chunk_id, chunk IN
        SELECT
          ch.id,
          format('%I.%I',ch.schema_name,ch.table_name)::regclass
        FROM _timescaledb_catalog.chunk ch
        WHERE
          ch.hypertable_id=ht_id
      LOOP
        RAISE NOTICE 'Checking chunk %', chunk;
        IF NOT EXISTS (
          SELECT FROM pg_constraint con
          WHERE
            con.contype='f' AND
            con.conrelid=chunk AND
            con.confrelid=con_frelid  AND
            pg_temp.constraint_columns(con.conrelid,con.conkey) = con_columns
        ) THEN
          RAISE WARNING 'Restoring constraint % on chunk %', con_name, chunk;
          PERFORM _timescaledb_functions.constraint_clone(con_oid, chunk);
          INSERT INTO _timescaledb_catalog.chunk_constraint(chunk_id, dimension_slice_id, constraint_name, hypertable_constraint_name) VALUES (chunk_id, NULL, con_name, con_name);
        END IF;

      END LOOP;
    END LOOP;

  END LOOP;

END
$$;

DROP FUNCTION pg_temp.constraint_columns(regclass, int2[]);
```
2024-04-13 16:28:46 +02:00
Sven Klemm
f6b2215560 Bump pgspot version used in CI 2024-04-12 12:50:54 +02:00
Alexander Kuzmenkov
4bd73f5043
Fix backport script when PR references an issue from another repo (#6824)
We were always looking at the referenced issue number in the timescaledb
repo, which is incorrect.
2024-04-12 09:23:01 +00:00
Alexander Kuzmenkov
4420561485
Treat segmentby columns same as compressed columns with default value (#6817)
This is a minor refactoring that will later allow to simplify the
vectorized aggregation code. No functional or performance changes are
expected.
2024-04-12 09:10:16 +00:00
Alexander Kuzmenkov
6e7b6e9a6e
vectorized aggregation as separate plan node (#6784)
This PR is a little too big, but it proved difficult to split into parts
because they are all dependent.

* Move the vectorized aggregation into a separate plan node, which
simplifies working with targetlist in DecompressChunk node.

* Add a post-planning hook that replaces the normal partial aggregation
node with the vectorized aggregation node. The advantage of this
compared to planning on Path stage is that we know which columns support
bulk decompression and which filters are vectorized.

* Use the compressed batch API in vectorized aggregation. This
simplifies the code.

* Support vectorized aggregation after vectorized filters.

* Add a simple generic interface for vectorized aggregate functions. For
now the only function is still `sum(int4)`.

* The parallel plans are now used more often, maybe because the old code
didn't add costs for aggregation and just used the costs from
DecompressChunk, so the costs of parallel plans were less different. The
current code does the cost-based planning for normal aggregates, and
then after planning replaces them with vectorized, so now we basically
follow the plan choice that Postgres makes for the usual aggregation.
2024-04-11 17:15:26 +00:00
Alexander Kuzmenkov
610db31241
Don't copy compressed slot to compressed batch struct (#6806)
There is overhead associated with copying the heap tuple and (un)pinning
the respective heap buffers, which becomes apparent in vectorized
aggregation.

Instead of this, it is enough to copy the by-reference segmentby values
to the per-batch context.

Also we have to copy in the rare case where the compressed data is
inlined into the compressed row and not toasted.
2024-04-11 11:55:49 +00:00
Mats Kindahl
971e6c370e Use ignore words file with codespell
It is not possible to automatically backport pull requests that makes
modifications to workflow files and since the codespell action has a
hard-coded list of ignored words as options, this means that any
changes to the ignored codespell words cannot be backported.

This pull request fixes this by using a ignore words file instead,
which means that adding new words does not require changing a workflow
file and hence the pull requests can be automatically backported.
2024-04-11 10:51:51 +02:00
Jan Nidzwetzki
8347621016 Check for trigger context before accessing data
The ts_hypertable_insert_blocker function was accessing data from the
trigger context before it was tested that a trigger context actually
exists. This led to a crash when the function was called directly.

Fixes: #6819
2024-04-11 08:36:14 +02:00
Jan Nidzwetzki
c4ebdf6b4f Fix handling of chunks with no contraints
When a catalog corruption occurs, and a chunk does not contain any
dimension slices, we crash in ts_dimension_slice_cmp(). This patch adds
a proper check and errors out before the code path is called.
2024-04-10 22:59:13 +02:00
Mats Kindahl
ea2284386b Add telemetry for access methods
Add telemetry for tracking access methods used, number of pages for
each access method, and number of instances using each access method.

Also introduces a type-based function `ts_jsonb_set_value_by_type` that
can generate correct JSONB based on the PostgreSQL type. It will
generate "bare" values for numerics, and strings for anything else
using the output function for the type.

To test this for string values, we update `ts_jsonb_add_interval` to
use this new function, which is calling the output function for the
type, just like `ts_jsonb_set_value_by_type`.
2024-04-10 15:13:29 +02:00
Ante Kresic
7ffdd0716c Reduce locking on decompress_chunk to allow reads
With a recent change, we updated the lock on decompress_chunk
to take an AccessExclusiveLock on the uncompressed chunk at
the start of this potentially long running operation. Reducing
this lock to ExclusiveLock would enable reads to execute while
we are decompressing the chunk. AccessExclusive lock will be
taken on the compressed chunk at the end of the operation,
during its removal.
2024-04-09 16:10:16 +02:00
Jan Nidzwetzki
626975f09f Prevent usage of time_bucket_ng in CAgg definition
The function timescaledb_experimental.time_bucket_ng() has been
deprecated for two years. This PR removes it from the list of bucketing
functions supported in a CAgg. Existing CAggs using this function will
still be supported; however, no new CAggs using this function can be
created.
2024-04-09 14:55:31 +02:00
Alexander Kuzmenkov
2a30ca428d
Remove restrict from const objects (#6791)
We don't really need it if we systematically use restrict on the
read/write objects.

This is a minor refactoring to avoid confusion, shouldn't actually
change any behavior or code generation.
2024-04-09 12:48:24 +00:00
Jan Nidzwetzki
25af8f4741 Improve found attribute handling in release builds
The found attribute is not used in release builds when Asserts are not
evaluated. This leads to unused attributes and compiler errors. This PR
fixes this problem by adding PG_USED_FOR_ASSERTS_ONLY to the variable
declaration.
2024-04-09 13:31:13 +02:00
Jan Nidzwetzki
9b6175abe1 Remove allow_install_without_preload GUC
The GUC allow_install_without_preload allowed the installation of the
extension without the preloaded loader in place. However, this way of
installing TimescaleDB is not longer supported since we reuqest shared
memory and rendezvous variables in the loader. This PR removes this
deprecated way of installing TimescaleDB.
2024-04-09 10:46:33 +02:00
Jan Nidzwetzki
a54a280980 Add ensure calls to CAgg validation function
The CAgg validation logic is quite complex. This PR adds two ensure()
statements to the validation function to make it more clear (e.g., also
for static code analyzers) that we expect a valid bucket function.
2024-04-08 09:00:16 +02:00
Nikhil Sontakke
20f422b26a Remove broken_tables test
This test file was created to handle repairing of hypertables that
had broken related metadata in the dimension_slice catalog tables.
Probably the test does not make sense today given that we have more
robust referential integrity in our catalog tables. Removing it now.
2024-04-08 11:04:28 +05:30
Jan Nidzwetzki
8d9b06294e Support for CAgg with origin/offset parameter
So far, we allowed only CAggs without origin or offset parameters in the
time_bucket definition. This commit adds support for the remaining
time_bucket variants.

Fixes #2265, Fixes #5453, Fixes #5828
2024-04-05 10:30:57 +02:00
Fabrízio de Royes Mello
52094a3103 Track job execution history
In #4678 we added an interface for troubleshoting job failures by
logging it in the metadata table `_timescaledb_internal.job_errors`.

With this PR we extended the existing interface to also store succeeded
executions. A new GUC named `timescaledb.enable_job_execution_logging`
was added to control this new behavior and the default value is `false`.

We renamed the metadata table to `_timescaledb_internal.bgw_job_stat_history`
and added a new view `timescaledb_information.job_history` to users that
have enough permissions can check the job execution history.
2024-04-04 10:39:28 -03:00
Mats Kindahl
034d577c96 Switch to official Slack API
The existing service does not seem to stable and does not work so
switch to using the official Spack API GitHub action instead.
2024-04-03 14:40:00 +02:00