In #6767 we introduced the ability to track job execution history
including succeeded and failed jobs.
The new metadata table `_timescaledb_internal.bgw_job_stat_history` has
two JSONB columns `config` (store config information) and `error_data`
(store the ErrorData information). The problem is that this approach is
not flexible for future history recording changes so this PR refactor
the current implementation to use only one JSONB column named `data`
that will store more job information in that form:
{
"job": {
"owner": "fabrizio",
"proc_name": "error",
"scheduled": true,
"max_retries": -1,
"max_runtime": "00:00:00",
"proc_schema": "public",
"retry_period": "00:05:00",
"initial_start": "00:05:00",
"fixed_schedule": true,
"schedule_interval": "00:00:30"
},
"config": {
"bar": 1
},
"error_data": {
"domain": "postgres-16",
"lineno": 841,
"context": "SQL statement \"SELECT 1/0\"\nPL/pgSQL function error(integer,jsonb) line 3 at PERFORM",
"message": "division by zero",
"filename": "int.c",
"funcname": "int4div",
"proc_name": "error",
"sqlerrcode": "22012",
"proc_schema": "public",
"context_domain": "plpgsql-16"
}
}
In #4678 we added an interface for troubleshoting job failures by
logging it in the metadata table `_timescaledb_internal.job_errors`.
With this PR we extended the existing interface to also store succeeded
executions. A new GUC named `timescaledb.enable_job_execution_logging`
was added to control this new behavior and the default value is `false`.
We renamed the metadata table to `_timescaledb_internal.bgw_job_stat_history`
and added a new view `timescaledb_information.job_history` to users that
have enough permissions can check the job execution history.
This patch drops the catalog table _timescaledb_catalog.hypertable_compression
and stores those settings in _timescaledb_catalog.compression_settings instead.
The storage format is changed and the new table will have 1 entry per relation
instead of 1 entry per column and has no dependancy on hypertables.
All other aspects of compression will remain the same. This is refactoring is
to enable per chunk compression settings in a follow-up patch.
- Added creation_time attribute to timescaledb catalog table "chunk".
Also, updated corresponding view timescaledb_information.chunks to
include chunk_creation_time attribute.
- A newly created chunk is assigned the creation time during chunk
creation to handle new partition range for give dimension (Time/
SERIAL/BIGSERIAL/INT/...).
- In case of an already existing chunk, the creation time is updated as
part of running upgrade script. The current timestamp (now()) at the
time of upgrade has been assigned as chunk creation time.
- Similarly, downgrade script is updated to drop the attribute
creation_time from catalog table "chunk".
- All relevant queries/views/test output have been updated accordingly.
Co-authored-by: Nikhil Sontakke <nikhil@timescale.com>
To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- to_unix_microseconds(timestamptz)
- to_timestamp(bigint)
- to_timestamp_without_timezone(bigint)
- to_date(bigint)
- to_interval(bigint)
- interval_to_usec(interval)
- time_to_internal(anyelement)
- subtract_integer_from_now(regclass, bigint)
Different num_chunks values reported by
timescaledb_information.hypertables and
timescaledb_information.chunks.
View definition of hypertables was
not filtering dropped and osm_chunks.
Fixes#5338
Since the job error log can contain information from many different
sources and also from many different jobs it is important to ensure
that visibility of the job error log entries is restricted to job
owners.
This commit extend the view `timescaledb_information.job_errors` with
role-based checks so that a user can only see entries for jobs that she
has permission to view and restrict the permissions to
`_timescaledb_internal.job_errors` so that users only can view the job
error log through the view. A special case is added so that the
superuser and the database owner can see all log entries, even if there
is no associated job id with the log entry.
Closes#5217
Currently, the next start of a scheduled background job is
calculated by adding the `schedule_interval` to its finish
time. This does not allow scheduling jobs to execute at fixed
times, as the next execution is "shifted" by the job duration.
This commit introduces the option to execute a job on a fixed
schedule instead. Users are expected to provide an initial_start
parameter on which subsequent job executions are aligned. The next
start is calculated by computing the next time_bucket of the finish
time with initial_start origin.
An `initial_start` parameter is added to the compression, retention,
reorder and continuous aggregate `add_policy` signatures. By passing
that upon policy creation users indicate the policy will execute on
a fixed schedule, or drifting schedule if `initial_start` is not
provided.
To allow users to pick a drifting schedule when registering a UDA,
an additional parameter `fixed_schedule` is added to `add_job`
to allow users to specify the old behavior by setting it to false.
Additionally, an optional TEXT parameter, `timezone`, is added to both
add_job and add_policy signatures, to address the 1-hour shift in
execution time caused by DST switches. As internally the next start of
a fixed schedule job is calculated using time_bucket, the timezone
parameter allows using timezone-aware buckets to calculate
the next start.
This commit gives more visibility into job failures by making the
information regarding a job runtime error available in an extension
table (`job_errors`) that users can directly query.
This commit also adds an infromational view on top of the table for
convenience.
To prevent the `job_errors` table from growing too large,
a retention job is also set up with a default retention interval
of 1 month. The retention job is registered with a custom check
function that requires that a valid "drop_after" interval be provided
in the config field of the job.
Previously users had no way to update the check function
registered with add_job. This commit adds a parameter check_config
to alter_job to allow updating the check function field.
Also, previously the signature expected from a check was of
the form (job_id, config) and there was no validation
that the check function given had the correct signature.
This commit removes the job_id as it is not required and
also checks that the check function has the correct signature
when it is registered with add_job, preventing an error being
thrown at job runtime.
First step to remove the re-aggregation for Continuous Aggregates
is to remove the `chunk_id` from the materialization hypertable.
Also added new metadata column named `finalized` to `continuous_cagg`
catalog table in order to store information about the new following
finalized version of Continuous Aggregates that will not need the
partials anymore. This flag is important to maintain backward
compatibility with previous Continuous Aggregate implementation that
requires the `chunk_id` to refresh data properly.
TimescaleDB was vulnerable to a privilege escalation attack in
the extension installation script. An attacker could precreate
objects normally owned by the extension and get those objects
used in the installation script since the script would only try
to create them if they did not already exist. Thanks to Pedro
Gallegos for reporting the problem.
This patch changes the schema, table and function creation to fail
and abort the installation when the object already exists instead
of using the existing object.
Security: CVE-2022-24128
Enable ALTER MATERIALIZED VIEW (timescaledb.compress)
This enables compression on the underlying materialized
hypertable. The segmentby and orderby columns for
compression are based on the GROUP BY clause and time_bucket
clause used while setting up the continuous aggregate.
timescaledb_information.continuous_aggregate view defn
change
Add support for compression policy on continuous
aggregates
Move code from job.c to policy_utils.c
Add support functions to check compression
policy validity for continuous aggregates.
The timescaledb_information.chunks view used to return NULL in the
is_compressed field for distributed chunks. This changes the view to
always return true or false, depending on the chunk status flags in the
Access Node. This is a hint and cannot be used as a source of truth for
the compression state of the actual chunks in the Data Nodes.
The compression status of individual chunks for distributed tables is
unknown on the access node. Fix the view so that is_compressed is NULL
instead of false for distributed chunks.
1. Add compression_state column for hypertable catalog
by renaming compressed column for the hypertable catalog
table. compression_state is a tri-state column.
This column indicates if the hypertable has
compression enabled (value = 1) or if it is an internal
compression table (value = 2).
2. Save compression settings on access node when compression
is turned on for a distributed hypertable
For a distributed hypertable, that has compression enabled,
compression_state is set. We don't create any internal tables
on the access node.
Fixes#2660
timescale_information.job_stats view was missing information about
jobs running user actions, since they don't have associated
hypertables. This commit fixes that the view includes user jobs in
addition to the predefined policies, which were already covered as all
of them reference a hypertable.
Add the hypertable's schema and name to the continuous aggregates view
in the information schema, since these fields where missing.
The new fields use the same names and order in the view to be
consistent with other information views that reference the hypertable.
Fixes#2653
Removes unlrelated column schedule_interval from
timescaledb_information.continuous_aggregates view and simplifies it.
Renames argument cagg in refresh_continuous_aggregate into
continuous_aggregate as in add_continuous_aggregate_policy.
Part of #2521
This change updates the timescaledb_information.job_stats view to
check whether a job is currently scheduled in the bgw_config table.
If it is not, the `job_status` field will show `Paused` and the
`next_start` field will be NULL.
Fixes#2488
This patch removes enterprise license support and moves
move_chunk() function under community license (TSL).
Licensing validation code been reworked and simplified.
Previously used timescaledb.license_key guc been renamed to
timescaledb.license.
This change also makes testing code more strict against
used license. Apache test suite now can test only apache-licensed
functions.
Fixes#2359
Rename the `refresh_interval` field in
`timescaledb_information.continuous_aggregate` view to match the
parameter name in `add_continuous_aggregate_policy`.
The is_compressed column for timescaledb_information.chunks
view is defined as TEXT instead of BOOLEAN
as true and false were specified using string literals.
Fixes#2409
Stats for policies are exposed via
the policy_stats view. Remove continuous aggregate
stats view - the thresholds exposed via this
view are not relevant with the new API.
This change filters materialized hypertables from the hypertables
view, similar to how internal compression hypertables are
filtered.
Materialized hypertables are internal objects created as a side effect
of creating a continuous aggregate, and these internal hypertables are
still listed in the continuous_aggregates view.
Fixes#2383
With the new continuous aggregate API, some of
the parameters used to create a continuous agg are
now obsolete. Remove refresh_lag, max_interval_per_job
and ignore_invalidation_older_than information from
timescaledb_information.continuous_aggregates.
The parameter `cascade_to_materialization` is removed from
`drop_chunks` and `add_drop_chunks_policy` as well as associated tables
and test functions.
Fixes#2137