387 Commits

Author SHA1 Message Date
gayyappan
a2629cbff2 Add ignore_invalidation_older_than to view
Add ignore_invalidation_older_than parameter to
timescaledb_information.continuous_aggregates view

Fixes #1664
2020-02-03 11:31:12 -05:00
Sven Klemm
2728658a0e Release 1.6.0
This release adds major new features and bugfixes since the 1.5.1 release.
We deem it moderate priority for upgrading.

The major new feature in this release allows users to keep the aggregated
data in a continuous aggregate while dropping the raw data with drop_chunks.
This allows users to save storage by keeping only the aggregates.

The semantics of the refresh_lag parameter for continuous aggregates has
been changed to be relative to the current timestamp instead of the maximum
value in the table. This change requires that an integer_now func be set on
hypertables with integer-based time columns to use continuous aggregates on
this table.

We added a timescaledb.ignore_invalidation_older_than parameter for continuous
aggregatess. This parameter accept a time-interval (e.g. 1 month). if set,
it limits the amount of time for which to process invalidation. Thus, if
timescaledb.ignore_invalidation_older_than = '1 month', then any modifications
for data older than 1 month from the current timestamp at modification time may
not cause continuous aggregate to be updated. This limits the amount of work
that a backfill can trigger. By default, all invalidations are processed.

**Major Features**
* #1589 Allow drop_chunks while keeping continuous aggregates

**Minor Features**
* #1568 Add ignore_invalidation_older_than option to continuous aggs
* #1575 Reorder group-by clause for continuous aggregates
* #1592 Improve continuous agg user messages

**Bugfixes**
* #1565 Fix partial select query for continuous aggregate
* #1591 Fix locf treat_null_as_missing option
* #1594 Fix error in compression constraint check
* #1603 Add join info to compressed chunk
* #1606 Fix constify params during runtime exclusion
* #1607 Delete compression policy when drop hypertable
* #1608 Add jobs to timescaledb_information.policy_stats
* #1609 Fix bug with parent table in decompression
* #1624 Fix drop_chunks for ApacheOnly
* #1632 Check for NULL before dereferencing variable

**Thanks**
* @optijon for reporting an issue with locf treat_null_as_missing option
* @acarrera42 for reporting an issue with constify params during runtime exclusion
* @ChristopherZellermann for reporting an issue with the compression constraint check
* @SimonDelamare for reporting an issue with joining hypertables with compression
2020-01-15 18:20:30 +01:00
Matvey Arye
e44be9c03a Add jobs to timescaledb_information.policy_stats
Add the compression and continuous aggs job types to the
timescaledb_information.policy_stats view. It is a bug
that it wasn't there before.
2020-01-03 14:18:13 -05:00
Matvey Arye
2c594ec6f9 Keep catalog rows for some dropped chunks
If a chunk is dropped but it has a continuous aggregate that is
not dropped we want to preserve the chunk catalog row instead of
deleting the row. This is to prevent dangling identifiers in the
materialization hypertable. It also preserves the dimension slice
and chunk constraints rows for the chunk since those will be necessary
when enabling this with multinode and is necessary to recreate the
chunk too. The postgres objects associated with the chunk are all
dropped (table, constraints, indexes).

If data is ever reinserted to the same data region, the chunk is
recreated with the same dimension definitions as before. The postgres
objects are simply recreated.
2019-12-30 09:10:44 -05:00
Matvey Arye
5eb047413b Allow drop_chunks while keeping continuous aggs
Allow dropping raw chunks on the raw hypertable while keeping
the continuous aggregate. This allows for downsampling data
and allows users to save on TCO. We only allow dropping
such data when the dropped data is older than the
`ignore_invalidation_older_than` parameter on all the associated
continuous aggs. This ensures that any modifications to the
region of data which was dropped should never be reflected
in the continuous agg and thus avoids semantic ambiguity
if chunks are dropped but then again recreated due to an
insert.

Before we drop a chunk we need to make sure to process any
continuous aggregate invalidations that were registed on
data inside the chunk. Thus we add an option to materialization
to perform materialization transactionally, to only process
invalidations, and to process invalidation only before a timestamp.

We fix drop_chunks and policy to properly process
`cascade_to_materialization` as a tri-state variable (unknown,
true, false); Existing policy rows should change false to NULL
(unknown) and true stays as true since it was explicitly set.
Remove the form data for bgw_policy_drop_chunk because there
is no good way to represent the tri-state variable in the
form data.

When dropping chunks with cascade_to_materialization = false, all
invalidations on the chunks are processed before dropping the chunk.
If we are so far behind that even the  completion threshold is inside
the chunks being dropped, we error. There are 2 reasons that we error:
1) We can't safely process new ranges transactionally without taking
   heavy weight locks and potentially locking the entire sytem
2) If a completion threshold is that far behind the system probably has
   some serious issues anyway.
2019-12-30 09:10:44 -05:00
Matvey Arye
08ad7b6612 Add ignore_invalidation_older_than to continuous aggs
We added a timescaledb.ignore_invalidation_older_than parameter for
continuous aggregatess. This parameter accept a time-interval (e.g. 1
month). if set, it limits the amount of time for which to process
invalidation. Thus, if
	timescaledb.ignore_invalidation_older_than = '1 month'
then any modifications for data older than 1 month from the current
timestamp at insert time will not cause updates to the continuous
aggregate. This limits the amount of work that a backfill can trigger.
This parameter must be >= 0. A value of 0 means that invalidations are
never processed.

When recording invalidations for the hypertable at insert time, we use
the maximum ignore_invalidation_older_than of any continuous agg attached
to the hypertable as a cutoff for whether to record the invalidation
at all. When materializing a particular continuous agg, we use that
aggs  ignore_invalidation_older_than cutoff. However we have to apply
that cutoff relative to the insert time not the materialization
time to make it easier for users to reason about. Therefore,
we record the insert time as part of the invalidation entry.
2019-12-04 15:47:03 -05:00
Matvey Arye
2f7d69f93b Make continuous agg relative to now()
Previously, refresh_lag in continuous aggs was calculated
relative to the maximum timestamp in the table. Change the
semantics so that it is relative to now(). This is more
intuitive.

Requires an integer_now function applied to hypertables
with integer-based time dimensions.
2019-11-21 14:17:37 -05:00
Sven Klemm
1ec16869f8 Release 1.5.1
This maintenance release contains bugfixes since the 1.5.0 release. We deem it low
priority for upgrading.

In particular the fixes contained in this maintenance release address potential
segfaults and no other security vulnerabilities. The bugfixes are related to bloom
indexes and updates from previous versions.

**Bugfixes**
* #1523 Fix bad SQL updates from previous updates
* #1526 Fix hypertable model
* #1530 Set active snapshots in multi-xact index create

**Thanks**
* @84660320 for reporting an issue with bloom indexes
2019-11-12 12:42:43 +01:00
Matvey Arye
122856c1bd Fix update scripts for type functions
Type functions have to be CREATE OR REPLACED on every update
since they need to point to the correct .so. Thus,
split the type definitions into a pre, functions,
and post part and rerun the functions part on both
pre_install and on every update.
2019-11-11 17:10:13 -05:00
Matvey Arye
99f862198e Fix update logic from 1.4.2 to 1.5.0
The update logic from 1.4.2 to 1.5.0 had an error where
the _timescaledb_catalog.hypertable table was altered in such
a way that the table was not re-written. This causes
bugs in catalog processing code. A CLUSTER rewrites the
table. We also backpatch this change to the 1.4.2--1.5.0
script to help anyone building from source.

Also fixes a similar error on  _timescaledb_catalog.metadata
introduced in the 1.3.2--1.4.0 update.
2019-11-07 12:12:54 -05:00
Sven Klemm
7b2519eb44 Release 1.5.0
This release adds major new features and bugfixes since the 1.4.2 release.
We deem it moderate priority for upgrading.

This release adds compression as a major new feature.
Multiple type-specific compression options are available in this release
(including DeltaDelta with run-length-encoding for integers and
timestamps; Gorilla compression for floats; dictionary-based compression
for any data type, but specifically for low-cardinality datasets;
and other LZ-based techniques). Individual columns can be compressed with
type-specific compression algorithms as Postgres' native row-based format
are rolled up into columnar-like arrays on a per chunk basis.
The query planner then handles transparent decompression for compressed
chunks at execution time.

This release also adds support for basic data tiering by supporting
the migration of chunks between tablespaces, as well as support for
parallel query coordination to the ChunkAppend node.
Previously ChunkAppend would rely on parallel coordination in the
underlying scans for parallel plans.
2019-10-31 17:13:41 +01:00
Matvey Arye
db23139b3c Fix error for exported_uuid in pg_restore
When restoring a database, people would encounter errors if
the restore happened after telemetry has run. This is because
a 'exported_uuid' field would then exist and people would encounter
a "duplicate key value" when the restore tried to overwrite it.

We fix this by moving this metadata to a different key
in pre_restore and trying to move it back in post_restore.
If the restore create an exported_uuid, that restored
value is used and the moved version is simply deleted

We also remove the error redirection in restore so that errors
will show up in tests in the future.

Fixes #1409.
2019-10-30 11:40:24 -04:00
gayyappan
87786f1520 Add compressed table size to existing views
Some information views report hypertable sizes. Include
compressed table size in the calculation when applicable.
2019-10-29 19:02:58 -04:00
Matvey Arye
2fe51d2735 Improve (de)compress_chunk API
This commit improves the API of compress_chunk and decompress_chunk:

- have it return the chunk regclass processed (or NULL in the
  idempotent case);
- mark it as STRICT
- add if_not_compressed/if_compressed options for idempotency
2019-10-29 19:02:58 -04:00
gayyappan
72588a2382 Restrict constraints on compressed hypertables.
Primary and unqiue constraints are limited to segment_by and order_by
columns and foreign key constraints are limited to segment_by columns
when creating a compressed hypertable. There are no restrictions on
check constraints.
2019-10-29 19:02:58 -04:00
Matvey Arye
0f3e74215a Split segment meta min_max into two columns
This simplifies the code and the access to the min/max
metadata. Before we used a custom type, but now the min/max
are just the same type as the underlying column and stored as two
columns.

This also removes the custom type that was used before.
2019-10-29 19:02:58 -04:00
gayyappan
43aa49ddc0 Add more information in compression views
Rename compression views to compressed_hypertable_stats and
compressed_chunk_stats and summarize information about compression
status for chunks.
2019-10-29 19:02:58 -04:00
gayyappan
edd3999553 Add trigger to block INSERT on compressed chunk
Prevent insert on compressed chunks by adding a trigger that blocks it.
Enable insert if the chunk gets decompressed.
2019-10-29 19:02:58 -04:00
Matvey Arye
0db50e7ffc Handle drops of compressed chunks/hypertables
This commit add handling for dropping of chunks and hypertables
in the presence of associated compressed objects. If the uncompressed
chunk/hypertable is dropped than drop the associated compressed object
using DROP_RESTRICT unless cascading is explicitly enabled.

Also add a compressed_chunk_id index on compressed tables for
figuring out whether a chunk is compressed or not.

Change a bunch of APIs to use DropBehavior instead of a cascade bool
to be more explicit.

Also test the drop chunks policy.
2019-10-29 19:02:58 -04:00
Matvey Arye
2bf97e452d Push down quals to segment meta columns
This commit pushes down quals or order_by columns to make
use of the SegmentMetaMinMax objects. Namely =,<,<=,>,>= quals
can now be pushed down.

We also remove filters from decompress node for quals that
have been pushed down and don't need a recheck.

This commit also changes tests to add more segment by and
order-by columns.

Finally, we rename segment meta accessor functions to be smaller
2019-10-29 19:02:58 -04:00
gayyappan
6e60d2614c Add compress chunks policy support
Add and drop compress chunks policy using bgw
infrastructure.
2019-10-29 19:02:58 -04:00
Matvey Arye
b9674600ae Add segment meta min/max
Add the type for min/max segment meta object. Segment metadata
objects keep metadata about data in segments (compressed rows).
The min/max variant keeps the min and max values inside the compressed
object. It will be used on compression order by columns to allow
queries that have quals on those columns to be able to exclude entire
segments if no uncompressed rows in the segment may match the qual.

We also add generalized infrastructure for datum serialization
/ deserialization for arbitrary types to and from memory as well
as binary strings.
2019-10-29 19:02:58 -04:00
Matvey Arye
a078781c2e Add decompress_chunk function
This is the opposite dual of compress_chunk.
2019-10-29 19:02:58 -04:00
gayyappan
7a728dc15f Add view for compression size
View for compressed_chunk_size and compressed_hypertable_size
2019-10-29 19:02:58 -04:00
gayyappan
1f4689eca9 Record chunk sizes after compression
Compute chunk size before/after compressing a chunk and record in
catalog table.
2019-10-29 19:02:58 -04:00
gayyappan
44941f7bd2 Add UI for compress_chunks functionality
Add support for compress_chunks function.

This also adds support for compress_orderby and compress_segmentby
parameters in ALTER TABLE. These parameteres are used by the
compress_chunks function.

The parsing code will most likely be changed to use PG raw_parser
function.
2019-10-29 19:02:58 -04:00
gayyappan
1c6aacc374 Add ability to create the compressed hypertable
This happens when compression is turned on for regular hypertables.
2019-10-29 19:02:58 -04:00
Joshua Lockerman
584f5d1061 Implement time-series compression algorithms
This commit introduces 4 compression algorithms
as well as 3 ADTs to support them. The compression
algorithms are time-series optimized. The following
algorithms are implemented:

- DeltaDelta compresses integer and timestamp values
- Gorilla compresses floats
- Dictionary compression handles any data type
  and is optimized for low-cardinality datasets.
- Array stores any data type in an array-like
  structure and does not actually compress it (though
  TOAST-based compression can be applied on top).

These compression algorithms are are fully described in
tsl/src/compression/README.md.

The Abstract Data Types that are implemented are
- Vector - A dynamic vector that can store any type.
- BitArray - A dynamic vector to store bits.
- SimpleHash - A hash table implementation from PG12.

More information can be found in
src/adts/README.md
2019-10-29 19:02:58 -04:00
gayyappan
3edc016dfc Add catalog tables to support compression
This commit adds catalog tables that will be used by the
compression infrastructure.
2019-10-29 19:02:58 -04:00
Matvey Arye
7ea492f29e Add last_successful_finish to bgw_job_stats
This allows people to better monitor the bgw job health. It
indicates when the last time the job made progress was.
2019-10-15 19:14:14 -04:00
Sven Klemm
d82ad2c8f6 Add ts_ prefix to all exported functions
This patch adds the `ts_` prefix to exported functions that didnt
have it and removes exports that are not needed.
2019-10-15 14:42:02 +02:00
Matvey Arye
2209133781 Add next_start option to alter_job_schedule
Add the option to set the next start time on a job in the
alter job schedule function. This also adds the ability
to pause jobs by setting next_start to 'infinity'

Also fix the enterprise licence check to only activate for
enterprise jobs.
2019-10-11 15:44:19 -04:00
Matvey Arye
d2f68cbd64 Move the set_integer_now func into Apache2
We decided this should be an OSS capability.
2019-10-11 13:00:55 -04:00
Sven Klemm
33a75ab09d Release 1.4.2
This maintenance release contains bugfixes since the 1.4.1 release.
We deem it medium priority for upgrading.

In particular the fixes contained in this maintenance release address
2 potential segfaults and no other security vulnerabilities.
The bugfixes are related to background workers, OUTER JOINs, ordered
append on space partitioned hypertables and expression indexes.
2019-09-10 10:23:15 +02:00
David Kohn
897fef42b6 Add support for moving chunks to different tablespaces
Adds a move_chunk function which to a different tablespace. This is
implemented as an extension to the reorder command.
Given that the heap, toast tables, and indexes are being rewritten
during the reorder operation, adding the ability to modify the tablespace
is relatively simple and mostly requires adding parameters to the relevant
functions for the destination tablespace (and index tablespace). The tests
do not focus on further exercising the reorder infrastructure, but instead
ensure that tablespace movement and permissions checks properly occur.
2019-08-21 12:07:28 -04:00
Narek Galstyan
62de29987b Add a notion of now for integer time columns
This commit implements functionality for users to give a custom
definition of now() for integer open dimension typed hypertables.
Such a now() function enables us to talk about intervals in the context
of hypertables with integer time columns. In order to simplify future
code. This commit defines a custom ts_interval type that unites the
usual postgres intervals and integer time dimension intervals under a
single composite type.

The commit also enables adding drop chunks policy on hypertables with
integer time dimensions if a custom now() function has been set.
2019-08-19 23:23:28 +04:00
Sven Klemm
03d4ae03d6 Release 1.4.1
This maintenance release contains bugfixes since the 1.4.0 release. We deem it medium
priority for upgrading.

In particular the fixes contained in this maintenance release address 2 potential
segfaults and no other security vulnerabilities. The bugfixes are related to queries
with prepared statements, PL/pgSQL functions and interoperability with other extensions.
More details below.

**Bugfixes**
* #1362 Fix ConstraintAwareAppend subquery exclusion
* #1363 Mark drop_chunks as VOLATILE and not PARALLEL SAFE
* #1369 Fix ChunkAppend with prepared statements
* #1373 Only allow PARAM_EXTERN as time_bucket_gapfill arguments
* #1380 Handle Result nodes gracefully in ChunkAppend

**Thanks**
* @overhacked for reporting an issue with drop_chunks and parallel queries
* @fvannee for reporting an issue with ConstraintAwareAppend and subqueries
* @rrb3942 for reporting a segfault with ChunkAppend and prepared statements
* @mchesser for reporting a segfault with time_bucket_gapfill and subqueries
* @lolizeppelin for reporting and helping debug an issue with ChunkAppend and Result nodes
2019-07-31 21:00:44 +02:00
Sven Klemm
a11910b5d5 Mark drop_chunks as VOLATILE and PARALLEL UNSAFE
The drop_chunks function was incorrectly marked as stable and
parallel safe this patch fixes the attributes.
2019-07-22 07:29:33 +02:00
Stephen Polcyn
1dc1850793 Drop_chunks returns list of dropped chunks
Previously, drop_chunks returned an empty table, giving the user
no indication of what (if anything) had happened.
Now, drop_chunks returns a list of the chunks identifiers in the
same style as show_chunks, with the chunk's schema and table name.

Notably, when show_chunks is called directly before drop_chunks, the
output should be the same.
2019-07-19 12:13:24 -04:00
Sven Klemm
a2f2db9cab Release 1.4.0
This release contains major new functionality for continuous aggregates
and adds performance improvements for analytical queries.

In version 1.3.0 we added support for continuous aggregates which
was initially limited to one continuous aggregate per hypertable.
With this release, we remove this restriction and allow multiple
continuous aggregates per hypertable.

This release adds a new custom node ChunkAppend that can perform
execution time constraint exclusion and is also used for ordered
append. Ordered append no longer requires a LIMIT clause and now
supports space partitioning and ordering by time_bucket.
2019-07-17 18:23:13 +02:00
gayyappan
e9df3bc1b6 Fix continuous agg catalog table insert failure
The primary key on continuous_aggs_materialization_invalidation_log
prevents multiple records with the same materialization id. Remove
the primary key to fix this problem.
2019-07-08 14:53:36 -04:00
gayyappan
5a0a73eabd Add columns to continuous_aggregate_stats view
Add more information about job history for continuous aggregate
background worker jobs.
2019-07-08 12:54:10 -04:00
Stephen Polcyn
ff44b33327 Update get_telemetry_report to expected behavior
Previously, returns full report even if telemetry is disabled.
Now, reassures user telemetry is disabled and provides the option
to view the report locally.
2019-06-25 19:35:17 +02:00
Sven Klemm
743a22f1fa Add 1.3.2 to update test scripts 2019-06-25 05:11:32 +02:00
gayyappan
60cfe6cc90 Support for multiple continuous aggregates
Allow multiple continuous aggregates to be defined on a hypertable.
2019-06-24 17:05:49 -04:00
Matvey Arye
e049238a07 Adjust permissions on internal functions
The following functions have had permission checks
added or adjusted:
ts_chunk_index_clone
ts_chunk_index_replace
ts_hypertable_insert_blocker_trigger_add
ts_current_license_key
ts_calculate_chunk_interval
ts_chunk_adaptive_set

The following functions have been removed from the regular SQL install.
They are only installed and used in tests:

dimension_calculate_default_range_open
dimension_calculate_default_range_closed
2019-06-24 10:57:38 -04:00
Sven Klemm
70a02b5410 Add 1.3.1 to update test scripts 2019-06-10 21:15:56 +02:00
Matvey Arye
48d9e2ce25 Add CMAKE option for default telemetry setting
Adds a CMAKE option to turn telemetry off by default by specifying
-DSEND_TELEMETRY_DEFAULT=NO. The default is YES or on.
2019-06-10 08:51:57 -04:00
Brian Rowe
aeac52aef6 Rename telemetry_metadata table to just metadata
This change renames the _timescale_catalog.telemetry_metadata to
_timescale_catalog.metadata.  It also adds a new boolean column to this
table which is used to flag data which should be included in telemetry.

It also renamed the src/telemetry/metadata.{h,c} files to
src/telemetry/telemetry_metadata.{h,c} and updated the API to reflect
this.  Finally it also includes the logic to use the new boolean column
when populating the telemetry parse state.
2019-05-17 17:04:42 -07:00
Sven Klemm
bfabb30be0 Release 1.3.0 2019-05-07 02:47:13 +02:00