2418 Commits

Author SHA1 Message Date
Erik Nordström
418f283443 Merge continuous aggregate invalidations
This change implements deduplication and merging of invalidation
entries for continuous aggregates in order to reduce the number of
reduntant entries in the continuous aggregate invalidation
log. Merging is done both when copying over entries from the
hypertable to the continuous aggregate invalidation log and when
cutting already existing invalidations in the latter log. Doing this
merging in both steps helps reduce the number of invalidations also
for the continuous aggregates that don't get refreshed by the active
refresh command.

Merging works by scanning invalidations in order of the lowest
modified value, and given this ordering it is possible to merge the
current and next entry into one large entry if they are
overlapping. This can continue until the current and next invalidation
are disjoint or there are no more invalidations to process.

Note, however, that only the continuous aggregate that gets refreshed
will be fully deduplicated. Some redundant entries might exist for
other aggregates since their entries in the continuous aggregate log
aren't cut against the refresh window.

Full deduplication for the refreshed continuous aggregate is only
possible if the continuous aggregate invalidation log is processed
last, since that also includes "old" entries. Therefore, this change
also changes the ordering of how the logs are processed. This also
makes it possible to process the hypertable invalidation log in the
first transaction of the refresh.
2020-08-13 12:35:23 +02:00
Sven Klemm
6874076a85 Split ordered_append test into 2 tests
This patch separates the ordered append join tests from the other
ordered append tests. It also adds additional constraints to some
queries to speed them up. These changes result in a 3x speedup for
regresscheck-shared.
2020-08-12 21:46:31 +02:00
Sven Klemm
f61818f3a7 Limit resultsets for constraint_exclusion_prepared test
This patches limits the resultset for the constraint exclusion test
with prepared statements to make them run in a more reasonable time.
2020-08-12 21:46:31 +02:00
Sven Klemm
02d715f216 Add distributed hypertable to regresscheck-shared
This patch sets up a distributed hypertable in the regresscheck-shared
environment to enable running distributed tests.
2020-08-12 21:46:31 +02:00
Erik Nordström
af0ed90f85 Move invalidation threshold code
This change moves the code to set and get the invalidation threshold
for continuous aggregates to a separate source file for better code
structure.
2020-08-12 11:16:23 +02:00
Erik Nordström
c01faa72f0 Set invalidation threshold during refresh
The invalidation threshold governs the window of data from the head of
a hypertable that shouldn't be subject to invalidations in order to
reduce write amplification during inserts on the hypertable.

When a continuous aggregate is refreshed, the invalidation threshold
must be moved forward (or initialized if it doesn't previously exist)
whenever the refresh window stretches beyond the current threshold.

Tests for setting the invalidation threshold are also added, including
new isolation tests for concurrency.
2020-08-12 11:16:23 +02:00
Erik Nordström
80720206df Make refresh_continuous_aggregate a procedure
When a continuous aggregate is refreshed, it also needs to move the
invalidation threshold in case the refresh window stretches beyond the
current threshold. The new invalidation threshold must be set in its
own transaction during the refresh, which can only be done if the
refresh command is a procedure.
2020-08-12 11:16:23 +02:00
Erik Nordström
b8ce74921a Fix refresh of integer-time continuous aggregates
The calculation of the max-size refresh window for integer-based
continuous aggregates used the range of 64-bit integers for all
integer types, while the max ranges for 16- and 32-bit integers are
lower. This change adds the missing range boundaries.
2020-08-12 11:16:23 +02:00
Sven Klemm
e939b7e603 Add policies to update test
This patch adds policies to the update test to ensure their
configuration is properly migrated during updates. This patch
also fixes the inconsistent background job application_name
and adjusts them in the update script.
2020-08-12 02:29:24 +02:00
Sven Klemm
cb801fb215 Run update test on PG 11.8 and 12.3
This patch changes the update test to run on PG 11.8 and 12.3 instead
of 11.0 and 12.0. This patch also adds additional diagnostic output
when errors occur during the update test.
2020-08-12 02:29:24 +02:00
Sven Klemm
d547d61516 Refactor continuous aggregate policy
This patch modifies the continuous aggregate policy to store its
configuration in the jobs table.
2020-08-11 22:57:02 +02:00
Sven Klemm
530cb8296e Add check for unreferenced test files
This patch adds a check for test files not referenced in the
CMakeLists.txt file to CI.
2020-08-11 19:16:27 +02:00
Dmitry Simonenko
1a8d0eae06 Add check for distributed hypertable to reorder/move_chunk
Ensure that move_chunk() and reorder_chunk() functions cannot
be used with distributed hypertable
2020-08-11 16:12:54 +03:00
Sven Klemm
f510a39a74 Make application name for bgw jobs unique
This patch changes the application name for background worker jobs
to include the job_id which makes the application name unique and
allows joining against pg_stat_activity to get a list of currently
running background worker processes. This change also makes
identifying misbehaving jobs easier from the postgres log as the
application name can be included in the log line.
2020-08-11 14:56:41 +02:00
gayyappan
eecc93f3b6 Add hypertable_index_size function
Function to compute the size for a specific
index of a hypertable
2020-08-10 18:00:51 -04:00
Sven Klemm
e40d70716e Ignore result of loader test on windows
The loader test fails extremely often on windows so this patch
makes the window test not fail on failed loader test.
2020-08-07 15:40:57 +02:00
Sven Klemm
4409bff025 Add unreferenced test files to CMakeLists
The with_clause_parser and continuous_aggs_drop_chunks tests were
not referenced in the CMakeLists leading to those tests never being
run. This patch adds them to the appropriate file and adjusts the
output.
2020-08-07 15:40:57 +02:00
Dmitry Simonenko
0f60b5b33b Add check for distributed hypertable to continuous aggs
Show an error message in case if a distributed hypertable
being used.
2020-08-07 15:31:29 +03:00
Sven Klemm
656d3a4ef4 Update package lists before installing packages in CI 2020-08-07 09:20:14 +02:00
Sven Klemm
3a119e066a Fix telemetry installed_time format
This patch changes the telemetry code to always send the installed_time
timestamp AS ISO8601. Previously it was depending on local settings
leading to timestamps not processable by the telemetry receiver.
2020-08-07 09:20:14 +02:00
Ruslan Fomkin
56b4c10a74 Fix error messages to compression policy
Error messages are improved and formulated in terms of compression
policy.
2020-08-06 19:17:44 +02:00
Sven Klemm
02dae3a5fb Fix background worker scheduler memory consumption
This patch changes how the scheduler handles memory contexts.
Previously only memory allocated during transactions would get
freed and everything else remained allocated.

The scheduler now uses 2 memory contexts for its operation: scheduler_mctx
for long-lived objects and scratch_mctx for short-lived objects.
After every iteration of the scheduling main loop scratch_mctx gets
reset. Special care needs to be taken in regards to memory contexts
since StartTransactionCommand creates and switches to a transaction
memory context which gets deleted on CommitTransactionCommand which
switches CurrentMemoryContext back to TopMemoryContext. So operations
wrapped in Start/CommitTransactionCommit will not happen in scratch_mctx
but will get freed on CommitTransactionCommand.
2020-08-05 17:45:33 +02:00
Ruslan Fomkin
393e5b9c1a Remove enabling enterprise from compression test
Compression is not enterprise feature anymore. Thus enabling
enterprise is not needed in tests.
2020-08-05 14:25:27 +02:00
Erik Nordström
9a7b4aa003 Process invalidations when refreshing continuous aggregate
This change adds intitial support for invalidation processing when
refreshing a continuous aggregate. Note that, currently, invalidations
are only cleared during a refresh, but not yet used to optimize
refreshes. There are two steps to this processing:

1. Invalidations are moved from hypertable invalidation log to the
   cagg invalidation log
2. The cagg invalidation entries are then processed for the continuous
   aggregate that gets refreshed.

The second step involves finding all invalidations that overlap with
the given refresh window and then either deleting them or cutting
them, depending on how they overlap.

Currently, the "invalidation threshold" is not moved up during a
refresh. This would only be required if the refresh window crosses
that threshold and will be addressed in a future change.
2020-08-04 14:22:04 +02:00
Erik Nordström
675eb7dd73 Allow setting snapshot in Scanner
This change adds the ability to set a snapshot to use with scans
executed with the Scanner module. The Scanner uses SnapshotSelf by
default, but this isn't appropriate for certain scans that, e.g.,
don't want to see their own changes. An option to keep the lock
on the scanned relation after the scan is also added.
2020-08-04 14:22:04 +02:00
Sven Klemm
bb891cf4d2 Refactor retention policy
This patch changes the retention policy to store its configuration
in the bgw_job table and removes the bgw_policy_drop_chunks table.
2020-08-03 22:33:54 +02:00
Mats Kindahl
9049a5d3cb Remove requirement of CASCADE from DROP VIEW
To drop a continuous aggregate it was necessary to use the `CASCADE`
keyword, which would then cascade to the materialized hypertable. Since
this can cascade the drop to other objects that are dependent on the
continuous aggregate, this could accidentally drop more objects than
intended.

This commit fixes this by removing the check for `CASCADE` and adding
the materialized hypertable to the list of objects to drop.

Fixes timescale/timescaledb-private#659
2020-08-03 22:01:21 +02:00
gayyappan
9f13fb9906 Add functions for compression stats
Add chunk_compression_stats and hypertable_compression_stats
functions to get before/after compression sizes
2020-08-03 10:19:55 -04:00
Sven Klemm
417908f19b Fix macos build
A recent change changed the macos build to run in release mode
which also changed postgres to be built without assertions.
Since we inherit the assertion setting from postgres this leads
to assertions being disabled for our code as well.
With assertions disabled clang errors on detecting null pointer
dereferences so this patch turns assertions for macos back on.
Since the postgres build is cached this took not effect immediately
and remained unnoticed in the CI run against the PR introducing the
change.
2020-08-01 17:24:24 +02:00
Sven Klemm
13e0a5f4c7 Sort test list in pg_regress
This patch changes pg_regress to sort the test list when tests
are controlled with either TESTS or SKIPS. This makes it more
consistent with the unfiltered test run which gets a sorted
list from cmake.
2020-08-01 17:24:24 +02:00
Sven Klemm
4c05168909 Remove obsolete sql update files
Since the minimum version we can update from on PG11 is 1.1.0 we
can get rid of all the previous versions update files as they
are not a valid update source for any current version.
2020-08-01 17:24:24 +02:00
Ruslan Fomkin
f62fd957b7 Test both debug and release on MacOS in CI
Switches MacOS test in PR to run release, so the build time is
reduced. Adds the test in Debug to scheduled job.
2020-07-31 23:08:53 +02:00
Mats Kindahl
590446c6a7 Remove cascade_to_materialization parameter
The parameter `cascade_to_materialization` is removed from
`drop_chunks` and `add_drop_chunks_policy` as well as associated tables
and test functions.

Fixes #2137
2020-07-31 11:21:36 +02:00
gayyappan
c93f963709 Remove chunk_relation_size
Remove chunk_relation_size and chunk_relation_size_pretty
functions
Fix row_number in chunks view
2020-07-30 16:06:04 -04:00
Mats Kindahl
03d2f32178 Add self-reference check to add_data_node
If the access node is adding itself as a data node using `add_data_node`
it will deadlock since transactions will be opened on both the access
node and data node both trying to update the metadata.

This commit fixes this by updating `set_dist_id` to check if the UUID
being added as `dist_uuid` is the same as the `uuid` of the node.  If
that is the case, it raises an error.

Fixes #2133
2020-07-30 21:19:33 +02:00
Sven Klemm
0d5f1ffc83 Refactor compress chunk policy
This patch changes the compression policy to store its configuration
in the bgw_job table and removes the bgw_policy_compress_chunks table.
2020-07-30 19:58:37 +02:00
Brian Rowe
68aee5144c Rename add_drop_chunks_policy
This change replaces the add_drop_chunks_policy function with
add_retention_policy.  This also renames the older_than parameter
of that function as retention_window.  Likewise, the
remove_drop_chunks_policy is also being renamed
remove_retention_policy.

Fixes #2119
2020-07-30 09:53:21 -07:00
Ruslan Fomkin
5696668500 Test detach_tablespaces on distributed hypertable
Adds a test to call detach_tablespaces on a distributed hypertable.
Since no tablespaces can be attached to distributed hyperatbles, the
test detaches 0 tablespaces. Also a test to detach tablespaces on a
data node is added.
2020-07-30 10:05:25 +02:00
Sven Klemm
6a3d31d045 Rename variable to be according to our code style
This changes the parseState variable in the telemetry code to
parse_state to conform with our code style.
2020-07-30 05:03:52 +02:00
Sven Klemm
7527a7deba Add support for calling custom functions to bgw scheduler
This patch adds internal support for calling user-defined
functions and procedures to the background worker scheduler.
2020-07-30 05:03:52 +02:00
Erik Nordström
7c4247c3fb Add test for concurrent continuous aggregate refresh
This change adds an new isolation test for concurrent refreshing on a
continuous aggregate. Although a bucket (group) in a continuous
aggregate should be unique on the GROUP BY columns there is no unique
constraint on the materialized hypertable to protect against duplicate
buckets. Therefore, concurrent refreshes can result in duplicate rows
in the materialized hypertable although such duplicates should not be
possible by the underlying query's definition.
2020-07-30 01:04:32 +02:00
Erik Nordström
84fd3b09b4 Add refresh function for continuous aggregates
This change adds a new refresh function called
`refresh_continuous_aggregate` that allows refreshing a continuous
aggregate over a given window of data, called the "refresh window".

This is the first step in a larger overhaul of the continuous
aggregate feature with the goal of cleaning up the API and separating
policy from the core functionality.

Currently, the refresh function does a brute-force refresh of a window
and it bypasses the whole invalidation framework. Future updates
intend to integrate with this framework (with modifications) to
optimize refreshes. An exclusive lock is take on the continuous
aggregate's internal materialized hypertable in order to protect
against concurrent refreshing. However, as this serializes refreshes,
we might want to relax this locking in the future to allow, e.g.,
concurrent refreshes of non-overlapping windows.

The new refresh functionality includes basic tests for bad input and
refreshing across different windows. Unfortunately, a bug in the
optimization code for `time_bucket` causes timestamps to overflow the
allowed MAX time. Therefore, refresh windows that are close to the MAX
allowed size are not yet supported or tested.
2020-07-30 01:04:32 +02:00
Sven Klemm
a3a668e654 Fix formatting of query in extension test 2020-07-30 00:00:57 +02:00
Sven Klemm
5a410736a9 Only run chunk_api test on debug build
The chunk_api test requires a debug build for certain test functions
this patch changes the chunk_api test to only run for debug builds.
2020-07-30 00:00:57 +02:00
gayyappan
7d3b4b5442 New size utils functions
Add hypertable_detailed_size , chunk_detailed_size,
hypertable_size functions.
Remove hypertable_relation_size,
hypertable_relation_size_pretty, and indexes_relation_size_pretty
Remove size information from hypertables view.
2020-07-29 15:30:39 -04:00
Sven Klemm
3e83577916 Refactor reorder policy
This patch changes the reorder policy to store it's configuration
in the bgw_job table and removes the bgw_policy_reorder table.
2020-07-29 12:07:13 +02:00
Erik Nordström
09d37fa4f7 Fix memory issues when scanning chunk constraints
A function to lookup the name of a chunk constraint returned a pointer
to string without first copying the string into a safe memory
context. This probably worked by chance because everything in the scan
function ran in the current memory context, including the deforming of
the tuple. However, returning pointers to data in deformed tuples can
easily cause memory corruption with the introduction of other changes
(such as improved memory management).

This memory issue is fixed by explicitly reallocating the string in
the memory context that should be used for any returned data.

Changes are also made to avoid unnecessarily deforming tuples multiple
times in the same scan function.
2020-07-29 10:40:12 +02:00
Erik Nordström
a311f3735d Adopt table scan methods for Scanner
This change makes the Scanner code agnostic to the underlying storage
implementation of the tables it scans. This also fixes a bug that made
it impossible to use non-heap table access methods on a
hypertable. The bug existed because a check is made for existing data
before a table is made into a hypertable. And, since this check reads
data from the table using the Scanner, it must be able to read the
data irrespective of the underlying storage.

As a result of the more generic scan interface, resource management is
also improved by delivering tuples in reference-counted tuple table
slots. A backwards-compatibility layer is used for PG11, which maps
all table access functions to the heap equivalents.
2020-07-29 10:40:12 +02:00
Sven Klemm
aec8758b06 Improve update test github output
This patch adds the diff output of the update test in a separate step
in the workflow and also uploads the update test diff as artifact.
2020-07-28 18:15:14 +02:00
Mats Kindahl
6f64f959db Propagate privileges from hypertables to chunks
Whenever chunks are created, no privileges are added to the chunks.
For accesses that go through the hypertable permission checks are
ignored so reads and writes will succeed anyway. However, for direct
accesses to the chunks, permission checks are done, which creates
problems for, e.g., `pg_dump`.

This commit fixes this by propagating `GRANT` and `REVOKE` statements
to the chunks when executed on the hypertable, and whenever new chunks
are created, privileges are copied from the hypertable.

This commit do not propagate privileges for distributed hypertables,
this is in a separate commit.
2020-07-28 17:42:52 +02:00