Add license information to files missing it and fix the license
check script to honor the return code of both the apache license
check and the TSL license check. Previously errors occurring during
apache license check would not be reflected in the return code of
the script, so only the TSL license check was effective.
When the continuous aggregate refresh_interval setting is modified,
it does not modify the job schedule until the next scheduled job
runs. This fix addresses it by updating the next_start time for
the bgw job.
The primary key on continuous_aggs_materialization_invalidation_log
prevents multiple records with the same materialization id. Remove
the primary key to fix this problem.
Calling `create_hypertable` with a space dimension silently succeeds
without actually creating the space dimension if `num_partitions` is
not specified.
This change ensures that we raise an appropriate error when a user
fails to specify the number of partitions.
With ordered append, chunk exclusion occur only along the primary open
"time" dimension, failing to exclude chunks along additional
partitioning dimensions. For instance, a query on a two-dimensional
table "hyper" (time, device), such as
```
SELECT * FROM hyper
WHERE time > '2019-06-11 12:30'
AND device = 1
ORDER BY time;
```
would only exclude chunks based on the "time" column restriction, but
not the "device" column restriction. This causes an unnecessary number
of chunks to be included in the query plan.
The reason this happens is because chunk exclusion during ordered
append is based on pre-sorting the set of slices in the primary
dimension to determine ordering. This is followed by a scan for
chunks slice-by-slice in the order of the sorted slices. Since those
scans do not include the restrictions in other dimensions, chunks that
would otherwise not match are included in the result.
This change fixes this issue by using the "regular" chunk scan that
account for multi-dimensional restrictions. This is followed by a sort
of the resulting chunks along the primary "time" dimension.
While this, sometimes, means sorting a larger set than the initial
slices in the primary "time" dimension, the resulting chunk set is
smaller instead. Sorting chunks also allows doing secondary ordering
on chunk ID for those chunks that belong to the same "time"
slice. While this additional ordering is not required for correct
tuple ordering, it gives slightly nicer EXPLAIN output since chunks
are also ordered by ID.
The test had a race condition between the Insert and Refresh. When the
invalidation log is released, if the Insert got it first it would write
a new invalidation that would be picked up by the refresh and if
Refresh got the lock first, it would not see any invalidations. Both
are valid paths for execution but caused non determinism in the tests.
Adding checks for `pg_isolation_regress` and `pg_regress` and not
running the tests if the programs cannot be found either installed in
default directories or provided with the `PG_PATH` variable, or the
`PG_SOURCE_DIR` variable.
SIGHUP's can be dropped in between background worker startup and
the call into the background worker entrypoint. Namely, Postgres
calls `pqsignal(SIGHUP, SIG_IGN)` inside of `StartBackgroundWorker`.
Thus, SIGHUPs will be ignored before the call to the entrypoint.
This creates a possible race condition where a config file change
is not correctly processed (and is instead ignored). We prevent
this by always processing the config file after we set our own
signal handler.
We also fix the tests here in two ways:
1) We disable background workers in this test (or, rather, delete the
line that starts them back up).
2) We put in a new mock timer option that calls the standard timer wait.
This allows us to test proper latch processing in the SIGHUP case.
We believe that this resolves some flakiness in our tests as well.
Make Travis test PRs and other branches on the latest released
PG patch version for each major version. Cron still tests the earliest
supported patch version (and the ABI tests run by cron cover the
tip of each major version).
Previously, returns full report even if telemetry is disabled.
Now, reassures user telemetry is disabled and provides the option
to view the report locally.
The initial implementation for Ordered Append required the ORDER BY
clause expression to match the time partitioning column.
This patch loosens that restriction and will apply the Ordered
Append optimization for queries with ORDER BY time_bucket and
date_trunc as well.
The Order display of the ChunkAppend node used the output of
the node instead of the input of the node when resolving the
targetlist index to display the order information leading to
incorrect display when the Sort column was not passed through
or the position changed.
The regexp-based test output filtering in `runner.sh` does not work
with the Mac OS/BSD version of `sed` since it doesn't use the
extended/modern regexp syntax by default. This makes some tests fail
on Mac OS X/BSD.
This change makes the filter use non-extended regexp syntax to fix
this issue.
We use the InvalidationThreshold as a barrier to ensure that all
transactions see an invalidation threshold. Upon transaction commit,
all transactions grab a lock on said table, while during the first
phase of materialization, the materializer grabs an AccessExclusive
lock on the table. This is supposed to ensure that all
INSERT/UPDATE/DELETEs are ordered strictly before or after the first
phase of materialization, and thus any mutations to newly materialized
data will be part the new materialization, or append an invalidation.
Unfortunately, the scanner does not hold its locks until the end of the
transaction, and since that was the only manner we were taking a lock
on the InvalidationLog in mutations, it was not functioning as a
barrier. To fix this, we now take an explicit lock.
Make sure that you can't add a view to a schema without CREATE
privileges. Nor can you use function where you don't have
EXECUTE privileges.
The latter case is also tested with background workers.
The following functions have had permission checks
added or adjusted:
ts_chunk_index_clone
ts_chunk_index_replace
ts_hypertable_insert_blocker_trigger_add
ts_current_license_key
ts_calculate_chunk_interval
ts_chunk_adaptive_set
The following functions have been removed from the regular SQL install.
They are only installed and used in tests:
dimension_calculate_default_range_open
dimension_calculate_default_range_closed
Since some of our background workers now execute user defined
functions, we should make them execute under the roles of
the objects that are associated with them (as defined by
ts_bgw_job_owner).
This prevents attacks such as UDF executing arbitrary code
under the default BGW user. Currently, the only possible worry
is regarding continuous aggs. But, this solution protects all
BGW jobs.
To create a continuous agg you now only need SELECT and
TRIGGER permission on the raw table. To continue refreshing
the continuous agg the owner of the continuous agg needs
only SELECT permission.
This commit adds tests to make sure that removing the
SELECT permission removes ability to refresh using
both REFRESH MATERIALIZED VIEW and also through a background
worker.
This work also uncovered divergence in permission logic for
creating triggers by a CREATE TRIGGER on chunks and when new
chunks are created. This has now been unified: there is a check
to make sure you can create the trigger on the main table and
then there is a check that the owner of the main table can create
triggers on chunks.
Alter view for continuous aggregates is allowed for the owner of the
view.
Items that are "handled" in process utility start never go through
the standard process utility. Thus they may not have permissions
checks called. This commit goes through all such items and adds
permissions checks as appropriate.
This commit fixes and tests permissions in the following
API calls:
- reorder_chunk (test only)
- alter_job_schedule
- add_drop_chunks_policy
- remove_drop_chunks_policy
- add_reorder_policy
- remove_reorder_policy
- drop_chunks
A number of TimescaleDB query optimizations involve operations on
functions. This refactor exposes a function cache that can be used to
quickly identify important functions and get access to relevant
auxiliary functionality and/or information. In particular, certain
functions apply to some type of (time) bucketing expression, e.g.,
expressions involving our own `time_bucket` function or PostgreSQL's
`date_trunc`.
This change recognizes the importance of time bucketing and uses the
function cache to access custom functionality around time bucketing
used in query optimizations. For example, both grouping estimates for
hash aggregates and sort transforms can be quickly accessed to make
better use of indexes when bucketing on a time column.
This refactor is also done with anticipation that this will be useful
going forward when other types of optimizations are implemented on
time bucketing expressions, or other functions that can benefit from
this cache.
When the column a Param references is not a partitioning column
the constraint is not useful for excluding chunks so we skip
enabling runtime exclusion for those cases.
Fixes use-after-free in chunk_append/exec.c. Also, strip out memory
usage from EXPLAIN ANALYZE output of Sort nodes because it is not
stable across platforms
In various places, most notably drop_chunks and show_chunks, we
dispatch based on the type of the "time" column of the hypertable, for
things such as determining which interval type to use. With a custom
partition function, this logic is incorrect, as we should instead be
determining this based on the return type of the partitioning function.
This commit changes all relevant access of dimension.column_type to a
new function, ts_dimension_get_partition_type, which has the correct
behavior: it returns the partitioning function's return type, if one
exists, and only otherwise uses the column type. After this commit, all
references to column_type directly should have a comment explaining why
this is appropriate.
fixes Gihub issue #1250
Given a query like:
SELECT * FROM hyper WHERE time_bucket(10, time) < 100
where time column has type bigint
the current `time_bucket` parser assumes the type of (Const*)100 to
be the same as the type of the `time` column of the table.
This does not have to be the same for integer types: `time` can be a
`bigint`, but if the operand fits in an `int`, the relevant (Const*)
object will have type `int` (reflected in its `consttype` attribute).
This PR makes sure that we use this type information to accordingly
extract value from Datum and not rely on the type information of the
column.
This patch makes TimescaleDB use ChunkAppend in places where it
used to used to use ConstraintAwareAppend before.
ConstraintAwareAppend will still be used for MergeAppend nodes
that cannot be changed to Ordered Append or when ChunkAppend is
disabled.
When a query on a hypertable is identified as benefitting from
execution exclusion Append nodes will be replaced by ChunkAppend
nodes.
This will enable the use of runtime exclusion for joins, lateral
joins, subqueries and correlated subqueries.
Change the test from using the old x_diff method to the new
direct diff method for checking that the results don't differ
in the optimized and unoptimized cases. This cleans up the
golden file so that in case of success the golden files are
nearly empty instead of containing a diff file that must
be checked to ensure that it only has explain (and not tuple)
output. Also combine several test files into query.sql, and
get rid of differences in pg versions.