Previously, the Chunk struct was used to represent both a full
chunk and the stub used for joins. The stub used for joins
only contained valid values for some chunk fields and not others.
After the join determined that a Chunk was complete, it filled
in the rest of the chunk field. The fact that a chunk could have
only some fields filled out and not others at different times,
made the code hard to follow and error prone.
So we separate out the stub state of the chunk into a separate
struct that doesn't contain the not-filled-out fields inside
of it. This leverages the type system to prevent errors that
try to access invalid fields during the join phase and makes
the code easier to follow.
Removes duplicate call to setup_append_rel_array and avoids allocating
another append_rel_array with the same values during planning queries
with hypertables.
We added a timescaledb.ignore_invalidation_older_than parameter for
continuous aggregatess. This parameter accept a time-interval (e.g. 1
month). if set, it limits the amount of time for which to process
invalidation. Thus, if
timescaledb.ignore_invalidation_older_than = '1 month'
then any modifications for data older than 1 month from the current
timestamp at insert time will not cause updates to the continuous
aggregate. This limits the amount of work that a backfill can trigger.
This parameter must be >= 0. A value of 0 means that invalidations are
never processed.
When recording invalidations for the hypertable at insert time, we use
the maximum ignore_invalidation_older_than of any continuous agg attached
to the hypertable as a cutoff for whether to record the invalidation
at all. When materializing a particular continuous agg, we use that
aggs ignore_invalidation_older_than cutoff. However we have to apply
that cutoff relative to the insert time not the materialization
time to make it easier for users to reason about. Therefore,
we record the insert time as part of the invalidation entry.
On older point releases (e.g. 10.2) the step size in isolation
tests is smaller leading to "SQL step too long" errors. This
PR splits up the setup step to avoid this error.
continuous aggregate views like
select time_bucket(), sum(col)
from ...
group by time_bucket(), grpcol;
when grpcol is missing from the select targetlist, the
partialize query's select targetlist is incorrect and the view
cannot be materialized. This PR fixes this issue.
This change the continuous aggregate materialization logic so that
the max_interval_per_job applies to invalidation entries as well
as new ranges in the materialization. The new logic is that the
MIPJ setting limits the sum of work done by the invalidations
and new ranges. Invalidations take precedence so new ranges
are only processed if there is time left over in the MIPJ
budget after all invalidations are done.
This forces us to calculate the invalidation range during the first
transaction. We still delete and/or cut the invalidation entries
in the second transaction. This change also more neatly separates concerns:
all decisions on work to be done happens in the first txn while only
execution happens in the second. Further refactoring could make
this more clear by passing a list of InternalRanges to represent the
work. But this PR is big enough, so that's left to a future refactor.
Note: There is remaining work to be done in breaking up invalidation
entries as created during inserts to constrain the length of the entries.
But that's a separate issue to be addressed in the future.
Refactor the continuous aggregate validation to use our function cache
to check for bucketing function. This simplifies the code and allows
adding support for other bucketing functions like date_trunc later on.
In some cases _temp variable will not be set due to pg_config not
returning any output for a specific flag. This results in an
error when doing comparison using STREQUAL and build failure.
Wrapping variable in double quotes fixes the problem.
Previously, refresh_lag in continuous aggs was calculated
relative to the maximum timestamp in the table. Change the
semantics so that it is relative to now(). This is more
intuitive.
Requires an integer_now function applied to hypertables
with integer-based time dimensions.
This maintenance release contains bugfixes since the 1.5.0 release. We deem it low
priority for upgrading.
In particular the fixes contained in this maintenance release address potential
segfaults and no other security vulnerabilities. The bugfixes are related to bloom
indexes and updates from previous versions.
**Bugfixes**
* #1523 Fix bad SQL updates from previous updates
* #1526 Fix hypertable model
* #1530 Set active snapshots in multi-xact index create
**Thanks**
* @84660320 for reporting an issue with bloom indexes
Type functions have to be CREATE OR REPLACED on every update
since they need to point to the correct .so. Thus,
split the type definitions into a pre, functions,
and post part and rerun the functions part on both
pre_install and on every update.
Set active snapshots when creating txns during index
create with timescaledb.transaction_per_chunk. This
is needed for some index types like `bloom`.
Tests not added since we don't want dependencies on contrib modules
like bloom.
Fixes#1521.
1. This commit introduces changes to existing plans due
to the addition of new chunks to metrics_ordered_idx.
2. Add tests for constraint aware appends on compressed
tables.
The update logic from 1.4.2 to 1.5.0 had an error where
the _timescaledb_catalog.hypertable table was altered in such
a way that the table was not re-written. This causes
bugs in catalog processing code. A CLUSTER rewrites the
table. We also backpatch this change to the 1.4.2--1.5.0
script to help anyone building from source.
Also fixes a similar error on _timescaledb_catalog.metadata
introduced in the 1.3.2--1.4.0 update.
PG11 added an optimization where columns that were added by
an ALTER TABLE that had a DEFAULT value did not cause a table
re-write. Instead, those columns are filled with the default
value on read.
But, this mechanism does not apply to catalog tables and does
not work with our catalog scanning code. This tests makes
sure we never have such alters in our updates.
The construct used for pushing down produces a warning on certain
older compiler, so while it was correct this patch changes it to
get rid of the warning and to prevent introducing an imbalance later.
The `test_sanitizer.sh` test failed because source code was being
copied from the host to the container as user `postgres` and this user
did not have read permissions on the mounted directory. This is fixed
by copying the files as `root` and then changing the owner to
`postgres`.
The commit also removes `wait_for_pg` since PostgreSQL server status is
not relevant for the tests since they start their own temporary
instance.
The commit also switches to use here-is documents for the execution for
readability purposes.
The main reason to run ARM tests was not to identify issues with
ARM but to identify 32 bit issues e.g. int64 as pointer instead
type by value. Those issues don't need ARM emulation but can be
tested with i386 which is much faster.
Fix tests that fail like so:
test=# CREATE CAST (customtype AS bigint)
test-# WITHOUT FUNCTION AS ASSIGNMENT;
ERROR: source and target data types are not physically compatible
A previous change made `UNIX` and `APPLE` build flags mutually
exclusive instead of complementary. This broke builds on, e.g., Mac OS
X.
The changes in this commit will make builds work on Mac OS X again.
When linking the extensions as shared libraries, the linker flags from
`pg_config` is not used. This means that if `PG_PATH` is provided and
refer to a locally compiled Postgres installation, shared libraries
from that installation will not be used. Instead any default-installed
version of Postgres will be used.
This commit adds `PG_LDFLAGS` to `CMAKE_SHARED_LINKER_FLAGS` and
`CMAKE_MODULE_LINKER_FLAGS`.
To handle that Windows set some fields to "not recorded" when they are
not available, it introduces a CMake function `get_pg_config` that will
replace it with `<var>-NOTFOUND` so that it is treated as undefined by
CMake.
This release adds major new features and bugfixes since the 1.4.2 release.
We deem it moderate priority for upgrading.
This release adds compression as a major new feature.
Multiple type-specific compression options are available in this release
(including DeltaDelta with run-length-encoding for integers and
timestamps; Gorilla compression for floats; dictionary-based compression
for any data type, but specifically for low-cardinality datasets;
and other LZ-based techniques). Individual columns can be compressed with
type-specific compression algorithms as Postgres' native row-based format
are rolled up into columnar-like arrays on a per chunk basis.
The query planner then handles transparent decompression for compressed
chunks at execution time.
This release also adds support for basic data tiering by supporting
the migration of chunks between tablespaces, as well as support for
parallel query coordination to the ChunkAppend node.
Previously ChunkAppend would rely on parallel coordination in the
underlying scans for parallel plans.
Histogram's combine function threw a segfault if both state1
and state2 were NULL. I could only reproduce this case in
PG 10. Add a tests that hits this with PG 10.4
Fixes#1490
When restoring a database, people would encounter errors if
the restore happened after telemetry has run. This is because
a 'exported_uuid' field would then exist and people would encounter
a "duplicate key value" when the restore tried to overwrite it.
We fix this by moving this metadata to a different key
in pre_restore and trying to move it back in post_restore.
If the restore create an exported_uuid, that restored
value is used and the moved version is simply deleted
We also remove the error redirection in restore so that errors
will show up in tests in the future.
Fixes#1409.
Several fixes:
- Change incorrect variable name in CmakeLists that prevented tests
from running.
- Add a PG 10.10 test to codecov
- Remove unused CODECOV_FLAGS in travis.yml
The following fields are added:
-num_compressed_hypertables
-compressed_KIND_size
-uncompressed_KIND_size
Where KIND = heap, index, toast.
`num_hypertables` field does NOT count the internal hypertables
used for compressed data.
We also removed internal continuous aggs tables from the
`num_hypertables` count.
We want compressed data to be stored out-of-line whenever possible so
that the headers are colocated and scans on the metadata and segmentbys
are cheap. This commit lowers toast_tuple_target to 128 bytes, so that
more tables will have this occur; using the default size, very often a
non-trivial portion of the data ends up in the main table, and only
very few rows are stored in a page.
Prior to PG 10.4, the costing for hashaggs was different (see
PG commit `1007b0`). We fix the tests not to change between the
old and new versions by disabling hashagg. We were not testing
for that anyway.
This commit adds tests for DATE, TIMESTAMP, and FLOAT compression and
decompression, NULL compression and decompression in dictionaries and
fixes a bug where the database would refuse to decompress DATEs. This
commit also removes the fallback allowing any binary compatible 8-byte
types to be compressed by our integer compressors as I believe I found
a bug in said fallback last time I reviewed it, and cannot recall what
the bug was. These can be re-added later, with appropriate tests.
We reset the varoattno when creating the equivalence
member for segment by columns. varoattno is used for
finding equivalence members (em) when searching for pathkeys
(although, strangely not for indexclauses). Without this change
the code for finding matching ems differs in the case where attnos
have changed and where they haven't.
Fixing this, allows the planner to plan more different types of paths
for several tests, Because of the way the cost fuzzer in
`compare_path_costs_fuzzily` interacts with disabling seqscans,
some choices the planner makes have changed (pretty much the cost
is dominated by the penalty of the seqscan and so it picks the
first available path). We've changed some enable_seqscan clauses
to get around this and have the planner show what we want in tests.
Also delete transparent_decompression-9.6.out since compression is
disabled on 9.6.
2 fixes:
- The no-ssl telemetry tests should always run in debug tests
- Run explain on a query instead of the query because this query
runs out of memory in our underpowered arm tests and the bug
we are looking is in the planner anyway.
This PR adds test infrastructure for running tests with shared tables.
This allows having hypertables with specific configurations usable for
all tests. Since these tests also don't require creating a new database
for each test case some of the overhead of the normal tests is removed.
While this will lead to much faster query tests some tests will still
require their own database to test things, but most queres could be moved
to this infrastructure to improve test coverage and speed them up.
Queries with the first/last optimization on compressed chunks
would not properly decompress data but instead access the uncompressed
chunk. This patch fixes the behaviour and also unifies the check
whether a hypertable has compression.