This PR adds the ability to have multiple different versions of the timescaledb
extension be used by different databases in the same PostgreSQL
instance (server).
This is accomplished by splitting this extension into two .so files.
1) timescaledb.so -- stuff under loader/. Really not a lot of code.
This code MUST be backwards compatible in the future.
2) timescaledb-version.so (most of our code). Need
not be backwards compatible.
Timescaledb.so becomes a small stub which is preloaded and whose main
reason for existing is to dynamically load the right
timescaledb-version.so when the time comes.
This change allows either of the above .so to be loaded in
shared_preload_libraries. But timescaledb.so allows for multiple
versions used on different databases in the same instance along
with smoother upgrades. Using timescaledb-version.so allows for
finer-grained control and lock-in and is appropriate in only a few
production environments.
This PR also adds version checking so that a clear failure message
will be displayed if the .so version does not match the SQL extension
version.
To support multi-version functionality we changed the way SQL update
scripts are generated. Previously, the system used a bunch of
intermediate upgrade scripts. So with 3 versions, you would have an
update script of 1--2, 2--3. But, this PR changes things so that we
produce direct "shortcut" update files: 1--3, 2--3.
This is done for 2 reasons:
1) Each of the update files should point to
$libdir/timescaledb-current_version. Since you cannot guarantee that
Previous .so for each intermediate version has been installed.
2) You don't want intermediate version updates installed without the
.so. For example, if you have versions 1,2,3
and you are installing version 3, you want the upgrade files 1--3,
2--3 but not 1--2 because if you have 1--2
then a user could do ALTER EXTENSION timescaledb UPDATE TO 2. But
the .so for version 2 may not be installed.
In order to test this functionality, we add a mock extension version .so
that we can test extension loading inside the regression framework.
Changing things like the location of pg_config and other macros,
additional cmake flags, etc require modifying bootstrap file. This
PR instead passes any flags given to bootstrap onto cmake itself.
When tables are created with storage parameters (reloptions),
these should be inherited by chunks so that one can set options
such as fillfactor and autovacuum settings on the hypertable.
This change makes chunks inherit the reloptions set on a hypertable
and allows altering options via the ALTER TABLE command.
A hypertable's associated schema is used to create and store internal
data tables (chunks). A hypertable creates tables in that schema,
typically with full superuser permissions, regardless of whether the
hypertable's owner or the current user have permissions for the schema.
If the schema doesn't exist, the hypertable will create it when
creating the first chunk, even though the user or table owner does
not have permissions to create schemas in the database.
This change adds proper permissions checks to create_hypertable() so
that users cannot create hypertables with a custom associated schema
unless they have the proper permissions on the schema or the database.
Chunks are also no longer created with internal schema permissions if
the associated schema is something different from the internal schema.
This change is part of an effort to create a consistent way
of dealing with metadata catalog updates, which is currently
a mix of C API and INSERT/UPDATE/DELETE statements from SQL
code. This mix makes catalog handling unnecessarily complex as
there are multiple ways to update metadata, increasing the risk
of security issues with publically exposed SQL functions. It also
complicates things like cache invalidation, requiring different
mechanisms for C and SQL code. Catalog updates from SQL code
require triggers on metadata tables for cache invalidation that
do not work with native catalog updates.
The creation of chunks has been particularly messy in this regard,
making the code hard to follow. Especially the handling of a chunk's
constraints, where dimensional and other constraints were handled
differently. With this change, constraint handling is now consistent
across constraint types with a single API for updating metadata.
Reduce memory usage for out-of-order inserts
The chunk_result_relation_info should be put on the chunk memory
context. This will cause the rri constraint expr to also go onto
that context and be correctly freed when the chunk insert state
is destroyed.
Compatibility with pg_upgrade required 2 changes:
1) search_path on functions cannot be blank for pg_upgrade.
2) The timescaledb.restoring GUC had to apply to more code (now moved to
higher-level check)
`pg_upgrade` must be passed the following option: `-O "-c timescaledb.restoring='on'"`
A change to the COPY test made the sort order ambiguous
for certain tuples, which breaks the test on some machines
where the sort order might differ from the expected one.
This fix makes the sort order more predictible.
The chunk_result_relation_info should be put on the chunk memory
context. This will cause the rri constraint expr to also go onto
that context and be correctly freed when the chunk insert state
is destroyed.
Users can now either force the removal of an existing BUILD_DIR
by passing BUILD_FORCE_REMOVE to ./bootstrap, or they will be
prompted to remove the dir if it already exists. Additionally, the
bootstrap script will now exit if there is an error in mkdir or
cmake.
When testing on Windows these tests were returning differences
because of floating point (sql_query_results_*) differences and
because the cost was trivially higher (insert_single). Here we
remove the costs since it was not relevant to the regressions we
want to detect. Further we truncate the floating point averages
to 8 decimal places; enough to see significant differences.
A hypertable's tablespaces are now always retrieved from
the tablespace metadata table instead of being cached
with the hypertable. This avoids having to do cache invalidation
when updating the tablespace table.
Tablespaces can now be detached from hypertables using
`tablespace_detach()`. This function can either detach
a tablespace from all tables or only a specific table.
Having the ability to detach tablespace allows more
advanced storage management, for instance, one can detach
tablespaces that are running low on diskspace while attaching
new ones to replace the old ones.
Attaching tablespaces to hypertables is now handled
in native code, with improved permissions checking and
caching of tablespaces in the Hypertable data object.
The user should be able to add time dimensions using INTERVAL when
the column type is TIMESTAMP/TIMESTAMPTZ/DATE, so this change adds
that support.
Additionally it adds some additional tests and checks for
add_dimension, e.g., a nice error when the table is not a
hypertable.
For convenience, the user should be able to specify the new
chunk time intervals using INTERVAL datatype if the hypertable is
using a TIMESTAMP/TIMESTAMPTZ/DATE datatype for its time column.
Previously the test generator would create `parallel` tests with a
suffix consisting of the PG major and minor version numbers. This
is unnecessary for PostgreSQL 10 currently since we don't expect
major breakages between minor versions. So now instead the file
is suffixed with just the major version when it is PG10.
Remove user from regression tests so that the local test will work.
The tests itself will still be run with the appropriate user because
of the logic in runner.sh.
This PR fixes the handling of drop_chunks when the hypertable's
time field is a TIMESTAMP or DATE field. Previously, such
hypertables needed drop_chunks to be given a timestamptz in UTC.
Now, drop_chunks can take a DATE or TIMESTAMP. Also, the INTERVAL
version of drop_chunks correctly handles these cases.
A consequence of this change is that drop_chunks cannot be called
on multiple tables (with table_name = NULL or schema_name = NULL)
if the tables have different time column types.
Windows 64-bit binaries should now be buildable using the cmake
build system either from the command line or from Visual Studio.
Previous issues regarding unresolved symbols have been resolved
with compatibility header files to properly export symbols or
getting GUCs via normal APIs.
Plans that have a result relation set (DELETE or UPDATE) should
not optimize subplans that contain append nodes. This is because
the PostgreSQL must turn such plans on an inheritance table into
a set of similar plans on each subtable (i.e., it needs to apply
UPDATE and DELETE to each subtable and not just the parent).
Optimizing such plans with, e.g., ConstraintAwareAppend makes
the planner believe it only needs to do the DELETE or UPDATE
on the root table, which means rows in subtables won't be affected.
TimescaleDB cache invalidation happens as a side effect of doing a
full SQL statement (INSERT/UPDATE/DELETE) on a catalog table (via
table triggers). However, triggers aren't invoked when using
PostgreSQL's internal catalog API for updates, since PostgreSQL's
catalog tables don't have triggers that require full statement
parsing, planning, and execution.
Since we are now using the regular PostgreSQL catalog update API for
some TimescaleDB catalog operations, we need to do cache invalidation
also on such operations.
This change adds cache invalidation when updating catalogs using the
internal (C) API and also makes the cache invalidation more fine
grained. For instance, caches are no longer invalidated on some
INSERTS that do not affect the validity of objects already in the
cache, such as adding a new chunk.
Append plans that are empty (no children) are replaced at planning-
time with a Result plan node that returns nothing. Such Result nodes
occur typically when querying outside the time range covered by chunks.
However, the ConstraintAwareAppend plan node did not handle the
case of a Result child node, instead expecting either an Append or
MergeAppend node. This is now fixed so that encountering a Result node
means doing nothing.
This change reduces the usage of SECURITY DEFINER on SQL
functions and fixes related permissions issues. It also
properly checks hypertable permissions relative the current_user
instead of the session_user, which otherwise breaks SET ROLE,
among other things.
The reindex test outputs the OID of a cloned index. This OID might
change with the state of the database, added tests, etc., causing
frequent test failures. The test is now updated to output the name of
the index instead of the OID.
This changes the order in which subdirectories are processed in the
top-level CMakeLists.txt file, so that dependencies from the scripts/
dir are set when processing the src/ dir. Otherwise, the `pgindent`
target won't be enabled on the first CMake run.
reindex allows you to reindex the indexes of only certain chunks,
filtering by time. This is a common use case because a user may
want to reindex chunks after they are no longer getting new data once.
reindex also has a recreate option which will not use REINDEX
but will rather CREATE INDEX a new index and then
DROP INDEX / RENAME new_index to old_name. This approach has advantages
in terms of blocking reads for a much shorter period of time. However,
it does more work and will use more disk space during the operation.
Clustering a table means reordering it according to an index.
This operation requires an exclusive lock and extensive
processing time. On large hypertables with many chunks, CLUSTER
risks blocking a lot of operations by holding locks for a long time.
This is alleviated by processing each chunk in a new transaction,
ensuring locks are only held on one chunk at a time.