A fix for updates of version 0.6.1 was lost in the previous PR that
refactored the update process. This change adds back that fix.
The build process for update scripts now also supports versioned
"origin" modfiles, which are included only in the update scripts that
origins in a particular version given by the origin modfile's
name. Origin modfiles make it possible to add fixes that should be
included only for the version upgrading from.
Update scripts have so far been built by concatenating all the
historical changes and code from previous versions, leading to bloated
update scripts, complicated script build process, and the need to keep
old C-functions in compat.c since those functions are referenced
during updates.
This change greatly simplifies the way update scripts are built,
significantly reducing the size of update scripts (to basically only
the changeset + current code), and removing the need for compat.c.
A few principles of building scripts must be followed going forward,
as discussed in sql/updates/README.md.
This removes the version suffix from SQL files that are copied from
the source directory to the build directory during the build process.
Versioning the files in this step serves no real purpose and only
tends to clutter up the build dir with extra files every time the
version is bumped, requiring manual cleanup.
The cache invalidation triggers on our catalog tables
aren't used anymore as all modifications to catalog tables
happen using the C API, which doesn't invoke triggers and
has its own cache invalidation functionality.
For multi-version upgrades it is necessary to change the location of
trigger functions before doing anything else in upgrade scripts.
Otherwise, it is possible to trigger an even before you change the
location of the functions, which would load the old shared library
and break the system.
This commit also fixes `/sql/timescaledb--0.8.0--0.9.0.sql` to
come from the release build.
Tables can now hold existing data, which is optionally migrated from
the main table to chunks when create_hypertable() is called.
The data migration is similar to the COPY path, with the single
difference that the inserted/copied tuples come from an existing table
instead of being read from a file. After the data has been migrated,
the main table is truncated.
One potential downside of this approach is that all of this happens in
a single transaction, which means that the table is blocked while
migration is ongoing, preventing inserts by other transactions.
This Fixes at least two bugs:
1) A drop of a referenced table used to drop the associated
FK constraint but not the metadata associated with the constraint.
Fixes#43.
2) A drop of a column removed any indexes associated with the column
but not the metadata associated with the index.
This change improves the handling of tablespaces as follows:
- Add if_not_attached / if_attached options to attach_tablespace() and
detach_tablespace(), respectively
- Block DROP tablespace if it is still attached to a table
- Block REVOKE if it means the table owner no longer has CREATE
permissions on an attached tablespace
- Make error messages follow the PostgreSQL style guide
This change refactors the handling of TRUNCATE so
that it is performed directly in process utility without
doing an upcall to PL/pgSQL.
It also adds handling for the ONLY modifier to TRUNCATE,
which shouldn't work on a hypertable. TRUNCATE now generates
an error if TRUNCATE ONLY is used on a hypertable.
When chunks are deleted, dimension slices can be orphaned, i.e., there
are no chunks or chunk constraints that reference such slices. This
change ensures that, when chunks are deleted, orphaned slices are also
deleted.
Deletes on metadata in the TimescaleDB catalog has so far been a mix
of native deletes using the C-based catalog API and SQL-based DELETE
statements that CASCADEs.
This mixed environment is confusing, and SQL-based DELETEs do not
consistently clean up objects that are related to the deleted
metadata.
This change moves towards A C-based API for deletes that consistently
deletes also the dependent objects (such as indexes, tables and
constraints). Ideally, we should prohobit direct manipulation of
catalog tables using SQL statements to avoid ending up in a bad state.
Once all catalog manipulations happend via the native API, we can also
remove the cache invalidation triggers on the catalog tables.
This is a continuation of prior efforts to refactor API functions in C
to:
- improve usage of proper error codes
- use error messages that better conform with the PostgreSQL standard.
- improve security by avoiding that lots of code run under SECURITY DEFINER
- move towards doing all metadata updates using a consistent catalog API
Most importantly, `create_hypertable()` has been refactored in C,
which simplifies a lot of code that previously required
upcalls/downcalls between C code and plpgsql code, or duplicated
functionality between the two environments.
The functions for adding and updating dimensions have been refactored
in C to:
- improve usage of proper error codes
- make messages that better conform with the PostgreSQL standard.
- improve security by avoiding that lots of code run under SECURITY DEFINER
A new if_not_exists option has also been added to add_dimension() and
a the number of partitions can now be set using the new
set_number_partitions() function.
A bug in the validation of smallint time intervals has been fixed. The
previous code didn't check for intervals > 0 and smallint intervals
accepted values up to UINT16_MAX instead of INT16_MAX.
This PR adds the ability to have multiple different versions of the timescaledb
extension be used by different databases in the same PostgreSQL
instance (server).
This is accomplished by splitting this extension into two .so files.
1) timescaledb.so -- stuff under loader/. Really not a lot of code.
This code MUST be backwards compatible in the future.
2) timescaledb-version.so (most of our code). Need
not be backwards compatible.
Timescaledb.so becomes a small stub which is preloaded and whose main
reason for existing is to dynamically load the right
timescaledb-version.so when the time comes.
This change allows either of the above .so to be loaded in
shared_preload_libraries. But timescaledb.so allows for multiple
versions used on different databases in the same instance along
with smoother upgrades. Using timescaledb-version.so allows for
finer-grained control and lock-in and is appropriate in only a few
production environments.
This PR also adds version checking so that a clear failure message
will be displayed if the .so version does not match the SQL extension
version.
To support multi-version functionality we changed the way SQL update
scripts are generated. Previously, the system used a bunch of
intermediate upgrade scripts. So with 3 versions, you would have an
update script of 1--2, 2--3. But, this PR changes things so that we
produce direct "shortcut" update files: 1--3, 2--3.
This is done for 2 reasons:
1) Each of the update files should point to
$libdir/timescaledb-current_version. Since you cannot guarantee that
Previous .so for each intermediate version has been installed.
2) You don't want intermediate version updates installed without the
.so. For example, if you have versions 1,2,3
and you are installing version 3, you want the upgrade files 1--3,
2--3 but not 1--2 because if you have 1--2
then a user could do ALTER EXTENSION timescaledb UPDATE TO 2. But
the .so for version 2 may not be installed.
In order to test this functionality, we add a mock extension version .so
that we can test extension loading inside the regression framework.
A hypertable's associated schema is used to create and store internal
data tables (chunks). A hypertable creates tables in that schema,
typically with full superuser permissions, regardless of whether the
hypertable's owner or the current user have permissions for the schema.
If the schema doesn't exist, the hypertable will create it when
creating the first chunk, even though the user or table owner does
not have permissions to create schemas in the database.
This change adds proper permissions checks to create_hypertable() so
that users cannot create hypertables with a custom associated schema
unless they have the proper permissions on the schema or the database.
Chunks are also no longer created with internal schema permissions if
the associated schema is something different from the internal schema.
This change is part of an effort to create a consistent way
of dealing with metadata catalog updates, which is currently
a mix of C API and INSERT/UPDATE/DELETE statements from SQL
code. This mix makes catalog handling unnecessarily complex as
there are multiple ways to update metadata, increasing the risk
of security issues with publically exposed SQL functions. It also
complicates things like cache invalidation, requiring different
mechanisms for C and SQL code. Catalog updates from SQL code
require triggers on metadata tables for cache invalidation that
do not work with native catalog updates.
The creation of chunks has been particularly messy in this regard,
making the code hard to follow. Especially the handling of a chunk's
constraints, where dimensional and other constraints were handled
differently. With this change, constraint handling is now consistent
across constraint types with a single API for updating metadata.
Reduce memory usage for out-of-order inserts
The chunk_result_relation_info should be put on the chunk memory
context. This will cause the rri constraint expr to also go onto
that context and be correctly freed when the chunk insert state
is destroyed.
Compatibility with pg_upgrade required 2 changes:
1) search_path on functions cannot be blank for pg_upgrade.
2) The timescaledb.restoring GUC had to apply to more code (now moved to
higher-level check)
`pg_upgrade` must be passed the following option: `-O "-c timescaledb.restoring='on'"`
A hypertable's tablespaces are now always retrieved from
the tablespace metadata table instead of being cached
with the hypertable. This avoids having to do cache invalidation
when updating the tablespace table.
Tablespaces can now be detached from hypertables using
`tablespace_detach()`. This function can either detach
a tablespace from all tables or only a specific table.
Having the ability to detach tablespace allows more
advanced storage management, for instance, one can detach
tablespaces that are running low on diskspace while attaching
new ones to replace the old ones.
Attaching tablespaces to hypertables is now handled
in native code, with improved permissions checking and
caching of tablespaces in the Hypertable data object.
The user should be able to add time dimensions using INTERVAL when
the column type is TIMESTAMP/TIMESTAMPTZ/DATE, so this change adds
that support.
Additionally it adds some additional tests and checks for
add_dimension, e.g., a nice error when the table is not a
hypertable.
For convenience, the user should be able to specify the new
chunk time intervals using INTERVAL datatype if the hypertable is
using a TIMESTAMP/TIMESTAMPTZ/DATE datatype for its time column.
This PR fixes the handling of drop_chunks when the hypertable's
time field is a TIMESTAMP or DATE field. Previously, such
hypertables needed drop_chunks to be given a timestamptz in UTC.
Now, drop_chunks can take a DATE or TIMESTAMP. Also, the INTERVAL
version of drop_chunks correctly handles these cases.
A consequence of this change is that drop_chunks cannot be called
on multiple tables (with table_name = NULL or schema_name = NULL)
if the tables have different time column types.
Windows 64-bit binaries should now be buildable using the cmake
build system either from the command line or from Visual Studio.
Previous issues regarding unresolved symbols have been resolved
with compatibility header files to properly export symbols or
getting GUCs via normal APIs.
TimescaleDB cache invalidation happens as a side effect of doing a
full SQL statement (INSERT/UPDATE/DELETE) on a catalog table (via
table triggers). However, triggers aren't invoked when using
PostgreSQL's internal catalog API for updates, since PostgreSQL's
catalog tables don't have triggers that require full statement
parsing, planning, and execution.
Since we are now using the regular PostgreSQL catalog update API for
some TimescaleDB catalog operations, we need to do cache invalidation
also on such operations.
This change adds cache invalidation when updating catalogs using the
internal (C) API and also makes the cache invalidation more fine
grained. For instance, caches are no longer invalidated on some
INSERTS that do not affect the validity of objects already in the
cache, such as adding a new chunk.
This change reduces the usage of SECURITY DEFINER on SQL
functions and fixes related permissions issues. It also
properly checks hypertable permissions relative the current_user
instead of the session_user, which otherwise breaks SET ROLE,
among other things.
reindex allows you to reindex the indexes of only certain chunks,
filtering by time. This is a common use case because a user may
want to reindex chunks after they are no longer getting new data once.
reindex also has a recreate option which will not use REINDEX
but will rather CREATE INDEX a new index and then
DROP INDEX / RENAME new_index to old_name. This approach has advantages
in terms of blocking reads for a much shorter period of time. However,
it does more work and will use more disk space during the operation.
Previously, for timezones w/o tz. The range_end and range_start were
defined as UTC, but the constraints on the table were written as as
the local time at the time of chunk creation. This does not work well
if timezones change over the life of the hypertable.
This change removes the dependency on local time for all timestamp
partitioning. Namely, the range_start and range_end remain as UTC
but the constraints are now always written in UTC too. Since old
constraints correctly describe the data currently in the chunks, the
update script to handle this change changes range_start and range_end
instead of the constraints.
Fixes#300.
Functions marked IMMUTABLE should also be parallel safe, but
aren't by default. This change marks all immutable functions
as parallel safe and removes the IMMUTABLE definitions on
some functions that have been wrongly labeled as IMMUTABLE.
If functions that are IMMUTABLE does not have the PARALLEL SAFE
label, then some standard PostgreSQL regression tests will fail
(this is true for PostgreSQL >= 10).
Aggregate functions that have serialize and deserialize support
functions (histogram, last, first, etc.) should have these
support functions marked STRICT.
PostgreSQL's regular test suite will fail when the timescaledb
module is loaded without these functions being marked STRICT.
All partitioning functions now has the signature `int func(anyelement)`.
This cleans up some special handling that was necessary to support
the legacy partitioning function that expected text input.
We now use INT64_MAX and INT64_MIN as the max and min values for
dimension_slice ranges. If a dimension_slice has a range_start of
INT64_MIN or the range_end is INT64_MAX, we remove the corresponding
check constraint on the chunk since it signifies that this end of the
range is infinite. Closed ranges now always have INT64_MIN as range_end
of first slice and range_end of INT64_MAX for the last slice.
Also, points corresponding to INT64_MAX are always
put in the same slice as INT64_MAX-1 to avoid problems with the
semantics that coordinate < range_end.