Previously, drop_chunks returned an empty table, giving the user
no indication of what (if anything) had happened.
Now, drop_chunks returns a list of the chunks identifiers in the
same style as show_chunks, with the chunk's schema and table name.
Notably, when show_chunks is called directly before drop_chunks, the
output should be the same.
For hypetables that have continuous aggregates, calling drop_chunks now
drops all of the rows in the materialization table that were based on
the dropped chunks. Since we don't know what the correct default
behavior for drop_chunks is, we've added a new argument,
cascade_to_materializations, which must be set to true in order to call
drop_chunks on a hypertable which has a continuous aggregate.
drop_chunks is blocked on the materialization tables of continuous
aggregates
TimescaleDB has always supported functions on closed (space)
dimension, i.e., for hash partitioning. However, functions have not
been supported on open (time) dimensions, instead requiring columns to
have a supported time type (e.g, integer or timestamp). This restricts
the tables that can be time partitioned. Tables with custom "time"
types, which can be transformed by a function expression into a
supported time type, are not supported.
This change generalizes partitioning so that both open and closed
dimensions can have an associated partitioning function that
calculates a dimensional value. Fortunately, since we already support
functions on closed dimensions, the changes necessary to support this
on any dimension are minimal. Thus, open dimensions now support an
(optional) partitioning function that transforms the input type to a
supported time type (e.g., integer or timestamp type). Any indexes on
such dimensional columns become expression indexes.
Tests have been added for chunk expansion and the hashagg and sort
transform optimizations on tables that are using a time partitioning
function.
Currently, not all of these optimizations are well supported, but this
could potentially be fixed in the future.
Remove the existing PLPGSQL function that implements drop_chunks, replacing it with a direct call to the C function, which also implements the old PLPGSQL checks in C. Refactor out much of the code shared between the C implementations of show_chunks and drop_chunks.
Timescale provides an efficient and easy to use api to drop individual
chunks from timescale database through drop_chunks. This PR builds on
that functionality and through a new show_chunks function gives the
opportunity to see the chunks that would be dropped if drop_chunks was run.
Additionally, it adds a newer_than option to drop_chunks (also supported
by show_chunks) that allows to see/drop chunks in an interval or newer
than a point in time.
This commit includes:
- Implementation of show_chunks in C
- Additional helper functions to work with chunks
- New version of drop_chunks in sql that uses show_chunks. This
also adds a newer_than option to drop_chunks
- More enhanced tests of drop_chunks and new tests for show_chunks
Among other reasons, show_chunks was implemented in C in order
to be able to have both older_than and newer_than arguments be null. This
was not possible in SQL because the arguments had to have polymorphic types
and whether they are used in function body or not, PL/pgSQL requires these
arguments to typecheck.
Add bool created to return value of create_hypertable and add_dimension.
When if_not_exists is true and creation is skipped because the object
already exists created will be false, otherwise it will be true. This
modifies the functions to return meta data even when no object was
created.
Change the return value of add_dimension to return a record consisting
of dimension_id, schema_name, table_name, column_name. This improves
user feedback about success of the operation but also gives the function
an API returning useful information for non-human consumption.
Change create_hypertable to return a record consisting of
(hypertable_id, schema_name, table_name). This improves user feedback
about success of the operation but also gives the function an API
returning useful information for non-human consumption.
We've decided to adopt the ts_ prefix on all exported C functions in
order to avoid having symbol conflicts with future postgres functions.
We've already started using this prefix on new functions and this commit
adds the prefix to to the old functions.
Adaptive chunking uses the min and max value of previous chunks
to estimate their "fill factor". Ideally, min and max should be
retreived using an index, but if no index exists we fall back
to a heap scan. A heap scan can be very expensive, so we now
raise a WARNING if no index exists.
This change also renames set_adaptive_chunk_sizing() to simply
set_adaptive_chunking().
Users can now (optionally) set a target chunk size and TimescaleDB
will try to adapt the interval length of the first open ("time")
dimension in order to reach that target chunk size. If a hypertable
has more than one open dimension, only the first one will have a
dynamically adapting interval.
Users can optionally specify their own function that calculates the
new dimension interval. They can also set a target size of 0 in order
to estimate a suitable target size for a chunk based on available
memory.
Tables can now hold existing data, which is optionally migrated from
the main table to chunks when create_hypertable() is called.
The data migration is similar to the COPY path, with the single
difference that the inserted/copied tuples come from an existing table
instead of being read from a file. After the data has been migrated,
the main table is truncated.
One potential downside of this approach is that all of this happens in
a single transaction, which means that the table is blocked while
migration is ongoing, preventing inserts by other transactions.
This change improves the handling of tablespaces as follows:
- Add if_not_attached / if_attached options to attach_tablespace() and
detach_tablespace(), respectively
- Block DROP tablespace if it is still attached to a table
- Block REVOKE if it means the table owner no longer has CREATE
permissions on an attached tablespace
- Make error messages follow the PostgreSQL style guide
This is a continuation of prior efforts to refactor API functions in C
to:
- improve usage of proper error codes
- use error messages that better conform with the PostgreSQL standard.
- improve security by avoiding that lots of code run under SECURITY DEFINER
- move towards doing all metadata updates using a consistent catalog API
Most importantly, `create_hypertable()` has been refactored in C,
which simplifies a lot of code that previously required
upcalls/downcalls between C code and plpgsql code, or duplicated
functionality between the two environments.
The functions for adding and updating dimensions have been refactored
in C to:
- improve usage of proper error codes
- make messages that better conform with the PostgreSQL standard.
- improve security by avoiding that lots of code run under SECURITY DEFINER
A new if_not_exists option has also been added to add_dimension() and
a the number of partitions can now be set using the new
set_number_partitions() function.
A bug in the validation of smallint time intervals has been fixed. The
previous code didn't check for intervals > 0 and smallint intervals
accepted values up to UINT16_MAX instead of INT16_MAX.
A hypertable's associated schema is used to create and store internal
data tables (chunks). A hypertable creates tables in that schema,
typically with full superuser permissions, regardless of whether the
hypertable's owner or the current user have permissions for the schema.
If the schema doesn't exist, the hypertable will create it when
creating the first chunk, even though the user or table owner does
not have permissions to create schemas in the database.
This change adds proper permissions checks to create_hypertable() so
that users cannot create hypertables with a custom associated schema
unless they have the proper permissions on the schema or the database.
Chunks are also no longer created with internal schema permissions if
the associated schema is something different from the internal schema.
Compatibility with pg_upgrade required 2 changes:
1) search_path on functions cannot be blank for pg_upgrade.
2) The timescaledb.restoring GUC had to apply to more code (now moved to
higher-level check)
`pg_upgrade` must be passed the following option: `-O "-c timescaledb.restoring='on'"`
Tablespaces can now be detached from hypertables using
`tablespace_detach()`. This function can either detach
a tablespace from all tables or only a specific table.
Having the ability to detach tablespace allows more
advanced storage management, for instance, one can detach
tablespaces that are running low on diskspace while attaching
new ones to replace the old ones.
Attaching tablespaces to hypertables is now handled
in native code, with improved permissions checking and
caching of tablespaces in the Hypertable data object.
The user should be able to add time dimensions using INTERVAL when
the column type is TIMESTAMP/TIMESTAMPTZ/DATE, so this change adds
that support.
Additionally it adds some additional tests and checks for
add_dimension, e.g., a nice error when the table is not a
hypertable.
For convenience, the user should be able to specify the new
chunk time intervals using INTERVAL datatype if the hypertable is
using a TIMESTAMP/TIMESTAMPTZ/DATE datatype for its time column.
This PR fixes the handling of drop_chunks when the hypertable's
time field is a TIMESTAMP or DATE field. Previously, such
hypertables needed drop_chunks to be given a timestamptz in UTC.
Now, drop_chunks can take a DATE or TIMESTAMP. Also, the INTERVAL
version of drop_chunks correctly handles these cases.
A consequence of this change is that drop_chunks cannot be called
on multiple tables (with table_name = NULL or schema_name = NULL)
if the tables have different time column types.
This change reduces the usage of SECURITY DEFINER on SQL
functions and fixes related permissions issues. It also
properly checks hypertable permissions relative the current_user
instead of the session_user, which otherwise breaks SET ROLE,
among other things.
The extension now works with PostgreSQL 10, while
retaining compatibility with version 9.6.
PostgreSQL 10 has numerous internal changes to functions and
APIs, which necessitates various glue code and compatibility
wrappers to seamlessly retain backwards compatiblity with older
versions.
Test output might also differ between versions. In particular,
the psql client generates version-specific output with `\d` and
EXPLAINs might differ due to new query optimizations. The test
suite has been modified as follows to handle these issues. First,
tests now use version-independent functions to query system
catalogs instead of using `\d`. Second, changes have been made to
the test suite to be able to verify some test outputs against
version-dependent reference files.
Add check that time dimensions are set as NOT NULL in the
main table that a hypertable is created from. If it is not
set, the constraint will be added.
Users might want to implement their own partitioning function
or use the legacy one included with TimescaleDB. This change
adds support for setting the partitioning function in
create_hypertable() and add_dimension().
All regression tests will now use a non-superuser unless superuser is
necessary. This PR is a meant to prevent things like issue #226.
This PR also fixes some more permission bugs found during this testing.
This is part of the ongoing effort to simplify the metadata tables and
removing any triggers on them that cause side effects.
This change includes the following:
- Remove the on_change_hypertable() trigger on the hypertable catalog
table.
- Remove the TRUNCATE blocking triggers on all metadata tables. If
we think such blocking is important, we should do this in an
event trigger or the processUtility hook.
- Put all SQL files in a single load_order.txt instead of splitting
across three distinct files. Now all SQL files are included in
update scripts as well for simplicity and consistency.
- As a result of removing triggers and related functions, the
setup_main() and restore_timescaledb() functions are no longer
needed. This also further simplifies the database restore process
as calling restore_timescaledb() is no longer needed (or possible).
- Refactor create_hypertable_row() to do more validation before
allocating a new hypertable ID. This avoids incrementing the serial
ID unnecessarily in case some validations fail.
This change refactors the chunk index handling to make better use
of standard PostgreSQL catalog information, while removing the
hypertable_index metadata table and associated triggers, including
those on the chunk_index table. The chunk_index table itself is
also simplified.
A benefit of this refactoring is that indexes are no longer
created using string mangling to construct the CREATE INDEX command
for a chunk, based on the string definition of the hypertable
index. Instead, indexes are created in C using proper index-related
internal data structures.
Chunk indexes can now also be renamed and are added in the parent
index tablespace. Changing tablespace on a hypertable index also
recurses to chunks, as expected. Default indexes that are added when
creating a hypertable use the hypertable's tablespace.
Creating Hypertable indexes with the CONCURRENTLY modifier is
currently blocked, due to unclear semantics regarding concurrent
creation over many tables, including how to deal with snapshots.
Applying triggers to chunks requires taking the definition
of a trigger on a hypertable and executing it on a chunk. Previously
this was done with string replacement in the trigger definition.
This was not especially safe, and thus we moved the logic to C
where we can do proper parsing/deparsing and replacement of the table
name. Another positive aspect is that we got rid of some DDL triggers.
This PR add support for primary-key, foreign-key, unique, and exclusion constraints.
Previously supported are CHECK and NOT NULL constraints. Now, foreign key
constraints where a hypertable references a plain table is support
(while vice versa, with a plain table references a hypertable, is still not).
Previously the default chunk time in microseconds was too large
for a SMALLINT or INTEGER field. Now, we only assign a default
value if the type is TIMESTAMP or TIMESTAMPTZ. Integer timestamps,
such as SMALLINT, INTEGER, and BIGINT, need to be explicitly set
since only the user knows what units the numbers represent.
Further, we check to make sure the chunk time interval is not too
large for SMALLINT and INTEGER so as to avoid confusing problems
later when the user goes to insert.
This adds support for all types of triggers on a hypertable except
INSERT AFTER. UPDATE and DELETE ROW triggers are automatically copied from
a hypertable onto the chunks. Therefore, any trigger defined on the
parent hypertable will apply to any row in any of the chunks as well.
STATEMENT level triggers and iNSERT triggers need not be copied in this
way.
A new public-facing API `add_dimension(table, column, ...)`
makes it possible to add additional dimensions (partitioning
columns) to a hypertable.
Currently, new dimension can only be added to empty tables.
The code has also been refactored with a corresponding
internal function that is called by both `add_dimension()`
and `create_hypertable()`.
Previously, catalog tables were not fully protected from malicious
non-superusers. This PR fixes permission handling be severely
restricting permissions to the catalog and instead using SECURITY
DEFINER functions to alter the catalog when needed without giving
users permission to do those same operations outside of these functions.
In addition, these functions check for proper permissions themselves
so are safe to use.
This PR also makes sure that chunk tables have the same owner as the
hypertable and correctly handles `ALTER TABLE...OWNER TO` commands to
keep this info in sync.