Tables can now hold existing data, which is optionally migrated from
the main table to chunks when create_hypertable() is called.
The data migration is similar to the COPY path, with the single
difference that the inserted/copied tuples come from an existing table
instead of being read from a file. After the data has been migrated,
the main table is truncated.
One potential downside of this approach is that all of this happens in
a single transaction, which means that the table is blocked while
migration is ongoing, preventing inserts by other transactions.
Sometimes pg_config reports versions like `PostgreSQL 10.2 (Ubuntu
10.2-1.pgdg16.04+1)`. Thus the regex is adjusted to allow for suffixes
after the postgres version number.
This Fixes at least two bugs:
1) A drop of a referenced table used to drop the associated
FK constraint but not the metadata associated with the constraint.
Fixes#43.
2) A drop of a column removed any indexes associated with the column
but not the metadata associated with the index.
This change improves the handling of tablespaces as follows:
- Add if_not_attached / if_attached options to attach_tablespace() and
detach_tablespace(), respectively
- Block DROP tablespace if it is still attached to a table
- Block REVOKE if it means the table owner no longer has CREATE
permissions on an attached tablespace
- Make error messages follow the PostgreSQL style guide
Dropping a schema that a hypertable depends on should clean up
dependent metadata. There are two schemas that matter for hypertables:
the hypertable's schema and the associated schema where chunks are
stored.
This change deals with the above as follows:
- If the hypertable schema is dropped, the hypertable and all chunks
should be deleted as well, including metadata.
- If an associated schema is dropped, the hypertables that use that
associated schema will have their associated schemas reset to the
internal schema.
- Even if no hypertable currently uses the dropped schema as their
associated schema, there might be chunks that reside in the dropped
schema (e.g., if the associated schema was changed for their
hypertables), so those chunks should have the metadata deleted.
This test adds a case where we use a trigger to automatically populate
a metadata table. Such uses are common in IOT where, for example,
you want to keep metadata associated with devices and you want
new devices to be auto-created.
The compiler does not seem to like when I use the msvc enter/exit
guards for utils/guc.h, so the alternative is to grab the value
via GetConfigOptionByName.
This change refactors the handling of TRUNCATE so
that it is performed directly in process utility without
doing an upcall to PL/pgSQL.
It also adds handling for the ONLY modifier to TRUNCATE,
which shouldn't work on a hypertable. TRUNCATE now generates
an error if TRUNCATE ONLY is used on a hypertable.
Previously stdint.h was not included on Windows so INT16_MAX and
friends were not defined. Additionally, having tablespace_attach
with PG_FUNCTION_ARGS in the header file caused issues during
linking, so a direct call version of the function is now exported
for others to use instead of the PG_FUNCTION_ARGS version.
Two minor warnings regarding not having a return in all cases are
also addressed.
When chunks are deleted, dimension slices can be orphaned, i.e., there
are no chunks or chunk constraints that reference such slices. This
change ensures that, when chunks are deleted, orphaned slices are also
deleted.
Deletes on metadata in the TimescaleDB catalog has so far been a mix
of native deletes using the C-based catalog API and SQL-based DELETE
statements that CASCADEs.
This mixed environment is confusing, and SQL-based DELETEs do not
consistently clean up objects that are related to the deleted
metadata.
This change moves towards A C-based API for deletes that consistently
deletes also the dependent objects (such as indexes, tables and
constraints). Ideally, we should prohobit direct manipulation of
catalog tables using SQL statements to avoid ending up in a bad state.
Once all catalog manipulations happend via the native API, we can also
remove the cache invalidation triggers on the catalog tables.
This is a continuation of prior efforts to refactor API functions in C
to:
- improve usage of proper error codes
- use error messages that better conform with the PostgreSQL standard.
- improve security by avoiding that lots of code run under SECURITY DEFINER
- move towards doing all metadata updates using a consistent catalog API
Most importantly, `create_hypertable()` has been refactored in C,
which simplifies a lot of code that previously required
upcalls/downcalls between C code and plpgsql code, or duplicated
functionality between the two environments.
Chunk creation needs to be serialized in order
to avoid having multiple processes trying to
create the same chunk and causing conflicts.
This serialization didn't work as expected, because
a lock on the chunk catalog table that implemented
this serialization was prematurely released.
This change fixes that issue and also changes the
serialization to happen around a lock on the
chunk's parent table (the main table) instead. This
change should allow multiple processes to simultaneously
create chunks for different hypertables.
The functions for adding and updating dimensions have been refactored
in C to:
- improve usage of proper error codes
- make messages that better conform with the PostgreSQL standard.
- improve security by avoiding that lots of code run under SECURITY DEFINER
A new if_not_exists option has also been added to add_dimension() and
a the number of partitions can now be set using the new
set_number_partitions() function.
A bug in the validation of smallint time intervals has been fixed. The
previous code didn't check for intervals > 0 and smallint intervals
accepted values up to UINT16_MAX instead of INT16_MAX.
Issuing "CREATE EXTENSION IF NOT EXISTS timescaledb;" generates
an error if the extension is already loaded. This change makes
this command work as expected, with a NOTICE.
Error messages related to CREATE EXTENSION have also been
updated to better adhere to the official PostgreSQL style
guide and also returns the same or similar error codes
and messages as regular PostgreSQL.
Currently, chunk indexes are always created in the tablespace of the
index on the main table (which could be none/default one), even if the
chunks themselves are created in different tablespaces. This is
problematic in a multi-disk setting where each disk is a separate
tablespace where chunks are placed. The chunk indexes might exhaust
the space on the common (often default tablespace) which might not
have a lot of disk space. This also prohibits the database, including
index storage to grow by adding new tablespaces.
Instead, chunk indexes are now created in the "next" tablespace after
that of their chunks to both spread indexes across tablespaces and
avoid colocating indexes with their chunks (for I/O throughput
reasons). To optionally avoid this spreading, one can pin chunk
indexes to a specific tablespace by setting an explicit tablespace on
a main table index.
Source code indentation has been updated in PostgreSQL 10 to fix a
number of issues. This update applies this new indentation to the
entire code base.
The new indentation requires a new version of pg_bsd_indent, which can
be found here:
https://git.postgresql.org/git/pg_bsd_indent.git
This commit adds logic for cache pinning to handle subtxn. It also makes
it easier to find cache pinning leaks. Finally, it fixes handling of
cross-commit operations like VACUUM and CLUSTER. Previously, such
operations incorrectly released that cache pin on the first commit
even though the object was used after that.
We add better accounting for number of items stored in a subspace
to allow better pruning. Instead of pruning based on the number of
dimension_slices in subsequent dimensions we now track number of total
items in the subspace store and prune based on that.
We add two GUC variables:
1) max_open_chunks_per_insert (default work_mem in bytes / 512. This
assumes an entry is 512 bytes)
2) max_cached_chunks_per_hypertable (default 100). Maximum cached chunks per
hypertable.
Previously, the cache in chunk_dispatch was limited to only hold
the chunk_insert_state for the last time dimension as a consequence
of logic in subspace_store. This has now been relaxed so that a
chunk_dispatch holds the cache for any chunk_insert_states that it
encounters. Logic for the hypertable chunk cache has not been changed.
The rule that we should follow is to limit the subspace store size for
caches that survive across commands. But caches within commands can be
allowed to grow.
Renaming constraints on hypertables is now supported. Using the ONLY
option with RENAME CONSTRAINT is blocked for hypertables. Renaming
constraints on chunks is also blocked.
This commit also fixes a possible bug in chunk_constraint code.
Previously, `chunk_constraint_add_from_tuple` used GETSTRUCT on the
tuple to convert the tuple to `FormData_chunk_constraint`. This
was incorrect since some fields can be NULL. Using GETSTRUCT for
tables with NULLABLE fields is unsafe and indeed was incorrect·
in this case.
Flags are now passed on to CMake, the source directory is correctly
resolved, and support for removing the old build directory added.
Additionally this will limit the VS solution file to only having
Release targets unless overriden with a flag.
Previously, the executor code adjusted es_processed to set the
number of tuples that have been processed in the completion tag.
That was necessary since we had triggers that re-routed tuples to chunks
and cancelled the inserts on the base table (thus those tuples were not
counted in es_processed). We no longer use triggers and thus this logic
was no longer used. In fact, this was mostly dead code because
`additional_tuples` was always 0.
This PR adds the ability to have multiple different versions of the timescaledb
extension be used by different databases in the same PostgreSQL
instance (server).
This is accomplished by splitting this extension into two .so files.
1) timescaledb.so -- stuff under loader/. Really not a lot of code.
This code MUST be backwards compatible in the future.
2) timescaledb-version.so (most of our code). Need
not be backwards compatible.
Timescaledb.so becomes a small stub which is preloaded and whose main
reason for existing is to dynamically load the right
timescaledb-version.so when the time comes.
This change allows either of the above .so to be loaded in
shared_preload_libraries. But timescaledb.so allows for multiple
versions used on different databases in the same instance along
with smoother upgrades. Using timescaledb-version.so allows for
finer-grained control and lock-in and is appropriate in only a few
production environments.
This PR also adds version checking so that a clear failure message
will be displayed if the .so version does not match the SQL extension
version.
To support multi-version functionality we changed the way SQL update
scripts are generated. Previously, the system used a bunch of
intermediate upgrade scripts. So with 3 versions, you would have an
update script of 1--2, 2--3. But, this PR changes things so that we
produce direct "shortcut" update files: 1--3, 2--3.
This is done for 2 reasons:
1) Each of the update files should point to
$libdir/timescaledb-current_version. Since you cannot guarantee that
Previous .so for each intermediate version has been installed.
2) You don't want intermediate version updates installed without the
.so. For example, if you have versions 1,2,3
and you are installing version 3, you want the upgrade files 1--3,
2--3 but not 1--2 because if you have 1--2
then a user could do ALTER EXTENSION timescaledb UPDATE TO 2. But
the .so for version 2 may not be installed.
In order to test this functionality, we add a mock extension version .so
that we can test extension loading inside the regression framework.
Changing things like the location of pg_config and other macros,
additional cmake flags, etc require modifying bootstrap file. This
PR instead passes any flags given to bootstrap onto cmake itself.
When tables are created with storage parameters (reloptions),
these should be inherited by chunks so that one can set options
such as fillfactor and autovacuum settings on the hypertable.
This change makes chunks inherit the reloptions set on a hypertable
and allows altering options via the ALTER TABLE command.
A hypertable's associated schema is used to create and store internal
data tables (chunks). A hypertable creates tables in that schema,
typically with full superuser permissions, regardless of whether the
hypertable's owner or the current user have permissions for the schema.
If the schema doesn't exist, the hypertable will create it when
creating the first chunk, even though the user or table owner does
not have permissions to create schemas in the database.
This change adds proper permissions checks to create_hypertable() so
that users cannot create hypertables with a custom associated schema
unless they have the proper permissions on the schema or the database.
Chunks are also no longer created with internal schema permissions if
the associated schema is something different from the internal schema.
This change is part of an effort to create a consistent way
of dealing with metadata catalog updates, which is currently
a mix of C API and INSERT/UPDATE/DELETE statements from SQL
code. This mix makes catalog handling unnecessarily complex as
there are multiple ways to update metadata, increasing the risk
of security issues with publically exposed SQL functions. It also
complicates things like cache invalidation, requiring different
mechanisms for C and SQL code. Catalog updates from SQL code
require triggers on metadata tables for cache invalidation that
do not work with native catalog updates.
The creation of chunks has been particularly messy in this regard,
making the code hard to follow. Especially the handling of a chunk's
constraints, where dimensional and other constraints were handled
differently. With this change, constraint handling is now consistent
across constraint types with a single API for updating metadata.
Reduce memory usage for out-of-order inserts
The chunk_result_relation_info should be put on the chunk memory
context. This will cause the rri constraint expr to also go onto
that context and be correctly freed when the chunk insert state
is destroyed.
Compatibility with pg_upgrade required 2 changes:
1) search_path on functions cannot be blank for pg_upgrade.
2) The timescaledb.restoring GUC had to apply to more code (now moved to
higher-level check)
`pg_upgrade` must be passed the following option: `-O "-c timescaledb.restoring='on'"`
A change to the COPY test made the sort order ambiguous
for certain tuples, which breaks the test on some machines
where the sort order might differ from the expected one.
This fix makes the sort order more predictible.
The chunk_result_relation_info should be put on the chunk memory
context. This will cause the rri constraint expr to also go onto
that context and be correctly freed when the chunk insert state
is destroyed.