48 Commits

Author SHA1 Message Date
Erik Nordström
ae587c9964 Add API function for explicit chunk creation
This adds an internal API function to create a chunk using explicit
constraints (dimension slices). A function to export a chunk in a
format consistent with the chunk creation function is also added.

The chunk export/create functions are needed for distributed
hypertables so that an access node can create chunks on data nodes
according to its own (global) partitioning configuration.
2020-05-27 17:31:09 +02:00
Erik Nordström
36af23ec94 Use flags for cache query options
Cache queries support multiple optional behaviors, such as "missing
ok" (do not fail on cache miss) and "no create" (do not create a new
entry if one doesn't exist in the cache). With multiple boolean
parameters, the query API has become unwieldy so this change turns
these booleans into one flag parameter.
2020-04-14 23:12:15 +02:00
Erik Nordström
afb4c7ba51 Refactor planner hooks
This change refactors our main planner hooks in `planner.c` with the
intention of providing a consistent way to classify planned relations
across hooks. In our hooks, we'd like to know whether a planned
relation (`RelOptInfo`) is one of the following:

* Hypertable
* Hypertable child (a hypertable can appear as a child of itself)
* Chunk as a child of hypertable (from expansion)
* Chunk as standalone (operation directly on chunk)
* Any other relation

Previously, there was no way to consistently know which of these one
was dealing with. Instead, a mix of various functions was used without
"remembering" the classification for reuse in later sections of the
code.

When classifying relations according to the above categories, the only
source of truth about a relation is our catalog metadata. In case of
hypertables, this is cached in the hypertable cache. However, this
cache is read-through, so, in case of a cache miss, the metadata will
always be scanned to resolve a new entry. To avoid unnecessary
metadata scans, this change introduces a way to do cache-only
queries. This requires maintaining a single warmed cache throughout
planning and is enabled by using a planner-global cache object. The
pre-planning query processing warms the cache by populating it with
all hypertables in the to-be-planned query.
2020-04-14 23:12:15 +02:00
Ruslan Fomkin
4dc0693d1f Unify error message if hypertable not found
Refactors multiple implementations of finding hypertables in cache
and failing with different error messages if not found. The
implementations are replaced with calling functions, which encapsulate
a single error message. This provides the unified error message and
removes need for copy-paste.
2020-01-29 08:10:27 +01:00
David Kohn
f17aeea374 Initial cont agg INSERT/materialization support
This commit adds initial support for the continuous aggregate materialization
and INSERT invalidations.

INSERT path:
  On INSERT, DELETE and UPDATE we log the [max, min] time range that may be
  invalidated (that is, newly inserted, updated, or deleted) to
  _timescaledb_catalog.continuous_aggs_hypertable_invalidation_log. This log
  will be used to re-materialize these ranges, to ensure that the aggregate
  is up-to-date. Currently these invalidations are recorded in by a trigger
  _timescaledb_internal.continuous_agg_invalidation_trigger, which should be
  added to the hypertable when the continuous aggregate is created. This trigger
  stores a cache of min/max values per-hypertable, and on transaction commit
  writes them to the log, if needed. At the moment, we consider them to always
  be needed, unless we're in ReadCommitted mode or weaker, and the min
  invalidated value is greater than the hypertable's invalidation threshold
  (found in _timescaledb_catalog.continuous_aggs_invalidation_threshold)

Materialization path:
  Materialization currently happens in multiple phase: in phase 1 we determine
  the timestamp at which we will end the new set of materializations, then we
  update the hypertable's invalidation threshold to that point, and finally we
  read the current invalidations, then materialize any invalidated rows, the new
  range between the continuous aggregate's completed threshold (found in
  _timescaledb_catalog.continuous_aggs_completed_threshold) and the hypertable's
  invalidation threshold. After all of this is done we update the completed
  threshold to the invalidation threshold. The portion of this protocol from
  after the invalidations are read, until the completed threshold is written
  (that is, actually materializing, and writing the completion threshold) is
  included with this commit, with the remainder to follow in subsequent ones.
  One important caveat is that since the thresholds are exclusive, we invalidate
  all values _less_ than the invalidation threshold, and we store timevalue
  as an int64 internally, we cannot ever determine if the row at PG_INT64_MAX is
  invalidated. To avoid this problem, we never materialize the time bucket
  containing PG_INT64_MAX.
2019-04-26 13:08:00 -04:00
Matvey Arye
34edba16a9 Run clang-format on code 2019-02-05 16:55:16 -05:00
Joshua Lockerman
acc41a7712 Update license header
Only have the copyright in the NOTICE. Hopefully
only having to update one place each year will
keep it consistent.
2019-01-03 11:57:51 -05:00
Joshua Lockerman
4e1e15f079 Add reorder command
New cluster-like command which writes to a new index than swaps,
much like is done for the data table, and only acquires
exclusive locks for said swap. This trades off disk usage for
lower contention: we hold locks for a much lower period of time,
allowing reads to work concurrently, but we have both the old
and new versions of the table existing at once, approximately
doubling storage usage while reorder is running.

Currently only works on chunks.
2019-01-02 15:43:48 -05:00
David Kohn
5aa1edac15 Refactor compatibility functions and code to support PG11
Introduce PG11 support by introducing compatibility functions for
any whose signatures have changed in PG11. Additionally, refactor
the structure of the compatibility functions found in compat.h by
breaking them out by function (or small set of similar functions)
so that it is easier to see what changed between versions and maintain
changes as more versions are supported.

In general, the philosophy has been to try for forward compatibility
wherever possible, so that we use the latest versions of function interfaces
where we can or where reasonably convenient and mimic the behavior
in older versions as much as possible.
2018-12-12 11:42:33 -05:00
Joshua Lockerman
9de504f958 Add ts_ prefix to everything in headers
Future proofing: if we ever want to make our functions available  to
others they’d need to be prefixed to prevent name collisions. In
order to avoid having some functions with the ts_ prefix and
others without, we’re adding the prefix to all non-static
functions now.
2018-12-05 14:43:22 -05:00
Erik Nordström
28e2e6a2f7 Refactor scanner callback interface
This change adds proper result types for the scanner's filter and
tuple handling callbacks. Previously, these callbacks were supposed to
return bool, which was hard to interpret. For instance, for the tuple
handler callback, true meant continue processing the next tuple while
false meant finish the scan. However, this wasn't always clear. Having
proper return types also makes it easier to see from a function's
signature that it is a scanner callback handler, rather than some
other function that can be called directly.
2018-11-08 17:33:26 +01:00
Joshua Lockerman
d8e41ddaba Add Apache License header to all C files 2018-10-29 13:28:19 -04:00
Mike Futerko
4f2f1a6eb7 Update the error messages to conform with the style guide; Fix tests
An attempt to unify the error messages to conform with the PostgreSQL error
messages style guide. See the link below:
https://www.postgresql.org/docs/current/static/error-style-guide.html
2018-07-10 12:55:02 -04:00
Erik Nordström
2e1f3b9fd0 Improve memory allocation during cache lookups
Previously, cache lookups were run on the cache's memory
context. While simple, this risked allocating transient (work) data on
that memory context, e.g., when scanning for new cache data during
cache misses.

This change makes scan functions take a memory context, which the
found data should be allocated on. All other data is allocated on the
current memory (typically the transaction's memory context). With this
functionality, a cache can pass its memory context to the scan, thus
avoiding taking on unnecessary memory allocations.
2018-06-22 16:45:07 +02:00
Erik Nordström
71962b86ec Refactor dimension-related API functions
The functions for adding and updating dimensions have been refactored
in C to:

- improve usage of proper error codes
- make messages that better conform with the PostgreSQL standard.
- improve security by avoiding that lots of code run under SECURITY DEFINER

A new if_not_exists option has also been added to add_dimension() and
a the number of partitions can now be set using the new
set_number_partitions() function.

A bug in the validation of smallint time intervals has been fixed. The
previous code didn't check for intervals > 0 and smallint intervals
accepted values up to UINT16_MAX instead of INT16_MAX.
2018-01-25 19:02:34 +01:00
Erik Nordström
e593876cb0 Refactor tablespace handling
Attaching tablespaces to hypertables is now handled
in native code, with improved permissions checking and
caching of tablespaces in the Hypertable data object.
2017-12-09 18:27:50 +01:00
Erik Nordström
c4a46ac8a1 Add hypertable cache lookup on ID/pkey
Hypertables can now be looked up through the cache on
ID/pkey in addition to OID.
2017-12-09 18:27:50 +01:00
Erik Nordström
e1a0e819cf Refactor and fix cache invalidation
TimescaleDB cache invalidation happens as a side effect of doing a
full SQL statement (INSERT/UPDATE/DELETE) on a catalog table (via
table triggers). However, triggers aren't invoked when using
PostgreSQL's internal catalog API for updates, since PostgreSQL's
catalog tables don't have triggers that require full statement
parsing, planning, and execution.

Since we are now using the regular PostgreSQL catalog update API for
some TimescaleDB catalog operations, we need to do cache invalidation
also on such operations.

This change adds cache invalidation when updating catalogs using the
internal (C) API and also makes the cache invalidation more fine
grained. For instance, caches are no longer invalidated on some
INSERTS that do not affect the validity of objects already in the
cache, such as adding a new chunk.
2017-11-27 17:33:10 +01:00
Erik Nordström
500563ffe5 Add support for PostgreSQL 10
The extension now works with PostgreSQL 10, while
retaining compatibility with version 9.6.

PostgreSQL 10 has numerous internal changes to functions and
APIs, which necessitates various glue code and compatibility
wrappers to seamlessly retain backwards compatiblity with older
versions.

Test output might also differ between versions. In particular,
the psql client generates version-specific output with `\d` and
EXPLAINs might differ due to new query optimizations. The test
suite has been modified as follows to handle these issues. First,
tests now use version-independent functions to query system
catalogs instead of using `\d`. Second, changes have been made to
the test suite to be able to verify some test outputs against
version-dependent reference files.
2017-11-10 09:44:20 +01:00
Erik Nordström
097db3d589 Refactor chunk index handling
This change refactors the chunk index handling to make better use
of standard PostgreSQL catalog information, while removing the
hypertable_index metadata table and associated triggers, including
those on the chunk_index table. The chunk_index table itself is
also simplified.

A benefit of this refactoring is that indexes are no longer
created using string mangling to construct the CREATE INDEX command
for a chunk, based on the string definition of the hypertable
index. Instead, indexes are created in C using proper index-related
internal data structures.

Chunk indexes can now also be renamed and are added in the parent
index tablespace. Changing tablespace on a hypertable index also
recurses to chunks, as expected. Default indexes that are added when
creating a hypertable use the hypertable's tablespace.

Creating Hypertable indexes with the CONCURRENTLY modifier is
currently blocked, due to unclear semantics regarding concurrent
creation over many tables, including how to deal with snapshots.
2017-10-03 10:51:32 +02:00
Erik Nordström
953346c18b Make VACUUM and REINDEX recurse to chunks
Previously, when issued on hypertable, database maintenance
commands, like VACUUM and REINDEX, only affected the main
table and did not recurse to chunks.

This change fixes that issue, allowing database maintainers
to issue single commands on hypertables that affect all
the data stored in the hypertable.

These commands (VACUUM, REINDEX) only work at the table level
for hypertables. If issued at other levels, e.g., schema, or
database, the behavior is the same as in standard PostgreSQL
as all tables are covered by default.

REINDEX commands that specify a hypertable index do not
recurse as that requires mapping the hypertable
index to the corresponding index on the chunk. This might
be fixed in a future update.
2017-08-15 17:26:52 +02:00
Matvey Arye
44da2c0be6 Run pgindent on code 2017-06-26 18:10:59 -04:00
Erik Nordström
a6309dac48 Fix a number of comments and cleanup unused code 2017-06-22 20:15:38 +02:00
Matvey Arye
ce3d630b6d Run pgindent on code 2017-06-22 20:15:38 +02:00
Erik Nordström
e75cd7e66b Finer grained memory management
Also fix a number of memory allocation bugs
and properly initialize chunks that are allocated
during a scan for chunks.
2017-06-22 20:15:38 +02:00
Matvey Arye
f5d7786eed Change the semantics of range_end to be exclusive 2017-06-22 20:15:38 +02:00
Erik Nordström
700c9c8a79 Refactor insert path in C.
Also in this commit:

- Rename time/space to open/closed for more generality.
- Create a Point data type for mapping a tuple to an
  N-dimensional space.
- Numerous fixes and cleanups.
2017-06-22 20:15:38 +02:00
Erik Nordström
7b8de0c592 Refactor catalog for new schema and add native data types
This is the first stab at updating the table and data type
definitions in the catalog module in the C code. This also
adds functions for natively scanning the dimension and
dimension_slice tables.
2017-06-22 20:15:38 +02:00
Matvey Arye
bfe58b61f7 Refactor towards supporting version upgrades
Clean up the table schema to get rid of legacy tables and functionality
that makes it more difficult to provide an upgrade path.

Notable changes:
* Get rid of legacy tables and code
* Simplify directory structure for SQL code
* Simplify table hierarchy: remove root table and make chunk tables
* inherit directly from main table
* Change chunk table suffix from _data to _chunk
* Simplify schema usage: _timescaledb_internal for internal functions.
* _timescaledb_catalog for metadata tables.
* Remove postgres_fdw dependency
* Improve code comments in sql code
2017-06-08 13:55:05 -04:00
Erik Nordström
2bc60c79e3 Fix time interval field name in hypertable cache entry
This change makes the naming of the time interval field in
the hypertable cache entry consistent with the table schema.
2017-05-23 16:40:00 -04:00
Matvey Arye
b2900f9f85 Disable query optimization on regular tables (non-hypertables)
This PR disables query optimizations on regular tables by default.
The option timescaledb.optimize_plain_tables = 'on' enables them
again. timescaledb.disable_optimizations = 'on' disables all
optimizations (note the change from 'true' to 'on').
2017-05-20 15:31:47 -04:00
Erik Nordström
c60b08e83a Fix DROP EXTENSION
DROP EXTENSION didn't properly reset caches and other saved state
causing various errors related to bad state when the extension was
dropped and/or recreated later.

This patch adds functionality to track the state of the extension and
also signals DROP EXTENSION to other backends that might be running,
allowing them to reset their internal extension state.
2017-03-22 19:43:40 +01:00
Erik Nordström
89692c9761 Add cache statistics and do minor cleanup
Track statistics for cache hits and misses in the cache module.
Currently not exposed to SQL, but might be useful for internal
debugging.
2017-03-22 09:57:44 +01:00
Erik Nordström
852ba7ee97 Avoid allocating hypertable cache storage for negative entries.
The hypertable cache stores negative cache entries to speed up checks
for tables that aren't hypertables. However, a full hypertable cache
entry was allocated for these negative entries, thus wasting cache
storage. With this update, the full entry is only allocated for
positive entries that actually represent hypertables.
2017-03-14 13:31:34 +01:00
Matvey Arye
1f13354bf9 Make the planner use metadata cache
Previously, the planner used a direct query via the SPI interface to
retrieve metadata info needed for query planner functions like query
rewriting. This commit updates the planner to use our caching system.
This is a performance improvement for pretty much all operations,
both data modifications and queries.

For hypertables, this added a cache keyed by the main table OID and
added negative entries (because the planner often needs to know if a
table is /not/ a hypertable).
2017-03-13 14:18:33 -04:00
Erik Nordström
3a03348356 Move catalog table definitions to catalog.h
This patch continues work to consolidate catalog information
and definitions to the catalog module.

It also refactors the naming of some data types to adhere
to camelcase naming convention (Hypertable, PartitionEpoch).
2017-03-10 20:39:20 +01:00
Erik Nordström
d1ad3afd49 Perform native scans for chunks (replicas)
Previously, chunk replicas were retreived with an SPI query. Now, all
catalog items are retrieved with native scans, with the exception of
newly created chunks.

This commit also refactors the chunk (replica) cache, removing some
data structures that were duplicating information. Now chunks are
cached by their ID (including their replicas) instead of just the
set of replicas. This removes the need for additional data structures,
such as the replica set which looked like a chunk minus time info,
and the cache entry wrapper struct. Another upside is that chunks
can be retrieved from the cache directly by ID.
2017-03-10 17:28:49 +01:00
Matvey Arye
eb09ccc980 Enable catalog to support multiple indexes per table 2017-03-09 11:32:12 -05:00
Matvey Arye
00b69ac010 Change close-chunk logic to use a c-based fastpath
This change is a performance improvement. Previously each insert called
a plpgsql function to check if there is a need to close the chunk. This
patch implements a c-only fastpath for the case when the table size is
less than the configured chunk size.
2017-03-07 12:05:53 -05:00
Matvey Arye
32c45b75b2 formatting with pgindent 2017-03-06 15:20:00 -05:00
Matvey Arye
4d4ac78ef5 cleanup 2017-03-05 10:13:45 -05:00
Matvey Arye
64e8ec1877 Ordering inserts to avoid deadlocks 2017-03-03 10:10:14 -05:00
Erik Nordström
eebd9bbbc1 Use native scan for chunks.
The chunk catalog table is now scanned with a native
scan rather than SPI call.

The scanner module is also updated with the option of
of taking locks on found tuples. In the case of chunk
scanning, chunks are typically returned with a share
lock on the tuple.
2017-03-02 12:25:00 +01:00
Erik Nordström
61668c837a Allow heap/index scans to abort early if condition is met.
Aborting scans can be useful when scanning large tables or indexes and
the scan function knows that a condition has been met (e.g., all
tuples needed have been found). For instance, when scanning for
hypertable partitions, one can abort the scan when the found tuple
count equals the number of partitions in the partition epoch.
2017-03-02 12:25:00 +01:00
Erik Nordström
1daec06ce6 Use native scans for partition epochs and partitions.
This patch refactors the code to use native heap/index scans for
finding partition epochs and partitions. It also moves the
partitioning-related code and data structures to partitioning.{c,h}.
2017-03-02 12:25:00 +01:00
Erik Nordström
c92fa28ea8 Add scanner module implementing both heap and index scans. 2017-03-02 12:24:59 +01:00
Erik Nordström
f99669c880 Use direct index scan for hypertable lookups.
This is a first stab at moving from SPI queries
to direct heap/index scans for cleaner and more
efficient code. By doing direct scans, there is no
need to prepare and cache a bunch of query plans
that make the code both slower and more complex.

This patch also adds a catalog module that keeps
cached OIDs and other things for catalog tables.
The cached information is updated every time the
backend switches to a new database. A permission
check is also implemented when accessing the catalog
information, but should probably be extended to
tables and schemas in the future.
2017-03-02 12:24:59 +01:00
Matvey Arye
67ad21ee36 Cache hypertable metadata information for faster inserts.
Add caches for hypertable metadata to make it faster to map
INSERT rows to the chunk they should go into.
2017-03-02 12:24:53 +01:00