timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-19 04:03:06 +08:00

Author	SHA1	Message	Date
Erik Nordström	ae587c9964	Add API function for explicit chunk creation This adds an internal API function to create a chunk using explicit constraints (dimension slices). A function to export a chunk in a format consistent with the chunk creation function is also added. The chunk export/create functions are needed for distributed hypertables so that an access node can create chunks on data nodes according to its own (global) partitioning configuration.	2020-05-27 17:31:09 +02:00
Erik Nordström	36af23ec94	Use flags for cache query options Cache queries support multiple optional behaviors, such as "missing ok" (do not fail on cache miss) and "no create" (do not create a new entry if one doesn't exist in the cache). With multiple boolean parameters, the query API has become unwieldy so this change turns these booleans into one flag parameter.	2020-04-14 23:12:15 +02:00
Erik Nordström	afb4c7ba51	Refactor planner hooks This change refactors our main planner hooks in `planner.c` with the intention of providing a consistent way to classify planned relations across hooks. In our hooks, we'd like to know whether a planned relation (`RelOptInfo`) is one of the following: * Hypertable * Hypertable child (a hypertable can appear as a child of itself) * Chunk as a child of hypertable (from expansion) * Chunk as standalone (operation directly on chunk) * Any other relation Previously, there was no way to consistently know which of these one was dealing with. Instead, a mix of various functions was used without "remembering" the classification for reuse in later sections of the code. When classifying relations according to the above categories, the only source of truth about a relation is our catalog metadata. In case of hypertables, this is cached in the hypertable cache. However, this cache is read-through, so, in case of a cache miss, the metadata will always be scanned to resolve a new entry. To avoid unnecessary metadata scans, this change introduces a way to do cache-only queries. This requires maintaining a single warmed cache throughout planning and is enabled by using a planner-global cache object. The pre-planning query processing warms the cache by populating it with all hypertables in the to-be-planned query.	2020-04-14 23:12:15 +02:00
Ruslan Fomkin	4dc0693d1f	Unify error message if hypertable not found Refactors multiple implementations of finding hypertables in cache and failing with different error messages if not found. The implementations are replaced with calling functions, which encapsulate a single error message. This provides the unified error message and removes need for copy-paste.	2020-01-29 08:10:27 +01:00
David Kohn	f17aeea374	Initial cont agg INSERT/materialization support This commit adds initial support for the continuous aggregate materialization and INSERT invalidations. INSERT path: On INSERT, DELETE and UPDATE we log the [max, min] time range that may be invalidated (that is, newly inserted, updated, or deleted) to _timescaledb_catalog.continuous_aggs_hypertable_invalidation_log. This log will be used to re-materialize these ranges, to ensure that the aggregate is up-to-date. Currently these invalidations are recorded in by a trigger _timescaledb_internal.continuous_agg_invalidation_trigger, which should be added to the hypertable when the continuous aggregate is created. This trigger stores a cache of min/max values per-hypertable, and on transaction commit writes them to the log, if needed. At the moment, we consider them to always be needed, unless we're in ReadCommitted mode or weaker, and the min invalidated value is greater than the hypertable's invalidation threshold (found in _timescaledb_catalog.continuous_aggs_invalidation_threshold) Materialization path: Materialization currently happens in multiple phase: in phase 1 we determine the timestamp at which we will end the new set of materializations, then we update the hypertable's invalidation threshold to that point, and finally we read the current invalidations, then materialize any invalidated rows, the new range between the continuous aggregate's completed threshold (found in _timescaledb_catalog.continuous_aggs_completed_threshold) and the hypertable's invalidation threshold. After all of this is done we update the completed threshold to the invalidation threshold. The portion of this protocol from after the invalidations are read, until the completed threshold is written (that is, actually materializing, and writing the completion threshold) is included with this commit, with the remainder to follow in subsequent ones. One important caveat is that since the thresholds are exclusive, we invalidate all values _less_ than the invalidation threshold, and we store timevalue as an int64 internally, we cannot ever determine if the row at PG_INT64_MAX is invalidated. To avoid this problem, we never materialize the time bucket containing PG_INT64_MAX.	2019-04-26 13:08:00 -04:00
Matvey Arye	34edba16a9	Run clang-format on code	2019-02-05 16:55:16 -05:00
Joshua Lockerman	acc41a7712	Update license header Only have the copyright in the NOTICE. Hopefully only having to update one place each year will keep it consistent.	2019-01-03 11:57:51 -05:00
Joshua Lockerman	4e1e15f079	Add reorder command New cluster-like command which writes to a new index than swaps, much like is done for the data table, and only acquires exclusive locks for said swap. This trades off disk usage for lower contention: we hold locks for a much lower period of time, allowing reads to work concurrently, but we have both the old and new versions of the table existing at once, approximately doubling storage usage while reorder is running. Currently only works on chunks.	2019-01-02 15:43:48 -05:00
David Kohn	5aa1edac15	Refactor compatibility functions and code to support PG11 Introduce PG11 support by introducing compatibility functions for any whose signatures have changed in PG11. Additionally, refactor the structure of the compatibility functions found in compat.h by breaking them out by function (or small set of similar functions) so that it is easier to see what changed between versions and maintain changes as more versions are supported. In general, the philosophy has been to try for forward compatibility wherever possible, so that we use the latest versions of function interfaces where we can or where reasonably convenient and mimic the behavior in older versions as much as possible.	2018-12-12 11:42:33 -05:00
Joshua Lockerman	9de504f958	Add ts_ prefix to everything in headers Future proofing: if we ever want to make our functions available to others they’d need to be prefixed to prevent name collisions. In order to avoid having some functions with the ts_ prefix and others without, we’re adding the prefix to all non-static functions now.	2018-12-05 14:43:22 -05:00
Erik Nordström	28e2e6a2f7	Refactor scanner callback interface This change adds proper result types for the scanner's filter and tuple handling callbacks. Previously, these callbacks were supposed to return bool, which was hard to interpret. For instance, for the tuple handler callback, true meant continue processing the next tuple while false meant finish the scan. However, this wasn't always clear. Having proper return types also makes it easier to see from a function's signature that it is a scanner callback handler, rather than some other function that can be called directly.	2018-11-08 17:33:26 +01:00
Joshua Lockerman	d8e41ddaba	Add Apache License header to all C files	2018-10-29 13:28:19 -04:00
Mike Futerko	4f2f1a6eb7	Update the error messages to conform with the style guide; Fix tests An attempt to unify the error messages to conform with the PostgreSQL error messages style guide. See the link below: https://www.postgresql.org/docs/current/static/error-style-guide.html	2018-07-10 12:55:02 -04:00
Erik Nordström	2e1f3b9fd0	Improve memory allocation during cache lookups Previously, cache lookups were run on the cache's memory context. While simple, this risked allocating transient (work) data on that memory context, e.g., when scanning for new cache data during cache misses. This change makes scan functions take a memory context, which the found data should be allocated on. All other data is allocated on the current memory (typically the transaction's memory context). With this functionality, a cache can pass its memory context to the scan, thus avoiding taking on unnecessary memory allocations.	2018-06-22 16:45:07 +02:00
Erik Nordström	71962b86ec	Refactor dimension-related API functions The functions for adding and updating dimensions have been refactored in C to: - improve usage of proper error codes - make messages that better conform with the PostgreSQL standard. - improve security by avoiding that lots of code run under SECURITY DEFINER A new if_not_exists option has also been added to add_dimension() and a the number of partitions can now be set using the new set_number_partitions() function. A bug in the validation of smallint time intervals has been fixed. The previous code didn't check for intervals > 0 and smallint intervals accepted values up to UINT16_MAX instead of INT16_MAX.	2018-01-25 19:02:34 +01:00
Erik Nordström	e593876cb0	Refactor tablespace handling Attaching tablespaces to hypertables is now handled in native code, with improved permissions checking and caching of tablespaces in the Hypertable data object.	2017-12-09 18:27:50 +01:00
Erik Nordström	c4a46ac8a1	Add hypertable cache lookup on ID/pkey Hypertables can now be looked up through the cache on ID/pkey in addition to OID.	2017-12-09 18:27:50 +01:00
Erik Nordström	e1a0e819cf	Refactor and fix cache invalidation TimescaleDB cache invalidation happens as a side effect of doing a full SQL statement (INSERT/UPDATE/DELETE) on a catalog table (via table triggers). However, triggers aren't invoked when using PostgreSQL's internal catalog API for updates, since PostgreSQL's catalog tables don't have triggers that require full statement parsing, planning, and execution. Since we are now using the regular PostgreSQL catalog update API for some TimescaleDB catalog operations, we need to do cache invalidation also on such operations. This change adds cache invalidation when updating catalogs using the internal (C) API and also makes the cache invalidation more fine grained. For instance, caches are no longer invalidated on some INSERTS that do not affect the validity of objects already in the cache, such as adding a new chunk.	2017-11-27 17:33:10 +01:00
Erik Nordström	500563ffe5	Add support for PostgreSQL 10 The extension now works with PostgreSQL 10, while retaining compatibility with version 9.6. PostgreSQL 10 has numerous internal changes to functions and APIs, which necessitates various glue code and compatibility wrappers to seamlessly retain backwards compatiblity with older versions. Test output might also differ between versions. In particular, the psql client generates version-specific output with `\d` and EXPLAINs might differ due to new query optimizations. The test suite has been modified as follows to handle these issues. First, tests now use version-independent functions to query system catalogs instead of using `\d`. Second, changes have been made to the test suite to be able to verify some test outputs against version-dependent reference files.	2017-11-10 09:44:20 +01:00
Erik Nordström	097db3d589	Refactor chunk index handling This change refactors the chunk index handling to make better use of standard PostgreSQL catalog information, while removing the hypertable_index metadata table and associated triggers, including those on the chunk_index table. The chunk_index table itself is also simplified. A benefit of this refactoring is that indexes are no longer created using string mangling to construct the CREATE INDEX command for a chunk, based on the string definition of the hypertable index. Instead, indexes are created in C using proper index-related internal data structures. Chunk indexes can now also be renamed and are added in the parent index tablespace. Changing tablespace on a hypertable index also recurses to chunks, as expected. Default indexes that are added when creating a hypertable use the hypertable's tablespace. Creating Hypertable indexes with the CONCURRENTLY modifier is currently blocked, due to unclear semantics regarding concurrent creation over many tables, including how to deal with snapshots.	2017-10-03 10:51:32 +02:00
Erik Nordström	953346c18b	Make VACUUM and REINDEX recurse to chunks Previously, when issued on hypertable, database maintenance commands, like VACUUM and REINDEX, only affected the main table and did not recurse to chunks. This change fixes that issue, allowing database maintainers to issue single commands on hypertables that affect all the data stored in the hypertable. These commands (VACUUM, REINDEX) only work at the table level for hypertables. If issued at other levels, e.g., schema, or database, the behavior is the same as in standard PostgreSQL as all tables are covered by default. REINDEX commands that specify a hypertable index do not recurse as that requires mapping the hypertable index to the corresponding index on the chunk. This might be fixed in a future update.	2017-08-15 17:26:52 +02:00
Matvey Arye	44da2c0be6	Run pgindent on code	2017-06-26 18:10:59 -04:00
Erik Nordström	a6309dac48	Fix a number of comments and cleanup unused code	2017-06-22 20:15:38 +02:00
Matvey Arye	ce3d630b6d	Run pgindent on code	2017-06-22 20:15:38 +02:00
Erik Nordström	e75cd7e66b	Finer grained memory management Also fix a number of memory allocation bugs and properly initialize chunks that are allocated during a scan for chunks.	2017-06-22 20:15:38 +02:00
Matvey Arye	f5d7786eed	Change the semantics of range_end to be exclusive	2017-06-22 20:15:38 +02:00
Erik Nordström	700c9c8a79	Refactor insert path in C. Also in this commit: - Rename time/space to open/closed for more generality. - Create a Point data type for mapping a tuple to an N-dimensional space. - Numerous fixes and cleanups.	2017-06-22 20:15:38 +02:00
Erik Nordström	7b8de0c592	Refactor catalog for new schema and add native data types This is the first stab at updating the table and data type definitions in the catalog module in the C code. This also adds functions for natively scanning the dimension and dimension_slice tables.	2017-06-22 20:15:38 +02:00
Matvey Arye	bfe58b61f7	Refactor towards supporting version upgrades Clean up the table schema to get rid of legacy tables and functionality that makes it more difficult to provide an upgrade path. Notable changes: * Get rid of legacy tables and code * Simplify directory structure for SQL code * Simplify table hierarchy: remove root table and make chunk tables * inherit directly from main table * Change chunk table suffix from _data to _chunk * Simplify schema usage: _timescaledb_internal for internal functions. * _timescaledb_catalog for metadata tables. * Remove postgres_fdw dependency * Improve code comments in sql code	2017-06-08 13:55:05 -04:00
Erik Nordström	2bc60c79e3	Fix time interval field name in hypertable cache entry This change makes the naming of the time interval field in the hypertable cache entry consistent with the table schema.	2017-05-23 16:40:00 -04:00
Matvey Arye	b2900f9f85	Disable query optimization on regular tables (non-hypertables) This PR disables query optimizations on regular tables by default. The option timescaledb.optimize_plain_tables = 'on' enables them again. timescaledb.disable_optimizations = 'on' disables all optimizations (note the change from 'true' to 'on').	2017-05-20 15:31:47 -04:00
Erik Nordström	c60b08e83a	Fix DROP EXTENSION DROP EXTENSION didn't properly reset caches and other saved state causing various errors related to bad state when the extension was dropped and/or recreated later. This patch adds functionality to track the state of the extension and also signals DROP EXTENSION to other backends that might be running, allowing them to reset their internal extension state.	2017-03-22 19:43:40 +01:00
Erik Nordström	89692c9761	Add cache statistics and do minor cleanup Track statistics for cache hits and misses in the cache module. Currently not exposed to SQL, but might be useful for internal debugging.	2017-03-22 09:57:44 +01:00
Erik Nordström	852ba7ee97	Avoid allocating hypertable cache storage for negative entries. The hypertable cache stores negative cache entries to speed up checks for tables that aren't hypertables. However, a full hypertable cache entry was allocated for these negative entries, thus wasting cache storage. With this update, the full entry is only allocated for positive entries that actually represent hypertables.	2017-03-14 13:31:34 +01:00
Matvey Arye	1f13354bf9	Make the planner use metadata cache Previously, the planner used a direct query via the SPI interface to retrieve metadata info needed for query planner functions like query rewriting. This commit updates the planner to use our caching system. This is a performance improvement for pretty much all operations, both data modifications and queries. For hypertables, this added a cache keyed by the main table OID and added negative entries (because the planner often needs to know if a table is /not/ a hypertable).	2017-03-13 14:18:33 -04:00
Erik Nordström	3a03348356	Move catalog table definitions to catalog.h This patch continues work to consolidate catalog information and definitions to the catalog module. It also refactors the naming of some data types to adhere to camelcase naming convention (Hypertable, PartitionEpoch).	2017-03-10 20:39:20 +01:00
Erik Nordström	d1ad3afd49	Perform native scans for chunks (replicas) Previously, chunk replicas were retreived with an SPI query. Now, all catalog items are retrieved with native scans, with the exception of newly created chunks. This commit also refactors the chunk (replica) cache, removing some data structures that were duplicating information. Now chunks are cached by their ID (including their replicas) instead of just the set of replicas. This removes the need for additional data structures, such as the replica set which looked like a chunk minus time info, and the cache entry wrapper struct. Another upside is that chunks can be retrieved from the cache directly by ID.	2017-03-10 17:28:49 +01:00
Matvey Arye	eb09ccc980	Enable catalog to support multiple indexes per table	2017-03-09 11:32:12 -05:00
Matvey Arye	00b69ac010	Change close-chunk logic to use a c-based fastpath This change is a performance improvement. Previously each insert called a plpgsql function to check if there is a need to close the chunk. This patch implements a c-only fastpath for the case when the table size is less than the configured chunk size.	2017-03-07 12:05:53 -05:00
Matvey Arye	32c45b75b2	formatting with pgindent	2017-03-06 15:20:00 -05:00
Matvey Arye	4d4ac78ef5	cleanup	2017-03-05 10:13:45 -05:00
Matvey Arye	64e8ec1877	Ordering inserts to avoid deadlocks	2017-03-03 10:10:14 -05:00
Erik Nordström	eebd9bbbc1	Use native scan for chunks. The chunk catalog table is now scanned with a native scan rather than SPI call. The scanner module is also updated with the option of of taking locks on found tuples. In the case of chunk scanning, chunks are typically returned with a share lock on the tuple.	2017-03-02 12:25:00 +01:00
Erik Nordström	61668c837a	Allow heap/index scans to abort early if condition is met. Aborting scans can be useful when scanning large tables or indexes and the scan function knows that a condition has been met (e.g., all tuples needed have been found). For instance, when scanning for hypertable partitions, one can abort the scan when the found tuple count equals the number of partitions in the partition epoch.	2017-03-02 12:25:00 +01:00
Erik Nordström	1daec06ce6	Use native scans for partition epochs and partitions. This patch refactors the code to use native heap/index scans for finding partition epochs and partitions. It also moves the partitioning-related code and data structures to partitioning.{c,h}.	2017-03-02 12:25:00 +01:00
Erik Nordström	c92fa28ea8	Add scanner module implementing both heap and index scans.	2017-03-02 12:24:59 +01:00
Erik Nordström	f99669c880	Use direct index scan for hypertable lookups. This is a first stab at moving from SPI queries to direct heap/index scans for cleaner and more efficient code. By doing direct scans, there is no need to prepare and cache a bunch of query plans that make the code both slower and more complex. This patch also adds a catalog module that keeps cached OIDs and other things for catalog tables. The cached information is updated every time the backend switches to a new database. A permission check is also implemented when accessing the catalog information, but should probably be extended to tables and schemas in the future.	2017-03-02 12:24:59 +01:00
Matvey Arye	67ad21ee36	Cache hypertable metadata information for faster inserts. Add caches for hypertable metadata to make it faster to map INSERT rows to the chunk they should go into.	2017-03-02 12:24:53 +01:00

48 Commits