timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-15 18:13:18 +08:00

Author	SHA1	Message	Date
Brian Rowe	aeac52aef6	Rename telemetry_metadata table to just metadata This change renames the _timescale_catalog.telemetry_metadata to _timescale_catalog.metadata. It also adds a new boolean column to this table which is used to flag data which should be included in telemetry. It also renamed the src/telemetry/metadata.{h,c} files to src/telemetry/telemetry_metadata.{h,c} and updated the API to reflect this. Finally it also includes the logic to use the new boolean column when populating the telemetry parse state.	2019-05-17 17:04:42 -07:00
Sven Klemm	bfabb30be0	Release 1.3.0	2019-05-07 02:47:13 +02:00
Joshua Lockerman	899cd0538d	Allow scheduled drop_chunks to cascade to aggs This commit adds a cascade_to_materializations flag to the scheduled version of drop_chunks that behaves much like the one from manual drop_chunks: if a hypertable that has a continuous aggregate tries to drop chunks, and this flag is not set, the chunks will not be dropped.	2019-04-30 15:46:49 -04:00
Matvey Arye	74f8d204a5	Optimize getting the chunk_id in continuous aggs We replace chunk_for_tuple with chunk_id_from_relid for getting chunk id fields when materializing continuous aggs. The old function required passing in the entire row. This was very slow because a lot of data was passed around at execution time. The new function just uses the internal `tableoid` attribute to convert the table relid to a chunk_id. This is much more efficient. We also add memoization to the new function because it is most often called consecutively for the same chunk.	2019-04-29 15:45:23 -04:00
Joshua Lockerman	ae3480c2cb	Fix continuous_aggs info This commit switches the remaining JOIN in the continuous_aggs_stats view to LEFT JOIN. This way we'll still see info from the other columns even when the background worker has not run yet. This commit also switches the time fields to output text in the correct format for the underlying time type.	2019-04-26 13:08:00 -04:00
Joshua Lockerman	3895e5ce0e	Add a setting for max an agg materializes per run Add a setting max_materialized_per_run which can be set to prevent a continuous aggregate from materializing too much of the table in a single run. This will prevent a single run from locking the hypertable for too long, when running on a large data set.	2019-04-26 13:08:00 -04:00
gayyappan	b8f9b91e60	Add user view query definition for cont aggs Add the query definition to timescaledb_information.continuous_aggregates. The user query (specified in the CREATE VIEW stmt of a continuous aggregate) is transformed in the process of creating a continuous aggregate and this modified query is saved in the pg_rewrite catalog tables. In order to display the original query, we create an internal view which is a replica of the user query. This is used to display the definition in timescaledb_information.continuous_aggregates. As an alternative we could save the original user query in our internal catalogs. But this approach involves replicating a lot of postgres code and causes portability problems.	2019-04-26 13:08:00 -04:00
Matvey Arye	dc0e250428	Add pg_dump/restore tests for continuous aggs The data in caggs needs to survive dump/restore. This test makes sure that caggs that are materialized both before and after restore are correct. Two code changes were necessary to make this work: 1) the valid_job_type constraint on bgw_job needed to be altered to add 'continuous_aggregate' as a valid job type 2) The user_view_query field needed to be changed to a text because dump/restore does not support pg_node_tree.	2019-04-26 13:08:00 -04:00
Joshua Lockerman	45fb1fc2c8	Handle drop_chunks on tables that have cont aggs For hypetables that have continuous aggregates, calling drop_chunks now drops all of the rows in the materialization table that were based on the dropped chunks. Since we don't know what the correct default behavior for drop_chunks is, we've added a new argument, cascade_to_materializations, which must be set to true in order to call drop_chunks on a hypertable which has a continuous aggregate. drop_chunks is blocked on the materialization tables of continuous aggregates	2019-04-26 13:08:00 -04:00
gayyappan	18d1607909	Add timescaledb_information views for continuous aggregates Add timescaledb_information.continuous_aggregate_settings and timescaledb_information.continuous_aggregate_job_stats views	2019-04-26 13:08:00 -04:00
Matvey Arye	19d47daf23	Delete related catalog rows when continuous aggs are dropped This PR deletes related rows from the following tables * completed_threshold * invalidation threshold * hypertable invalidation log The latter two tables are only affected if no other continuous aggs exist on the raw hyperatble. This commit also adds locks to prevent concurrent raw table inserts and any access to the materialization table when dropping caggs. It also moves all locks to the beginning of the function so that the lock order is easier to track and reason about. Also added a few formatting fixes.	2019-04-26 13:08:00 -04:00
gayyappan	1cbd8c74f7	Add invalidation trigger for continuous aggs Add invalidation trigger for DML changes to the hypertable used in the continuous aggregate query. Also add user_view_query definition in continuous_agg catalog table.	2019-04-26 13:08:00 -04:00
Joshua Lockerman	0737b370a3	Add the actual bgw job for continuous aggregates This commit adds the the actual background worker job that runs the continuous aggregate automatically. This job gets created when the continuous aggregate is created and is deleted when the aggregate is DROPed. By default this job will attempt to run every two bucket widths, and attempts to materialize up to two bucket widths behind the end of the table.	2019-04-26 13:08:00 -04:00
David Kohn	f17aeea374	Initial cont agg INSERT/materialization support This commit adds initial support for the continuous aggregate materialization and INSERT invalidations. INSERT path: On INSERT, DELETE and UPDATE we log the [max, min] time range that may be invalidated (that is, newly inserted, updated, or deleted) to _timescaledb_catalog.continuous_aggs_hypertable_invalidation_log. This log will be used to re-materialize these ranges, to ensure that the aggregate is up-to-date. Currently these invalidations are recorded in by a trigger _timescaledb_internal.continuous_agg_invalidation_trigger, which should be added to the hypertable when the continuous aggregate is created. This trigger stores a cache of min/max values per-hypertable, and on transaction commit writes them to the log, if needed. At the moment, we consider them to always be needed, unless we're in ReadCommitted mode or weaker, and the min invalidated value is greater than the hypertable's invalidation threshold (found in _timescaledb_catalog.continuous_aggs_invalidation_threshold) Materialization path: Materialization currently happens in multiple phase: in phase 1 we determine the timestamp at which we will end the new set of materializations, then we update the hypertable's invalidation threshold to that point, and finally we read the current invalidations, then materialize any invalidated rows, the new range between the continuous aggregate's completed threshold (found in _timescaledb_catalog.continuous_aggs_completed_threshold) and the hypertable's invalidation threshold. After all of this is done we update the completed threshold to the invalidation threshold. The portion of this protocol from after the invalidations are read, until the completed threshold is written (that is, actually materializing, and writing the completion threshold) is included with this commit, with the remainder to follow in subsequent ones. One important caveat is that since the thresholds are exclusive, we invalidate all values _less_ than the invalidation threshold, and we store timevalue as an int64 internally, we cannot ever determine if the row at PG_INT64_MAX is invalidated. To avoid this problem, we never materialize the time bucket containing PG_INT64_MAX.	2019-04-26 13:08:00 -04:00
gayyappan	2dbc28df82	Create base infrastructure for continuous aggs This PR adds a catalog table for storing metadata about continuous aggregates. It also adds code for creating the materialization hypertable and 2 views that are used by the continuous aggregate system: 1) The user view - This is the actual view queried by the enduser. It is a query on top of the materialized hypertable and is responsible for finalizing and combining partials in a manner that return to the user the data as defined by the original user-defined view. 2) The partial view - which queries the raw table and returns columns as defined in the materialized table. This will be used by the materializer to calculate the data that will be inserted into the materialization table. Note the data here is the partial state of any aggregates.	2019-04-26 13:08:00 -04:00
Joshua Lockerman	1e486ef2a4	Fix ts_chunk_for_tuple performance ts_chunk_for_tuple should use the chunk cache. ts_chunk_for_tuple should be marked stable. These fixes markedly improve performance.	2019-04-19 12:46:36 -04:00
David Kohn	35a1e357d8	Add functions for turning restoring on/off and setting license key These functions improve usability and take all the proper steps to set restoring on/off (and stop/start background workers in the process) and to set the license key via a function rather than a guc modification.	2019-04-18 11:59:31 -04:00
Sven Klemm	7961fc77e9	Rename installation_metadata to telemetry_metadata	2019-04-15 21:44:10 +02:00
Matvey Arye	672c41755f	Rename files to partialize_finalize It's better to have more concrete names than just util_aggfns. Also add TSDLLEXPORT where appropriate for windows.	2019-04-12 12:12:17 -04:00
Matvey Arye	d7b6ad239b	Add support for FINALFUNC_EXTRA This PR adds support for finalizing aggregates with FINALFUNC_EXTRA. To do this, we need to pass NULLS correspond to all of the aggregate parameters to the ffunc as arguments following the partial state value. These arguments need to have the correct concrete types. For polymorphic aggregates, the types cannot be derived from the catalog but need to be somehow conveyed to the finalize_agg. Two designs were considered: 1) Encode the type names as part of the partial state (bytea) 2) Pass down the arguments as parameters to the finalize_agg In the end (2) was picked for the simple reason that (1) would have increased the size of each partial, sometimes considerably (esp. for small partial values). The types are passed down as names not OIDs because in the continuous agg case using OIDs is not safe for backup/restore and in the clustering case the datanodes may not have the same type OIDs either.	2019-04-12 12:12:17 -04:00
gayyappan	b45343b3cc	Add ability to work with aggregate partials The ability to get aggregate partials instead of the final state is important for both continuous aggregation and clustering. This commit adds the ability to work with aggregate partials. Namely a function called _timescaledb_internal.partialize_agg can now wrap an aggregate and return the partial results as a bytea. The _timescaledb_internal.finalize_agg aggregate allows you to combine and finalize partials. The partialize_agg function works as a marker in the planner to force the planner to return partial result. Unfortunately, we could not get the planner to modify the plan directly to aggregate partials. Instead, the finalize_agg is a real aggregate that performs aggregation on the partial state. Note that it is not yet parallel. Aggregate that use FINALFUNC_EXTRA are currently not supported. Co-authored-by: gayyappan <gayathri@timescale.com> Co-authored-by: David Kohn <david@timescale.com>	2019-04-12 12:12:17 -04:00
Dmitry Simonenko	4daeb06eee	Track hypertables used during process utility hook execution This patch does refactoring necessary to support execution of DDL commands on remote servers. Basically it extends cross module api with ddl_command_start, ddl_command_end and sql_drop functions. Variable hypertables_list added to ProcessUtilityArg. It is used to keep a list of found hypertables during Utility/DDL statement parsing. This information and information gathered from other hook functions will be used to distinct distributed hypertables and forward DDL commands to any remote servers associated with them.	2019-03-29 13:04:18 +03:00
Sven Klemm	89cb73318d	Add support for window functions to gapfill This patch adds full support for window functions to gapfill queries. The targetlist for the gapfill node is built from the final targetlist and pushed down until aggregation node. locf and interpolate function calls will be toplevel function calls in the targetlist of the gapfill node. This patch changes gapfill code to no longer remove the marker function calls from the plans to allow PostgreSQL to properly identify subexpressions in targetlist.	2019-03-26 05:14:16 +01:00
Sven Klemm	38483358d0	Release 1.2.2	2019-03-14 14:32:27 +01:00
Joshua Lockerman	905cd4becc	Add function to determine the chunk for a given row	2019-03-11 16:29:50 -04:00
Sven Klemm	33ef1de542	Add treat_null_as_missing option to locf When doing a gapfill query with multiple columns that may contain NULLs it is not trivial to remove NULL values from individual columns with a WHERE clause, this new locf option allows those NULL values to be ignored in gapfill queries with locf. We drop the old locf function because we dont want 2 locf functions. Unfortunately this means any views using locf have to be dropped.	2019-02-16 00:09:38 +01:00
Joshua Lockerman	6d9ffe5c7d	Release 1.2.1	2019-02-08 19:11:09 -05:00
Joshua Lockerman	4295c04caf	Release 1.2.0	2019-01-28 20:47:04 -05:00
Sven Klemm	fd8a5197c8	Make time_bucket_gapfill start and finish optional Make time_bucket_gapfill start and finish optional, this is in preparation for deducing them from WHERE clause. We make this optional now to not introduce breaking change later. This also only allows simple expressions for bucket_width, start and finish because only those can be evaluated safely in gapfill_begin.	2019-01-28 19:07:34 +01:00
Joshua Lockerman	88c7149c2c	Fix issues with non-dev versions when generating the update scripts Without this, we generate multiple rules for the latest script	2019-01-28 10:04:59 -05:00
David Kohn	cf67ddd9b0	Add informational views for policies Add views so that users can see what the parameters are for policies they have created and a separate view so that they can see policies that have been created and scheduled on hypertables.	2019-01-25 13:51:52 -05:00
David Kohn	73d3a14665	Rename alter_policy_schedule & main_table for better UI Rename alter_policy_schedule to alter_job_schedule for consistency with the job_id argument passed in. Also rename main_table to hypertable in all of the policy related functions as they must deal with hypertables that have already been created.	2019-01-25 13:51:52 -05:00
Sven Klemm	fa61613440	Change time_bucket_gapfill argument names time_bucket_gapfill used end as argument name which is a sql keyword and has to be quoted when used, this changes the argument names from start/end to start/finish.	2019-01-25 18:38:55 +01:00
niksa	319b79c8ec	Making chunks_in function internal This function needs chunk ids as input. Since chunk ids are TimescaleDB internal metadata it feels more natural to make this function internal.	2019-01-23 10:04:06 +01:00
niksa	c77f4ab1b3	Explicit chunk exclusion In some cases user might already know what chunks need to be scanned to answer a particular query. Using `chunks_in` function we can skip calculating chunks involved in particular query which should result in better performances as well. A simple example: `SELECT * FROM hypertable WHERE chunks_in(hypertable, ARRAY[1,2])`	2019-01-19 00:02:01 +01:00
Joshua Lockerman	fdaa7173fb	Update telemetry with prettier os info The info gotten from uname is difficult to work with, so read the os name from /etc/os-release if it's available.	2019-01-18 10:23:01 -05:00
Sven Klemm	f89fd07c5b	Remove year from SQL file license text This changes the license text for SQL files to be identical with the license text for C files.	2019-01-13 23:30:22 +01:00
Joshua Lockerman	65894f08cf	Add view displaying info about the current license Currently the view displays the current edition, expiry date, and whether the license is expired. We're not displaying the license key itself in the view as it can get rather long, and get be read via SHOW. We also do not display the license's ID since that is for internal use.	2019-01-10 17:29:59 -05:00
Joshua Lockerman	47b5b7d553	Log which chunks are dropped by background workers We don't want to do this silently, so that users are able to debug where their chunks went.	2019-01-10 13:53:38 -05:00
Joshua Lockerman	27cd0fa27d	Fix speeling	2019-01-09 17:16:17 -05:00
Joshua Lockerman	fafc98d343	Fix warnings for TSL licenses So as to reduce the amount of logspam users receive, restrict printing license info to the following: 1. On CREATE EXTENSION a. in the notice, print the license expiration time, if any b. if the license is expired additionally print that c. else if the license will expire within a week print an addional warning 2. On the first usage of a TSL function, print if the license is expired or will be expired within a week	2019-01-08 19:35:50 -05:00
Joshua Lockerman	28265dcc1f	Use a fixed file for the latest dev version When developing a feature across releases, timescaledb updates can get stuck in the wrong update script, breaking the update process. To avoid this, we introduce a new file "latest-dev.sql" in which all new updates should go. During a release, this file gets renamed to "<previous version>--<current version>.sql" ensuring that all new updates are released and all updates in other branches will automatically get redirected to the next update script.	2019-01-07 14:03:05 -05:00
Joshua Lockerman	2a284fc84e	Move 1.2.0 updates to the correct file	2019-01-02 15:43:48 -05:00
Sven Klemm	6125111dfa	Mark gapfill functions parallel safe Gapfill functions need to be marked parallel safe to not prevent parallelism. The gapfill node itself is still parallel restricted but child nodes can be parallel	2019-01-02 15:43:48 -05:00
Joshua Lockerman	4e1e15f079	Add reorder command New cluster-like command which writes to a new index than swaps, much like is done for the data table, and only acquires exclusive locks for said swap. This trades off disk usage for lower contention: we hold locks for a much lower period of time, allowing reads to work concurrently, but we have both the old and new versions of the table existing at once, approximately doubling storage usage while reorder is running. Currently only works on chunks.	2019-01-02 15:43:48 -05:00
Amy Tai	9ad73249e1	Move enterprise updates to newest update file	2019-01-02 15:43:48 -05:00
Amy Tai	ef43e52107	Add alter_policy_schedule API function	2019-01-02 15:43:48 -05:00
Sven Klemm	5ba740ed98	Add gapfill query support This patch adds first level support for gap fill queries, including support for LOCF (last observation carried forward) and interpolation, without requiring to join against `generate_series`. This makes it easier to join timeseries with different or irregular sampling intervals.	2019-01-02 15:43:48 -05:00
Amy Tai	be7c74cdf3	Add logic for automatic DB maintenance functions This commit adds logic for manipulating internal metadata tables used for enabling users to schedule automatic drop_chunks and recluster policies. This commit includes: - SQL for creating policy tables and chunk stats table - Catalog code and C code for accessing these three tables programatically - Implement and expose new user API functions: add__policy and remove__policy - Stub scheduler logic for running the policies	2019-01-02 15:43:48 -05:00
Joshua Lockerman	4ff6ac7b91	Initial Timescale-Licensed-Module and License-Key Implementation This commit adds support for dynamically loaded submodules to timescaledb as well an initial license-key implementation in the tsl subdirectory. Dynamically loaded modules allow our users to determine which licenses they wish to use for their version of timescaledb; if they wish to only use Apache-Licensed code, they do not load the Timescale-Licensed submodule. Calls from the Apache-Licensed code into the Timescale-Licensed submodule are handled via dynamicaly-set function pointers; see tsl/src/Readme.module.md for more details. This commit also adds code for license keys for the ApacheOnly, Community, and Enterprise editions. The license key determines which features are enabled, and controls loading the submodule: when a license key that requires the sub-module is installed, the module is automatically loaded. Currently the ApacheOnly and Community license-keys are hardcoded to be "ApacheOnly" and "Community" respectively. The first version of the enterprise license-key is described in tsl/src/Readme.module.md	2019-01-02 15:43:48 -05:00

1 2 3 4 5 ...

339 Commits