213 Commits

Author SHA1 Message Date
gayyappan
88f693887a Cleanup index on hypertable catalog table
Reorder schema_name + table_name index. Remove
unnecessary constraint.
2020-07-23 11:08:11 -04:00
Sven Klemm
4e41672161 Remove unused functions
Remove functions that are no longer used due to refactorings.
Removes the following functions:
- hypertable_tuple_match_name
- ts_hypertable_get_all_by_name
- ts_hypertable_has_tuples
2020-06-30 18:05:16 +02:00
Dmitry Simonenko
04bcc949c1 Add checks for read-only transactions
This change ensures that API functions and DDL operations
which modify data respects read-only transaction state
set by default_transaction_read_only option.
2020-06-22 17:03:04 +03:00
Mats Kindahl
5d188267e4 Fix crash with create_distributed_hypertable
Function `ts_hypertable_create_from_info` will error if the hypertable
already exists unless the if-not-exists flag is not set.  If we reach
this point, either if-not-exists-flag was set or the hypertable did not
exist and was created above.

In `ts_hypertable_create_from_info`, a call to
`ts_dimension_info_validate` with `space_dim_info` will be made if (and
only if) the table did not exist. The function does not only validate
the dimension, it also set `dimension_id` field.

 If the table already existed, `created` will be false and the
`dimension_id` will be set to the invalid OID, which means that
`ts_hypertable_check_partitioning` will crash since it it expect a
proper OID to be passed.

This commit fixes that by checking if the hypertable exists prior to
calling `ts_hypertable_create_from_info` in
`ts_hypertable_create_internal` and aborting with an error if the
hypertable already exists.

Fixes #1987
2020-06-17 08:38:31 +02:00
Sven Klemm
c90397fd6a Remove support for PG9.6 and PG10
This patch removes code support for PG9.6 and PG10. In addition to
removing PG96 and PG10 macros the following changes are done:

remove HAVE_INT64_TIMESTAMP since this is always true on PG10+
remove PG_VERSION_SUPPORTS_MULTINODE
2020-06-02 23:48:35 +02:00
Brian Rowe
c7478f04b3 Change default prefix for distributed tables
This change modifies the default 'associated_table_prefix' string
for compressed hypertables from "_hyper_n" to "_dist_hyper_n" where
n is the hypertable id of the table.  This change makes it so that
when the backend hypertable for a distributed hypertable is created
on a data node, it won't have an 'associated_table_prefix' that may
conflict with locally created tables.

As part of this change, chunk names for distributed tables no longer
append the string "dist" as this is now part of the prefix for those
chunk names (i.e. instead of "_hyper_1_1_dist_chunk" we will now
have "_dist_hyper_1_1_chunk").
2020-05-29 12:12:41 -07:00
Ruslan Fomkin
c44a202576 Implement altering replication factor
Implements SQL function set_replication_factor, which changes
replication factor of a distributed hypertable. The change of the
replication factor doesn't affect existing chunks. Newly created
chunks are replicated according to new replication factor.
2020-05-27 17:31:09 +02:00
Brian Rowe
0017208368 Test dimension add on distributed hypertables
Prior to this change attempting to add a dimension to a distributed
hypertable which currently or previously contained data would fail
with an opaque error.  This change will properly test distributed
hypertables when adding dimensions and will print appropriate errors.
2020-05-27 17:31:09 +02:00
Mats Kindahl
8d28fad66d Error on reference from distributed hypertable
It is not possible to properly reference another table from a
distributed hypertable since this would require replication of the
referenced table.

This commit add a warning message when a distributed hypertable attempt
to reference any other table using a foreign key.
2020-05-27 17:31:09 +02:00
Dmitry Simonenko
11ef10332e Add number of compressed hypertables to stat
This change includes telemetry fixes which extends HypertablesStat
with num_hypertables_compressed. It also updates the way how the
number of regular hypertables is calculated, which is now treated as a
non-compressed and not related to continuous aggregates.
2020-05-27 17:31:09 +02:00
Ruslan Fomkin
ef823a3060 Remove unnecessary check from distributed DDL
Since NULL value for replication factor in SQL DDL corresponds to
HYPERTABLE_REGULAR now, which is different from
HYPERTABLE_DISTRIBUTED_MEMBER, there is no need to check for non-NULL
value and comparing with HYPERTABLE_DISTRIBUTED_MEMBER is enough.
2020-05-27 17:31:09 +02:00
Erik Nordström
5933c1785e Avoid attaching data nodes without permissions
This change will make `create_distributed_hypertable` attach only
those data nodes that the creating user has USAGE permission on
instead of trying to attach all data nodes (even when permissions do
not allow it). This prevents an ERROR when a user tries to create a
distributed hypertable and lacks USAGE on one or more data nodes. This
new behavior *only* applies when no explicit set of data nodes is
specified. When an explicit set is specified the behavior remains
unchanged, i.e., USAGE is required on all the explicitly specified
data nodes.

Note that, with distributed hypertables, USAGE permission on a data
node (foreign server) only governs the ability to attach it to a
hypertable. This is analogous to the regular behavior of foreign
servers where USAGE governs who can create foreign tables using the
server. The actual usage of the server once associated with a table is
not affected as that would break the table if permissions are revoked.

The new behavior introduced with this change makes it simpler to use
`create_distributed_hypertable` in a multi-user and multi-permission
environment where users have different permissions on data nodes
(DNs). For instance, imagine user A being allowed to attach DN1 and
DN2, while user B can attach DN2 and DN3. Without this change,
`create_distributed_hypertable` will always fail since it tries to
attach DN1, DN2, and DN3 irrespective of the user that calls the
function. Even worse, if only DN1 and DN2 existed initially, user A
would be able to create distributed hypertables without errors, but,
as soon as DN3 is added, `create_distributed_hypertable` would start
to fail for user A.

The only way to avoid the above described errors when creating
distributed hypertables is for users to pass an explicit set of data
nodes that only includes the data nodes they have USAGE on. If a
user is forced to do that, the result would in any case be the same as
that introduced with this change. Unfortunately, users currently have
no way of knowing which data nodes they have USAGE on unless they
query the PostgreSQL catalogs. In many cases, a user might not even
care about the data nodes they are using if they aren't DBAs
themselves.

To summarize the new behavior:

* If public (everyone) has USAGE on all data nodes, there is no
  change in behavior.
* If a user has USAGE on a subset of data nodes, it will by default
  attach only those data nodes since it cannot attach the other ones
  anyway.
* If the user specifies an explicit set of data nodes to attach, all
  of those nodes still require USAGE permissions. (Behavior
  unchanged.)
2020-05-27 17:31:09 +02:00
Dmitry Simonenko
0a31e72540 Block tablespace api for a distributed hypertable 2020-05-27 17:31:09 +02:00
Dmitry Simonenko
296d134a1e Add telemetry to track distributed databases
This change extends telemetry report and adds new 'distributed_db'
section which includes following keys: 'distributed_member',
'data_nodes_count' and 'distributed_hypertables_count'.
2020-05-27 17:31:09 +02:00
Erik Nordström
7a49296272 Distribute dimension-related DDL commands
This change ensures that all dimension-releated DDL commands on
hypertables are distributed to its data nodes. Most importantly,
`add_dimension()` now applies across all data nodes of a hypertable
when issued on the access node. This ensures the dimension
configuration is the same on all nodes of a particular hypertable.

While the configuration of chunk time interval and number of
partitions do not matter that much for data nodes (since the access
node takes care of sizing chunks), functions like
`set_chunk_time_interval()` and `set_number_partitions()` are
distributed as well. This ensures that dimension-related configuration
changes apply across all nodes.
2020-05-27 17:31:09 +02:00
niksa
d97c32d0c7 Support distributed drop_chunks
Running `drop_chunks` on a distributed hypertable should remove chunks
from all its data nodes. To make it work we send the same SQL command
to all involved data nodes.
2020-05-27 17:31:09 +02:00
Erik Nordström
7b64bf20f5 Use num data nodes as default num partitions
For convenience, this adds the option to create a distributed
hypertable without specifying the number of partitions in the space
dimension even in the case when no data nodes are specified
(defaulting to the data nodes added to the database).
2020-05-27 17:31:09 +02:00
Erik Nordström
5309cd6c5f Repartition hypertables when attaching data node
Distributed hypertables are now repartitioned when attaching new data
nodes and the current number of partition (slices) in the first closed
(space) dimension is less than the number of data nodes. Increasing
the number of partitions is necessary to make use of a newly attached
data node. However, repartitioning is optional and can be avoided via
a boolean parameter in `attach_server()`.

In addition to the above repartitioning, this change also adds
informational messages to `create_hypertable` and
`set_number_partitions` to raise awareness of situations when the
number of partitions in the space dimensions is lower than the number
of attached data nodes.
2020-05-27 17:31:09 +02:00
Brian Rowe
79fb46456f Rename server to data node
The timescale clustering code so far has been written referring to the
remote databases as 'servers'.  This terminology is a bit overloaded,
and in particular we don't enforce any network topology limitations
that the term 'server' would suggest.  In light of this we've decided
to change to use the term 'node' when referring to the different
databases in a distributed database.  Specifically we refer to the
frontend as an 'access node' and to the backends as 'data nodes',
though we may omit the access or data qualifier where it's unambiguous.

As the vast bulk of the code so far has been written for the case where
there was a single access node, almost all instances of 'server' were
references to data nodes.  This change has updated the code to rename
those instances.
2020-05-27 17:31:09 +02:00
Brian Rowe
dd3847a7e0 Rename files in preparation for large refactor
This change includes the only rename changes required by the renaming
of server to data node across the clustering codebase.  This change
is being committed separately from the bulk of the rename changes to
prevent git from losing the file history of renamed files (merging the
rename with extensive code modifications resulted in git treating some
of the file moves as a file delete and new file creation).
2020-05-27 17:31:09 +02:00
Erik Nordström
3943a758a7 Refactor test of distributed hypertables
This refactors the `hypertable_distributed` test to make better use of
the `remote_exec` utility function. The refactoring also makes sure we
actually use space partitioning when testing distributed hypertables.
2020-05-27 17:31:09 +02:00
Erik Nordström
9a52a2819f Make chunk placement consistent across data nodes
Chunks are placed across data nodes based on the ordinal of the slice
in the first space dimension, if such a dimension exists. For
instance, if a chunk belongs to the second slice in the space
dimension, this ordinal number will be used modulo the number of
data nodes to find the data node to place the chunk on.

However, the ordinal is calculated based on the existing slices in the
dimension, and, because slices are created lazily, the ordinal of a
specific slice might vary until all slices are created in the space
dimension. This has the result that chunks aren't consistently placed
on data nodes based on their space partition, prohibiting some
push-down optimizations that rely on consistent partitioning.

This change ensures the ordinal of a space slice is calculated as if
all slices in the dimension are pre-existing. This might still lead to
inconsistencies during times of repartioning, but fixes issues that
occur initially when no slices exists.
2020-05-27 17:31:09 +02:00
niksa
0da34e840e Fix server detach/delete corner cases
Prevent server delete if the server contains data, unless user
specifies `force => true`. In case the server is the only data
replica, we don't allow delete/detach unless table/chunks are dropped.
The idea is to have the same semantics for delete as for detach since
delete actually calls detach

We also try to update pg_foreign_table when we delete server if there
is another server containing the same chunk.

An internal function is added to enable updating foreign table server
which might be useful in some cases since foreign table server is
considered a default server for that particular chunk.

Since this command needs to work even if the server we're trying to
remove is non responsive, we're not removing any data on the remote
data node.
2020-05-27 17:31:09 +02:00
Dmitry Simonenko
b2fde83d2e Block DDL operations on a data nodes
This change adds support for blocking DDL operations on data nodes
when not executed via the access node.

Specifically, DDL operations can only be executed on data nodes if the
request came on a connection from the access node or DDL operations
have been explicitly allowed by setting
`timescaledb.enable_client_ddl_on_data_servers=true`.
2020-05-27 17:31:09 +02:00
niksa
2fd99c6f4b Block new chunks on data nodes
This functionality enables users to block or allow creation of new
chunks on a data node for one or more hypertables. Use cases for this
include the ability to block new chunks when a data node is running
low on disk space or to affect chunk distribution across data nodes.

Sometimes blocking data nodes for new chunks can make a hypertable
under-replicated. For that case an additional argument `force => true`
can be supplied to force blocking new chunks.

Here are some examples.

Block for a specific hypertable:
`SELECT * FROM block_new_chunks_on_server('server_1', 'disttable');`

Block for all hypertables on the server:
`SELECT * FROM block_new_chunks_on_server('server_1', force =>true);`

Unblock:
`SELECT * FROM allow_new_chunks_on_server('server_1', true);`

This change adds the `force` argument to `detach_server` as well.  If
detaching or blocking new chunks will make a hypertable
under-replicated then `force => true` needs to used.
2020-05-27 17:31:09 +02:00
Dmitry Simonenko
f6a829669a Distinguish data node hypertables from regular ones
Hypertables created on a data node by an access node (via
`create_distributed_hypertable()`) will now have their
`replication_factor` set to -1. This makes it possible to distinguish
regular data node hypertables from those that are part of a larger
distributed hypertable.

This functionality will be needed for decision making based on the
connection type, for example allow or block a DDL commands on a data
node.
2020-05-27 17:31:09 +02:00
Erik Nordström
2f43408eb5 Push down partitionwise aggregates to servers
This change adds support for pushing down FULL partitionwise
aggregates to remote servers. Partial partitionwise aggregates cannot
yet be pushed down since that requires a way to tell the remote server
to compute a specific partial.

NOTE: Push-down aggregates are a PG11 only feature as it builds on top
of partitionwise aggregate push-down only available in
PG11. Therefore, a number of query-running tests now only run on PG11,
since these have different output on PG10.

To make push downs work on a per-server basis, hypertables are now
first expended into chunk append plans. This is useful to let the
planner do chunk exclusion and cost estimation of individual
chunks. The append path is then converted into a per-server plan by
grouping chunks by servers, with reduced cost because there is only
one startup cost per server instead of per chunk.

Future optimizations might consider avoiding the original per-chunk
plan computation, in order to increase planning spead.

To make use of existing PostgreSQL planning code for partitionwise
aggregates, we need to create range table entries for the server
relations even though these aren't "real" tables in the system. This
is because the planner code expects those entries to be present for
any "partitions" it is planning aggregates on (note that in
"declarative partitioning" all partitions are system tables). For this
purpose, we create range table entries for each server that points to
the root hypertable relation. This is in a sense "correct" since each
per-server relation is an identical (albeit partial) hypertable on the
remote server. The upside of pointing the server rel's range table
entry to the root hypertable is that the planner can make use of the
indexes on the hypertable for planning purposes. This leads to more
efficient remote queries when, e.g., ordering is important (i.e., we
get push down sorts for free).
2020-05-27 17:31:09 +02:00
Brian Rowe
59e3d7f1bd Add create_distributed_hypertable command
This change adds a variant of the create_hypertable command that will
ensure the created table is distributed.
2020-05-27 17:31:09 +02:00
Dmitry Simonenko
11aab55094 Add support for basic distributed DDL
This is straightforward implementation which allows to execute
limited set of DDL commands on distributed hypertable.
2020-05-27 17:31:09 +02:00
Erik Nordström
6ba70029e3 Cleanup chunk servers when dropping dependencies
Chunk server mappings are now cleaned up when dropping chunk tables
and foreign servers. In particular, this works even when such objects
are removed as a result of cascading deletions of other objects.

Some refactoring has been done to the event trigger handling code in
addition to adding support for new event objects.
2020-05-27 17:31:09 +02:00
Brian Rowe
d355c00961 Distribute hypertable creation to data nodes
When creating a hypertable with backend servers (replication_factor >
0), this will deparse the table structure and send the commands to
create it on all of the backend nodes.  It will then send a command
to create the hypertable on each backend.

There is still some more work needed to handle assigning hypertables
with multiple space dimensions and supporting partitioning functions.
2020-05-27 17:31:09 +02:00
Erik Nordström
33f1601e6f Handle constraints, triggers, and indexes on distributed hypertables
In distributed hypertables, chunks are foreign tables and such tables
do not support (or should not support) indexes, certain constraints,
and triggers. Therefore, such objects should not recurse to foreign
table chunks nor add a mappings in the `chunk_constraint` or
`chunk_index` tables.

This change ensures that we properly filter out the indexes, triggers,
and constraints that should not recurse to chunks on distributed
hypertables.
2020-05-27 17:31:09 +02:00
Erik Nordström
596be8cda1 Add mappings table for remote chunks
A frontend node will now maintain mappings from a local chunk to the
corresponding remote chunks in a `chunk_server` table.

The frontend creates local chunks as foreign tables and adds entries
to `chunk_server` for each chunk it creates on remote data node.

Currently, the creation of remote chunks is not implemented, so a
dummy chunk_id for the remote chunk will be added instead for testing
purposes.
2020-05-27 17:31:09 +02:00
Erik Nordström
ece582d458 Add mappings table for remote hypertables
In a multi-node (clustering) setup, TimescaleDB needs to track which
remote servers have data for a particular distributed hypertable. It
also needs to know which servers to place new chunks on and to use in
queries against a distributed hypertable.

A new metadata table, `hypertable_server` is added to map a local
hypertable ID to a hypertable ID on a remote server. We require that
the remote hypertable has the same schema and name as the local
hypertable.

When a local server is removed (using `DROP SERVER` or our
`delete_server()`), all remote hypertable mappings for that server
should also be removed.
2020-05-27 17:31:09 +02:00
Erik Nordström
ae587c9964 Add API function for explicit chunk creation
This adds an internal API function to create a chunk using explicit
constraints (dimension slices). A function to export a chunk in a
format consistent with the chunk creation function is also added.

The chunk export/create functions are needed for distributed
hypertables so that an access node can create chunks on data nodes
according to its own (global) partitioning configuration.
2020-05-27 17:31:09 +02:00
Erik Nordström
28e9a443b3 Improve handling of "dropped" chunks
The internal chunk API is updated to avoid returning `Chunk` objects
that are marked `dropped=true` along with some refactoring, hardening,
and cleanup of the internal chunk APIs. In particular, apart from
being returned in a dropped state, chunks could also be returned in a
partial state (without all fields set, partial constraints,
etc.). None of this is allowed as of this change. Further, lock
handling was unclear when joining chunk metadata from different
catalog tables. This is made clear by having chunks built within
nested scan loops so that proper locks are held when joining in
additional metadata (such as constraints).

This change also fixes issues with dropped chunks that caused chunk
metadata to be processed many times instead of just once, leading to
potential bugs or bad performance.

In particular, since the introduction of the “dropped” flag, chunk
metadata can exist in two states: 1. `dropped=false`
2. `dropped=true`. When dropping chunks (e.g., via `drop_chunks`,
`DROP TABLE <chunk>`, or `DROP TABLE <hypertable>`) there are also two
modes of dropping: 1. DELETE row and 2. UPDATE row and SET
dropped=true.

The deletion mode and the current state of chunk lead to a
cross-product resulting in 4 cases when dropping/deleting a chunk:

1. DELETE row when dropped=false
2. DELETE row when dropped=true
3. UPDATE row when dropped=false
4. UPDATE row when dropped=true

Unfortunately, the code didn't distinguish between these cases. In
particular, case (4) should not be able to happen, but since it did it
lead to a recursing loop where an UPDATE created a new tuple that then
is recursed to in the same loop, and so on.

To fix this recursing loop and make the code for dropping chunks less
error prone, a number of assertions have been added, including some
new light-weight scan functions to access chunk information without
building a full-blown chunk.

This change also removes the need to provide the number of constraints
when scanning for chunks. This was really just a hint anyway, but this
is no longer needed since all constraints are joined in anyway.
2020-04-28 13:49:14 +02:00
Erik Nordström
0e9461251b Silence various compiler warnings
This change fixes various compiler warnings that show up on different
compilers and platforms. In particular, MSVC is sensitive to functions
that do not return a value after throwing an error since it doesn't
realize that the code path is not reachable.
2020-04-27 15:02:18 +02:00
Oleg Smirnov
e7f70e354e Fix ts_hypertable_get_all for compressed tables
When calling show_chunks or drop_chunks without specifying
a particular hypertable TimescaleDB iterates through all
existing hypertables and builds a list. While doing this
it adds the internal '_compressed_hypertable_*' tables
which leads to incorrect behaviour of
ts_chunk_get_chunks_in_time_range function. This fix
filters out the internal compressed tables while scanning
at ts_hypertable_get_all function.
2020-04-15 15:13:59 +02:00
Ruslan Fomkin
ed32d093dc Use table_open/close and PG aggregated directive
Fixing more places to use table_open and table_close introduced in
PG12. Unifies PG version directives to use aggregated macro.
2020-04-14 23:12:15 +02:00
Erik Nordström
36af23ec94 Use flags for cache query options
Cache queries support multiple optional behaviors, such as "missing
ok" (do not fail on cache miss) and "no create" (do not create a new
entry if one doesn't exist in the cache). With multiple boolean
parameters, the query API has become unwieldy so this change turns
these booleans into one flag parameter.
2020-04-14 23:12:15 +02:00
Joshua Lockerman
949b88ef2e Initial support for PostgreSQL 12
This change includes a major refactoring to support PostgreSQL
12. Note that many tests aren't passing at this point. Changes
include, but are not limited to:

- Handle changes related to table access methods
- New way to expand hypertables since expansion has changed in
  PostgreSQL 12 (more on this below).
- Handle changes related to table expansion for UPDATE/DELETE
- Fixes for various TimescaleDB optimizations that were affected by
  planner changes in PostgreSQL (gapfill, first/last, etc.)

Before PostgreSQL 12, planning was organized something like as
follows:

 1. construct add `RelOptInfo` for base and appendrels
 2. add restrict info, joins, etc.
 3. perform the actual planning with `make_one_rel`

For our optimizations we would expand hypertables in the middle of
step 1; since nothing in the query planner before `make_one_rel` cared
about the inheritance children, we didn’t have to be too precises
about where we were doing it.

However, with PG12, and the optimizations around declarative
partitioning, PostgreSQL now does care about when the children are
expanded, since it wants as much information as possible to perform
partition-pruning. Now planning is organized like:

 1. construct add RelOptInfo for base rels only
 2. add restrict info, joins, etc.
 3. expand appendrels, removing irrelevant declarative partitions
 4. perform the actual planning with make_one_rel

Step 3 always expands appendrels, so when we also expand them during
step 1, the hypertable gets expanded twice, and things in the planner
break.

The changes to support PostgreSQL 12 attempts to solve this problem by
keeping the hypertable root marked as a non-inheritance table until
`make_one_rel` is called, and only then revealing to PostgreSQL that
it does in fact have inheritance children. While this strategy entails
the least code change on our end, the fact that the first hook we can
use to re-enable inheritance is `set_rel_pathlist_hook` it does entail
a number of annoyances:

 1. this hook is called after the sizes of tables are calculated, so we
    must recalculate the sizes of all hypertables, as they will not
    have taken the chunk sizes into account
 2. the table upon which the hook is called will have its paths planned
    under the assumption it has no inheritance children, so if it's a
    hypertable we have to replan it's paths

Unfortunately, the code for doing these is static, so we need to copy
them into our own codebase, instead of just using PostgreSQL's.

In PostgreSQL 12, UPDATE/DELETE on inheritance relations have also
changed and are now planned in two stages:

- In stage 1, the statement is planned as if it was a `SELECT` and all
  leaf tables are discovered.
- In stage 2, the original query is planned against each leaf table,
  discovered in stage 1, directly, not part of an Append.

Unfortunately, this means we cannot look in the appendrelinfo during
UPDATE/DELETE planning, in particular to determine if a table is a
chunk, as the appendrelinfo is not at the point we wish to do so
initialized. This has consequences for how we identify operations on
chunks (sometimes for blocking and something for enabling
functionality).
2020-04-14 23:12:15 +02:00
gayyappan
91fe723d3a Drop chunks from materialized hypertables
Add support for dropping chunks from materialized
hypertables. drop_chunks_policy can now be set up
for materialized hypertables.
2020-02-26 11:50:58 -05:00
gayyappan
2702140fa3 Cannot add dimension if table has empty chunks
add_dimension should fail when table has no
data but still has empty chunks.
Fixes #1623
2020-02-10 10:47:23 -05:00
Ruslan Fomkin
4dc0693d1f Unify error message if hypertable not found
Refactors multiple implementations of finding hypertables in cache
and failing with different error messages if not found. The
implementations are replaced with calling functions, which encapsulate
a single error message. This provides the unified error message and
removes need for copy-paste.
2020-01-29 08:10:27 +01:00
Matvey Arye
2c594ec6f9 Keep catalog rows for some dropped chunks
If a chunk is dropped but it has a continuous aggregate that is
not dropped we want to preserve the chunk catalog row instead of
deleting the row. This is to prevent dangling identifiers in the
materialization hypertable. It also preserves the dimension slice
and chunk constraints rows for the chunk since those will be necessary
when enabling this with multinode and is necessary to recreate the
chunk too. The postgres objects associated with the chunk are all
dropped (table, constraints, indexes).

If data is ever reinserted to the same data region, the chunk is
recreated with the same dimension definitions as before. The postgres
objects are simply recreated.
2019-12-30 09:10:44 -05:00
Matvey Arye
5eb047413b Allow drop_chunks while keeping continuous aggs
Allow dropping raw chunks on the raw hypertable while keeping
the continuous aggregate. This allows for downsampling data
and allows users to save on TCO. We only allow dropping
such data when the dropped data is older than the
`ignore_invalidation_older_than` parameter on all the associated
continuous aggs. This ensures that any modifications to the
region of data which was dropped should never be reflected
in the continuous agg and thus avoids semantic ambiguity
if chunks are dropped but then again recreated due to an
insert.

Before we drop a chunk we need to make sure to process any
continuous aggregate invalidations that were registed on
data inside the chunk. Thus we add an option to materialization
to perform materialization transactionally, to only process
invalidations, and to process invalidation only before a timestamp.

We fix drop_chunks and policy to properly process
`cascade_to_materialization` as a tri-state variable (unknown,
true, false); Existing policy rows should change false to NULL
(unknown) and true stays as true since it was explicitly set.
Remove the form data for bgw_policy_drop_chunk because there
is no good way to represent the tri-state variable in the
form data.

When dropping chunks with cascade_to_materialization = false, all
invalidations on the chunks are processed before dropping the chunk.
If we are so far behind that even the  completion threshold is inside
the chunks being dropped, we error. There are 2 reasons that we error:
1) We can't safely process new ranges transactionally without taking
   heavy weight locks and potentially locking the entire sytem
2) If a completion threshold is that far behind the system probably has
   some serious issues anyway.
2019-12-30 09:10:44 -05:00
Matvey Arye
08ad7b6612 Add ignore_invalidation_older_than to continuous aggs
We added a timescaledb.ignore_invalidation_older_than parameter for
continuous aggregatess. This parameter accept a time-interval (e.g. 1
month). if set, it limits the amount of time for which to process
invalidation. Thus, if
	timescaledb.ignore_invalidation_older_than = '1 month'
then any modifications for data older than 1 month from the current
timestamp at insert time will not cause updates to the continuous
aggregate. This limits the amount of work that a backfill can trigger.
This parameter must be >= 0. A value of 0 means that invalidations are
never processed.

When recording invalidations for the hypertable at insert time, we use
the maximum ignore_invalidation_older_than of any continuous agg attached
to the hypertable as a cutoff for whether to record the invalidation
at all. When materializing a particular continuous agg, we use that
aggs  ignore_invalidation_older_than cutoff. However we have to apply
that cutoff relative to the insert time not the materialization
time to make it easier for users to reason about. Therefore,
we record the insert time as part of the invalidation entry.
2019-12-04 15:47:03 -05:00
Matvey Arye
85d35af140 Fix shared tests on OSX
Fix the test runner to work with the OSX version of tests.
2019-11-07 20:29:19 -05:00
Matvey Arye
2cf66fdf44 Fix hypertable model handling
The hypertable model now has NULL fields but was still using
GETSTRUCT. This is unsafe. Switch to using the proper model
access methods.
2019-11-07 20:29:19 -05:00
Matvey Arye
d2db84fd98 Fix windows compilation
Some minor fixes for compilation in Windows.
2019-10-29 19:02:58 -04:00