79 Commits

Author SHA1 Message Date
Dmitry Simonenko
cb2da81bf7 Fix ts_get_now_internal to use transaction time
Issue: #2167
2020-08-31 14:47:10 +03:00
Erik Nordström
c5a202476e Fix timestamp overflow in time_bucket optimization
An optimization for `time_bucket` transforms expressions of the form
`time_bucket(10, time) < 100` to `time < 100 + 10` in order to do
chunk exclusion and make better use of indexes on the time
column. However, since one bucket is added to the timestamp when doing
this transformation, the timestamp can overflow.

While a check for such overflows already exists, it uses `+Infinity`
(INT64_MAX/DT_NOEND) as the upper bound instead of the actual end of
the valid timestamp range. A further complication arises because
TimescaleDB internally converts timestamps to UNIX epoch time, thus
losing a little bit of the valid timestamp range in the process. Dates
are further restricted by the fact that they are internally first
converted to timestamps (thus limited by the timestamp range) and then
converted to UNIX epoch.

This change fixes the overflow issue by only applying the
transformation if the resulting timestamps or dates stay within the
valid (TimescaleDB-specific) ranges.

A test has also been added to show the valid timestamp and date
ranges, both PostgreSQL and TimescaleDB-specific ones.
2020-08-27 19:16:24 +02:00
Erik Nordström
e1c94484cf Add support for infinite timestamps
The internal conversion functions for timestamps didn't account for
timestamps that are infinite (`-Infinity` or `+Infinity`), and they
would therefore generate an error if such timestamps were
encountered. This change adds extra checks to the conversion functions
to allow infinite timestamps.
2020-08-14 01:52:28 +02:00
Sven Klemm
bb891cf4d2 Refactor retention policy
This patch changes the retention policy to store its configuration
in the bgw_job table and removes the bgw_policy_drop_chunks table.
2020-08-03 22:33:54 +02:00
Erik Nordström
a311f3735d Adopt table scan methods for Scanner
This change makes the Scanner code agnostic to the underlying storage
implementation of the tables it scans. This also fixes a bug that made
it impossible to use non-heap table access methods on a
hypertable. The bug existed because a check is made for existing data
before a table is made into a hypertable. And, since this check reads
data from the table using the Scanner, it must be able to read the
data irrespective of the underlying storage.

As a result of the more generic scan interface, resource management is
also improved by delivering tuples in reference-counted tuple table
slots. A backwards-compatibility layer is used for PG11, which maps
all table access functions to the heap equivalents.
2020-07-29 10:40:12 +02:00
Sven Klemm
c90397fd6a Remove support for PG9.6 and PG10
This patch removes code support for PG9.6 and PG10. In addition to
removing PG96 and PG10 macros the following changes are done:

remove HAVE_INT64_TIMESTAMP since this is always true on PG10+
remove PG_VERSION_SUPPORTS_MULTINODE
2020-06-02 23:48:35 +02:00
Dmitry Simonenko
9b4aae813f Support storage options for distributed hypertables
This change allows to deparse and include a main table storage
options for the CREATE TABLE command which is executed during
the create_distributed_hypertable() call.
2020-05-27 17:31:09 +02:00
Erik Nordström
e2371558f7 Create chunks on remote servers
This change ensures that chunk replicas are created on remote
(datanode) servers whenever a chunk is created in a local distributed
hypertable.

Remote chunks are created using the `create_chunk()` function, which
has been slightly refactored to allow specifying an explicit chunk
table name. The one making the remote call also records the resulting
remote chunk IDs in its `chunk_server` mappings table.

Since remote command invokation without super-user permissions
requires password authentication, the test configuration files have
been updated to require password authentication for a cluster test
user that is used in tests.
2020-05-27 17:31:09 +02:00
Ruslan Fomkin
1ddc62eb5f Refactor header inclusion
Correcting conditions in #ifdefs, adding missing includes, removing
and rearranging existing includes, replacing PG12 with PG12_GE for
forward compatibility. Fixed number of places with relation_close to
table_close, which were missed earlier.
2020-04-14 23:12:15 +02:00
Joshua Lockerman
949b88ef2e Initial support for PostgreSQL 12
This change includes a major refactoring to support PostgreSQL
12. Note that many tests aren't passing at this point. Changes
include, but are not limited to:

- Handle changes related to table access methods
- New way to expand hypertables since expansion has changed in
  PostgreSQL 12 (more on this below).
- Handle changes related to table expansion for UPDATE/DELETE
- Fixes for various TimescaleDB optimizations that were affected by
  planner changes in PostgreSQL (gapfill, first/last, etc.)

Before PostgreSQL 12, planning was organized something like as
follows:

 1. construct add `RelOptInfo` for base and appendrels
 2. add restrict info, joins, etc.
 3. perform the actual planning with `make_one_rel`

For our optimizations we would expand hypertables in the middle of
step 1; since nothing in the query planner before `make_one_rel` cared
about the inheritance children, we didn’t have to be too precises
about where we were doing it.

However, with PG12, and the optimizations around declarative
partitioning, PostgreSQL now does care about when the children are
expanded, since it wants as much information as possible to perform
partition-pruning. Now planning is organized like:

 1. construct add RelOptInfo for base rels only
 2. add restrict info, joins, etc.
 3. expand appendrels, removing irrelevant declarative partitions
 4. perform the actual planning with make_one_rel

Step 3 always expands appendrels, so when we also expand them during
step 1, the hypertable gets expanded twice, and things in the planner
break.

The changes to support PostgreSQL 12 attempts to solve this problem by
keeping the hypertable root marked as a non-inheritance table until
`make_one_rel` is called, and only then revealing to PostgreSQL that
it does in fact have inheritance children. While this strategy entails
the least code change on our end, the fact that the first hook we can
use to re-enable inheritance is `set_rel_pathlist_hook` it does entail
a number of annoyances:

 1. this hook is called after the sizes of tables are calculated, so we
    must recalculate the sizes of all hypertables, as they will not
    have taken the chunk sizes into account
 2. the table upon which the hook is called will have its paths planned
    under the assumption it has no inheritance children, so if it's a
    hypertable we have to replan it's paths

Unfortunately, the code for doing these is static, so we need to copy
them into our own codebase, instead of just using PostgreSQL's.

In PostgreSQL 12, UPDATE/DELETE on inheritance relations have also
changed and are now planned in two stages:

- In stage 1, the statement is planned as if it was a `SELECT` and all
  leaf tables are discovered.
- In stage 2, the original query is planned against each leaf table,
  discovered in stage 1, directly, not part of an Append.

Unfortunately, this means we cannot look in the appendrelinfo during
UPDATE/DELETE planning, in particular to determine if a table is a
chunk, as the appendrelinfo is not at the point we wish to do so
initialized. This has consequences for how we identify operations on
chunks (sometimes for blocking and something for enabling
functionality).
2020-04-14 23:12:15 +02:00
Michael J. Freedman
416cf13385 Clarify supported intervals in error msg
Error message used to specify that interval must be defined in terms
of days or smaller, which was confusing because we really meant any
fixed interval (e.g., weeks, days, hours, minutes, etc.), but not an
interval that is not of fixed duration (e.g., months or years).
2020-03-05 13:13:04 -05:00
Matvey Arye
ef77c2ace8 Improve continuous agg user messages
Switch from using internal timestamps to more user-friendly
timestamps in our log messages and clean up some messages.
2020-01-02 15:49:04 -05:00
Matvey Arye
92aa77247a Improve minor UIUX
Some small improvements:

- allow alter table with empty segment by if the original definition
  had an empty segment by. Improve error msgs.
- block compression on tables with OIDs
- block compression on tables with RLS
2019-10-29 19:02:58 -04:00
gayyappan
909b0ece78 Block updates/deletes on compressed chunks 2019-10-29 19:02:58 -04:00
Sven Klemm
e2c03e40aa Add support for pathkey pushdown for transparent decompression
This patch adds support for producing ordered output. All
segmentby columns need to be prefix of pathkeys and the orderby
specified for the compression needs exactly match the rest of
pathkeys.
2019-10-29 19:02:58 -04:00
Joshua Lockerman
fa26992c4c Improve deltadelta and gorilla compressors
- Add fallback compressors for deltadelta/gorilla
- Add bool compressor for deltadelta
2019-10-29 19:02:58 -04:00
Sven Klemm
d82ad2c8f6 Add ts_ prefix to all exported functions
This patch adds the `ts_` prefix to exported functions that didnt
have it and removes exports that are not needed.
2019-10-15 14:42:02 +02:00
Sven Klemm
a3a49703aa Remove get_function_oid from utils.c
The get_function_oid function was a reimplementation of PostgreSQL
LookupFuncName. This patch removes the function and switches
all callers to use LookupFuncName instead.
2019-09-24 21:13:06 +02:00
Sven Klemm
b86e47a8a1 Fix microsoft compiler warnings
The microsoft compiler can't figure out that elog(ERROR) doesn't
return and warns about functions not returning a value in all code
paths. This patch adds pg_unreachable calls to those functions.
2019-09-16 10:13:21 +02:00
Sven Klemm
468c205a4f Remove attno_find_by_attname and use get_attnum instead
The patch removes the custom implementation to find the attribute
number for a column and uses PostgreSQL get_attnum function instead.
2019-09-13 14:30:18 +02:00
Sven Klemm
7c434d4914 Fix ChunkAppend space partitioning support for ordered append
When ordered append tried to push down targetlist to child paths
it assumed childs would be scans on rels which is not true for
space partitioning where children might be MergeAppend nodes.
This patch also no longer applies the ordered append optimization
to partial paths because its not safe to do so.
This patch also adds more tests for space partitioned hypertables.
2019-08-21 23:08:15 +02:00
Narek Galstyan
62de29987b Add a notion of now for integer time columns
This commit implements functionality for users to give a custom
definition of now() for integer open dimension typed hypertables.
Such a now() function enables us to talk about intervals in the context
of hypertables with integer time columns. In order to simplify future
code. This commit defines a custom ts_interval type that unites the
usual postgres intervals and integer time dimension intervals under a
single composite type.

The commit also enables adding drop chunks policy on hypertables with
integer time dimensions if a custom now() function has been set.
2019-08-19 23:23:28 +04:00
gayyappan
5b7eea4cfe Pass int64 using Int64GetDatum when a Datum is required
int64 should be passed to functions that take a Datum parameter using Int64GetDatum.
Depending on the platform, postgres either passes int64 by value or allocs a pointer
to hold this value.
Without this change, we get SEGV on raspberry pi.
2019-05-23 15:44:41 -04:00
Joshua Lockerman
ae3480c2cb Fix continuous_aggs info
This commit switches the remaining JOIN in the continuous_aggs_stats
view to LEFT JOIN. This way we'll still see info from the other columns
even when the background worker has not run yet.
This commit also switches the time fields to output text in the correct
format for the underlying time type.
2019-04-26 13:08:00 -04:00
Joshua Lockerman
0737b370a3 Add the actual bgw job for continuous aggregates
This commit adds the the actual background worker job that runs the continuous
aggregate automatically. This job gets created when the continuous aggregate is
created and is deleted when the aggregate is DROPed. By default this job will
attempt to run every two bucket widths, and attempts to materialize up to two
bucket widths behind the end of the table.
2019-04-26 13:08:00 -04:00
David Kohn
f17aeea374 Initial cont agg INSERT/materialization support
This commit adds initial support for the continuous aggregate materialization
and INSERT invalidations.

INSERT path:
  On INSERT, DELETE and UPDATE we log the [max, min] time range that may be
  invalidated (that is, newly inserted, updated, or deleted) to
  _timescaledb_catalog.continuous_aggs_hypertable_invalidation_log. This log
  will be used to re-materialize these ranges, to ensure that the aggregate
  is up-to-date. Currently these invalidations are recorded in by a trigger
  _timescaledb_internal.continuous_agg_invalidation_trigger, which should be
  added to the hypertable when the continuous aggregate is created. This trigger
  stores a cache of min/max values per-hypertable, and on transaction commit
  writes them to the log, if needed. At the moment, we consider them to always
  be needed, unless we're in ReadCommitted mode or weaker, and the min
  invalidated value is greater than the hypertable's invalidation threshold
  (found in _timescaledb_catalog.continuous_aggs_invalidation_threshold)

Materialization path:
  Materialization currently happens in multiple phase: in phase 1 we determine
  the timestamp at which we will end the new set of materializations, then we
  update the hypertable's invalidation threshold to that point, and finally we
  read the current invalidations, then materialize any invalidated rows, the new
  range between the continuous aggregate's completed threshold (found in
  _timescaledb_catalog.continuous_aggs_completed_threshold) and the hypertable's
  invalidation threshold. After all of this is done we update the completed
  threshold to the invalidation threshold. The portion of this protocol from
  after the invalidations are read, until the completed threshold is written
  (that is, actually materializing, and writing the completion threshold) is
  included with this commit, with the remainder to follow in subsequent ones.
  One important caveat is that since the thresholds are exclusive, we invalidate
  all values _less_ than the invalidation threshold, and we store timevalue
  as an int64 internally, we cannot ever determine if the row at PG_INT64_MAX is
  invalidated. To avoid this problem, we never materialize the time bucket
  containing PG_INT64_MAX.
2019-04-26 13:08:00 -04:00
Joshua Lockerman
b0bd2775bd Enable optimizing SELECTs within INSERTs
Before this PR only SELECTs would be optimized to exclude unneeded
chunks by our planner. This PR enables such optimizations on SELECTs
found within an INSERT as well. This should speed up commands of the
form

    INSERT INTO <hypertable> (SELECT ... FROM <hyepertable> WHERE ...)

We would like to enable this for all commands, but currently DELETE and
UPDATE can not handle them, and cause errors when the optimizations are
enabled.

This commit also fixes an issue that would occur if we tried to exclude
chunks based off of infinite time values.
2019-04-24 14:40:08 -04:00
Sven Klemm
ef9891b2e8 Fix a couple typos 2019-04-15 21:44:10 +02:00
Sven Klemm
1813848cb7 Add time_bucket support to chunk exclusion
This patch adds support for chunk exclusion for time_bucket
expressions in the WHERE clause. The following transformation
is done when building RestrictInfo:

Transform time_bucket calls of the following form in WHERE clause:

time_bucket(width, column) OP value

Since time_bucket always returns the lower bound of the bucket
for lower bound comparisons the width is not relevant and the
following transformation can be applied:

time_bucket(width, column) > value
column > value

Example with values:

time_bucket(10, column) > 109
column > 109

For upper bound comparisons width needs to be taken into account
and we need to extend the upper bound by width to capture all
possible values.

time_bucket(width, column) < value
column < value + width

Example with values:

time_bucket(10, column) < 100
column < 100 + 10

This allows chunk exclusions to work for views with aggregations.
2019-04-13 04:36:36 +02:00
Joshua Lockerman
e051842fee Add interval to internal conversions, and tests for both this and time conversions
We find ourselves needing to store intervals (specifically time_bucket widths) in
upcoming PRs, so this commit adds that functionality, along with tests that we
perform the conversion in a sensible, round-tripa-able, manner.

This commit fixes a longstanding bug in plan_hashagg where negative time values
would prevent us from using a hashagg. The old logic for to_internal had a flag
that caused the function to return -1 instead of throwing an error, if it could
not perform the conversion. This logic was incorrect, as -1 is a valid time val
The new logic throws the error uncoditionally, and forces the user to CATCH it
if they wish to handle that case. Switching plan_hashagg to using the new logic
fixed the bug.

The commit adds a single SQL file, c_unit_tests.sql, to be the driver for all such
pure-C unit tests. Since the tests run quickly, and there is very little work to
be done at the SQL level, it does not seem like each group of such tests requires
their own SQL file.

This commit also upates the test/sql/.gitignore, as some generated files were
missing.
2019-03-29 14:47:41 -04:00
Matvey Arye
34edba16a9 Run clang-format on code 2019-02-05 16:55:16 -05:00
niksa
c77f4ab1b3 Explicit chunk exclusion
In some cases user might already know what chunks need to be scanned to answer
a particular query. Using `chunks_in` function we can skip calculating chunks
involved in particular query which should result in better performances as well.
A simple example:

`SELECT * FROM hypertable WHERE chunks_in(hypertable, ARRAY[1,2])`
2019-01-19 00:02:01 +01:00
Joshua Lockerman
acc41a7712 Update license header
Only have the copyright in the NOTICE. Hopefully
only having to update one place each year will
keep it consistent.
2019-01-03 11:57:51 -05:00
Joshua Lockerman
888dea71b5 Stop using the extra field for now and other Windows bugs
Something is causing a heap corruption upon setting the license key to
default when we try to use the guc extra on windows. For now stop using
it and just rerun the validation function, if we get to the assign hook
we must have a valid key, so it will never fail.

Also Fixes error message on windows;
turns out windows does not like to print NULL strings.
Don't do that.

Fixes other minor windows bugs.
2019-01-02 15:43:48 -05:00
Sven Klemm
c59a30feed Remove unused functions from utils.c
Remove the following unused functions:
_timescaledb_internal.to_microseconds(TIMESTAMPTZ)
_timescaledb_internal.to_timestamp_pg(BIGINT)
_timescaledb_internal.time_to_internal(anyelement)
2018-12-12 20:54:20 +01:00
David Kohn
5aa1edac15 Refactor compatibility functions and code to support PG11
Introduce PG11 support by introducing compatibility functions for
any whose signatures have changed in PG11. Additionally, refactor
the structure of the compatibility functions found in compat.h by
breaking them out by function (or small set of similar functions)
so that it is easier to see what changed between versions and maintain
changes as more versions are supported.

In general, the philosophy has been to try for forward compatibility
wherever possible, so that we use the latest versions of function interfaces
where we can or where reasonably convenient and mimic the behavior
in older versions as much as possible.
2018-12-12 11:42:33 -05:00
Sven Klemm
ed5067c356 Fix interval_from_now_to_internal timestamptz handling
fix interval_from_now_to_internal to handle timezone properly
for timestamptz and simplify code
2018-12-10 23:24:12 +01:00
Joshua Lockerman
9b52909b17 Add the ability to ignore tests from the command line using IGNORES 2018-12-10 16:36:44 -05:00
niksa
019971c402 Optimize FIRST/LAST aggregate functions
If possible replace aggregate functions FIRST/LAST with subqueries of the form
(SELECT value FROM table WHERE sort IS NOT NULL AND existing-quals ORDER BY sort ASC/DESC
LIMIT 1).
Given a suitable index on sort column, this plan can be much faster then scanning all the
rows and running an aggregate function.
The optimization can't be performed if:
- query uses GROUP BY or WINDOW function
- query contains CTEs
- query contains other aggregate functions (eg. Combining MIN/MAX with FIRST/LAST. We can't
	optimize accross different aggregate functions)
- query uses JOIN
- FIRST/LAST used in ORDER BY

Optimization also works with subqueries, or if FIRST/LAST is used in CTE subquery.

In order to standardize existing FIRST/LAST aggregate function with PostgreSQL and
FIRST/LAST optimization, we exclude NULL values in sort by column.
2018-12-10 09:50:55 +01:00
Joshua Lockerman
9de504f958 Add ts_ prefix to everything in headers
Future proofing: if we ever want to make our functions available  to
others they’d need to be prefixed to prevent name collisions. In
order to avoid having some functions with the ts_ prefix and
others without, we’re adding the prefix to all non-static
functions now.
2018-12-05 14:43:22 -05:00
Sven Klemm
b9b439fde4 Remove unused functions from utils.c
Remove int_cmp, create_fmgr and makeRangeVarFromRelid from utils.c
since they were not used and had no test coverage.
2018-11-30 20:12:26 +01:00
Narek Galstyan
9a3402809f Implement show_chunks in C and have drop_chunks use it
Timescale provides an efficient and easy to use api to drop individual
chunks from timescale database through drop_chunks. This PR builds on
that functionality and through a new show_chunks function gives the
opportunity to see the chunks that would be dropped if drop_chunks was run.
Additionally, it adds a newer_than option to drop_chunks (also supported
by show_chunks) that allows to see/drop chunks in an interval or newer
than a point in time.

This commit includes:
    - Implementation of show_chunks in C
    - Additional helper functions to work with chunks
    - New version of drop_chunks in sql that uses show_chunks. This
      	  also adds a newer_than option to drop_chunks
    - More enhanced tests of drop_chunks and new tests for show_chunks

Among other reasons, show_chunks was implemented in C in order
to be able to have both older_than and newer_than arguments be null. This
was not possible in SQL because the arguments had to have polymorphic types
and whether they are used in function body or not, PL/pgSQL requires these
arguments to typecheck.
2018-11-28 13:46:07 -05:00
Amy Tai
80e0b05348 Provide helper function creating struct from tuple
Refactored the boilerplate that allocates and copies over data from a tuple to a struct. This is typically used in the scanner context in order to read rows from a SQL table in C.
2018-11-21 15:33:56 -05:00
Joshua Lockerman
d8e41ddaba Add Apache License header to all C files 2018-10-29 13:28:19 -04:00
Erik Nordström
b2130f8039 Move all time_bucket funtions to same source file
This change moves all time_bucket-related functions to the same source
file (time_bucket.c) for consistency. There are no changes to code
logic.
2018-10-23 10:44:58 +02:00
Matvey Arye
19299cf349 Make all time_bucket function STRICT
All time bucket function should return NULL on any NULL parameters.
2018-10-15 10:16:10 -04:00
Matvey Arye
debd91478a Move to using macro for time_bucket_ts
Macro is used for 2 reasons:
1) It's more correct in that it doesn't mix Timestamp and TimestampTz
types. There is no implicit conversion of the two beneath the hood.
2) It is slightly faster as it avoid an extra function call. This
is a very performance sensitive function for OLAP queries.
2018-10-15 10:16:10 -04:00
Matvey Arye
297d88551b Add a version of time_bucket that takes an origin
This allows people to explicitly specify the origin point.
2018-10-15 10:16:10 -04:00
Matvey Arye
e74be30925 Move time_bucket epoch to a Monday
Since Monday is the ISO start of the week, it makes sense to move
the time_bucket epoch to start on a Monday. Before the epoch was the
same as the Postgres epoch (2000-01-01, a Saturday).
2018-10-15 10:16:10 -04:00
Joshua Lockerman
974788516a Prefix public C functions with ts_
We've decided to adopt the ts_ prefix on all exported C functions in
order to avoid having symbol conflicts with future postgres functions.
We've already started using this prefix on new functions and this commit
adds the prefix to to the old functions.
2018-09-27 11:45:04 -04:00