56 Commits

Author SHA1 Message Date
Erik Nordström
c09f6013fa Add number of partitions to partition_epoch table.
There are two reasons for adding the partition count to
the partition_epoch table:

* It makes the partition_epoch more self-describing as
  it makes it easy to see how many partitions are
  in the current epoch as well as past ones.
* It simplifies native code that can read the partition
  epoch, allocate memory for the right number of partitions,
  and finally scan the partition table filling in each entry.
2017-02-28 20:07:32 +01:00
Erik Nordström
e3fabf993a Fix chunk-related deadlocks.
This patch fixes two deadlock cases.

The first case occurred as a result of taking partition and chunk
locks in inconsistent orders. When creating the first chunk C1
in a table, concurrent INSERT workers would race to create
that chunk. The result would be that the transactions queue up on
the partition lock P, effectively serializing these transactions.
This would lead to these concurrent transactions to insert
at very different offsets in time, one at a time. At some point
in the future, some n'th transaction Tn queued up on P would get
that lock as the preceeding inserters T1-(n-1) finish their inserts
and move on to their next batches. When Tn finally holds P, one of
the preceeding workers starts a new transaction that finds that it
needs to close C1, grabbing a lock on C1 and then on P. However,
it will block on P since Tn already holds P. Tn will also believe
it needs to close C1, thus trying to grab a lock on C1, but will
block, causing a deadlock.

The second case can occur on multi-partition hypertables. With
multiple partitions there are more than one open-ended chunk
at a time (one for each partition). This leads to a deadlock case
when two processes try to close (and thus lock) the chunks in
different order. For instance process P1 closes chunk C1 and then
C2, while process P2 locks in order C2 and C1.

The fix for the first case is to remove the partition lock
altogether. As it turns out, this lock is not needed.
Instead, transactions can race to create new chunks, thus causing
conflicts. A conflict in creating a new chunk can safely be
ignored and it also avoids taking unecessary locks. Removing the
partition lock also avoids the transaction serialization that
happens around this lock, which is especially bad for long-running
transactions (e.g., big INSERT batches).

The fix for the second multi-partition deadlock case is to always
close chunks in chunk ID order. This requires closing chunks at
the end of a transaction, once a transaction knows all the chunks
it needs to close. This also has the added benefit of reducing the
time a transaction holds exclusive locks on chunks, potentially
improving insert performance.
2017-02-23 15:41:57 +01:00
Olof Rensfelt
406dbb7bd6 Remove clustered tests. 2017-02-22 12:16:48 +01:00
Erik Nordström
4683d8e03e Fix conversion between TIMESTAMP and internal BIGINT UNIX time representation.
Currently, the internal metadata tables for hypertables track time
as a BIGINT integer. Converting hypertable time columns in TIMESTAMP
format to this internal representation requires using Postgres' conversion
functions that are imprecise due to floating-point arithmetic. This patch
adds C-based conversion functions that offer the following conversions
using accurate integer arithmetic:

- TIMESTAMP to UNIX epoch BIGINT in microseconds
- UNIX epoch BIGINT in microseconds to TIMESTAMP
- TIMESTAMP to Postgres epoch BIGINT in microseconds
- Postgres epoch BIGINT in microseconds to TIMESTAMP

The downside of the UNIX epoch functions are that they don't offer the full
date range as offered by the Postgres to_timestamp() function. This is
because of the required epoch shift might otherwise overflow the BIGINT.
All functions should, however, offer appropriate range checks and will
throw errors if outside the range.
2017-02-14 14:01:26 +01:00
Erik Nordström
ab3070aa89 Fix C-style and compiler warnings
- Add emacs-style indentation settings headers
- Fix compiler warnings related to C style
- Indentation fixes
2017-02-13 11:47:41 +01:00
Erik Nordström
81d966f908 Update README and add user-facing API to drop chunks
- Add complete database setup instructions to the README.
- Add drop_chunks() user-facing API
- Add retention policy section to README
2017-02-09 13:04:20 +01:00
Olof Rensfelt
87035d9916 Merged in perf-improvements (pull request #78)
Performance improvements

Approved-by: ci-vast
2017-02-08 09:09:18 +00:00
Matvey Arye
321babee81 cleanup 2017-02-07 17:39:31 -05:00
Matvey Arye
f12a880d8d prepare hypertable_info plan 2017-02-07 17:09:08 -05:00
Matvey Arye
f1fcf51476 Make inserts use temporary tables for performance. 2017-02-07 17:08:51 -05:00
Matvey Arye
68a060e8a7 various performance fixes 2017-02-07 17:05:10 -05:00
Erik Nordström
839710d567 Fix time conversion bug 2017-02-07 18:38:15 +01:00
Erik Nordström
036c40886c Simplify cluster setup
Setting up single node is now:

```
CREATE EXTENSION IF NOT EXISTS iobeamdb CASCADE;
select setup_single_node();
```

To setup a cluster do (on meta node):
```
CREATE EXTENSION IF NOT EXISTS iobeamdb CASCADE;
select set_meta();
```

on data node:
```
CREATE EXTENSION IF NOT EXISTS iobeamdb CASCADE;
select join_cluster('metadb', 'metahost');
```

This assumes that the commands are issued by the same user on both the
meta node and the data node. Otherwise the data node also has to
specify the user name to use when connecting to the meta node.
2017-02-07 12:22:36 +01:00
Erik Nordström
7b94c573ba Refactor directory structure and tests
- Directory structure now matches common practices
- Regression tests now run with pg_regress via the PGXS infrastructure.
- Unit tests do not integrate well with pg_regress and have to be run
  separately.
- Docker functionality is separate from main Makefile. Run with
  `make -f docker.mk` to build and `make -f docker.mk run` to run
  the database in a container.
2017-01-31 20:14:19 +01:00
Olof Rensfelt
0f3aa8d557 * Add _meta schema to allow all code to be loaded on both meta and nodes.
* Split SQL into functions and setup.
* Remove hash-lib dependency.
* Makes code into Postgresql extension.
2016-12-20 16:10:59 +01:00
Olof Rensfelt
3ac5ea7fd6 Added query tests. 2016-12-02 16:12:50 +01:00
Matvey Arye
7fa37f10bc Bug Fix: create index only on local tables 2016-11-29 10:24:36 -05:00
Matvey Arye
77a12411d2 fix time period limits 2016-11-28 22:06:36 -05:00
Matvey Arye
f56701f738 Fix non existant hypertable case 2016-11-28 21:49:44 -05:00
Olof Rensfelt
12f260f893 Update unit tests 2016-11-23 18:52:31 +01:00
Matvey Arye
3c72689664 Finishing refactor TODOs & Formatting. Placement and cross-epoch queries. 2016-11-22 16:49:44 -05:00
Matvey Arye
939a4008ad Cleanup after refactor 2016-11-22 16:42:19 -05:00
Matvey Arye
582671f926 Queries again work after previous refactoring 2016-11-22 16:41:28 -05:00
Matvey Arye
0ff42fad02 added ability to insert across partitions 2016-11-22 16:41:27 -05:00
Matvey Arye
1fdb3d43f9 remove last_time_approx 2016-11-22 16:41:27 -05:00
Matvey Arye
bfcb25642e added meta table and communication from data nodes to meta 2016-11-22 16:41:27 -05:00
Matvey Arye
42ee7c8586 starting refactor of clustering and naming logic
This fix allows more flexible placement of tables on a node; better
and more flexible logic for remote placement of chunks on nodes.
2016-11-22 16:41:27 -05:00
Olof Rensfelt
f57f34f1b2 add script to generate csv 2016-11-21 18:39:19 +01:00
Olof Rensfelt
e15da09f45 add test from postgres-kafka-consumer
Added missing files.
2016-11-21 18:39:19 +01:00
Matvey Arye
3950aa6237 rewriting the rest of non-agg queries to return sql, not records 2016-11-10 12:32:57 -05:00
Matvey Arye
5f44f32f86 rewriting by every query to return cte-based sql. 2016-11-09 16:52:31 -05:00
Matvey Arye
117e5ccd47 remove dead code 2016-11-07 12:02:34 -05:00
Matvey Arye
e96509d4c0 fix error codes 2016-11-07 11:25:58 -05:00
Matvey Arye
b15719f9b4 completed aggregate query work 2016-11-03 18:38:36 -04:00
Matvey Arye
60aa45cde3 more work on aggregates 2016-11-03 17:07:36 -04:00
Matvey Arye
7bbe78c3ac better solution to agg queries 2016-11-03 15:12:49 -04:00
Matvey Arye
99981228da fix in aggregate logic 2016-11-03 12:59:01 -04:00
Rob Kiefer
416cc1c041 Fix regression test file. Add Makefile and README 2016-11-02 14:42:14 -04:00
Rob Kiefer
29c4556402 Testing Jenkins PR 2016-11-02 14:16:17 -04:00
Olof Rensfelt
e199a1bcf7 Add setup scripts and cleanup test directories 2016-11-02 17:25:22 +01:00
Olof Rensfelt
696e9d0304 unit_test run script bugfix 2016-11-02 15:27:18 +01:00
Matvey Arye
e6e88ddb99 bug fixes 2016-11-01 22:05:01 -04:00
Matvey Arye
c3d55d339e queries now work with old consumer tests 2016-11-01 14:02:16 -04:00
Olof Rensfelt
b26e9e14e5 Merged in olof_rensfelt/backend-database/ore/test (pull request #2)
add unit tests
2016-11-01 14:54:33 +00:00
Olof Rensfelt
c33e64f2da add unit tests 2016-11-01 12:37:30 +01:00
Rob Kiefer
7c6dee79dc Add restrictions on new namespace names 2016-10-31 13:11:02 -04:00
Rob Kiefer
48da98b5bb Add trigger funcs (on_*, sync_*) to _sysinternal. 2016-10-31 12:49:07 -04:00
Rob Kiefer
e3c1dc2ecc Create schema _sysinternal, put underscored funcs in it. 2016-10-28 17:56:46 -04:00
Matvey Arye
a075899beb test and bug fixes; getting rid of jinja2 2016-10-26 17:02:08 -04:00
Matvey Arye
d300471c86 move cluster fdw setup inside core 2016-10-26 14:43:04 -04:00