18146 Commits

Author SHA1 Message Date
Lukas Joswiak
8a6bb8611a Update Python libfdb_c paths 2022-01-11 09:34:20 -08:00
Lukas Joswiak
bf9b4aeaab Rename libfdb_c in bindings dir 2022-01-11 09:34:20 -08:00
Lukas Joswiak
ff03fe99ff Add a copy of libfdb_c.so to lib for external client use 2022-01-11 09:34:20 -08:00
Andrew Noyes
64090dd2c0 Suppress a small leak that LSAN is reporting
I cannot seem to figure out why LSAN is reporting this, but if it is a
real leak than it's only a few bytes. Better to have the ASAN tests
actually passing IMO.
2022-01-10 13:44:09 -08:00
Jingyu Zhou
db436fb494 Remove unneeded Arena in Requests/Replies
If the Request/Reply doesn't have *Ref types, we typically don't need to have
an Arena.
2022-01-10 10:26:02 -08:00
Kao Makino
95c72bfc1b Fix malformed JSON 2022-01-10 10:19:11 -08:00
Jingyu Zhou
a71712c985 Update txnStateStore doc with new CC initiated recovery
Also explains what is "txsPoppedVersion" and how it is used in recovery.
2022-01-07 13:29:31 -08:00
Aaron Molitor
6e31821bf5 update download links in documentation #6154 2022-01-06 17:53:35 -08:00
Ata E Husain Bohra
936bf5336a
Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine" (#6191)
* Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine""

Major changes includes:
1. Re-revert Sequencer refactor commits listed below (in listed order):
1.a. This reverts commit bb17e194d9c9888e203421290959bd7f2c075d7f.
1.b. This reverts commit d174bb2e06bff01157d16c652073536c54d17f7f.
1.c. This reverts commit 30b05b469c87d9b526b427751c211fb5cf7ff9cd.

2. Update Status.actor to track ClusterController interface to track
   recovery status.
3. Introduce a ServerKnob to define "cluster recovery trace event"
   prefix; for now keeping it as "Master", however, it should allow
   smooth transition to "Cluster" prefix as it seems more appropriate.
2022-01-06 12:15:51 -08:00
Jon Fu
5c3f6d13af
Merge pull request #6205 from sfc-gh-jfu/jfu-ignore-test
Ignore RocksDB unit test
2022-01-06 14:46:32 -05:00
Jon Fu
5757e2c93c Ignore RocksDB unit test 2022-01-06 13:34:50 -05:00
Andrew Noyes
e5f943de08
Merge pull request #6179 from sfc-gh-anoyes/anoyes/test-generated-go-up-to-date
Update generated.go, and test to keep it up to date
2022-01-06 09:58:33 -08:00
Jon Fu
a7bfebdd22
Merge pull request #6200 from sfc-gh-jfu/jfu-disable-rocksdb-unit-test
Ignore RocksDB unit test by default because correctness breaks when RocksDB is not included
2022-01-05 23:02:52 -05:00
Jon Fu
93c5efc918 Check RocksDB flag in CMakeLists 2022-01-05 21:48:49 -05:00
Jon Fu
e5f7883f63 Ignore RocksDB unit test by default because correctness breaks when RocksDB is not included 2022-01-05 21:47:05 -05:00
Andrew Noyes
a3f37df94a
Merge pull request #6175 from sfc-gh-anoyes/anoyes/delete-non-virtual-destructor
Enable -Wdelete-non-virtual-dtor for clang build
2022-01-05 15:41:59 -08:00
A.J. Beamon
d3c88040fc
Merge pull request #6196 from sfc-gh-ajbeamon/fix-trace-event
Fix duplicate trace field
2022-01-05 13:30:23 -08:00
A.J. Beamon
59503a397e Fix duplicate trace field. 2022-01-05 11:30:21 -08:00
He Liu
30b3789316
Merge pull request #6195 from liquid-helium/range-deletion-to-single-deletion
Implement single deletion in RocksDB.
2022-01-04 12:55:58 -08:00
Andrew Noyes
e4bbfe468e
Merge pull request #6184 from sfc-gh-anoyes/anoyes/fix-asan-ctest
Fix ctest under ASAN
2022-01-04 10:40:39 -08:00
He Liu
1c2c2783ea Implement single deletion in RocksDB.
If the target deletion range contains a single key, convert it into a
single deletion, to avoid extra cost in RocksDB due to range deletion.
2022-01-04 09:59:40 -08:00
Aaron Molitor
30b05b469c Revert "Refactor: ClusterController driving cluster-recovery state machine"
This reverts commit dfe9d184ff5dd66bdbbc5b984688ac3ebb15b901.
2021-12-24 11:25:51 -08:00
Aaron Molitor
d174bb2e06 Revert "Refactor: ClusterController driving cluster-recovery state machine"
This reverts commit abd2959702b0027ab23b8d42d8082b79c3b197f3.
2021-12-24 11:25:51 -08:00
Aaron Molitor
bb17e194d9 Revert "Refactor: ClusterController driving cluster-recovery state machine"
This reverts commit 1520390bc50614ae7583638c07c033739f40dbfb.
2021-12-24 11:25:51 -08:00
Andrew Noyes
32ebdc6da2 Log status json if cluster is unavailable in fdbcli tests 2021-12-22 15:23:05 -08:00
Andrew Noyes
38a97a2e8f Increase default timeout to 5 minutes for add_fdbclient_test 2021-12-22 15:23:05 -08:00
Andrew Noyes
9ca514401d Suppress known asan stderr message 2021-12-22 15:04:30 -08:00
Andrew Noyes
28971c5181 Fix memory leak. Closes #4482 2021-12-22 15:04:00 -08:00
Andrew Noyes
6fdbd9ae30 Fail test if there's a sev40 in the logs, and output logs if test fails 2021-12-22 15:03:37 -08:00
Andrew Noyes
ebb570422b Format tmp_cluster.py with black 2021-12-22 15:01:54 -08:00
Ata E Husain Bohra
1520390bc5 Refactor: ClusterController driving cluster-recovery state machine
diff-1: Address Jingyu's review comments
 diff-2: Introduce ClusterRecovery actor to seperate out
         cluster recovery code

At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
Ata E Husain Bohra
abd2959702 Refactor: ClusterController driving cluster-recovery state machine
diff-1: Address Jingyu's review comments

At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
Ata E Husain Bohra
dfe9d184ff Refactor: ClusterController driving cluster-recovery state machine
At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
neethuhaneesha
3086941c12
Merge pull request #6149 from neethuhaneesha/rocksdbHistograms
KeyValueStoreRocksDB histograms to track latencies
2021-12-22 11:28:36 -08:00
Neethu Haneesha Bingi
1f30368e71 KeyValueStoreRocksDB histograms to track latencies 2021-12-21 23:09:46 -08:00
Andrew Noyes
fba55557ae Update generated.go, and test to keep it up to date
Also remove some unnecessary cgo stuff, and add a description to
trace_partial_file_suffix
2021-12-21 15:16:50 -08:00
Andrew Noyes
fd33d31ff5 Enable -Wdelete-non-virtual-dtor for clang build
We had been disabling -Wdelete-non-virtual-dtor, because this seems to be done intentionally in the generated code of the actor compiler. I spent some time trying to rewrite it in a way that doesn't literally delete/destroy through a pointer to a base class without a virtual destructor, but I was unable to come up with something that passes correctness. My best guess is that we do this so that we can destroy actor state classes, call callbacks registered on the actor SAV, and then destroy the SAV.

Anyway now we'll detect new usages of deleting through a pointer to a base class without a virtual destructor.
2021-12-20 16:19:31 -08:00
Chris Douglas
6613ec282d
Merge pull request #6164 from cdouglas/awscli-baseimg
Move awscli to base image from YCSB
2021-12-17 12:58:59 -08:00
Chris Douglas
3793e4e5f0 Remove rundant WORKDIR directive 2021-12-17 11:34:32 -08:00
Chris Douglas
fec0fb9e9f Move awscli to base image from YCSB 2021-12-17 11:25:22 -08:00
Aaron Molitor
95d33cb363 copy packaging/docker to PROJECT_BINARY_DIR (undoing part of #5994),
fetch commit_sha from source_code_directory (don't assume we're in the source tree anymore),
allow custom tag (if a parameter is passed in as $1)
update README.md
2021-12-15 15:23:17 -08:00
A.J. Beamon
496000477c
Merge pull request #6144 from sfc-gh-ajbeamon/unify-flags
Convert command line arguments to use hyphens rather than underscores
2021-12-15 10:47:32 -08:00
Renxuan Wang
2227fc2943 Fix a bug in getDesiredCoordinators().
When no workers are chosen, we should return 0 coordinators.
2021-12-15 10:42:28 -08:00
A.J. Beamon
c2a960b0f7 Invoke a different command on fdbbackup that doesn't hang when a cluster file is present but the cluster is unavailable. 2021-12-15 09:34:08 -08:00
zhenfeng yang
76974605c1
support lto (#6140)
* support lto

* use relative path

* add another variable to control lto

* remove unnecessary code
2021-12-14 15:45:07 -08:00
A.J. Beamon
ca47b436ac
Apply suggestions from code review
Co-authored-by: Markus Pilman <markus.pilman@snowflake.com>
2021-12-14 14:44:20 -08:00
A.J. Beamon
16fb079c2d Undo some changes that aren't command line flags. 2021-12-14 12:35:49 -08:00
A.J. Beamon
30e2c2d9a6 Don' use new-style arguments in test harness. 2021-12-14 12:31:12 -08:00
A.J. Beamon
1a893e8d32 Add a test that various binaries properly parse arguments. 2021-12-14 12:03:44 -08:00
A.J. Beamon
5c9b64e414 Backup agent was mistakenly modified in conf files. 2021-12-14 12:02:13 -08:00