372 Commits

Author SHA1 Message Date
Zhe Wu
bd99f4fa3b Log tlog initialization 2022-07-21 13:54:44 -07:00
Markus Pilman
1de37afd52
Make TEST macros C++ only (#7558)
* proof of concept

* use code-probe instead of test

* code probe working on gcc

* code probe implemented

* renamed TestProbe to CodeProbe

* fixed refactoring typo

* support filtered output

* print probes at end of simulation

* fix missed probes print

* fix deduplication

* Fix refactoring issues

* revert bad refactor

* make sure file paths are relative

* fix more wrong refactor changes
2022-07-19 13:15:51 -07:00
Jingyu Zhou
b2fded5c51 CC sends recovery txn version during TLog recruitment
This simplifies the logic for TLog to wait for recovery txn before replying
back to peeks.
2022-05-24 14:57:55 -07:00
Zhe Wu
cb73352e36 Don't pop every generation of old log router 2022-05-24 08:47:57 -07:00
Ray Jenkins
dc9e782ccc
OpenTelemetry Tracing Perf Fixes (#6990) 2022-05-02 14:56:51 -05:00
Evan Tschannen
a825eb8a8c fix: when more tlogs are absent than the replication factor we would access invalid memory 2022-04-27 16:53:30 -07:00
Ray Jenkins
1c5bf135d5
Revert "Migrate to OpenTelemetry tracing. (#6855)" (#6941)
This reverts commit 5df3bac110d9b5b88931b008b852433688bb7eb0.
2022-04-25 09:29:56 -05:00
Ray Jenkins
5df3bac110
Migrate to OpenTelemetry tracing. (#6855) 2022-04-20 09:26:37 -05:00
Jingyu Zhou
cfcf0f152c Merge branch 'main-4a085fc84' into vv
Fix Conflicts:
	fdbclient/NativeAPI.actor.cpp
	fdbserver/ClusterRecovery.actor.cpp
	fdbserver/MasterInterface.h
	fdbserver/masterserver.actor.cpp
	flow/error_definitions.h
2022-03-30 22:28:06 -07:00
Dan Lambright
f867474b05 Respond to Jingyu's comments 3/24 2022-03-25 10:50:41 -04:00
Dan Lambright
12e88a8ef5 Error in previous commit 2022-03-23 14:16:45 -04:00
Dan Lambright
f23f451cc4 Fix bug computing tlog count per log group 2022-03-22 14:12:26 -04:00
sfc-gh-tclinkenbeard
a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
Dan Lambright
d69aa8ae92 retain tlog count per log group, add fix dropped in previous rebase 2022-03-21 15:08:13 -04:00
Dan Lambright
6e507b8c07 refactor unicast recovery 2022-03-17 12:25:50 -04:00
Dan Lambright
b529801407 Respond to Jingyu's comments 2022-03-15 19:17:54 -04:00
Dan Lambright
de73fc03dc fix recovery algorithm 2022-03-14 08:59:58 -04:00
Dan Lambright
9544379cdf rebase 2022-01-20 11:12:33 -05:00
Dan Lambright
adc9055097 Do not restart recovery unless min DV of recovered tlog set goes down 2022-01-11 12:52:05 -05:00
Ata E Husain Bohra
936bf5336a
Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine" (#6191)
* Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine""

Major changes includes:
1. Re-revert Sequencer refactor commits listed below (in listed order):
1.a. This reverts commit bb17e194d9c9888e203421290959bd7f2c075d7f.
1.b. This reverts commit d174bb2e06bff01157d16c652073536c54d17f7f.
1.c. This reverts commit 30b05b469c87d9b526b427751c211fb5cf7ff9cd.

2. Update Status.actor to track ClusterController interface to track
   recovery status.
3. Introduce a ServerKnob to define "cluster recovery trace event"
   prefix; for now keeping it as "Master", however, it should allow
   smooth transition to "Cluster" prefix as it seems more appropriate.
2022-01-06 12:15:51 -08:00
Dan Lambright
49e89571fa Set recoverAt to max(all tlogs rv) for recovered (crashed) tLogs in UNICAST mode. 2022-01-04 12:27:20 -05:00
Aaron Molitor
30b05b469c Revert "Refactor: ClusterController driving cluster-recovery state machine"
This reverts commit dfe9d184ff5dd66bdbbc5b984688ac3ebb15b901.
2021-12-24 11:25:51 -08:00
Ata E Husain Bohra
dfe9d184ff Refactor: ClusterController driving cluster-recovery state machine
At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
Dan Lambright
792d7d288d address review comments 2021-12-19 12:50:59 -05:00
Dan Lambright
0222d8669d fix simulation failures 2021-12-10 09:56:21 -05:00
Dan Lambright
faef404279 system rv is max of tlog's rv 2021-11-15 09:42:01 -05:00
Dan Lambright
4979ccb889 commits recovered if written to every tlog minus failure tolerance. 2021-11-12 12:10:04 -05:00
Dan Lambright
0f99ad582b first cut unicast recovery 2021-11-10 12:31:16 -05:00
Lukas Joswiak
3988b11fd6 Cleanup 2021-11-09 12:29:48 -08:00
Lukas Joswiak
3e2c65bb11 Allow tlog to join another cluster but retain its data 2021-11-09 12:29:48 -08:00
Lukas Joswiak
30867750b5 Add protection against storage and tlog data deletion when joining a new cluster 2021-11-09 12:29:47 -08:00
Dan Lambright
58e1888d8e remove network hop by getting previous commit versions in GetCommitVersionRequest 2021-09-30 11:51:57 -04:00
Sreenath Bodagala
184c134b8a - Resolve merge conflicts 2021-09-17 20:25:16 +00:00
Sreenath Bodagala
2aa3b44d4e Merge remote-tracking branch 'apple-upstream/master' into version-vector-prototype
- Conflicts:
	fdbserver/LogSystem.h
	fdbserver/LogSystemConfig.h
	fdbserver/TagPartitionedLogSystem.actor.cpp

- Files modified during merge:

modified:   fdbserver/LogSystem.cpp
modified:   fdbserver/LogSystemConfig.cpp
2021-09-17 19:36:18 +00:00
Xiaoge Su
abf73047ca Enforce std:: specifier rather than using namespace 2021-09-16 19:40:28 -07:00
Xiaoge Su
c32c3b6ec4 fixup! Reformat the code per github's requirement 2021-09-12 14:17:19 -07:00
Xiaoge Su
40648dbb31 fixup! Update code per comment
Also fix the issue that TagPartitionedLogSystem.actor.cpp should include
TagPartitionedLogSystem.actor.h
2021-09-12 14:17:19 -07:00
Xiaoge Su
ecca4edeb4 Create TagPartitionedLogSystem.actor.h
TagPartitionedLogSystem.actor.h contains the struct of TagPartitionedLogSystem.
2021-09-12 14:17:19 -07:00
Sreenath Bodagala
a081c0baa5 Merge remote-tracking branch 'apple-upstream/master' into version-vector-prototype 2021-08-05 22:40:32 +00:00
yao-xiao-github
8609b45354
Add histograms to CommitProxyServer. (#5299) 2021-08-05 09:17:37 -07:00
Andrew Noyes
353efe7db2
Merge pull request #5264 from sfc-gh-tclinkenbeard/fix-more-clang-warnings
Enable more warnings for `clang`
2021-07-29 15:43:54 -07:00
sfc-gh-tclinkenbeard
94a65865d9 Merge remote-tracking branch 'origin/master' into fix-clang-warnings 2021-07-28 12:29:27 -07:00
sfc-gh-tclinkenbeard
c74047c665 Merge remote-tracking branch 'origin/master' into fix-more-clang-warnings 2021-07-28 11:51:02 -07:00
Steve Atherton
507c1f11e3 Add .log() to bare TraceEvent() invocations without any .detail()s to avoid clang-tidy warning about immediate destruction of object without use. 2021-07-26 19:55:10 -07:00
sfc-gh-tclinkenbeard
64dc1dc185 Fix -Wreorder-ctor warnings in NativeAPI.actor.cpp and several other files 2021-07-24 00:23:06 -07:00
sfc-gh-tclinkenbeard
e62e6503ac Fix most delete-non-virtual-dtor clang warnings 2021-07-21 23:32:44 -07:00
Dan Lambright
d07c8ce211 wait for prev per tlog 2021-07-15 08:44:48 -04:00
Lukas Joswiak
5338251946 Fix invalid read 2021-07-13 10:44:37 -07:00
Jingyu Zhou
c700feaa6e Address Dan's comments 2021-06-28 11:16:35 -07:00
sfc-gh-tclinkenbeard
371a38e6e5 Merge remote-tracking branch 'origin/master' into remove-extra-copies 2021-06-07 10:26:06 -07:00