87 Commits

Author SHA1 Message Date
A.J. Beamon
cdebda35ab
Merge pull request #5725 from sfc-gh-jfu/jfu-grv-cache
Add transaction option for clients to use cached read versions
2022-03-04 09:17:27 -08:00
A.J. Beamon
250a88e682 Enforce that trace event suppression calls happen first when using trace event call chaining. Fix various instances where we weren't following this requirement. 2022-02-24 12:25:52 -08:00
Jon Fu
ae071a7211 clean up debug trace lines 2022-02-22 17:16:11 -05:00
Jon Fu
e0d3b0a488 format times in trace event 2022-02-16 16:55:14 -05:00
Jon Fu
6199faadc3 fix bug which constantly overwrote start time of last throttled queue 2022-02-16 16:52:38 -05:00
Jon Fu
5846dda410 temporary changes and extra traces for debugging 2022-02-16 15:05:25 -05:00
Jon Fu
6f1c3d50bb add debug traces for testing 2022-02-15 15:08:53 -05:00
Jon Fu
8129c4e21c simplify sidebandsingle workload and be stricter with batch throttling on rk 2022-02-14 13:58:56 -05:00
Jon Fu
a63d218e9d simplify test workload and adjust ratekeeper throttling strategy 2022-02-11 16:41:14 -05:00
Jon Fu
458e708272 addressed code review comments: renamed variables, small functional changes, style changes 2022-02-10 16:17:54 -05:00
Jon Fu
ec2bbf0343 clean up some more trace lines and leftover code snippets 2022-02-07 14:50:04 -05:00
Jon Fu
d8e7fea421 clean up some comments and debug changes 2022-02-02 14:03:32 -05:00
Jon Fu
915e2f6c1c Merge branch 'main' of github.com:apple/foundationdb into jfu-grv-cache 2022-01-20 16:17:20 -05:00
Ata E Husain Bohra
936bf5336a
Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine" (#6191)
* Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine""

Major changes includes:
1. Re-revert Sequencer refactor commits listed below (in listed order):
1.a. This reverts commit bb17e194d9c9888e203421290959bd7f2c075d7f.
1.b. This reverts commit d174bb2e06bff01157d16c652073536c54d17f7f.
1.c. This reverts commit 30b05b469c87d9b526b427751c211fb5cf7ff9cd.

2. Update Status.actor to track ClusterController interface to track
   recovery status.
3. Introduce a ServerKnob to define "cluster recovery trace event"
   prefix; for now keeping it as "Master", however, it should allow
   smooth transition to "Cluster" prefix as it seems more appropriate.
2022-01-06 12:15:51 -08:00
Aaron Molitor
30b05b469c Revert "Refactor: ClusterController driving cluster-recovery state machine"
This reverts commit dfe9d184ff5dd66bdbbc5b984688ac3ebb15b901.
2021-12-24 11:25:51 -08:00
Aaron Molitor
d174bb2e06 Revert "Refactor: ClusterController driving cluster-recovery state machine"
This reverts commit abd2959702b0027ab23b8d42d8082b79c3b197f3.
2021-12-24 11:25:51 -08:00
Ata E Husain Bohra
abd2959702 Refactor: ClusterController driving cluster-recovery state machine
diff-1: Address Jingyu's review comments

At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
Ata E Husain Bohra
dfe9d184ff Refactor: ClusterController driving cluster-recovery state machine
At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
Jon Fu
2b0ade5250 Change throttling threshold to count loop iterations instead of time 2021-11-25 13:31:55 -05:00
Jon Fu
beae9ccfa1 tweak some knob default settings and trace formatting 2021-11-23 14:58:17 -05:00
Jon Fu
33ee5fa372 add tracing to proxy throttling check codepath 2021-11-22 13:00:53 -05:00
Jon Fu
3f24128da4 Merge branch 'master' of github.com:apple/foundationdb into jfu-grv-cache 2021-11-19 14:46:55 -05:00
Jon Fu
e9c58d9f86 Check for sustained throttling in the proxy to lower threshold time and avoid false positives 2021-11-19 14:33:06 -05:00
Lukas Joswiak
28b72550f3 Remove additional unused tracing 2021-11-10 13:33:49 -08:00
Lukas Joswiak
c93052121f Fix issue where transaction spans would not be recorded 2021-11-10 13:33:49 -08:00
Jon Fu
6d74239760 Track throttling by measuring time spent left in queue on the proxy 2021-10-22 13:55:01 -04:00
Jon Fu
f1c8d3fbc8 Add code to disable cache when ratekeeper begins throttling 2021-10-20 15:52:43 -04:00
Jon Fu
44a854772f Merge branch 'master' of github.com:apple/foundationdb into jfu-grv-cache 2021-10-05 12:55:02 -04:00
Jon Fu
d560eb1fea debug time bounds using sim_validation 2021-10-04 14:12:31 -04:00
Xiaoge Su
abf73047ca Enforce std:: specifier rather than using namespace 2021-09-16 19:40:28 -07:00
FDB Formatster
2c788c233d apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-08-27 17:07:47 -07:00
Lukas Joswiak
a605fb3852
Merge pull request #5026 from sfc-gh-ljoswiak/fixes/alp6
Actor sampling
2021-08-11 13:44:17 -07:00
Yao Xiao
f3baedd27f Apply suggestions. 2021-08-04 13:51:56 -07:00
Lukas Joswiak
5dc9a97230 Merge branch 'master' into fixes/alp6 2021-08-01 20:42:52 -07:00
Yao Xiao
82be496ba3 Updated grvRawDist to grvGetCommittedVersionRpcDist. 2021-07-30 18:01:27 -07:00
Yao Xiao
74a7da0179 Add histogram in GrvProxyServer. 2021-07-30 17:54:51 -07:00
sfc-gh-tclinkenbeard
c74047c665 Merge remote-tracking branch 'origin/master' into fix-more-clang-warnings 2021-07-28 11:51:02 -07:00
Lukas Joswiak
3eed4084e2 Merge branch 'master' into fixes/alp6 2021-07-27 11:26:53 -07:00
Lukas Joswiak
59d535149e Merge branch 'master' into fixes/alp6 2021-07-27 10:07:18 -07:00
Steve Atherton
507c1f11e3 Add .log() to bare TraceEvent() invocations without any .detail()s to avoid clang-tidy warning about immediate destruction of object without use. 2021-07-26 19:55:10 -07:00
sfc-gh-tclinkenbeard
64dc1dc185 Fix -Wreorder-ctor warnings in NativeAPI.actor.cpp and several other files 2021-07-24 00:23:06 -07:00
sfc-gh-tclinkenbeard
6f81155784 Merge remote-tracking branch 'origin/master' into const-serverdbinfo 2021-07-20 10:18:40 -07:00
Steve Atherton
f596a81073 Rename ::TRUE and ::FALSE in BooleanParams to ::True and ::False so as to not conflict with the TRUE and FALSE macros provided by the Windows and MacOS SDKs. 2021-07-17 00:11:40 -07:00
sfc-gh-tclinkenbeard
b2bbdf0d7f Prevent grvProxyServer from modifying ServerDBInfo object 2021-07-11 23:29:36 -07:00
sfc-gh-tclinkenbeard
79ff07a071 Added *BOOLEAN_PARAM macros to enforce documentation of boolean parameters 2021-07-02 15:04:42 -07:00
Renxuan Wang
00599069d2 Add a few ProxyStats fields.
1. Add the count of # of getRateInfo() and # of leaseTimeout;
2. Add SystemGRVQueueSize, DefaultGRVQueueSize and BatchGRVQueueSize.
2021-06-22 11:45:58 -07:00
RenxuanW
fe936207a9 Replace lower priority txn request when limit is hit. 2021-06-15 14:00:06 -07:00
Lukas Joswiak
4eca095644 Remove scoped lineage 2021-06-15 11:08:57 -07:00
RenxuanW
f19d256e0d Bug fix: grvLatencyBands should take "GRVLatencyBands" as name. 2021-06-14 17:13:22 -07:00
RenxuanW
29cb735881 Fix batch txn throttling. 2021-06-09 12:51:44 -07:00