Evan Tschannen
b61a911685
removed an ASSERT that was for debugging purposed, and increased the max commit latency, because it can be spuriously triggered by dummy transactions that take 5+ seconds each
2021-04-21 14:30:06 -07:00
Evan Tschannen
e18c9961b4
rewrote tlog recruitment logic so that it is deterministic, to prevent better master exists from triggering spuriously
2021-04-21 00:22:33 -07:00
Lukas Joswiak
c81e1e9519
Add sampling profiler frequency to global config
2021-04-19 22:46:57 -07:00
RenxuanW
4bf7218e8f
Merge pull request #4635 from RenxuanW/priority_logging
...
Log a warning when remote dc is disabled (priority < 0)
2021-04-15 17:00:41 -07:00
Lukas Joswiak
7de23918c0
Add comments, fix erase bug, make optimizations
2021-04-14 10:56:33 -07:00
Lukas Joswiak
c38ddf5eb7
Add comments
2021-04-14 10:56:33 -07:00
Lukas Joswiak
7ba7257cd2
Store global config data on heap
2021-04-14 10:56:33 -07:00
Lukas Joswiak
1c60653c2a
Add fix to conditionally set global config history
2021-04-14 10:56:33 -07:00
Lukas Joswiak
6de28dd916
clang-format
2021-04-14 10:56:33 -07:00
Lukas Joswiak
1260385965
Use object to wrap global configuration history
2021-04-14 10:56:32 -07:00
Lukas Joswiak
fb9a929780
Fix issue with freed memory being accessed
2021-04-14 10:56:32 -07:00
Lukas Joswiak
c3f68831af
Move existing ClientDBInfo variables to global configuration
2021-04-14 10:56:32 -07:00
Lukas Joswiak
7bb0b3d899
Use commit version for global configuration updates
...
FIXME: There is a memory issue where the underlying data for values set
in the `data` field of GlobalConfig will be freed shortly after being
set.
2021-04-14 10:56:32 -07:00
Lukas Joswiak
f1415412f1
Add global configuration framework implementation
2021-04-14 10:56:32 -07:00
Evan Tschannen
bd6db9ca7c
Update fdbserver/ClusterController.actor.cpp
...
Co-authored-by: Markus Pilman <markus.pilman@snowflake.com>
2021-04-13 15:13:45 -07:00
RenxuanW
7be8dab045
Change DcPriorityNegative to CCDcPriorityNegative
2021-04-08 16:00:37 -07:00
RenxuanW
738e7402f7
Log a warning when remote dc is disabled (priority < 0)
2021-04-08 15:36:52 -07:00
RenxuanW
f3d5fa4750
Revert "Log a warning when remote dc's priority doesn't match the original primary."
...
This reverts commit 1d701e8bcfcd01b31949f92e095fd405b4826cfd.
2021-04-08 15:19:43 -07:00
RenxuanW
1d701e8bcf
Log a warning when remote dc's priority doesn't match the original primary.
2021-04-08 14:38:37 -07:00
Evan Tschannen
a90c26f1d0
The master, proxies, and resolver all need to have the same machine class fitness function besides best fit to ensure recruitment is deterministic
...
if the first GRV proxy or resolver is forced to share a process, it should prefer to share with the commit proxy so that the commit proxy has more potential options it can share with
2021-04-08 14:29:12 -07:00
Evan Tschannen
5695a1816f
fix: requiredFitness was being set to one higher than the actual requirement
2021-04-07 21:31:14 -07:00
Evan Tschannen
1b1f73ea16
added comments
2021-04-07 20:40:42 -07:00
Evan Tschannen
4d8dd0b0a0
fix: desired must be greater than or equal to required
2021-04-07 20:32:45 -07:00
Evan Tschannen
14213b0151
code cleanup
2021-04-07 20:06:30 -07:00
Evan Tschannen
15e8b43961
rewrote getWorkersForTLogs to do a much better job of avoiding degraded processes and processes in the same DC as the cluster controller
2021-04-07 19:57:24 -07:00
Evan Tschannen
c27d82cecd
tlog recruitment used a degraded LogClass process over a non-degraded TransactionClass process
...
tlog recruitment would not use TransactionClass processes if it fulfulled the required amount with LogClass processes
Better master exists did not account for how many times a process had been used when comparing recruitments
Better master exists did not account for the fact that tlogs prefer to be in a different dc than the cluster controller
RoleFitness comparison did not properly order count before degraded or bestFit
betterCount was returning worstFit when worstIsDegraded did not match
backupWorker recruitment did not attempt to avoid sharing processes with other roles
If any of the commit_proxy, grv_proxy, or resolver are forced to share a process, allow the recruitment for all of them to share to an equal degree, this change allows BetterMasterExists to be refactors as a tuple comparison
2021-04-07 16:04:08 -07:00
Markus Pilman
50342b5082
fix a second low-latency bug
2021-03-29 13:31:26 -06:00
Markus Pilman
8555723b98
removing testing case
2021-03-26 15:46:54 -06:00
Markus Pilman
43bed1d9dd
Fix bug where betterMasterExist and recruitment disagree
2021-03-26 15:06:59 -06:00
Evan Tschannen
10b6b5d710
If the current configuration does not have a satellite fallback policy we do not care if the old configuration is in fallback mode
2021-03-23 13:02:31 -07:00
A.J. Beamon
99f3bb6d7d
Merge pull request #4509 from sfc-gh-etschannen/feature-bme-count
...
Do not trigger BetterMasterExists if it lowers the number of processes
2021-03-22 13:43:24 -07:00
Zhe Wu
15f3699e22
Add targeting DC ids in the tlog recruitment event trace.
2021-03-19 14:10:38 -07:00
Meng Xu
0cedef123b
Merge pull request #4518 from halfprice/zhewu/log-tlog-recruitment-failure-reason
...
Logging more detailed information during Tlog recruitment
2021-03-19 11:36:05 -07:00
Zhe Wu
58d9f47782
log fitness for excluded workers as well
2021-03-19 11:04:53 -07:00
Zhe Wu
4c00361f1c
Add comment for 'getWorkersForTlogs' method, and addressed TraceEvent formatting comments.
2021-03-18 21:33:43 -07:00
Zhe Wu
9419387295
Update logging field.
2021-03-18 14:53:43 -07:00
Evan Tschannen
2ff63f544e
Update fdbserver/ClusterController.actor.cpp
...
Co-authored-by: Lukas Joswiak <lukas.joswiak@snowflake.com>
2021-03-18 13:45:51 -07:00
Zhe Wu
451b14af09
Log detailed information when a worker is considered as unavailable by the cluster controller for TLog recruitment.
2021-03-18 12:18:03 -07:00
Zhe Wu
6468c5aed6
Fix string join
2021-03-17 23:46:11 -07:00
Zhe Wu
1205650a69
Log the dcid during TLog recruitment, so that we can tell in which DC the recruitment is happening
2021-03-17 23:22:42 -07:00
Evan Tschannen
9aeb69ca1c
added a comment
2021-03-16 14:19:23 -07:00
Evan Tschannen
d0f134c20e
added a comment
2021-03-16 13:17:56 -07:00
Evan Tschannen
2a272e525f
fix compile error
2021-03-16 12:21:21 -07:00
Evan Tschannen
10fd094920
Better master exists should not trigger if it will lower the total number of processes being recruited
2021-03-16 12:14:19 -07:00
FDB Formatster
df90cc89de
apply clang-format to *.c, *.cpp, *.h, *.hpp files
2021-03-10 10:18:07 -08:00
Evan Tschannen
346a4e3ecd
Merge branch 'release-6.3'
...
# Conflicts:
# fdbcli/fdbcli.actor.cpp
# fdbrpc/LoadBalance.actor.h
# fdbrpc/MultiInterface.h
# fdbserver/ClusterController.actor.cpp
# fdbserver/MasterProxyServer.actor.cpp
# fdbserver/masterserver.actor.cpp
2021-03-01 18:52:06 -08:00
Meng Xu
33eb1de00e
Add some comment to log system
...
and resolve review comment by deleting my questions.
2021-02-19 21:44:13 -08:00
Meng Xu
9122be4d81
Add comments to HA code and loadBalance code
2021-02-10 13:51:36 -08:00
Richard Chen
c77d9e4abe
merge conflicts
2020-12-02 21:53:19 +00:00
Markus Pilman
bdd3dbfa7d
remove duplicates
2020-11-10 14:01:07 -07:00