121 Commits

Author SHA1 Message Date
Evan Tschannen
aed2d34bcb Merge branch 'master' into feature-proxy-load-balance
# Conflicts:
#	fdbclient/NativeAPI.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	flow/Knobs.cpp
2020-05-01 09:19:39 -07:00
Evan Tschannen
76fb345dd1 Merge branch 'master' into feature-tree-broadcast
# Conflicts:
#	fdbrpc/FailureMonitor.actor.cpp
2020-04-29 09:51:22 -07:00
Evan Tschannen
ba3e2af473 Merge commit '5288033bcfe40c3ade97c8bf2d04cf31b3f16cb1' into feature-tree-broadcast 2020-04-17 15:17:37 -07:00
Vishesh Yadav
8c8f23bff2 Merge remote-tracking branch 'apple/master' into task/issue-1017-slow-machine-poisoning 2020-04-16 00:45:35 -07:00
A.J. Beamon
d8690d31cd Merge branch 'master' into per-priority-busy-logging
# Conflicts:
#	flow/Net2.actor.cpp
2020-04-15 08:31:30 -07:00
A.J. Beamon
b1172417f5 Merge branch 'master' into per-priority-busy-logging
# Conflicts:
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/Net2.actor.cpp
2020-04-14 14:22:12 -07:00
A.J. Beamon
e104a2e3a6 Merge commit 'cf01233f28a2c42908656a39f458a4475c1d44a3' into run-loop-busy-profiler
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/NativeAPI.actor.h
#	fdbserver/fdbserver.actor.cpp
#	flow/Net2.actor.cpp
2020-04-14 14:02:24 -07:00
Evan Tschannen
07cc0a8d74 code cleanup 2020-04-10 17:02:11 -07:00
Vishesh Yadav
13447f439f fdbrpc: Add a constant to onFailedFor()
Since, we mark an address as failed when connection is failed, this
patch adds a contant to compensate the time needed to reconnect and
make sure endpoint is actually down. This contant is equal to
FAILURE_MIN_DELAY which was used by centralized
FailureMonitoringClient earlier removed.
2020-04-08 19:34:40 -07:00
Vishesh Yadav
975e6b1d9a Merge remote-tracking branch 'apple/master' into task/issue-1017-slow-machine-poisoning
Removed merge conflict with old build system.
2020-04-08 19:25:13 -07:00
Vishesh Yadav
fdc1048f75 Add knob to turn off marking unstable connections 2020-04-03 15:53:00 -07:00
Vishesh Yadav
1d35f2ff5a Mark a connection as failed for X seconds if closes too often 2020-04-03 15:53:00 -07:00
tclinken
884e92bb49 Atomically update dependent knobs 2020-04-01 15:18:49 -07:00
Evan Tschannen
e08f0201f1 merge release 6.2 into master 2020-03-17 12:51:47 -07:00
A.J. Beamon
d8cfabe73b Extend the allocation tracing disabling flag to cover more parts of trace logging as a precaution. Make it possible to disable via knob. 2020-03-16 13:59:31 -07:00
Evan Tschannen
303df197cf Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	bindings/c/test/mako/mako.c
#	documentation/sphinx/source/release-notes.rst
#	fdbbackup/backup.actor.cpp
#	fdbclient/NativeAPI.actor.cpp
#	fdbclient/NativeAPI.actor.h
#	fdbserver/DataDistributionQueue.actor.cpp
#	fdbserver/Knobs.cpp
#	fdbserver/Knobs.h
#	fdbserver/LogRouter.actor.cpp
#	fdbserver/SkipList.cpp
#	fdbserver/fdbserver.actor.cpp
#	flow/CMakeLists.txt
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/flow.vcxproj
#	flow/flow.vcxproj.filters
#	versions.target
2020-03-06 18:22:46 -08:00
Evan Tschannen
39050308ff lower accept batch size just to be conservative with the change 2020-03-05 18:17:49 -08:00
Evan Tschannen
820957025f accept connections in batches of 20 to improve performance 2020-03-04 14:24:57 -08:00
Evan Tschannen
924d335aa7 Merge branch 'release-6.2'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	flow/Knobs.cpp
#	flow/Knobs.h
2020-02-25 18:25:19 -08:00
Evan Tschannen
65fbe0d0bc revert AcceptSocket priority change because of bad performance results 2020-02-21 19:22:14 -08:00
Evan Tschannen
96258b9809 Merge branch 'release-6.2'
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbcli/fdbcli.actor.cpp
#	fdbclient/ManagementAPI.actor.cpp
#	fdbrpc/FlowTransport.actor.cpp
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/DataDistribution.actor.h
#	fdbserver/DataDistributionQueue.actor.cpp
#	fdbserver/KeyValueStoreMemory.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/QuietDatabase.actor.cpp
#	fdbserver/SkipList.cpp
#	fdbserver/StorageMetrics.actor.h
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/fdbserver.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	fdbserver/workloads/KVStoreTest.actor.cpp
#	flow/CMakeLists.txt
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/genericactors.actor.cpp
#	flow/serialize.h
2020-02-21 19:09:16 -08:00
Evan Tschannen
f04e311a1e Merge commit 'b46d6e25e24993ab5a5f04091fd3235050b7cd09' into feature-boost-ssl
# Conflicts:
#	fdbserver/SimulatedCluster.actor.cpp
#	flow/Net2.actor.cpp
2020-02-20 17:36:38 -08:00
Evan Tschannen
08c318d28a re-added the connect lock in the fdbcli so that the timeout is not spent before a connection has been initiated (because of the handshake lock) 2020-02-20 10:43:34 -08:00
Evan Tschannen
69b5a1fbe3 more priority improvements 2020-02-20 10:11:43 -08:00
Evan Tschannen
fbd45963d8 The cluster controller waits until no new workers register for 1.0 before starting a bad recruitment 2020-02-19 16:48:30 -08:00
Alex Miller
fe78524bbc
Merge pull request #2678 from sears/networktest_perf
Add some tuning knobs to networktestclient; also, measure latency directly
2020-02-19 14:38:09 -08:00
Evan Tschannen
693e469003 Changed the handshake lock to a BoundedFlowLock, which will enforce that old handshakes complete before starting to initiate new handshakes 2020-02-14 16:49:52 -08:00
Russell Sears
7724c644e5 Add some tuning knobs to networktestclient; also, measure latency directly. 2020-02-13 13:11:54 -08:00
Andrew Noyes
1248d2b8b4 Remove USE_OBJECT_SERIALIZER knob 2020-02-12 10:41:52 -08:00
A.J. Beamon
abb75f7eb7 Add logging to indicate the time spent at each priority that exceeds some minimum busyness threshold 2020-02-07 14:34:24 -08:00
Evan Tschannen
69de430057 separate handshaking from connection to improve pipelining 2020-02-06 16:45:54 -08:00
Evan Tschannen
53d0867a17 limit the number of connections a process can attempt to establish in parallel 2020-02-04 18:15:10 -08:00
Evan Tschannen
4524831456
Merge pull request #2518 from vishesh/task/failmon-remove-server
FailureMonitoring: Server processes no longer need to talk to ClusterController
2020-02-03 17:22:50 -08:00
A.J. Beamon
182dac7cd5 Convert the slow task profiler into a run loop profiler that also logs when the run loop is 100% busy for a knob-configurable duration. 2020-01-28 12:09:37 -08:00
Evan Tschannen
78adbea834 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	documentation/sphinx/source/release-notes.rst
#	flow/Knobs.h
#	versions.target
2020-01-21 21:38:19 -08:00
Evan Tschannen
afd3ec13ff added knobs 2020-01-21 18:58:34 -08:00
Vishesh Yadav
daef5f011a Merge remote-tracking branch 'apple/master' into task/failmon-remove-server 2020-01-21 13:20:15 -08:00
Evan Tschannen
3f9d9d8b84 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	cmake/FlowCommands.cmake
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/StorageServerInterface.h
#	fdbserver/DataDistributionTracker.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	fdbserver/fdbserver.actor.cpp
#	flow/Knobs.h
#	flow/Platform.cpp
#	versions.target
2020-01-16 18:37:47 -08:00
Evan Tschannen
0e916fdbed throttle client TLS errors longer than server errors so that when both happen simultaneously the server throttling will be disabled when the client makes its next attempt 2020-01-12 22:12:18 -08:00
Balachandar Namasivayam
741aa523e6 Establishing TLS connection through the handshake process is expensive and the fdbserver process can get easily saturated with doing repeated TLS handshakes with only a few hundreds of clients have bad certificate. Hence throttle the number of handshakes done on the server per client ip if it has a bad certificate. 2020-01-10 16:19:41 -08:00
Vishesh Yadav
598b2eaeb0 fdbrpc: Add warning when peer is unavailable for long time 2020-01-08 13:55:13 -08:00
Evan Tschannen
83ad9caf54 implemented a load balancing algorithm which evens out the number of requests processes by each proxy 2020-01-08 01:59:01 -08:00
mpilman
7a62d3b526 Changed failure monitor ping delay to 1 second 2019-12-11 11:23:24 -08:00
Evan Tschannen
3c769fcf60 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/ClusterController.actor.cpp
#	fdbserver/MasterProxyServer.actor.cpp
#	versions.target
2019-11-22 15:39:19 -08:00
Evan Tschannen
27cb299d84 simulation can sometimes randomly hang or throw connection_failed, instead of always doing one or the other 2019-11-21 16:24:18 -08:00
Evan Tschannen
2727b91c46 simulation tests network connections failing due to errors instead of just hanging 2019-11-21 12:33:07 -08:00
Meng Xu
c4d1e6e1a9 Trace:Severity:Include SevNoInfo to mute trace
Define SevFRMutationInfo to trace mutations in restore.
2019-11-04 16:18:40 -08:00
Evan Tschannen
3325980c03 Merge branch 'release-6.2'
# Conflicts:
#	CMakeLists.txt
#	documentation/sphinx/source/release-notes.rst
#	fdbserver/DataDistribution.actor.cpp
#	fdbserver/OldTLogServer_6_0.actor.cpp
#	fdbserver/TLogServer.actor.cpp
#	fdbserver/WorkerInterface.actor.h
#	fdbserver/worker.actor.cpp
#	versions.target
2019-10-24 17:38:15 -07:00
mpilman
325a8e4213 remove confusing USE_ODIRECT knob 2019-10-24 11:44:03 -07:00
mpilman
f23392ec5a Don't use O_DIRECT in EIO by default 2019-10-24 11:39:55 -07:00