1
0
mirror of https://github.com/apple/foundationdb.git synced 2025-05-21 22:33:17 +08:00

Merge release-6.3 into master

This commit is contained in:
A.J. Beamon 2020-05-22 09:25:32 -07:00
parent 696f8c451a
commit d128252e90
69 changed files with 1762 additions and 836 deletions
README.md
bindings/java/src/main/com/apple/foundationdb/tuple
cmake
contrib/transaction_profiling_analyzer
design
documentation/sphinx/source
fdbbackup
fdbcli
fdbclient
fdbmonitor
fdbrpc
fdbserver
fdbservice
flow
tests

@ -20,7 +20,7 @@ Contributing to FoundationDB can be in contributions to the code base, sharing y
### Binary downloads ### Binary downloads
Developers interested in using the FoundationDB store for an application can get started easily by downloading and installing a binary package. Please see the [downloads page](https://www.foundationdb.org/download/) for a list of available packages. Developers interested in using FoundationDB can get started by downloading and installing a binary package. Please see the [downloads page](https://www.foundationdb.org/download/) for a list of available packages.
### Compiling from source ### Compiling from source
@ -28,44 +28,24 @@ Developers interested in using the FoundationDB store for an application can get
Developers on an OS for which there is no binary package, or who would like Developers on an OS for which there is no binary package, or who would like
to start hacking on the code, can get started by compiling from source. to start hacking on the code, can get started by compiling from source.
Currently there are two build systems: a collection of Makefiles and a The official docker image for building is `foundationdb/foundationdb-build`. It has all dependencies installed. To build outside the official docker image you'll need at least these dependencies:
CMake-based build system. Both of them should currently work for most users,
and CMake should be the preferred choice as it will eventually become the only 1. Install cmake Version 3.13 or higher [CMake](https://cmake.org/)
build system available. 1. Install [Mono](http://www.mono-project.com/download/stable/)
1. Install [Ninja](https://ninja-build.org/) (optional, but recommended)
If compiling for local development, please set `-DUSE_WERROR=ON` in If compiling for local development, please set `-DUSE_WERROR=ON` in
cmake. Our CI compiles with `-Werror` on, so this way you'll find out about cmake. Our CI compiles with `-Werror` on, so this way you'll find out about
compiler warnings that break the build earlier. compiler warnings that break the build earlier.
## CMake Once you have your dependencies, you can run cmake and then build:
To build with CMake, generally the following is required (works on Linux and
Mac OS - for Windows see below):
1. Check out this repository. 1. Check out this repository.
1. Install cmake Version 3.13 or higher [CMake](https://cmake.org/)
1. Download version 1.67 of [Boost](https://sourceforge.net/projects/boost/files/boost/1.67.0/).
1. Unpack boost (you don't need to compile it)
1. Install [Mono](http://www.mono-project.com/download/stable/).
1. Install a [JDK](http://www.oracle.com/technetwork/java/javase/downloads/index.html). FoundationDB currently builds with Java 8.
1. Create a build directory (you can have the build directory anywhere you 1. Create a build directory (you can have the build directory anywhere you
like): `mkdir build` like). There is currently a directory in the source tree called build, but you should not use it. See [#3098](https://github.com/apple/foundationdb/issues/3098)
1. `cd build` 1. `cd <PATH_TO_BUILD_DIRECTORY>`
1. `cmake -GNinja -DBOOST_ROOT=<PATH_TO_BOOST> <PATH_TO_FOUNDATIONDB_DIRECTORY>` 1. `cmake -G Ninja <PATH_TO_FOUNDATIONDB_DIRECTORY>`
1. `ninja` 1. `ninja # If this crashes it probably ran out of memory. Try ninja -j1`
CMake will try to find its dependencies. However, for LibreSSL this can be often
problematic (especially if OpenSSL is installed as well). For that we recommend
passing the argument `-DLibreSSL_ROOT` to cmake. So, for example, if you
LibreSSL is installed under `/usr/local/libressl-2.8.3`, you should call cmake like
this:
```
cmake -GNinja -DLibreSSL_ROOT=/usr/local/libressl-2.8.3/ ../foundationdb
```
FoundationDB will build just fine without LibreSSL, however, the resulting
binaries won't support TLS connections.
### Language Bindings ### Language Bindings
@ -120,8 +100,7 @@ create a XCode-project with the following command:
cmake -G Xcode -DOPEN_FOR_IDE=ON <FDB_SOURCE_DIRECTORY> cmake -G Xcode -DOPEN_FOR_IDE=ON <FDB_SOURCE_DIRECTORY>
``` ```
You should create a second build-directory which you will use for building You should create a second build-directory which you will use for building and debugging.
(probably with make or ninja) and debugging.
#### FreeBSD #### FreeBSD
@ -160,11 +139,8 @@ There are no special requirements for Linux. A docker image can be pulled from
`foundationdb/foundationdb-build` that has all of FoundationDB's dependencies `foundationdb/foundationdb-build` that has all of FoundationDB's dependencies
pre-installed, and is what the CI uses to build and test PRs. pre-installed, and is what the CI uses to build and test PRs.
If you want to create a package you have to tell cmake what platform it is for.
And then you can build by simply calling `cpack`. So for debian, call:
``` ```
cmake -GNinja <FDB_SOURCE_DIR> cmake -G Ninja <FDB_SOURCE_DIR>
ninja ninja
cpack -G DEB cpack -G DEB
``` ```
@ -173,20 +149,15 @@ For RPM simply replace `DEB` with `RPM`.
### MacOS ### MacOS
The build under MacOS will work the same way as on Linux. To get LibreSSL, The build under MacOS will work the same way as on Linux. To get boost and ninja you can use [Homebrew](https://brew.sh/).
boost, and ninja you can use [Homebrew](https://brew.sh/). LibreSSL will not be
installed in `/usr/local` instead it will stay in `/usr/local/Cellar`. So the
cmake command will look something like this:
```sh ```sh
cmake -GNinja -DLibreSSL_ROOT=/usr/local/Cellar/libressl/2.8.3 <PATH_TO_FOUNDATIONDB_SOURCE> cmake -G Ninja <PATH_TO_FOUNDATIONDB_SOURCE>
``` ```
To generate a installable package, you have to call CMake with the corresponding To generate a installable package, you can use cpack:
arguments and then use cpack to generate the package:
```sh ```sh
cmake -GNinja <FDB_SOURCE_DIR>
ninja ninja
cpack -G productbuild cpack -G productbuild
``` ```
@ -198,15 +169,15 @@ that Visual Studio is used to compile.
1. Install Visual Studio 2017 (Community Edition is tested) 1. Install Visual Studio 2017 (Community Edition is tested)
1. Install cmake Version 3.12 or higher [CMake](https://cmake.org/) 1. Install cmake Version 3.12 or higher [CMake](https://cmake.org/)
1. Download version 1.67 of [Boost](https://sourceforge.net/projects/boost/files/boost/1.67.0/). 1. Download version 1.72 of [Boost](https://dl.bintray.com/boostorg/release/1.72.0/source/boost_1_72_0.tar.bz2)
1. Unpack boost (you don't need to compile it) 1. Unpack boost (you don't need to compile it)
1. Install [Mono](http://www.mono-project.com/download/stable/). 1. Install [Mono](http://www.mono-project.com/download/stable/)
1. Install a [JDK](http://www.oracle.com/technetwork/java/javase/downloads/index.html). FoundationDB currently builds with Java 8. 1. (Optional) Install a [JDK](http://www.oracle.com/technetwork/java/javase/downloads/index.html). FoundationDB currently builds with Java 8
1. Set `JAVA_HOME` to the unpacked location and JAVA_COMPILE to 1. Set `JAVA_HOME` to the unpacked location and JAVA_COMPILE to
`$JAVA_HOME/bin/javac`. `$JAVA_HOME/bin/javac`.
1. Install [Python](https://www.python.org/downloads/) if it is not already installed by Visual Studio. 1. Install [Python](https://www.python.org/downloads/) if it is not already installed by Visual Studio
1. (Optional) Install [WIX](http://wixtoolset.org/). Without it Visual Studio 1. (Optional) Install [WIX](http://wixtoolset.org/). Without it Visual Studio
won't build the Windows installer. won't build the Windows installer
1. Create a build directory (you can have the build directory anywhere you 1. Create a build directory (you can have the build directory anywhere you
like): `mkdir build` like): `mkdir build`
1. `cd build` 1. `cd build`
@ -218,22 +189,7 @@ that Visual Studio is used to compile.
Studio will only know about the generated files. `msbuild` is located at Studio will only know about the generated files. `msbuild` is located at
`c:\Program Files (x86)\MSBuild\14.0\Bin\MSBuild.exe` for Visual Studio 15. `c:\Program Files (x86)\MSBuild\14.0\Bin\MSBuild.exe` for Visual Studio 15.
If you want TLS support to be enabled under Windows you currently have to build
and install LibreSSL yourself as the newer LibreSSL versions are not provided
for download from the LibreSSL homepage. To build LibreSSL:
1. Download and unpack libressl (>= 2.8.2)
2. `cd libressl-2.8.2`
3. `mkdir build`
4. `cd build`
5. `cmake -G "Visual Studio 15 2017 Win64" ..`
6. Open the generated `LibreSSL.sln` in Visual Studio as administrator (this is
necessary for the install)
7. Build the `INSTALL` project in `Release` mode
This will install LibreSSL under `C:\Program Files\LibreSSL`. After that `cmake`
will automatically find it and build with TLS support.
If you installed WIX before running `cmake` you should find the If you installed WIX before running `cmake` you should find the
`FDBInstaller.msi` in your build directory under `packaging/msi`. `FDBInstaller.msi` in your build directory under `packaging/msi`.
TODO: Re-add instructions for TLS support [#3022](https://github.com/apple/foundationdb/issues/3022)

@ -1,5 +1,5 @@
/* /*
* ByteArrayUtil.java * FastByteComparisons.java
* *
* This source file is part of the FoundationDB open source project * This source file is part of the FoundationDB open source project
* *

@ -85,7 +85,17 @@ include(CheckFunctionExists)
set(CMAKE_REQUIRED_INCLUDES stdlib.h malloc.h) set(CMAKE_REQUIRED_INCLUDES stdlib.h malloc.h)
set(CMAKE_REQUIRED_LIBRARIES c) set(CMAKE_REQUIRED_LIBRARIES c)
set(CMAKE_CXX_STANDARD 17) set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_C_STANDARD 11) set(CMAKE_C_STANDARD 11)
set(CMAKE_C_STANDARD_REQUIRED ON)
if(NOT WIN32)
include(CheckIncludeFile)
CHECK_INCLUDE_FILE("stdatomic.h" HAS_C11_ATOMICS)
if (NOT HAS_C11_ATOMICS)
message(FATAL_ERROR "C compiler does not support c11 atomics")
endif()
endif()
if(WIN32) if(WIN32)
# see: https://docs.microsoft.com/en-us/windows/desktop/WinProg/using-the-windows-headers # see: https://docs.microsoft.com/en-us/windows/desktop/WinProg/using-the-windows-headers

@ -173,44 +173,45 @@ class Mutation(object):
class BaseInfo(object): class BaseInfo(object):
def __init__(self, start_timestamp): def __init__(self, bb, protocol_version):
self.start_timestamp = start_timestamp self.start_timestamp = bb.get_double()
if protocol_version >= PROTOCOL_VERSION_6_3:
self.dc_id = bb.get_bytes_with_length()
class GetVersionInfo(BaseInfo): class GetVersionInfo(BaseInfo):
def __init__(self, bb, protocol_version): def __init__(self, bb, protocol_version):
super().__init__(bb.get_double()) super().__init__(bb, protocol_version)
self.latency = bb.get_double() self.latency = bb.get_double()
if protocol_version >= PROTOCOL_VERSION_6_2: if protocol_version >= PROTOCOL_VERSION_6_2:
self.transaction_priority_type = bb.get_int() self.transaction_priority_type = bb.get_int()
if protocol_version >= PROTOCOL_VERSION_6_3: if protocol_version >= PROTOCOL_VERSION_6_3:
self.read_version = bb.get_long() self.read_version = bb.get_long()
class GetInfo(BaseInfo): class GetInfo(BaseInfo):
def __init__(self, bb): def __init__(self, bb, protocol_version):
super().__init__(bb.get_double()) super().__init__(bb, protocol_version)
self.latency = bb.get_double() self.latency = bb.get_double()
self.value_size = bb.get_int() self.value_size = bb.get_int()
self.key = bb.get_bytes_with_length() self.key = bb.get_bytes_with_length()
class GetRangeInfo(BaseInfo): class GetRangeInfo(BaseInfo):
def __init__(self, bb): def __init__(self, bb, protocol_version):
super().__init__(bb.get_double()) super().__init__(bb, protocol_version)
self.latency = bb.get_double() self.latency = bb.get_double()
self.range_size = bb.get_int() self.range_size = bb.get_int()
self.key_range = bb.get_key_range() self.key_range = bb.get_key_range()
class CommitInfo(BaseInfo): class CommitInfo(BaseInfo):
def __init__(self, bb, full_output=True): def __init__(self, bb, protocol_version, full_output=True):
super().__init__(bb.get_double()) super().__init__(bb, protocol_version)
self.latency = bb.get_double() self.latency = bb.get_double()
self.num_mutations = bb.get_int() self.num_mutations = bb.get_int()
self.commit_bytes = bb.get_int() self.commit_bytes = bb.get_int()
if protocol_version >= PROTOCOL_VERSION_6_3: if protocol_version >= PROTOCOL_VERSION_6_3:
self.commit_version = bb.get_long() self.commit_version = bb.get_long()
read_conflict_range = bb.get_key_range_list() read_conflict_range = bb.get_key_range_list()
if full_output: if full_output:
self.read_conflict_range = read_conflict_range self.read_conflict_range = read_conflict_range
@ -225,22 +226,22 @@ class CommitInfo(BaseInfo):
class ErrorGetInfo(BaseInfo): class ErrorGetInfo(BaseInfo):
def __init__(self, bb): def __init__(self, bb, protocol_version):
super().__init__(bb.get_double()) super().__init__(bb, protocol_version)
self.error_code = bb.get_int() self.error_code = bb.get_int()
self.key = bb.get_bytes_with_length() self.key = bb.get_bytes_with_length()
class ErrorGetRangeInfo(BaseInfo): class ErrorGetRangeInfo(BaseInfo):
def __init__(self, bb): def __init__(self, bb, protocol_version):
super().__init__(bb.get_double()) super().__init__(bb, protocol_version)
self.error_code = bb.get_int() self.error_code = bb.get_int()
self.key_range = bb.get_key_range() self.key_range = bb.get_key_range()
class ErrorCommitInfo(BaseInfo): class ErrorCommitInfo(BaseInfo):
def __init__(self, bb, full_output=True): def __init__(self, bb, protocol_version, full_output=True):
super().__init__(bb.get_double()) super().__init__(bb, protocol_version)
self.error_code = bb.get_int() self.error_code = bb.get_int()
read_conflict_range = bb.get_key_range_list() read_conflict_range = bb.get_key_range_list()
@ -282,33 +283,33 @@ class ClientTransactionInfo:
if (not type_filter or "get_version" in type_filter): if (not type_filter or "get_version" in type_filter):
self.get_version = get_version self.get_version = get_version
elif event == 1: elif event == 1:
get = GetInfo(bb) get = GetInfo(bb, protocol_version)
if (not type_filter or "get" in type_filter): if (not type_filter or "get" in type_filter):
# because of the crappy json serializtion using __dict__ we have to set the list here otherwise # because of the crappy json serializtion using __dict__ we have to set the list here otherwise
# it doesn't print # it doesn't print
if not self.gets: self.gets = [] if not self.gets: self.gets = []
self.gets.append(get) self.gets.append(get)
elif event == 2: elif event == 2:
get_range = GetRangeInfo(bb) get_range = GetRangeInfo(bb, protocol_version)
if (not type_filter or "get_range" in type_filter): if (not type_filter or "get_range" in type_filter):
if not self.get_ranges: self.get_ranges = [] if not self.get_ranges: self.get_ranges = []
self.get_ranges.append(get_range) self.get_ranges.append(get_range)
elif event == 3: elif event == 3:
commit = CommitInfo(bb, full_output=full_output) commit = CommitInfo(bb, protocol_version, full_output=full_output)
if (not type_filter or "commit" in type_filter): if (not type_filter or "commit" in type_filter):
self.commit = commit self.commit = commit
elif event == 4: elif event == 4:
error_get = ErrorGetInfo(bb) error_get = ErrorGetInfo(bb, protocol_version)
if (not type_filter or "error_gets" in type_filter): if (not type_filter or "error_gets" in type_filter):
if not self.error_gets: self.error_gets = [] if not self.error_gets: self.error_gets = []
self.error_gets.append(error_get) self.error_gets.append(error_get)
elif event == 5: elif event == 5:
error_get_range = ErrorGetRangeInfo(bb) error_get_range = ErrorGetRangeInfo(bb, protocol_version)
if (not type_filter or "error_get_range" in type_filter): if (not type_filter or "error_get_range" in type_filter):
if not self.error_get_ranges: self.error_get_ranges = [] if not self.error_get_ranges: self.error_get_ranges = []
self.error_get_ranges.append(error_get_range) self.error_get_ranges.append(error_get_range)
elif event == 6: elif event == 6:
error_commit = ErrorCommitInfo(bb, full_output=full_output) error_commit = ErrorCommitInfo(bb, protocol_version, full_output=full_output)
if (not type_filter or "error_commit" in type_filter): if (not type_filter or "error_commit" in type_filter):
if not self.error_commits: self.error_commits = [] if not self.error_commits: self.error_commits = []
self.error_commits.append(error_commit) self.error_commits.append(error_commit)
@ -978,4 +979,3 @@ def main():
if __name__ == "__main__": if __name__ == "__main__":
main() main()

@ -7,7 +7,7 @@ Currently, there are several client functions implemented as FDB calls by passin
- **cluster_file_path**: `get("\xff\xff/cluster_file_path)` - **cluster_file_path**: `get("\xff\xff/cluster_file_path)`
- **connection_string**: `get("\xff\xff/connection_string)` - **connection_string**: `get("\xff\xff/connection_string)`
- **worker_interfaces**: `getRange("\xff\xff/worker_interfaces", <any_key>)` - **worker_interfaces**: `getRange("\xff\xff/worker_interfaces", <any_key>)`
- **conflicting-keys**: `getRange("\xff\xff/transaction/conflicting_keys/", "\xff\xff/transaction/conflicting_keys/\xff")` - **conflicting_keys**: `getRange("\xff\xff/transaction/conflicting_keys/", "\xff\xff/transaction/conflicting_keys/\xff")`
At present, implementions are hard-coded and the pain points are obvious: At present, implementions are hard-coded and the pain points are obvious:
- **Maintainability**: As more features added, the hard-coded snippets are hard to maintain - **Maintainability**: As more features added, the hard-coded snippets are hard to maintain
@ -78,4 +78,21 @@ ASSERT(
res2[0].value == LiteralStringRef("London") && res2[0].value == LiteralStringRef("London") &&
res2[1].value == LiteralStringRef("Washington, D.C.") res2[1].value == LiteralStringRef("Washington, D.C.")
); );
``` ```
## Module
We introduce this `module` concept after a [discussion](https://forums.foundationdb.org/t/versioning-of-special-key-space/2068) on cross module read on special-key-space. By default, range reads cover more than one module will not be allowed with `special_keys_cross_module_read` errors. In addition, range reads touch no modules will come with `special_keys_no_module_found` errors. The motivation here is to avoid unexpected blocking or errors happen in a wide-scope range read. In particular, you write code `getRange("A", "Z")` when all registered calls between `[A, Z)` happen locally, thus your code does not have any error-handling. However, if in the future, anyone register a new call in `[A, Z)` and sometimes throw errors like `time_out()`, then your original code is broken. The `module` is like a top-level directory where inside the module, calls are homogeneous. So we allow cross range read inside each module by default but cross module reads are forbidden. Right now, there are two modules available to use:
- TRANSACTION : `\xff\xff/transaction/, \xff\xff/transaction0`, all transaction related information like *read_conflict_range*, *write_conflict_range*, *conflicting_keys*.(All happen locally). Right now we have:
- `\xff\xff/transaction/conflicting_keys/, \xff\xff/transaction/conflicting_keys0` : conflicting keys that caused conflicts
- `\xff\xff/transaction/read_conflict_range/, \xff\xff/transaction/read_conflict_range0` : read conflict ranges of the transaction
- `\xff\xff/transaction/write_conflict_range/, \xff\xff/transaction/write_conflict_range0` : write conflict ranges of the transaction
- METRICS: `\xff\xff/metrics/, \xff\xff/metrics0`, all metrics like data-distribution metrics or healthy metrics are planned to put here. All need to call the rpc, so time_out error s may happen. Right now we have:
- `\xff\xff/metrics/data_distribution_stats, \xff\xff/metrics/data_distribution_stats` : stats info about data-distribution
- WORKERINTERFACE : `\xff\xff/worker_interfaces/, \xff\xff/worker_interfaces0`, which is compatible with previous implementation, thus should not be used to add new functions.
In addition, all singleKeyRanges are formatted as modules and cannot be used again. In particular, you should call `get` not `getRange` on these keys. Below are existing ones:
- STATUSJSON : `\xff\xff/status/json`
- CONNECTIONSTRING : `\xff\xff/connection_string`
- CLUSTERFILEPATH : `\xff\xff/cluster_file_path`

@ -284,8 +284,6 @@
}, },
"limiting_queue_bytes_storage_server":0, "limiting_queue_bytes_storage_server":0,
"worst_queue_bytes_storage_server":0, "worst_queue_bytes_storage_server":0,
"limiting_version_lag_storage_server":0,
"worst_version_lag_storage_server":0,
"limiting_data_lag_storage_server":{ "limiting_data_lag_storage_server":{
"versions":0, "versions":0,
"seconds":0.0 "seconds":0.0

@ -219,8 +219,9 @@ struct VersionedMutations {
*/ */
struct DecodeProgress { struct DecodeProgress {
DecodeProgress() = default; DecodeProgress() = default;
DecodeProgress(const LogFile& file, std::vector<std::tuple<Arena, Version, int32_t, StringRef>> values) template <class U>
: file(file), keyValues(values) {} DecodeProgress(const LogFile& file, U &&values)
: file(file), keyValues(std::forward<U>(values)) {}
// If there are no more mutations to pull from the file. // If there are no more mutations to pull from the file.
// However, we could have unfinished version in the buffer when EOF is true, // However, we could have unfinished version in the buffer when EOF is true,
@ -228,7 +229,7 @@ struct DecodeProgress {
// should call getUnfinishedBuffer() to get these left data. // should call getUnfinishedBuffer() to get these left data.
bool finished() { return (eof && keyValues.empty()) || (leftover && !keyValues.empty()); } bool finished() { return (eof && keyValues.empty()) || (leftover && !keyValues.empty()); }
std::vector<std::tuple<Arena, Version, int32_t, StringRef>>&& getUnfinishedBuffer() { return std::move(keyValues); } std::vector<std::tuple<Arena, Version, int32_t, StringRef>>&& getUnfinishedBuffer() && { return std::move(keyValues); }
// Returns all mutations of the next version in a batch. // Returns all mutations of the next version in a batch.
Future<VersionedMutations> getNextBatch() { return getNextBatchImpl(this); } Future<VersionedMutations> getNextBatch() { return getNextBatchImpl(this); }
@ -448,7 +449,7 @@ ACTOR Future<Void> decode_logs(DecodeParams params) {
for (; i < logs.size(); i++) { for (; i < logs.size(); i++) {
if (logs[i].fileSize == 0) continue; if (logs[i].fileSize == 0) continue;
state DecodeProgress progress(logs[i], left); state DecodeProgress progress(logs[i], std::move(left));
wait(progress.openFile(container)); wait(progress.openFile(container));
while (!progress.finished()) { while (!progress.finished()) {
VersionedMutations vms = wait(progress.getNextBatch()); VersionedMutations vms = wait(progress.getNextBatch());
@ -456,7 +457,7 @@ ACTOR Future<Void> decode_logs(DecodeParams params) {
std::cout << vms.version << " " << m.toString() << "\n"; std::cout << vms.version << " " << m.toString() << "\n";
} }
} }
left = progress.getUnfinishedBuffer(); left = std::move(progress).getUnfinishedBuffer();
if (!left.empty()) { if (!left.empty()) {
TraceEvent("UnfinishedFile").detail("File", logs[i].fileName).detail("Q", left.size()); TraceEvent("UnfinishedFile").detail("File", logs[i].fileName).detail("Q", left.size());
} }

@ -63,7 +63,7 @@ using std::endl;
#endif #endif
#endif #endif
#include "fdbclient/IncludeVersions.h" #include "fdbclient/versions.h"
#include "flow/SimpleOpt.h" #include "flow/SimpleOpt.h"
#include "flow/actorcompiler.h" // This must be the last #include. #include "flow/actorcompiler.h" // This must be the last #include.
@ -593,9 +593,7 @@ CSimpleOpt::SOption g_rgRestoreOptions[] = {
{ OPT_RESTORE_TIMESTAMP, "--timestamp", SO_REQ_SEP }, { OPT_RESTORE_TIMESTAMP, "--timestamp", SO_REQ_SEP },
{ OPT_KNOB, "--knob_", SO_REQ_SEP }, { OPT_KNOB, "--knob_", SO_REQ_SEP },
{ OPT_RESTORECONTAINER,"-r", SO_REQ_SEP }, { OPT_RESTORECONTAINER,"-r", SO_REQ_SEP },
{ OPT_PREFIX_ADD, "-add_prefix", SO_REQ_SEP }, // TODO: Remove in 6.3
{ OPT_PREFIX_ADD, "--add_prefix", SO_REQ_SEP }, { OPT_PREFIX_ADD, "--add_prefix", SO_REQ_SEP },
{ OPT_PREFIX_REMOVE, "-remove_prefix", SO_REQ_SEP }, // TODO: Remove in 6.3
{ OPT_PREFIX_REMOVE, "--remove_prefix", SO_REQ_SEP }, { OPT_PREFIX_REMOVE, "--remove_prefix", SO_REQ_SEP },
{ OPT_TAGNAME, "-t", SO_REQ_SEP }, { OPT_TAGNAME, "-t", SO_REQ_SEP },
{ OPT_TAGNAME, "--tagname", SO_REQ_SEP }, { OPT_TAGNAME, "--tagname", SO_REQ_SEP },
@ -2709,7 +2707,13 @@ extern uint8_t *g_extra_memory;
int main(int argc, char* argv[]) { int main(int argc, char* argv[]) {
platformInit(); platformInit();
int status = FDB_EXIT_SUCCESS; int status = FDB_EXIT_SUCCESS;
std::string commandLine;
for(int a=0; a<argc; a++) {
if (a) commandLine += ' ';
commandLine += argv[a];
}
try { try {
#ifdef ALLOC_INSTRUMENTATION #ifdef ALLOC_INSTRUMENTATION
@ -3366,12 +3370,6 @@ int main(int argc, char* argv[]) {
args = NULL; args = NULL;
} }
std::string commandLine;
for(int a=0; a<argc; a++) {
if (a) commandLine += ' ';
commandLine += argv[a];
}
delete FLOW_KNOBS; delete FLOW_KNOBS;
FlowKnobs* flowKnobs = new FlowKnobs; FlowKnobs* flowKnobs = new FlowKnobs;
FLOW_KNOBS = flowKnobs; FLOW_KNOBS = flowKnobs;

@ -49,7 +49,7 @@
#include "fdbcli/linenoise/linenoise.h" #include "fdbcli/linenoise/linenoise.h"
#endif #endif
#include "fdbclient/IncludeVersions.h" #include "fdbclient/versions.h"
#include "flow/actorcompiler.h" // This must be the last #include. #include "flow/actorcompiler.h" // This must be the last #include.
@ -2423,11 +2423,11 @@ Reference<ReadYourWritesTransaction> getTransaction(Database db, Reference<ReadY
return tr; return tr;
} }
std::string new_completion(const char *base, const char *name) { std::string newCompletion(const char *base, const char *name) {
return format("%s%s ", base, name); return format("%s%s ", base, name);
} }
void comp_generator(const char* text, bool help, std::vector<std::string>& lc) { void compGenerator(const char* text, bool help, std::vector<std::string>& lc) {
std::map<std::string, CommandHelp>::const_iterator iter; std::map<std::string, CommandHelp>::const_iterator iter;
int len = strlen(text); int len = strlen(text);
@ -2438,7 +2438,7 @@ void comp_generator(const char* text, bool help, std::vector<std::string>& lc) {
for (auto iter = helpMap.begin(); iter != helpMap.end(); ++iter) { for (auto iter = helpMap.begin(); iter != helpMap.end(); ++iter) {
const char* name = (*iter).first.c_str(); const char* name = (*iter).first.c_str();
if (!strncmp(name, text, len)) { if (!strncmp(name, text, len)) {
lc.push_back( new_completion(help ? "help " : "", name) ); lc.push_back( newCompletion(help ? "help " : "", name) );
} }
} }
@ -2447,31 +2447,31 @@ void comp_generator(const char* text, bool help, std::vector<std::string>& lc) {
const char* name = *he; const char* name = *he;
he++; he++;
if (!strncmp(name, text, len)) if (!strncmp(name, text, len))
lc.push_back( new_completion("help ", name) ); lc.push_back( newCompletion("help ", name) );
} }
} }
} }
void cmd_generator(const char* text, std::vector<std::string>& lc) { void cmdGenerator(const char* text, std::vector<std::string>& lc) {
comp_generator(text, false, lc); compGenerator(text, false, lc);
} }
void help_generator(const char* text, std::vector<std::string>& lc) { void helpGenerator(const char* text, std::vector<std::string>& lc) {
comp_generator(text, true, lc); compGenerator(text, true, lc);
} }
void option_generator(const char* text, const char *line, std::vector<std::string>& lc) { void optionGenerator(const char* text, const char *line, std::vector<std::string>& lc) {
int len = strlen(text); int len = strlen(text);
for (auto iter = validOptions.begin(); iter != validOptions.end(); ++iter) { for (auto iter = validOptions.begin(); iter != validOptions.end(); ++iter) {
const char* name = (*iter).c_str(); const char* name = (*iter).c_str();
if (!strncmp(name, text, len)) { if (!strncmp(name, text, len)) {
lc.push_back( new_completion(line, name) ); lc.push_back( newCompletion(line, name) );
} }
} }
} }
void array_generator(const char* text, const char *line, const char** options, std::vector<std::string>& lc) { void arrayGenerator(const char* text, const char *line, const char** options, std::vector<std::string>& lc) {
const char** iter = options; const char** iter = options;
int len = strlen(text); int len = strlen(text);
@ -2479,32 +2479,57 @@ void array_generator(const char* text, const char *line, const char** options, s
const char* name = *iter; const char* name = *iter;
iter++; iter++;
if (!strncmp(name, text, len)) { if (!strncmp(name, text, len)) {
lc.push_back( new_completion(line, name) ); lc.push_back( newCompletion(line, name) );
} }
} }
} }
void onoff_generator(const char* text, const char *line, std::vector<std::string>& lc) { void onOffGenerator(const char* text, const char *line, std::vector<std::string>& lc) {
const char* opts[] = {"on", "off", NULL}; const char* opts[] = {"on", "off", nullptr};
array_generator(text, line, opts, lc); arrayGenerator(text, line, opts, lc);
} }
void configure_generator(const char* text, const char *line, std::vector<std::string>& lc) { void configureGenerator(const char* text, const char *line, std::vector<std::string>& lc) {
const char* opts[] = {"new", "single", "double", "triple", "three_data_hall", "three_datacenter", "ssd", "ssd-1", "ssd-2", "memory", "memory-1", "memory-2", "memory-radixtree-beta", "proxies=", "logs=", "resolvers=", NULL}; const char* opts[] = {"new", "single", "double", "triple", "three_data_hall", "three_datacenter", "ssd", "ssd-1", "ssd-2", "memory", "memory-1", "memory-2", "memory-radixtree-beta", "proxies=", "logs=", "resolvers=", nullptr};
array_generator(text, line, opts, lc); arrayGenerator(text, line, opts, lc);
} }
void status_generator(const char* text, const char *line, std::vector<std::string>& lc) { void statusGenerator(const char* text, const char *line, std::vector<std::string>& lc) {
const char* opts[] = {"minimal", "details", "json", NULL}; const char* opts[] = {"minimal", "details", "json", nullptr};
array_generator(text, line, opts, lc); arrayGenerator(text, line, opts, lc);
} }
void kill_generator(const char* text, const char *line, std::vector<std::string>& lc) { void killGenerator(const char* text, const char *line, std::vector<std::string>& lc) {
const char* opts[] = {"all", "list", NULL}; const char* opts[] = {"all", "list", nullptr};
array_generator(text, line, opts, lc); arrayGenerator(text, line, opts, lc);
} }
void fdbcli_comp_cmd(std::string const& text, std::vector<std::string>& lc) { void throttleGenerator(const char* text, const char *line, std::vector<std::string>& lc, std::vector<StringRef> const& tokens) {
if(tokens.size() == 1) {
const char* opts[] = { "on tag", "off", "enable auto", "disable auto", "list", nullptr };
arrayGenerator(text, line, opts, lc);
}
else if(tokens.size() >= 2 && tokencmp(tokens[1], "on")) {
if(tokens.size() == 2) {
const char* opts[] = { "tag", nullptr };
arrayGenerator(text, line, opts, lc);
}
else if(tokens.size() == 6) {
const char* opts[] = { "default", "immediate", "batch", nullptr };
arrayGenerator(text, line, opts, lc);
}
}
else if(tokens.size() >= 2 && tokencmp(tokens[1], "off") && !tokencmp(tokens[tokens.size()-1], "tag")) {
const char* opts[] = { "all", "auto", "manual", "tag", "default", "immediate", "batch", nullptr };
arrayGenerator(text, line, opts, lc);
}
else if(tokens.size() == 2 && tokencmp(tokens[1], "enable") || tokencmp(tokens[1], "disable")) {
const char* opts[] = { "auto", nullptr };
arrayGenerator(text, line, opts, lc);
}
}
void fdbcliCompCmd(std::string const& text, std::vector<std::string>& lc) {
bool err, partial; bool err, partial;
std::string whole_line = text; std::string whole_line = text;
auto parsed = parseLine(whole_line, err, partial); auto parsed = parseLine(whole_line, err, partial);
@ -2531,37 +2556,102 @@ void fdbcli_comp_cmd(std::string const& text, std::vector<std::string>& lc) {
// printf("final text (%d tokens): `%s' & `%s'\n", count, base_input.c_str(), ntext.c_str()); // printf("final text (%d tokens): `%s' & `%s'\n", count, base_input.c_str(), ntext.c_str());
if (!count) { if (!count) {
cmd_generator(ntext.c_str(), lc); cmdGenerator(ntext.c_str(), lc);
return; return;
} }
if (tokencmp(tokens[0], "help") && count == 1) { if (tokencmp(tokens[0], "help") && count == 1) {
help_generator(ntext.c_str(), lc); helpGenerator(ntext.c_str(), lc);
return; return;
} }
if (tokencmp(tokens[0], "option")) { if (tokencmp(tokens[0], "option")) {
if (count == 1) if (count == 1)
onoff_generator(ntext.c_str(), base_input.c_str(), lc); onOffGenerator(ntext.c_str(), base_input.c_str(), lc);
if (count == 2) if (count == 2)
option_generator(ntext.c_str(), base_input.c_str(), lc); optionGenerator(ntext.c_str(), base_input.c_str(), lc);
} }
if (tokencmp(tokens[0], "writemode") && count == 1) { if (tokencmp(tokens[0], "writemode") && count == 1) {
onoff_generator(ntext.c_str(), base_input.c_str(), lc); onOffGenerator(ntext.c_str(), base_input.c_str(), lc);
} }
if (tokencmp(tokens[0], "configure")) { if (tokencmp(tokens[0], "configure")) {
configure_generator(ntext.c_str(), base_input.c_str(), lc); configureGenerator(ntext.c_str(), base_input.c_str(), lc);
} }
if (tokencmp(tokens[0], "status") && count == 1) { if (tokencmp(tokens[0], "status") && count == 1) {
status_generator(ntext.c_str(), base_input.c_str(), lc); statusGenerator(ntext.c_str(), base_input.c_str(), lc);
} }
if (tokencmp(tokens[0], "kill") && count == 1) { if (tokencmp(tokens[0], "kill") && count == 1) {
kill_generator(ntext.c_str(), base_input.c_str(), lc); killGenerator(ntext.c_str(), base_input.c_str(), lc);
} }
if (tokencmp(tokens[0], "throttle")) {
throttleGenerator(ntext.c_str(), base_input.c_str(), lc, tokens);
}
}
std::vector<const char*> throttleHintGenerator(std::vector<StringRef> const& tokens, bool inArgument) {
if(tokens.size() == 1) {
return { "<on|off|enable auto|disable auto|list>", "[ARGS]" };
}
else if(tokencmp(tokens[1], "on")) {
std::vector<const char*> opts = { "tag", "<TAG>", "[RATE]", "[DURATION]", "[default|immediate|batch]" };
if(tokens.size() == 2) {
return opts;
}
else if(((tokens.size() == 3 && inArgument) || tokencmp(tokens[2], "tag")) && tokens.size() < 7) {
return std::vector<const char*>(opts.begin() + tokens.size() - 2, opts.end());
}
}
else if(tokencmp(tokens[1], "off")) {
if(tokencmp(tokens[tokens.size()-1], "tag")) {
return { "<TAG>" };
}
else {
bool hasType = false;
bool hasTag = false;
bool hasPriority = false;
for(int i = 2; i < tokens.size(); ++i) {
if(tokencmp(tokens[i], "all") || tokencmp(tokens[i], "auto") || tokencmp(tokens[i], "manual")) {
hasType = true;
}
else if(tokencmp(tokens[i], "default") || tokencmp(tokens[i], "immediate") || tokencmp(tokens[i], "batch")) {
hasPriority = true;
}
else if(tokencmp(tokens[i], "tag")) {
hasTag = true;
++i;
}
else {
return {};
}
}
std::vector<const char*> options;
if(!hasType) {
options.push_back("[all|auto|manual]");
}
if(!hasTag) {
options.push_back("[tag <TAG>]");
}
if(!hasPriority) {
options.push_back("[default|immediate|batch]");
}
return options;
}
}
else if((tokencmp(tokens[1], "enable") || tokencmp(tokens[1], "disable")) && tokens.size() == 2) {
return { "auto" };
}
else if(tokens.size() == 2 && inArgument) {
return { "[ARGS]" };
}
return std::vector<const char*>();
} }
void LogCommand(std::string line, UID randomID, std::string errMsg) { void LogCommand(std::string line, UID randomID, std::string errMsg) {
@ -3919,7 +4009,7 @@ ACTOR Future<int> cli(CLIOptions opt, LineNoise* plinenoise) {
(int)(itr->tpsRate), (int)(itr->tpsRate),
std::min((int)(itr->expirationTime-now()), (int)(itr->initialDuration)), std::min((int)(itr->expirationTime-now()), (int)(itr->initialDuration)),
transactionPriorityToString(itr->priority, false), transactionPriorityToString(itr->priority, false),
itr->autoThrottled ? "auto" : "manual", itr->throttleType == TagThrottleType::AUTO ? "auto" : "manual",
itr->tag.toString().c_str()); itr->tag.toString().c_str());
} }
} }
@ -3932,19 +4022,21 @@ ACTOR Future<int> cli(CLIOptions opt, LineNoise* plinenoise) {
printf("There are no throttled tags\n"); printf("There are no throttled tags\n");
} }
} }
else if(tokencmp(tokens[1], "on") && tokens.size() <=6) { else if(tokencmp(tokens[1], "on")) {
if(tokens.size() < 4 || !tokencmp(tokens[2], "tag")) { if(tokens.size() < 4 || !tokencmp(tokens[2], "tag") || tokens.size() > 7) {
printf("Usage: throttle on tag <TAG> [RATE] [DURATION]\n"); printf("Usage: throttle on tag <TAG> [RATE] [DURATION] [PRIORITY]\n");
printf("\n"); printf("\n");
printf("Enables throttling for transactions with the specified tag.\n"); printf("Enables throttling for transactions with the specified tag.\n");
printf("An optional transactions per second rate can be specified (default 0).\n"); printf("An optional transactions per second rate can be specified (default 0).\n");
printf("An optional duration can be specified, which must include a time suffix (s, m, h, d) (default 1h).\n"); printf("An optional duration can be specified, which must include a time suffix (s, m, h, d) (default 1h).\n");
printf("An optional priority can be specified. Choices are `default', `immediate', and `batch' (default `default').\n");
is_error = true; is_error = true;
continue; continue;
} }
double tpsRate = 0.0; double tpsRate = 0.0;
uint64_t duration = 3600; uint64_t duration = 3600;
TransactionPriority priority = TransactionPriority::DEFAULT;
if(tokens.size() >= 5) { if(tokens.size() >= 5) {
char *end; char *end;
@ -3968,70 +4060,145 @@ ACTOR Future<int> cli(CLIOptions opt, LineNoise* plinenoise) {
continue; continue;
} }
duration = parsedDuration.get(); duration = parsedDuration.get();
}
if(duration == 0) { if(duration == 0) {
printf("ERROR: throttle duration cannot be 0\n"); printf("ERROR: throttle duration cannot be 0\n");
is_error = true; is_error = true;
continue; continue;
}
}
if(tokens.size() == 7) {
if(tokens[6] == LiteralStringRef("default")) {
priority = TransactionPriority::DEFAULT;
}
else if(tokens[6] == LiteralStringRef("immediate")) {
priority = TransactionPriority::IMMEDIATE;
}
else if(tokens[6] == LiteralStringRef("batch")) {
priority = TransactionPriority::BATCH;
}
else {
printf("ERROR: unrecognized priority `%s'. Must be one of `default',\n `immediate', or `batch'.\n", tokens[6].toString().c_str());
is_error = true;
continue;
}
} }
TagSet tags; TagSet tags;
tags.addTag(tokens[3]); tags.addTag(tokens[3]);
wait(ThrottleApi::throttleTags(db, tags, tpsRate, duration, false, TransactionPriority::DEFAULT)); wait(ThrottleApi::throttleTags(db, tags, tpsRate, duration, TagThrottleType::MANUAL, priority));
printf("Tag `%s' has been throttled\n", tokens[3].toString().c_str()); printf("Tag `%s' has been throttled\n", tokens[3].toString().c_str());
} }
else if(tokencmp(tokens[1], "off")) { else if(tokencmp(tokens[1], "off")) {
if(tokencmp(tokens[2], "tag") && tokens.size() == 4) { int nextIndex = 2;
TagSet tags; TagSet tags;
tags.addTag(tokens[3]); bool throttleTypeSpecified = false;
bool success = wait(ThrottleApi::unthrottleTags(db, tags, false, TransactionPriority::DEFAULT)); // TODO: Allow targeting priority and auto/manual Optional<TagThrottleType> throttleType = TagThrottleType::MANUAL;
if(success) { Optional<TransactionPriority> priority;
printf("Unthrottled tag `%s'\n", tokens[3].toString().c_str());
if(tokens.size() == 2) {
is_error = true;
}
while(nextIndex < tokens.size() && !is_error) {
if(tokencmp(tokens[nextIndex], "all")) {
if(throttleTypeSpecified) {
is_error = true;
continue;
}
throttleTypeSpecified = true;
throttleType = Optional<TagThrottleType>();
++nextIndex;
} }
else { else if(tokencmp(tokens[nextIndex], "auto")) {
printf("Tag `%s' was not throttled\n", tokens[3].toString().c_str()); if(throttleTypeSpecified) {
is_error = true;
continue;
}
throttleTypeSpecified = true;
throttleType = TagThrottleType::AUTO;
++nextIndex;
}
else if(tokencmp(tokens[nextIndex], "manual")) {
if(throttleTypeSpecified) {
is_error = true;
continue;
}
throttleTypeSpecified = true;
throttleType = TagThrottleType::MANUAL;
++nextIndex;
}
else if(tokencmp(tokens[nextIndex], "default")) {
if(priority.present()) {
is_error = true;
continue;
}
priority = TransactionPriority::DEFAULT;
++nextIndex;
}
else if(tokencmp(tokens[nextIndex], "immediate")) {
if(priority.present()) {
is_error = true;
continue;
}
priority = TransactionPriority::IMMEDIATE;
++nextIndex;
}
else if(tokencmp(tokens[nextIndex], "batch")) {
if(priority.present()) {
is_error = true;
continue;
}
priority = TransactionPriority::BATCH;
++nextIndex;
}
else if(tokencmp(tokens[nextIndex], "tag")) {
if(tags.size() > 0 || nextIndex == tokens.size()-1) {
is_error = true;
continue;
}
tags.addTag(tokens[nextIndex+1]);
nextIndex += 2;
} }
} }
else if(tokencmp(tokens[2], "all") && tokens.size() == 3) {
bool unthrottled = wait(ThrottleApi::unthrottleAll(db)); if(!is_error) {
if(unthrottled) { state const char *throttleTypeString = !throttleType.present() ? "" : (throttleType.get() == TagThrottleType::AUTO ? "auto-" : "manually ");
printf("Unthrottled all tags\n"); state std::string priorityString = priority.present() ? format(" at %s priority", transactionPriorityToString(priority.get(), false)) : "";
if(tags.size() > 0) {
bool success = wait(ThrottleApi::unthrottleTags(db, tags, throttleType, priority));
if(success) {
printf("Unthrottled tag `%s'%s\n", tokens[3].toString().c_str(), priorityString.c_str());
}
else {
printf("Tag `%s' was not %sthrottled%s\n", tokens[3].toString().c_str(), throttleTypeString, priorityString.c_str());
}
} }
else { else {
printf("There were no tags being throttled\n"); bool unthrottled = wait(ThrottleApi::unthrottleAll(db, throttleType, priority));
} if(unthrottled) {
} printf("Unthrottled all %sthrottled tags%s\n", throttleTypeString, priorityString.c_str());
else if(tokencmp(tokens[2], "auto") && tokens.size() == 3) { }
bool unthrottled = wait(ThrottleApi::unthrottleAuto(db)); else {
if(unthrottled) { printf("There were no tags being %sthrottled%s\n", throttleTypeString, priorityString.c_str());
printf("Unthrottled all auto-throttled tags\n"); }
}
else {
printf("There were no tags being throttled\n");
}
}
else if(tokencmp(tokens[2], "manual") && tokens.size() == 3) {
bool unthrottled = wait(ThrottleApi::unthrottleManual(db));
if(unthrottled) {
printf("Unthrottled all manually throttled tags\n");
}
else {
printf("There were no tags being throttled\n");
} }
} }
else { else {
printf("Usage: throttle off <all|auto|manual|tag> [TAG]\n"); printf("Usage: throttle off [all|auto|manual] [tag <TAG>] [PRIORITY]\n");
printf("\n"); printf("\n");
printf("Disables throttling for the specified tag(s).\n"); printf("Disables throttling for throttles matching the specified filters. At least one filter must be used.\n\n");
printf("Use `all' to turn off all tag throttles, `auto' to turn off throttles created by\n"); printf("An optional qualifier `all', `auto', or `manual' can be used to specify the type of throttle\n");
printf("the cluster, and `manual' to turn off throttles created manually. Use `tag <TAG>'\n"); printf("affected. `all' targets all throttles, `auto' targets those created by the cluster, and\n");
printf("to turn off throttles for a specific tag\n"); printf("`manual' targets those created manually (default `manual').\n\n");
is_error = true; printf("The `tag' filter can be use to turn off only a specific tag.\n\n");
printf("The priority filter can be used to turn off only throttles at specific priorities. Choices are\n");
printf("`default', `immediate', or `batch'. By default, all priorities are targeted.\n");
} }
} }
else if((tokencmp(tokens[1], "enable") || tokencmp(tokens[1], "disable")) && tokens.size() == 3 && tokencmp(tokens[2], "auto")) { else if(tokencmp(tokens[1], "enable") || tokencmp(tokens[1], "disable")) {
if(tokens.size() != 3 || !tokencmp(tokens[2], "auto")) { if(tokens.size() != 3 || !tokencmp(tokens[2], "auto")) {
printf("Usage: throttle <enable|disable> auto\n"); printf("Usage: throttle <enable|disable> auto\n");
printf("\n"); printf("\n");
@ -4077,7 +4244,7 @@ ACTOR Future<int> cli(CLIOptions opt, LineNoise* plinenoise) {
ACTOR Future<int> runCli(CLIOptions opt) { ACTOR Future<int> runCli(CLIOptions opt) {
state LineNoise linenoise( state LineNoise linenoise(
[](std::string const& line, std::vector<std::string>& completions) { [](std::string const& line, std::vector<std::string>& completions) {
fdbcli_comp_cmd(line, completions); fdbcliCompCmd(line, completions);
}, },
[enabled=opt.cliHints](std::string const& line)->LineNoise::Hint { [enabled=opt.cliHints](std::string const& line)->LineNoise::Hint {
if (!enabled) { if (!enabled) {
@ -4098,18 +4265,32 @@ ACTOR Future<int> runCli(CLIOptions opt) {
// being entered. // being entered.
if (error && line.back() != '\\') return LineNoise::Hint(std::string(" {malformed escape sequence}"), 90, false); if (error && line.back() != '\\') return LineNoise::Hint(std::string(" {malformed escape sequence}"), 90, false);
auto iter = helpMap.find(command.toString()); bool inArgument = *(line.end() - 1) != ' ';
if (iter != helpMap.end()) { std::string hintLine = inArgument ? " " : "";
std::string helpLine = iter->second.usage; if(tokencmp(command, "throttle")) {
std::vector<std::vector<StringRef>> parsedHelp = parseLine(helpLine, error, partial); std::vector<const char*> hintItems = throttleHintGenerator(parsed.back(), inArgument);
std::string hintLine = (*(line.end() - 1) == ' ' ? "" : " "); if(hintItems.empty()) {
for (int i = finishedParameters; i < parsedHelp.back().size(); i++) { return LineNoise::Hint();
hintLine = hintLine + parsedHelp.back()[i].toString() + " "; }
for(auto item : hintItems) {
hintLine = hintLine + item + " ";
} }
return LineNoise::Hint(hintLine, 90, false);
} else {
return LineNoise::Hint();
} }
else {
auto iter = helpMap.find(command.toString());
if(iter != helpMap.end()) {
std::string helpLine = iter->second.usage;
std::vector<std::vector<StringRef>> parsedHelp = parseLine(helpLine, error, partial);
for (int i = finishedParameters; i < parsedHelp.back().size(); i++) {
hintLine = hintLine + parsedHelp.back()[i].toString() + " ";
}
}
else {
return LineNoise::Hint();
}
}
return LineNoise::Hint(hintLine, 90, false);
}, },
1000, 1000,
false); false);

@ -44,19 +44,28 @@ namespace FdbClientLogEvents {
}; };
struct Event { struct Event {
Event(EventType t, double ts) : type(t), startTs(ts) { } Event(EventType t, double ts, const Optional<Standalone<StringRef>> &dc) : type(t), startTs(ts){
if (dc.present())
dcId = dc.get();
}
Event() { } Event() { }
template <typename Ar> Ar& serialize(Ar &ar) { return serializer(ar, type, startTs); } template <typename Ar> Ar& serialize(Ar &ar) {
if (ar.protocolVersion().version() >= (uint64_t) 0x0FDB00B063010001LL) {
return serializer(ar, type, startTs, dcId);
} else {
return serializer(ar, type, startTs);
}
}
EventType type{ EVENTTYPEEND }; EventType type{ EVENTTYPEEND };
double startTs{ 0 }; double startTs{ 0 };
Key dcId{};
void logEvent(std::string id, int maxFieldLength) const {} void logEvent(std::string id, int maxFieldLength) const {}
}; };
struct EventGetVersion : public Event { struct EventGetVersion : public Event {
EventGetVersion(double ts, double lat) : Event(GET_VERSION_LATENCY, ts), latency(lat) { }
EventGetVersion() { } EventGetVersion() { }
template <typename Ar> Ar& serialize(Ar &ar) { template <typename Ar> Ar& serialize(Ar &ar) {
@ -77,22 +86,6 @@ namespace FdbClientLogEvents {
// Version V2 of EventGetVersion starting at 6.2 // Version V2 of EventGetVersion starting at 6.2
struct EventGetVersion_V2 : public Event { struct EventGetVersion_V2 : public Event {
EventGetVersion_V2(double ts, double lat, TransactionPriority priority) : Event(GET_VERSION_LATENCY, ts), latency(lat) {
switch(priority) {
// Unfortunately, the enum serialized here disagrees with the enum used elsewhere for the values used by each priority
case TransactionPriority::IMMEDIATE:
priorityType = PRIORITY_IMMEDIATE;
break;
case TransactionPriority::DEFAULT:
priorityType = PRIORITY_DEFAULT;
break;
case TransactionPriority::BATCH:
priorityType = PRIORITY_BATCH;
break;
default:
ASSERT(false);
}
}
EventGetVersion_V2() { } EventGetVersion_V2() { }
template <typename Ar> Ar& serialize(Ar &ar) { template <typename Ar> Ar& serialize(Ar &ar) {
@ -115,7 +108,7 @@ namespace FdbClientLogEvents {
// Version V3 of EventGetVersion starting at 6.3 // Version V3 of EventGetVersion starting at 6.3
struct EventGetVersion_V3 : public Event { struct EventGetVersion_V3 : public Event {
EventGetVersion_V3(double ts, double lat, TransactionPriority priority, Version version) : Event(GET_VERSION_LATENCY, ts), latency(lat), readVersion(version) { EventGetVersion_V3(double ts, const Optional<Standalone<StringRef>> &dcId, double lat, TransactionPriority priority, Version version) : Event(GET_VERSION_LATENCY, ts, dcId), latency(lat), readVersion(version) {
switch(priority) { switch(priority) {
// Unfortunately, the enum serialized here disagrees with the enum used elsewhere for the values used by each priority // Unfortunately, the enum serialized here disagrees with the enum used elsewhere for the values used by each priority
case TransactionPriority::IMMEDIATE: case TransactionPriority::IMMEDIATE:
@ -154,7 +147,7 @@ namespace FdbClientLogEvents {
}; };
struct EventGet : public Event { struct EventGet : public Event {
EventGet(double ts, double lat, int size, const KeyRef &in_key) : Event(GET_LATENCY, ts), latency(lat), valueSize(size), key(in_key) { } EventGet(double ts, const Optional<Standalone<StringRef>> &dcId, double lat, int size, const KeyRef &in_key) : Event(GET_LATENCY, ts, dcId), latency(lat), valueSize(size), key(in_key) { }
EventGet() { } EventGet() { }
template <typename Ar> Ar& serialize(Ar &ar) { template <typename Ar> Ar& serialize(Ar &ar) {
@ -180,7 +173,7 @@ namespace FdbClientLogEvents {
}; };
struct EventGetRange : public Event { struct EventGetRange : public Event {
EventGetRange(double ts, double lat, int size, const KeyRef &start_key, const KeyRef & end_key) : Event(GET_RANGE_LATENCY, ts), latency(lat), rangeSize(size), startKey(start_key), endKey(end_key) { } EventGetRange(double ts, const Optional<Standalone<StringRef>> &dcId, double lat, int size, const KeyRef &start_key, const KeyRef & end_key) : Event(GET_RANGE_LATENCY, ts, dcId), latency(lat), rangeSize(size), startKey(start_key), endKey(end_key) { }
EventGetRange() { } EventGetRange() { }
template <typename Ar> Ar& serialize(Ar &ar) { template <typename Ar> Ar& serialize(Ar &ar) {
@ -208,7 +201,6 @@ namespace FdbClientLogEvents {
}; };
struct EventCommit : public Event { struct EventCommit : public Event {
EventCommit(double ts, double lat, int mut, int bytes, const CommitTransactionRequest &commit_req) : Event(COMMIT_LATENCY, ts), latency(lat), numMutations(mut), commitBytes(bytes), req(commit_req) { }
EventCommit() { } EventCommit() { }
template <typename Ar> Ar& serialize(Ar &ar) { template <typename Ar> Ar& serialize(Ar &ar) {
@ -260,8 +252,8 @@ namespace FdbClientLogEvents {
// Version V2 of EventGetVersion starting at 6.3 // Version V2 of EventGetVersion starting at 6.3
struct EventCommit_V2 : public Event { struct EventCommit_V2 : public Event {
EventCommit_V2(double ts, double lat, int mut, int bytes, Version version, const CommitTransactionRequest &commit_req) EventCommit_V2(double ts, const Optional<Standalone<StringRef>> &dcId, double lat, int mut, int bytes, Version version, const CommitTransactionRequest &commit_req)
: Event(COMMIT_LATENCY, ts), latency(lat), numMutations(mut), commitBytes(bytes), commitVersion(version), req(commit_req) { } : Event(COMMIT_LATENCY, ts, dcId), latency(lat), numMutations(mut), commitBytes(bytes), commitVersion(version), req(commit_req) { }
EventCommit_V2() { } EventCommit_V2() { }
template <typename Ar> Ar& serialize(Ar &ar) { template <typename Ar> Ar& serialize(Ar &ar) {
@ -314,7 +306,7 @@ namespace FdbClientLogEvents {
}; };
struct EventGetError : public Event { struct EventGetError : public Event {
EventGetError(double ts, int err_code, const KeyRef &in_key) : Event(ERROR_GET, ts), errCode(err_code), key(in_key) { } EventGetError(double ts, const Optional<Standalone<StringRef>> &dcId, int err_code, const KeyRef &in_key) : Event(ERROR_GET, ts, dcId), errCode(err_code), key(in_key) { }
EventGetError() { } EventGetError() { }
template <typename Ar> Ar& serialize(Ar &ar) { template <typename Ar> Ar& serialize(Ar &ar) {
@ -338,7 +330,7 @@ namespace FdbClientLogEvents {
}; };
struct EventGetRangeError : public Event { struct EventGetRangeError : public Event {
EventGetRangeError(double ts, int err_code, const KeyRef &start_key, const KeyRef & end_key) : Event(ERROR_GET_RANGE, ts), errCode(err_code), startKey(start_key), endKey(end_key) { } EventGetRangeError(double ts, const Optional<Standalone<StringRef>> &dcId, int err_code, const KeyRef &start_key, const KeyRef & end_key) : Event(ERROR_GET_RANGE, ts, dcId), errCode(err_code), startKey(start_key), endKey(end_key) { }
EventGetRangeError() { } EventGetRangeError() { }
template <typename Ar> Ar& serialize(Ar &ar) { template <typename Ar> Ar& serialize(Ar &ar) {
@ -364,7 +356,7 @@ namespace FdbClientLogEvents {
}; };
struct EventCommitError : public Event { struct EventCommitError : public Event {
EventCommitError(double ts, int err_code, const CommitTransactionRequest &commit_req) : Event(ERROR_COMMIT, ts), errCode(err_code), req(commit_req) { } EventCommitError(double ts, const Optional<Standalone<StringRef>> &dcId, int err_code, const CommitTransactionRequest &commit_req) : Event(ERROR_COMMIT, ts, dcId), errCode(err_code), req(commit_req) { }
EventCommitError() { } EventCommitError() { }
template <typename Ar> Ar& serialize(Ar &ar) { template <typename Ar> Ar& serialize(Ar &ar) {

@ -1019,6 +1019,21 @@ struct HealthMetrics {
} }
}; };
struct DDMetricsRef {
int64_t shardBytes;
KeyRef beginKey;
DDMetricsRef() : shardBytes(0) {}
DDMetricsRef(int64_t bytes, KeyRef begin) : shardBytes(bytes), beginKey(begin) {}
DDMetricsRef(Arena& a, const DDMetricsRef& copyFrom)
: shardBytes(copyFrom.shardBytes), beginKey(a, copyFrom.beginKey) {}
template <class Ar>
void serialize(Ar& ar) {
serializer(ar, shardBytes, beginKey);
}
};
struct WorkerBackupStatus { struct WorkerBackupStatus {
LogEpoch epoch; LogEpoch epoch;
Version version; Version version;

@ -1,28 +0,0 @@
/*
* IncludeVersions.h
*
* This source file is part of the FoundationDB open source project
*
* Copyright 2013-2020 Apple Inc. and the FoundationDB project authors
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
// This is a simple header to isolate the stupidity that results out of two
// build systems and versions.h include directives
#if defined(CMAKE_BUILD)
# include "fdbclient/versions.h"
#elif !defined(WIN32)
# include "versions.h"
#endif

@ -92,6 +92,7 @@ void ClientKnobs::initialize(bool randomize) {
init( STORAGE_METRICS_TOO_MANY_SHARDS_DELAY, 15.0 ); init( STORAGE_METRICS_TOO_MANY_SHARDS_DELAY, 15.0 );
init( AGGREGATE_HEALTH_METRICS_MAX_STALENESS, 0.5 ); init( AGGREGATE_HEALTH_METRICS_MAX_STALENESS, 0.5 );
init( DETAILED_HEALTH_METRICS_MAX_STALENESS, 5.0 ); init( DETAILED_HEALTH_METRICS_MAX_STALENESS, 5.0 );
init( TAG_ENCODE_KEY_SERVERS, false ); if( randomize && BUGGIFY ) TAG_ENCODE_KEY_SERVERS = true;
//KeyRangeMap //KeyRangeMap
init( KRM_GET_RANGE_LIMIT, 1e5 ); if( randomize && BUGGIFY ) KRM_GET_RANGE_LIMIT = 10; init( KRM_GET_RANGE_LIMIT, 1e5 ); if( randomize && BUGGIFY ) KRM_GET_RANGE_LIMIT = 10;

@ -85,6 +85,7 @@ public:
double STORAGE_METRICS_TOO_MANY_SHARDS_DELAY; double STORAGE_METRICS_TOO_MANY_SHARDS_DELAY;
double AGGREGATE_HEALTH_METRICS_MAX_STALENESS; double AGGREGATE_HEALTH_METRICS_MAX_STALENESS;
double DETAILED_HEALTH_METRICS_MAX_STALENESS; double DETAILED_HEALTH_METRICS_MAX_STALENESS;
bool TAG_ENCODE_KEY_SERVERS;
//KeyRangeMap //KeyRangeMap
int KRM_GET_RANGE_LIMIT; int KRM_GET_RANGE_LIMIT;

@ -42,7 +42,6 @@ struct MasterProxyInterface {
Optional<Key> processId; Optional<Key> processId;
bool provisional; bool provisional;
Endpoint base;
RequestStream< struct CommitTransactionRequest > commit; RequestStream< struct CommitTransactionRequest > commit;
RequestStream< struct GetReadVersionRequest > getConsistentReadVersion; // Returns a version which (1) is committed, and (2) is >= the latest version reported committed (by a commit response) when this request was sent RequestStream< struct GetReadVersionRequest > getConsistentReadVersion; // Returns a version which (1) is committed, and (2) is >= the latest version reported committed (by a commit response) when this request was sent
// (at some point between when this request is sent and when its response is received, the latest version reported committed) // (at some point between when this request is sent and when its response is received, the latest version reported committed)
@ -56,6 +55,7 @@ struct MasterProxyInterface {
RequestStream< struct GetHealthMetricsRequest > getHealthMetrics; RequestStream< struct GetHealthMetricsRequest > getHealthMetrics;
RequestStream< struct ProxySnapRequest > proxySnapReq; RequestStream< struct ProxySnapRequest > proxySnapReq;
RequestStream< struct ExclusionSafetyCheckRequest > exclusionSafetyCheckReq; RequestStream< struct ExclusionSafetyCheckRequest > exclusionSafetyCheckReq;
RequestStream< struct GetDDMetricsRequest > getDDMetrics;
UID id() const { return commit.getEndpoint().token; } UID id() const { return commit.getEndpoint().token; }
std::string toString() const { return id().shortString(); } std::string toString() const { return id().shortString(); }
@ -65,18 +65,18 @@ struct MasterProxyInterface {
template <class Archive> template <class Archive>
void serialize(Archive& ar) { void serialize(Archive& ar) {
serializer(ar, processId, provisional, base); serializer(ar, processId, provisional, commit);
if( Archive::isDeserializing ) { if( Archive::isDeserializing ) {
commit = RequestStream< struct CommitTransactionRequest >( base.getAdjustedEndpoint(0) ); getConsistentReadVersion = RequestStream< struct GetReadVersionRequest >( commit.getEndpoint().getAdjustedEndpoint(1) );
getConsistentReadVersion = RequestStream< struct GetReadVersionRequest >( base.getAdjustedEndpoint(1) ); getKeyServersLocations = RequestStream< struct GetKeyServerLocationsRequest >( commit.getEndpoint().getAdjustedEndpoint(2) );
getKeyServersLocations = RequestStream< struct GetKeyServerLocationsRequest >( base.getAdjustedEndpoint(2) ); getStorageServerRejoinInfo = RequestStream< struct GetStorageServerRejoinInfoRequest >( commit.getEndpoint().getAdjustedEndpoint(3) );
getStorageServerRejoinInfo = RequestStream< struct GetStorageServerRejoinInfoRequest >( base.getAdjustedEndpoint(3) ); waitFailure = RequestStream<ReplyPromise<Void>>( commit.getEndpoint().getAdjustedEndpoint(4) );
waitFailure = RequestStream<ReplyPromise<Void>>( base.getAdjustedEndpoint(4) ); getRawCommittedVersion = RequestStream< struct GetRawCommittedVersionRequest >( commit.getEndpoint().getAdjustedEndpoint(5) );
getRawCommittedVersion = RequestStream< struct GetRawCommittedVersionRequest >( base.getAdjustedEndpoint(5) ); txnState = RequestStream< struct TxnStateRequest >( commit.getEndpoint().getAdjustedEndpoint(6) );
txnState = RequestStream< struct TxnStateRequest >( base.getAdjustedEndpoint(6) ); getHealthMetrics = RequestStream< struct GetHealthMetricsRequest >( commit.getEndpoint().getAdjustedEndpoint(7) );
getHealthMetrics = RequestStream< struct GetHealthMetricsRequest >( base.getAdjustedEndpoint(7) ); proxySnapReq = RequestStream< struct ProxySnapRequest >( commit.getEndpoint().getAdjustedEndpoint(8) );
proxySnapReq = RequestStream< struct ProxySnapRequest >( base.getAdjustedEndpoint(8) ); exclusionSafetyCheckReq = RequestStream< struct ExclusionSafetyCheckRequest >( commit.getEndpoint().getAdjustedEndpoint(9) );
exclusionSafetyCheckReq = RequestStream< struct ExclusionSafetyCheckRequest >( base.getAdjustedEndpoint(9) ); getDDMetrics = RequestStream< struct GetDDMetricsRequest >( commit.getEndpoint().getAdjustedEndpoint(10) );
} }
} }
@ -92,7 +92,8 @@ struct MasterProxyInterface {
streams.push_back(getHealthMetrics.getReceiver()); streams.push_back(getHealthMetrics.getReceiver());
streams.push_back(proxySnapReq.getReceiver()); streams.push_back(proxySnapReq.getReceiver());
streams.push_back(exclusionSafetyCheckReq.getReceiver()); streams.push_back(exclusionSafetyCheckReq.getReceiver());
base = FlowTransport::transport().addEndpoints(streams); streams.push_back(getDDMetrics.getReceiver());
FlowTransport::transport().addEndpoints(streams);
} }
}; };
@ -391,6 +392,34 @@ struct GetHealthMetricsRequest
} }
}; };
struct GetDDMetricsReply
{
constexpr static FileIdentifier file_identifier = 7277713;
Standalone<VectorRef<DDMetricsRef>> storageMetricsList;
GetDDMetricsReply() {}
template <class Ar>
void serialize(Ar& ar) {
serializer(ar, storageMetricsList);
}
};
struct GetDDMetricsRequest {
constexpr static FileIdentifier file_identifier = 14536812;
KeyRange keys;
int shardLimit;
ReplyPromise<struct GetDDMetricsReply> reply;
GetDDMetricsRequest() {}
explicit GetDDMetricsRequest(KeyRange const& keys, const int shardLimit) : keys(keys), shardLimit(shardLimit) {}
template<class Ar>
void serialize(Ar& ar) {
serializer(ar, keys, shardLimit, reply);
}
};
struct ProxySnapRequest struct ProxySnapRequest
{ {
constexpr static FileIdentifier file_identifier = 22204900; constexpr static FileIdentifier file_identifier = 22204900;

@ -49,7 +49,7 @@
#include "flow/TLSConfig.actor.h" #include "flow/TLSConfig.actor.h"
#include "flow/UnitTest.h" #include "flow/UnitTest.h"
#include "fdbclient/IncludeVersions.h" #include "fdbclient/versions.h"
#ifdef WIN32 #ifdef WIN32
#define WIN32_LEAN_AND_MEAN #define WIN32_LEAN_AND_MEAN
@ -607,6 +607,8 @@ DatabaseContext::DatabaseContext(Reference<AsyncVar<Reference<ClusterConnectionF
registerSpecialKeySpaceModule(SpecialKeySpace::MODULE::TRANSACTION, std::make_unique<ConflictingKeysImpl>(conflictingKeysRange)); registerSpecialKeySpaceModule(SpecialKeySpace::MODULE::TRANSACTION, std::make_unique<ConflictingKeysImpl>(conflictingKeysRange));
registerSpecialKeySpaceModule(SpecialKeySpace::MODULE::TRANSACTION, std::make_unique<ReadConflictRangeImpl>(readConflictRangeKeysRange)); registerSpecialKeySpaceModule(SpecialKeySpace::MODULE::TRANSACTION, std::make_unique<ReadConflictRangeImpl>(readConflictRangeKeysRange));
registerSpecialKeySpaceModule(SpecialKeySpace::MODULE::TRANSACTION, std::make_unique<WriteConflictRangeImpl>(writeConflictRangeKeysRange)); registerSpecialKeySpaceModule(SpecialKeySpace::MODULE::TRANSACTION, std::make_unique<WriteConflictRangeImpl>(writeConflictRangeKeysRange));
registerSpecialKeySpaceModule(SpecialKeySpace::MODULE::METRICS,
std::make_unique<DDStatsRangeImpl>(ddStatsRange));
registerSpecialKeySpaceModule(SpecialKeySpace::MODULE::WORKERINTERFACE, std::make_unique<WorkerInterfacesSpecialKeyImpl>(KeyRangeRef( registerSpecialKeySpaceModule(SpecialKeySpace::MODULE::WORKERINTERFACE, std::make_unique<WorkerInterfacesSpecialKeyImpl>(KeyRangeRef(
LiteralStringRef("\xff\xff/worker_interfaces/"), LiteralStringRef("\xff\xff/worker_interfaces0")))); LiteralStringRef("\xff\xff/worker_interfaces/"), LiteralStringRef("\xff\xff/worker_interfaces0"))));
registerSpecialKeySpaceModule(SpecialKeySpace::MODULE::STATUSJSON, std::make_unique<SingleSpecialKeyImpl>( registerSpecialKeySpaceModule(SpecialKeySpace::MODULE::STATUSJSON, std::make_unique<SingleSpecialKeyImpl>(
@ -738,7 +740,7 @@ Reference<LocationInfo> DatabaseContext::setCachedLocation( const KeyRangeRef& k
locationCache.insert( KeyRangeRef(begin, end), Reference<LocationInfo>() ); locationCache.insert( KeyRangeRef(begin, end), Reference<LocationInfo>() );
} }
locationCache.insert( keys, loc ); locationCache.insert( keys, loc );
return std::move(loc); return loc;
} }
void DatabaseContext::invalidateCache( const KeyRef& key, bool isBackward ) { void DatabaseContext::invalidateCache( const KeyRef& key, bool isBackward ) {
@ -1518,7 +1520,7 @@ ACTOR Future<Optional<Value>> getValue( Future<Version> version, Key key, Databa
cx->readLatencies.addSample(latency); cx->readLatencies.addSample(latency);
if (trLogInfo) { if (trLogInfo) {
int valueSize = reply.value.present() ? reply.value.get().size() : 0; int valueSize = reply.value.present() ? reply.value.get().size() : 0;
trLogInfo->addLog(FdbClientLogEvents::EventGet(startTimeD, latency, valueSize, key)); trLogInfo->addLog(FdbClientLogEvents::EventGet(startTimeD, cx->clientLocality.dcId(), latency, valueSize, key));
} }
cx->getValueCompleted->latency = timer_int() - startTime; cx->getValueCompleted->latency = timer_int() - startTime;
cx->getValueCompleted->log(); cx->getValueCompleted->log();
@ -1550,7 +1552,7 @@ ACTOR Future<Optional<Value>> getValue( Future<Version> version, Key key, Databa
wait(delay(CLIENT_KNOBS->WRONG_SHARD_SERVER_DELAY, info.taskID)); wait(delay(CLIENT_KNOBS->WRONG_SHARD_SERVER_DELAY, info.taskID));
} else { } else {
if (trLogInfo) if (trLogInfo)
trLogInfo->addLog(FdbClientLogEvents::EventGetError(startTimeD, static_cast<int>(e.code()), key)); trLogInfo->addLog(FdbClientLogEvents::EventGetError(startTimeD, cx->clientLocality.dcId(), static_cast<int>(e.code()), key));
throw e; throw e;
} }
} }
@ -1955,7 +1957,7 @@ void getRangeFinished(Database cx, Reference<TransactionLogInfo> trLogInfo, doub
cx->transactionKeysRead += result.size(); cx->transactionKeysRead += result.size();
if( trLogInfo ) { if( trLogInfo ) {
trLogInfo->addLog(FdbClientLogEvents::EventGetRange(startTime, now()-startTime, bytes, begin.getKey(), end.getKey())); trLogInfo->addLog(FdbClientLogEvents::EventGetRange(startTime, cx->clientLocality.dcId(), now()-startTime, bytes, begin.getKey(), end.getKey()));
} }
if( !snapshot ) { if( !snapshot ) {
@ -2195,7 +2197,7 @@ ACTOR Future<Standalone<RangeResultRef>> getRange( Database cx, Reference<Transa
wait(delay(CLIENT_KNOBS->WRONG_SHARD_SERVER_DELAY, info.taskID)); wait(delay(CLIENT_KNOBS->WRONG_SHARD_SERVER_DELAY, info.taskID));
} else { } else {
if (trLogInfo) if (trLogInfo)
trLogInfo->addLog(FdbClientLogEvents::EventGetRangeError(startTime, static_cast<int>(e.code()), begin.getKey(), end.getKey())); trLogInfo->addLog(FdbClientLogEvents::EventGetRangeError(startTime, cx->clientLocality.dcId(), static_cast<int>(e.code()), begin.getKey(), end.getKey()));
throw e; throw e;
} }
@ -2449,7 +2451,7 @@ ACTOR Future< Key > getKeyAndConflictRange(
conflictRange.send( std::make_pair( rep, k.orEqual ? keyAfter( k.getKey() ) : Key(k.getKey(), k.arena()) ) ); conflictRange.send( std::make_pair( rep, k.orEqual ? keyAfter( k.getKey() ) : Key(k.getKey(), k.arena()) ) );
else else
conflictRange.send( std::make_pair( k.orEqual ? keyAfter( k.getKey() ) : Key(k.getKey(), k.arena()), keyAfter( rep ) ) ); conflictRange.send( std::make_pair( k.orEqual ? keyAfter( k.getKey() ) : Key(k.getKey(), k.arena()), keyAfter( rep ) ) );
return std::move(rep); return rep;
} catch( Error&e ) { } catch( Error&e ) {
conflictRange.send(std::make_pair(Key(), Key())); conflictRange.send(std::make_pair(Key(), Key()));
throw; throw;
@ -2975,7 +2977,7 @@ ACTOR static Future<Void> tryCommit( Database cx, Reference<TransactionLogInfo>
cx->commitLatencies.addSample(latency); cx->commitLatencies.addSample(latency);
cx->latencies.addSample(now() - tr->startTime); cx->latencies.addSample(now() - tr->startTime);
if (trLogInfo) if (trLogInfo)
trLogInfo->addLog(FdbClientLogEvents::EventCommit_V2(startTime, latency, req.transaction.mutations.size(), req.transaction.mutations.expectedSize(), ci.version, req)); trLogInfo->addLog(FdbClientLogEvents::EventCommit_V2(startTime, cx->clientLocality.dcId(), latency, req.transaction.mutations.size(), req.transaction.mutations.expectedSize(), ci.version, req));
return Void(); return Void();
} else { } else {
// clear the RYW transaction which contains previous conflicting keys // clear the RYW transaction which contains previous conflicting keys
@ -3038,7 +3040,7 @@ ACTOR static Future<Void> tryCommit( Database cx, Reference<TransactionLogInfo>
TraceEvent(SevError, "TryCommitError").error(e); TraceEvent(SevError, "TryCommitError").error(e);
} }
if (trLogInfo) if (trLogInfo)
trLogInfo->addLog(FdbClientLogEvents::EventCommitError(startTime, static_cast<int>(e.code()), req)); trLogInfo->addLog(FdbClientLogEvents::EventCommitError(startTime, cx->clientLocality.dcId(), static_cast<int>(e.code()), req));
throw; throw;
} }
} }
@ -3449,7 +3451,7 @@ ACTOR Future<Version> extractReadVersion(DatabaseContext* cx, TransactionPriorit
double latency = now() - startTime; double latency = now() - startTime;
cx->GRVLatencies.addSample(latency); cx->GRVLatencies.addSample(latency);
if (trLogInfo) if (trLogInfo)
trLogInfo->addLog(FdbClientLogEvents::EventGetVersion_V3(startTime, latency, priority, rep.version)); trLogInfo->addLog(FdbClientLogEvents::EventGetVersion_V3(startTime, cx->clientLocality.dcId(), latency, priority, rep.version));
if (rep.version == 1 && rep.locked) { if (rep.version == 1 && rep.locked) {
throw proxy_memory_limit_exceeded(); throw proxy_memory_limit_exceeded();
} }
@ -3858,6 +3860,25 @@ Future< StorageMetrics > Transaction::getStorageMetrics( KeyRange const& keys, i
} }
} }
ACTOR Future<Standalone<VectorRef<DDMetricsRef>>> waitDataDistributionMetricsList(Database cx, KeyRange keys,
int shardLimit) {
state Future<Void> clientTimeout = delay(5.0);
loop {
choose {
when(wait(cx->onMasterProxiesChanged())) {}
when(ErrorOr<GetDDMetricsReply> rep =
wait(errorOr(basicLoadBalance(cx->getMasterProxies(false), &MasterProxyInterface::getDDMetrics,
GetDDMetricsRequest(keys, shardLimit))))) {
if (rep.isError()) {
throw rep.getError();
}
return rep.get().storageMetricsList;
}
when(wait(clientTimeout)) { throw timed_out(); }
}
}
}
Future<Standalone<VectorRef<KeyRangeRef>>> Transaction::getReadHotRanges(KeyRange const& keys) { Future<Standalone<VectorRef<KeyRangeRef>>> Transaction::getReadHotRanges(KeyRange const& keys) {
return ::getReadHotRanges(cx, keys); return ::getReadHotRanges(cx, keys);
} }

@ -330,6 +330,8 @@ private:
}; };
ACTOR Future<Version> waitForCommittedVersion(Database cx, Version version); ACTOR Future<Version> waitForCommittedVersion(Database cx, Version version);
ACTOR Future<Standalone<VectorRef<DDMetricsRef>>> waitDataDistributionMetricsList(Database cx, KeyRange keys,
int shardLimit);
std::string unprintable( const std::string& ); std::string unprintable( const std::string& );

@ -312,8 +312,6 @@ const KeyRef JSONSchemas::statusSchema = LiteralStringRef(R"statusSchema(
}, },
"limiting_queue_bytes_storage_server":0, "limiting_queue_bytes_storage_server":0,
"worst_queue_bytes_storage_server":0, "worst_queue_bytes_storage_server":0,
"limiting_version_lag_storage_server":0,
"worst_version_lag_storage_server":0,
"limiting_data_lag_storage_server":{ "limiting_data_lag_storage_server":{
"versions":0, "versions":0,
"seconds":0.0 "seconds":0.0

@ -29,7 +29,9 @@ std::unordered_map<SpecialKeySpace::MODULE, KeyRange> SpecialKeySpace::moduleToB
KeyRangeRef(LiteralStringRef("\xff\xff/worker_interfaces/"), LiteralStringRef("\xff\xff/worker_interfaces0")) }, KeyRangeRef(LiteralStringRef("\xff\xff/worker_interfaces/"), LiteralStringRef("\xff\xff/worker_interfaces0")) },
{ SpecialKeySpace::MODULE::STATUSJSON, singleKeyRange(LiteralStringRef("\xff\xff/status/json")) }, { SpecialKeySpace::MODULE::STATUSJSON, singleKeyRange(LiteralStringRef("\xff\xff/status/json")) },
{ SpecialKeySpace::MODULE::CONNECTIONSTRING, singleKeyRange(LiteralStringRef("\xff\xff/connection_string")) }, { SpecialKeySpace::MODULE::CONNECTIONSTRING, singleKeyRange(LiteralStringRef("\xff\xff/connection_string")) },
{ SpecialKeySpace::MODULE::CLUSTERFILEPATH, singleKeyRange(LiteralStringRef("\xff\xff/cluster_file_path")) } { SpecialKeySpace::MODULE::CLUSTERFILEPATH, singleKeyRange(LiteralStringRef("\xff\xff/cluster_file_path")) },
{ SpecialKeySpace::MODULE::METRICS,
KeyRangeRef(LiteralStringRef("\xff\xff/metrics/"), LiteralStringRef("\xff\xff/metrics0")) }
}; };
// This function will move the given KeySelector as far as possible to the standard form: // This function will move the given KeySelector as far as possible to the standard form:
@ -164,7 +166,6 @@ SpecialKeySpace::getRangeAggregationActor(SpecialKeySpace* sks, Reference<ReadYo
state Optional<SpecialKeySpace::MODULE> lastModuleRead; state Optional<SpecialKeySpace::MODULE> lastModuleRead;
wait(normalizeKeySelectorActor(sks, ryw, &begin, &lastModuleRead, &actualBeginOffset, &result)); wait(normalizeKeySelectorActor(sks, ryw, &begin, &lastModuleRead, &actualBeginOffset, &result));
// TODO : check if end the boundary of a module
wait(normalizeKeySelectorActor(sks, ryw, &end, &lastModuleRead, &actualEndOffset, &result)); wait(normalizeKeySelectorActor(sks, ryw, &end, &lastModuleRead, &actualEndOffset, &result));
// Handle all corner cases like what RYW does // Handle all corner cases like what RYW does
// return if range inverted // return if range inverted
@ -314,6 +315,37 @@ Future<Standalone<RangeResultRef>> ConflictingKeysImpl::getRange(Reference<ReadY
return result; return result;
} }
ACTOR Future<Standalone<RangeResultRef>> ddStatsGetRangeActor(Reference<ReadYourWritesTransaction> ryw,
KeyRangeRef kr) {
try {
auto keys = kr.removePrefix(ddStatsRange.begin);
Standalone<VectorRef<DDMetricsRef>> resultWithoutPrefix =
wait(waitDataDistributionMetricsList(ryw->getDatabase(), keys, CLIENT_KNOBS->STORAGE_METRICS_SHARD_LIMIT));
Standalone<RangeResultRef> result;
for (const auto& ddMetricsRef : resultWithoutPrefix) {
// each begin key is the previous end key, thus we only encode the begin key in the result
KeyRef beginKey = ddMetricsRef.beginKey.withPrefix(ddStatsRange.begin, result.arena());
// Use json string encoded in utf-8 to encode the values, easy for adding more fields in the future
json_spirit::mObject statsObj;
statsObj["ShardBytes"] = ddMetricsRef.shardBytes;
std::string statsString =
json_spirit::write_string(json_spirit::mValue(statsObj), json_spirit::Output_options::raw_utf8);
ValueRef bytes(result.arena(), statsString);
result.push_back(result.arena(), KeyValueRef(beginKey, bytes));
}
return result;
} catch (Error& e) {
throw;
}
}
DDStatsRangeImpl::DDStatsRangeImpl(KeyRangeRef kr) : SpecialKeyRangeBaseImpl(kr) {}
Future<Standalone<RangeResultRef>> DDStatsRangeImpl::getRange(Reference<ReadYourWritesTransaction> ryw,
KeyRangeRef kr) const {
return ddStatsGetRangeActor(ryw, kr);
}
class SpecialKeyRangeTestImpl : public SpecialKeyRangeBaseImpl { class SpecialKeyRangeTestImpl : public SpecialKeyRangeBaseImpl {
public: public:
explicit SpecialKeyRangeTestImpl(KeyRangeRef kr, const std::string& prefix, int size) explicit SpecialKeyRangeTestImpl(KeyRangeRef kr, const std::string& prefix, int size)

@ -51,13 +51,14 @@ protected:
class SpecialKeySpace { class SpecialKeySpace {
public: public:
enum class MODULE { enum class MODULE {
UNKNOWN, // default value for all unregistered range
TESTONLY, // only used by correctness tests
TRANSACTION,
WORKERINTERFACE,
STATUSJSON,
CLUSTERFILEPATH, CLUSTERFILEPATH,
CONNECTIONSTRING CONNECTIONSTRING,
METRICS, // data-distribution metrics
TESTONLY, // only used by correctness tests
TRANSACTION, // transaction related info, conflicting keys, read/write conflict range
STATUSJSON,
UNKNOWN, // default value for all unregistered range
WORKERINTERFACE,
}; };
Future<Optional<Value>> get(Reference<ReadYourWritesTransaction> ryw, const Key& key); Future<Optional<Value>> get(Reference<ReadYourWritesTransaction> ryw, const Key& key);
@ -152,5 +153,12 @@ public:
KeyRangeRef kr) const override; KeyRangeRef kr) const override;
}; };
class DDStatsRangeImpl : public SpecialKeyRangeBaseImpl {
public:
explicit DDStatsRangeImpl(KeyRangeRef kr);
Future<Standalone<RangeResultRef>> getRange(Reference<ReadYourWritesTransaction> ryw,
KeyRangeRef kr) const override;
};
#include "flow/unactorcompiler.h" #include "flow/unactorcompiler.h"
#endif #endif

@ -54,7 +54,6 @@ struct StorageServerInterface {
LocalityData locality; LocalityData locality;
UID uniqueID; UID uniqueID;
Endpoint base;
RequestStream<struct GetValueRequest> getValue; RequestStream<struct GetValueRequest> getValue;
RequestStream<struct GetKeyRequest> getKey; RequestStream<struct GetKeyRequest> getKey;
@ -87,20 +86,19 @@ struct StorageServerInterface {
// versioned carefully! // versioned carefully!
if (ar.protocolVersion().hasSmallEndpoints()) { if (ar.protocolVersion().hasSmallEndpoints()) {
serializer(ar, uniqueID, locality, base); serializer(ar, uniqueID, locality, getValue);
if( Ar::isDeserializing ) { if( Ar::isDeserializing ) {
getValue = RequestStream<struct GetValueRequest>( base.getAdjustedEndpoint(0) ); getKey = RequestStream<struct GetKeyRequest>( getValue.getEndpoint().getAdjustedEndpoint(1) );
getKey = RequestStream<struct GetKeyRequest>( base.getAdjustedEndpoint(1) ); getKeyValues = RequestStream<struct GetKeyValuesRequest>( getValue.getEndpoint().getAdjustedEndpoint(2) );
getKeyValues = RequestStream<struct GetKeyValuesRequest>( base.getAdjustedEndpoint(2) ); getShardState = RequestStream<struct GetShardStateRequest>( getValue.getEndpoint().getAdjustedEndpoint(3) );
getShardState = RequestStream<struct GetShardStateRequest>( base.getAdjustedEndpoint(3) ); waitMetrics = RequestStream<struct WaitMetricsRequest>( getValue.getEndpoint().getAdjustedEndpoint(4) );
waitMetrics = RequestStream<struct WaitMetricsRequest>( base.getAdjustedEndpoint(4) ); splitMetrics = RequestStream<struct SplitMetricsRequest>( getValue.getEndpoint().getAdjustedEndpoint(5) );
splitMetrics = RequestStream<struct SplitMetricsRequest>( base.getAdjustedEndpoint(5) ); getStorageMetrics = RequestStream<struct GetStorageMetricsRequest>( getValue.getEndpoint().getAdjustedEndpoint(6) );
getStorageMetrics = RequestStream<struct GetStorageMetricsRequest>( base.getAdjustedEndpoint(6) ); waitFailure = RequestStream<ReplyPromise<Void>>( getValue.getEndpoint().getAdjustedEndpoint(7) );
waitFailure = RequestStream<ReplyPromise<Void>>( base.getAdjustedEndpoint(7) ); getQueuingMetrics = RequestStream<struct StorageQueuingMetricsRequest>( getValue.getEndpoint().getAdjustedEndpoint(8) );
getQueuingMetrics = RequestStream<struct StorageQueuingMetricsRequest>( base.getAdjustedEndpoint(8) ); getKeyValueStoreType = RequestStream<ReplyPromise<KeyValueStoreType>>( getValue.getEndpoint().getAdjustedEndpoint(9) );
getKeyValueStoreType = RequestStream<ReplyPromise<KeyValueStoreType>>( base.getAdjustedEndpoint(9) ); watchValue = RequestStream<struct WatchValueRequest>( getValue.getEndpoint().getAdjustedEndpoint(10) );
watchValue = RequestStream<struct WatchValueRequest>( base.getAdjustedEndpoint(10) ); getReadHotRanges = RequestStream<struct ReadHotSubRangeRequest>( getValue.getEndpoint().getAdjustedEndpoint(11) );
getReadHotRanges = RequestStream<struct ReadHotSubRangeRequest>( base.getAdjustedEndpoint(11) );
} }
} else { } else {
ASSERT(Ar::isDeserializing); ASSERT(Ar::isDeserializing);
@ -110,7 +108,6 @@ struct StorageServerInterface {
serializer(ar, uniqueID, locality, getValue, getKey, getKeyValues, getShardState, waitMetrics, serializer(ar, uniqueID, locality, getValue, getKey, getKeyValues, getShardState, waitMetrics,
splitMetrics, getStorageMetrics, waitFailure, getQueuingMetrics, getKeyValueStoreType); splitMetrics, getStorageMetrics, waitFailure, getQueuingMetrics, getKeyValueStoreType);
if (ar.protocolVersion().hasWatches()) serializer(ar, watchValue); if (ar.protocolVersion().hasWatches()) serializer(ar, watchValue);
base = getValue.getEndpoint();
} }
} }
bool operator == (StorageServerInterface const& s) const { return uniqueID == s.uniqueID; } bool operator == (StorageServerInterface const& s) const { return uniqueID == s.uniqueID; }
@ -129,7 +126,7 @@ struct StorageServerInterface {
streams.push_back(getKeyValueStoreType.getReceiver()); streams.push_back(getKeyValueStoreType.getReceiver());
streams.push_back(watchValue.getReceiver()); streams.push_back(watchValue.getReceiver());
streams.push_back(getReadHotRanges.getReceiver()); streams.push_back(getReadHotRanges.getReceiver());
base = FlowTransport::transport().addEndpoints(streams); FlowTransport::transport().addEndpoints(streams);
} }
}; };
@ -320,6 +317,8 @@ struct GetShardStateRequest {
struct StorageMetrics { struct StorageMetrics {
constexpr static FileIdentifier file_identifier = 13622226; constexpr static FileIdentifier file_identifier = 13622226;
int64_t bytes = 0; // total storage int64_t bytes = 0; // total storage
// FIXME: currently, neither of bytesPerKSecond or iosPerKSecond are actually used in DataDistribution calculations.
// This may change in the future, but this comment is left here to avoid any confusion for the time being.
int64_t bytesPerKSecond = 0; // network bandwidth (average over 10s) int64_t bytesPerKSecond = 0; // network bandwidth (average over 10s)
int64_t iosPerKSecond = 0; int64_t iosPerKSecond = 0;
int64_t bytesReadPerKSecond = 0; int64_t bytesReadPerKSecond = 0;

@ -46,6 +46,11 @@ const KeyRef keyServersKey( const KeyRef& k, Arena& arena ) {
return k.withPrefix( keyServersPrefix, arena ); return k.withPrefix( keyServersPrefix, arena );
} }
const Value keyServersValue( Standalone<RangeResultRef> result, const std::vector<UID>& src, const std::vector<UID>& dest ) { const Value keyServersValue( Standalone<RangeResultRef> result, const std::vector<UID>& src, const std::vector<UID>& dest ) {
if(!CLIENT_KNOBS->TAG_ENCODE_KEY_SERVERS) {
BinaryWriter wr(IncludeVersion()); wr << src << dest;
return wr.toValue();
}
std::vector<Tag> srcTag; std::vector<Tag> srcTag;
std::vector<Tag> destTag; std::vector<Tag> destTag;
@ -203,6 +208,9 @@ const KeyRangeRef writeConflictRangeKeysRange =
KeyRangeRef(LiteralStringRef("\xff\xff/transaction/write_conflict_range/"), KeyRangeRef(LiteralStringRef("\xff\xff/transaction/write_conflict_range/"),
LiteralStringRef("\xff\xff/transaction/write_conflict_range/\xff\xff")); LiteralStringRef("\xff\xff/transaction/write_conflict_range/\xff\xff"));
const KeyRangeRef ddStatsRange = KeyRangeRef(LiteralStringRef("\xff\xff/metrics/data_distribution_stats/"),
LiteralStringRef("\xff\xff/metrics/data_distribution_stats/\xff\xff"));
// "\xff/storageCache/[[begin]]" := "[[vector<uint16_t>]]" // "\xff/storageCache/[[begin]]" := "[[vector<uint16_t>]]"
const KeyRangeRef storageCacheKeys( LiteralStringRef("\xff/storageCache/"), LiteralStringRef("\xff/storageCache0") ); const KeyRangeRef storageCacheKeys( LiteralStringRef("\xff/storageCache/"), LiteralStringRef("\xff/storageCache0") );
const KeyRef storageCachePrefix = storageCacheKeys.begin; const KeyRef storageCachePrefix = storageCacheKeys.begin;

@ -81,6 +81,7 @@ extern const KeyRangeRef conflictingKeysRange;
extern const ValueRef conflictingKeysTrue, conflictingKeysFalse; extern const ValueRef conflictingKeysTrue, conflictingKeysFalse;
extern const KeyRangeRef writeConflictRangeKeysRange; extern const KeyRangeRef writeConflictRangeKeysRange;
extern const KeyRangeRef readConflictRangeKeysRange; extern const KeyRangeRef readConflictRangeKeysRange;
extern const KeyRangeRef ddStatsRange;
extern const KeyRef cacheKeysPrefix; extern const KeyRef cacheKeysPrefix;

@ -73,7 +73,7 @@ Key TagThrottleKey::toKey() const {
memcpy(str, tagThrottleKeysPrefix.begin(), tagThrottleKeysPrefix.size()); memcpy(str, tagThrottleKeysPrefix.begin(), tagThrottleKeysPrefix.size());
str += tagThrottleKeysPrefix.size(); str += tagThrottleKeysPrefix.size();
*(str++) = autoThrottled ? 1 : 0; *(str++) = (uint8_t)throttleType;
*(str++) = (uint8_t)priority; *(str++) = (uint8_t)priority;
for(auto tag : tags) { for(auto tag : tags) {
@ -89,7 +89,7 @@ Key TagThrottleKey::toKey() const {
TagThrottleKey TagThrottleKey::fromKey(const KeyRef& key) { TagThrottleKey TagThrottleKey::fromKey(const KeyRef& key) {
const uint8_t *str = key.substr(tagThrottleKeysPrefix.size()).begin(); const uint8_t *str = key.substr(tagThrottleKeysPrefix.size()).begin();
bool autoThrottled = *(str++) != 0; TagThrottleType throttleType = TagThrottleType(*(str++));
TransactionPriority priority = TransactionPriority(*(str++)); TransactionPriority priority = TransactionPriority(*(str++));
TagSet tags; TagSet tags;
@ -99,7 +99,7 @@ TagThrottleKey TagThrottleKey::fromKey(const KeyRef& key) {
str += size; str += size;
} }
return TagThrottleKey(tags, autoThrottled, priority); return TagThrottleKey(tags, throttleType, priority);
} }
TagThrottleValue TagThrottleValue::fromValue(const ValueRef& value) { TagThrottleValue TagThrottleValue::fromValue(const ValueRef& value) {
@ -164,9 +164,9 @@ namespace ThrottleApi {
} }
} }
ACTOR Future<Void> throttleTags(Database db, TagSet tags, double tpsRate, double initialDuration, bool autoThrottled, TransactionPriority priority, Optional<double> expirationTime) { ACTOR Future<Void> throttleTags(Database db, TagSet tags, double tpsRate, double initialDuration, TagThrottleType throttleType, TransactionPriority priority, Optional<double> expirationTime) {
state Transaction tr(db); state Transaction tr(db);
state Key key = TagThrottleKey(tags, autoThrottled, priority).toKey(); state Key key = TagThrottleKey(tags, throttleType, priority).toKey();
ASSERT(initialDuration > 0); ASSERT(initialDuration > 0);
@ -177,7 +177,7 @@ namespace ThrottleApi {
loop { loop {
try { try {
if(!autoThrottled) { if(throttleType == TagThrottleType::MANUAL) {
Optional<Value> oldThrottle = wait(tr.get(key)); Optional<Value> oldThrottle = wait(tr.get(key));
if(!oldThrottle.present()) { if(!oldThrottle.present()) {
wait(updateThrottleCount(&tr, 1)); wait(updateThrottleCount(&tr, 1));
@ -186,7 +186,7 @@ namespace ThrottleApi {
tr.set(key, value); tr.set(key, value);
if(!autoThrottled) { if(throttleType == TagThrottleType::MANUAL) {
signalThrottleChange(tr); signalThrottleChange(tr);
} }
@ -199,28 +199,54 @@ namespace ThrottleApi {
} }
} }
ACTOR Future<bool> unthrottleTags(Database db, TagSet tags, bool autoThrottled, TransactionPriority priority) { ACTOR Future<bool> unthrottleTags(Database db, TagSet tags, Optional<TagThrottleType> throttleType, Optional<TransactionPriority> priority) {
state Transaction tr(db); state Transaction tr(db);
state Key key = TagThrottleKey(tags, autoThrottled, priority).toKey();
state bool removed = false;
state std::vector<Key> keys;
for(auto p : allTransactionPriorities) {
if(!priority.present() || priority.get() == p) {
if(!throttleType.present() || throttleType.get() == TagThrottleType::AUTO) {
keys.push_back(TagThrottleKey(tags, TagThrottleType::AUTO, p).toKey());
}
if(!throttleType.present() || throttleType.get() == TagThrottleType::MANUAL) {
keys.push_back(TagThrottleKey(tags, TagThrottleType::MANUAL, p).toKey());
}
}
}
state bool removed = false;
loop { loop {
try { try {
state Optional<Value> value = wait(tr.get(key)); state std::vector<Future<Optional<Value>>> values;
if(value.present()) { for(auto key : keys) {
if(!autoThrottled) { values.push_back(tr.get(key));
wait(updateThrottleCount(&tr, -1)); }
wait(waitForAll(values));
int delta = 0;
for(int i = 0; i < values.size(); ++i) {
if(values[i].get().present()) {
if(TagThrottleKey::fromKey(keys[i]).throttleType == TagThrottleType::MANUAL) {
delta -= 1;
}
tr.clear(keys[i]);
// Report that we are removing this tag if we ever see it present.
// This protects us from getting confused if the transaction is maybe committed.
// It's ok if someone else actually ends up removing this tag at the same time
// and we aren't the ones to actually do it.
removed = true;
} }
}
tr.clear(key); if(delta != 0) {
wait(updateThrottleCount(&tr, delta));
}
if(removed) {
signalThrottleChange(tr); signalThrottleChange(tr);
// Report that we are removing this tag if we ever see it present.
// This protects us from getting confused if the transaction is maybe committed.
// It's ok if someone else actually ends up removing this tag at the same time
// and we aren't the ones to actually do it.
removed = true;
wait(tr.commit()); wait(tr.commit());
} }
@ -232,7 +258,7 @@ namespace ThrottleApi {
} }
} }
ACTOR Future<bool> unthrottleTags(Database db, KeyRef beginKey, KeyRef endKey, bool onlyExpiredThrottles) { ACTOR Future<bool> unthrottleMatchingThrottles(Database db, KeyRef beginKey, KeyRef endKey, Optional<TransactionPriority> priority, bool onlyExpiredThrottles) {
state Transaction tr(db); state Transaction tr(db);
state KeySelector begin = firstGreaterOrEqual(beginKey); state KeySelector begin = firstGreaterOrEqual(beginKey);
@ -253,8 +279,12 @@ namespace ThrottleApi {
} }
} }
bool autoThrottled = TagThrottleKey::fromKey(tag.key).autoThrottled; TagThrottleKey key = TagThrottleKey::fromKey(tag.key);
if(!autoThrottled) { if(priority.present() && key.priority != priority.get()) {
continue;
}
if(key.throttleType == TagThrottleType::MANUAL) {
++manualUnthrottledTags; ++manualUnthrottledTags;
} }
@ -285,20 +315,22 @@ namespace ThrottleApi {
} }
} }
Future<bool> unthrottleManual(Database db) { Future<bool> unthrottleAll(Database db, Optional<TagThrottleType> tagThrottleType, Optional<TransactionPriority> priority) {
return unthrottleTags(db, tagThrottleKeysPrefix, tagThrottleAutoKeysPrefix, false); KeyRef begin = tagThrottleKeys.begin;
} KeyRef end = tagThrottleKeys.end;
Future<bool> unthrottleAuto(Database db) { if(tagThrottleType.present() && tagThrottleType == TagThrottleType::AUTO) {
return unthrottleTags(db, tagThrottleAutoKeysPrefix, tagThrottleKeys.end, false); begin = tagThrottleAutoKeysPrefix;
} }
else if(tagThrottleType.present() && tagThrottleType == TagThrottleType::MANUAL) {
end = tagThrottleAutoKeysPrefix;
}
Future<bool> unthrottleAll(Database db) { return unthrottleMatchingThrottles(db, begin, end, priority, false);
return unthrottleTags(db, tagThrottleKeys.begin, tagThrottleKeys.end, false);
} }
Future<bool> expire(Database db) { Future<bool> expire(Database db) {
return unthrottleTags(db, tagThrottleKeys.begin, tagThrottleKeys.end, true); return unthrottleMatchingThrottles(db, tagThrottleKeys.begin, tagThrottleKeys.end, Optional<TransactionPriority>(), true);
} }
ACTOR Future<Void> enableAuto(Database db, bool enabled) { ACTOR Future<Void> enableAuto(Database db, bool enabled) {

@ -107,14 +107,19 @@ struct dynamic_size_traits<TagSet> : std::true_type {
} }
}; };
enum class TagThrottleType : uint8_t {
MANUAL,
AUTO
};
struct TagThrottleKey { struct TagThrottleKey {
TagSet tags; TagSet tags;
bool autoThrottled; TagThrottleType throttleType;
TransactionPriority priority; TransactionPriority priority;
TagThrottleKey() : autoThrottled(false), priority(TransactionPriority::DEFAULT) {} TagThrottleKey() : throttleType(TagThrottleType::MANUAL), priority(TransactionPriority::DEFAULT) {}
TagThrottleKey(TagSet tags, bool autoThrottled, TransactionPriority priority) TagThrottleKey(TagSet tags, TagThrottleType throttleType, TransactionPriority priority)
: tags(tags), autoThrottled(autoThrottled), priority(priority) {} : tags(tags), throttleType(throttleType), priority(priority) {}
Key toKey() const; Key toKey() const;
static TagThrottleKey fromKey(const KeyRef& key); static TagThrottleKey fromKey(const KeyRef& key);
@ -139,17 +144,17 @@ struct TagThrottleValue {
struct TagThrottleInfo { struct TagThrottleInfo {
TransactionTag tag; TransactionTag tag;
bool autoThrottled; TagThrottleType throttleType;
TransactionPriority priority; TransactionPriority priority;
double tpsRate; double tpsRate;
double expirationTime; double expirationTime;
double initialDuration; double initialDuration;
TagThrottleInfo(TransactionTag tag, bool autoThrottled, TransactionPriority priority, double tpsRate, double expirationTime, double initialDuration) TagThrottleInfo(TransactionTag tag, TagThrottleType throttleType, TransactionPriority priority, double tpsRate, double expirationTime, double initialDuration)
: tag(tag), autoThrottled(autoThrottled), priority(priority), tpsRate(tpsRate), expirationTime(expirationTime), initialDuration(initialDuration) {} : tag(tag), throttleType(throttleType), priority(priority), tpsRate(tpsRate), expirationTime(expirationTime), initialDuration(initialDuration) {}
TagThrottleInfo(TagThrottleKey key, TagThrottleValue value) TagThrottleInfo(TagThrottleKey key, TagThrottleValue value)
: autoThrottled(key.autoThrottled), priority(key.priority), tpsRate(value.tpsRate), expirationTime(value.expirationTime), initialDuration(value.initialDuration) : throttleType(key.throttleType), priority(key.priority), tpsRate(value.tpsRate), expirationTime(value.expirationTime), initialDuration(value.initialDuration)
{ {
ASSERT(key.tags.size() == 1); // Multiple tags per throttle is not currently supported ASSERT(key.tags.size() == 1); // Multiple tags per throttle is not currently supported
tag = *key.tags.begin(); tag = *key.tags.begin();
@ -160,13 +165,11 @@ namespace ThrottleApi {
Future<std::vector<TagThrottleInfo>> getThrottledTags(Database const& db, int const& limit); Future<std::vector<TagThrottleInfo>> getThrottledTags(Database const& db, int const& limit);
Future<Void> throttleTags(Database const& db, TagSet const& tags, double const& tpsRate, double const& initialDuration, Future<Void> throttleTags(Database const& db, TagSet const& tags, double const& tpsRate, double const& initialDuration,
bool const& autoThrottled, TransactionPriority const& priority, Optional<double> const& expirationTime = Optional<double>()); TagThrottleType const& throttleType, TransactionPriority const& priority, Optional<double> const& expirationTime = Optional<double>());
Future<bool> unthrottleTags(Database const& db, TagSet const& tags, bool const& autoThrottled, TransactionPriority const& priority); Future<bool> unthrottleTags(Database const& db, TagSet const& tags, Optional<TagThrottleType> const& throttleType, Optional<TransactionPriority> const& priority);
Future<bool> unthrottleManual(Database db); Future<bool> unthrottleAll(Database db, Optional<TagThrottleType> throttleType, Optional<TransactionPriority> priority);
Future<bool> unthrottleAuto(Database db);
Future<bool> unthrottleAll(Database db);
Future<bool> expire(Database db); Future<bool> expire(Database db);
Future<Void> enableAuto(Database const& db, bool const& enabled); Future<Void> enableAuto(Database const& db, bool const& enabled);

@ -21,7 +21,7 @@
#include "fdbclient/ThreadSafeTransaction.h" #include "fdbclient/ThreadSafeTransaction.h"
#include "fdbclient/ReadYourWrites.h" #include "fdbclient/ReadYourWrites.h"
#include "fdbclient/DatabaseContext.h" #include "fdbclient/DatabaseContext.h"
#include "fdbclient/IncludeVersions.h" #include "fdbclient/versions.h"
// Users of ThreadSafeTransaction might share Reference<ThreadSafe...> between different threads as long as they don't call addRef (e.g. C API follows this). // Users of ThreadSafeTransaction might share Reference<ThreadSafe...> between different threads as long as they don't call addRef (e.g. C API follows this).
// Therefore, it is unsafe to call (explicitly or implicitly) this->addRef in any of these functions. // Therefore, it is unsafe to call (explicitly or implicitly) this->addRef in any of these functions.

@ -77,7 +77,7 @@
#include "flow/SimpleOpt.h" #include "flow/SimpleOpt.h"
#include "SimpleIni.h" #include "SimpleIni.h"
#include "fdbclient/IncludeVersions.h" #include "fdbclient/versions.h"
#ifdef __linux__ #ifdef __linux__
typedef fd_set* fdb_fd_set; typedef fd_set* fdb_fd_set;

@ -802,36 +802,36 @@ ACTOR Future<int> actorFuzz29( FutureStream<int> inputStream, PromiseStream<int>
std::pair<int,int> actorFuzzTests() { std::pair<int,int> actorFuzzTests() {
int testsOK = 0; int testsOK = 0;
testsOK += testFuzzActor( &actorFuzz0, "actorFuzz0", (vector<int>(),390229,596271,574865) ); testsOK += testFuzzActor( &actorFuzz0, "actorFuzz0", {390229,596271,574865});
testsOK += testFuzzActor( &actorFuzz1, "actorFuzz1", (vector<int>(),477566,815578,477566,815578,477566,815578,477566,815578,477566,815578,917160) ); testsOK += testFuzzActor( &actorFuzz1, "actorFuzz1", {477566,815578,477566,815578,477566,815578,477566,815578,477566,815578,917160});
testsOK += testFuzzActor( &actorFuzz2, "actorFuzz2", (vector<int>(),476677,930237) ); testsOK += testFuzzActor( &actorFuzz2, "actorFuzz2", {476677,930237});
testsOK += testFuzzActor( &actorFuzz3, "actorFuzz3", (vector<int>(),1000) ); testsOK += testFuzzActor( &actorFuzz3, "actorFuzz3", {1000});
testsOK += testFuzzActor( &actorFuzz4, "actorFuzz4", (vector<int>(),180600,177605,177605,177605,954508,810052) ); testsOK += testFuzzActor( &actorFuzz4, "actorFuzz4", {180600,177605,177605,177605,954508,810052});
testsOK += testFuzzActor( &actorFuzz5, "actorFuzz5", (vector<int>(),1000) ); testsOK += testFuzzActor( &actorFuzz5, "actorFuzz5", {1000});
testsOK += testFuzzActor( &actorFuzz6, "actorFuzz6", (vector<int>(),320321,266526,762336,463730,320321,266526,762336,463730,320321,266526,762336,463730,320321,266526,762336,463730,320321,266526,762336,463730,945289) ); testsOK += testFuzzActor( &actorFuzz6, "actorFuzz6", {320321,266526,762336,463730,320321,266526,762336,463730,320321,266526,762336,463730,320321,266526,762336,463730,320321,266526,762336,463730,945289});
testsOK += testFuzzActor( &actorFuzz7, "actorFuzz7", (vector<int>(),406152,478841,609181,634881,253861,592023,240597,253861,593023,240597,253861,594023,240597,415949,169335,478331,634881,253861,596023,240597,253861,597023,240597,253861,598023,240597,415949,173335,478331,634881,253861,600023,240597,253861,601023,240597,253861,602023,240597,415949,177335,478331,634881,253861,604023,240597,253861,605023,240597,253861,606023,240597,415949,181335,478331,634881,253861,608023,240597,253861,609023,240597,253861,610023,240597,415949,185335,478331,331905,946924,663973,797073,971923,295772,923567,559259,559259,559259,325678,679187,295772,923567,559259,559259,559259,325678,679187,295772,923567,559259,559259,559259,325678,679187,295772,923567,559259,559259,559259,325678,679187,295772,923567,559259,559259,559259,325678,679187,534407,814172,949658) ); testsOK += testFuzzActor( &actorFuzz7, "actorFuzz7", {406152,478841,609181,634881,253861,592023,240597,253861,593023,240597,253861,594023,240597,415949,169335,478331,634881,253861,596023,240597,253861,597023,240597,253861,598023,240597,415949,173335,478331,634881,253861,600023,240597,253861,601023,240597,253861,602023,240597,415949,177335,478331,634881,253861,604023,240597,253861,605023,240597,253861,606023,240597,415949,181335,478331,634881,253861,608023,240597,253861,609023,240597,253861,610023,240597,415949,185335,478331,331905,946924,663973,797073,971923,295772,923567,559259,559259,559259,325678,679187,295772,923567,559259,559259,559259,325678,679187,295772,923567,559259,559259,559259,325678,679187,295772,923567,559259,559259,559259,325678,679187,295772,923567,559259,559259,559259,325678,679187,534407,814172,949658});
testsOK += testFuzzActor( &actorFuzz8, "actorFuzz8", (vector<int>(),285937,696473) ); testsOK += testFuzzActor( &actorFuzz8, "actorFuzz8", {285937,696473});
testsOK += testFuzzActor( &actorFuzz9, "actorFuzz9", (vector<int>(),141463,397424) ); testsOK += testFuzzActor( &actorFuzz9, "actorFuzz9", {141463,397424});
testsOK += testFuzzActor( &actorFuzz10, "actorFuzz10", (vector<int>(),543113,1000) ); testsOK += testFuzzActor( &actorFuzz10, "actorFuzz10", {543113,1000});
testsOK += testFuzzActor( &actorFuzz11, "actorFuzz11", (vector<int>(),1000) ); testsOK += testFuzzActor( &actorFuzz11, "actorFuzz11", {1000});
testsOK += testFuzzActor( &actorFuzz12, "actorFuzz12", (vector<int>(),970588,981887) ); testsOK += testFuzzActor( &actorFuzz12, "actorFuzz12", {970588,981887});
testsOK += testFuzzActor( &actorFuzz13, "actorFuzz13", (vector<int>(),861219) ); testsOK += testFuzzActor( &actorFuzz13, "actorFuzz13", {861219});
testsOK += testFuzzActor( &actorFuzz14, "actorFuzz14", (vector<int>(),527098,527098,527098,628047) ); testsOK += testFuzzActor( &actorFuzz14, "actorFuzz14", {527098,527098,527098,628047});
testsOK += testFuzzActor( &actorFuzz15, "actorFuzz15", (vector<int>(),582389,240216,732317,582389,240216,732317,582389,240216,732317,582389,240216,732317,582389,240216,732317,884781) ); testsOK += testFuzzActor( &actorFuzz15, "actorFuzz15", {582389,240216,732317,582389,240216,732317,582389,240216,732317,582389,240216,732317,582389,240216,732317,884781});
testsOK += testFuzzActor( &actorFuzz16, "actorFuzz16", (vector<int>(),943071,492690,908751,198776,537939) ); testsOK += testFuzzActor( &actorFuzz16, "actorFuzz16", {943071,492690,908751,198776,537939});
testsOK += testFuzzActor( &actorFuzz17, "actorFuzz17", (vector<int>(),249436,416782,249436,416782,249436,416782,299183) ); testsOK += testFuzzActor( &actorFuzz17, "actorFuzz17", {249436,416782,249436,416782,249436,416782,299183});
testsOK += testFuzzActor( &actorFuzz18, "actorFuzz18", (vector<int>(),337649,395297,807261,517901) ); testsOK += testFuzzActor( &actorFuzz18, "actorFuzz18", {337649,395297,807261,517901});
testsOK += testFuzzActor( &actorFuzz19, "actorFuzz19", (vector<int>(),492598,139186,742053,492598,140186,742053,492598,141186,742053,592919) ); testsOK += testFuzzActor( &actorFuzz19, "actorFuzz19", {492598,139186,742053,492598,140186,742053,492598,141186,742053,592919});
testsOK += testFuzzActor( &actorFuzz20, "actorFuzz20", (vector<int>(),760082,1000) ); testsOK += testFuzzActor( &actorFuzz20, "actorFuzz20", {760082,1000});
testsOK += testFuzzActor( &actorFuzz21, "actorFuzz21", (vector<int>(),806394) ); testsOK += testFuzzActor( &actorFuzz21, "actorFuzz21", {806394});
testsOK += testFuzzActor( &actorFuzz22, "actorFuzz22", (vector<int>(),722878,369302,416748) ); testsOK += testFuzzActor( &actorFuzz22, "actorFuzz22", {722878,369302,416748});
testsOK += testFuzzActor( &actorFuzz23, "actorFuzz23", (vector<int>(),562792,231437) ); testsOK += testFuzzActor( &actorFuzz23, "actorFuzz23", {562792,231437});
testsOK += testFuzzActor( &actorFuzz24, "actorFuzz24", (vector<int>(),847672,835175) ); testsOK += testFuzzActor( &actorFuzz24, "actorFuzz24", {847672,835175});
testsOK += testFuzzActor( &actorFuzz25, "actorFuzz25", (vector<int>(),843261,327560,592398) ); testsOK += testFuzzActor( &actorFuzz25, "actorFuzz25", {843261,327560,592398});
testsOK += testFuzzActor( &actorFuzz26, "actorFuzz26", (vector<int>(),520263,306397,944232,366272,700651,146918,191890) ); testsOK += testFuzzActor( &actorFuzz26, "actorFuzz26", {520263,306397,944232,366272,700651,146918,191890});
testsOK += testFuzzActor( &actorFuzz27, "actorFuzz27", (vector<int>(),313322,196907) ); testsOK += testFuzzActor( &actorFuzz27, "actorFuzz27", {313322,196907});
testsOK += testFuzzActor( &actorFuzz28, "actorFuzz28", (vector<int>(),715827,529509,449273,715827,529509,449273,715827,529509,449273,715827,529509,449273,715827,529509,449273,743922) ); testsOK += testFuzzActor( &actorFuzz28, "actorFuzz28", {715827,529509,449273,715827,529509,449273,715827,529509,449273,715827,529509,449273,715827,529509,449273,743922});
testsOK += testFuzzActor( &actorFuzz29, "actorFuzz29", (vector<int>(),821092,901028,617942,821092,902028,617942,821092,903028,617942,821092,904028,617942,821092,905028,617942,560881) ); testsOK += testFuzzActor( &actorFuzz29, "actorFuzz29", {821092,901028,617942,821092,902028,617942,821092,903028,617942,821092,904028,617942,821092,905028,617942,560881});
return std::make_pair(testsOK, 30); return std::make_pair(testsOK, 30);
} }
#endif // WIN32 #endif // WIN32

@ -24,14 +24,6 @@
using std::vector; using std::vector;
inline vector<int>& operator , (vector<int>& v, int a) {
v.push_back(a);
return v;
}
inline vector<int>& operator , (vector<int> const& v, int a) {
return (const_cast<vector<int>&>(v), a);
}
inline void throw_operation_failed() { throw operation_failed(); } inline void throw_operation_failed() { throw operation_failed(); }
// This is in dsltest.actor.cpp: // This is in dsltest.actor.cpp:

@ -80,16 +80,17 @@ Future<Reference<IAsyncFile>> AsyncFileCached::open_impl( std::string filename,
return open_impl(filename, flags, mode, pageCache); return open_impl(filename, flags, mode, pageCache);
} }
Future<Void> AsyncFileCached::read_write_impl( AsyncFileCached* self, void* data, int length, int64_t offset, bool writing ) { template <bool writing>
if (writing) { Future<Void> AsyncFileCached::read_write_impl(AsyncFileCached* self,
typename std::conditional_t<writing, const uint8_t*, uint8_t*> data,
int length, int64_t offset) {
if constexpr (writing) {
if (offset + length > self->length) if (offset + length > self->length)
self->length = offset + length; self->length = offset + length;
} }
std::vector<Future<Void>> actors; std::vector<Future<Void>> actors;
uint8_t* cdata = static_cast<uint8_t*>(data);
int offsetInPage = offset % self->pageCache->pageSize; int offsetInPage = offset % self->pageCache->pageSize;
int64_t pageOffset = offset - offsetInPage; int64_t pageOffset = offset - offsetInPage;
@ -108,13 +109,16 @@ Future<Void> AsyncFileCached::read_write_impl( AsyncFileCached* self, void* data
int bytesInPage = std::min(self->pageCache->pageSize - offsetInPage, remaining); int bytesInPage = std::min(self->pageCache->pageSize - offsetInPage, remaining);
auto w = writing Future<Void> w;
? p->second->write( cdata, bytesInPage, offsetInPage ) if constexpr (writing) {
: p->second->read( cdata, bytesInPage, offsetInPage ); w = p->second->write(data, bytesInPage, offsetInPage);
} else {
w = p->second->read(data, bytesInPage, offsetInPage);
}
if (!w.isReady() || w.isError()) if (!w.isReady() || w.isError())
actors.push_back( w ); actors.push_back( w );
cdata += bytesInPage; data += bytesInPage;
pageOffset += self->pageCache->pageSize; pageOffset += self->pageCache->pageSize;
offsetInPage = 0; offsetInPage = 0;

@ -28,6 +28,7 @@
#define FLOW_ASYNCFILECACHED_ACTOR_H #define FLOW_ASYNCFILECACHED_ACTOR_H
#include <boost/intrusive/list.hpp> #include <boost/intrusive/list.hpp>
#include <type_traits>
#include "flow/flow.h" #include "flow/flow.h"
#include "fdbrpc/IAsyncFile.h" #include "fdbrpc/IAsyncFile.h"
@ -166,7 +167,7 @@ public:
length = int(this->length - offset); length = int(this->length - offset);
ASSERT(length >= 0); ASSERT(length >= 0);
} }
auto f = read_write_impl(this, data, length, offset, false); auto f = read_write_impl<false>(this, static_cast<uint8_t*>(data), length, offset);
if( f.isReady() && !f.isError() ) return length; if( f.isReady() && !f.isError() ) return length;
++countFileCacheReadsBlocked; ++countFileCacheReadsBlocked;
++countCacheReadsBlocked; ++countCacheReadsBlocked;
@ -180,7 +181,7 @@ public:
wait(self->currentTruncate); wait(self->currentTruncate);
++self->countFileCacheWrites; ++self->countFileCacheWrites;
++self->countCacheWrites; ++self->countCacheWrites;
Future<Void> f = read_write_impl(self, const_cast<void*>(data), length, offset, true); Future<Void> f = read_write_impl<true>(self, static_cast<const uint8_t*>(data), length, offset);
if (!f.isReady()) { if (!f.isReady()) {
++self->countFileCacheWritesBlocked; ++self->countFileCacheWritesBlocked;
++self->countCacheWritesBlocked; ++self->countCacheWritesBlocked;
@ -346,7 +347,10 @@ private:
return Void(); return Void();
} }
static Future<Void> read_write_impl( AsyncFileCached* self, void* data, int length, int64_t offset, bool writing ); template <bool writing>
static Future<Void> read_write_impl(AsyncFileCached* self,
typename std::conditional_t<writing, const uint8_t*, uint8_t*> data,
int length, int64_t offset);
void remove_page( AFCPage* page ); void remove_page( AFCPage* page );
}; };

@ -121,7 +121,8 @@ void SimpleFailureMonitor::endpointNotFound(Endpoint const& endpoint) {
.suppressFor(1.0) .suppressFor(1.0)
.detail("Address", endpoint.getPrimaryAddress()) .detail("Address", endpoint.getPrimaryAddress())
.detail("Token", endpoint.token); .detail("Token", endpoint.token);
endpointKnownFailed.set(endpoint, true); failedEndpoints.insert(endpoint);
endpointKnownFailed.trigger(endpoint);
} }
void SimpleFailureMonitor::notifyDisconnect(NetworkAddress const& address) { void SimpleFailureMonitor::notifyDisconnect(NetworkAddress const& address) {
@ -132,7 +133,7 @@ void SimpleFailureMonitor::notifyDisconnect(NetworkAddress const& address) {
Future<Void> SimpleFailureMonitor::onDisconnectOrFailure(Endpoint const& endpoint) { Future<Void> SimpleFailureMonitor::onDisconnectOrFailure(Endpoint const& endpoint) {
// If the endpoint or address is already failed, return right away // If the endpoint or address is already failed, return right away
auto i = addressStatus.find(endpoint.getPrimaryAddress()); auto i = addressStatus.find(endpoint.getPrimaryAddress());
if (i == addressStatus.end() || i->second.isFailed() || endpointKnownFailed.get(endpoint)) { if (i == addressStatus.end() || i->second.isFailed() || failedEndpoints.count(endpoint)) {
TraceEvent("AlreadyDisconnected").detail("Addr", endpoint.getPrimaryAddress()).detail("Tok", endpoint.token); TraceEvent("AlreadyDisconnected").detail("Addr", endpoint.getPrimaryAddress()).detail("Tok", endpoint.token);
return Void(); return Void();
} }
@ -149,14 +150,14 @@ Future<Void> SimpleFailureMonitor::onStateChanged(Endpoint const& endpoint) {
// failure status for that endpoint can never change (and we could be spuriously triggered by setStatus) // failure status for that endpoint can never change (and we could be spuriously triggered by setStatus)
// Also returns spuriously when notifyDisconnect is called (which doesn't actually change the state), but callers // Also returns spuriously when notifyDisconnect is called (which doesn't actually change the state), but callers
// check the state so it's OK // check the state so it's OK
if (endpointKnownFailed.get(endpoint)) if (failedEndpoints.count(endpoint))
return Never(); return Never();
else else
return endpointKnownFailed.onChange(endpoint); return endpointKnownFailed.onChange(endpoint);
} }
FailureStatus SimpleFailureMonitor::getState(Endpoint const& endpoint) { FailureStatus SimpleFailureMonitor::getState(Endpoint const& endpoint) {
if (endpointKnownFailed.get(endpoint)) if (failedEndpoints.count(endpoint))
return FailureStatus(true); return FailureStatus(true);
else { else {
auto a = addressStatus.find(endpoint.getPrimaryAddress()); auto a = addressStatus.find(endpoint.getPrimaryAddress());
@ -178,7 +179,7 @@ FailureStatus SimpleFailureMonitor::getState(NetworkAddress const& address) {
} }
bool SimpleFailureMonitor::onlyEndpointFailed(Endpoint const& endpoint) { bool SimpleFailureMonitor::onlyEndpointFailed(Endpoint const& endpoint) {
if (!endpointKnownFailed.get(endpoint)) return false; if (!failedEndpoints.count(endpoint)) return false;
auto a = addressStatus.find(endpoint.getPrimaryAddress()); auto a = addressStatus.find(endpoint.getPrimaryAddress());
if (a == addressStatus.end()) if (a == addressStatus.end())
return true; return true;
@ -187,10 +188,11 @@ bool SimpleFailureMonitor::onlyEndpointFailed(Endpoint const& endpoint) {
} }
bool SimpleFailureMonitor::permanentlyFailed(Endpoint const& endpoint) { bool SimpleFailureMonitor::permanentlyFailed(Endpoint const& endpoint) {
return endpointKnownFailed.get(endpoint); return failedEndpoints.count(endpoint);
} }
void SimpleFailureMonitor::reset() { void SimpleFailureMonitor::reset() {
addressStatus = std::unordered_map<NetworkAddress, FailureStatus>(); addressStatus = std::unordered_map<NetworkAddress, FailureStatus>();
failedEndpoints = std::unordered_set<Endpoint>();
endpointKnownFailed.resetNoWaiting(); endpointKnownFailed.resetNoWaiting();
} }

@ -25,6 +25,7 @@
#include "flow/flow.h" #include "flow/flow.h"
#include "fdbrpc/FlowTransport.h" // Endpoint #include "fdbrpc/FlowTransport.h" // Endpoint
#include <unordered_map> #include <unordered_map>
#include <unordered_set>
using std::vector; using std::vector;
@ -153,6 +154,7 @@ public:
private: private:
std::unordered_map<NetworkAddress, FailureStatus> addressStatus; std::unordered_map<NetworkAddress, FailureStatus> addressStatus;
YieldedAsyncMap<Endpoint, bool> endpointKnownFailed; YieldedAsyncMap<Endpoint, bool> endpointKnownFailed;
std::unordered_set<Endpoint> failedEndpoints;
friend class OnStateChangedActorActor; friend class OnStateChangedActorActor;
}; };

@ -122,10 +122,11 @@ const Endpoint& EndpointMap::insert( NetworkAddressList localAddresses, std::vec
} }
UID base = deterministicRandom()->randomUniqueID(); UID base = deterministicRandom()->randomUniqueID();
for(int i=0; i<streams.size(); i++) { for(uint64_t i=0; i<streams.size(); i++) {
int index = adjacentStart+i; int index = adjacentStart+i;
streams[i].first->setEndpoint( Endpoint( localAddresses, UID( base.first() | TOKEN_STREAM_FLAG, (base.second()&0xffffffff00000000LL) | index) ) ); uint64_t first = (base.first()+(i<<32)) | TOKEN_STREAM_FLAG;
data[index].token() = Endpoint::Token( base.first() | TOKEN_STREAM_FLAG, (base.second()&0xffffffff00000000LL) | static_cast<uint32_t>(streams[i].second) ); streams[i].first->setEndpoint( Endpoint( localAddresses, UID( first, (base.second()&0xffffffff00000000LL) | index) ) );
data[index].token() = Endpoint::Token( first, (base.second()&0xffffffff00000000LL) | static_cast<uint32_t>(streams[i].second) );
data[index].receiver = (NetworkMessageReceiver*) streams[i].first; data[index].receiver = (NetworkMessageReceiver*) streams[i].first;
} }
@ -1277,8 +1278,8 @@ void FlowTransport::addEndpoint( Endpoint& endpoint, NetworkMessageReceiver* rec
self->endpoints.insert( receiver, endpoint.token, taskID ); self->endpoints.insert( receiver, endpoint.token, taskID );
} }
const Endpoint& FlowTransport::addEndpoints( std::vector<std::pair<FlowReceiver*, TaskPriority>> const& streams ) { void FlowTransport::addEndpoints( std::vector<std::pair<FlowReceiver*, TaskPriority>> const& streams ) {
return self->endpoints.insert( self->localAddresses, streams ); self->endpoints.insert( self->localAddresses, streams );
} }
void FlowTransport::removeEndpoint( const Endpoint& endpoint, NetworkMessageReceiver* receiver ) { void FlowTransport::removeEndpoint( const Endpoint& endpoint, NetworkMessageReceiver* receiver ) {

@ -68,23 +68,17 @@ public:
Endpoint getAdjustedEndpoint( uint32_t index ) { Endpoint getAdjustedEndpoint( uint32_t index ) {
uint32_t newIndex = token.second(); uint32_t newIndex = token.second();
newIndex += index; newIndex += index;
return Endpoint( addresses, UID(token.first(), (token.second()&0xffffffff00000000LL) | newIndex) ); return Endpoint( addresses, UID(token.first()+(uint64_t(index)<<32), (token.second()&0xffffffff00000000LL) | newIndex) );
} }
bool operator == (Endpoint const& r) const { bool operator == (Endpoint const& r) const {
return getPrimaryAddress() == r.getPrimaryAddress() && token == r.token; return token == r.token && getPrimaryAddress() == r.getPrimaryAddress();
} }
bool operator != (Endpoint const& r) const { bool operator != (Endpoint const& r) const {
return !(*this == r); return !(*this == r);
} }
bool operator < (Endpoint const& r) const { bool operator < (Endpoint const& r) const {
const NetworkAddress& left = getPrimaryAddress(); return addresses.address < r.addresses.address || (addresses.address == r.addresses.address && token < r.token);
const NetworkAddress& right = r.getPrimaryAddress();
if (left != right)
return left < right;
else
return token < r.token;
} }
template <class Ar> template <class Ar>
@ -109,6 +103,18 @@ public:
}; };
#pragma pack(pop) #pragma pack(pop)
namespace std
{
template <>
struct hash<Endpoint>
{
size_t operator()(const Endpoint& ep) const
{
return ep.token.hash() + ep.addresses.address.hash();
}
};
}
class ArenaObjectReader; class ArenaObjectReader;
class NetworkMessageReceiver { class NetworkMessageReceiver {
public: public:
@ -186,7 +192,7 @@ public:
void addEndpoint( Endpoint& endpoint, NetworkMessageReceiver*, TaskPriority taskID ); void addEndpoint( Endpoint& endpoint, NetworkMessageReceiver*, TaskPriority taskID );
// Sets endpoint to be a new local endpoint which delivers messages to the given receiver // Sets endpoint to be a new local endpoint which delivers messages to the given receiver
const Endpoint& addEndpoints( std::vector<std::pair<struct FlowReceiver*, TaskPriority>> const& streams ); void addEndpoints( std::vector<std::pair<struct FlowReceiver*, TaskPriority>> const& streams );
void removeEndpoint( const Endpoint&, NetworkMessageReceiver* ); void removeEndpoint( const Endpoint&, NetworkMessageReceiver* );
// The given local endpoint no longer delivers messages to the given receiver or uses resources // The given local endpoint no longer delivers messages to the given receiver or uses resources

2
fdbrpc/actorFuzz.py Normal file → Executable file

@ -449,7 +449,7 @@ for actor in actors:
print("std::pair<int,int> actorFuzzTests() {\n\tint testsOK = 0;", file=outputFile) print("std::pair<int,int> actorFuzzTests() {\n\tint testsOK = 0;", file=outputFile)
for actor in actors: for actor in actors:
print('\ttestsOK += testFuzzActor( &%s, "%s", (vector<int>(),%s) );' % (actor.name, actor.name, ','.join(str(e) for e in actor.ecx.output)), print('\ttestsOK += testFuzzActor( &%s, "%s", {%s} );' % (actor.name, actor.name, ','.join(str(e) for e in actor.ecx.output)),
file=outputFile) file=outputFile)
print("\treturn std::make_pair(testsOK, %d);\n}" % len(actors), file=outputFile) print("\treturn std::make_pair(testsOK, %d);\n}" % len(actors), file=outputFile)
print('#endif // WIN32\n', file=outputFile) print('#endif // WIN32\n', file=outputFile)

@ -85,17 +85,6 @@ void ISimulator::displayWorkers() const
return; return;
} }
namespace std {
template<>
class hash<Endpoint> {
public:
size_t operator()(const Endpoint &s) const
{
return crc32c_append(0, (const uint8_t*)&s, sizeof(s));
}
};
}
const UID TOKEN_ENDPOINT_NOT_FOUND(-1, -1); const UID TOKEN_ENDPOINT_NOT_FOUND(-1, -1);
ISimulator* g_pSimulator = 0; ISimulator* g_pSimulator = 0;

@ -83,21 +83,24 @@ std::map<std::tuple<LogEpoch, Version, int>, std::map<Tag, Version>> BackupProgr
auto progressIt = progress.lower_bound(epoch); auto progressIt = progress.lower_bound(epoch);
if (progressIt != progress.end() && progressIt->first == epoch) { if (progressIt != progress.end() && progressIt->first == epoch) {
if (progressIt != progress.begin()) { std::set<Tag> toCheck = tags;
for (auto current = progressIt; current != progress.begin() && !toCheck.empty();) {
auto prev = std::prev(current);
// Previous epoch is gone, consolidate the progress. // Previous epoch is gone, consolidate the progress.
auto prev = std::prev(progressIt);
for (auto [tag, version] : prev->second) { for (auto [tag, version] : prev->second) {
if (tags.count(tag) > 0) { if (toCheck.count(tag) > 0) {
progressIt->second[tag] = std::max(version, progressIt->second[tag]); progressIt->second[tag] = std::max(version, progressIt->second[tag]);
toCheck.erase(tag);
} }
} }
current = prev;
} }
updateTagVersions(&tagVersions, &tags, progressIt->second, info.epochEnd, adjustedBeginVersion, epoch); updateTagVersions(&tagVersions, &tags, progressIt->second, info.epochEnd, adjustedBeginVersion, epoch);
} else { } else {
auto rit = std::find_if( auto rit = std::find_if(
progress.rbegin(), progress.rend(), progress.rbegin(), progress.rend(),
[epoch = epoch](const std::pair<LogEpoch, std::map<Tag, Version>>& p) { return p.first < epoch; }); [epoch = epoch](const std::pair<LogEpoch, std::map<Tag, Version>>& p) { return p.first < epoch; });
if (!(rit == progress.rend())) { while (!(rit == progress.rend())) {
// A partial recovery can result in empty epoch that copies previous // A partial recovery can result in empty epoch that copies previous
// epoch's version range. In this case, we should check previous // epoch's version range. In this case, we should check previous
// epoch's savedVersion. // epoch's savedVersion.
@ -112,7 +115,9 @@ std::map<std::tuple<LogEpoch, Version, int>, std::map<Tag, Version>> BackupProgr
// ASSERT(info.logRouterTags == epochTags[rit->first]); // ASSERT(info.logRouterTags == epochTags[rit->first]);
updateTagVersions(&tagVersions, &tags, rit->second, info.epochEnd, adjustedBeginVersion, epoch); updateTagVersions(&tagVersions, &tags, rit->second, info.epochEnd, adjustedBeginVersion, epoch);
break;
} }
rit++;
} }
} }

@ -34,14 +34,17 @@
#include "flow/actorcompiler.h" // This must be the last #include. #include "flow/actorcompiler.h" // This must be the last #include.
#define SevDebugMemory SevVerbose
struct VersionedMessage { struct VersionedMessage {
LogMessageVersion version; LogMessageVersion version;
StringRef message; StringRef message;
VectorRef<Tag> tags; VectorRef<Tag> tags;
Arena arena; // Keep a reference to the memory containing the message Arena arena; // Keep a reference to the memory containing the message
size_t bytes; // arena's size when inserted, which can grow afterwards
VersionedMessage(LogMessageVersion v, StringRef m, const VectorRef<Tag>& t, const Arena& a) VersionedMessage(LogMessageVersion v, StringRef m, const VectorRef<Tag>& t, const Arena& a)
: version(v), message(m), tags(t), arena(a) {} : version(v), message(m), tags(t), arena(a), bytes(a.getSize()) {}
const Version getVersion() const { return version.version; } const Version getVersion() const { return version.version; }
const uint32_t getSubVersion() const { return version.sub; } const uint32_t getSubVersion() const { return version.sub; }
@ -64,6 +67,10 @@ struct VersionedMessage {
} }
}; };
static bool sameArena(const Arena& a, const Arena& b) {
return a.impl.getPtr() == b.impl.getPtr();
}
struct BackupData { struct BackupData {
const UID myId; const UID myId;
const Tag tag; // LogRouter tag for this worker, i.e., (-2, i) const Tag tag; // LogRouter tag for this worker, i.e., (-2, i)
@ -84,6 +91,7 @@ struct BackupData {
bool stopped = false; bool stopped = false;
bool exitEarly = false; // If the worker is on an old epoch and all backups starts a version >= the endVersion bool exitEarly = false; // If the worker is on an old epoch and all backups starts a version >= the endVersion
AsyncVar<bool> paused; // Track if "backupPausedKey" is set. AsyncVar<bool> paused; // Track if "backupPausedKey" is set.
Reference<FlowLock> lock;
struct PerBackupInfo { struct PerBackupInfo {
PerBackupInfo() = default; PerBackupInfo() = default;
@ -231,12 +239,14 @@ struct BackupData {
: myId(id), tag(req.routerTag), totalTags(req.totalTags), startVersion(req.startVersion), : myId(id), tag(req.routerTag), totalTags(req.totalTags), startVersion(req.startVersion),
endVersion(req.endVersion), recruitedEpoch(req.recruitedEpoch), backupEpoch(req.backupEpoch), endVersion(req.endVersion), recruitedEpoch(req.recruitedEpoch), backupEpoch(req.backupEpoch),
minKnownCommittedVersion(invalidVersion), savedVersion(req.startVersion - 1), popVersion(req.startVersion - 1), minKnownCommittedVersion(invalidVersion), savedVersion(req.startVersion - 1), popVersion(req.startVersion - 1),
cc("BackupWorker", myId.toString()), pulledVersion(0), paused(false) { cc("BackupWorker", myId.toString()), pulledVersion(0), paused(false),
lock(new FlowLock(SERVER_KNOBS->BACKUP_LOCK_BYTES)) {
cx = openDBOnServer(db, TaskPriority::DefaultEndpoint, true, true); cx = openDBOnServer(db, TaskPriority::DefaultEndpoint, true, true);
specialCounter(cc, "SavedVersion", [this]() { return this->savedVersion; }); specialCounter(cc, "SavedVersion", [this]() { return this->savedVersion; });
specialCounter(cc, "MinKnownCommittedVersion", [this]() { return this->minKnownCommittedVersion; }); specialCounter(cc, "MinKnownCommittedVersion", [this]() { return this->minKnownCommittedVersion; });
specialCounter(cc, "MsgQ", [this]() { return this->messages.size(); }); specialCounter(cc, "MsgQ", [this]() { return this->messages.size(); });
specialCounter(cc, "BufferedBytes", [this]() { return this->lock->activePermits(); });
logger = traceCounters("BackupWorkerMetrics", myId, SERVER_KNOBS->WORKER_LOGGING_INTERVAL, &cc, logger = traceCounters("BackupWorkerMetrics", myId, SERVER_KNOBS->WORKER_LOGGING_INTERVAL, &cc,
"BackupWorkerMetrics"); "BackupWorkerMetrics");
} }
@ -310,6 +320,34 @@ struct BackupData {
doneTrigger.trigger(); doneTrigger.trigger();
} }
// Erases messages and updates lock with memory released.
void eraseMessages(int num) {
ASSERT(num <= messages.size());
if (num == 0) return;
if (messages.size() == num) {
messages.clear();
TraceEvent(SevDebugMemory, "BackupWorkerMemory", myId).detail("ReleaseAll", lock->activePermits());
lock->release(lock->activePermits());
return;
}
// keep track of each arena and accumulate their sizes
int64_t bytes = 0;
for (int i = 0; i < num; i++) {
const Arena& a = messages[i].arena;
const Arena& b = messages[i + 1].arena;
if (!sameArena(a, b)) {
bytes += messages[i].bytes;
TraceEvent(SevDebugMemory, "BackupWorkerMemory", myId)
.detail("Release", messages[i].bytes)
.detail("Arena", (void*)a.impl.getPtr());
}
}
lock->release(bytes);
messages.erase(messages.begin(), messages.begin() + num);
}
void eraseMessagesAfterEndVersion() { void eraseMessagesAfterEndVersion() {
ASSERT(endVersion.present()); ASSERT(endVersion.present());
const Version ver = endVersion.get(); const Version ver = endVersion.get();
@ -637,6 +675,7 @@ ACTOR Future<Void> saveMutationsToFile(BackupData* self, Version popVersion, int
state std::vector<Reference<IBackupFile>> logFiles; state std::vector<Reference<IBackupFile>> logFiles;
state std::vector<int64_t> blockEnds; state std::vector<int64_t> blockEnds;
state std::vector<UID> activeUids; // active Backups' UIDs state std::vector<UID> activeUids; // active Backups' UIDs
state std::vector<Version> beginVersions; // logFiles' begin versions
state KeyRangeMap<std::set<int>> keyRangeMap; // range to index in logFileFutures, logFiles, & blockEnds state KeyRangeMap<std::set<int>> keyRangeMap; // range to index in logFileFutures, logFiles, & blockEnds
state std::vector<Standalone<StringRef>> mutations; state std::vector<Standalone<StringRef>> mutations;
state int idx; state int idx;
@ -655,15 +694,20 @@ ACTOR Future<Void> saveMutationsToFile(BackupData* self, Version popVersion, int
const int index = logFileFutures.size(); const int index = logFileFutures.size();
activeUids.push_back(it->first); activeUids.push_back(it->first);
self->insertRanges(keyRangeMap, it->second.ranges.get(), index); self->insertRanges(keyRangeMap, it->second.ranges.get(), index);
if (it->second.lastSavedVersion == invalidVersion) { if (it->second.lastSavedVersion == invalidVersion) {
if (it->second.startVersion > self->startVersion && !self->messages.empty()) { if (it->second.startVersion > self->startVersion && !self->messages.empty()) {
// True-up first mutation log's begin version // True-up first mutation log's begin version
it->second.lastSavedVersion = self->messages[0].getVersion(); it->second.lastSavedVersion = self->messages[0].getVersion();
} else { } else {
it->second.lastSavedVersion = it->second.lastSavedVersion = std::max({ self->popVersion, self->savedVersion, self->startVersion });
std::max(self->popVersion, std::max(self->savedVersion, self->startVersion));
} }
TraceEvent("BackupWorkerTrueUp", self->myId).detail("LastSavedVersion", it->second.lastSavedVersion);
} }
// The true-up version can be larger than first message version, so keep
// the begin versions for later muation filtering.
beginVersions.push_back(it->second.lastSavedVersion);
logFileFutures.push_back(it->second.container.get().get()->writeTaggedLogFile( logFileFutures.push_back(it->second.container.get().get()->writeTaggedLogFile(
it->second.lastSavedVersion, popVersion + 1, blockSize, self->tag.id, self->totalTags)); it->second.lastSavedVersion, popVersion + 1, blockSize, self->tag.id, self->totalTags));
it++; it++;
@ -675,7 +719,7 @@ ACTOR Future<Void> saveMutationsToFile(BackupData* self, Version popVersion, int
std::transform(logFileFutures.begin(), logFileFutures.end(), std::back_inserter(logFiles), std::transform(logFileFutures.begin(), logFileFutures.end(), std::back_inserter(logFiles),
[](const Future<Reference<IBackupFile>>& f) { return f.get(); }); [](const Future<Reference<IBackupFile>>& f) { return f.get(); });
ASSERT(activeUids.size() == logFiles.size()); ASSERT(activeUids.size() == logFiles.size() && beginVersions.size() == logFiles.size());
for (int i = 0; i < logFiles.size(); i++) { for (int i = 0; i < logFiles.size(); i++) {
TraceEvent("OpenMutationFile", self->myId) TraceEvent("OpenMutationFile", self->myId)
.detail("BackupID", activeUids[i]) .detail("BackupID", activeUids[i])
@ -698,7 +742,10 @@ ACTOR Future<Void> saveMutationsToFile(BackupData* self, Version popVersion, int
std::vector<Future<Void>> adds; std::vector<Future<Void>> adds;
if (m.type != MutationRef::Type::ClearRange) { if (m.type != MutationRef::Type::ClearRange) {
for (int index : keyRangeMap[m.param1]) { for (int index : keyRangeMap[m.param1]) {
adds.push_back(addMutation(logFiles[index], message, message.message, &blockEnds[index], blockSize)); if (message.getVersion() >= beginVersions[index]) {
adds.push_back(
addMutation(logFiles[index], message, message.message, &blockEnds[index], blockSize));
}
} }
} else { } else {
KeyRangeRef mutationRange(m.param1, m.param2); KeyRangeRef mutationRange(m.param1, m.param2);
@ -713,8 +760,10 @@ ACTOR Future<Void> saveMutationsToFile(BackupData* self, Version popVersion, int
wr << subm; wr << subm;
mutations.push_back(wr.toValue()); mutations.push_back(wr.toValue());
for (int index : range.value()) { for (int index : range.value()) {
adds.push_back( if (message.getVersion() >= beginVersions[index]) {
addMutation(logFiles[index], message, mutations.back(), &blockEnds[index], blockSize)); adds.push_back(
addMutation(logFiles[index], message, mutations.back(), &blockEnds[index], blockSize));
}
} }
} }
} }
@ -791,12 +840,12 @@ ACTOR Future<Void> uploadData(BackupData* self) {
.detail("MsgQ", self->messages.size()); .detail("MsgQ", self->messages.size());
// save an empty file for old epochs so that log file versions are continuous // save an empty file for old epochs so that log file versions are continuous
wait(saveMutationsToFile(self, popVersion, numMsg)); wait(saveMutationsToFile(self, popVersion, numMsg));
self->messages.erase(self->messages.begin(), self->messages.begin() + numMsg); self->eraseMessages(numMsg);
} }
// If transition into NOOP mode, should clear messages // If transition into NOOP mode, should clear messages
if (!self->pulling) { if (!self->pulling) {
self->messages.clear(); self->eraseMessages(self->messages.size());
} }
if (popVersion > self->savedVersion && popVersion > self->popVersion) { if (popVersion > self->savedVersion && popVersion > self->popVersion) {
@ -810,7 +859,7 @@ ACTOR Future<Void> uploadData(BackupData* self) {
} }
if (self->allMessageSaved()) { if (self->allMessageSaved()) {
self->messages.clear(); self->eraseMessages(self->messages.size());
return Void(); return Void();
} }
@ -825,6 +874,7 @@ ACTOR Future<Void> pullAsyncData(BackupData* self) {
state Future<Void> logSystemChange = Void(); state Future<Void> logSystemChange = Void();
state Reference<ILogSystem::IPeekCursor> r; state Reference<ILogSystem::IPeekCursor> r;
state Version tagAt = std::max(self->pulledVersion.get(), std::max(self->startVersion, self->savedVersion)); state Version tagAt = std::max(self->pulledVersion.get(), std::max(self->startVersion, self->savedVersion));
state Arena prev;
TraceEvent("BackupWorkerPull", self->myId); TraceEvent("BackupWorkerPull", self->myId);
loop { loop {
@ -850,6 +900,15 @@ ACTOR Future<Void> pullAsyncData(BackupData* self) {
// Note we aggressively peek (uncommitted) messages, but only committed // Note we aggressively peek (uncommitted) messages, but only committed
// messages/mutations will be flushed to disk/blob in uploadData(). // messages/mutations will be flushed to disk/blob in uploadData().
while (r->hasMessage()) { while (r->hasMessage()) {
if (!sameArena(prev, r->arena())) {
TraceEvent(SevDebugMemory, "BackupWorkerMemory", self->myId)
.detail("Take", r->arena().getSize())
.detail("Arena", (void*)r->arena().impl.getPtr())
.detail("Current", self->lock->activePermits());
wait(self->lock->take(TaskPriority::DefaultYield, r->arena().getSize()));
prev = r->arena();
}
self->messages.emplace_back(r->version(), r->getMessage(), r->getTags(), r->arena()); self->messages.emplace_back(r->version(), r->getMessage(), r->getTags(), r->arena());
r->nextMessage(); r->nextMessage();
} }

@ -134,6 +134,7 @@ set(FDBSERVER_SRCS
workloads/ConsistencyCheck.actor.cpp workloads/ConsistencyCheck.actor.cpp
workloads/CpuProfiler.actor.cpp workloads/CpuProfiler.actor.cpp
workloads/Cycle.actor.cpp workloads/Cycle.actor.cpp
workloads/DataDistributionMetrics.actor.cpp
workloads/DDBalance.actor.cpp workloads/DDBalance.actor.cpp
workloads/DDMetrics.actor.cpp workloads/DDMetrics.actor.cpp
workloads/DDMetricsExclude.actor.cpp workloads/DDMetricsExclude.actor.cpp

@ -4429,7 +4429,7 @@ ACTOR Future<Void> monitorBatchLimitedTime(Reference<AsyncVar<ServerDBInfo>> db,
} }
} }
ACTOR Future<Void> dataDistribution(Reference<DataDistributorData> self) ACTOR Future<Void> dataDistribution(Reference<DataDistributorData> self, PromiseStream<GetMetricsListRequest> getShardMetricsList)
{ {
state double lastLimited = 0; state double lastLimited = 0;
self->addActor.send( monitorBatchLimitedTime(self->dbInfo, &lastLimited) ); self->addActor.send( monitorBatchLimitedTime(self->dbInfo, &lastLimited) );
@ -4605,7 +4605,7 @@ ACTOR Future<Void> dataDistribution(Reference<DataDistributorData> self)
} }
actors.push_back( pollMoveKeysLock(cx, lock) ); actors.push_back( pollMoveKeysLock(cx, lock) );
actors.push_back( reportErrorsExcept( dataDistributionTracker( initData, cx, output, shardsAffectedByTeamFailure, getShardMetrics, getAverageShardBytes.getFuture(), readyToStart, anyZeroHealthyTeams, self->ddId ), "DDTracker", self->ddId, &normalDDQueueErrors() ) ); actors.push_back( reportErrorsExcept( dataDistributionTracker( initData, cx, output, shardsAffectedByTeamFailure, getShardMetrics, getShardMetricsList, getAverageShardBytes.getFuture(), readyToStart, anyZeroHealthyTeams, self->ddId ), "DDTracker", self->ddId, &normalDDQueueErrors() ) );
actors.push_back( reportErrorsExcept( dataDistributionQueue( cx, output, input.getFuture(), getShardMetrics, processingUnhealthy, tcis, shardsAffectedByTeamFailure, lock, getAverageShardBytes, self->ddId, storageTeamSize, configuration.storageTeamSize, &lastLimited ), "DDQueue", self->ddId, &normalDDQueueErrors() ) ); actors.push_back( reportErrorsExcept( dataDistributionQueue( cx, output, input.getFuture(), getShardMetrics, processingUnhealthy, tcis, shardsAffectedByTeamFailure, lock, getAverageShardBytes, self->ddId, storageTeamSize, configuration.storageTeamSize, &lastLimited ), "DDQueue", self->ddId, &normalDDQueueErrors() ) );
vector<DDTeamCollection*> teamCollectionsPtrs; vector<DDTeamCollection*> teamCollectionsPtrs;
@ -4856,6 +4856,7 @@ ACTOR Future<Void> ddExclusionSafetyCheck(DistributorExclusionSafetyCheckRequest
ACTOR Future<Void> dataDistributor(DataDistributorInterface di, Reference<AsyncVar<struct ServerDBInfo>> db ) { ACTOR Future<Void> dataDistributor(DataDistributorInterface di, Reference<AsyncVar<struct ServerDBInfo>> db ) {
state Reference<DataDistributorData> self( new DataDistributorData(db, di.id()) ); state Reference<DataDistributorData> self( new DataDistributorData(db, di.id()) );
state Future<Void> collection = actorCollection( self->addActor.getFuture() ); state Future<Void> collection = actorCollection( self->addActor.getFuture() );
state PromiseStream<GetMetricsListRequest> getShardMetricsList;
state Database cx = openDBOnServer(db, TaskPriority::DefaultDelay, true, true); state Database cx = openDBOnServer(db, TaskPriority::DefaultDelay, true, true);
state ActorCollection actors(false); state ActorCollection actors(false);
self->addActor.send(actors.getResult()); self->addActor.send(actors.getResult());
@ -4864,7 +4865,7 @@ ACTOR Future<Void> dataDistributor(DataDistributorInterface di, Reference<AsyncV
try { try {
TraceEvent("DataDistributorRunning", di.id()); TraceEvent("DataDistributorRunning", di.id());
self->addActor.send( waitFailureServer(di.waitFailure.getFuture()) ); self->addActor.send( waitFailureServer(di.waitFailure.getFuture()) );
state Future<Void> distributor = reportErrorsExcept( dataDistribution(self), "DataDistribution", di.id(), &normalDataDistributorErrors() ); state Future<Void> distributor = reportErrorsExcept( dataDistribution(self, getShardMetricsList), "DataDistribution", di.id(), &normalDataDistributorErrors() );
loop choose { loop choose {
when ( wait(distributor || collection) ) { when ( wait(distributor || collection) ) {
@ -4876,6 +4877,17 @@ ACTOR Future<Void> dataDistributor(DataDistributorInterface di, Reference<AsyncV
TraceEvent("DataDistributorHalted", di.id()).detail("ReqID", req.requesterID); TraceEvent("DataDistributorHalted", di.id()).detail("ReqID", req.requesterID);
break; break;
} }
when ( state GetDataDistributorMetricsRequest req = waitNext(di.dataDistributorMetrics.getFuture()) ) {
ErrorOr<Standalone<VectorRef<DDMetricsRef>>> result = wait(errorOr(brokenPromiseToNever(
getShardMetricsList.getReply(GetMetricsListRequest(req.keys, req.shardLimit)))));
if ( result.isError() ) {
req.reply.sendError(result.getError());
} else {
GetDataDistributorMetricsReply rep;
rep.storageMetricsList = result.get();
req.reply.send(rep);
}
}
when(DistributorSnapRequest snapReq = waitNext(di.distributorSnapReq.getFuture())) { when(DistributorSnapRequest snapReq = waitNext(di.distributorSnapReq.getFuture())) {
actors.add(ddSnapCreate(snapReq, db)); actors.add(ddSnapCreate(snapReq, db));
} }

@ -107,6 +107,15 @@ struct GetMetricsRequest {
GetMetricsRequest( KeyRange const& keys ) : keys(keys) {} GetMetricsRequest( KeyRange const& keys ) : keys(keys) {}
}; };
struct GetMetricsListRequest {
KeyRange keys;
int shardLimit;
Promise<Standalone<VectorRef<DDMetricsRef>>> reply;
GetMetricsListRequest() {}
GetMetricsListRequest( KeyRange const& keys, const int shardLimit ) : keys(keys), shardLimit(shardLimit) {}
};
struct TeamCollectionInterface { struct TeamCollectionInterface {
PromiseStream< GetTeamRequest > getTeam; PromiseStream< GetTeamRequest > getTeam;
}; };
@ -203,6 +212,7 @@ Future<Void> dataDistributionTracker(
PromiseStream<RelocateShard> const& output, PromiseStream<RelocateShard> const& output,
Reference<ShardsAffectedByTeamFailure> const& shardsAffectedByTeamFailure, Reference<ShardsAffectedByTeamFailure> const& shardsAffectedByTeamFailure,
PromiseStream<GetMetricsRequest> const& getShardMetrics, PromiseStream<GetMetricsRequest> const& getShardMetrics,
PromiseStream<GetMetricsListRequest> const& getShardMetricsList,
FutureStream<Promise<int64_t>> const& getAverageShardBytes, FutureStream<Promise<int64_t>> const& getAverageShardBytes,
Promise<Void> const& readyToStart, Promise<Void> const& readyToStart,
Reference<AsyncVar<bool>> const& zeroHealthyTeams, Reference<AsyncVar<bool>> const& zeroHealthyTeams,

@ -813,12 +813,60 @@ ACTOR Future<Void> fetchShardMetrics( DataDistributionTracker* self, GetMetricsR
return Void(); return Void();
} }
ACTOR Future<Void> fetchShardMetricsList_impl( DataDistributionTracker* self, GetMetricsListRequest req ) {
try {
loop {
// used to control shard limit
int shardNum = 0;
// list of metrics, regenerate on loop when full range unsuccessful
Standalone<VectorRef<DDMetricsRef>> result;
Future<Void> onChange;
for (auto t : self->shards.containedRanges(req.keys)) {
auto &stats = t.value().stats;
if( !stats->get().present() ) {
onChange = stats->onChange();
break;
}
result.push_back_deep(result.arena(),
DDMetricsRef(stats->get().get().metrics.bytes, KeyRef(t.begin().toString())));
++shardNum;
if (shardNum >= req.shardLimit) {
break;
}
}
if( !onChange.isValid() ) {
req.reply.send( result );
return Void();
}
wait( onChange );
}
} catch( Error &e ) {
if( e.code() != error_code_actor_cancelled && !req.reply.isSet() )
req.reply.sendError(e);
throw;
}
}
ACTOR Future<Void> fetchShardMetricsList( DataDistributionTracker* self, GetMetricsListRequest req ) {
choose {
when( wait( fetchShardMetricsList_impl( self, req ) ) ) {}
when( wait( delay( SERVER_KNOBS->DD_SHARD_METRICS_TIMEOUT ) ) ) {
req.reply.sendError(timed_out());
}
}
return Void();
}
ACTOR Future<Void> dataDistributionTracker( ACTOR Future<Void> dataDistributionTracker(
Reference<InitialDataDistribution> initData, Reference<InitialDataDistribution> initData,
Database cx, Database cx,
PromiseStream<RelocateShard> output, PromiseStream<RelocateShard> output,
Reference<ShardsAffectedByTeamFailure> shardsAffectedByTeamFailure, Reference<ShardsAffectedByTeamFailure> shardsAffectedByTeamFailure,
PromiseStream<GetMetricsRequest> getShardMetrics, PromiseStream<GetMetricsRequest> getShardMetrics,
PromiseStream<GetMetricsListRequest> getShardMetricsList,
FutureStream<Promise<int64_t>> getAverageShardBytes, FutureStream<Promise<int64_t>> getAverageShardBytes,
Promise<Void> readyToStart, Promise<Void> readyToStart,
Reference<AsyncVar<bool>> anyZeroHealthyTeams, Reference<AsyncVar<bool>> anyZeroHealthyTeams,
@ -847,6 +895,9 @@ ACTOR Future<Void> dataDistributionTracker(
when( GetMetricsRequest req = waitNext( getShardMetrics.getFuture() ) ) { when( GetMetricsRequest req = waitNext( getShardMetrics.getFuture() ) ) {
self.sizeChanges.add( fetchShardMetrics( &self, req ) ); self.sizeChanges.add( fetchShardMetrics( &self, req ) );
} }
when( GetMetricsListRequest req = waitNext( getShardMetricsList.getFuture() ) ) {
self.sizeChanges.add( fetchShardMetricsList( &self, req ) );
}
when( wait( self.sizeChanges.getResult() ) ) {} when( wait( self.sizeChanges.getResult() ) ) {}
} }
} catch (Error& e) { } catch (Error& e) {

@ -32,6 +32,7 @@ struct DataDistributorInterface {
struct LocalityData locality; struct LocalityData locality;
RequestStream<struct DistributorSnapRequest> distributorSnapReq; RequestStream<struct DistributorSnapRequest> distributorSnapReq;
RequestStream<struct DistributorExclusionSafetyCheckRequest> distributorExclCheckReq; RequestStream<struct DistributorExclusionSafetyCheckRequest> distributorExclCheckReq;
RequestStream<struct GetDataDistributorMetricsRequest> dataDistributorMetrics;
DataDistributorInterface() {} DataDistributorInterface() {}
explicit DataDistributorInterface(const struct LocalityData& l) : locality(l) {} explicit DataDistributorInterface(const struct LocalityData& l) : locality(l) {}
@ -48,7 +49,7 @@ struct DataDistributorInterface {
template <class Archive> template <class Archive>
void serialize(Archive& ar) { void serialize(Archive& ar) {
serializer(ar, waitFailure, haltDataDistributor, locality, distributorSnapReq, distributorExclCheckReq); serializer(ar, waitFailure, haltDataDistributor, locality, distributorSnapReq, distributorExclCheckReq, dataDistributorMetrics);
} }
}; };
@ -66,6 +67,33 @@ struct HaltDataDistributorRequest {
} }
}; };
struct GetDataDistributorMetricsReply {
constexpr static FileIdentifier file_identifier = 1284337;
Standalone<VectorRef<DDMetricsRef>> storageMetricsList;
GetDataDistributorMetricsReply() {}
template <class Ar>
void serialize(Ar& ar) {
serializer(ar,storageMetricsList);
}
};
struct GetDataDistributorMetricsRequest {
constexpr static FileIdentifier file_identifier = 1059267;
KeyRange keys;
int shardLimit;
ReplyPromise<struct GetDataDistributorMetricsReply> reply;
GetDataDistributorMetricsRequest() {}
explicit GetDataDistributorMetricsRequest(KeyRange const& keys, const int shardLimit) : keys(keys), shardLimit(shardLimit) {}
template<class Ar>
void serialize(Ar& ar) {
serializer(ar, keys, shardLimit, reply);
}
};
struct DistributorSnapRequest struct DistributorSnapRequest
{ {
constexpr static FileIdentifier file_identifier = 22204900; constexpr static FileIdentifier file_identifier = 22204900;

@ -7,7 +7,7 @@
#include "fdbserver/FDBExecHelper.actor.h" #include "fdbserver/FDBExecHelper.actor.h"
#include "flow/Trace.h" #include "flow/Trace.h"
#include "flow/flow.h" #include "flow/flow.h"
#include "fdbclient/IncludeVersions.h" #include "fdbclient/versions.h"
#include "fdbserver/Knobs.h" #include "fdbserver/Knobs.h"
#include "flow/actorcompiler.h" // This must be the last #include. #include "flow/actorcompiler.h" // This must be the last #include.

@ -387,7 +387,8 @@ void ServerKnobs::initialize(bool randomize, ClientKnobs* clientKnobs, bool isSi
init( BACKUP_TIMEOUT, 0.4 ); init( BACKUP_TIMEOUT, 0.4 );
init( BACKUP_NOOP_POP_DELAY, 5.0 ); init( BACKUP_NOOP_POP_DELAY, 5.0 );
init( BACKUP_FILE_BLOCK_BYTES, 1024 * 1024 ); init( BACKUP_FILE_BLOCK_BYTES, 1024 * 1024 );
init( BACKUP_UPLOAD_DELAY, 10.0 ); if( randomize && BUGGIFY ) BACKUP_UPLOAD_DELAY = deterministicRandom()->random01() * 20; // TODO: Increase delay range init( BACKUP_LOCK_BYTES, 3e9 ); if(randomize && BUGGIFY) BACKUP_LOCK_BYTES = deterministicRandom()->randomInt(1024, 4096) * 1024;
init( BACKUP_UPLOAD_DELAY, 10.0 ); if(randomize && BUGGIFY) BACKUP_UPLOAD_DELAY = deterministicRandom()->random01() * 60;
//Cluster Controller //Cluster Controller
init( CLUSTER_CONTROLLER_LOGGING_DELAY, 5.0 ); init( CLUSTER_CONTROLLER_LOGGING_DELAY, 5.0 );
@ -629,6 +630,13 @@ void ServerKnobs::initialize(bool randomize, ClientKnobs* clientKnobs, bool isSi
init( REDWOOD_DEFAULT_PAGE_SIZE, 4096 ); init( REDWOOD_DEFAULT_PAGE_SIZE, 4096 );
init( REDWOOD_KVSTORE_CONCURRENT_READS, 64 ); init( REDWOOD_KVSTORE_CONCURRENT_READS, 64 );
init( REDWOOD_PAGE_REBUILD_FILL_FACTOR, 0.66 ); init( REDWOOD_PAGE_REBUILD_FILL_FACTOR, 0.66 );
init( REDWOOD_LAZY_CLEAR_BATCH_SIZE_PAGES, 10 );
init( REDWOOD_LAZY_CLEAR_MIN_PAGES, 0 );
init( REDWOOD_LAZY_CLEAR_MAX_PAGES, 1e6 );
init( REDWOOD_REMAP_CLEANUP_BATCH_SIZE, 5000 );
init( REDWOOD_REMAP_CLEANUP_VERSION_LAG_MIN, 4 );
init( REDWOOD_REMAP_CLEANUP_VERSION_LAG_MAX, 15 );
init( REDWOOD_LOGGING_INTERVAL, 5.0 );
// clang-format on // clang-format on

@ -179,7 +179,7 @@ public:
int64_t DD_SS_FAILURE_VERSIONLAG; // Allowed SS version lag from the current read version before marking it as failed. int64_t DD_SS_FAILURE_VERSIONLAG; // Allowed SS version lag from the current read version before marking it as failed.
int64_t DD_SS_ALLOWED_VERSIONLAG; // SS will be marked as healthy if it's version lag goes below this value. int64_t DD_SS_ALLOWED_VERSIONLAG; // SS will be marked as healthy if it's version lag goes below this value.
double DD_SS_STUCK_TIME_LIMIT; // If a storage server is not getting new versions for this amount of time, then it becomes undesired. double DD_SS_STUCK_TIME_LIMIT; // If a storage server is not getting new versions for this amount of time, then it becomes undesired.
// TeamRemover to remove redundant teams // TeamRemover to remove redundant teams
bool TR_FLAG_DISABLE_MACHINE_TEAM_REMOVER; // disable the machineTeamRemover actor bool TR_FLAG_DISABLE_MACHINE_TEAM_REMOVER; // disable the machineTeamRemover actor
double TR_REMOVE_MACHINE_TEAM_DELAY; // wait for the specified time before try to remove next machine team double TR_REMOVE_MACHINE_TEAM_DELAY; // wait for the specified time before try to remove next machine team
@ -313,6 +313,7 @@ public:
double BACKUP_TIMEOUT; // master's reaction time for backup failure double BACKUP_TIMEOUT; // master's reaction time for backup failure
double BACKUP_NOOP_POP_DELAY; double BACKUP_NOOP_POP_DELAY;
int BACKUP_FILE_BLOCK_BYTES; int BACKUP_FILE_BLOCK_BYTES;
int64_t BACKUP_LOCK_BYTES;
double BACKUP_UPLOAD_DELAY; double BACKUP_UPLOAD_DELAY;
//Cluster Controller //Cluster Controller
@ -561,6 +562,13 @@ public:
int REDWOOD_DEFAULT_PAGE_SIZE; // Page size for new Redwood files int REDWOOD_DEFAULT_PAGE_SIZE; // Page size for new Redwood files
int REDWOOD_KVSTORE_CONCURRENT_READS; // Max number of simultaneous point or range reads in progress. int REDWOOD_KVSTORE_CONCURRENT_READS; // Max number of simultaneous point or range reads in progress.
double REDWOOD_PAGE_REBUILD_FILL_FACTOR; // When rebuilding pages, start a new page after this capacity double REDWOOD_PAGE_REBUILD_FILL_FACTOR; // When rebuilding pages, start a new page after this capacity
int REDWOOD_LAZY_CLEAR_BATCH_SIZE_PAGES; // Number of pages to try to pop from the lazy delete queue and process at once
int REDWOOD_LAZY_CLEAR_MIN_PAGES; // Minimum number of pages to free before ending a lazy clear cycle, unless the queue is empty
int REDWOOD_LAZY_CLEAR_MAX_PAGES; // Maximum number of pages to free before ending a lazy clear cycle, unless the queue is empty
int REDWOOD_REMAP_CLEANUP_BATCH_SIZE; // Number of queue entries for remap cleanup to process and potentially coalesce at once.
int REDWOOD_REMAP_CLEANUP_VERSION_LAG_MIN; // Number of versions between head of remap queue and oldest retained version before remap cleanup starts
int REDWOOD_REMAP_CLEANUP_VERSION_LAG_MAX; // Number of versions between head of remap queue and oldest retained version before remap cleanup may stop
double REDWOOD_LOGGING_INTERVAL;
ServerKnobs(); ServerKnobs();
void initialize(bool randomize = false, ClientKnobs* clientKnobs = NULL, bool isSimulated = false); void initialize(bool randomize = false, ClientKnobs* clientKnobs = NULL, bool isSimulated = false);

@ -33,7 +33,6 @@ typedef uint64_t DBRecoveryCount;
struct MasterInterface { struct MasterInterface {
constexpr static FileIdentifier file_identifier = 5979145; constexpr static FileIdentifier file_identifier = 5979145;
LocalityData locality; LocalityData locality;
Endpoint base;
RequestStream< ReplyPromise<Void> > waitFailure; RequestStream< ReplyPromise<Void> > waitFailure;
RequestStream< struct TLogRejoinRequest > tlogRejoin; // sent by tlog (whether or not rebooted) to communicate with a new master RequestStream< struct TLogRejoinRequest > tlogRejoin; // sent by tlog (whether or not rebooted) to communicate with a new master
RequestStream< struct ChangeCoordinatorsRequest > changeCoordinators; RequestStream< struct ChangeCoordinatorsRequest > changeCoordinators;
@ -49,13 +48,12 @@ struct MasterInterface {
if constexpr (!is_fb_function<Archive>) { if constexpr (!is_fb_function<Archive>) {
ASSERT(ar.protocolVersion().isValid()); ASSERT(ar.protocolVersion().isValid());
} }
serializer(ar, locality, base); serializer(ar, locality, waitFailure);
if( Archive::isDeserializing ) { if( Archive::isDeserializing ) {
waitFailure = RequestStream< ReplyPromise<Void> >( base.getAdjustedEndpoint(0) ); tlogRejoin = RequestStream< struct TLogRejoinRequest >( waitFailure.getEndpoint().getAdjustedEndpoint(1) );
tlogRejoin = RequestStream< struct TLogRejoinRequest >( base.getAdjustedEndpoint(1) ); changeCoordinators = RequestStream< struct ChangeCoordinatorsRequest >( waitFailure.getEndpoint().getAdjustedEndpoint(2) );
changeCoordinators = RequestStream< struct ChangeCoordinatorsRequest >( base.getAdjustedEndpoint(2) ); getCommitVersion = RequestStream< struct GetCommitVersionRequest >( waitFailure.getEndpoint().getAdjustedEndpoint(3) );
getCommitVersion = RequestStream< struct GetCommitVersionRequest >( base.getAdjustedEndpoint(3) ); notifyBackupWorkerDone = RequestStream<struct BackupWorkerDoneRequest>( waitFailure.getEndpoint().getAdjustedEndpoint(4) );
notifyBackupWorkerDone = RequestStream<struct BackupWorkerDoneRequest>( base.getAdjustedEndpoint(4) );
} }
} }
@ -66,7 +64,7 @@ struct MasterInterface {
streams.push_back(changeCoordinators.getReceiver()); streams.push_back(changeCoordinators.getReceiver());
streams.push_back(getCommitVersion.getReceiver(TaskPriority::GetConsistentReadVersion)); streams.push_back(getCommitVersion.getReceiver(TaskPriority::GetConsistentReadVersion));
streams.push_back(notifyBackupWorkerDone.getReceiver()); streams.push_back(notifyBackupWorkerDone.getReceiver());
base = FlowTransport::transport().addEndpoints(streams); FlowTransport::transport().addEndpoints(streams);
} }
}; };

@ -1756,6 +1756,25 @@ ACTOR Future<Void> healthMetricsRequestServer(MasterProxyInterface proxy, GetHea
} }
} }
ACTOR Future<Void> ddMetricsRequestServer(MasterProxyInterface proxy, Reference<AsyncVar<ServerDBInfo>> db)
{
loop {
choose {
when(state GetDDMetricsRequest req = waitNext(proxy.getDDMetrics.getFuture()))
{
ErrorOr<GetDataDistributorMetricsReply> reply = wait(errorOr(db->get().distributor.get().dataDistributorMetrics.getReply(GetDataDistributorMetricsRequest(req.keys, req.shardLimit))));
if ( reply.isError() ) {
req.reply.sendError(reply.getError());
} else {
GetDDMetricsReply newReply;
newReply.storageMetricsList = reply.get().storageMetricsList;
req.reply.send(newReply);
}
}
}
}
}
ACTOR Future<Void> monitorRemoteCommitted(ProxyCommitData* self) { ACTOR Future<Void> monitorRemoteCommitted(ProxyCommitData* self) {
loop { loop {
wait(delay(0)); //allow this actor to be cancelled if we are removed after db changes. wait(delay(0)); //allow this actor to be cancelled if we are removed after db changes.
@ -1996,6 +2015,7 @@ ACTOR Future<Void> masterProxyServerCore(
addActor.send(readRequestServer(proxy, addActor, &commitData)); addActor.send(readRequestServer(proxy, addActor, &commitData));
addActor.send(rejoinServer(proxy, &commitData)); addActor.send(rejoinServer(proxy, &commitData));
addActor.send(healthMetricsRequestServer(proxy, &healthMetricsReply, &detailedHealthMetricsReply)); addActor.send(healthMetricsRequestServer(proxy, &healthMetricsReply, &detailedHealthMetricsReply));
addActor.send(ddMetricsRequestServer(proxy, db));
// wait for txnStateStore recovery // wait for txnStateStore recovery
wait(success(commitData.txnStateStore->readValue(StringRef()))); wait(success(commitData.txnStateStore->readValue(StringRef())));

@ -782,7 +782,7 @@ ACTOR Future<Void> monitorThrottlingChanges(RatekeeperData *self) {
TransactionTag tag = *tagKey.tags.begin(); TransactionTag tag = *tagKey.tags.begin();
Optional<ClientTagThrottleLimits> oldLimits = self->throttledTags.getManualTagThrottleLimits(tag, tagKey.priority); Optional<ClientTagThrottleLimits> oldLimits = self->throttledTags.getManualTagThrottleLimits(tag, tagKey.priority);
if(tagKey.autoThrottled) { if(tagKey.throttleType == TagThrottleType::AUTO) {
updatedTagThrottles.autoThrottleTag(self->id, tag, 0, tagValue.tpsRate, tagValue.expirationTime); updatedTagThrottles.autoThrottleTag(self->id, tag, 0, tagValue.tpsRate, tagValue.expirationTime);
} }
else { else {
@ -819,7 +819,7 @@ void tryAutoThrottleTag(RatekeeperData *self, StorageQueueInfo const& ss, RkTagT
TagSet tags; TagSet tags;
tags.addTag(ss.busiestTag.get()); tags.addTag(ss.busiestTag.get());
self->addActor.send(ThrottleApi::throttleTags(self->db, tags, clientRate.get(), SERVER_KNOBS->AUTO_TAG_THROTTLE_DURATION, true, TransactionPriority::DEFAULT, now() + SERVER_KNOBS->AUTO_TAG_THROTTLE_DURATION)); self->addActor.send(ThrottleApi::throttleTags(self->db, tags, clientRate.get(), SERVER_KNOBS->AUTO_TAG_THROTTLE_DURATION, TagThrottleType::AUTO, TransactionPriority::DEFAULT, now() + SERVER_KNOBS->AUTO_TAG_THROTTLE_DURATION));
} }
} }
} }

@ -217,6 +217,9 @@ ACTOR static Future<Void> _parsePartitionedLogFileOnLoader(
VersionedMutationsMap::iterator it; VersionedMutationsMap::iterator it;
bool inserted; bool inserted;
std::tie(it, inserted) = kvOps.emplace(msgVersion, MutationsVec()); std::tie(it, inserted) = kvOps.emplace(msgVersion, MutationsVec());
// A clear mutation can be split into multiple mutations with the same (version, sub).
// See saveMutationsToFile(). Current tests only use one key range per backup, thus
// only one clear mutation is generated (i.e., always inserted).
ASSERT(inserted); ASSERT(inserted);
ArenaReader rd(buf.arena(), StringRef(message, msgSize), AssumeVersion(currentProtocolVersion)); ArenaReader rd(buf.arena(), StringRef(message, msgSize), AssumeVersion(currentProtocolVersion));

@ -31,7 +31,7 @@
#include "fdbclient/ManagementAPI.actor.h" #include "fdbclient/ManagementAPI.actor.h"
#include "fdbclient/NativeAPI.actor.h" #include "fdbclient/NativeAPI.actor.h"
#include "fdbclient/BackupAgent.actor.h" #include "fdbclient/BackupAgent.actor.h"
#include "fdbclient/IncludeVersions.h" #include "fdbclient/versions.h"
#include "flow/actorcompiler.h" // This must be the last #include. #include "flow/actorcompiler.h" // This must be the last #include.
#undef max #undef max

@ -728,7 +728,7 @@ StringRef setK(Arena& arena, int i) {
#include "fdbserver/ConflictSet.h" #include "fdbserver/ConflictSet.h"
struct ConflictSet { struct ConflictSet {
ConflictSet() : oldestVersion(0) {} ConflictSet() : oldestVersion(0), removalKey(makeString(0)) {}
~ConflictSet() {} ~ConflictSet() {}
SkipList versionHistory; SkipList versionHistory;

@ -377,9 +377,9 @@ JsonBuilderObject getLagObject(int64_t versions) {
struct MachineMemoryInfo { struct MachineMemoryInfo {
double memoryUsage; double memoryUsage;
double numProcesses; double aggregateLimit;
MachineMemoryInfo() : memoryUsage(0), numProcesses(0) {} MachineMemoryInfo() : memoryUsage(0), aggregateLimit(0) {}
bool valid() { return memoryUsage >= 0; } bool valid() { return memoryUsage >= 0; }
void invalidate() { memoryUsage = -1; } void invalidate() { memoryUsage = -1; }
@ -613,11 +613,12 @@ ACTOR static Future<JsonBuilderObject> processStatusFetcher(
try { try {
ASSERT(pMetrics.count(workerItr->interf.address())); ASSERT(pMetrics.count(workerItr->interf.address()));
const TraceEventFields& processMetrics = pMetrics[workerItr->interf.address()]; const TraceEventFields& processMetrics = pMetrics[workerItr->interf.address()];
const TraceEventFields& programStart = programStarts[workerItr->interf.address()];
if(memInfo->second.valid()) { if(memInfo->second.valid()) {
if(processMetrics.size() > 0) { if(processMetrics.size() > 0 && programStart.size() > 0) {
memInfo->second.memoryUsage += processMetrics.getDouble("Memory"); memInfo->second.memoryUsage += processMetrics.getDouble("Memory");
++memInfo->second.numProcesses; memInfo->second.aggregateLimit += programStart.getDouble("MemoryLimit");
} }
else else
memInfo->second.invalidate(); memInfo->second.invalidate();
@ -789,19 +790,21 @@ ACTOR static Future<JsonBuilderObject> processStatusFetcher(
memoryObj.setKeyRawNumber("unused_allocated_memory", processMetrics.getValue("UnusedAllocatedMemory")); memoryObj.setKeyRawNumber("unused_allocated_memory", processMetrics.getValue("UnusedAllocatedMemory"));
} }
int64_t memoryLimit = 0;
if (programStarts.count(address)) { if (programStarts.count(address)) {
auto const& psxml = programStarts.at(address); auto const& programStartEvent = programStarts.at(address);
if(psxml.size() > 0) { if(programStartEvent.size() > 0) {
memoryObj.setKeyRawNumber("limit_bytes",psxml.getValue("MemoryLimit")); memoryLimit = programStartEvent.getInt64("MemoryLimit");
memoryObj.setKey("limit_bytes", memoryLimit);
std::string version; std::string version;
if (psxml.tryGetValue("Version", version)) { if (programStartEvent.tryGetValue("Version", version)) {
statusObj["version"] = version; statusObj["version"] = version;
} }
std::string commandLine; std::string commandLine;
if (psxml.tryGetValue("CommandLine", commandLine)) { if (programStartEvent.tryGetValue("CommandLine", commandLine)) {
statusObj["command_line"] = commandLine; statusObj["command_line"] = commandLine;
} }
} }
@ -813,10 +816,10 @@ ACTOR static Future<JsonBuilderObject> processStatusFetcher(
availableMemory = mMetrics[address].getDouble("AvailableMemory"); availableMemory = mMetrics[address].getDouble("AvailableMemory");
auto machineMemInfo = machineMemoryUsage[workerItr->interf.locality.machineId()]; auto machineMemInfo = machineMemoryUsage[workerItr->interf.locality.machineId()];
if (machineMemInfo.valid()) { if (machineMemInfo.valid() && memoryLimit > 0) {
ASSERT(machineMemInfo.numProcesses > 0); ASSERT(machineMemInfo.aggregateLimit > 0);
int64_t memory = (availableMemory + machineMemInfo.memoryUsage) / machineMemInfo.numProcesses; int64_t memory = (availableMemory + machineMemInfo.memoryUsage) * memoryLimit / machineMemInfo.aggregateLimit;
memoryObj["available_bytes"] = std::max<int64_t>(memory, 0); memoryObj["available_bytes"] = std::min<int64_t>(std::max<int64_t>(memory, 0), memoryLimit);
} }
} }
@ -1725,10 +1728,6 @@ ACTOR static Future<JsonBuilderObject> workloadStatusFetcher(Reference<AsyncVar<
(*qos).setKeyRawNumber("worst_queue_bytes_storage_server", ratekeeper.getValue("WorstStorageServerQueue")); (*qos).setKeyRawNumber("worst_queue_bytes_storage_server", ratekeeper.getValue("WorstStorageServerQueue"));
(*qos).setKeyRawNumber("limiting_queue_bytes_storage_server", ratekeeper.getValue("LimitingStorageServerQueue")); (*qos).setKeyRawNumber("limiting_queue_bytes_storage_server", ratekeeper.getValue("LimitingStorageServerQueue"));
// TODO: These can be removed in the next release after 6.2
(*qos).setKeyRawNumber("worst_version_lag_storage_server", ratekeeper.getValue("WorstStorageServerVersionLag"));
(*qos).setKeyRawNumber("limiting_version_lag_storage_server", ratekeeper.getValue("LimitingStorageServerVersionLag"));
(*qos)["worst_data_lag_storage_server"] = getLagObject(ratekeeper.getInt64("WorstStorageServerVersionLag")); (*qos)["worst_data_lag_storage_server"] = getLagObject(ratekeeper.getInt64("WorstStorageServerVersionLag"));
(*qos)["limiting_data_lag_storage_server"] = getLagObject(ratekeeper.getInt64("LimitingStorageServerVersionLag")); (*qos)["limiting_data_lag_storage_server"] = getLagObject(ratekeeper.getInt64("LimitingStorageServerVersionLag"));
(*qos)["worst_durability_lag_storage_server"] = getLagObject(ratekeeper.getInt64("WorstStorageServerDurabilityLag")); (*qos)["worst_durability_lag_storage_server"] = getLagObject(ratekeeper.getInt64("WorstStorageServerDurabilityLag"));

@ -36,7 +36,6 @@ struct TLogInterface {
LocalityData filteredLocality; LocalityData filteredLocality;
UID uniqueID; UID uniqueID;
UID sharedTLogID; UID sharedTLogID;
Endpoint base;
RequestStream< struct TLogPeekRequest > peekMessages; RequestStream< struct TLogPeekRequest > peekMessages;
RequestStream< struct TLogPopRequest > popMessages; RequestStream< struct TLogPopRequest > popMessages;
@ -75,7 +74,7 @@ struct TLogInterface {
streams.push_back(disablePopRequest.getReceiver()); streams.push_back(disablePopRequest.getReceiver());
streams.push_back(enablePopRequest.getReceiver()); streams.push_back(enablePopRequest.getReceiver());
streams.push_back(snapRequest.getReceiver()); streams.push_back(snapRequest.getReceiver());
base = FlowTransport::transport().addEndpoints(streams); FlowTransport::transport().addEndpoints(streams);
} }
template <class Ar> template <class Ar>
@ -83,19 +82,18 @@ struct TLogInterface {
if constexpr (!is_fb_function<Ar>) { if constexpr (!is_fb_function<Ar>) {
ASSERT(ar.isDeserializing || uniqueID != UID()); ASSERT(ar.isDeserializing || uniqueID != UID());
} }
serializer(ar, uniqueID, sharedTLogID, filteredLocality, base); serializer(ar, uniqueID, sharedTLogID, filteredLocality, peekMessages);
if( Ar::isDeserializing ) { if( Ar::isDeserializing ) {
peekMessages = RequestStream< struct TLogPeekRequest >( base.getAdjustedEndpoint(0) ); popMessages = RequestStream< struct TLogPopRequest >( peekMessages.getEndpoint().getAdjustedEndpoint(1) );
popMessages = RequestStream< struct TLogPopRequest >( base.getAdjustedEndpoint(1) ); commit = RequestStream< struct TLogCommitRequest >( peekMessages.getEndpoint().getAdjustedEndpoint(2) );
commit = RequestStream< struct TLogCommitRequest >( base.getAdjustedEndpoint(2) ); lock = RequestStream< ReplyPromise< struct TLogLockResult > >( peekMessages.getEndpoint().getAdjustedEndpoint(3) );
lock = RequestStream< ReplyPromise< struct TLogLockResult > >( base.getAdjustedEndpoint(3) ); getQueuingMetrics = RequestStream< struct TLogQueuingMetricsRequest >( peekMessages.getEndpoint().getAdjustedEndpoint(4) );
getQueuingMetrics = RequestStream< struct TLogQueuingMetricsRequest >( base.getAdjustedEndpoint(4) ); confirmRunning = RequestStream< struct TLogConfirmRunningRequest >( peekMessages.getEndpoint().getAdjustedEndpoint(5) );
confirmRunning = RequestStream< struct TLogConfirmRunningRequest >( base.getAdjustedEndpoint(5) ); waitFailure = RequestStream< ReplyPromise<Void> >( peekMessages.getEndpoint().getAdjustedEndpoint(6) );
waitFailure = RequestStream< ReplyPromise<Void> >( base.getAdjustedEndpoint(6) ); recoveryFinished = RequestStream< struct TLogRecoveryFinishedRequest >( peekMessages.getEndpoint().getAdjustedEndpoint(7) );
recoveryFinished = RequestStream< struct TLogRecoveryFinishedRequest >( base.getAdjustedEndpoint(7) ); disablePopRequest = RequestStream< struct TLogDisablePopRequest >( peekMessages.getEndpoint().getAdjustedEndpoint(8) );
disablePopRequest = RequestStream< struct TLogDisablePopRequest >( base.getAdjustedEndpoint(8) ); enablePopRequest = RequestStream< struct TLogEnablePopRequest >( peekMessages.getEndpoint().getAdjustedEndpoint(9) );
enablePopRequest = RequestStream< struct TLogEnablePopRequest >( base.getAdjustedEndpoint(9) ); snapRequest = RequestStream< struct TLogSnapRequest >( peekMessages.getEndpoint().getAdjustedEndpoint(10) );
snapRequest = RequestStream< struct TLogSnapRequest >( base.getAdjustedEndpoint(10) );
} }
} }
}; };

File diff suppressed because it is too large Load Diff

@ -54,7 +54,7 @@
#include "fdbrpc/AsyncFileCached.actor.h" #include "fdbrpc/AsyncFileCached.actor.h"
#include "fdbserver/CoroFlow.h" #include "fdbserver/CoroFlow.h"
#include "flow/TLSConfig.actor.h" #include "flow/TLSConfig.actor.h"
#include "fdbclient/IncludeVersions.h" #include "fdbclient/versions.h"
#include "fdbmonitor/SimpleIni.h" #include "fdbmonitor/SimpleIni.h"

@ -4101,3 +4101,4 @@ void versionedMapTest() {
printf("Memory used: %f MB\n", printf("Memory used: %f MB\n",
(after - before)/ 1e6); (after - before)/ 1e6);
} }

@ -399,9 +399,11 @@ struct BackupAndParallelRestoreCorrectnessWorkload : TestWorkload {
if (!self->locked && BUGGIFY) { if (!self->locked && BUGGIFY) {
TraceEvent("BARW_SubmitBackup2", randomID).detail("Tag", printable(self->backupTag)); TraceEvent("BARW_SubmitBackup2", randomID).detail("Tag", printable(self->backupTag));
try { try {
// Note the "partitionedLog" must be false, because we change
// the configuration to disable backup workers before restore.
extraBackup = backupAgent.submitBackup( extraBackup = backupAgent.submitBackup(
cx, LiteralStringRef("file://simfdb/backups/"), deterministicRandom()->randomInt(0, 100), cx, LiteralStringRef("file://simfdb/backups/"), deterministicRandom()->randomInt(0, 100),
self->backupTag.toString(), self->backupRanges, true, self->usePartitionedLogs); self->backupTag.toString(), self->backupRanges, true, false);
} catch (Error& e) { } catch (Error& e) {
TraceEvent("BARW_SubmitBackup2Exception", randomID) TraceEvent("BARW_SubmitBackup2Exception", randomID)
.error(e) .error(e)

@ -138,11 +138,9 @@ bool checkTxInfoEntryFormat(BinaryReader &reader) {
while (!reader.empty()) { while (!reader.empty()) {
// Get EventType and timestamp // Get EventType and timestamp
FdbClientLogEvents::EventType event; FdbClientLogEvents::Event event;
reader >> event; reader >> event;
double timeStamp; switch (event.type)
reader >> timeStamp;
switch (event)
{ {
case FdbClientLogEvents::GET_VERSION_LATENCY: case FdbClientLogEvents::GET_VERSION_LATENCY:
parser->parseGetVersion(reader); parser->parseGetVersion(reader);
@ -166,7 +164,7 @@ bool checkTxInfoEntryFormat(BinaryReader &reader) {
parser->parseErrorCommit(reader); parser->parseErrorCommit(reader);
break; break;
default: default:
TraceEvent(SevError, "ClientTransactionProfilingUnknownEvent").detail("EventType", event); TraceEvent(SevError, "ClientTransactionProfilingUnknownEvent").detail("EventType", event.type);
return false; return false;
} }
} }

@ -34,6 +34,7 @@ static const char* logTypes[] = {
"log_version:=2", "log_version:=3", "log_version:=4" "log_version:=2", "log_version:=3", "log_version:=4"
}; };
static const char* redundancies[] = { "single", "double", "triple" }; static const char* redundancies[] = { "single", "double", "triple" };
static const char* backupTypes[] = { "backup_worker_enabled:=0", "backup_worker_enabled:=1" };
std::string generateRegions() { std::string generateRegions() {
std::string result; std::string result;
@ -271,7 +272,7 @@ struct ConfigureDatabaseWorkload : TestWorkload {
if(g_simulator.speedUpSimulation) { if(g_simulator.speedUpSimulation) {
return Void(); return Void();
} }
state int randomChoice = deterministicRandom()->randomInt(0, 7); state int randomChoice = deterministicRandom()->randomInt(0, 8);
if( randomChoice == 0 ) { if( randomChoice == 0 ) {
wait( success( wait( success(
runRYWTransaction(cx, [=](Reference<ReadYourWritesTransaction> tr) -> Future<Optional<Value>> runRYWTransaction(cx, [=](Reference<ReadYourWritesTransaction> tr) -> Future<Optional<Value>>
@ -322,6 +323,10 @@ struct ConfigureDatabaseWorkload : TestWorkload {
else if ( randomChoice == 6 ) { else if ( randomChoice == 6 ) {
// Some configurations will be invalid, and that's fine. // Some configurations will be invalid, and that's fine.
wait(success( IssueConfigurationChange( cx, logTypes[deterministicRandom()->randomInt( 0, sizeof(logTypes)/sizeof(logTypes[0]))], false ) )); wait(success( IssueConfigurationChange( cx, logTypes[deterministicRandom()->randomInt( 0, sizeof(logTypes)/sizeof(logTypes[0]))], false ) ));
} else if (randomChoice == 7) {
wait(success(IssueConfigurationChange(
cx, backupTypes[deterministicRandom()->randomInt(0, sizeof(backupTypes) / sizeof(backupTypes[0]))],
false)));
} else { } else {
ASSERT(false); ASSERT(false);
} }

@ -0,0 +1,108 @@
/*
* DataDistributionMetrics.actor.cpp
*
* This source file is part of the FoundationDB open source project
*
* Copyright 2013-2018 Apple Inc. and the FoundationDB project authors
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <boost/lexical_cast.hpp>
#include "fdbclient/ReadYourWrites.h"
#include "fdbserver/workloads/workloads.actor.h"
#include "flow/actorcompiler.h" // This must be the last include
struct DataDistributionMetricsWorkload : KVWorkload {
int numTransactions;
int writesPerTransaction;
int transactionsCommitted;
int numShards;
int64_t avgBytes;
DataDistributionMetricsWorkload(WorkloadContext const& wcx)
: KVWorkload(wcx), transactionsCommitted(0), numShards(0), avgBytes(0) {
numTransactions = getOption(options, LiteralStringRef("numTransactions"), 100);
writesPerTransaction = getOption(options, LiteralStringRef("writesPerTransaction"), 1000);
}
static Value getRandomValue() {
return Standalone<StringRef>(format("Value/%08d", deterministicRandom()->randomInt(0, 10e6)));
}
ACTOR static Future<Void> _start(Database cx, DataDistributionMetricsWorkload* self) {
state int tNum;
for (tNum = 0; tNum < self->numTransactions; ++tNum) {
loop {
state ReadYourWritesTransaction tr(cx);
try {
state int i;
for (i = 0; i < self->writesPerTransaction; ++i) {
tr.set(StringRef(format("Key/%08d", tNum * self->writesPerTransaction + i)), getRandomValue());
}
wait(tr.commit());
++self->transactionsCommitted;
break;
} catch (Error& e) {
wait(tr.onError(e));
}
}
}
return Void();
}
ACTOR static Future<bool> _check(Database cx, DataDistributionMetricsWorkload* self) {
if (self->transactionsCommitted == 0) {
TraceEvent(SevError, "NoTransactionsCommitted");
return false;
}
state Reference<ReadYourWritesTransaction> tr =
Reference<ReadYourWritesTransaction>(new ReadYourWritesTransaction(cx));
try {
state Standalone<RangeResultRef> result = wait(tr->getRange(ddStatsRange, 100));
ASSERT(!result.more);
self->numShards = result.size();
if (self->numShards < 1) return false;
state int64_t totalBytes = 0;
for (int i = 0; i < result.size(); ++i) {
ASSERT(result[i].key.startsWith(ddStatsRange.begin));
totalBytes += readJSONStrictly(result[i].value.toString()).get_obj()["ShardBytes"].get_int64();
}
self->avgBytes = totalBytes / self->numShards;
// fetch data-distribution stats for a smalller range
state int idx = deterministicRandom()->randomInt(0, result.size());
Standalone<RangeResultRef> res = wait(tr->getRange(
KeyRangeRef(result[idx].key, idx + 1 < result.size() ? result[idx + 1].key : ddStatsRange.end), 100));
ASSERT_WE_THINK(res.size() == 1 &&
res[0] == result[idx]); // It works good now. However, not sure in any case of data-distribution, the number changes
} catch (Error& e) {
TraceEvent(SevError, "FailedToRetrieveDDMetrics").detail("Error", e.what());
return false;
}
return true;
}
virtual std::string description() { return "DataDistributionMetrics"; }
virtual Future<Void> setup(Database const& cx) { return Void(); }
virtual Future<Void> start(Database const& cx) { return _start(cx, this); }
virtual Future<bool> check(Database const& cx) { return _check(cx, this); }
virtual void getMetrics(vector<PerfMetric>& m) {
m.push_back(PerfMetric("NumShards", numShards, true));
m.push_back(PerfMetric("AvgBytes", avgBytes, true));
}
};
WorkloadFactory<DataDistributionMetricsWorkload> DataDistributionMetricsWorkloadFactory("DataDistributionMetrics");

@ -50,6 +50,22 @@ struct TagThrottleApiWorkload : TestWorkload {
virtual void getMetrics(vector<PerfMetric>& m) {} virtual void getMetrics(vector<PerfMetric>& m) {}
static Optional<TagThrottleType> randomTagThrottleType() {
Optional<TagThrottleType> throttleType;
switch(deterministicRandom()->randomInt(0, 3)) {
case 0:
throttleType = TagThrottleType::AUTO;
break;
case 1:
throttleType = TagThrottleType::MANUAL;
break;
default:
break;
}
return throttleType;
}
ACTOR Future<Void> throttleTag(Database cx, std::map<std::pair<TransactionTag, TransactionPriority>, TagThrottleInfo> *manuallyThrottledTags) { ACTOR Future<Void> throttleTag(Database cx, std::map<std::pair<TransactionTag, TransactionPriority>, TagThrottleInfo> *manuallyThrottledTags) {
state TransactionTag tag = TransactionTagRef(deterministicRandom()->randomChoice(DatabaseContext::debugTransactionTagChoices)); state TransactionTag tag = TransactionTagRef(deterministicRandom()->randomChoice(DatabaseContext::debugTransactionTagChoices));
state TransactionPriority priority = deterministicRandom()->randomChoice(allTransactionPriorities); state TransactionPriority priority = deterministicRandom()->randomChoice(allTransactionPriorities);
@ -60,7 +76,7 @@ struct TagThrottleApiWorkload : TestWorkload {
tagSet.addTag(tag); tagSet.addTag(tag);
try { try {
wait(ThrottleApi::throttleTags(cx, tagSet, rate, duration, false, priority)); wait(ThrottleApi::throttleTags(cx, tagSet, rate, duration, TagThrottleType::MANUAL, priority));
} }
catch(Error &e) { catch(Error &e) {
state Error err = e; state Error err = e;
@ -72,7 +88,7 @@ struct TagThrottleApiWorkload : TestWorkload {
throw err; throw err;
} }
manuallyThrottledTags->insert_or_assign(std::make_pair(tag, priority), TagThrottleInfo(tag, false, priority, rate, now() + duration, duration)); manuallyThrottledTags->insert_or_assign(std::make_pair(tag, priority), TagThrottleInfo(tag, TagThrottleType::MANUAL, priority, rate, now() + duration, duration));
return Void(); return Void();
} }
@ -82,26 +98,30 @@ struct TagThrottleApiWorkload : TestWorkload {
TagSet tagSet; TagSet tagSet;
tagSet.addTag(tag); tagSet.addTag(tag);
state bool autoThrottled = deterministicRandom()->coinflip(); state Optional<TagThrottleType> throttleType = TagThrottleApiWorkload::randomTagThrottleType();
TransactionPriority priority = deterministicRandom()->randomChoice(allTransactionPriorities); Optional<TransactionPriority> priority = deterministicRandom()->coinflip() ? Optional<TransactionPriority>() : deterministicRandom()->randomChoice(allTransactionPriorities);
state bool erased = false; state bool erased = false;
state double expiration = 0; state double maxExpiration = 0;
if(!autoThrottled) { if(!throttleType.present() || throttleType.get() == TagThrottleType::MANUAL) {
auto itr = manuallyThrottledTags->find(std::make_pair(tag, priority)); for(auto p : allTransactionPriorities) {
if(itr != manuallyThrottledTags->end()) { if(!priority.present() || priority.get() == p) {
expiration = itr->second.expirationTime; auto itr = manuallyThrottledTags->find(std::make_pair(tag, p));
erased = true; if(itr != manuallyThrottledTags->end()) {
manuallyThrottledTags->erase(itr); maxExpiration = std::max(maxExpiration, itr->second.expirationTime);
erased = true;
manuallyThrottledTags->erase(itr);
}
}
} }
} }
bool removed = wait(ThrottleApi::unthrottleTags(cx, tagSet, autoThrottled, priority)); bool removed = wait(ThrottleApi::unthrottleTags(cx, tagSet, throttleType, priority));
if(removed) { if(removed) {
ASSERT(erased || autoThrottled); ASSERT(erased || !throttleType.present() || throttleType.get() == TagThrottleType::AUTO);
} }
else { else {
ASSERT(expiration < now()); ASSERT(maxExpiration < now());
} }
return Void(); return Void();
@ -113,7 +133,7 @@ struct TagThrottleApiWorkload : TestWorkload {
int manualThrottledTags = 0; int manualThrottledTags = 0;
int activeAutoThrottledTags = 0; int activeAutoThrottledTags = 0;
for(auto &tag : tags) { for(auto &tag : tags) {
if(!tag.autoThrottled) { if(tag.throttleType == TagThrottleType::MANUAL) {
ASSERT(manuallyThrottledTags->find(std::make_pair(tag.tag, tag.priority)) != manuallyThrottledTags->end()); ASSERT(manuallyThrottledTags->find(std::make_pair(tag.tag, tag.priority)) != manuallyThrottledTags->end());
++manualThrottledTags; ++manualThrottledTags;
} }
@ -139,34 +159,32 @@ struct TagThrottleApiWorkload : TestWorkload {
} }
ACTOR Future<Void> unthrottleTagGroup(Database cx, std::map<std::pair<TransactionTag, TransactionPriority>, TagThrottleInfo> *manuallyThrottledTags) { ACTOR Future<Void> unthrottleTagGroup(Database cx, std::map<std::pair<TransactionTag, TransactionPriority>, TagThrottleInfo> *manuallyThrottledTags) {
state int choice = deterministicRandom()->randomInt(0, 3); state Optional<TagThrottleType> throttleType = TagThrottleApiWorkload::randomTagThrottleType();
state Optional<TransactionPriority> priority = deterministicRandom()->coinflip() ? Optional<TransactionPriority>() : deterministicRandom()->randomChoice(allTransactionPriorities);
if(choice == 0) { bool unthrottled = wait(ThrottleApi::unthrottleAll(cx, throttleType, priority));
bool unthrottled = wait(ThrottleApi::unthrottleAll(cx)); if(!throttleType.present() || throttleType.get() == TagThrottleType::MANUAL) {
bool unthrottleExpected = false; bool unthrottleExpected = false;
for(auto itr = manuallyThrottledTags->begin(); itr != manuallyThrottledTags->end(); ++itr) { bool empty = manuallyThrottledTags->empty();
if(itr->second.expirationTime > now()) { for(auto itr = manuallyThrottledTags->begin(); itr != manuallyThrottledTags->end();) {
unthrottleExpected = true; if(!priority.present() || priority.get() == itr->first.second) {
if(itr->second.expirationTime > now()) {
unthrottleExpected = true;
}
itr = manuallyThrottledTags->erase(itr);
}
else {
++itr;
} }
} }
ASSERT(!unthrottleExpected || unthrottled); if(throttleType.present()) {
manuallyThrottledTags->clear(); ASSERT((unthrottled && !empty) || (!unthrottled && !unthrottleExpected));
} }
else if(choice == 1) { else {
bool unthrottled = wait(ThrottleApi::unthrottleManual(cx)); ASSERT(unthrottled || !unthrottleExpected);
bool unthrottleExpected = false;
for(auto itr = manuallyThrottledTags->begin(); itr != manuallyThrottledTags->end(); ++itr) {
if(itr->second.expirationTime > now()) {
unthrottleExpected = true;
}
} }
ASSERT((unthrottled && !manuallyThrottledTags->empty()) || (!unthrottled && !unthrottleExpected));
manuallyThrottledTags->clear();
}
else {
bool unthrottled = wait(ThrottleApi::unthrottleAuto(cx));
} }
return Void(); return Void();
@ -176,7 +194,7 @@ struct TagThrottleApiWorkload : TestWorkload {
if(deterministicRandom()->coinflip()) { if(deterministicRandom()->coinflip()) {
wait(ThrottleApi::enableAuto(cx, true)); wait(ThrottleApi::enableAuto(cx, true));
if(deterministicRandom()->coinflip()) { if(deterministicRandom()->coinflip()) {
bool unthrottled = wait(ThrottleApi::unthrottleAuto(cx)); bool unthrottled = wait(ThrottleApi::unthrottleAll(cx, TagThrottleType::AUTO, Optional<TransactionPriority>()));
} }
} }
else { else {

@ -30,7 +30,7 @@
#include "flow/SimpleOpt.h" #include "flow/SimpleOpt.h"
#include "fdbmonitor/SimpleIni.h" #include "fdbmonitor/SimpleIni.h"
#include "fdbclient/IncludeVersions.h" #include "fdbclient/versions.h"
// For PathFileExists // For PathFileExists
#include "Shlwapi.h" #include "Shlwapi.h"

@ -73,7 +73,7 @@ class ThreadPool : public IThreadPool, public ReferenceCounted<ThreadPool> {
void operator()() { Thread::dispatch(action); action = NULL; } void operator()() { Thread::dispatch(action); action = NULL; }
~ActionWrapper() { if (action) { action->cancel(); } } ~ActionWrapper() { if (action) { action->cancel(); } }
private: private:
void operator=(ActionWrapper const&); ActionWrapper &operator=(ActionWrapper const&);
}; };
public: public:
ThreadPool() : dontstop(ios), mode(Run) {} ThreadPool() : dontstop(ios), mode(Run) {}

@ -287,7 +287,7 @@ ACTOR static Future<Void> readEntireFile( std::string filename, std::string* des
throw file_too_large(); throw file_too_large();
} }
destination->resize(filesize); destination->resize(filesize);
wait(success(file->read(const_cast<char*>(destination->c_str()), filesize, 0))); wait(success(file->read(&destination[0], filesize, 0)));
return Void(); return Void();
} }

@ -235,6 +235,17 @@ struct NetworkAddress {
bool isTLS() const { return (flags & FLAG_TLS) != 0; } bool isTLS() const { return (flags & FLAG_TLS) != 0; }
bool isV6() const { return ip.isV6(); } bool isV6() const { return ip.isV6(); }
size_t hash() const {
size_t result = 0;
if (ip.isV6()) {
uint16_t* ptr = (uint16_t*)ip.toV6().data();
result = ((size_t)ptr[5] << 32) | ((size_t)ptr[6] << 16) | ptr[7];
} else {
result = ip.toV4();
}
return (result << 16) + port;
}
static NetworkAddress parse(std::string const&); // May throw connection_string_invalid static NetworkAddress parse(std::string const&); // May throw connection_string_invalid
static Optional<NetworkAddress> parseOptional(std::string const&); static Optional<NetworkAddress> parseOptional(std::string const&);
static std::vector<NetworkAddress> parseList( std::string const& ); static std::vector<NetworkAddress> parseList( std::string const& );
@ -270,14 +281,7 @@ namespace std
{ {
size_t operator()(const NetworkAddress& na) const size_t operator()(const NetworkAddress& na) const
{ {
size_t result = 0; return na.hash();
if (na.ip.isV6()) {
uint16_t* ptr = (uint16_t*)na.ip.toV6().data();
result = ((size_t)ptr[5] << 32) | ((size_t)ptr[6] << 16) | ptr[7];
} else {
result = na.ip.toV4();
}
return (result << 16) + na.port;
} }
}; };
} }

@ -46,6 +46,7 @@ if(WITH_PYTHON)
add_fdb_test(TEST_FILES BlobStore.txt IGNORE) add_fdb_test(TEST_FILES BlobStore.txt IGNORE)
add_fdb_test(TEST_FILES ConsistencyCheck.txt IGNORE) add_fdb_test(TEST_FILES ConsistencyCheck.txt IGNORE)
add_fdb_test(TEST_FILES DDMetricsExclude.txt IGNORE) add_fdb_test(TEST_FILES DDMetricsExclude.txt IGNORE)
add_fdb_test(TEST_FILES DataDistributionMetrics.txt IGNORE)
add_fdb_test(TEST_FILES DiskDurability.txt IGNORE) add_fdb_test(TEST_FILES DiskDurability.txt IGNORE)
add_fdb_test(TEST_FILES FileSystem.txt IGNORE) add_fdb_test(TEST_FILES FileSystem.txt IGNORE)
add_fdb_test(TEST_FILES Happy.txt IGNORE) add_fdb_test(TEST_FILES Happy.txt IGNORE)

@ -0,0 +1,21 @@
testTitle=DataDistributionMetrics
testName=Cycle
transactionsPerSecond=2500.0
testDuration=10.0
expectedRate=0.025
testName=DataDistributionMetrics
numTransactions=100
writesPerTransaction=1000
testName=Attrition
machinesToKill=1
machinesToLeave=3
reboot=true
testDuration=10.0
testName=Attrition
machinesToKill=1
machinesToLeave=3
reboot=true
testDuration=10.0