foundationdb

mirror of https://github.com/apple/foundationdb.git synced 2025-05-31 01:37:54 +08:00

Go to file

blob: allow for alignment of granules to tuple boundaries (#7746 )

* blob: read TenantMap during recovery

Future functionality in the blob subsystem will rely on the tenant data
being loaded. This fixes this issue by loading the tenant data before
completing recovery such that continued actions on existing blob
granules will have access to the tenant data.

Example scenario with failover, splits are restarted before loading the
tenant data:
BM - BlobManager
epoch 3:                        epoch 4:
  BM record intent to split.
  Epoch fails.
                                BM recovery begins.
  BM fails to persist split.
                                BM recovery finishes.
                                BM.checkBlobWorkerList()
                                  maybeSplitRange().
                                BM.monitorClientRanges().
                                  loads tenant data.

bin/fdbserver -r simulation -f tests/slow/BlobGranuleCorrectness.toml \
    -s 223570924 -b on  --crash --trace_format json

* blob: add tuple key truncation for blob granule alignment

FDB has a backup system available using the blob manager and blob
granule subsystem. If we want to audit the data in the blobs, it's a lot
easier if we can align them to something meaningful.

When a blob granule is being split, we ask the storage metrics system
for split points as it holds approximate data distribution metrics.
These keys are then processed to determine if they are a tuple and
should be truncated according to the new knob,
BG_KEY_TUPLE_TRUNCATE_OFFSET.

Here we keep all aligned keys together in the same granule even if it is
larger than the allowed granule size. The following commit will address
this by adding merge boundaries.

* blob: minor clean ups in merging code

1. Rename mergeNow -> seen. This is more inline with clocksweep naming
   and removes the confusion between mergeNow and canMergeNow.
2. Make clearMergeCandidate() reset to MergeCandidateCannotMerge to make
   a clear distinction what we're accomplishing.
3. Rename canMergeNow() -> mergeEligble().

* blob: add explicit (hard) boundaries

Blob ranges can be specified either through explicit ranges or at the
tenant level. Right now this is managed implicitly. This commit aims to
make it a little more explicit.

Blobification begins in monitorClientRanges() which parses either the
explicit blob ranges or the tenant map. As we do this and add new
ranges, let's explicitly track what is a hard boundary and what isn't.

When blob merging occurs, we respect this boundary. When a hard boundary
is encountered, we submit the found eligible ranges and start looking
for a new range beginning with this hard boundary.

* blob: create BlobGranuleSplitPoints struct

This is a setup for the following commit. Our goal here is to provide a
structure for split points to be passed around. The need is for us to be
able to carry uncommitted state until it is committed and we can apply
these mutations to the in-memory data structures.

* blob: implement soft boundaries

An earlier commit establishes the need to create data boundaries within
a tenant. The reality is we may encounter a set of keys that degnerate
to the same key prefix. We'll need to be able to split those across
granules, but we want to ensure we merge the split granules together
before merging with other granules.

This adds to the BlobGranuleSplitPoints state of new
BlobGranuleMergeBoundary items. BlobGranuleMergeBoundary contains state
saying if it is a left or right boundary. This information is used to,
like hard boundaries, force merging of like granules first.

We read the BlobGranuleMergeBoundary map into memory at recovery.

2022-08-02 16:06:25 -05:00

bindings

Make Tuple::pack return a Standalone<StringRef> (#7764 )

2022-08-02 12:45:56 -07:00

cmake

Python_EXECUTABLE to Python3_EXECUTABLE

2022-07-29 14:57:29 -07:00

contrib

C shim library: API for setting client library path; additional tests (#7702 )

2022-07-29 11:45:45 +02:00

design

Merge pull request #7575 from xis19/lb

2022-07-19 17:50:25 -07:00

documentation

Add test coverage for SpecialKeyRangeAsyncImpl::getRange (#7671 )

2022-08-02 12:04:40 -07:00

fdbbackup

Use narrow includes in key backed types; remove some unnecessary return statements.

2022-07-14 13:26:43 -07:00

fdbcli

Adding more blobrange cli commands and a couple other tweaks (#7727 )

2022-07-29 08:20:45 -07:00

fdbclient

blob: allow for alignment of granules to tuple boundaries (#7746 )

2022-08-02 16:06:25 -05:00

fdbkubernetesmonitor

Add docs for how to run the operator with the unified image

2022-07-04 15:25:25 +01:00

FDBLibTLS

add WolfSSL support (#6682 )

2022-04-28 16:53:38 -07:00

fdbmonitor

fix fdbmonitor

2022-06-27 19:11:24 -06:00

fdbrpc

Fix bug in token cache unit test where the expiration time was underflowing.

2022-07-30 14:22:24 -07:00

fdbserver

blob: allow for alignment of granules to tuple boundaries (#7746 )

2022-08-02 16:06:25 -05:00

fdbservice

Add missing dependencies for fdbservice

2022-06-29 23:43:38 +02:00

flow

Merge pull request #7763 from sfc-gh-xwang/feature/main/unittest

2022-08-02 13:30:30 -07:00

flowbench

Place generateRandomData() under {I|Deterministic}Random

2022-07-20 13:21:11 +02:00

layers

Fix comments to use transaction_too_old instead of past_version

2019-04-24 18:50:57 -07:00

packaging

C shim library: API for setting client library path; additional tests (#7702 )

2022-07-29 11:45:45 +02:00

recipes

update version to 7.2.0

2022-04-11 23:23:27 -05:00

tests

blob: allow for alignment of granules to tuple boundaries (#7746 )

2022-08-02 16:06:25 -05:00

.clang-format

apply clang-format to *.c, *.cpp, *.h, *.hpp files

2021-03-10 10:18:07 -08:00

.git-blame-ignore-revs

Ignore format commits in git blame

2022-06-27 17:32:43 -04:00

.gitignore

createdTime based storage wiggler (#6219 )

2022-02-04 15:04:30 -08:00

ACKNOWLEDGEMENTS

Add ACKNOWLEDGEMENTS. Replace memcpy with advsimd implementation.

2021-08-23 19:12:52 -07:00

CMakeLists.txt

Update Python target name to Python3

2022-07-29 14:57:29 -07:00

CODE_OF_CONDUCT.md

Updates markdown link to Contributor Covenant homepage in the Code of Conduct.

2018-04-18 01:08:55 -04:00

CONTRIBUTING.md

Update CONTRIBUTING.md

2022-02-04 13:29:49 -06:00

fdb.cluster.cmake

Fix port to match sandbox foundationdb.conf

2019-04-03 13:49:44 -07:00

LICENSE

Initial repository commit

2017-05-25 13:48:44 -07:00

pull_request_template.md

Rename master to main in PR template

2022-04-11 08:53:33 +01:00

README.md

Update Badge URL in README.md

2022-05-05 16:34:58 -05:00

versions.target.cmake

use FDB_VERSION in lieu of PROJECT_VERSION or CMAKE_PROJECT_VERSION

2021-11-29 15:11:20 -08:00

README.md

FoundationDB is a distributed database designed to handle large volumes of structured data across clusters of commodity servers. It organizes data as an ordered key-value store and employs ACID transactions for all operations. It is especially well-suited for read/write workloads but also has excellent performance for write-intensive workloads. Users interact with the database using API language binding.

To learn more about FoundationDB, visit foundationdb.org

Documentation

Documentation can be found online at https://apple.github.io/foundationdb/. The documentation covers details of API usage, background information on design philosophy, and extensive usage examples. Docs are built from the source in this repo.

Forums

The FoundationDB Forums are the home for most of the discussion and communication about the FoundationDB project. We welcome your participation! We want FoundationDB to be a great project to be a part of and, as part of that, have established a Code of Conduct to establish what constitutes permissible modes of interaction.

Contributing

Contributing to FoundationDB can be in contributions to the code base, sharing your experience and insights in the community on the Forums, or contributing to projects that make use of FoundationDB. Please see the contributing guide for more specifics.

Getting Started

Binary downloads

Developers interested in using FoundationDB can get started by downloading and installing a binary package. Please see the downloads page for a list of available packages.

Compiling from source

Developers on an OS for which there is no binary package, or who would like to start hacking on the code, can get started by compiling from source.

The official docker image for building is foundationdb/build which has all dependencies installed. The Docker image definitions used by FoundationDB team members can be found in the dedicated repository..

To build outside the official docker image you'll need at least these dependencies:

Install cmake Version 3.13 or higher CMake
Install Mono
Install Ninja (optional, but recommended)

If compiling for local development, please set -DUSE_WERROR=ON in cmake. Our CI compiles with -Werror on, so this way you'll find out about compiler warnings that break the build earlier.

Once you have your dependencies, you can run cmake and then build:

Check out this repository.
Create a build directory (you can have the build directory anywhere you like). There is currently a directory in the source tree called build, but you should not use it. See #3098
cd <PATH_TO_BUILD_DIRECTORY>
cmake -G Ninja <PATH_TO_FOUNDATIONDB_DIRECTORY>
ninja # If this crashes it probably ran out of memory. Try ninja -j1

Language Bindings

The language bindings that are supported by cmake will have a corresponding README.md file in the corresponding bindings/lang directory.

Generally, cmake will build all language bindings for which it can find all necessary dependencies. After each successful cmake run, cmake will tell you which language bindings it is going to build.

Generating `compile_commands.json`

CMake can build a compilation database for you. However, the default generated one is not too useful as it operates on the generated files. When running make, the build system will create another compile_commands.json file in the source directory. This can than be used for tools like CCLS, CQuery, etc. This way you can get code-completion and code navigation in flow. It is not yet perfect (it will show a few errors) but we are constantly working on improving the development experience.

CMake will not produce a compile_commands.json, you must pass -DCMAKE_EXPORT_COMPILE_COMMANDS=ON. This also enables the target processed_compile_commands, which rewrites compile_commands.json to describe the actor compiler source file, not the post-processed output files, and places the output file in the source directory. This file should then be picked up automatically by any tooling.

Note that if building inside of the foundationdb/build docker image, the resulting paths will still be incorrect and require manual fixing. One will wish to re-run cmake with -DCMAKE_EXPORT_COMPILE_COMMANDS=OFF to prevent it from reverting the manual changes.

Using IDEs

CMake has built in support for a number of popular IDEs. However, because flow files are precompiled with the actor compiler, an IDE will not be very useful as a user will only be presented with the generated code - which is not what she wants to edit and get IDE features for.

The good news is, that it is possible to generate project files for editing flow with a supported IDE. There is a CMake option called OPEN_FOR_IDE which will generate a project which can be opened in an IDE for editing. You won't be able to build this project, but you will be able to edit the files and get most edit and navigation features your IDE supports.

For example, if you want to use XCode to make changes to FoundationDB you can create a XCode-project with the following command:

cmake -G Xcode -DOPEN_FOR_IDE=ON <FDB_SOURCE_DIRECTORY>

You should create a second build-directory which you will use for building and debugging.

FreeBSD

Check out this repo on your server.
Install compile-time dependencies from ports.
(Optional) Use tmpfs & ccache for significantly faster repeat builds
(Optional) Install a JDK for Java Bindings. FoundationDB currently builds with Java 8.
Navigate to the directory where you checked out the foundationdb repo.

Build from source.

sudo pkg install -r FreeBSD \
    shells/bash devel/cmake devel/ninja devel/ccache  \
    lang/mono lang/python3 \
    devel/boost-libs devel/libeio \
    security/openssl
mkdir .build && cd .build
cmake -G Ninja \
    -DUSE_CCACHE=on \
    -DUSE_DTRACE=off \
    ..
ninja -j 10
# run fast tests
ctest -L fast
# run all tests
ctest --output-on-failure -v

Linux

There are no special requirements for Linux. A docker image can be pulled from foundationdb/build that has all of FoundationDB's dependencies pre-installed, and is what the CI uses to build and test PRs.

cmake -G Ninja <FDB_SOURCE_DIR>
ninja
cpack -G DEB

For RPM simply replace DEB with RPM.

MacOS

The build under MacOS will work the same way as on Linux. To get boost and ninja you can use Homebrew.

cmake -G Ninja <PATH_TO_FOUNDATIONDB_SOURCE>

To generate a installable package,

ninja
$SRCDIR/packaging/osx/buildpkg.sh . $SRCDIR

Windows

Under Windows, only Visual Studio with ClangCl is supported

Install Visual Studio 2019 (IDE or Build Tools), and enable llvm support
Install CMake 3.15 or higher
Download Boost 1.77.0
Unpack boost to C:\boost, or use -DBOOST_ROOT=<PATH_TO_BOOST> with cmake if unpacked elsewhere
Install Python if is not already installed by Visual Studio
(Optional) Install OpenJDK 11 to build Java bindings
(Optional) Install OpenSSL 3.x to build with TLS support
(Optional) Install WIX Toolset to build Windows installer
mkdir build && cd build
cmake -G "Visual Studio 16 2019" -A x64 -T ClangCl <PATH_TO_FOUNDATIONDB_SOURCE>
msbuild /p:Configuration=Release foundationdb.sln
To increase build performance, use /p:UseMultiToolTask=true and /p:CL_MPCount=<NUMBER_OF_PARALLEL_JOBS>

Languages

C++ 68.2%

C 18.7%

Python 4.1%

Java 3.2%

Go 1.5%

Other 4.1%

README.md

Documentation

Forums

Contributing

Getting Started

Binary downloads

Compiling from source

Language Bindings

Generating compile_commands.json

Using IDEs

FreeBSD

Linux

MacOS

Windows

Generating `compile_commands.json`