1
0
mirror of https://github.com/apple/foundationdb.git synced 2025-05-31 01:37:54 +08:00
Dennis Zhou b34a54fa7f
blob: allow for alignment of granules to tuple boundaries ()
* blob: read TenantMap during recovery

Future functionality in the blob subsystem will rely on the tenant data
being loaded. This fixes this issue by loading the tenant data before
completing recovery such that continued actions on existing blob
granules will have access to the tenant data.

Example scenario with failover, splits are restarted before loading the
tenant data:
BM - BlobManager
epoch 3:                        epoch 4:
  BM record intent to split.
  Epoch fails.
                                BM recovery begins.
  BM fails to persist split.
                                BM recovery finishes.
                                BM.checkBlobWorkerList()
                                  maybeSplitRange().
                                BM.monitorClientRanges().
                                  loads tenant data.

bin/fdbserver -r simulation -f tests/slow/BlobGranuleCorrectness.toml \
    -s 223570924 -b on  --crash --trace_format json

* blob: add tuple key truncation for blob granule alignment

FDB has a backup system available using the blob manager and blob
granule subsystem. If we want to audit the data in the blobs, it's a lot
easier if we can align them to something meaningful.

When a blob granule is being split, we ask the storage metrics system
for split points as it holds approximate data distribution metrics.
These keys are then processed to determine if they are a tuple and
should be truncated according to the new knob,
BG_KEY_TUPLE_TRUNCATE_OFFSET.

Here we keep all aligned keys together in the same granule even if it is
larger than the allowed granule size. The following commit will address
this by adding merge boundaries.

* blob: minor clean ups in merging code

1. Rename mergeNow -> seen. This is more inline with clocksweep naming
   and removes the confusion between mergeNow and canMergeNow.
2. Make clearMergeCandidate() reset to MergeCandidateCannotMerge to make
   a clear distinction what we're accomplishing.
3. Rename canMergeNow() -> mergeEligble().

* blob: add explicit (hard) boundaries

Blob ranges can be specified either through explicit ranges or at the
tenant level. Right now this is managed implicitly. This commit aims to
make it a little more explicit.

Blobification begins in monitorClientRanges() which parses either the
explicit blob ranges or the tenant map. As we do this and add new
ranges, let's explicitly track what is a hard boundary and what isn't.

When blob merging occurs, we respect this boundary. When a hard boundary
is encountered, we submit the found eligible ranges and start looking
for a new range beginning with this hard boundary.

* blob: create BlobGranuleSplitPoints struct

This is a setup for the following commit. Our goal here is to provide a
structure for split points to be passed around. The need is for us to be
able to carry uncommitted state until it is committed and we can apply
these mutations to the in-memory data structures.

* blob: implement soft boundaries

An earlier commit establishes the need to create data boundaries within
a tenant. The reality is we may encounter a set of keys that degnerate
to the same key prefix. We'll need to be able to split those across
granules, but we want to ensure we merge the split granules together
before merging with other granules.

This adds to the BlobGranuleSplitPoints state of new
BlobGranuleMergeBoundary items. BlobGranuleMergeBoundary contains state
saying if it is a left or right boundary. This information is used to,
like hard boundaries, force merging of like granules first.

We read the BlobGranuleMergeBoundary map into memory at recovery.
2022-08-02 16:06:25 -05:00
2022-04-28 16:53:38 -07:00
2022-06-27 19:11:24 -06:00
2022-04-11 23:23:27 -05:00
2022-02-04 13:29:49 -06:00
2017-05-25 13:48:44 -07:00
2022-05-05 16:34:58 -05:00

FoundationDB logo

Build Status

FoundationDB is a distributed database designed to handle large volumes of structured data across clusters of commodity servers. It organizes data as an ordered key-value store and employs ACID transactions for all operations. It is especially well-suited for read/write workloads but also has excellent performance for write-intensive workloads. Users interact with the database using API language binding.

To learn more about FoundationDB, visit foundationdb.org

Documentation

Documentation can be found online at https://apple.github.io/foundationdb/. The documentation covers details of API usage, background information on design philosophy, and extensive usage examples. Docs are built from the source in this repo.

Forums

The FoundationDB Forums are the home for most of the discussion and communication about the FoundationDB project. We welcome your participation! We want FoundationDB to be a great project to be a part of and, as part of that, have established a Code of Conduct to establish what constitutes permissible modes of interaction.

Contributing

Contributing to FoundationDB can be in contributions to the code base, sharing your experience and insights in the community on the Forums, or contributing to projects that make use of FoundationDB. Please see the contributing guide for more specifics.

Getting Started

Binary downloads

Developers interested in using FoundationDB can get started by downloading and installing a binary package. Please see the downloads page for a list of available packages.

Compiling from source

Developers on an OS for which there is no binary package, or who would like to start hacking on the code, can get started by compiling from source.

The official docker image for building is foundationdb/build which has all dependencies installed. The Docker image definitions used by FoundationDB team members can be found in the dedicated repository..

To build outside the official docker image you'll need at least these dependencies:

  1. Install cmake Version 3.13 or higher CMake
  2. Install Mono
  3. Install Ninja (optional, but recommended)

If compiling for local development, please set -DUSE_WERROR=ON in cmake. Our CI compiles with -Werror on, so this way you'll find out about compiler warnings that break the build earlier.

Once you have your dependencies, you can run cmake and then build:

  1. Check out this repository.
  2. Create a build directory (you can have the build directory anywhere you like). There is currently a directory in the source tree called build, but you should not use it. See #3098
  3. cd <PATH_TO_BUILD_DIRECTORY>
  4. cmake -G Ninja <PATH_TO_FOUNDATIONDB_DIRECTORY>
  5. ninja # If this crashes it probably ran out of memory. Try ninja -j1

Language Bindings

The language bindings that are supported by cmake will have a corresponding README.md file in the corresponding bindings/lang directory.

Generally, cmake will build all language bindings for which it can find all necessary dependencies. After each successful cmake run, cmake will tell you which language bindings it is going to build.

Generating compile_commands.json

CMake can build a compilation database for you. However, the default generated one is not too useful as it operates on the generated files. When running make, the build system will create another compile_commands.json file in the source directory. This can than be used for tools like CCLS, CQuery, etc. This way you can get code-completion and code navigation in flow. It is not yet perfect (it will show a few errors) but we are constantly working on improving the development experience.

CMake will not produce a compile_commands.json, you must pass -DCMAKE_EXPORT_COMPILE_COMMANDS=ON. This also enables the target processed_compile_commands, which rewrites compile_commands.json to describe the actor compiler source file, not the post-processed output files, and places the output file in the source directory. This file should then be picked up automatically by any tooling.

Note that if building inside of the foundationdb/build docker image, the resulting paths will still be incorrect and require manual fixing. One will wish to re-run cmake with -DCMAKE_EXPORT_COMPILE_COMMANDS=OFF to prevent it from reverting the manual changes.

Using IDEs

CMake has built in support for a number of popular IDEs. However, because flow files are precompiled with the actor compiler, an IDE will not be very useful as a user will only be presented with the generated code - which is not what she wants to edit and get IDE features for.

The good news is, that it is possible to generate project files for editing flow with a supported IDE. There is a CMake option called OPEN_FOR_IDE which will generate a project which can be opened in an IDE for editing. You won't be able to build this project, but you will be able to edit the files and get most edit and navigation features your IDE supports.

For example, if you want to use XCode to make changes to FoundationDB you can create a XCode-project with the following command:

cmake -G Xcode -DOPEN_FOR_IDE=ON <FDB_SOURCE_DIRECTORY>

You should create a second build-directory which you will use for building and debugging.

FreeBSD

  1. Check out this repo on your server.

  2. Install compile-time dependencies from ports.

  3. (Optional) Use tmpfs & ccache for significantly faster repeat builds

  4. (Optional) Install a JDK for Java Bindings. FoundationDB currently builds with Java 8.

  5. Navigate to the directory where you checked out the foundationdb repo.

  6. Build from source.

    sudo pkg install -r FreeBSD \
        shells/bash devel/cmake devel/ninja devel/ccache  \
        lang/mono lang/python3 \
        devel/boost-libs devel/libeio \
        security/openssl
    mkdir .build && cd .build
    cmake -G Ninja \
        -DUSE_CCACHE=on \
        -DUSE_DTRACE=off \
        ..
    ninja -j 10
    # run fast tests
    ctest -L fast
    # run all tests
    ctest --output-on-failure -v
    

Linux

There are no special requirements for Linux. A docker image can be pulled from foundationdb/build that has all of FoundationDB's dependencies pre-installed, and is what the CI uses to build and test PRs.

cmake -G Ninja <FDB_SOURCE_DIR>
ninja
cpack -G DEB

For RPM simply replace DEB with RPM.

MacOS

The build under MacOS will work the same way as on Linux. To get boost and ninja you can use Homebrew.

cmake -G Ninja <PATH_TO_FOUNDATIONDB_SOURCE>

To generate a installable package,

ninja
$SRCDIR/packaging/osx/buildpkg.sh . $SRCDIR

Windows

Under Windows, only Visual Studio with ClangCl is supported

  1. Install Visual Studio 2019 (IDE or Build Tools), and enable llvm support
  2. Install CMake 3.15 or higher
  3. Download Boost 1.77.0
  4. Unpack boost to C:\boost, or use -DBOOST_ROOT=<PATH_TO_BOOST> with cmake if unpacked elsewhere
  5. Install Python if is not already installed by Visual Studio
  6. (Optional) Install OpenJDK 11 to build Java bindings
  7. (Optional) Install OpenSSL 3.x to build with TLS support
  8. (Optional) Install WIX Toolset to build Windows installer
  9. mkdir build && cd build
  10. cmake -G "Visual Studio 16 2019" -A x64 -T ClangCl <PATH_TO_FOUNDATIONDB_SOURCE>
  11. msbuild /p:Configuration=Release foundationdb.sln
  12. To increase build performance, use /p:UseMultiToolTask=true and /p:CL_MPCount=<NUMBER_OF_PARALLEL_JOBS>
Description
FoundationDB - the open source, distributed, transactional key-value store
Readme Apache-2.0 252 MiB
Languages
C++ 68.2%
C 18.7%
Python 4.1%
Java 3.2%
Go 1.5%
Other 4.1%