Found by simulation:
seed: -f tests/slow/ApiCorrectnessAtomicRestore.toml -s 177856328 -b on
Commit: 51ad8428e0fbe1d82bc76cf42b1579f51ecf2773
Compiler: clang++
Env: Rhel9 okteto
applyMutations() has processed version 801400000-803141392, and before calling sendCommitTransactionRequest(),
which was going to update apply begin version to 803141392. But DID NOT wait for the transaction commit.
Then there is an update on the apply end version to 845345760, which picks up the PREVIOUS apply begin version 801400000.
Thus started another applyMutation() with version range 801400000-845345760. Note because previous
applyMutation() has finished and didn't wait for the transaction commit, thus the starting version
is wrong. As a result, this applyMutation() re-processed version range 801400000-803141392.
The test failed during re-processing, because mutations are missing for the overlapped range.
The fix is to wait for the transaction to commit in sendCommitTransactionRequest().
This bug probably affects DR as well.
See rdar://146877552
20250317-162835-jzhou-ff4c4d6d7c51bfed
This patch adds TLS support for GrpcServer and AsyncGrpcClient by
implementing `GrpcCredentialsProvider` and using that to get channel
credentials. It adds `FlowGrpc` which is a flow global instance, and
initializes TLS credentials that are consistent with the ones provided
to FlowTransport.
- Added `FlowGrpc` to manage gRPC server initialization and TLS
configuration globally.
- `GrpcCredentialsProvider` abstracts secure/insecure communications
configurations for server/clients.
- Introduced `GrpcTlsCredentialProvider` for dynamic TLS certificate
reloading from filesystem and `GrpcTlsCredentialStaticProvider` for
static in-memory credentials.
- Updated `GrpcServer` to accept a `GrpcCredentialProvider`, enabling
dynamic TLS credential management.
- Modified `fdbserver` to use `FlowGrpc::init()` for gRPC server
initialization instead of `GrpcServer::initInstance()`, aligning it
with FlowTransport behavior.
- Modified `GrpcServer::run()` to use the provided
`GrpcCredentialProvider` instead of hardcoded insecure credentials.
Testing:
- Implemented a basic mTLS test case (`/fdbrpc/grpc/basic_tls`) to
verify secure gRPC connections using
`GrpcTlsCredentialStaticProvider`.
Todo:
- Generate certificates during testruns instead statically.
- Add test for `GrpcTlsCredentialProvider` which reads keys/certs from
filesystem and monitors changes.
- Verify peers rules/criterias like FDB --verify-peer feature.
`getFuture()` should be called before post as `send`/`sendError`
operation in `ThreadReturnPromise` moves the underlying Promise to
`tagAndForward()`.
Ideally, `ThreadReturnPromise` behavior should stay consistent with the
`Promise`. However, the problem is that it relies on the invariant that
there will always be one owner of its internal `Promise` which is either
itself or `tagOrForward` -- which is necessary to ensure that only one
thread can operate on the Promise's internal state (ref count, flags
etc) and avoid race conditions.
This patch (1) makes sure that in case of `post()` function we get
future before, (2) adds an ASSERT as this should never happen, (3)
documentation for future users and (4) a test case for potentially
fixing this in future.
* ENABLE_VERSION_VECTOR_REPLY_RECOVERY can be T only if ENABLE_VERSION_VECTOR_TLOG_UNICAST is T
* Respond to review comments
---------
Co-authored-by: Dan Lambright <hlambright@apple.com>
Tighten up options for bulk*. Compound 'local' and 'blobstore' as 'dump'/'load'. Ditto for 'history'.
Make it so 'bulkload mode' works like 'bulkdump mode': i.e. dumps current mode.
If mode is not on for bulk*, ERROR in same manner as for writemode.
Make it so we can return bulk* subcommand specific help rather than dump all help when an issue.
Make the commands match in the ctest
* - Extend the unicast based recovery algorithm to do the replication policy check
* - Review comments related changes
* - Review and compilation related changes
* Hash file before uploading. Add it as tag after successful
multipart upload. On download, after the file is on disk,
get its hash and compare to that of the tag we get from s3.
* fdbclient/CMakeLists.txt
Be explicit what s3client needs.
* fdbclient/S3BlobStore.actor.cpp
* fdbclient/include/fdbclient/S3BlobStore.h
Add putObjectTags and getObjectTags
* fdbclient/S3Client.actor.cpp
Add calculating checksum, adding it as
tags on upload, fetching on download,
and verifying match if present.
Clean up includes.
Less logging.
* fdbclient/tests/s3client_test.sh
Less logging.
* Make failed checksum check an error (and mark non-retryable)
---------
Co-authored-by: michael stack <stack@duboce.com>
* Track shard moves for version vector
* Don't broadcast to all TL when a different CP had a metadata mutation, unless on shard moves
* update lastShardMove on resolver
* Respond to review comments
---------
Co-authored-by: Dan Lambright <hlambright@apple.com>
* fdbcli/BulkDumpCommand.actor.cpp
* fdbcli/BulkLoadCommand.actor.cpp
Print out the bulkdump description rather than usage so user
has a chance of figuring out what it is they entered incorrectly.
Make bulkdump and bulkload align by using 'cancel' instead of
'clear' in both and ordering the sub-commands the same for
bulkload and bulkdump. Add more help to the description.
Bulkload was missing mention of the jobid needed
specifying a bulkload.
* documentation/sphinx/source/bulkdump.rst
s/clearBulkDumpJob/cancelBulkDumpJob/
Co-authored-by: stack <stack@duboce.com>
Simulation found an assertion failure in SS:
ASSERT(rollbackVersion >= data->storageVersion());
The reason is that storage version is updated to a version larger than the
forced recovery version, due to only 1'000'000 for max_read_transaction_life_versions.
Also added debugging for cumulative checksum mutations.
See rdar://144550725
20250309-185039-jzhou-5145c65b0e8071b7
* * fdbclient/S3Client.actor.cpp
Change field names so capitialized (convention)
Add duration as field to traces.
* fdbserver/BulkLoadUtil.actor.cpp
When the job-manifest is big, processing blocks
so much getBulkLoadJobFileManifestEntryFromJobManifestFile
fails.
* Make bulkload file reads and writes async and memory parsimonious.
In tests at scale, processing a large job-manifest.txt was blocking
and causing the bulk job to fail. This is part 1 of two patches.
The second is to address data copy added in the below when we
made methods ACTORs (ACTOR doesn't allow passing by reference).
* fdbserver/BulkDumpUtil.actor.cpp
Removed writeStringToFile and buldDumpFileCopy in favor of new methods
in BulkLoadUtil. Made hosting functions ACTORs so could wait on
async calls.
* fdbserver/BulkLoadUtil.actor.cpp
Added async read and write functions.
* fdbserver/DataDistribution.actor.cpp
Making uploadBulkDumpJobManifestFile async made it so big bulkloads
work.
* fix memory corruption in writeBulkFileBytes and fix read options in getBulkLoadJobFileManifestEntryFromJobManifestFile
* If read or write < 1MB, do it in a single read else do multiple read/writes
* * packaging/docker/fdb-aws-s3-credentials-fetcher/fdb-aws-s3-credentials-fetcher.go
Just be blunt and write out the credentials. Trying to figure when the
blob credentials have expired is error prone.
Co-authored-by: michael stack <stack@duboce.com>
Co-authored-by: Zhe Wang <zhe.wang@wustl.edu>
Copy-constructor can be added back if necessary. Meanwhile, its simpler to enforce only copy of
ThreadReturnPromise* family, and avoid scattering it all over places.
When a value/error is sent via `ThreadReturnPromiseStream` we assume that the underlying
`PromiseStream` will be alive when the client waits. However, if the last
`ThreadReturnPromiseStream` gets destroyed after sending values/end_of_stream(), the underlying
`PromiseStream` will as well resulting in `broken_promise`. This happens because the actual work of
sending the value/error is deferred on the main thread.
This is likely to happen because the sender did its work and it isn't supposed to check if client
got the value. Hence, little reason to keep the promise. Meanwhile, client is free to read values
from its future whenever it needs to.
This patch just holds the reference to underlying `NotifiedQueue` by copying `PromiseStream` until
the value/error is sent. The test added would fail without this patch.
This patch has two set of changes:
- Whenever a service is registered and removed from server, we need to restart gRPC server.
GrpcServer provides some methods that can be used by worker actors so that the life of
services registered by them can tied to the life of the worker role itself.
- Replace asio::thread_pool with AsyncTaskExecutor both in client and server.
This patch implements `AsyncTaskExecutor` for asynchronous execution of tasks in a separate thread
pool. We already have `IThreadPool` however its API is more well suited for bigger tasks. This just
provides an easier to use API.
There is `AsyncTaskThread` which is similar in nature, but this is not re-wrapping IThreadPool hence
has ability to have multiple worker threads. We can potentially replace that with this component by
setting `num_threads = 1`.
TODO: Move this to `flow/include` instead of here.
* During commits with version vector enabled, compute location list only once, as recalcuating could
generate a different random number, hence a different set of locations.
* Respond to review comments.
* Select replicas from locations returned from resolver.
* Respond to review comments
---------
Co-authored-by: Dan Lambright <hlambright@apple.com>