Commit Graph

240 Commits

Author SHA1 Message Date
Andrew Mayorov 5d7b2e2ce6
fix(dsrepl): attempt leadership transfer on terminate
In addition to on removal. The reasoning is basically the same: try to
avoid situations when log entries are replicated (or will be considered
replicated when the new leader is elected) but the leader terminates
before replying to the client.

To be clear: this is a stupid solution. Something much more robust is
needed.
2024-04-15 22:05:24 +02:00
Andrew Mayorov 89f42f1171
fix(dsrepl): make placeholder shard process permanent under supervisor 2024-04-15 16:43:52 +02:00
Andrew Mayorov c4d1360b96
fix(dsrepl): trigger election for new ra servers unconditionallly
Otherwise we might end up in a situation when there's no member online
yet at the time of the election trigger, and the election will never
happen.
2024-04-15 16:42:29 +02:00
Andrew Mayorov d12e907209
fix(dsrepl): correctly handle ra membership change command results
Before this change, results similar to `{error, {no_more_servers_to_try,
[{error, nodedown}, {error, not_member}]}}` were considered retryable
failures, which is incorrect.
2024-04-08 22:44:34 +02:00
Andrew Mayorov 3223797ae5
fix(dsrepl): attempt leadership transfer before server removal
This should make it much less likely to hit weird edge cases that lead
to duplicate Raft log entries because of client retries upon receiving
`shutdown` from the leader being removed.
2024-04-08 22:43:58 +02:00
Andrew Mayorov 1e95bd4da6
test(dsrepl): test unresponsive nodes removal / node restarts 2024-04-08 21:27:56 +02:00
Andrew Mayorov 7a836317ac
fix(dsrepl): trigger unfinished shard transition upon startup
Also provide a trivial API to trigger them by hand.
2024-04-08 16:12:42 +02:00
Andrew Mayorov 75bb7f5cdc
fix(dsrepl): retry only `{add, Site}` crashed membership transitions
To minimize the potential negative impact of removal transitions that
crash for some unknown and unusual reasons.
2024-04-08 16:04:33 +02:00
Andrew Mayorov 4c0cc079c2
fix(dsrepl): apply unnecessary rebalancing transitions cleanly 2024-04-08 13:25:45 +02:00
Andrew Mayorov dcde30c38a
test(dsrepl): add two more testcases for rebalancing 2024-04-08 13:22:31 +02:00
Andrew Mayorov 2ace9bb893
chore(dsrepl): sprinkle few comments and typespecs for exports 2024-04-07 22:51:56 +02:00
Andrew Mayorov ecaad348a7
chore(dsrepl): update few outdated comments / TODOs 2024-04-07 22:51:56 +02:00
Andrew Mayorov 6293efb995
fix(dsrepl): retry crashed membership transitions 2024-04-07 22:51:56 +02:00
Andrew Mayorov 826ce5806d
fix(dsrepl): ensure that new member UID matches server's UID
Before that change, UIDs supplied in the `ra:add_member/3` were not
the same as those servers were using. This haven't caused any issues
for some reason, but it's better to ensure that UIDs are the same.
2024-04-07 22:31:24 +02:00
Andrew Mayorov 556ffc78c9
feat(dsrepl): implement membership changes and rebalancing 2024-04-05 18:57:28 +02:00
Andrew Mayorov d6058b7f51
feat(dsrepl): allow to subscribe to DB metadata changes
Currently, only shard metadata changes are announced to the
subscribers.
2024-04-05 17:40:55 +02:00
Andrew Mayorov a07295d3bc
fix(ds): address shards in the supervisor properly 2024-04-05 17:40:38 +02:00
ieQu1 a62db08676
feat(ds): Add REST API for durable storage 2024-04-05 15:22:06 +02:00
ieQu1 d09787d1a6
fix(ds): Fix return types in replication_layer_meta 2024-04-05 15:22:06 +02:00
Andrew Mayorov 70396e9766
Merge pull request #12825 from keynslug/feat/EMQX-12110/repl-meta-api
feat(dsrepl): add APIs to manage DB replication sites
2024-04-04 22:32:03 +02:00
Andrew Mayorov df6c5b35fe
feat(dsrepl): add more primitive operations to modify DB sites 2024-04-04 21:22:49 +02:00
Andrew Mayorov bb8ffee18c
feat(dsrepl): add API to get current DB replication sites 2024-04-04 21:22:02 +02:00
Andrew Mayorov ad52f7838e
feat(dsrepl): add APIs to manage DB replication sites 2024-04-04 21:22:01 +02:00
Thales Macedo Garitezi c57c36adb2 feat(ds): clear all checkpoints when (re)starting storage layer
Fixes https://emqx.atlassian.net/browse/EMQX-12143
2024-04-04 14:05:52 -03:00
ieQu1 f37ed3a40a fix(ds): Limit the number of retries in egress to 0 2024-04-03 16:38:49 +02:00
ieQu1 2bbfada7af
fix(ds): Make async batches truly async 2024-04-03 11:57:47 +02:00
ieQu1 92ca90c0ca
fix(ds): Improve egress logging 2024-04-03 11:57:47 +02:00
ieQu1 ae5935e7f7
test(ds): Attempt to stabilize metrics_worker tests in CI 2024-04-02 19:14:10 +02:00
ieQu1 4382971443
fix(ds): Preserve errors in the egress 2024-04-02 16:47:43 +02:00
ieQu1 94ca7ad0f8
feat(ds): Report counters for LTS storage layout 2024-04-02 16:47:43 +02:00
ieQu1 b379f331de
fix(sessds): Handle errors when storing messages 2024-04-02 16:47:41 +02:00
ieQu1 f41e538526
feat(sessds): Observe next time 2024-04-02 16:45:52 +02:00
ieQu1 75b092bf0e
fix(ds): Actually retry sending batch 2024-04-02 16:45:49 +02:00
ieQu1 0de255cac8
feat(ds): Report egress flush time 2024-04-02 16:25:04 +02:00
ieQu1 044f3d4ef5
fix(ds): Don't reverse entries in the atomic batch 2024-04-02 16:25:04 +02:00
ieQu1 606f2a88cd
feat(ds): Add egress metrics 2024-04-02 16:25:04 +02:00
ieQu1 c9de336234
feat(ds): Add metrics worker to the builtin db supervision tree 2024-04-02 16:25:04 +02:00
Andrew Mayorov 778e897f1f
chore(dsrepl): describe snapshot ownership and few shortcomings 2024-04-02 13:48:51 +02:00
Andrew Mayorov c666c65c6a
test(ds): factor out storage iteration into helper module 2024-04-02 13:48:51 +02:00
Andrew Mayorov 7cebf598a8
chore(dsrepl): simplify snapshot transfer code a bit
Co-Authored-By: Thales Macedo Garitezi <thalesmg@gmail.com>
2024-04-02 13:48:51 +02:00
Andrew Mayorov e029b8f996
test(dsrepl): wait for whole cluster readiness
To minimize the chance of flaky tests due to the shards not being
completely online.

Co-Authored-By: Thales Macedo Garitezi <thalesmg@gmail.com>
2024-04-02 13:48:50 +02:00
Andrew Mayorov e8b06a6a9f
chore(dsrepl): mark few more BPAPI targets as obsolete 2024-04-02 13:48:50 +02:00
Andrew Mayorov d31cd0c728
feat(ds): ensure LTS state ids are deterministic 2024-04-02 13:48:50 +02:00
Andrew Mayorov 2cd357a5bd
fix(ds): ensure store batch is idempotent wrt generations 2024-04-02 13:48:50 +02:00
Andrew Mayorov 77a022bd93
feat(dsrepl): transfer storage snapshot during ra snapshot recovery 2024-04-02 13:48:49 +02:00
Andrew Mayorov b8b9b7739b
chore(ds): slightly simplify working with storage generations 2024-04-02 13:48:08 +02:00
Andrew Mayorov fa66a640c3
fix(dsrepl): handle RPC errors gracefully when storage is down 2024-03-28 15:17:01 +01:00
Ivan Dyachkov db9efb9317 chore: bump apps versions 2024-03-28 10:19:09 +01:00
Thales Macedo Garitezi 796c04e7a8 test: fix flaky test
We should emit the trace event before replying to callers.

Example failure:

https://github.com/emqx/emqx/actions/runs/8378977952/job/22946318696#step:6:182

```
 =CRITICAL REPORT==== 21-Mar-2024::17:45:37.676024 ===
"check stage" failed: error
{assertMatch,[{module,emqx_ds_storage_bitfield_lts_SUITE},
              {line,270},
              {expression,"? of_kind ( emqx_ds_replication_layer_egress_flush , Trace )"},
              {pattern,"[ # { batch := [ _ , _ , _ ] } ]"},
              {value,[]}]}
Stacktrace: [{emqx_ds_storage_bitfield_lts_SUITE,
                 '-t_atomic_store_batch/1-fun-1-',1,
                 [{file,
                      "/__w/emqx/emqx/apps/emqx_durable_storage/test/emqx_ds_storage_bitfield_lts_SUITE.erl"},
                  {line,270}]},
             {emqx_ds_storage_bitfield_lts_SUITE,t_atomic_store_batch,1,
                 [{file,
                      "/__w/emqx/emqx/apps/emqx_durable_storage/test/emqx_ds_storage_bitfield_lts_SUITE.erl"},
                  {line,249}]}]
```
2024-03-21 15:47:29 -03:00
Thales Macedo Garitezi 68af211130 fix(ds): reply sync callers after raft store failure 2024-03-21 15:40:21 -03:00