Commit Graph

192 Commits

Author SHA1 Message Date
Thales Macedo Garitezi 796c04e7a8 test: fix flaky test
We should emit the trace event before replying to callers.

Example failure:

https://github.com/emqx/emqx/actions/runs/8378977952/job/22946318696#step:6:182

```
 =CRITICAL REPORT==== 21-Mar-2024::17:45:37.676024 ===
"check stage" failed: error
{assertMatch,[{module,emqx_ds_storage_bitfield_lts_SUITE},
              {line,270},
              {expression,"? of_kind ( emqx_ds_replication_layer_egress_flush , Trace )"},
              {pattern,"[ # { batch := [ _ , _ , _ ] } ]"},
              {value,[]}]}
Stacktrace: [{emqx_ds_storage_bitfield_lts_SUITE,
                 '-t_atomic_store_batch/1-fun-1-',1,
                 [{file,
                      "/__w/emqx/emqx/apps/emqx_durable_storage/test/emqx_ds_storage_bitfield_lts_SUITE.erl"},
                  {line,270}]},
             {emqx_ds_storage_bitfield_lts_SUITE,t_atomic_store_batch,1,
                 [{file,
                      "/__w/emqx/emqx/apps/emqx_durable_storage/test/emqx_ds_storage_bitfield_lts_SUITE.erl"},
                  {line,249}]}]
```
2024-03-21 15:47:29 -03:00
Thales Macedo Garitezi 68af211130 fix(ds): reply sync callers after raft store failure 2024-03-21 15:40:21 -03:00
Thales Macedo Garitezi 70737a437a fix(ds): add caller to pending replies before flushing 2024-03-21 14:39:21 -03:00
Andrew Mayorov fe50a1711b
fix(ds-egress): drop pending batch on failures
Before this commit, messages in the current batch will be retried as
part of next batch. This could have led to message duplication which is
probably not what the user wants by default.
2024-03-20 13:20:25 +01:00
Andrew Mayorov a1f5de3f5b
fix(dsrepl): turn memoize into a safer function 2024-03-20 13:20:24 +01:00
Andrew Mayorov d39ca41070
chore(dsrepl): mark per-node `add_generation` RPC target obsolete
Also annotate internal exports with comments according with their
intended use.
2024-03-20 13:20:24 +01:00
Andrew Mayorov 35b18f9125
fix(dsrepl): properly handle error conditions in generation mgmt
Also update few outdated typespecs. Also make error reasons easier
to comprehend.
2024-03-20 13:20:24 +01:00
Andrew Mayorov f2268aa69a
fix(dsrepl): use correct base dir for ra system stuff
Co-Authored-By: Thales Macedo Garitezi <thalesmg@gmail.com>
2024-03-20 13:20:24 +01:00
Andrew Mayorov 404e919494
refactor(dsrepl): make shard allocator more robust and consistent
Co-Authored-By: Thales Macedo Garitezi <thalesmg@gmail.com>
2024-03-20 13:20:24 +01:00
Andrew Mayorov 0e18bd6e80
fix(dsrepl): increase replication site id bitsize back
In order to minimize chances of site id collision to practically zero.
2024-03-20 13:20:24 +01:00
Andrew Mayorov ac9700dd28
fix(dsrepl): split shard allocator into a separate module 2024-03-20 13:20:23 +01:00
Andrew Mayorov 1b647035d0
chore(dsrepl): make dialyzer a bit happier 2024-03-20 13:20:23 +01:00
Andrew Mayorov 611b3f0e07
feat(dsrepl): use more straightforward way to drop ra shards 2024-03-20 13:20:23 +01:00
Andrew Mayorov 74881e8706
feat(dsrepl): make storage layer unaware of granularity of time
Storage also becomes a bit more pure, depending on the upper layer to
provide the timestamps, which also makes it possible to handle more
operations idempotently.
2024-03-20 13:20:23 +01:00
Andrew Mayorov 3cb36a5619
feat(ds-lts): extract timestamp from storage key itself
1. This avoids the need to deserialize the message to get the timestamp.
2. It also makes possible to decouple the storage key timestamp from the
   message timestamp, which might be useful for replication purposes.
2024-03-19 20:21:56 +01:00
Andrew Mayorov 5cc0246351
feat(dsrepl): allow to tune select ra options 2024-03-19 20:21:55 +01:00
Andrew Mayorov 54b5adf868
feat(dsrepl): allocate shards predictably
To ensure strictly optimal and fair shard allocation across
cluster. Before this commit it was quite easy to end up with
an allocation significantly skewed towards some node, because
of the nature of randomness and relatively small number of
shards.
2024-03-19 20:21:55 +01:00
Andrew Mayorov 887e151be5
fix(dsrepl): handle errors gracefully in shard egress process
Also add cooldown on timeout / unavailability.
2024-03-19 20:21:53 +01:00
Andrew Mayorov e16aee99b4
chore(dsrepl): clarify how to perform leadership transfer in runtime 2024-03-19 20:21:18 +01:00
Andrew Mayorov 00d509f27b
feat(dsrepl): prefer local replica in read path
To optimize out any unnecessary RPCs. Given the load should be
smoothed evenly across the cluster, choosing non-leader node is
not a priority.
2024-03-19 20:11:42 +01:00
Andrew Mayorov 19305c223c
fix(dsrepl): require majority for replication-related tables 2024-03-19 20:11:42 +01:00
Andrew Mayorov f89909f60c
fix(dsrepl): tolerate trigger election timeouts for existing servers 2024-03-19 20:11:42 +01:00
Andrew Mayorov 3b59cf2ebf
feat(dsrepl): move shard allocation to separate process
That starts shard and egress processes only when shards are fully
allocated.
2024-03-19 20:11:41 +01:00
Andrew Mayorov 4dafbf21f6
fix(dsrepl): make db + shard part of machine state
It doesn't feel right, but right now is the easiest way to have it
in the scope of `apply/3`, because `init/1` doesn't actually invoked
for ra machines recovered from the existing log / snapshot.
2024-03-19 20:11:41 +01:00
Andrew Mayorov d19128ed65
feat(dsrepl): cache shard metadata in persistent terms 2024-03-19 20:11:41 +01:00
Andrew Mayorov e6c2c2fb07
feat(dsrepl): manage generations / db config through ra machine 2024-03-19 20:11:39 +01:00
Andrew Mayorov 5e94bdb932
feat(dsrepl): allocate shards once predefined number of sites online
Before this commit the most likely shard allocation outcome was
that all shard are allocated to just one node.
2024-03-19 20:11:03 +01:00
Andrew Mayorov be793e4735
fix(dsrepl): reassign timestamp at the time of submission
This is needed to ensure total message order for a shard, and
guarantee that no messages will be written "in the past". which
may break replay consistency.
2024-03-19 20:11:01 +01:00
Andrew Mayorov 146f082fdc
feat(dsrepl): implement raft-based replication
Still very rough but mostly working.
2024-03-19 20:09:44 +01:00
Ivan Dyachkov f2dc940436 Merge remote-tracking branch 'upstream/release-56' into 0319-sync-release56 2024-03-19 15:20:08 +01:00
Thales Macedo Garitezi 11ae04d810 test(ds): rm unused var warning 2024-03-14 17:16:24 -03:00
Thales Macedo Garitezi 2ebc8dcc55 fix(ds): use `infinity` timeout when storing batches 2024-03-14 10:17:18 -03:00
Thales Macedo Garitezi 6af01b916e feat(ds): implement `get_delete_streams`, `make_delete_iterator` and `delete_next` callbacks for builtin storage
Part of https://emqx.atlassian.net/browse/EMQX-11841
2024-03-08 09:56:46 -03:00
Andrew Mayorov f7e3afde16
test(ds): avoid introducing new macros 2024-03-07 16:49:20 +01:00
Andrew Mayorov 69427dc42d
test(ds): add tests for error mapping and replay recovery 2024-03-07 12:59:58 +01:00
Andrew Mayorov 09905d78cd
chore(ds): make error handling slightly simpler
Co-Authored-By: Thales Macedo Garitezi <thalesmg@gmail.com>
2024-03-07 12:59:57 +01:00
Andrew Mayorov b39c710ec2
fix(ds): tidy up few typespecs 2024-03-07 12:59:57 +01:00
Andrew Mayorov 2146d9e1fe
feat(ds): introduce error classes in critical API functions
For now, only recoverable / unrecoverable errors are introduced.
2024-03-07 12:59:57 +01:00
Thales Macedo Garitezi 5d87d400f4 feat(ds): add atomic store API
Part of https://emqx.atlassian.net/browse/EMQX-11841
2024-03-06 15:24:14 -03:00
Thales Macedo Garitezi 06334798a5 fix(ds): fix `drop_generation` typespec
This typespec fix will be used downstream by other backends.
2024-03-04 14:15:59 -03:00
Ilya Averyanov b706caf294 feat(ds): export types 2024-02-29 14:27:18 +03:00
Ilya Averyanov d5ae0e5c53 feat(ds): update delete/count interface 2024-02-28 22:51:24 +03:00
Ilya Averyanov b010d34640 chore(ds): add delete callbacks 2024-02-26 17:35:13 +03:00
Zaiming (Stone) Shi 46877e979b chore: update copyright-year 2024-02-23 08:21:06 +01:00
Thales Macedo Garitezi d469f4158e chore: bump app vsns 2024-02-20 16:53:57 -03:00
ieQu1 8cfb22f0b8
fix(ds): Retry getting the shard leader 2024-02-16 12:42:48 +01:00
ieQu1 280fcd8c52
Merge pull request #12437 from ieQu1/dev/optimize_make_filter
Optimize emqx_ds_bitmask_keymapper:make_filter function.
2024-02-05 17:32:28 +01:00
ieQu1 4665837cf0 fix(ds): Apply review remarks 2024-02-05 16:52:06 +01:00
ieQu1 c7888ad1f1
Merge pull request #12475 from ieQu1/dev/lean-stream
Use a more compact data structure to represent streams
2024-02-05 13:55:24 +01:00
ieQu1 698ba3f271 fix(ds): Optimize emqx_ds_bitmask_keymapper:make_filter
This optimization makes idle polling faster
2024-02-05 10:54:19 +01:00