Commit Graph

37 Commits

Author SHA1 Message Date
Andrew Mayorov b3e7e51094
test(bridge): drop unnecessary cleanup routines
Since `end_per_testcase` cleans out all the resources anyway.
2023-03-15 19:17:29 +03:00
Thales Macedo Garitezi 8fbb948b6f test: fix flaky mqtt bridge test
Sometimes, this test fails because the metrics are still in the
inflight phase.
2023-03-06 09:09:33 -03:00
Andrew Mayorov cbb2885499
fix(mqtt-bridge): allow to configure `clean_start` for ingresses 2023-02-10 19:15:48 +03:00
Andrew Mayorov 7002fe2ef4
fix(mqtt-bridge): disallow QoS 2 on ingress bridges 2023-02-10 17:17:59 +03:00
Andrew Mayorov 48efe552d1
test(mqtt-bridge): manage test flaps by waiting for flush events 2023-02-09 14:31:42 +03:00
Zaiming (Stone) Shi a71d983ff8 test: fix assertMatch patter to make rebar3 fmt work 2023-02-02 16:37:45 +01:00
Andrew Mayorov 0912f13c1f
fix(mqtt-bridge): stop respecting `clean_start` config parameter
We are ignoring the user configuration because there's currently no
reliable way to ensure proper session recovery according to the MQTT
spec.
2023-02-02 12:55:10 +03:00
Andrew Mayorov 8a46cb974e
test(mqtt-bridge): test async bridge reconnects seamlessly 2023-02-01 16:28:16 +03:00
Andrew Mayorov 5ebceb20d2
test(mqtt-bridge): also test reconfiguration 2023-02-01 16:23:58 +03:00
Andrew Mayorov ad88938d34
refactor: reuse some parts of test code for brewity 2023-02-01 16:22:19 +03:00
Andrew Mayorov d0c10b59aa
feat(mqtt-bridge): avoid middleman process
Instead, supervise `emqtt` client process directly.
2023-01-31 17:59:03 +03:00
Stefan Strigler 7005b71ddf style: fix typo in comment 2023-01-27 11:43:51 +01:00
Thales Macedo Garitezi ca4a262b75 refactor: re-organize dealing with unrecoverable errors 2023-01-20 12:00:17 -03:00
Thales Macedo Garitezi 6fa6c679bb feat(buffer_worker): add expiration time to requests
With this, we avoid performing work or replying to callers that are no
longer waiting on a result.

Also introduces two new counters:

- `dropped.expired` :: happens when a request expires before being
  sent downstream
- `late_reply` :: when a response is receive from downstream, but the
  caller is no longer for a reply because the request has expired, and
  the caller might even have retried it.
2023-01-20 11:36:52 -03:00
Thales Macedo Garitezi 47f796dd12 refactor: rename `emqx_resource_worker` -> `emqx_resource_buffer_worker`
To make it more clear that it's purpose is serve as a buffering layer.
2023-01-18 16:15:34 -03:00
Thales Macedo Garitezi fa01deb3eb chore: retry as much as possible, don't reply to caller too soon 2023-01-17 16:49:15 -03:00
Thales Macedo Garitezi 478fcc6ffd test: fix flaky test 2023-01-17 16:48:48 -03:00
Thales Macedo Garitezi 32a9e60313 feat(buffer_worker): also use the inflight table for sync requests
Related: https://emqx.atlassian.net/browse/EMQX-8692

This should also correctly account for `retried.*` metrics for sync
requests.

Also fixes cases where race conditions for retrying async requests
could potentially lead to inconsistent metrics.

Fixes more cases where a stale reference to `replayq` was being held
accidentally after a `pop`.
2023-01-17 16:48:48 -03:00
Stefan Strigler 67909f0b40 fix: testing metrics for emqx_bridge_mqtt_SUITE 2023-01-16 12:10:06 +01:00
JimMoen 54ebc27d24
Merge pull request #9672 from JimMoen/0103-fix-mqtt-bridge
Fix the problem that the bridge is not available when the Payload template is empty in the MQTT bridge.
2023-01-16 09:57:20 +08:00
Erik Timan f1c58c34ed test(emqx_bridge): fix fetching of metrics in emqx_bridge_mqtt_SUITE 2023-01-13 14:19:23 +01:00
JimMoen b7259d9a20
test(mqtt-bridge): use empty payload template for ingress/egress mqtt bridge 2023-01-13 18:12:53 +08:00
Thales Macedo Garitezi fd360ac6c0 feat(buffer_worker): refactor buffer/resource workers to always use queue
This makes the buffer/resource workers always use `replayq` for
queuing, along with collecting multiple requests in a single call.
This is done to avoid long message queues for the buffer workers and
rely on `replayq`'s capabilities of offloading to disk and detecting
overflow.

Also, this deprecates the `enable_batch` and `enable_queue` resource
creation options, as: i) queuing is now always enables; ii) batch_size
> 1 <=> batch_enabled.  The corresponding metric
`dropped.queue_not_enabled` is dropped, along with `batching`.  The
batching is too ephemeral, especially considering a default batch time
of 20 ms, and is not shown in the dashboard, so it was removed.
2023-01-05 10:15:09 -03:00
Thales Macedo Garitezi 7e02eac3bc
Merge pull request #9619 from thalesmg/refactor-gauges-v50
refactor(metrics): use absolute gauge values rather than deltas (v5.0)
2023-01-02 10:56:47 -03:00
Zaiming (Stone) Shi dbc10c2eed chore: update copyright year 2023 2023-01-02 09:22:27 +01:00
Thales Macedo Garitezi 8b060a75f1 refactor(metrics): use absolute gauge values rather than deltas
https://emqx.atlassian.net/browse/EMQX-8548

Currently, we face several issues trying to keep resource metrics
reasonable.  For example, when a resource is re-created and has its
metrics reset, but then its durable queue resumes its previous work
and leads to strange (often negative) metrics.

Instead using `counters` that are shared by more than one worker to
manage gauges, we introduce an ETS table whose key is not only scoped
by the Resource ID as before, but also by the worker ID.  This way,
when a worker starts/terminates, they should set their own gauges to
their values (often 0 or `replayq:count` when resuming off a queue).
With this scoping and initialization procedure, we'll hopefully avoid
hitting those strange metrics scenarios and have better control over
the gauges.
2022-12-30 16:51:24 -03:00
Zaiming (Stone) Shi 0b43ae621d ci: dump docker-compose log if failed to run ct 2022-12-29 09:23:11 +01:00
Thales Macedo Garitezi 35dc75b7ed feat(mqtt): add option to customize clientid prefix for egress bridges
https://emqx.atlassian.net/browse/EMQX-8445

Currently the bridge client’s client ID is prefixed with the resource
ID.

Sometimes it’s useful for users to have control of this prefix,
e.g. prefix based ACL rules in the target broker.
2022-12-23 09:50:26 -03:00
William Yang 6b6bb09d4a fix: ingress only bridge causes egress bridge traffic stop 2022-12-09 15:36:20 +01:00
Zaiming (Stone) Shi 31098d6c67 test: fix bridge api tests, metrics is now a separate api 2022-12-06 19:03:32 +01:00
Thales Macedo Garitezi 2d01726b22 fix: account calls when resource is not connected as matched 2022-10-13 15:32:04 -03:00
Thales Macedo Garitezi f0ff32c031 test: fix tests after counter changes 2022-10-11 17:45:48 -03:00
Shawn 9aa7e826cb refactor(resource): fast resume resource worker if inflight msgs are ACKed 2022-09-17 00:34:30 +08:00
Shawn d5d3972ff5 chore: add test cases for MQTT Bridge reconnecting 2022-09-15 10:19:33 +08:00
Shawn b9ae4ea276 refactor: rename some metrics for emqx_resource 2022-09-13 14:04:25 +08:00
Shawn 73e19d84ee feat: use the new metrics to bridge APIs 2022-08-30 23:47:58 +08:00
Shawn 55c18c0b5f fix(bridges): update the test cases for new config structure 2022-08-22 18:24:59 +08:00