yuanbiao/emqx - emqx

Commit Graph

Author	SHA1	Message	Date
Serge Tupchii	b5eda9f0d1	perf(emqx_resource): don't reactivate alarms on reoccurring errors Avoid unnecessary calls to activate an alarm if it has been already activated. Fixes: EMQX-9529/#10357	2023-04-20 16:37:33 +03:00
Thales Macedo Garitezi	cb995e2033	fix(buffer_worker): avoid sending late reply messages to callers Fixes https://emqx.atlassian.net/browse/EMQX-9635 During a sync call from process `A` to a buffer worker `B`, its call to the underlying resource `C` can be very slow. In those cases, `A` will receive a timeout response and expect no more messages from `B` nor `C`. However, prior to this fix, if `B` is stuck in a long sync call to `C` and then gets its response after `A` timed out, `B` would still send the late response to `A`, polluting its mailbox.	2023-04-19 18:27:10 -03:00
Thales Macedo Garitezi	f8d5d53908	feat(buffer_worker): decouple query mode from underlying connector call mode Fixes https://emqx.atlassian.net/browse/EMQX-9129 Currently, if an user configures a bridge with query mode sync, then all calls to the underlying driver/connector ("inner calls") will always be synchronous, regardless of its support for async calls. Since buffer workers always support async queries ("outer calls"), we should decouple those two call modes (inner and outer), and avoid exposing the inner call configuration to user to avoid complexity. There are two situations when we want to force synchronous calls to the underlying connector even if it supports async: 1) When using `simple_sync_query`, since we are bypassing the buffer workers; 2) When retrying the inflight window, to avoid overwhelming the driver.	2023-03-23 13:40:31 -03:00
Andrew Mayorov	e411c5d5f8	refactor(resman): work with state cache atomically Also ensure that cache entries are always consistent with `Data`, so that most of the code could rely on reading the cached entry most of the time.	2023-03-15 19:17:30 +03:00
Zaiming (Stone) Shi	c97d17cc91	test: refactor to loop wait for counters	2023-02-24 09:02:03 +01:00
Zaiming (Stone) Shi	3a6dbbdd05	refactor(buffer_worker): ensure flsh message is never missed	2023-02-23 20:11:00 +01:00
Zaiming (Stone) Shi	dbfdeec5e9	fix(buffer_worker): log unknown async replies	2023-02-23 12:55:49 +01:00
Andrew Mayorov	d8d06a260f	test(buffer): add test on inflight overflow w/ async queries This testcase should verify that the buffer will retry all inflight queries failed with recoverable errors + flush all outstanding queries. Co-authored-by: ieQu1 <99872536+ieQu1@users.noreply.github.com>	2023-02-08 14:08:04 +03:00
Thales Macedo Garitezi	6fa6c679bb	feat(buffer_worker): add expiration time to requests With this, we avoid performing work or replying to callers that are no longer waiting on a result. Also introduces two new counters: - `dropped.expired` :: happens when a request expires before being sent downstream - `late_reply` :: when a response is receive from downstream, but the caller is no longer for a reply because the request has expired, and the caller might even have retried it.	2023-01-20 11:36:52 -03:00
Thales Macedo Garitezi	fa01deb3eb	chore: retry as much as possible, don't reply to caller too soon	2023-01-17 16:49:15 -03:00
Thales Macedo Garitezi	006b4bda97	feat(buffer_worker): monitor async workers and cancel their inflight requests upon death	2023-01-17 16:48:48 -03:00
Thales Macedo Garitezi	731ac6567a	fix(buffer_worker): don't retry all kinds of inflight requests Some requests should not be retried during the blocked state. For example, if some async requests are just taking some time to process, we should avoid retrying them periodically, lest risk overloading the downstream further.	2023-01-17 16:48:48 -03:00
Thales Macedo Garitezi	c383558467	fix(buffer): fix `replayq` usages in buffer workers (5.0) https://emqx.atlassian.net/browse/EMQX-8700 Fixes a few errors in the usage of `replayq` queues. - Close `replayq` when `emqx_resource_worker` terminates. - Do not keep old references to `replayq` after any `pop`s. - Clear `replayq`'s data directories when removing a resource.	2023-01-17 16:48:48 -03:00
Thales Macedo Garitezi	fd360ac6c0	feat(buffer_worker): refactor buffer/resource workers to always use queue This makes the buffer/resource workers always use `replayq` for queuing, along with collecting multiple requests in a single call. This is done to avoid long message queues for the buffer workers and rely on `replayq`'s capabilities of offloading to disk and detecting overflow. Also, this deprecates the `enable_batch` and `enable_queue` resource creation options, as: i) queuing is now always enables; ii) batch_size > 1 <=> batch_enabled. The corresponding metric `dropped.queue_not_enabled` is dropped, along with `batching`. The batching is too ephemeral, especially considering a default batch time of 20 ms, and is not shown in the dashboard, so it was removed.	2023-01-05 10:15:09 -03:00
Zaiming (Stone) Shi	dbc10c2eed	chore: update copyright year 2023	2023-01-02 09:22:27 +01:00
Thales Macedo Garitezi	f0ff32c031	test: fix tests after counter changes	2022-10-11 17:45:48 -03:00
Shawn	9aa7e826cb	refactor(resource): fast resume resource worker if inflight msgs are ACKed	2022-09-17 00:34:30 +08:00
Shawn	26234d38b9	fix: mark the async msg 'queuing' not 'sent.inflight' on recoverable_error	2022-09-02 18:41:43 +08:00
Shawn	6b0ccfbc43	refactor: rename the error return resource_down -> recoverable_error	2022-08-26 17:11:12 +08:00
Shawn	6203a01320	feat: add inflight window to emqx_resource	2022-08-11 08:36:35 +08:00
Shawn	82550a585a	fix: add test cases for query async	2022-08-10 00:45:34 +08:00
Shawn	efd6c56dd9	fix: test cases for batch query sync	2022-08-10 00:45:34 +08:00
Shawn	35fe70b887	feat: support aysnc callback to connector modules	2022-08-10 00:34:35 +08:00
Shawn	a2afdeeb48	feat: add test cases for batching query	2022-08-10 00:34:35 +08:00

24 Commits