(Almost?) fixes https://emqx.atlassian.net/browse/EMQX-9637
During the course of performance tests comparing the performance of
e5.0.3 and e4.4.16 regarding the webhook bridge in sync mode, we
observed that the throughput in e5.0.3 (sync) was much lower than in
e4.4.16: ~ 9 k msgs / s vs. ~ 50 k msgs / s, respectively.
Analyzing `observer_cli` output, we noticed that a lot of the time
both buffer workers and ehttpc processes was spent in `ets:info/2`.
That function was called to check the size of the inflight table when
updating metrics and checking if the inflight table was full. Other
uses of `ets:info/2` were contained inside the arguments to some
`?tp/2` macro usages (https://github.com/kafka4beam/snabbkaffe/pull/60).
By using a specific record to track the size of the table, we managed
to improve the bridge performance to ~ 45 k msgs / s in sync mode.
Port of https://github.com/emqx/emqx/pull/10154 for `release-50`
Fixes https://emqx.atlassian.net/browse/EMQX-9099
Originally, the `resume_interval`, which is what defines how often a
buffer worker will attempt to retry its inflight window, was set to
the same as the `health_check_interval`. This had the problem that,
with default values, `health_check_interval = request_timeout`. This
meant that, if a buffer worker with those configs were ever blocked,
all requests would have timed out by the time it retried them.
Here we change the default `resume_interval` to a reasonable value
dependent on `health_check_interval` and `request_timeout`, and also
expose that as a hidden parameter for fine tuning if necessary.
Fixes https://emqx.atlassian.net/browse/EMQX-9130
Since buffer workers always support async calls ("outer calls"), we
should decouple those two call modes (inner and outer), and avoid
exposing the inner call configuration to user to avoid complexity.
For bridges that currently only allow sync query modes, we should
allow them to be configured with async. That means basically all
bridge types except Kafka Producer.
This commit makes it possible to use the ${var} syntax to refer to
variables in the payload of the message in the collection field.
This makes it possible to select which collection to insert into
dynamically.
Fixes:
https://emqx.atlassian.net/browse/EMQX-9246
Fixes https://emqx.atlassian.net/browse/EMQX-9392
The returned information does not allow to diagnose the issue (i.e.: a
connection issue due to the wrong host and port, the wrong password
failing authn). However, such information is printed to the logs.
This changes the returned error to the API so that the user is hinted
at looking at the logs for further investigation of the error.
To make the configuration more intuitive and avoid exposing more
parameters to the user, we should:
1) Remove reset_by_subscriber as an enum constructor for
`offset_reset_policy`, as that might make the consumer hang
indefinitely without manual action.
2) Set the `begin_offset` `brod_consumer` parameter to `earliest` or
`latest` depending on the value of `offset_reset_policy`, as that’s
probably the user’s intention.
Fixes https://emqx.atlassian.net/browse/EMQX-9129
Currently, if an user configures a bridge with query mode sync, then
all calls to the underlying driver/connector ("inner calls") will
always be synchronous, regardless of its support for async calls.
Since buffer workers always support async queries ("outer calls"), we
should decouple those two call modes (inner and outer), and avoid
exposing the inner call configuration to user to avoid complexity.
There are two situations when we want to force synchronous calls to
the underlying connector even if it supports async:
1) When using `simple_sync_query`, since we are bypassing the buffer
workers;
2) When retrying the inflight window, to avoid overwhelming the
driver.