Commit Graph

192 Commits

Author SHA1 Message Date
Thales Macedo Garitezi 8aa7c014e7 perf(buffer_worker): avoid calling `ets:info/2`
(Almost?) fixes https://emqx.atlassian.net/browse/EMQX-9637

During the course of performance tests comparing the performance of
e5.0.3 and e4.4.16 regarding the webhook bridge in sync mode, we
observed that the throughput in e5.0.3 (sync) was much lower than in
e4.4.16: ~ 9 k msgs / s vs. ~ 50 k msgs / s, respectively.

Analyzing `observer_cli` output, we noticed that a lot of the time
both buffer workers and ehttpc processes was spent in `ets:info/2`.
That function was called to check the size of the inflight table when
updating metrics and checking if the inflight table was full.  Other
uses of `ets:info/2` were contained inside the arguments to some
`?tp/2` macro usages (https://github.com/kafka4beam/snabbkaffe/pull/60).

By using a specific record to track the size of the table, we managed
to improve the bridge performance to ~ 45 k msgs / s in sync mode.
2023-05-02 17:05:32 -03:00
Zaiming (Stone) Shi b58d3e8f94
Merge pull request #10529 from zmstone/0426-ensure-buffer-worker-monitors-cassandra-conn-pid
0426 ensure buffer worker monitors cassandra conn pid
2023-04-27 22:54:30 +02:00
Zaiming (Stone) Shi c83d630c97 fix(cassandra): ensure async calls return connection pid
so the buffer worker can monitor it and perform retries
if the connection restarted
2023-04-26 14:33:37 +02:00
firest 9eccfa5cdf fix(dynamo): fix test case errors 2023-04-26 17:03:01 +08:00
firest d826b0921d fix(dynamo): separate the implementation of connector and client of Dynamo bridge 2023-04-23 10:57:36 +08:00
firest d865998a63 fix(rocketmq): fix test cases 2023-04-21 11:02:14 +08:00
firest e89f4d4565 fix(dynamo): fix terminology erros
- Changed `username` to `aws_access_key_id`
- Changed `password` to `aws_secret_access_key`
2023-04-19 16:26:42 +08:00
firest bc353b0a06 fix(dynamo): change `database` to `table` in the schema of the DynamoDB bridge
there is no term like `database` in DynamoDB, the correct concept should be `table`
2023-04-19 15:56:29 +08:00
Stefan Strigler f0c13e0134 fix: stale ref to emqx_map_lib 2023-04-14 18:43:53 +02:00
Stefan Strigler edd1bc579f fix: stale ref to emqx_json after rebase 2023-04-14 16:32:42 +02:00
Stefan Strigler 062ce5f819 refactor: rename emqx_map_lib to emqx_utils_maps 2023-04-14 13:41:34 +02:00
Stefan Strigler 9c11bfce80 refactor: rename emqx_misc to emqx_utils 2023-04-14 13:41:27 +02:00
Stefan Strigler f8e9e54393 refactor: move emqx_json to emqx_utils_json 2023-04-14 13:31:27 +02:00
JimMoen 9ce0dbae7b
test: fix batch_size parameter 2023-04-14 13:11:45 +08:00
JimMoen 2484a79c7a
test: create bridge with invalid password 2023-04-14 12:52:08 +08:00
JimMoen 437096985b
test: fix SUITE case `t_create_disconnected/1` 2023-04-14 10:02:47 +08:00
JimMoen a379d909bf
test: use type VARCHAR to use utf8 encoding in sqlserver 2023-04-14 10:02:46 +08:00
JimMoen c366267b0f
test: MS SQL Server data bridge 2023-04-14 10:02:46 +08:00
Thales Macedo Garitezi 9acfe00498
Merge pull request #10347 from thalesmg/refactor-kafka-bridge-dirs-v50
refactor(kafka_bridge): move kafka bridge into its own app
2023-04-13 13:26:36 -03:00
firest a4d9234b24 test(dynamo): remove the flaky test case 2023-04-13 12:02:22 +08:00
Thales Macedo Garitezi 871ee90b3e refactor(kafka_bridge): move kafka bridge into its own app
Fixes https://emqx.atlassian.net/browse/EMQX-9481
2023-04-12 13:54:45 -03:00
Ivan Dyachkov f01e2f358b
Merge pull request #10367 from id/0411-sync-release-50-back-to-master
0411 sync release 50 back to master
2023-04-12 17:23:17 +02:00
Ivan Dyachkov bdffa925db chore: merge upstream/master release-50 2023-04-12 15:30:20 +02:00
JianBo He 9560fdc5a2 chore: typo fixes 2023-04-12 14:16:40 +08:00
JianBo He 30bdffe318 feat: support async and batch callback for cassandra connector 2023-04-10 15:08:10 +08:00
Thales Macedo Garitezi 4c24b08244 fix(rule_action): fix metrics for bridges returning `async_return`
Kafka Producer, when called asynchronously, will return
`{async_return, {ok, pid()}}`, which currently counts as an unknown failure.
2023-04-06 16:00:01 -03:00
Thales Macedo Garitezi 5d5b7ea215
Merge pull request #10306 from thalesmg/enable-async-buffer-workers-all-bridges-v50
feat(bridges): enable async query mode for all bridges with buffer workers
2023-04-04 17:10:46 -03:00
Thales Macedo Garitezi 0b6fd7fe14 fix(buffer_worker): check request timeout and health check interval
Port of https://github.com/emqx/emqx/pull/10154 for `release-50`

Fixes https://emqx.atlassian.net/browse/EMQX-9099

Originally, the `resume_interval`, which is what defines how often a
buffer worker will attempt to retry its inflight window, was set to
the same as the `health_check_interval`.  This had the problem that,
with default values, `health_check_interval = request_timeout`.  This
meant that, if a buffer worker with those configs were ever blocked,
all requests would have timed out by the time it retried them.

Here we change the default `resume_interval` to a reasonable value
dependent on `health_check_interval` and `request_timeout`, and also
expose that as a hidden parameter for fine tuning if necessary.
2023-04-04 08:58:36 -03:00
Thales Macedo Garitezi a8f8228a12
Merge pull request #10308 from thalesmg/test-increase-peer-timeout-v50
test(peer): increase init and startup timeout for peer nodes
2023-04-03 15:46:13 -03:00
Thales Macedo Garitezi f3ffc02bff feat(bridges): enable async query mode for all bridges with buffer workers
Fixes https://emqx.atlassian.net/browse/EMQX-9130

Since buffer workers always support async calls ("outer calls"), we
should decouple those two call modes (inner and outer), and avoid
exposing the inner call configuration to user to avoid complexity.

For bridges that currently only allow sync query modes, we should
allow them to be configured with async.  That means basically all
bridge types except Kafka Producer.
2023-04-03 14:49:51 -03:00
Thales Macedo Garitezi 8b5a717a1f test(peer): increase init and startup timeout for peer nodes
Attempt to stabilize tests that use cluster nodes.
2023-04-03 13:20:22 -03:00
Thales Macedo Garitezi ec1871ffde test(janitor): catch each callback invocation 2023-04-03 10:20:19 -03:00
zhongwencool d63680cf25
Merge pull request #10307 from emqx/release-50
Sync release-50 back to master
2023-04-02 11:36:41 +08:00
Kjell Winblad 58898ea11d
Merge pull request #10294 from kjellwinblad/kjell/feat/collection_var_syntax_mongodb/EMQX-9246
feat: (MongoDB bridge) use ${var} syntax in MongoDB collection field
2023-03-31 17:18:27 +02:00
Thales Macedo Garitezi 5011486b18 fix(kafka_consumer): return better error messages when probing kafka consumer bridge
Fixes https://emqx.atlassian.net/browse/EMQX-9422
2023-03-31 11:33:15 -03:00
Kjell Winblad e808fef1e4 feat: (MongoDB bridge) use ${var} syntax for MongoDB collection
This commit makes it possible to use the ${var} syntax to refer to
variables in the payload of the message in the collection field.
This makes it possible to select which collection to insert into
dynamically.

Fixes:
https://emqx.atlassian.net/browse/EMQX-9246
2023-03-30 17:49:56 +02:00
Thales Macedo Garitezi 632bffd451 fix: return friendly message when kafka producer fails to start (rv5.0)
Fixes https://emqx.atlassian.net/browse/EMQX-9392

The returned information does not allow to diagnose the issue (i.e.: a
connection issue due to the wrong host and port, the wrong password
failing authn).  However, such information is printed to the logs.

This changes the returned error to the API so that the user is hinted
at looking at the logs for further investigation of the error.
2023-03-30 11:51:36 -03:00
Zaiming (Stone) Shi 81a104690d test: fix flaky influxdb test 2023-03-30 16:19:22 +02:00
Zaiming (Stone) Shi 80eb9d7542
Merge pull request #10252 from emqx/release-50
0327 merge release-50 to master
2023-03-29 12:33:17 +02:00
Thales Macedo Garitezi 64faccf50b test: fix flaky kafka consumer test 2023-03-28 14:50:55 -03:00
Thales Macedo Garitezi 1a7ca7235e
Merge pull request #10249 from thalesmg/fix-kafka-offset-doc-rv50
feat(kafka_consumer): tie `offset_reset_policy` and `begin_offset` together (rv5.0)
2023-03-28 11:37:46 -03:00
Thales Macedo Garitezi 69fc1123ee refactor: change enum constructors and improve docs 2023-03-27 17:30:17 -03:00
Thales Macedo Garitezi 5cf09209cd feat: tie `offset_reset_policy` and `begin_offset` together
To make the configuration more intuitive and avoid exposing more
parameters to the user, we should:

1) Remove reset_by_subscriber as an enum constructor for
`offset_reset_policy`, as that might make the consumer hang
indefinitely without manual action.

2) Set the `begin_offset` `brod_consumer` parameter to `earliest` or
`latest` depending on the value of `offset_reset_policy`, as that’s
probably the user’s intention.
2023-03-27 14:20:31 -03:00
Zaiming (Stone) Shi da5e6f3d0a test: test with only one Kafka partition for bad config recover test 2023-03-27 17:38:34 +02:00
JianBo He bfa5922209
Merge pull request #10140 from HJianBo/cassa
feat: support cassandra data bridge
2023-03-27 10:23:02 +08:00
Thales Macedo Garitezi ff272a2071
Merge pull request #10206 from thalesmg/decouple-buffer-worker-query-call-mode-v50
feat(buffer_worker): decouple query mode from underlying connector call mode
2023-03-24 13:49:00 -03:00
Thales Macedo Garitezi f8d5d53908 feat(buffer_worker): decouple query mode from underlying connector call mode
Fixes https://emqx.atlassian.net/browse/EMQX-9129

Currently, if an user configures a bridge with query mode sync, then
all calls to the underlying driver/connector ("inner calls") will
always be synchronous, regardless of its support for async calls.

Since buffer workers always support async queries ("outer calls"), we
should decouple those two call modes (inner and outer), and avoid
exposing the inner call configuration to user to avoid complexity.

There are two situations when we want to force synchronous calls to
the underlying connector even if it supports async:

1) When using `simple_sync_query`, since we are bypassing the buffer
workers;
2) When retrying the inflight window, to avoid overwhelming the
driver.
2023-03-23 13:40:31 -03:00
JianBo He 9b63bdc1e0 chore: apply review suggestions
- Rename sql to cql
- Add tests for `bridges_probe` API
2023-03-23 15:27:34 +08:00
JianBo He 8cbbc9f271 Merge remote-tracking branch 'upstream/master' into cassa 2023-03-23 11:53:17 +08:00
Thales Macedo Garitezi ddffba0355
Merge pull request #10154 from thalesmg/fix-buffer-worker-default-req-timeout
fix(buffer_worker): calculate default `resume_interval` based on `request_timeout` and `health_check_interval`
2023-03-22 20:21:04 -03:00