Commit Graph

48 Commits

Author SHA1 Message Date
Thales Macedo Garitezi 890970345b test: reduce flakiness 2023-12-22 15:31:56 -03:00
Thales Macedo Garitezi 85963d3b45 test: restore connect timeout for test case 2023-12-22 14:10:24 -03:00
zhongwencool 2d5f7e0a6d
Merge pull request #12215 from thalesmg/sync-r54-m-20231221
sync r54 to master
2023-12-22 10:24:29 +08:00
Zaiming (Stone) Shi f4286f3208 test(gcp_pubsub): increase wait timeout and fix falt injection 2023-12-21 20:59:51 +01:00
Thales Macedo Garitezi b2c82ab052 test(gcp_pubsub_consumer): yet another attempt to stabilize tests
Hopefully, should work better after https://github.com/emqx/emqx/pull/12197
2023-12-20 13:21:13 -03:00
Thales Macedo Garitezi 5128c11542 test(gcp_pubsub_consumer): another attempt to stabilize flaky tests 2023-12-18 17:37:58 -03:00
Zaiming (Stone) Shi 22f7cc1622 test: replace 'slave' and 'ct_slave' with 'peer' 2023-12-01 08:07:09 +01:00
Ivan Dyachkov ec10c51073 Merge remote-tracking branch 'upstream/release-53' into 1129-sync-r53 2023-11-30 19:51:12 +01:00
Thales Macedo Garitezi 62b763a8f8 test(gcp_pubsub_consumer): even more adjustments 2023-11-29 15:12:47 -03:00
Zaiming (Stone) Shi 14644988e0 chore: change triple-quotes to single-quotes 2023-11-29 16:15:18 +01:00
Thales Macedo Garitezi 095e7c4ecb test(flaky): more adjustments 2023-11-28 13:41:31 -03:00
Thales Macedo Garitezi f2dbddc315 test: attempting to stabilize more flaky tests 2023-11-27 11:36:32 -03:00
Thales Macedo Garitezi 261fe8a831 Merge remote-tracking branch 'origin/release-53' into sync-r53-m-20231124 2023-11-24 10:10:09 -03:00
Thales Macedo Garitezi e95ec5b150 test: fix another flaky test 2023-11-24 09:24:21 -03:00
Thales Macedo Garitezi f8fd95c683 Merge remote-tracking branch 'origin/release-53' into sync-r53-m-20231124 2023-11-24 09:22:24 -03:00
Thales Macedo Garitezi db83457d13 test: fix flaky test
The cause was that the call `sys:terminate/2` was timing out...

`exit/2` doens't always work:

```
 2023-11-22 19:14:40.974
killed async workers

Error: -22T19:14:40.974563+00:00 [error] crasher: initial call: gun:proc_lib_hack/5, pid: <0.15908.7>, registered_name: [], exit: {{{owner_gone,killed},[{gun,owner_gone,1,[{file,"gun.erl"},{line,970}]},{gun,proc_lib_hack,5,[{file,"gun.erl"},{line,649}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]},[{gun,proc_lib_hack,5,[{file,"gun.erl"},{line,654}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}, ancestors: [gun_sup,<0.15387.7>], message_queue_len: 0, messages: [], links: [<0.15388.7>], dictionary: [], trap_exit: false, status: running, heap_size: 987, stack_size: 28, reductions: 1822; neighbours:
Error: -22T19:14:40.998051+00:00 [error] Supervisor: {local,gun_sup}. Context: child_terminated. Reason: {{owner_gone,killed},[{gun,owner_gone,1,[{file,"gun.erl"},{line,970}]},{gun,proc_lib_hack,5,[{file,"gun.erl"},{line,649}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}. Offender: id=gun,pid=<0.15908.7>.
2023-11-22T19:15:41.088752+00:00 [critical] Run stage failed: error:{badmatch,{timeout,#{expected_remaining => 1,mailbox => {messages,[]},msgs_so_far => []}}}, Stacktrace: [{emqx_bridge_gcp_pubsub_consumer_SUITE,'-t_async_worker_death_mid_pull/1-fun-17-',3,[{file,"/emqx/apps/emqx_bridge_gcp_pubsub/test/emqx_bridge_gcp_pubsub_consumer_SUITE.erl"},{line,1576}]},{emqx_bridge_gcp_pubsub_consumer_SUITE,t_async_worker_death_mid_pull,1,[{file,"/emqx/apps/emqx_bridge_gcp_pubsub/test/emqx_bridge_gcp_pubsub_consumer_SUITE.erl"},{line,1505}]}], Trace dump: "/emqx/_build/test/logs/ct_run.test@127.0.0.1.2023-11-22_19.14.27/snabbkaffe/1700680540975786370.log", mfa: undefined
Error: -22T19:15:46.095702+00:00 [error] crasher: initial call: gun:proc_lib_hack/5, pid: <0.15934.7>, registered_name: [], exit: {{{owner_gone,killed},[{gun,owner_gone,1,[{file,"gun.erl"},{line,970}]},{gun,proc_lib_hack,5,[{file,"gun.erl"},{line,649}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]},[{gun,proc_lib_hack,5,[{file,"gun.erl"},{line,654}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}, ancestors: [gun_sup,<0.15387.7>], message_queue_len: 0, messages: [], links: [<0.15388.7>], dictionary: [], trap_exit: false, status: running, heap_size: 610, stack_size: 28, reductions: 1471; neighbours:
Error: -22T19:15:46.095192+00:00 [error] Supervisor: {local,ehttpc_sup}. Context: shutdown_error. Reason: killed. Offender: id={ehttpc_pool_sup,<<98,114,105,100,103,101,58,103,99,112,95,112,117,98,115,117,98,95,99,111,110,115,117,109,101,114,58,116,95,97,115,121,110,99,95,119,111,114,107,101,114,95,100,101,97,116,104,95,109,105,100,95,112,117,108,108,45,53,55,54,52,54,48,55,53,50,51,48,51,52,50,50,55,53,49>>},pid=<0.15903.7>.
Error: -22T19:15:46.095470+00:00 [error] Supervisor: {<0.15906.7>,ehttpc_worker_sup}. Context: shutdown_error. Reason: killed. Offender: id={worker,1},pid=<0.15924.7>.
Error: -22T19:15:46.096762+00:00 [error] Supervisor: {local,gun_sup}. Context: child_terminated. Reason: {{owner_gone,killed},[{gun,owner_gone,1,[{file,"gun.erl"},{line,970}]},{gun,proc_lib_hack,5,[{file,"gun.erl"},{line,649}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}. Offender: id=gun,pid=<0.15934.7>.
Warning: 2T19:15:46.098278+00:00 [warning] msg: remove_local_resource_failed, mfa: emqx_resource:remove_local/1(362), error: {error,timeout}, resource_id: <<"bridge:gcp_pubsub_consumer:t_async_worker_death_mid_pull-576460752303422751">>
Error: -22T19:15:46.149090+00:00 [error] Generic server <0.15904.7> terminating. Reason: killed. Last message: {'EXIT',<0.15903.7>,killed}. State: {state,<<"bridge:gcp_pubsub_consumer:t_async_worker_death_mid_pull-576460752303422751">>,1,random}.
Error: -22T19:15:46.149525+00:00 [error] crasher: initial call: ehttpc_pool:init/1, pid: <0.15904.7>, registered_name: [], exit: {killed,[{gen_server,decode_msg,9,[{file,"gen_server.erl"},{line,909}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}, ancestors: [<0.15903.7>,ehttpc_sup,<0.15731.7>], message_queue_len: 0, messages: [], links: [], dictionary: [], trap_exit: true, status: running, heap_size: 376, stack_size: 28, reductions: 3428; neighbours:
```
2023-11-22 17:41:32 -03:00
Thales Macedo Garitezi fc849f0c05 ci(test): add info to help diagnose flaky test 2023-11-22 12:36:10 -03:00
Thales Macedo Garitezi 9e1796ec4f feat(gcp_pubsub_producer): migrate GCP PubSub producer to actions
Fixes https://emqx.atlassian.net/browse/EMQX-11157
2023-11-21 14:22:42 -03:00
Kjell Winblad 9dc3a169b3 feat: split bridges into a connector part and a bridge part
Co-authored-by: Thales Macedo Garitezi <thalesmg@gmail.com>
Co-authored-by: Stefan Strigler <stefan.strigler@emqx.io>
Co-authored-by: Zaiming (Stone) Shi <zmstone@gmail.com>

Several bridges should be able to share a connector pool defined by a
single connector. The connectors should be possible to enable and
disable similar to how one can disable and enable bridges. There should
also be an API for checking the status of a connector and for
add/edit/delete connectors similar to the current bridge API.

Issues:
https://emqx.atlassian.net/browse/EMQX-10805
2023-10-30 14:48:47 +01:00
Zaiming (Stone) Shi 7c2f87fabe test: merge broker and router boot modules 2023-09-06 21:36:16 +02:00
Thales Macedo Garitezi e5041de9cc fix(gcp_consumer): handle 401 errors
Fixes https://emqx.atlassian.net/browse/EMQX-10852
2023-08-24 09:03:34 -03:00
Thales Macedo Garitezi 029b461a13
Merge pull request #11448 from thalesmg/gcp-consumer-403-20230815
fix(gcp_consumer): handle 403 responses
2023-08-17 09:04:00 -03:00
Thales Macedo Garitezi ba956ebe88 fix(gcp_consumer): handle 403 responses
Fixes https://emqx.atlassian.net/browse/EMQX-10736
2023-08-15 13:20:20 -03:00
Thales Macedo Garitezi 23f5cea482 feat: handle strange key values when resolving placeholders 2023-08-14 13:39:38 -03:00
Thales Macedo Garitezi 82b8538041 feat(gcp_producer): add support for defining message attributes and ordering key
Fixes https://emqx.atlassian.net/browse/EMQX-10652
2023-08-14 10:33:17 -03:00
Thales Macedo Garitezi 5c8dc092a1 fix(http_bridge): don't attempt to convert headers to atoms
Fixes https://emqx.atlassian.net/browse/EMQX-10653
2023-08-07 13:08:34 -03:00
Ivan Dyachkov 243b8f5b67 chore: merge 'upstream/master' into v5.1.2 2023-07-21 13:25:46 +02:00
Thales Macedo Garitezi 6cd503865b fix(machine_boot): ensure `emqx_bridge` starts after its companion apps
We need to reverse the dependency of `emqx_bridge` and `emqx_bridge_*`, because the former
loads and starts bridges during its application startup.  If the individual bridge
application being loaded has not started with its dependencies, the supervision tree will
not be ready for that.
2023-07-20 13:11:44 -03:00
Thales Macedo Garitezi 05c3e023a9 chore(gcp_pubsub_consumer): unhide GCP PubSub Consumer bridge for e5.2.0
Fixes https://emqx.atlassian.net/browse/EMQX-10506
2023-07-17 11:24:21 -03:00
Thales Macedo Garitezi 01b143c5ad fix(resource): don't destruct error tuple
Otherwise, `emqx_resource:query` won't correctly deem the resource to
be unhealthy when there's an extra message.
2023-07-13 16:12:33 -03:00
Thales Macedo Garitezi 0dff428efb
Merge pull request #11262 from thalesmg/fix-gcp-consumer-hc-20230712-master
fix(gcp_pubsub_consumer): fail health check when there are no workers
2023-07-13 15:09:02 -03:00
Thales Macedo Garitezi be7918aa41 fix(gcp_pubsub_consumer): fail health check when there are no workers
`ecpool` already returns an error even if the worker process is dead,
but we add the empty worker list clause here just for completeness.
2023-07-12 16:31:21 -03:00
Kjell Winblad f28510b3ad refactor: HTTP connector into emqx_bridge_http app 2023-07-12 14:46:43 +02:00
Thales Macedo Garitezi 1a058f6890 test(gcp_consumer): attempts to reduce flakiness 2023-06-30 12:49:23 -03:00
Thales Macedo Garitezi 30e0b4be54 test(gcp_pubsub_consumer): add more tests and improve bridge
Fixes https://emqx.atlassian.net/browse/EMQX-10309
2023-06-28 14:08:40 -03:00
Thales Macedo Garitezi 22356b7c25 chore: hide gcp pubsub consumer until e5.2.0 2023-06-22 10:05:52 -03:00
Thales Macedo Garitezi 0463828e84 refactor(gcp_pubsub): transform connector into opaque client 2023-06-20 15:27:42 -03:00
Thales Macedo Garitezi 2ac2d4c037 refactor: addressing review comments 2023-06-20 11:15:13 -03:00
Thales Macedo Garitezi b442910ff1 feat(gcp_pubsub_consumer): implement GCP PubSub Consumer bridge
Fixes https://emqx.atlassian.net/browse/EMQX-10281
2023-06-19 16:04:12 -03:00
Thales Macedo Garitezi bb49482529 refactor(gcp_pubsub): move logging from connector to bridge 2023-06-19 15:59:00 -03:00
Thales Macedo Garitezi 5375421954 refactor(gcp_pubsub): split connector into producer and reusable parts 2023-06-19 15:59:00 -03:00
Thales Macedo Garitezi 260fae296b feat(gcp_pubsub): generate jwt tokens on demand without workers (5.1)
Fixes https://emqx.atlassian.net/browse/EMQX-9603

Rather than relying on a JWT worker to produce and refresh tokens, we
could just produce then on demand when pushing the messages to GCP
PubSub.  That can generate a bit of extra work (as multiple processes
might realize it’s time to refresh the JWT and do so), but that
shouldn’t be much.  In return, we avoid any possibility of not having
a fresh JWT when pushing messages.
2023-06-06 13:19:24 -03:00
Thales Macedo Garitezi 99796224d8 refactor(resource): rename `request_timeout` -> `request_ttl`
See
https://emqx.atlassian.net/wiki/spaces/P/pages/612368639/open+e5.1+remove+auto+restart+interval+from+buffer+worker+resource+options
2023-06-01 13:01:53 -03:00
Thales Macedo Garitezi 1e25ebb64c test(gcp_pubsub): attempt to fix flakiness
https://github.com/emqx/emqx/actions/runs/5125118728/jobs/9218520994?pr=10887#step:8:309
```
  =CRITICAL REPORT==== 30-May-2023::19:19:34.887082 ===
  "check stage" failed: error
  {assertMatch,[{module,emqx_bridge_gcp_pubsub_SUITE},
                {line,1066},
                {expression,"? of_kind ( gcp_pubsub_request_failed , Trace )"},
                {pattern,"[ # { reason := Error , connector := ResourceId } | _ ]"},
                {value,[#{connector =>
                              <<"bridge:gcp_pubsub:emqx_bridge_gcp_pubsub_SUITE0005FCEE15534E9CD4CD02004CF10000">>,
                          msg => gcp_pubsub_request_failed,query_mode => async,
                          reason => {closed,"The connection was lost."},
                          recoverable_error => true,
                          '~meta' =>
                              #{gl => <0.17903.2>,
                                location =>
                                    #Fun<emqx_bridge_gcp_pubsub_connector.19.19548918>,
                                node => 'test@127.0.0.1',pid => <0.19724.2>,
                                time => -576460610660164}}]}]}
  Stacktrace: [{emqx_bridge_gcp_pubsub_SUITE,
                   '-do_econnrefused_or_timeout_test/2-fun-2-',3,
                   [{file,
                        "/__w/emqx/emqx/source/apps/emqx_bridge_gcp_pubsub/test/emqx_bridge_gcp_pubsub_SUITE.erl"},
                    {line,1066}]},
               {emqx_bridge_gcp_pubsub_SUITE,do_econnrefused_or_timeout_test,2,
                   [{file,
                        "/__w/emqx/emqx/source/apps/emqx_bridge_gcp_pubsub/test/emqx_bridge_gcp_pubsub_SUITE.erl"},
                    {line,1022}]}]
```
2023-05-31 10:19:55 -03:00
Thales Macedo Garitezi 324459990f Merge branch 'release-50' into merge-r50-into-v50-20230524 2023-05-24 12:54:15 -03:00
Thales Macedo Garitezi 4565acc600 fix: handle `infinity` timeout option in `ehttpc` (r5.0)
Fixes https://emqx.atlassian.net/browse/EMQX-9987
2023-05-24 10:33:54 -03:00
Thales Macedo Garitezi 7d798c10e9 perf(buffer_worker): flush metrics periodically inside buffer worker process
Fixes https://emqx.atlassian.net/browse/EMQX-9905

Since calling `telemetry` is costly in a hot path, we instead collect
metrics inside the buffer workers state and periodically flush them,
rather than immediately as events happen.
2023-05-22 09:11:23 -03:00
Thales Macedo Garitezi a9bd91fcff refactor(gcp_pubsub): move GCP PubSub Bridge to its own app
Fixes https://emqx.atlassian.net/browse/EMQX-9536

Note: since GCP PubSub is not shared by any authn/authz backend,
there's no need to separate its connector into another app.
2023-04-19 13:24:32 -03:00