When EMQX boots up, it tries to get latest config from peer (core type)
nodes, if none of the nodes are replying, the node will decide
to boot with local config (and replay the committed changes) if
the commit table is loaded from disk locally (an indication of the
data being latest), otherwise it will sleep for 1-2 seconds and
retry.
This lead to a race condition, e.g. in a two nodes cluster:
1. node1 boots up
2. node2 boots up and copy mnesia table from node1
3. node1 restart before node2 can sync cluster.hocon from it
4. node1 boots up and copy mnesia table from node2
Now that both node1 and node2 has the mnesia `load_node` pointing
to each other (i.e. not a local disk load).
Prior to this fix, the nodes would wait for each other in a dead loop.
This commit fixes the issue by allowing node to boot
with local config if it does not have a lagging.
Fixes https://emqx.atlassian.net/browse/EMQX-11330
After feedback from Product team, we should rename `bridges_v2` to `actions` everywhere.
We'll start with the public facing APIs.
- HTTP API
- Hocon schema root key
Fixes https://emqx.atlassian.net/browse/EMQX-11086
There’s currently a metric inconsistency due to the internal buffering nature of Kafka
Producer (wolff).
We use simple_sync_query to call the Kafka Producer bridge. If that times out, the call
is accounted as failed, even though the message is buffered in wolff and later sent
successfully.
Fixes https://emqx.atlassian.net/browse/EMQX-10944
Also updates ekka -> 0.15.15, mria -> 0.6.4
How to test
===========
1. Start 2 or more EMQX nodes and merge them in a cluster.
2. Stop them in order.
3. Start only the first node that was stopped in the previous step.
4. Wait until the log is printed.
Or, more easily:
1. Start 2 or more EMQX nodes and merge them in a cluster.
2. Stop all but one.
3. Run `mria_mnesia:diagnosis([]).` on that node.
Example output
==============
```
Check check_open_ports should get ok but got #{msg =>
"some ports are unreachable",
results =>
#{'emqx@172.100.239.4' =>
#{open_ports =>
#{4370 => false,
5370 =>
false},
ports_to_check =>
[4370,5370],
resolved_ips =>
[{172,100,239,
4}],
status =>
bad_ports},
'emqx@172.100.239.5' =>
#{open_ports =>
#{4370 => false,
5370 =>
false},
ports_to_check =>
[4370,5370],
resolved_ips =>
[{172,100,239,
5}],
status =>
bad_ports}}}
```
After one node is back:
```
Check check_open_ports should get ok but got #{msg =>
"some ports are unreachable",
results =>
#{'emqx@172.100.239.4' =>
#{ports_to_check =>
[4370,5370],
resolved_ips =>
[{172,100,239,
4}],
status => ok},
'emqx@172.100.239.5' =>
#{open_ports =>
#{4370 => false,
5370 =>
false},
ports_to_check =>
[4370,5370],
resolved_ips =>
[{172,100,239,
5}],
status =>
bad_ports}}}
```