Break the habit with acceptance of waiting recorders to time out #555

bertllll · 2024-11-12T01:03:19Z

I have a reasoned assumption that there are tests in the codebase which make use of time-outs that originally were intended just for the state of emergency, when beyond this point the test would be considered hanging.
These timeouts were implemented through the special auxiliary Actor called SystemKiller. This concept is probably used at also some other places, and it should be dealt with across the board, but here I'm talking mainly about the specific case when a test is set to be ended by so called StopConditions.

As you might know, the Actor system never stops unless it's asked to. It had caused a lot of issues in writing good tests because only System::current().stop() wouldn't serve us always, as some part of the planned actions in the Act mightn't have been carried out fully. Therefore the test wasn't reliable because of races of threads. The StopConditions allow specifying a message or a set of them for a recorder impersonating one of our normal Actors, these are then the conditions of stopping the system as soon as the recorder receives all the conditioned messages.

Because there is always a chance that the recorder doesn't receive them as the tested code might work differently than the developer thought it would - which points probably to a design issue - we put in also the SystemKiller, ensuring no more than 10 seconds for waiting if the messages would arrive.

The problem is we exploit this security measure sometimes. For example, I've seen a test where the requirement was to prove that a certain was NOT sent. Who has ever thought about these tests thoroughly must know it's one of the trickiest requirements possible. There isn't a good recommended approach solving all such difficulties of the person writing tests like these.

This card originally came to existence with some bigger aspiration (of setting helpers for asserting on messages that don't happen) but I later realized that a general solution would be probably an overkill, as I knew how hard designing and implementing it could be. (For example, holistically approached and with really a clear supervision over the activity, we would have to have some smart wrappers around the web3 functions, perhaps the recipients only. Which I find quite a big change.).

Goals:
So, the main goal comes with a change that the time-out being easily overlooked for it is so polite and hidden but we actually want the test terminate as a failure. As a matter of fact, we maybe even don't want any services from the SystemKiller, a standard panic firing off after the chosen delay would do fine. I'm convinced that both with or without the special Actor means we need a treatment by panic.

Once the above suggested changes are deployed I suspect running our full test suit won't be possible anymore. I have clues that there should be at least two tests that are going to blow up.

We should then go through these failing tests and try to fix them accordingly, but in a more favorable manner.

If we ran into some which try to prove an unsent message, for now, I'd only recommend to hold a possibly quite fruitful discussion about this phenomenon. Finding an effective solution would give us an avenue that we've desired to have many times but we've ever given up defining for our otherwise already rich mental tool set for writing code in Rust and for Actix.

The text was updated successfully, but these errors were encountered:

bertllll added this to MASQ Node v2 Nov 12, 2024

bertllll converted this from a draft issue Nov 12, 2024

bertllll mentioned this issue Nov 12, 2024

GH-744 - Set BlockchainBridge free from blocking caused by RPC calls #456

Open

bertllll changed the title ~~Test utility for asserting unsent messages~~ Break the habit with acceptance of waiting recorders to time out Nov 13, 2024

bertllll added enhancement New feature or request invalid This doesn't seem right test-tooling labels Nov 13, 2024

kauri-hero moved this from 🆕 New to 🔖 Ready in MASQ Node v2 Dec 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Break the habit with acceptance of waiting recorders to time out #555

Break the habit with acceptance of waiting recorders to time out #555

bertllll commented Nov 12, 2024 •

edited

Loading

Break the habit with acceptance of waiting recorders to time out #555

Break the habit with acceptance of waiting recorders to time out #555

Comments

bertllll commented Nov 12, 2024 • edited Loading

bertllll commented Nov 12, 2024 •

edited

Loading