Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break the habit with acceptance of waiting recorders to time out #555

Open
bertllll opened this issue Nov 12, 2024 · 0 comments
Open

Break the habit with acceptance of waiting recorders to time out #555

bertllll opened this issue Nov 12, 2024 · 0 comments
Labels
enhancement New feature or request invalid This doesn't seem right test-tooling

Comments

@bertllll
Copy link
Contributor

bertllll commented Nov 12, 2024

I have a reasoned assumption that there are tests in the codebase which make use of time-outs that originally were intended just for the state of emergency, when beyond this point the test would be considered hanging.
These timeouts were implemented through the special auxiliary Actor called SystemKiller. This concept is probably used at also some other places, and it should be dealt with across the board, but here I'm talking mainly about the specific case when a test is set to be ended by so called StopConditions.

As you might know, the Actor system never stops unless it's asked to. It had caused a lot of issues in writing good tests because only System::current().stop() wouldn't serve us always, as some part of the planned actions in the Act mightn't have been carried out fully. Therefore the test wasn't reliable because of races of threads. The StopConditions allow specifying a message or a set of them for a recorder impersonating one of our normal Actors, these are then the conditions of stopping the system as soon as the recorder receives all the conditioned messages.

Because there is always a chance that the recorder doesn't receive them as the tested code might work differently than the developer thought it would - which points probably to a design issue - we put in also the SystemKiller, ensuring no more than 10 seconds for waiting if the messages would arrive.

The problem is we exploit this security measure sometimes. For example, I've seen a test where the requirement was to prove that a certain was NOT sent. Who has ever thought about these tests thoroughly must know it's one of the trickiest requirements possible. There isn't a good recommended approach solving all such difficulties of the person writing tests like these.

This card originally came to existence with some bigger aspiration (of setting helpers for asserting on messages that don't happen) but I later realized that a general solution would be probably an overkill, as I knew how hard designing and implementing it could be. (For example, holistically approached and with really a clear supervision over the activity, we would have to have some smart wrappers around the web3 functions, perhaps the recipients only. Which I find quite a big change.).

Goals:
So, the main goal comes with a change that the time-out being easily overlooked for it is so polite and hidden but we actually want the test terminate as a failure. As a matter of fact, we maybe even don't want any services from the SystemKiller, a standard panic firing off after the chosen delay would do fine. I'm convinced that both with or without the special Actor means we need a treatment by panic.

Once the above suggested changes are deployed I suspect running our full test suit won't be possible anymore. I have clues that there should be at least two tests that are going to blow up.

We should then go through these failing tests and try to fix them accordingly, but in a more favorable manner.

If we ran into some which try to prove an unsent message, for now, I'd only recommend to hold a possibly quite fruitful discussion about this phenomenon. Finding an effective solution would give us an avenue that we've desired to have many times but we've ever given up defining for our otherwise already rich mental tool set for writing code in Rust and for Actix.

@bertllll bertllll converted this from a draft issue Nov 12, 2024
@bertllll bertllll changed the title Test utility for asserting unsent messages Break the habit with acceptance of waiting recorders to time out Nov 13, 2024
@bertllll bertllll added enhancement New feature or request invalid This doesn't seem right test-tooling labels Nov 13, 2024
@kauri-hero kauri-hero moved this from 🆕 New to 🔖 Ready in MASQ Node v2 Dec 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request invalid This doesn't seem right test-tooling
Projects
Status: 🔖 Ready
Development

No branches or pull requests

1 participant