Hard to test concurrency, useally have test over a 5 week period and when you find the issues it is hard to reproduce
Lots of threads, and the behaviour of the program changes state and the interleaving changes and therefore it becomes hard to reproduce the concurrency issue, this is hope based testing, running on n number of machines and hope that we find a concurrency bug. This is not only costly, in having the ability to run all the processes on multiple machines, the resource involved in managing the logistics for running all the tests and more importantly if you find a concurrency issue can you reproduce the bug.
So what we need is to be able to control the interleaving of the programme, we need to be able to run again and again in a loop, and record the interleaving and then if we find a bug having the ability to repeat that test and interleaving to reproduce that bug
CHESS is a tool for systematic and disciplined concurrency testing. Given a concurrent test, CHESS systematically enumerates the possible thread schedules to find hard-to-find concurrency errors, including assertion violations, deadlocks, data-races, and atomicity violations.
Tools And Techniques To Identify Concurrency Issues