Test mpi versions#20
Conversation
Codecov Report
@@ Coverage Diff @@
## main #20 +/- ##
=======================================
Coverage 74.29% 74.29%
=======================================
Files 41 41
Lines 2587 2587
=======================================
Hits 1922 1922
Misses 665 665
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
2f39e76 to
aaa4638
Compare
aaa4638 to
75a5004
Compare
|
I am not sure why it is still in fail fast. It would be nice to have all these tests, potentially with xfail on the configurations known to cause problems? |
They are stopped because of time out. The tests with mpich hang at a point due to #19, then it waits until max timeout for github actions. When it is over, they are cancelled. |
|
@tomMoral The main problem for tests with mpich is that we need to run the tests with |
@tomMoral As far as I understand, this is message is due to Singleton feature not being implemented in mpich, see mpich issue on github. details are explained in #19. I think with mpich we need to run the tests with: Note: Actually we can use the same command for both mpich and openmpi. As hostfile format for mpich and openmpi are not the same When we use the above command with:
I think the openmpi version should be able to stop spawned processes properly. That makes me think that the code to stop spawned processes might not be reliable. |
|
@tomMoral I tried using mpich with a very simple MPI program that spawns a number of processes (gets the hostfile from env.) to see if the problem arises from dicodile code. With openmpi I can run the prog as: If I do the same with mpich, I get the above error; ie. I think this is really due to Singleton not being implemented in mpich. I propose to change the testing command to and fix the hanging problem and other possible problems afterwards. WDYT? |
fac864b to
cf55093
Compare

Runs the tests:
Tests with openmpi on ubuntu-18.04 fails due to #12.
Tests with mpich on both ubuntu-18.04 and ubuntu 20.04 fail due to #19.