View Issue Details Jump to Notes ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0008056CMakeCTestpublic2008-11-07 10:452009-09-14 21:22
ReporterRoscoe A. Bartlett 
Assigned ToBill Hoffman 
PrioritynormalSeveritymajorReproducibilityalways
StatusclosedResolutionfixed 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0008056: ctest -j N does not work correctly with expected failing tests
DescriptionI just checkout, built and installed the most recent version of CMake on 2008/11/07. I tested the -j N option with ctest and it mostly works correctly but it does not work correctly for tests that are expected to fail (i.e. they return non-zero value) but there is a REGEX given to determine success.

Here is one example. I one test defined with a helper macro as:

===========================================================

TRILINOS_ADD_TEST(
  UnitTest_BadUnitTest
  NAME UnitTest_BadUnitTest_end_result_failed
  DIRECTORY UnitTest
  ARGS "--teuchos-suppress-startup-banner"
  NUM_MPI_PROCS 1
  COMM serial mpi
  PASS_REGULAR_EXPRESSION "End Result: TEST FAILED"
  )

===========================================================

In the macro TRILINOS_ADD_TEST(...), the argument PASS_REGULAR_EXPRESSION "End Result: TEST FAILED" is added as test property as you would expect.

Here is the output from the test:

===========================================================

Start testing: Nov 07 07:52 MST
----------------------------------------------------------
Changing directory into: /home/rabartl/PROJECTS/Trilinos.base/BUILDS/CMAKE/GCC-4.1.2/SERIAL_OPT/packages/teuchos/test
7/109 Testing: Teuchos_UnitTest_BadUnitTest_end_result_failed
7/109 Test: Teuchos_UnitTest_BadUnitTest_end_result_failed
Command: "/home/rabartl/PROJECTS/Trilinos.base/BUILDS/CMAKE/GCC-4.1.2/SERIAL_OPT/packages/teuchos/test/UnitTest/UnitTest_BadUnitTest.exe" "--teuchos-suppress-startup-banner"
Directory: /home/rabartl/PROJECTS/Trilinos.base/BUILDS/CMAKE/GCC-4.1.2/SERIAL_OPT/packages/teuchos/test
"Teuchos_UnitTest_BadUnitTest_end_result_failed" start time: Nov 07 07:52 MST
Output:
----------------------------------------------------------

***
*** Unit test suite ...
***


Sorting tests by group name then by test name ...

Running unit tests ...

0. Int_Assignment_UnitTest ... [Passed]
1. Int_BadAssignment_UnitTest ...
 i2 = 5 == i1 = 4 : failed
 [FAILED]
 Location: /home/rabartl/PROJECTS/Trilinos.base/Trilinos/packages/teuchos/test/UnitTest/Failing_UnitTest.cpp:7
 
2. Int_Basic_UnitTest ... [Passed]
3. VectorInt_OutOfRangeAt_UnitTest ...
 
 p=0: *** Caught standard std::exception of type 'std::out_of_range' :
 
  vector::_M_range_check
 [FAILED]
 Location: /home/rabartl/PROJECTS/Trilinos.base/Trilinos/packages/teuchos/test/UnitTest/Failing_UnitTest.cpp:15
 
4. someFunc_ComplexDouble_test1_UnitTest ... [Passed]
5. someFunc_ComplexFloat_test1_UnitTest ... [Passed]
6. someFunc_double_test1_UnitTest ... [Passed]
7. someFunc_float_test1_UnitTest ... [Passed]

Summary: total = 8, run = 8, passed = 6, failed = 2

End Result: TEST FAILED
----------------------------------------------------------
"Teuchos_UnitTest_BadUnitTest_end_result_failed" end time: Nov 07 07:52 MST
"Teuchos_UnitTest_BadUnitTest_end_result_failed" time elapsed: 00:00:00
----------------------------------------------------------

End testing: Nov 07 07:52 MST

===========================================================

Note that the above test gives the correct string, but 'ctest -j 8' returned that it failed.

Another problem with 'ctest -j N' is that it does not respect the '-W' option to widen the output as shown by:

===========================================================

[rabartl@godel SERIAL_OPT]$ ctest -W 100 -j 8 -E TrilinosDep -R Teuchos_UnitTest
Start processing tests
Test project /home/rabartl/PROJECTS/Trilinos.base/BUILDS/CMAKE/GCC-4.1.2/SERIAL_OPT
20: 20/120 Testing Teuchos_UnitTest_BadUnitTest_I***Failed
21: 21/120 Testing Teuchos_UnitTest_BadUnitTest_I***Failed
19: 19/120 Testing Teuchos_UnitTest_BadUnitTest_e***Failed
16: 16/120 Testing Teuchos_UnitTest_UnitTests ... Passed
18: 18/120 Testing Teuchos_UnitTest_BadUnitTest_e***Failed
17: 17/120 Testing Teuchos_UnitTest_BadUnitTest_f***Failed

17% tests passed, 5 tests failed out of 6

===========================================================

The simultaneous serial testing feature is an important feature for Trilinos in order to speed up pre-checking testing, to give faster continuous integration feedback, and to better utilize sparse testing machines.




TagsNo tags attached.
Attached Files

 Relationships

  Notes
(0014063)
Roscoe A. Bartlett (reporter)
2008-11-07 17:54

The issue of not respecting the -W option and having different behavior with respect to pass/fail tests would seem to suggest that CTest has some duplication issues. Does CTest need to be refactored to remove some duplication (that is not being consistently maintained as is the problem with all duplication)?
(0014404)
Bill Hoffman (manager)
2008-12-18 22:03

I have fixed the pass/fail regex stuff. I have not yet fixed the -W stuff. It is not so much a duplication issue. The -j stuff runs a ctest process for each test. To do this efficiently, it parses all the tests, then creates a temporary cache file to store the test information so that the sub-ctests can load it quick. It was not saving all of the test information, it now is. (also, I put a comment in place to make sure it stays up-to-date.) For the -W stuff, ctest should pass the -W to the sub ctest and all should be good.
(0014437)
Roscoe A. Bartlett (reporter)
2008-12-24 14:03

Bill,

I have updated my CVS copy of CMake and 'ctest -jN' does now seem to
run all of the tests correctly. However, the parallel performance
does not look very good. Here is a set of experiments that I ran:

Here, I want to look at the runtime savings that running with -jN on
godel for different values of N (1 to 8).

1) Running serial debug tests without -jN:

    $ time ctest -W 100
    
    ...
    
    
    real 3m48.493s
    user 3m42.992s
    sys 0m8.495s

2) Running serial debug tests with -j2:

    $ time ctest -W 100 -j2
    
    ...
    
    100% tests passed, 0 tests failed out of 248
    
    real 2m25.877s
    user 3m50.630s
    sys 0m34.104s

3) Running serial debug tests with -j4:

    $ time ctest -W 100 -j4
    
    ...
    
    100% tests passed, 0 tests failed out of 248
    
    real 1m47.034s
    user 3m51.593s
    sys 0m41.900s

4) Running serial debug tests with -j8:

    $ time ctest -W 100 -j8
    
    ...
    
    100% tests passed, 0 tests failed out of 248
    
    real 1m40.348s
    user 3m52.744s
    sys 0m57.647s

These are not good speedups by even parallel MPI standards. I suspect
that the algorithm used to implement parallel runs is not very good or
something.

Here is the table of times and speedups for using 'ctest -jN' with the
entire serial debug test suite of 248 tests:


    # Processors (-jN) Wallclock time (sec) Speedup Efficiency
    1 228 - 1.00
    2 145 1.57 0.79
    4 107 2.13 0.53
    8 100 2.28 0.29
 
This is terrible parallel efficiency, even by MPI standards. Moving
from 1 to 2 processors is worthwile but moving to 4 processers is not
and moving to 8 is a waste of processors.

I can only assume that CTest is using a bad algorithm for scheduling
the jobs.

How does parallel compiling with -jN with GNU Make compare on the same
machine? Let's run an experiment of just compiling the Teuchos
library in parallel. There are 60 *.o files get get built by this.

Here are the raw timing experiments:

1) Compiling Teuchos_libs with -j1:

    $ make clean
    $ time make Teuchos_libs -j1
    
    ...
    
    real 0m37.736s
    user 0m29.996s
    sys 0m5.852s

2) Compiling Teuchos_libs with -j2:

    $ make clean
    $ time make Teuchos_libs -j2
    
    ...
    
    real 0m18.684s
    user 0m30.148s
    sys 0m5.917s

3) Compiling Teuchos_libs with -j4:

    $ make clean
    $ time make Teuchos_libs -j4
    
    ...
    
    real 0m9.977s
    user 0m30.417s
    sys 0m6.146s

4) Compiling Teuchos_libs with -j8:

    $ make clean
    $ time make Teuchos_libs -j8
    
    ...
    
    
    real 0m5.534s
    user 0m30.774s
    sys 0m6.688s

Here is the table of times and speedups for using 'make Teuchos_libs
-jN':

    # Processors (-jN) Wallclock time (sec) Speedup Efficiency
    1 37.7 - 1.00
    2 18.7 2.02 1.01
    4 10.0 3.77 0.94
    8 5.5 6.82 0.85
 
As you can see, we are getting almost perfect speedup and parallel
efficiency on four processors and very good speedup on 8 processors.
This shows that GNU Make has a reasonably algorithm for controlling
the parallel builds.

What this shows to me is that CTest is not using a very good parallel
algorithm for running the tests.

What algorithm is CTest using the schedule the jobs and what can we do
to improve this?
(0014438)
Roscoe A. Bartlett (reporter)
2008-12-24 14:08

Also, I see that -jN and -T memcheck do not work together. Since memory testing takes a lot more time than just testing, we would save a lot of our testing resources if we could get good parallel scalability in running memory tests.
(0014439)
Bill Hoffman (manager)
2008-12-24 15:35

I will have to take a look. The algorithm is pretty simple. I can think of a few things that might be the problem...

1. I run a new ctest process for each job, that might be adding extra overhead.
2. I am using process execution and no threads so I have to keep checking for output or the end of the process in a loop. When a process ends, it starts a new one.

If you want to have a look the code is all in CTest/cmCTestMultiProcessHandler.cxx and CTest/cmProcess.cxx

This code is pretty new. I don't even think it will work with dashboards yet. Right now it runs the tests and that is about it.
(0014440)
Bill Hoffman (manager)
2008-12-24 15:46

You might want to try changing this parameter:

  int pipe = p->CheckOutput(.1, out, err);

That is the timeout for each check for output on the running tests.
(0014452)
Bill Hoffman (manager)
2008-12-29 18:11

I am having a hard time reproducing this. I created a test in CMake, Tests/JCTest. It does something like this:
foreach(f 1 2 3 4 5 6 7 8 9 10 11 12 12 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29 30)
  add_test(TestTime${f} TestTime 50000000)
endforeach(f)

#include <stdio.h>
#include <stdlib.h>

TestTime.cxx
int main(int ac, char** av)
{
  float d = 10.0;
  for(int i =0; i < atoi(av[1]); i++)
    {
    d *= .2;
    }
  printf("%f", d);
}


If I run this on an 8 core machine, I get these results:
# time TestTime 50000000 0m3.411s (running just the test program)
# no -j 1m38.422
# -j1 1m40.538s
# -j2 0m53.607s
# -j4 0m28.833s
# -j6 0m20.268s
# -j8 0m14.677s

It seems to be scaling very well.
3.4 * 29
98.6
98.6 / 8
12
Only 2 seconds slower than optimal.

Can you try running my test on your machine?

Just checkout a recent CVS CMAke, then do this:
cd Tests/JCTest
mkdir b
cd b
cmake ..
ctest -j 8
  

You may have to modify CMakeLists.txt number of iterations passed into TestTime based on your machine speed. I did not want to use a sleep, I wanted to make sure the machine was really busy.
(0014886)
Bill Hoffman (manager)
2009-02-10 14:24

-W is now working with -j
(0014921)
Roscoe A. Bartlett (reporter)
2009-02-13 00:52

Sorry I did not respond sooner but these got sent to my Junk Mail folder and I just happened to see them :-)

I will try this out again the next time I update my CVS version.
(0017514)
Bill Hoffman (manager)
2009-09-14 21:22

This should be fixed now, and all the code is the same for j1 and jN so it should not happen again...

 Issue History
Date Modified Username Field Change
2008-11-07 10:45 Roscoe A. Bartlett New Issue
2008-11-07 17:54 Roscoe A. Bartlett Note Added: 0014063
2008-12-16 16:59 Bill Hoffman Status new => assigned
2008-12-16 16:59 Bill Hoffman Assigned To => Bill Hoffman
2008-12-18 22:03 Bill Hoffman Note Added: 0014404
2008-12-24 14:03 Roscoe A. Bartlett Note Added: 0014437
2008-12-24 14:08 Roscoe A. Bartlett Note Added: 0014438
2008-12-24 15:35 Bill Hoffman Note Added: 0014439
2008-12-24 15:46 Bill Hoffman Note Added: 0014440
2008-12-29 18:11 Bill Hoffman Note Added: 0014452
2009-02-10 14:24 Bill Hoffman Note Added: 0014886
2009-02-13 00:52 Roscoe A. Bartlett Note Added: 0014921
2009-09-14 21:22 Bill Hoffman Note Added: 0017514
2009-09-14 21:22 Bill Hoffman Status assigned => closed
2009-09-14 21:22 Bill Hoffman Resolution open => fixed


Copyright © 2000 - 2018 MantisBT Team