Chapter 1
Introduction

 1.1 Listing of CAS systems tested
 1.2 Results
 1.3 Performance per integrand type
 1.4 Maximum leaf size ratio for each CAS against the optimal result
 1.5 Pass/Fail per test file for each CAS system
 1.6 Timing
 1.7 Verification
 1.8 Important notes about some of the results
 1.9 Design of the test system

This report gives the result of running the computer algebra independent integration problems.

The current number of problems in this test suite is [3809].

1.1 Listing of CAS systems tested

The following are the CAS systems tested:

  1. Mathematica 13.0.1 (February 17, 2022) on windows 10.
  2. Rubi 4.16.1 (Dec 19, 2018) on Mathematica 13.0.1 on windows 10.
  3. Maple 2022.1 (June 1, 2022) on windows 10.
  4. Maxima 5.46 (April 13, 2022) using Lisp SBCL 2.0.1.debian on Ubuntu 20.04 Linux under window 10 WSL 2.0 subsystem via sagemath 9.6.
  5. Fricas 1.3.7 (June 30, 2021) based on based on ecl 21.2.1 on Ubuntu 20.04 Linux under window 10 WSL 2.0 subsystem via sagemath 9.6.
  6. Giac/Xcas 1.9.0-7 (April 2022) on on Ubuntu 20.04 Linux under window 10 WSL 2.0 subsystem. Direct testing using C++ API.
  7. Sympy 1.10.1 (March 20, 2022) Using Python 3.10.4 Ubuntu 20.04 Linux under window 10 WSL 2.0 subsystem via sagemath 9.6.
  8. Mupad using Matlab 2021a with Symbolic Math Toolbox Version 8.7 on windows 10.
  9. Mathics 4.0 via sagemath 9.6.

Maxima, Fricas, Mathics are called using Sagemath. This was done using Sagemath integrate command by changing the name of the algorithm to use the different CAS systems. Mathics was called using its own interface in Sagemath as in this example

from sage.interfaces.mathics import mathics 
res = mathics('Integrate[Sin[x]/(3 + Cos[x])^2,x]')
 

Sympy was called directly from Python. Giac was also called directly via its C++ interface.

1.2 Results

Important note: A number of problems in this test suite have no antiderivative in closed form. This means the antiderivative of these integrals can not be expressed in terms of elementary, special functions or Hypergeometric2F1 functions. RootSum and RootOf are not allowed.

If a CAS returns the above integral unevaluated within the time limit, then the result is counted as passed and assigned an A grade.

However, if CAS times out, then it is assigned an F grade even if the integral is not integrable, as this implies CAS could not determine that the integral is not integrable in the time limit.

If a CAS returns an antiderivative to such an integral, it is assigned an A grade automatically and this special result is listed in the introduction section of each individual test report to make it easy to identify as this can be important result to investigate.

The results given in in the table below reflects the above.

System solved Failed
Mathematica % 99.61 ( 3794 ) % 0.39 ( 15 )
Rubi % 99.5 ( 3790 ) % 0.5 ( 19 )
Fricas % 89.16 ( 3396 ) % 10.84 ( 413 )
Maple % 87.63 ( 3338 ) % 12.37 ( 471 )
Giac % 79.42 ( 3025 ) % 20.58 ( 784 )
Maxima % 75.58 ( 2879 ) % 24.42 ( 930 )
Mupad % 73.33 ( 2793 ) % 26.67 ( 1016 )
Sympy % 67.68 ( 2578 ) % 32.32 ( 1231 )
Mathics % 67.08 ( 2555 ) % 32.92 ( 1254 )
Table 1.1: Percentage solved for each CAS

The table below gives additional break down of the grading of quality of the antiderivatives generated by each CAS. The grading is given using the letters A,B,C and F with A being the best quality. The grading is accomplished by comparing the antiderivative generated with the optimal antiderivatives included in the test suite. The following table describes the meaning of these grades.

grade

description

A

Integral was solved and antiderivative is optimal in quality and leaf size.

B

Integral was solved and antiderivative is optimal in quality but leaf size is larger than twice the optimal antiderivatives leaf size.

C

Integral was solved and antiderivative is non-optimal in quality. This can be due to one or more of the following reasons

  1. antiderivative contains a hypergeometric function and the optimal antiderivative does not.
  2. antiderivative contains a special function and the optimal antiderivative does not.
  3. antiderivative contains the imaginary unit and the optimal antiderivative does not.

F

Integral was not solved. Either the integral was returned unevaluated within the time limit, or it timed out, or CAS hanged or crashed or an exception was raised.

Table 1.2: Description of grading applied to integration result

Grading is implemented for all CAS systems in this version except for CAS Mupad where a grade of B is automatically assigned as a place holder for all integrals it completes on time.

The following table summarizes the grading results.

System % A grade % B grade % C grade % F grade
Rubi 98.84 0.45 0.21 0.5
Mathematica 86.95 5.22 7.43 0.39
Maple 73.35 9.43 4.86 12.37
Fricas 67.6 21.37 0.18 10.84
Maxima 65. 10.19 0.39 24.42
Giac 63.64 15.02 0.76 20.58
Sympy 51.17 9.95 6.56 32.32
Mathics 42.66 6.09 18.33 32.92
Mupad 0.11 73.22 0. 26.67
Table 1.3: Antiderivative Grade distribution for each CAS

The following is a Bar chart illustration of the data in the above table.

pict

The figure below compares the CAS systems for each grade level.

pict

1.2.1 Time and leaf size Performance

The table below summarizes the performance of each CAS system in terms of time used and leaf size of results.

System Mean time (sec) Mean size Normalized mean Median size Normalized median
Rubi 0.05 78.61 1.2 42. 1.
Giac 0.13 129.69 1.95 43. 1.2
Maple 0.2 557.56 6.4 35. 0.93
Mupad 0.25 76.16 1.35 28. 0.87
Maxima 0.3 70.76 1.32 32. 0.9
Fricas 0.47 145.22 1.78 41. 1.13
Mathematica 0.85 81.31 1.2 39. 1.
Sympy 3.48 156.92 2.82 37. 1.07
Mathics 7.02 110.26 2.04 33. 1.
Table 1.4: Time and leaf size performance for each CAS

1.3 Performance per integrand type

The following are the different integrand types the test suite contains.

  1. Independent tests.
  2. Algebraic Binomial problems (products involving powers of binomials and monomials).
  3. Algebraic Trinomial problems (products involving powers of trinomials, binomials and monomials).
  4. Miscellaneous Algebraic functions.
  5. Exponentials.
  6. Logarithms.
  7. Trigonometric.
  8. Inverse Trigonometric.
  9. Hyperbolic functions.
  10. Inverse Hyperbolic functions.
  11. Special functions.
  12. Sam Blake input file.
  13. Waldek Hebisch input file.

The following table gives percentage solved of each CAS per integrand type.

Integrand type problems Rubi Mathematica Maple Maxima Fricas Sympy Giac Mupad Mathics
Independent tests 1892 99. 99.21 93.71 81.98 94.77 72.89 86.63 82.03 71.78
Algebraic Binomial 1917 100. 100. 81.64 69.27 83.62 62.55 72.3 64.74 62.44
Algebraic Trinomial 0 0. 0. 0. 0. 0. 0. 0. 0. 0.
Algebraic Miscellaneous 0 0. 0. 0. 0. 0. 0. 0. 0. 0.
Exponentials 0 0. 0. 0. 0. 0. 0. 0. 0. 0.
Logarithms 0 0. 0. 0. 0. 0. 0. 0. 0. 0.
Trigonometric 0 0. 0. 0. 0. 0. 0. 0. 0. 0.
Inverse Trigonometric 0 0. 0. 0. 0. 0. 0. 0. 0. 0.
Hyperbolic 0 0. 0. 0. 0. 0. 0. 0. 0. 0.
Inverse Hyperbolic 0 0. 0. 0. 0. 0. 0. 0. 0. 0.
Special functions 0 0. 0. 0. 0. 0. 0. 0. 0. 0.
Sam Blake file 0 0. 0. 0. 0. 0. 0. 0. 0. 0.
Waldek Hebisch file 0 0. 0. 0. 0. 0. 0. 0. 0. 0.
Table 1.5: Percentage solved per integrand type

In addition to the above table, for each type of integrand listed above, 3D chart is made which shows how each CAS performed on that specific integrand type.

These charts and the table above can be used to show where each CAS relative strength or weakness in the area of integration.

pict

pict

1.4 Maximum leaf size ratio for each CAS against the optimal result

The following table gives the largest ratio found in each test file, between each CAS antiderivative and the optimal antiderivative.

For each test input file, the problem with the largest ratio \(\frac {\text {CAS leaf size}}{\text {Optimal leaf size}}\) is recorded with the corresponding problem number.

In each column in the table below, the first number is the maximum leaf size ratio, and the number that follows inside the parentheses is the problem number in that specific file where this maximum ratio was found. This ratio is determined only when CAS solved the the problem and also when an optimal antiderivative is known.

If it happens that a CAS was not able to solve all the integrals in the input test file, or if it was not possible to obtain leaf size for the CAS result for all the problems in the file, then a zero is used for the ratio and -1 is used for the problem number.

This makes it easy to locate the problem. In the future, a direct link will be added as well.

# Rubi Mathematica Maple Maxima FriCAS Sympy Giac Mupad Mathics
1 1. (1) 3.9 (50) 16.9 (114) 3.8 (169) 4. (45) 7.5 (169) 4.3 (45) 42.4 (169) 8.9 (156)
2 7.3 (21) 5. (20) 3.6 (17) 1.9 (4) 14.3 (13) 16.8 (5) 6.5 (2) 3.3 (26) 2.4 (19)
3 1. (1) 1.1 (14) 17. (6) 11.1 (7) 2. (8) 1.9 (5) 2.5 (3) 11.3 (5) 1.4 (5)
4 6.4 (5) 14.3 (13) 14.7 (46) 16.6 (43) 5.5 (43) 4.8 (40) 7.1 (8) 6.9 (4) 3.2 (16)
5 1. (1) 54.7 (278) 12737.8 (278) 8.1 (280) 7.7 (280) 39.8 (123) 26.9 (141) 14.1 (204) 17.9 (175)
6 1. (1) 1.4 (3) 2.2 (4) 1.9 (1) 1.4 (7) 0.8 (4) 2.2 (5) 1.3 (3) 1.3 (6)
7 2.2 (3) 5.6 (7) 1.8 (3) 2.8 (3) 6.7 (9) 45.4 (9) 2.6 (3) 1.7 (3) 1.7 (7)
8 1.6 (50) 5.3 (31) 5.1 (40) 6.5 (11) 5. (42) 26.4 (71) 7. (40) 22.5 (70) 7.3 (56)
9 1.2 (365) 7.2 (80) 3.7 (296) 12.1 (328) 4.2 (341) 8.1 (75) 16.9 (328) 6. (9) 13.1 (29)
10 3.2 (335) 93. (452) 3343.5 (327) 36.9 (399) 32.1 (595) 76.3 (215) 24.5 (537) 12.8 (253) 41.9 (586)
11 529. (82) 127. (82) 317. (82) 2.7 (2) 70. (82) 41.3 (17) 49.1 (50) 207. (82) 15.8 (19)
12 1.8 (6) 2.3 (4) 1.2 (8) 1.5 (2) 3.3 (3) 3.4 (3) 2.1 (2) 0.9 (8) 5.4 (3)
13 7.1 (369) 23.8 (1323) 30.9 (1323) 32.9 (1323) 32.9 (1323) 136.1 (671) 49.1 (1444) 38.1 (1323) 61.3 (671)

1.5 Pass/Fail per test file for each CAS system

The following table gives the number of passed integrals and number of failed integrals per test number. There are 210 tests. Each tests corresponds to one input file.

#
Rubi
MMA
Maple
Maxima
FriCAS
Sympy
Giac
Mupad
Mathics
Pass Fail Pass Fail Pass Fail Pass Fail Pass Fail Pass Fail Pass Fail Pass Fail Pass Fail
1 175 0 175 0 173 2 166 9 172 3 158 17 170 5 169 6 156 19
2 33 2 35 0 28 7 15 20 24 11 7 28 17 18 9 26 6 29
3 13 1 14 0 12 2 8 6 12 2 9 5 10 4 11 3 10 4
4 48 2 50 0 33 17 24 26 48 2 19 31 42 8 12 38 15 35
5 279 5 284 0 282 2 252 32 281 3 253 31 268 16 270 14 248 36
6 3 4 7 0 5 2 3 4 7 0 5 2 5 2 7 0 5 2
7 7 2 9 0 9 0 7 2 9 0 5 4 9 0 9 0 3 6
8 113 0 113 0 113 0 111 2 112 1 105 8 111 2 106 7 104 9
9 376 0 376 0 376 0 374 2 376 0 345 31 375 1 372 4 349 27
10 705 0 705 0 655 50 564 141 653 52 436 269 590 115 542 163 426 279
11 113 3 101 15 79 37 20 96 91 25 29 87 34 82 37 79 28 88
12 8 0 8 0 8 0 7 1 8 0 8 0 8 0 8 0 8 0
13 1917 0 1917 0 1565 352 1328 589 1603 314 1199 718 1386 531 1241 676 1197 720

1.6 Timing

The command AbsoluteTiming[] was used in Mathematica to obtain the elapsed time for each integrate call. In Maple, the command Usage was used as in the following example

cpu_time := Usage(assign ('result_of_int',int(expr,x)),output='realtime'

For all other CAS systems, the elapsed time to complete each integral was found by taking the difference between the time after the call completed from the time before the call was made. This was done using Python’s time.time() call.

All elapsed times shown are in seconds. A time limit of 3 CPU minutes was used for each integral. If the integrate command did not complete within this time limit, the integral was aborted and considered to have failed and assigned an F grade. The time used by failed integrals due to time out was not counted in the final statistics.

1.7 Verification

A verification phase was applied on the result of integration for Rubi, Mathematica and Mathics. Future version of this report will implement verification for the other CAS systems. For the integrals whose result was not run through a verification phase, it is assumed that the antiderivative was correct.

Verification phase also had 3 minutes time out. An integral whose result was not verified could still be correct, but further investigation is needed on those integrals. These integrals were marked in the summary table below and also in each integral separate section so they are easy to identify and locate.

1.8 Important notes about some of the results

1.8.1 Important note about Maxima results

Since tests were run in a batch mode, and using an automated script, then any integral where Maxima needed an interactive response from the user to answer a question during the evaluation of the integral will fail.

The exception raised is ValueError. Therefore Maxima results is lower than what would result if Maxima was run directly and each question was answered correctly.

The percentage of such failures were not counted for each test file, but for an example, for the Timofeev test file, there were about 14 such integrals out of total 705, or about 2 percent. This percentage can be higher or lower depending on the specific input test file.

Such integrals can be identified by looking at the output of the integration in each section for Maxima. The exception message will indicate the cause of error.

Maxima integrate was run using SageMath with the following settings set by default

'besselexpand : true' 
'display2d : false' 
'domain : complex' 
'keepfloat : true' 
'load(to_poly_solve)' 
'load(simplify_sum)' 
'load(abs_integrate)' 'load(diag)'
 

SageMath automatic loading of Maxima abs_integrate was found to cause some problems. So the following code was added to disable this effect.

 from sage.interfaces.maxima_lib import maxima_lib 
 maxima_lib.set('extra_definite_integration_methods', '[]') 
 maxima_lib.set('extra_integration_methods', '[]')
 

See https://ask.sagemath.org/question/43088/integrate-results-that-are-different-from-using-maxima/ for reference.

1.8.2 Important note about FriCAS result

There were few integrals which failed due to SageMath interface and not because FriCAS system could not do the integration.

These will fail With error Exception raised: NotImplementedError.

The number of such cases seems to be very small. About 1 or 2 percent of all integrals. These can be identified by looking at the exception message given in the result.

1.8.3 Important note about finding leaf size of antiderivative

For Mathematica, Rubi, Mathics and Maple, the builtin system function LeafSize was used to find the leaf size of each antiderivative.

The other CAS systems (SageMath and Sympy) do not have special builtin function for this purpose at this time. Therefore the leaf size for Fricas and Sympy antiderivative was determined using the following function, thanks to user slelievre at https://ask.sagemath.org/question/57123/could-we-have-a-leaf_count-function-in-base-sagemath/

def tree_size(expr): 
    r""" 
    Return the tree size of this expression. 
    """ 
    if expr not in SR: 
        # deal with lists, tuples, vectors 
        return 1 + sum(tree_size(a) for a in expr) 
    expr = SR(expr) 
    x, aa = expr.operator(), expr.operands() 
    if x is None: 
        return 1 
    else: 
        return 1 + sum(tree_size(a) for a in aa)
 

For Sympy, which was called directly from Python, the following code was used to obtain the leafsize of its result

try: 
  # 1.7 is a fudge factor since it is low side from actual leaf count 
  leafCount = round(1.7*count_ops(anti)) 
 
  except Exception as ee: 
         leafCount =1
 

For Giac, the call taille(anti_derivative,RAND_MAX); is used to find leaf size.

1.8.4 Important note about Mupad results

Matlab’s symbolic toolbox does not have a leaf count function to measure the size of the antiderivative. Maple was used to determine the leaf size of Mupad output by post processing Mupad result.

Currently no grading of the antiderivative for Mupad is implemented. If it can integrate the problem, it was assigned a B grade automatically as a placeholder. In the future, when grading function is implemented for Mupad, the tests will be rerun again.

The following is an example of using Matlab’s symbolic toolbox (Mupad) to solve an integral

integrand = evalin(symengine,'cos(x)*sin(x)') 
the_variable = evalin(symengine,'x') 
anti = int(integrand,the_variable)
 

Which gives sin(x)^2/2

1.9 Design of the test system

The following diagram gives a high level view of the current test build system.