These micro-benchmarks, while not comprehensive, do test compiler performance on a range of common code patterns, such as function calls, string parsing, sorting, numerical loops, random number generation, recursion, and array operations.
It is important to note that the benchmark codes are not written for
absolute maximal performance (the fastest code to compute
recursion_fibonacci(20)
is the constant literal 6765
). Instead,
the benchmarks are written to test the performance of identical
algorithms and code patterns implemented in each language. For
example, the Fibonacci benchmarks all use the same (inefficient)
doubly-recursive algorithm, and the pi summation benchmarks use the
same for-loop. The “algorithm” for matrix multiplication is to call
the most obvious built-in/standard random-number and matmul routines
(or to directly call BLAS if the language does not provide a
high-level matmul), except where a matmul/BLAS call is not possible
(such as in JavaScript).
The data presented here is generated using this IJulia benchmarks notebook.
C and Fortran compiled with gcc 4.8.5, taking best timing from all optimization levels (-O0 through -O3). C, Fortran, Go, Julia, Lua, Python, and Octave use OpenBLAS v0.2.19 for matrix operations; Mathematica uses Intel(R) MKL. The Python environment is Anaconda Python v3.6.3. The Python implementations of matrix_statistics and matrix_multiply use NumPy v1.13.1 and OpenBLAS v0.2.19 functions; the rest are pure Python implementations. Raw benchmark numbers in CSV format are available here and the benchmark source code for each language can be found in the perf. files listed here.
These micro-benchmark results were obtained on a single core (serial execution) on an Intel(R) Core(TM) i7-3960X 3.30GHz CPU with 64GB of 1600MHz DDR3 RAM, running openSUSE LEAP 42.3 Linux.