News and releases!

Features for MPIR 2.1.1

Bug Fixes:

  • Fixed a long standing but latent bug in some Windows assembler code that has only now been triggered because of recent changes in higher level code. In outline the K8/K10 mpn_sublsh_n function entry point (in the file mpn/x86_64w/k8/sublsh_n.asm) was not being set up correctly. Thanks to Case Vanhorsen for reporting this bug.

Speedups:

  • None

Features:

  • Initial build with Visual Studio Express 2010

Changes:

  • None

Features for MPIR 2.1.0

Bug Fixes:

  • Fixed the xgcd normalisation issue and redid the tuning code for gcd and xgcd
  • Fixes for compiling with GCC 4.5.0 on Itanium

Speedups:

  • None

Features:

  • Initial build with Visual Studio 2010

Changes:

  • Export new function mpn_sqr

Features for MPIR 2.0.0

License:

  • Switched to overall LGPL v3+

Bug Fixes:

  • Fixed a bug in the probable prime code (reported by Xiangyu Liu)
  • Fixed a build issue on 32 bit p6 Apples
  • Fixed demos/pollard_rho
  • Numerous tuning bug fixes
  • Fixed a build issue for icc compiler

Speedups:

  • Sped up squaring code
  • Minor speedup to toom4 code
  • Sped up x86_64 divrem_1 when divisor is 64 bits (from GMP)
  • Sped up x86_64 divrem_2 (from GMP)
  • Sped up GCD and GCDEXT by an improved nhgcd2.c (from GMP)
  • Sped up addmul code for Itanium (by Jason Martin)
  • Large number of new and sped up Itanium assembly functions (by Torbjorn Granlund)

Features:

  • Toom8.5 code (by Marco Bodrato) see the paper M. Bodrato, "High degree Toom'n'half for balanced and unbalanced multiplication", E. Antelo, D. Hough and P. Ienne, editors, Proceedings of the 20th IEEE Symposium on Computer Arithmetic, IEEE, Tubingen, Germany, July 25-27, 2011, pp. 15--22.
  • Schoolbook Euclidean division code (by Torbjorn Granlund)
  • Divide and conquer Euclidean division code (by Torbjorn Granlund) and Marco Bodrato (adapted to use David Harvey's middle product based approximate quotient code)
  • Asymptotically fast division code (by William Hart), based on Paul Zimmermann's mpn_invert and some reuse of the divide and conquer code.
  • New mpn_tdiv_q and mpn_tdiv_qr code (by Torbjorn Granlund)
  • Schoolbook Hensel division code, (largely by Niels Moller)
  • Divide and conquer Hensel division code (by Niels Moller, Torbjorn Granlund and David Harvey)
  • New mpn_divexact code and mpz_divexact to match (by Torbjorn Granlund)
  • New mpn_rootrem, mpz_rootrem and mpz_root code (by Paul Zimmermann and Torbjorn Granlund)
  • New mpn_neg, mpn_sqr, mpn_zero, mpn_and_n, mpn_ior_n, mpn_xor_n, mpn_xnor_n, mpn_nand_n, etc (by Torbjorn Granlund)
  • New string input/output code (by Torbjorn Granlund)
  • New mp_bitcnt_t type for multiple precision bit counts

Changes:

  • Removed benchmark 0.1 code from tarball
  • Updated GMP_VERSION to "5.0.1"

Features for MPIR 1.3.0

Bug Fixes:

  • Fixes to the build system to better support MinGW
  • Fixed a memory leak in lehmer GCD code
  • Fixed a CPU misidentification on BSD
  • Fixed a BSD install issue
  • Fixed a make try warning on Solaris
  • Fixed make distclean to clean up properly after a fat binary build
  • Fixed a bug in make distcheck
  • Fixed mpf_eq bug (reported on GMP list)
  • Fixed non-uniformness of mpz_urandomm
  • Fixed mpf exponent printing issue (reported on GMP list)
  • Fixed bug in sparc32/v9 add/sub code

Speedups:

  • Unbalanced Toom 4 multiplication
  • Toom 53 multiplication
  • New fast single limb gcd and gcdext routines
  • Switched on ngcd based Lehmer GCD routine
  • Strassen multiplication for 2x2 matrices to speed up ngcd and ngcdext
  • Switched on new MPN_ZERO and mpn_store assembly routines in FFT code
  • Left and right shift assembly code for x86_64
  • Rewrote generic mullow and mulhi functions
  • New mpz factorial code and tuning (contributed by Robert Gerbicz using an algorithm of Schonhage - see page 226 in: "Fast Algorithms, A Multitape Turing Machine Implementation" by A. Schonhage, A. F. W. Grotefeld and E. Vetter, BI Wissenschafts-Verlag, Mannheim, 1994)
  • Updating of 32 bit Windows support for AMD64, p3 and p4
  • Core2/penryn and nehalem mpn_store assembly code
  • Core2/penryn copyi assembly code
  • Better 32 bit k8/k10 and Nehalem assembly code
  • Initial support for via Nano
  • New mpn_rootrem code
  • Select better assembly code for Atom 64 bit
  • New faster mpz_tdiv_q code
  • Faster division and exact division by a single limb on x86_64
  • Core2/penryn and nehalem addlsh_n assembly code
  • K8/k10 addlsh_n, sublsh_n assembly functions, including carry in variants
  • K8/k10 inclsh_n, declsh_n assembly code

Features:

  • Middle product multiplication (by David Harvey) - see his paper The Karatsuba middle product for integers
  • Optimised k8/k10 and Nehalem assembly code for add_err1_n, sub_err1_n used by mulmid
  • Speed program accepts lines of data from a text file
  • A batch script to build MPIR using MSVC using a configure/make like syntax
  • Complete rewrite of the benchmark program in C by Brian Gladman
  • New primality test code written by T. R. Nicely used as a benchmark case, adapted with the help of Jeff Gilchrist
  • mpn_lshift2 and mpn_rshift2 assembly functions
  • Latest Yasm assembler
  • sb_divappr_q, schoolbook approximate quotient
  • dc_divappr_q, divide and conquer approximate quotient (by David Harvey)
  • Script for setting all version numbers automatically when doing a release
  • mpn_neg_n function
  • New mpn_mulmod_2expp1 and mpn_mulmod_2expm1 functions
  • Benchmark for mpn functions
  • New k8 mpn_lshiftc assembler function
  • Macro functions inclsh1, declsh1
  • The try program now tests macro functions
  • Macros for memory managers to determine when reallocations are likely to occur
  • New function mpz_nthroot
  • New mpz_next_likely_prime, mpz_probable_prime_p and mpz_likely_prime functions
  • Factor out trial division function from primality test code
  • New mpf_rrandomb without global state
  • New mpn_urandomb, mpn_urandomm, mpn_rrandom and mpn_randomb functions without global state
  • New mpn invert code (contributed by Paul Zimmermann), used in division code
  • New generic divrem_hensel functions
  • Implement Peter Montgomery's mpn_mod_1_k algorithms
  • Optimised AMD, core2/penryn, atom, nehalem assembly functions for mpn_mod_1_?
  • New assembly code for AMD divrem_hensel_qr_1, divrem_hensel_r_1
  • New AMD, core2/penryn, atom, nehalem assembly functions mpn_rsh_divrem_hensel_qr_1_2
  • New optimised AMD, core2/penryn, atom, nehalem assembly functions mpn_divrem_hensel_qr_1_2
  • New generic functions mpn_rsh_divrem_hensel_qr_1_?
  • New generic mpn_tdiv_q function (based on mulmid/dc_divappr_q code)
  • Improved Windows timing code
  • Added architecture directory k102 for Phenom II assembly code
  • Support for new family 6, model 30 CPU

Changes:

  • Removed requirement to type make install-gmpcompat
  • Make check tests both static and dynamic libraries where code differs
  • Changed library version numbers from x.y to x.y.z when doing a new minor release
  • Removed numerous extremely old deprecated functions
  • Removed mpbsd support from MPIR
  • Removed ancient ansi2knr conversion

Features for MPIR 1.2.0

Bug fixes:

  • None

Speedups:

  • Add new FFT code (retrieved from http://www.loria.fr/~kruppaal/mul_fft-4.2.1.1.tgz), based on the original FFT code written by Paul Zimmermann, subsequently rewritten by Pierrick Gaudry, Alexander Kruppa and Paul Zimmermann according to the ideas presented in their ISAAC paper, ("A GMP-based implemenation of Schonhage-Strassen's large integer multiplication algorithm", Pierrick Gaudry and Alexander Kruppa and Paul Zimmermann, Proceedings of the 2007 International Symposium on Symbolic and Algebraic Computation, pp. 167-174), and revised by Torbjorn Granlund, with numerous bug fixes due to William Hart and improvements for Windows due to Brian Gladman
  • Add tuning parameters for new FFT for most modern processors
  • Write tuning code for new FFT
  • Implement Toom32, unbalanced Toom3, Toom42
  • Optimise Toom3 and Toom3 squaring code using better sequences
  • Factor out Toom4/7 interpolate sequences and switch to twos complement
  • Optimise memory usage in Toom 3, 4 and 7 routines
  • Many new highly optimised assembly routines for x86_64 architectures
  • Fast XGCD based on Moller's ngcd algorithm

Features:

  • Modified speed program to be able to add values from columns together

Changes:

  • None

Features for MPIR 1.1.0

Bug fixes:

  • Work around a linker bug in Apple Darwin Tiger
  • Resolve an issue causing a build failure on recent Cygwin32's
  • Fixed development test code to do proper overlap tests for functions with three source operands

Speedups:

  • Added numerous assembly optimised functions for division by certain small constants and euclidean division by a single limb (Jason Moxham)
  • Optimised mul_2 and addmul_2 (Jason Moxham)
  • Added Toom 4 and Toom 7 multiplication for balanced operands (William Hart)
  • Small speedup for mpz_mul for small operands when not aliased

Features:

  • Complete rearrangement of cpu detection code to explicitly support k8, k10, pentium4, prescott, netburst, netburstlahf, core2, core, penryn, atom, nehalem
  • factored out x86/x86_64 detection for both ordinary and fat builds into cpuid.c
  • Distribute mpirbench with mpir (new make bench option)
  • Added __GMP_CC and __GMP_CFLAGS, __MPIR_CC and __MPIR_CFLAGS to gmp/mpir.h
  • Report when CPU is not identified (try sensible defaults)
  • Support Pentium 4's that do not support LAHF/SAHF instructions
  • Support Pathscale gcc on MIPS64
  • Addition of assembly optimised subadd_n function

Changes:

  • Re-enabled mpbsd functionality

Features for MPIR 1.0.0

Bug fixes:

  • Building outside the source tree is now possible
  • Bug removed from Windows Assembler file dive_1.asm
  • Fat binary support for Core 2 64 bit fixed
  • x86_64 fat binary support on Sun machines with gcc fixed
  • Build failure on Sun machines using later versions of gcc fixed
  • Aliasing bug in mpz_urandomm fixed

Speedups:

  • Dramatic speedups for K8 assembly code (due primarily to Jason Moxham)
  • Assembly support for K10
  • Significant speedups for Core 2 assembly (due primarily to Jason Moxham)
  • Some mpn assembler functions were not being used in mpz layer due to missing HAVE_NATIVE flags
  • Nocona processors now use Core 2 assembly functions instead of generic C

Features:

  • Emit mpir binaries and mpir.h and offer support for gmp compatibility
  • Build support for Intel Atom
  • Unrecognised Intel 64 machines now default to Core 2 assembly support
  • Some new, undocumented mpn functions
  • Try, speed and tune now available for Windows MSVC build

Features for MPIR 0.9.0

Bug fixes:

  • Sun CC support
  • C99 support in gmp.h
  • Support for Apple GCC compiler
  • Merged numerous patches for GMP 4.2.1 from GMP website
  • Corrections in documentation including function prototypes
  • Build support for Sparc-Solaris
  • Support for Core 2 Solaris
  • Support for SiCortex MIPS
  • Distinguish and detect P4, Nocona, Prescott
  • Support numerous recent Intel family 6 and Dunnington prcessors
  • Fixed bugs in perfect power detection

Speedups:

  • Jason Martin's Core 2 assembly patches
  • Niels Möhler's GCD patches
  • Pierrick Gaudry's AMD64 assembly patches
  • Tuning flags for P4, Prescott, Nocona and Core 2

Features:

  • x86_64 code to Yasm format (Yasm supplied with MPIR)
  • Support for building on MSVC
  • Fat binary support for AMD x86_64

Other changes:

  • Disabled nails support
  • Removed macos port