Logo Hisseo

Linear interpolation

The following code takes as input two arrays x[0..n-1] and y[0..n-1] representing a sampling yi = f(xi) of a real function f. Given a real z, it first finds an interval [xi-1,xi] where z lies, then returns an approximation of f(z) by linearly interpolating between (xi-1,yi-1) and (xi,yi).

This program is almost trivial to prove when using a perfect mathematical model for computation, but when we precisely model floating-point computations, we need extra information to guarantee the absence of overflow. This is done by giving 2 bounds UPPER and LOWER, so that any number involved in the given arrays are not greater than UPPER, and moreover the steps of samplings in x must be greater than LOWER. This last condition is needed to avoid overflow in the division of the last return statement.

The paper proof of absence of overflow in that division is as follows: let's assume UPPER=2^u and LOWER=2^(-l). In that division we have as numerator the product of z - x[i-1] and y[i] - y[i-1] which both belong to interval [ -2 UPPER, 2 UPPER ]. The denominator is x[i] - x[i-1] which is greater than LOWER. Thus, division does not overflow iff ((2 UPPER)^2) / LOWER is less or equal the maximal representable number in 64bit format, that is 2^1024 - 1. This reduces to (2^{u+1} ^ 2) / 2^(-l) < 2^1024, i.e. 2u+2 + l < 1024, i.e. 2u+l < 1022.

In the code below, we choose u and l satisfying condition above and close to each other, to have UPPER and LOWER of a similar order of magnitude, which gives u=340 and l=341.

#pragma JessieFloatModel(strict)

#define UPPER 0x1p340
#define LOWER 0x1p-341

/*@ predicate min_step{L}(double t[], integer a, integer b) = 
  @    \forall integer i; a < i <= b ==> t[i] - t[i-1] >= LOWER;

/*@ lemma min_step_increasing{L}:
  @   \forall double t[], integer a, b;
  @     min_step(t,a,b) ==> \forall integer i,j; 
  @          a <= i <= j <= b ==> t[i] <= t[j];

//@ predicate bounded(double z) = -UPPER <= z <= UPPER;

/*@ predicate array_bounded{L}(double t[],int n) = 
  @   \forall integer i; 0 <= i < n ==> bounded(t[i]);

//@ ghost int i_interp;

/*@ requires n >= 1 && \valid_range(x,0,n-1) && \valid_range(y,0,n-1);
  @ requires min_step(x,0,n-1);
  @ requires bounded(z);
  @ requires array_bounded(x,n) ;
  @ requires array_bounded(y,n);
  @ ensures z <= x[0] ==> \result == y[0];
  @ ensures z > x[n-1] ==> \result == y[n-1];
double interp_lin(double x[], double y[], int n, double z) {
  int i;
  if (z <= x[0]) return y[0];

  /*@ loop invariant 1 <= i <= n;
    @ loop invariant z > x[i-1];
    @ loop variant n-i;
  for (i=1; i= LOWER;
  return yim1+zmxim1*(yi-yim1)/(xi-xim1);
  //  return y[i-1]+(z-x[i-1])*(y[i]-y[i-1])/(x[i]-x[i-1]);

This is proved by a combination of automated provers (SMT solvers and Gappa) and the Coq proof assistant.

Coq is needed to prove the lemma min_step_increasing which need an induction on j.

Gappa deals with all the floating-point overflow checking, including the division: it shows that Gappa is able to automatically perform the reasoning we made on paper above.

The remaining verification conditions, which are related to all the non floating-point aspects of the code are handled by Alt-Ergo or Z3.

Notice: the code above introduces extra auxiliary variables to store values of x[i-1], x[i], y[i-1], y[i]. This was needed to perform the proof, because of some unsufficient feature of the translation from Why to Gappa. This issue is under investigation.