\documentstyle[11pt]{article}
\newcommand\draft{\textheight=10.5truein \textwidth=7.5truein % draft print
\hoffset=-1.5truein \voffset=-1truein}
%\newcommand\quality{\textheight=220mm \textwidth=160mm \hoffset=-20mm} %[11pt]
\newcommand\quality{\textheight=240mm \textwidth=160mm \hoffset=-20mm} %[11pt]
%\draft
\quality
\topmargin 0Truein
\def\be{\begin{equation}} \def\ee{\end{equation}}
\def\IR{\hbox{\rm I\kern-.2em\hbox{\rm R}}}
\def\IZ{\hbox{{\rm Z}\kern-.3em{\rm Z}}}
\def\n{\noindent} \def\D{{\cal D}}
\def\dist{\, {\rm dist}} \def\mod1{\,({\rm mod\ } 1)\,}
\def\ep{\varepsilon} \def\la{\lambda}
\newcommand\Lbrace{\left\lbrace} \newcommand\Rbrace{\right\rbrace}
\newcommand\Lpar{\left(} \newcommand\Rpar{\right)}
%=========================================================================
\begin{document}
\title{Round-off induced instabilities and some unusual arithmetics}
\author{Michael Blank
\thanks{On leave from Russian Academy of Sciences, Inst. for
Information Transmission Problems,
Ermolovoy Str. 19, 101447, Moscow, Russia.}
\\ \\
C.N.R.S., Observatoire de Nice, BP 229, \\ 06304 Nice Cedex 4, France,
e-mail: blank@obs-nice.fr}
\date{February 14, 1995} % \today}
\maketitle
\begin{abstract} We consider some unusual phenomena that appear in long
consecutive calculations due to a new sort of round-off errors of Pentium
floating point division bug type. Mathematical model of these round-off
errors is proposed and new phenomena in numerical simulations of
dynamical systems (differential equations) and other numerical procedures
are discussed. It is shown also that these new phenomena could appear not
only with Pentium floating point division bugs, but in a more general
situation. \end{abstract}
\section{Introduction}\label{secI}
When solving differential equations or other dynamical systems on a
computer, we inevitably run into perturbations caused by various types of
discretizations. The first of them is time discretization, that is, the
transition from differential equations or flows to difference equations
or maps. The second type is space discretization, that is, the transition
from continuous phase space to a discrete lattice. In numerical
simulations these discretizations correspond to round-off errors.
Recently discovered Pentium floating point division bug attracted a new
attention to the well known problem of round-off errors and gave a
possibility to take a fresh look to the problem.
A review of unusual phenomena that appear under the action of round-offs
was published recently in \cite{B1}. The model of round-off errors,
proposed in \cite{B1,B2,B3}, is the following. Consider a difference
equation $x_{n+1}=f(x_n)$, acting on some compact subset
$X~\subset~\IR^d$ (for example, on the unit interval $X=[0,1]$). As an
example of type of maps under consideration, one could have in mind a so
called ``tent-map'' - one-dimensional ($d=1$) map $f_T:[0,1]~\to~[0,1]$,
defined by the relation $f_T(x)=a(1/2-|1/2-x|)$ (see Fig.~1.a), or
another one-dimensional map - a dyadic map $f_2(x)=\{2x\}$ (see
Fig.~1.b). Here $\{.\}$ means the fractional part of a number.
\bigskip \hskip2Truecm
%\framebox{ %figure 1.a
\setlength{\unitlength}{1mm}
\begin{picture}(50,40) \put(0,0){\framebox(40,40){}}
\put(0,0){\line(1,1){40}} \thicklines
\put(0,0){\line(3,5){20}} \put(40,0){\line(-3,5){20}}
\put(-20,35){Figure 1.a} \put(2,35){$f_T$}
\end{picture} %}
%\framebox{ %figure 1.b
\setlength{\unitlength}{1mm} \begin{picture}(20,40) \end{picture}
\begin{picture}(50,40) \put(0,0){\framebox(40,40){}}
\put(0,0){\line(1,1){40}} \thicklines
\put(0,0){\line(1,2){20}} \put(20,0){\line(1,2){20}}
\put(-10,35){1.b} \put(2,35){$f_2$}
\end{picture} %}
\hskip1cm
What do we mean by space discretization of such a system? Actually,
in computer simulations we do not have a continuous compact phase
space $X \subset \IR^d$, but an ordered discrete lattice of points
$X_\ep \subset X$, and after the calculation of the next point
$x_{n+1}$ we in fact do not have the precise value of $f(x_n)$ but
only the nearest point to it in our numerical lattice. The parameter
$\ep$ (discretization diameter) corresponds to the maximal distance
between neighboring points of this lattice. Let us denote the operator
that associates to each point $x \in X$ its nearest point on the
lattice $x_\ep \in X_\ep$ by $D_\ep:X\to~X_\ep$ (if there are several
such points we choose the point with the minimal index). Then the
discretized map is $f_\ep = D_\ep \circ f$ and the discretized system
is a pair $(f_\ep, X_\ep)$.
{}From the first sight, it seems, that this model is the most general,
as possible, especially, if we take into account that the lattice
$X_\ep$ needs not to be uniform. However any model of this type
preserves the order in the following sense. In the one-dimensional
case ($d=1$) for any two points $x \le y \in \IR^1$ the natural
order preserves under discretization: $D_\ep(x) \le D_\ep(y)$. For
uniform $d$-dimensional discretizations (or independent arbitrary
discretizations on each coordinate) for any $i=1,2,\dots,d$ the
inequality $x_i \le y_i$ implies $(D_\ep(x))_i \le (D_\ep(y))_i$,
where $x_i$ is the $i$-th coordinate of a vector $x \in \IR^d$.
As it will be seen later this consideration could be crucial in
some computer simulations.
The Pentium floating point division bug, according to \cite{P5} and
also to Cleve Moler and Thomas Nicely reports consists in the following.
Double precision divisions involving operands with certain bit
patterns can produce incorrect results. The most dramatic example
was given by by Tim Coe of Vitesse Semiconductor:
$x=4195835, \,$
$y=3145727, \,$
$z=x-(x/y)*y.$
With exact computation, $z$ would be zero, but, on the Pentium,
$z=256.$
According to \cite{P5} such the bit patterns that could produce
pathological results are very rare, and therefore it is quite
improbable to find something of this type in calculations with
a short number of consecutive steps. On the contrary, dealing with
numerical simulations of dynamical systems or partial differential
equations, one calculates very long trajectories of the system,
i.e. consecutive iterations of the map $f$. Therefore the probability
to run into the point with a ``bad'' bit pattern during the calculation
of a long trajectory with the initial point chosen at random, could
be very large, even if a percentage of ``bad'' bit patterns is very
small. Actually exactly in such a kind of calculations the considered
bug was discovered firstly by T.~Nicely \cite{Ci}, who investigated
asymptotic properties of the sum of reciprocals to prime numbers.
It not known yet how many operands cause the Pentium's floating
point division to fail (T.~Nicely estimates the probability of this
event as $10^{-9}$ \cite{Ci}), or even what operands produce the largest
relative error.
Remark, that the hardware errors of this type are not quite new.
Consider, for example, well known errors in the implementations of
trigonometric functions, random number generators, etc, on various
computers. Consider now in detail how a function is calculated on a
computer. Let, for example, $f(x)=\{\pi x\}$. Then in a computer
implementation of this function we shall obtain
$f_\ep(x) = \{D_\ep(D_\ep(\pi)~\,~D_\ep(x))\}$, rather than
$D_\ep(\{\pi x\})$. The difference is that one should consider the
round-off error on every elementary step of the calculation. Clearly, when
$\ep$ is small enough, the difference between these two values is also
small. Formally, denoting the new discretization by $\D_\ep$, one could
write the discretized system as a pair $(\D_\ep f, X_\ep)$, however now
$\D_\ep f$ is not a composition of the discretization $\D_\ep$ and the
map $f$, but a more complex procedure, depending on the decomposition of
the map into elementary exact calculated procedures and on the hardware
errors of Pentium flaw type. The assumption that the number of ``bad''
points is small enough simply means that the proportion of the points in
the numerical lattice such that
$|D_\ep\circ f(x)-\D_\ep f(x)|\sim\ep$ is close to $1$.
One could tell that the only difference between $D_\ep$-discretizations
and $\D_\ep$-dis\-cre\-tizations consists in the fact, that the round-off
error is larger in the latter case and there exists another lattice
(with larger $\ep$) corresponding to the $\D_\ep$-discretization.
However this is not so. Indeed, even if the magnitude of errors due to
$\D_\ep$-discretization is small enough, say of order $\ep$, the
operator $\D_\ep$ may not preserve order, which as we show further leads
to new phenomena in the behavior of discretized systems or numerical
procedures.
It could be surprising, but the exact calculation of the usual round-off
error on the every elementary step could bring to the absence of the
order preserving property. Consider an example. Let
$x=1, \, y=1+3\ep/5, \, u=1-\ep/3, \, v=1+2\ep/5$. Then $x/y>u/v$, while
$\D_\ep(x/y)=D_\ep(x)/D_\ep(y)=1/2<1=D_\ep(u)/D_\ep(v)=\D_\ep(u/v)$ for
the uniform $D_\ep$-discretization with any $\ep>0$. Clearly such
situations could not appear with other arithmetical operations, and even
with the division they are very rare, exactly as in the case of the
Pentiom flaw.
There are two typical situations in which very rare errors could lead to
a visible result in numerical simulations. One of them is the calculation
of very long trajectories or consecutive statistics, when the probability
to run into the mistake becomes large. The second situation corresponds
to the fact that this type of perturbations does not preserve order, and,
thus, when order is important, even fine enough $\D_\ep$-perturbations
could alter the result not only quantitatively, but qualitatively.
In \cite{B1} a series of examples was considered when usual
discretizations with arbitrary small diameters (corresponding to
arbitrary small round-off errors) lead to a very drastic difference in
behavior of the perturbed (discretized) systems with respect to the
original ones. On the one hand rigorous conditions under which
statistical (the roughest) properties of the system survive
discretization were discussed there, and on another hand examples where,
even in the limit of vanishing perturbation, a ``localization''
phenomenon takes place: trajectories which should normally be dense
remain confined to a small number of points. In the present paper we
shall discuss new phenomena that could appear under the action of
$\D_\ep$-discretizations to compare the behavior of dynamical systems or
numerical procedures under $D_\ep$- and $\D_\ep$-discretizations, when
round-off errors are small enough.
\section{Periodic trajectories}\label{secII}
Remember that an orbit or a trajectory of a map $f$ is a sequence of
points $\{x_k\}_1^\infty$ such that $x_{k+1}=f(x_k)$ for all $k$. For
the case of a periodic orbit of period $n$ such a sequence is formed by a
repetition of a finite collection of different points $(x_1, x_2, \cdots,
x_n)$ and therefore it is defined by any of these points (for example
$x_1$) and by the period. We call such a point periodic with period $n$.
In contrast with the original dynamical system, for every map $f$, the
set of non-wandering points (that is the broadest nontrivial invariant
set) of the discretized system coincides with the set of periodic points.
Indeed, the phase space of the discretized map $X_\ep$ consists of only a
finite number of points, and thus any trajectory will eventually hit into
some periodic trajectory. Therefore, the investigation of the asymptotic
properties of the perturbed system is reduced to an investigation of
solely periodic orbits.
``Period multiplication'' phenomenon, described in \cite{B1} (Theorem 1),
consists in the emergence in a neighborhood of the original cycle of a
new cycle, the period of which is a multiple of that of the parent cycle.
In the 1-dimensional case ($d=1$) only the period doubling could take
place, while for $d>1$ any multiple period could appear (Theorem 1). In a
numerical investigation of a bifurcation structure of the system
(especially in the well known Feigenbaum scenario of the transition to
chaos by repeated period doubling) this leads to a very complex problem
to distinguish the real bifurcations from ``numerical'' ones.
The general scheme of the period doubling of a stable fixed point is
the following:
\bigskip \hskip 3Truecm
%\framebox{
\setlength{\unitlength}{1mm}
\def\xl#1#2{\newcount\x \x=#1 \newcount\y \y=#2
\put(\x,#2){\line(0,1){6}}
\advance\x by 14 \put(\x,#2){\line(0,1){6}}
\advance\x by 14 \put(\x,#2){\line(0,1){6}}
\advance\y by 3 \put(26,\y){\circle*{1}}
\advance\y by 1 \put(25,\y){c}}
\begin{picture}(50,40) \put(0,0){\framebox(40,40){}}
\put(5,30){\line(1,0){30}}
\put(5,20){\line(1,0){30}}
\put(5,10){\line(1,0){30}}
\thicklines \xl{6}{27} \xl{6}{17} \xl{6}{7} \thinlines
\put(26,30){\vector(-1,0){6}}
\put(20,30){\vector(1,-1){10}} \put(28,20){\vector(1,0){6}}
\put(34,20){\vector(-2,-1){20}} \put(14,10){\vector(1,0){6}}
\put(15,-5){Figure 2.}
\end{picture} %}
\bigskip \bigskip
The proof of this theorem is based on the order-preserving property for
both the map $f$ (locally around the periodic trajectory) and the
discretization $D_\ep$. Clearly, this is not the case with
$\D_\ep$-discretizations, and therefore even for $d=1$ one could find
period multiplication phenomena with arbitrary multipliers. Consider, for
example, the situation above, when $\D_\ep(c) \ne D_\ep(c)$, where $c \in
X$ is a stable fixed point of the map $f$, i.e., $f(c)=c$.
``Period multiplication'' does not depend on the stability of the original
periodic trajectory and in the $\D_\ep$-discretizations case large period
multipliers could appear even when the $\D_\ep$-round-off error is of the
same order as the usual round-off error. On the other hand, the
$\D_\ep$-round-off errors could prevent ``period multiplication'' even if
all the conditions for its existence are satisfied.
The problem of the description of general effects of space discretization is of
special interest in the case of chaotic dynamical systems (see, for example,
\cite{B3} and referencies therein), because all their trajectories are unstable
and therefore the influence of perturbations on individual trajectories may
grow (exponentially) in time. A typical example is the ``dyadic'' map (see Fig
1.b). The iterates of almost all initial points are uniformly distributed and
all features of developed chaos are visible. The dyadic map has periodic
trajectories of all periods and the set of its periodic points is dense on the
interval $[0,1]$. More complex examples occur in the three body problem in
celestial mechanics, the H\'enon mapping $(x,y) \to (1-ax^2+y,bx)$ and others
(see \cite{B3} and references therein for further examples). Questions about
properties of discretized chaotic systems are an active area of mathematical
investigation and many problems are still open. Practically, there are only a
few results, showing that orbits of the discretized chaotic systems may have a
connection to the original orbits. Among these results the main general one
shows the connection of the orbit of weakly perturbed systems (with arbitrary
type of perturbation) to the original orbit. This result (shadowing property)
is due to D.V.~Anosov \cite{An}, who showed that for a smooth hyperbolic system
for each orbit of a weakly perturbed system there exists uniformly closed orbit
of the original one.
The shadowing orbit may be untypical in the sense that the distribution of its
points (which is the most important feature of a chaotic system) differs from
the typical one, defined by the Sinai-Bowen-Ruelle measure \cite{BPSJ}.
Theorem~2 \cite{B1} shows that this situation may often happen in case of the
usual arbitrary fine discretizations. Indeed, the ``localization phenomenon'',
described in this theorem corresponds to a situation, when trajectories, which
should normally be dense, remain confined to a small number of points. The
simplest example, when this phenomenon takes place is the dyadic map under
binary uniform discretizations ($\ep=2^{-n}$). Let $f(x):=\{2x-q/2^n\}$ and let
$q\in~\IZ^1$. Then the uniformly $2^{-n}$-discretized map has a fixed poit
$q/2^n$ and under the action of the discretized map any point $x\in~X_{2^{-n}}$
eventually hits to this point after not more than $n$ iterations. However, in a
small neighborhood of the point $q/2^n$ there are points $\beta\in~X_{2^{-n}}$
such that $\min\{k:\;\Lpar D_{2^{-n}}\circ f\Rpar^k(\beta)=q/2^n\}=n$. Suppose
now, that $\D_{2^{-n}}(q/2^n)=\beta$ and there are no $\D_\ep$-errors during
the first $n$ iterates of the point $\beta$. This assumption is quite natural,
because the ``bad'' points supposed to be very rare. Then the
$\D_\ep$-discretized system has a $n$-periodic trajectory, which certainly
differs from the statement above.
As we mentioned in Section~\ref{secI}, in spite of the fact that
$\D_\ep$-round-off errors are very rare, their magnitude could be large enough
(remember the example of Tim Coe) and thus Anosov's theorem about the shadowing
property could not be applyed here. Actually this type of perturbations may be
considered as a noise with very rare large fluctuations. A generalized
shadowing property for perturbations small in average was considered in
\cite{B5}, where under the same assumptions as in Anosov's theorem the
existence of shadowing in average trajectories was proved.
\section{Neutral periodic trajectories}\label{secIV}
Consider now the influence of discretizations on systems, such that the
trajectories are neither stable nor unstable, or such they have a local
invariant component of this type. Systems of this type appear in various
mathematical models of intermittent behavior (for example, a map
$x \to \{x+x^2\}$) and therefore their investigation is very important.
The simplest example of a map with neutral trajectories is a rotation
$f^{(\alpha)}$ of points in the $d$-dimensional plane $\IR^d$ by a fixed
angle $\alpha \in \IR^{d-1}$ around the origin. In this case the
investigation of individual trajectories has no sense, and only the
global properties like ergodicity, stability etc. are of interest. Once
more, statistical approach is quite useful here for the case of usual
space discretizations, and due to the same reason as before, it could not
be applied for $\D_\ep$-discretizations.
Numerical simulations of $D_\ep$-discretizations of the two-dimensional
irrational rotation in \cite{B1} shows (see Fig.~3.a) that the asymptotic
trajectories look like thick clusters of quasy circles with gaps (without
any trajectory there) between them. Typical numerical trajectory is
filling a large part of such cluster, and for any trajectory the distance
to the origin remains bounded.
The main question here is if there are numerical trajectories going to
infinity or to the origin. Numerical investigations give a negative answer
to this question (see Fig.~3.a). However, due to the thickness of
numerical trajectories in the clusters, the appearance of ``bad'' points,
such that $\D_\ep(x)\ne~x$, in these clusters becomes quite probable, and,
thus, a trajectory could go very far from the origin, or could return to
the origin.
Consider now $\D_\ep$-discretizations. This means that the elements of
the rotation matrix are calculated with the precision $\ep$ and all the
multiplications are done with the same precision. The result is shown on
Figure~3.b. The difference becomes especially striking if one takes into
account that all the points outside of the circle of radius $0.45$ are
going out of the circle of radius $10^6$, thus the asymptotic absolute
error is of order $10^6$, which gives actually a positive answer to the
question above. To explain this phenomenon consider a ``toy'' model of the
rotation to the angle $\pi$ on the one-dimensional integer lattice:
$x \to -D_1(x/p) \, p$, where $p>1$ is an integer. This map has a fixed point
$0$ and 2-periodic trajectories ($kp~\longleftrightarrow~-kp, \, k=1,2,\dots$)
with the distance $p$ between them. Even more interesting features could
be found in the integer version of the rotation to the angle $2\pi$:
$$ x \to x + D_1(x/p) \, p - D_1((x-1)/p) \, p - D_1(1/p) \, p
\equiv x + D_1(x/p) \, p - D_1((x-1)/p) \, p. $$
In the exact calculation without the discretization this is the identical
map. However for $p=2q+1, \, q \in \IZ_+$, the discretized map
preserves all the integer points of the real line except the points
$x=kp+q+1, \, k=0,\pm~1, \pm~2, \dots$ going to plus infinity.
\section{Instability of sorting procedures}\label{secV}
Order preserving property is especially important in numerical
realizations of various sorting procedures. Let us start from the well
known and very efficient procedure to find zeroes of a one-dimensional
function. The scheme is the following. On the every step of the procedure
the interval, where the zero is located is divided into two equal parts
and, according to the value of the function in the boundary point, the
left (right) subinterval is chosen. The procedure converges exponentially
even in the presense of $\D_\ep$-round-off errors, but the solution may
be quite far from the real zero of the function. Indeed, if, during the
long sequence of comparisons, we only once run into the
$\D_\ep$-round-off error, then all the consequent steps are wrong.
This situation is far from from the worst in the presense of
$\D_\ep$-round-off errors, because the error, while very probable, does
not grow in time. The latter case becomes typical in consecutive sorting
procedures. Modern sorting algorithms like ``submassive merging'',
``priority queues'' etc. (see \cite{Kn} for details) use tree-like
structures in sorting and therefore even the only one small
$\D_\ep$-round-off error could switch the procedure to another branch of
the tree, and therefore to bring to a large sorting error.
Surprisingly, application of various sorting procedures becomes an
efficient tool in modern mathematical physics. Recently an algorithm
heavily based on sorting was proposed in \cite{VDFN} for numerical
investigation of Burgers' equation with random initial data.
Implementation of the algorithm leads to a consecutive sorting of a very
large amount of data (about $10^7$ points) in each step, thus, the
appearance of $\D_\ep$-round-off errors becomes unavoidable.
\section*{Acknowledgements} I would like to thank Uriel Frisch, who
actually proposed the idea to investigate Pentium-type errors, and
Michel Henon for helpfull discussions. This work was supported by the
French Ministry of Higher Education.
%--------------------------------------------------------
\newpage
\begin{thebibliography}{99}
\bibitem{An} D. V. Anosov, Geodesic flows on closed Riemannian
manifolds with negative curvature, {\em Trudi. Math. Inst. Russian
Ac. of Sci}, {\bf 90}, 1967, pp.~210--283.
\bibitem{B1} M. L. Blank, Pathologies generated by round-off in
dynamical systems, {\em Physica} D {\bf 78}:1-2, 1994, pp.~93--114.
\bibitem{B2} M. L. Blank, Ergodic properties of discretizations of
dynamical systems, {\em Docl. Russian Ac. of Sci.}, {\bf 278}:4,
1984, pp.~779--782.
\bibitem{B3} M. L. Blank, Small perturbations of chaotic dynamical
systems, {\em Uspekhi Matem. Nauk.}, {\bf 44}:6, 1989, pp.~3--28.
\bibitem{B5} M. L. Blank, Metric properties of $\ep$-trajectories of
dynamical systems with stochastic behavior, {\em Ergodic Theory and
Dynamical Systems}, {\bf 8}:3, 1988, pp.~365--378.
\bibitem{BPSJ} L. A. Bunimovich, Ya. G. Pesin, Ya. G. Sinai,
M. V. Jacobson, Ergodic theory of smooth dynamical systems.
Modern problems of mathematics. Fundamental trends, {\bf 2}, 1985,
pp.~113--231.
\bibitem{Ci} B. Cipra, How number theory got the best of the Pentium
chip, {\em Science} {\bf 267}, 1995, p.~175.
\bibitem{Kn} D. E. Knut, The art of computer programming. {\bf 3} Sorting
and searching, Addison-Wesley, 1973.
\bibitem{P5} H. P. Sharangpani, M. L. Barton, Statistical analysis of
floating point flaw in the Pentium$^{TM}$ processor, Preprint, 1994.
\bibitem{VDFN} M. Vergassola, B. Dubrulle, U. Frisch, A. Noullez,
Burgers' equation, devil's staircases and the mass distribution for
large-scale structures, {\em Astron. \& Astrophys.} {\bf 289}, 1994,
pp.~325--356.
\end{thebibliography}
\end{document}