\magnification=1200
\def\qed{\unskip\kern 6pt\penalty 500\raise -2pt\hbox
{\vrule\vbox to 10pt{\hrule width 4pt\vfill\hrule}\vrule}}
\null\vskip 4truecm
\centerline{POSITIVITY OF ENTROPY PRODUCTION}
\bigskip
\centerline{IN NONEQUILIBRIUM STATISTICAL MECHANICS.}
\bigskip
\bigskip
\centerline{by David Ruelle\footnote{*}{IHES (91440 Bures sur Yvette, France),
and Math. Dept., Rutgers University (New Brunswick, NJ 08903, USA).}}
\bigskip
\bigskip
\indent
{\it Abstract.} We analyze different mechanisms of entropy
production in statistical mechanics, and propose formulae for the
entropy production rate $e(\mu)$ in a state $\mu$. When $\mu$ is a
steady state describing the long term behavior of a system we show that
$e(\mu)\ge0$, and sometimes we can prove $e(\mu)>0$.
\bigskip
\bigskip
\indent
{\it Key words and phrases:} ensemble, entropy production,
folding entropy, nonequilibrium stationary state, nonequilibrium
statistical mechanics, SRB state, thermostat.
\vfill\eject
\centerline{0. Introduction.}
\bigskip
\bigskip
\indent
The study of nonequilibrium statistical mechanics leads
naturally to the introduction of {\it nonequilibrium states}. These are
probability measures $\mu$ on the phase space of the system, suitably
chosen and stationary (in principle) under the nonequilibrium time
evolution. In the present paper we analyze the entropy production
$e(\mu)$ for such nonequilibrium states, and show that it is positive
(more precisely $\ge0$, sometimes one can prove $>0$). That the
positivity of $e(\mu)$ needs a proof was repeatedly pointed out by G.
Gallavotti and E.G.D. Cohen\footnote{*}{In their seminal paper [15] for
instance they state: "positivity rests on numerical evidence", and refer
to [12]. }
. Here we shall emphasize the physics of the problem
and be particularly concerned with a proper choice of mathematical
framework and definitions; the proof that $e(\mu) \ge0$ will then be
relatively easy.
\medskip\indent
{\it Thermostatting}
\medskip\indent
We shall think of a physical system having a finite (possibly
large) number of degrees of freedom. The phase space ${\cal S}$ is thus a
finite dimensional manifold, with a symplectic structure and therefore a
natural volume element. In the situation of {\it equilibrium
statistical mechanics} there are conservative forces acting on the
system. A Hamiltonian $H$ is thus defined on ${\cal S}$, and energy is
conserved, {\it i.e.}, time evolution is restricted to an energy shell
${\cal S}_E=\{x:H(x)=E\}$ of ${\cal S}$, where $E$ typically ranges from
some lower bound $E_0$ to $+\infty$ (this is because the potential
energy is $\ge E_0$ and the kinetic energy takes all values $\ge 0$).
While ${\cal S}$ is noncompact and has infinite volume, ${\cal S}_E$ is
compact and has finite volume.
\medskip\indent
In the case of {\it nonequilibrium statistical mechanics} we
have nonconservative forces and, although we may be able to define a
natural energy function, with values in $[E_0,+\infty)$, the energy is in
general not conserved. Typically the point representing the system
wanders away to infinity in the noncompact phase space ${\cal S}$ while
the system heats up, {\it i.e.}, its energy tends to $+\infty$. In this
situation it is not possible to define time averages corresponding to a
probability measure on ${\cal S}$, {\it i.e.}, it is not possible to
introduce nonequilibrium states. This difficulty follows from the
noncompactness of phase space and is not tied to the special physical
meaning of the energy. (The same difficulty arises in diffusion
problems where the energy is constant but the configuration space is
infinite).
\medskip\indent
Physically, the way to avoid heating up the system is to put
it in contact with a thermostat. One can idealize the thermostat as a
random interaction (with a "heat bath"). The study of entropy
production remains to be done in this framework, and should separate the randomness (or entropy) introduced
by the thermostat, and that created by the system itself.
\medskip\indent
It is also possible to constrain the time evolution by brute
force to some compact manifold $M \subset {\cal S}$. Consider for
instance a system satisfying Hamiltonian equations of motion:
$$ \dot X=J {\partial}_X H $$
where $X=(p,q)$ and $J{\partial}_X=(-{\partial}_q,{\partial}_p)$. The energy is conserved because $\dot H={\partial}_X H.{\dot
X}={\partial}_X H.J{\partial}_X H=0$. Let us add an external driving
term F so that the time evolution is now
$$ \dot X=J{\partial}_X H+F(X) $$
This will in general heat up the system because
$$ \dot H=\partial_X H.\dot X=\partial_X H.F(X) \ne 0 $$
but if we replace $F$ by
$$ F-{(\partial H.F) \over (\partial H.\partial H)}.\partial H $$
then $H$ is preserved: this is the so-called {\it Gaussian thermostat}
(see Hoover[18]).
\medskip\indent
To summarize, we want to act on our system to keep it outside
of equilibrium, but also impose a thermostat to prevent heating.
Physically this means that we pump entropy out of the system, while
keeping the energy fixed.
\medskip\indent
From now on we shall consider a time evolution on a compact
manifold $M$. We shall forget the symplectic structure (this is no
longer relevant because we no longer have a Hamiltonian). We shall
however need the volume element $dx$ to define the statistical
mechanical entropy $S(\underline\rho)=-\int dx \underline\rho (x) \hbox{log}\underline\rho (x)$
of a probability density $\underline\rho$ on $M$. Equivalent volume
elements will be equivalent for our purposes because changing $dx$ to
$\phi(x)dx$ replaces $S(\underline\rho)$ by $S(\underline\rho)+\int dx \underline\rho (x)
\hbox{log}\phi (x)$; the additive term is bounded independently of
$\rho$, and will play no significant role in our considerations. We may
thus take for $dx$ the volume element associated with any Riemann
metric. Note that $S(\underline\rho)$ is the physical entropy when
$\underline\rho$ is a thermodynamic equilibrium state, but we can extend
the definition to arbitrary $\underline\rho$ such that
$S(\underline\rho)$ is finite.
\medskip\indent
The fact that we take seriously the expression
$S(\underline\rho)=-\int dx\underline\rho(x)\hbox{log}\underline\rho(x)$
for the entropy seems to be at variance with the point of view defended
by Lebowitz [20] who prefers to give physical meaning to a {\it Bolzmann
entropy} different from $S(\underline\rho)$. There is however no necessary
contradiction between the two points of view, which correspond to
idealizations of different physical situations. Specifically, Lebowitz
discusses the entropy of states which are locally close to equilibrium,
while here we analyze entropy production for certain particular steady
states (which may be far from equilibrium).
\medskip\indent
{\it Pumping entropy out of the system.}
\medskip\indent
We have now reduced our mathematical framework to a smooth
time evolution on a compact manifold $M$. We may also discretize the
time (using a time one map $f$ or a Poincar\'e first return map $f$) and
consider that the time evolution is given by iterates of $f:M \to M$.
Even though the mathematical setup is now just that of a smooth
dynamical system $(M,f)$, there remains the problem to study how entropy
is pumped out of the system, and how nonequilibrium states are defined.
We shall consider three cases.
\medskip\indent
(i) $f$ is a diffeomorphism (hence $f^{-1}$ is defined).
Nonequilibrium states $\mu$ may be defined by time averages corresponding
to orbits $(f^k x)$ where $x \in {\cal V}(\mu)$ and ${\cal V}$ has
positive Riemann volume: vol${\cal V}(\mu)>0$. More precisely, let $\delta(x)$
denote the unit mass at $x$; we may say that $\mu$ is a nonequilibrium
state if $\mu=\lim_{m \to \infty} (1/m) \sum_{k=0}^{m-1} \delta (f^k x)$ for
all $x \in {\cal V}(\mu)$, and ${\cal V}(\mu)>0$;
special examples are the so-called SRB states. We shall see that
entropy is pumped out of $\mu$ because $f$ contracts volume elements (in
the average)\footnote{*}{See Chernov, Eying, Lebowitz, and
Sinai [5],Chernov and Lebowitz [6] for models with phase space contraction. }.
\medskip\indent
(ii) $f$ is an noninvertible map. Here the folding of the phase
space caused by $f$ acts to pump entropy out of the
system\footnote{**}{A model with folding of phase space has been
considered by Chernov and Lebowitz [7].}. Nonequilibrium
states may be defined as limits of states $(1/m)\sum_{k=0}^{m-1} f^k
\rho$ with $\rho$ absolutely continuous with respect to the volume.
\medskip\indent
(iii) $f$ has a nonattracting set $A$ which carries a
nonequilibrium state $\mu$ associated with a diffusion process
\footnote{\dag}{See Gaspard and Nicolis [17], Dorfman and Gaspard [10].}.
Specifically, let $A$ be an $f$-invariant subset of $M$ which is
not attracting. If $U$ is a small neighborhood of $A$, $fU$ is not
contained in $U$. Let $\rho$ be the Riemann volume normalized to
$U$. Then $f\rho$ is not supported in $U$. We multiply by the
characteristic function of $U$ and normalize to obtain a new probability
measure ${\rho}_1={\parallel\chi_U .f\rho\parallel}^{-1} \chi_U.f\rho$.
Iterating this process $m$ times we obtain $\rho_m$, and define
$$ \rho^{(m)}={1\over m}\sum_{k=1}^m f^{-k} \rho_m $$
In the Axiom A case we shall see (Section 3 below) that $\rho^{(m)}$
tends to an $f$-invariant probability measure $\mu$ giving to
$h(\mu)-\sum$ positive Lyapunov exponents of $\mu$ its maximum value
$P$ ($h$ is the {\it Kolmogorov-Sinai entropy} and the {\it pressure}
$P$ is $\le 0$). One can argue (see [19], [11], and below) that the volume
of the points $x\in U$ such that $fx,...,f^m x \in U$ behaves like
$e^{mP}$. Here again entropy is pumped out of the
system by getting rid of the part of $f\rho$ outside of $U$, and $\mu$
may be interpreted as nonequilibrium state.
\medskip\indent
For a recent, physically oriented, review of nonequilibrium
statistical mechanics we refer the reader to Dorfman [9]. He discusses in
particular calculations using periodic orbits, as advocated by
Cvitanovi\'c (see Artuso, Aurel, and Cvitanovi\'c [1], Cvitanovi\'c,
Eckmann, and Gaspard [8]).
\medskip\indent
{\it Towards physical applications}
\medskip\indent
The {\it ergodic hypothesis} states that the Liouville measure
restricted to an energy shell ${\cal S}_E$ (for a Hamiltonian system) is
ergodic under time evolution. This serves to justify the ensembles of
statistical mechanics and, while the ergodic hypothesis is likely to be
false in general, it is apparently almost true in the sense that the
application of equilibrium statistical mechanics to real systems has
been extremely successful.
\medskip\indent
One may try to base nonequilibrium statistical mechanics on a
principle similar to the ergodic hypothesis. Here one assumes that the
dynamical system $(M,f)$ describing time evolution is {\it hyperbolic}
in some sense
\footnote{*}{See Eckmann and Ruelle [11] for definitions and a physically
oriented review of dynamical systems.}
, and that time averages are given by particular
probability measures called SRB measures; these are the
nonequilibrium states which replace the microcanonical {\it ensemble} of
equilibrium statistical mechanics. The SRB states correspond to
time averages for a set of positive measure of initial conditions. They
are characterized by smoothness along unstable directions or
equivalently by a variational principle
\footnote{**}{The approach just indicated to the study of nonequilibrium
systems was advocated early in lectures by the present author (G.
Gallavotti mentions the date of 1973); for the case of turbulence see
[24]. Other people familiar with SRB measures would have had similar
ideas, but these have started to be useful only with the recent (1995)
work of Gallavotti and Cohen [15], [16]. The mathematical study of SRB
states was made by Sinai [26] for Anosov diffeomorphisms, Ruelle [23]
for Axiom A diffeomorphisms, Bowen and Ruelle [4] for Axiom A flows.
The very nontrivial extension to nonuniformly hyperbolic systems is due to
Ledrappier and Young [22].}.
\medskip\indent
The assumption that the systems of nonequilibrium statistical
mechanics are hyperbolic and described by SRB measures is unlikely to
be exactly true, but it is reasonable to expect that it is approximately
true in the sense that it gives correct physical predictions in the
limit of large systems (thermodynamic limit).
\medskip\indent
Actual physical predictions were obtained only after
Gallavotti and Cohen [15] supplemented the hyperbolicity assumption by the
{\it reversibility} assumption. The latter assumes that there is a map
$i:M\mapsto M$ such that $i^2=1$,$fi=if^{-1}$.
\medskip\indent
The {\it chaotic hypothesis} of Gallavotti and Cohen [15],
[16] (see also Gallavotti [13],[14])
states that physically correct results (for nonequilibrium systems in
the thermodynamic limit) will be obtained by assuming reversibility and
treating the system as if it were hyperbolic (in fact Anosov). An
essential role in the inspiration of Gallavotti and Cohen was played by
the numerical results and analysis by Evans, Cohen and Morris [12].
\medskip\indent
{\it Example.}
\medskip\indent
Consider a Hamiltonian $H(X)={1\over 2}(p,M^{-1}p)+U(q)$ where
$M$ is the mass matrix, and $U$ is the potential energy. We denote by
$f^t X$ (with $f^0 X=X$) the solution of Hamilton's equation $\dot
X=J{\partial}_X H$. Defining $i(p,q)=(-p,q)$ we find that $f^t
i=if^{-t}$, which expresses reversibility. Reversibility is preserved
if we introduce an external force $F=(\Phi (q),0)$, and again if we add
a Gaussian thermostat.
\medskip\indent
{\it Scope of the paper.}
\medskip\indent
In what follows we shall analyze entropy production and its
positivity for the three cases outlined earlier: (i) diffeomorphism,
(ii) noninvertible map, (iii) map near a nonattracting set. The
treatment of these three cases will be somewhat uneven because the
existing mathematical results range from detailed in case (i) to
rather limited in case (iii). Since the emphasis of this paper
is on having the physics straight we have allowed the uneven
mathematical treatment, but suggested some conjectural extensions of the
results that are proved. The possibility of a unified presentation will
depend on further progress in the ergodic theory of differentiable
dynamical systems.
\medskip\indent
{\it Acknowledgements.}
\medskip\indent
I am greatly indebted to Giovanni Gallavotti and Joel Lebowitz
for discussion of the basic concepts and issues involved in the present
paper, and to Jean-Pierre Eckmann for critical reading of the manuscript.
\vfill\eject
\centerline{1. Entropy production for diffeomorphisms.}
\bigskip
\bigskip
Let $M$ be a compact manifold, and $f:M\to M$ a $C^1$
diffeomorphism. Choosing a Riemann metric on $M$, let
$\rho(dx)=\underline\rho(x)dx$ be a probability measure with density $\underline\rho$ with respect to the Riemann
volume element $dx$. The direct image $\rho_1 =f\rho$ has density
$\underline\rho_1 (x) =\underline\rho(f^{-1}x) / J(f^{-1}x) $
where $J(X)$ is the absolute value of the Jacobian of $f$ at $x$
(computed with respect to the Riemann metric).
The statistical mechanical entropy associated with $\underline\rho$ is
$$ S(\underline\rho)=-\int dx \underline\rho(x)\log\underline\rho(x) $$
(This means that $dx$ is interpreted as the phase space volume element;
if $dx$ is the configuration space volume element, then $S(\underline\rho)$ is the
configurational entropy). The entropy $S(\underline\rho)$ will have to be
distinguished from the Kolmogorov-Sinai (time) entropy $h(\mu )$ of an
$f$-invariant state $\mu$ used below.
The entropy associated with $\underline\rho_1$ is
$$ S(\underline\rho_1)=-\int dx \underline\rho_1 (y)\log \underline\rho_1 (y) $$
$$=-\int dy {{\underline\rho(f^{-1} y)}\over{J(f^{-1} y)}} [\log\underline\rho(f^{-1}y)-\log J(f^{-1}y)] $$
$$=-\int dx \underline\rho(x)[\log \underline\rho(x)-\log J(x)]. $$
\medskip\indent
The entropy put into the system in one time step is thus
$$ S(\underline\rho_1)-S(\underline\rho)=\int dx \underline\rho(x)\log J(x) $$
This means that the entropy pumped out of the system, or produced by the
system, is
$$ -\int dx \underline\rho(x)\log J(x) $$
Let ${\underline\rho}_m$ be the density of the measure $\rho_m=f^m
\rho$. If $\underline\rho_m$ tends vaguely
\footnote{*}{The vague topology is the $w^*$-topology on the space of
measures considered as dual of the space of continuous functions. We
denote a vague limit by v.lim.}
to $\mu$ when $m\to\infty$, the entropy production
$$-[S({\underline\rho}_{m+1})-S({\underline\rho}_m)]=-\int dx \rho_m(x)\log J(x) $$
tends to
$$ -\int \mu(dx)\log J(x) $$
It is thus natural to take as definition of the entropy production for
an arbitrary $f$-invariant probability measure $\mu$ the expression
$$ e_f(\mu)=-\int \mu(dx)\log J(x) $$
\indent
In the rest of this Section we take $\mu$ to be ergodic, so that
the Lyapunov exponents are constant ($\mu$-a.e.). The general case is
obtained by representing $\mu$ as an integral over its ergodic
components.
\medskip\indent
{\it 1.1 Lemma.} {\it The entropy production $e_f(\mu)$ is
independent of the choice of
Riemann metric and equal to minus the sum of the Lyapunov exponents of
$\mu$ with respect to $f$.}
\medskip\indent
This follows from the Oseledec multiplicative ergodic theorem
in the form given in [11].\qed
\medskip\indent
We remind the reader that the Kolmogorov-Sinai entropy
$h(\mu)$ is the amount of information produced by $f$ in the state $\mu$
(see for instance Billingsley [2]). We always have
$$ h(\mu)\le\sum\hbox{ positive Lyapunov exponents } \eqno(1.1) $$
(this inequality is due to Ruelle, see [11]). We call $\mu$ an SRB measure
(see Ledrappier-Young [22], Eckmann-Ruelle [11]) if
$$ h(\mu)=\sum\hbox{ positive Lyapunov exponents } \eqno(1.2) $$
({\it Pesin identity}). If $f$ is of class $C^2$, the above condition is
equivalent to $\mu$ having conditional probabilities on unstable
manifolds absolutely continuous with respect to Lebesgue measure
(Ledrappier-Young [22]). If $f$ is $C^2$ and $\mu$ has no vanishing
Lyapunov exponent, then there is a set of positive Riemann volume of
points $x\in M$ with time averages $(1/N)\sum_{k=0}^{N-1} \delta({f^k}x)$
tending vaguely to $\mu$ (this result is due to Pugh and Shub, see [11]).
\medskip\indent
{\it 1.2 Theorem.} {\it Let $f$ be a $C^1$ diffeomorphism and
$\mu$ an $f$-invariant probability measure on the compact manifold $M$.
\medskip\indent
(a) If $\mu$ is an SRB measure then $e_f (\mu)\ge 0$.
\medskip\indent
(b) Let $f$ be $C^{1+\alpha}$ with $\alpha>0$ and $\mu$ be an SRB measure. If $\mu$ is
singular with respect to $dx$ and has no vanishing Lyapunov exponent,
then $e_f (\mu)>0$.
\medskip\indent
(c) For every $a$
$$ {\rm{vol}} \{x:{1\over m}\sum_{k=0}^{m-1} \log J(f^k x)\ge a\}\le
e^{-ma} {\rm{vol}} M $$
In particular if ${\cal V(\mu)}=\{x:{\rm{v.lim}}_{m\to\infty}
(1/m)\sum_{k=0}^{m-1}\delta(f^k x)=\mu\}$ and $e_f (\mu)<0$, then
$\rm{vol}{\cal V}(\mu)=0$.}
\medskip\indent
We have denoted by vol the Riemann volume in $M$. In view of
the result of Pugh and Shub mentioned above, (a) follows from (c) if
$f$ is $C^2$ and $\mu$ has no zero characteristic exponent.
Here is a direct proof of (a): if $\mu$ is SRB we have
$$ e_f (\mu)=-\sum \hbox{ Lyapunov exponents } $$
$$=[h(\mu)-\sum \hbox{ positive Lyapunov exponents }]-[h(\mu)+\sum
\hbox{ negative Lyapunov exponents }] $$
$$=[h(\mu]-\sum \hbox{ positive Lyap. exp. w.r.t. }f ]-[h(\mu)-\sum
\hbox{positive Lyap. exp. w.r.t. }f^{-1} ] $$
$$ \ge 0 $$
where we have used (1.1) and (1.2).
\medskip\indent
To prove (b) notice that if $\mu$ is SRB and $e_f (\mu)=0$
then, according to (a),
$$ h(\mu)=\sum \hbox{positive Lyapunov exponents}=-\sum \hbox{negative Lyapunov
exponents} $$
This implies that $\mu$ is absolutely continuous with respect to $dx$
(see Ledrappier [21] Cor.(5.6)) if $f$ is of class $C^{1+\alpha}$ and $\mu$ has no vanishing
Lyapunov exponent.
\medskip\indent
To prove (c) write
$$ {\cal V}(m)=\{x:{1\over m}\sum_{k=0}^{m-1} \log J(f^k x) \ge a \} $$
We have thus
$$ \hbox{vol} M \ge \hbox{vol} f^m {\cal V} (m)=\int_{{\cal V}(m)}
\prod_{k=0}^{m-1} J(f^k x) dx $$
$$ \ge e^{ma} \hbox{vol} {\cal V} (m) $$
as announced.\qed
\medskip\indent
{\it 1.3 Corollary.} {\it If $\mu$ is an SRB measure with respect to
both $f$ and $f^{-1}$, then $e_f(\mu)=0$. }
\medskip\indent
We have indeed $e_f(\mu)\ge0$, and
$e_{f^{-1}}(\mu)=-e_f(\mu)\ge0$. (As pointed out to the author by Joel
Lebowitz, this covers the case of the microcanonical ensemble).\qed
\vfill\eject
\centerline{2. Entropy production for noninvertible maps.}
\bigskip
\bigskip
\indent
{\it Standing assumptions.}
\medskip\indent
Let $M$ be a compact Riemann manifold, possibly with boundary.
We denote by vol the Riemann volume and by $dx$ the volume element. We
assume that a closed set $\Sigma\subset M$ is given, containing the
boundary of $M$, and $f:M\backslash \Sigma \to M$ such that the
following properties are satisfied
\medskip\indent
(A1) vol $\Sigma$=0
\medskip\indent
(A2) There are disjoint open sets $D_1,...,D_N$ such that $M\backslash
\Sigma ={\cup}_{\alpha =1}^N D_{\alpha}$, and $f\mid
D_{\alpha}$ is a homeomorphism to $fD_\alpha$, absolutely continuous
with respect to vol. The Jacobian $J$ of $f$ is continuous in $M\backslash
\Sigma$ and satisfies
$$ \hbox{inf}_{x\notin \Sigma}J(x)\ge e^{-K}>0 $$
(A3) For all pairs $(\alpha ,\beta)$, $fD_{\alpha}$ and $fD_{\beta}$ are
either disjoint or identical.
\medskip\indent
{\it Comments.}
\medskip\indent
It is convenient to use a map $f$ defined outside of an
{\it excluded} set $\Sigma$. In particular this allows discontinuities on
$\Sigma$. When considering the direct image $f\mu$ of a measure $\mu
\ge 0$ on $M$ by $f$, we shall have to assume that $\mu (\Sigma )=0$.
(We have made such an assumption for the measure vol ).
\medskip\indent
Condition (A3) might seem very strong, but can be arranged to
hold under the weaker assumption
$$ \hbox{vol} (fD_{\alpha}\cap \partial fD_{\beta})=0 $$
for all pairs $(\alpha ,\beta )$. Let indeed $( D_{\gamma}^1 )$ be the
family of open sets $\cap_{k=1}^N (fD_{\alpha})^\sim $ where
$(fD_{\alpha})^\sim $ is either $fD_{\alpha}$ or $M\backslash
\hbox{clos} fD_{\alpha}$ for each $\alpha$. Let $D_{\alpha
\gamma}^*=D_\alpha \cap f^{-1} D_{\gamma}^1$ and $\Sigma^* =M\backslash
\cup_\alpha \cup_\gamma D_{\alpha \gamma}^*$, then (A1), (A2), (A3) hold
when $\Sigma$, $(D_\alpha)$ are replaced by $\Sigma^*$, $(D_{\alpha
\gamma}^*)$. When considering the direct image $f\mu$, we shall now
have to assume that $\mu(\Sigma^*)=0$.
\medskip\indent
{\it Refining $(D_{\alpha})$.}
\medskip\indent
Let $fD_{\alpha}=D_{\gamma}^1$. We may write
$$ D_{\gamma}^1={\Sigma}_{\gamma}^1 \cup D_{\gamma 1}^1 \cup ...
\cup D_{\gamma n}^1 $$
where vol ${\Sigma}_{\gamma}^1=0$ and the disjoint open sets
$D_{\gamma 1}^1,...,D_{\gamma n}^1$ are small. Writing $D_{\alpha
i}=D_{\alpha} \cap f^{-1} D_{\gamma i}^1$, we may replace $(D_\alpha)$
by a family $(D_{\alpha i})$ of arbitrarily small sets. In other
words we may refine the family $(D_\alpha)$ to a new family
$(D_{\alpha}^*)$ (with $\alpha \in \{1,...,N^* \}$ and an excluded set
${\Sigma}^*$) so that (A1),(A2), (A3) still hold and the sets
$D_{\alpha}^*$ are arbitrarily small.
\medskip\indent
In the study of a measure $\mu \ge 0$ with $\mu(\Sigma)=0$ we can arrange that
$f(\mu)({\Sigma}_{\gamma}^1)=0$, implying that $\mu ({\Sigma}^*) =0$.
\vfill\eject
\indent
{\it Folding entropy. }
\medskip\indent
Let $\mu$ be a positive measure on $M \backslash \Sigma $.
(We may also consider $\mu$ as a positive measure on $M$ such that
$\mu(\Sigma)=0$). Our assumptions imply that there is a
{\it disintegration} of $\mu$ associated with the map $f$ (see Bourbaki
[3] par.3). In general this
means that we have the integral representation
$$ \mu=\int {\mu}_1 (dx) {\nu}_x $$
where ${\mu}_1=f\mu$ is the direct image of $\mu$ by $f$, and
${\nu}_x$ is a probability measure with ${\nu}_x (f^{-1}\{x\})=1$.
This representation is essentially unique. Here we may assume that ${\nu}_x$ is atomic
(with at most $N$ atoms) and write
$$ H({\nu}_x)=-\sum_{\alpha} p_{\alpha} \hbox{log } p_{\alpha}
$$
where the $p_{\alpha}$ are the masses of the atoms of ${\nu}_x$. (In
the general case we would write $H({\nu}_x)=+\infty $ if $\nu_x$ is
nonatomic). We let now
$$ F(\mu)=F_f (\mu)=\int {\mu}_1 (dx) H({\nu}_x) $$
and call $F(\mu)$ the {\it folding entropy} of $\mu$ with respect to
$f$.
\medskip\indent
Let again $D_{\gamma}^1=fD_{\alpha}$. By the concavity of $t
\mapsto -t{\hbox {log}}t$, we have
$$ ({\mu}_1 (D_{\gamma}^1))^{-1}\int_{D_{\gamma}^1} \mu_1 (dx)
H({\nu_x}) \le -\sum_{\alpha : \gamma(\alpha)=\gamma}
{{\mu(D_{\alpha})} \over {\mu_1 (D_{\gamma}^1)}} \hbox{log}{{\mu(D_{\alpha})}
\over {\mu_1 (D_{\gamma}^1)}} $$
Therefore, when $(D_{\alpha})$ is replaced by $(D_{\alpha}^*)$ which
consists of smaller and smaller sets, the expression
$$ F^* (\mu)=\sum_{\gamma} \mu_1 (D_{\gamma}^{*1})
[-\sum_{\alpha :\gamma (\alpha) =\gamma} {{\mu(D_{\alpha}^*)} \over
{\mu_1 (D_{\gamma}^{*1})}} \hbox{log}{{\mu(D_{\alpha}^*)} \over
{\mu_1 (D_{\gamma}^{*1})}} ] $$
tends to $F(\mu)=\int \mu_1 (dx) H(\nu_x)$ from above.
\medskip\indent
{\it 2.1 Proposition. Let $P$ be the set of probability measures
on $M$ with the vague topology and
$$ I=\{ \mu \in P : \mu \hbox{ is } f-{\hbox{invariant}} \} $$
$$ P_{\backslash \Sigma}=\{ \mu \in P : \mu (\Sigma)=0 \} $$
$$ I_{\backslash \Sigma}=I \cap P_{\backslash \Sigma} $$
\indent
(a) The function $F: P_{\backslash\Sigma} \mapsto {\bf R}$ (with values in
$[0,\log N]$) is concave upper semicontinuous (u.s.c.).
\medskip\indent
(b) The restriction of $F$ to $I_{\backslash\Sigma}$ is affine u.s.c.}
\medskip\indent
Since $H(\nu_x)$ takes values in $[0,{\hbox{log}}N]$, so does
$F$. To prove concavity we have to estimate $F$ at $\mu^\prime$,
$\mu^{\prime \prime}$ and $\mu=(1-t)\mu^\prime +t \mu^{\prime
\prime}$, with $\mu^\prime$, $\mu^{\prime \prime}\in
P_{\backslash\Sigma}$. We may choose $(D_{\alpha}^*)$ arbitrarily fine so that
$\mu^\prime (\Sigma^*)=\mu^{\prime\prime} (\Sigma^*)=0$; therefore
$F(\mu)={\hbox{lim}}{F^*}(\mu)$,
$F(\mu^\prime)={\hbox{lim}}{F^*}(\mu^\prime)$,
$F(\mu^{\prime\prime})={\hbox{lim}}{F^*}(\mu^{\prime\prime})$.
Concavity of $F$ follows from the concavity of $t \mapsto
{F^*}((1-t)\mu^\prime +t\mu^{\prime\prime})$, or the convexity of
$$ t \mapsto \sum_{\alpha}
[(1-t)u_{\alpha}+tv_{\alpha}]{\hbox{log}}{{(1-t)u_{\alpha}+tv_{\alpha}}
\over {\sum_{\beta} ((1-t)u_{\beta}+tv_{\beta})}} $$
\indent
Since $P$ is metrizable (with the vague topology), we prove
upper semicontinuity of $F$ by showing that if $\rho^{(m)}$, $\mu \in
P_{\backslash \Sigma}$, and if the sequence $(\rho^{(m)})$ tends to $\mu$,
then $F(\mu) \le \hbox{lim}F(\rho^{(m)})$. We may choose
$(D_{\alpha}^*)$ arbitrarily fine so that $\mu(\Sigma^*)=0$ and
$\rho^{(m)}(\Sigma^*)=0$ for all $m$; $F(\mu)$ and $F(\rho^{(m)})$ are
thus limits of $F^*(\mu)$ and $F^*(\rho^{(m)})$. Since
$\mu(\Sigma^*)=\rho^{(m)}(\Sigma^*)=0$, $F^*$ is continuous for the
vague topology on the set $S=\{\mu\}\cup\{\rho^{(m)}:m \in {\bf N}\}$,
and $F|S$ is thus the limit of a decreasing family of continuous
functions, hence upper semicontinuous. This proves (a).
\medskip\indent
To prove (b) we remark that $\mu^\prime$,
$\mu^{\prime\prime}$ are absolutely continuous with respect to
$\mu=(1-t)\mu^\prime +t \mu^{\prime\prime}$ (if $t \ne 0,1$) and let
$g^\prime={{\delta \mu^\prime} / {\delta
\mu}}$, $g^{\prime\prime}={{\delta \mu^{\prime\prime}} / {\delta
\mu}}$. If $\mu^\prime$,$\mu^{\prime\prime} \in I$ the functions $g^\prime$,
$g^{\prime\prime}$ are $f$-invariant. Therefore
$$ \mu^\prime (dy)=(g^\prime \mu)(dy)=g^\prime (y) \mu
(dy)=\int \mu_1 (dx) g^\prime (y) \nu_x (dy) $$
$$ =\int \mu_1 (dx) g^\prime (x) \nu_x (dy)=\int (g^\prime \mu)_1 (dx)
\nu_x (dy)=\int\mu_1^\prime (dx) \nu_x (dy) $$
and similarly for $g^{\prime\prime} \mu$. Therefore
$$ (1-t)F(\mu^\prime)+tF(\mu^{\prime\prime})=(1-t)\int
\mu_1^\prime (dx) H(\nu_x) +t \int \mu_1^{\prime\prime} (dx) H(\nu_x)
=\int \mu_1 (dx) H(\nu_x) =F(\mu) $$
This completes the proof of the proposition.\qed
\medskip\indent
{\it Extension}
\medskip\indent
If $P$ denotes the set of positive measures on $M$ (rather
than the probability measures), we have
$$ F(\mu) \in [0,\hbox{log} N].\parallel\mu\parallel $$
Apart from that the above proposition remains true, with the same
proof. In fact since $F(\lambda \mu)=\lambda F(\mu)$ for $\lambda \ge
0$, $\mu \ge 0$, $\mu(\Sigma)=0$, the extension of $F$ from
probability measures to positive measures is trivial.
\medskip\indent
{\it Entropy production}
\medskip\indent
We define now the {\it entropy production} $e_f (\mu)$ for a
dynamical system $(M, f)$ satisfying our standing assumptions and $\mu
\in P_{\backslash\Sigma}$ ({\it i.e.}, $\mu$ is a probability measure such that
$\mu (\Sigma)=0$). We write
$$ e_f (\mu)=F(\mu)- \mu (\hbox{log} J) $$
This definition will be motivated below, first when $\mu$ is defined
by a density, then more generally.
\medskip\indent
{\it 2.2 Proposition.
\medskip\indent
(a) $e_f (\mu)$ is independent of the choice of Riemann
metric on $M$.
\medskip\indent
(b) $e_f$ is concave u.s.c. on $P_{\backslash \Sigma}$, and affine
u.s.c. on $I_{\backslash \Sigma}$.
\medskip\indent
(c) If the probability measures $\rho^{(m)}$ are absolutely
continuous with respect to Riemann volume, and tend vaguely to $\mu \in
P_{\backslash \Sigma}$ we have}
$$ {\hbox{lim sup}}_{m \to \infty} e_f (\rho^{(m)}) \le
e_f(\mu) $$
\indent
A change of Riemann metric replaces $J$ by $J + \Phi -\Phi \circ f$, so that $\mu (\hbox{log}J)$ and $e_f(\mu)$ are not changed.
This proves (a).
\medskip\indent
The function $K+\log J$ is $\ge 0$ and continuous on $M
\backslash \Sigma$. Let $(\chi_n)$ be an increasing sequence of
continuous functions $M \mapsto [0,1]$, vanishing on $\Sigma$ and
tending to 1 on $M \backslash \Sigma$. Then $((K+ \log J).\chi_n)$
is an increasing sequence of continuous positive functions
tending to $K+\hbox{log}J$ on $M \backslash \Sigma$. Therefore
$$ \mu \mapsto \mu (K+\hbox{log}J)=K+\mu(\log J) $$
is affine l.s.c. on $P_{\backslash \Sigma}$, and
$$ \mu \mapsto -\mu (\hbox{log}J) $$
is affine u.s.c. on $P_{\backslash \Sigma}$. Together with
Proposition 2.1 this proves (b).
\medskip\indent
To prove (c) we note that , since vol$\Sigma=0$, we have
$\rho^{(m)}(\Sigma)=0$. It suffices then to apply (b).\qed
\medskip\indent
{\it Entropy associated with a density}
\medskip\indent
Let $\rho$ be a probability measure with density
$\underline\rho$ with respect to Riemann volume , {\it i.e.}, $\rho
(dx)={\underline\rho}(x) dx$. If $dx$ is interpreted as phase space
volume element, the statistical mechanical entropy associated with
$\rho$ is
$$ S(\underline\rho)=- \int dx {\underline \rho}(x)
\hbox{log}{\underline \rho}(x) $$
Using the concavity of the log we have
$$ S(\underline\rho)=\int {\underline \rho}(x) \hbox{log}{1 \over
{\underline \rho}(x)} \le \hbox{log} \int dx {{\underline \rho}(x)
\over {\underline \rho}(x)} =\hbox{log vol}M \eqno(2.1) $$
so that $S(.)$ takes values in $[-\infty, \hbox{log vol}M]$, the value
$-\infty$ being allowed.
\medskip\indent
If $\psi_\alpha$ is the inverse of $f|D_\alpha$, the direct
image $\rho_1=f\rho$ has density
$$ {\underline\rho}_1=\sum_\alpha ({\underline\rho}\circ
\psi_\alpha).(\bar J \circ \psi_\alpha) $$
where $\bar J=1/J$, and characteristic functions of the sets $f
D_\alpha$ have been omitted. Define
$$ p_\alpha ={1 \over {\underline\rho}_1 (x)} \underline\rho (\psi_\alpha
x). ({\bar J}\circ \psi_\alpha) $$
$$ \nu_x =\sum_\alpha p_\alpha (x) \delta (\psi_\alpha x) $$
where $\delta (x)$ denotes the unit mass at $x$. Note that $f\nu_x
=\delta (x)$. We have the disintegration
$$ \rho =\int dx {\underline\rho}_1 (x) \nu_x \eqno(2.2) $$
and therefore
$$ F(\rho)=\int dx {\underline\rho}_1 (x) H(\nu_x) \eqno(2.3) $$
\medskip\indent
Note also the identity
\footnote {*}{We are applying the formula (familiar in equilibrium
statistical mechanics)
$$ \log\sum_i e^{-U(i)}=-\sum_i p_i \hbox{log}p_i -\sum_i p_i U(i)
$$
with $p_i=e^{-U(i)} / \sum_j e^{-U(j)}$. }
$$ \hbox{log}{\underline\rho}_1 (x) =-\sum_\alpha p_\alpha (x)
\hbox{log} p_\alpha (x) +\sum_\alpha p_\alpha (x)
[\hbox{log}{\underline\rho}(\psi_\alpha x)+\hbox{log} {\bar J}
(\psi_\alpha x)] $$
$$ =H(\nu_x)+\nu_x (\hbox{log}{\underline\rho})+\nu_x
(\hbox{log} \bar J) $$
Therefore, using (2.3) and (2.2),
$$ -S({\underline\rho}_1)=\int dx {\underline\rho}_1(x)
\hbox{log}{\underline\rho}_1(x) $$
$$ =\int dx {\underline\rho}_1(x) H(\nu_x)
+\int dx {\underline\rho}_1(x) \nu_x(\hbox{log} \underline\rho )
+\int dx {\underline\rho}_1(x) \nu_x (\hbox{log} \bar J )
=F(\rho)+\rho(\hbox{log} \underline\rho ) +\rho(\hbox{log}
\bar J ) $$
and, if $S(\rho)=-\rho(\hbox{log} \underline\rho)$ is $\ne -\infty$,
$$ -[S({\underline\rho}_1)-S(\underline\rho)]=F(\rho)+\rho
(\hbox{log}{\bar J}) \eqno(2.4) $$
The right hand side has values $\le \hbox{log}N+K$ so that
$S({\underline\rho}_1) \ne -\infty$ when $S(\underline\rho) \ne -\infty$.
\medskip\indent
{\it 2.3 Proposition. Let $S(\underline\rho) \ne -\infty$.
\medskip\indent
(a) The entropy production associated with the density
$\underline\rho$ is
$$
-[S({\underline\rho}_1)-S(\underline\rho)]=F(\rho)+\rho(\hbox{log} \bar
J)=e_f (\rho) $$
\indent
(b) If the probability measures $\rho^{(m)}$ are absolutely
continuous with respect to Riemann volume, and tend vaguely to $\mu$ such
that $\mu(\Sigma)=0$, we have}
$$ e_f(\mu) \ge \hbox{lim sup}_{m \to \infty}
[-S({\underline\rho}_1^{(m)})+S({\underline\rho}^{(m)})] $$
\indent
(a) follows from (2.4); (b) follows from (a) and Proposition
2.2(c).\qed
\medskip\indent
{\it Physical discussion. }
\medskip\indent
The above proposition is our justification to define $e_f
(\mu)$ as the entropy production associated with $\mu \in
P_{\backslash \Sigma}$.
Note that the definition of $e_f (\mu)$ depends only on $\mu$ and $f$
and not on the choice of an approximation of $\mu$ by absolutely
continuous measures ${\mu}^{(m)}$. However we only have the
inequality
$$ e_f (\mu)=F(\mu)+\mu (\hbox{log} \bar J) \ge {\hbox {lim
sup}}_{m \to \infty} [F({\rho}^{(m)} )+{\rho}^{(m)} ({\hbox{log}}\bar J)] $$
where one might hope for an inequality. The term $\mu ({\hbox
{log}}\bar J)$ poses no serious problem in this respect: if we assume
that $\hbox{log}J$ is bounded we have
$$ \mu({\hbox{log}}\bar J)={\hbox{lim}}_{m \to \infty}
{\rho}^{(m)} (\hbox{log}\bar J) $$
For the term $F(\mu)$ there might however be a discontinuity of $F$ at
$\mu$. What this means is that some mass of $\rho ^{(m)}$
gets "folded more" in the limit $\rho ^{(m)} \to \mu$. For instance $f$
might be injective on supp$\rho ^{(m)}$ but not on supp$\mu$; this
would give $F(\rho ^{(m)})=0$, but possibly $F(\mu) >0$.
\medskip\indent
Physically one should only think of $\mu$ as an idealization
of $\rho ^{(m)}$ for large $m$. When the map $f$ "folds together" some
mass of $\mu$ it almost folds together the corresponding mass of $\rho
^{(m)}$ and, in a coarse-grained description, it thus makes sense to
replace $F(\rho ^{(m)} )$ by $F(\mu)$ and to interpret the latter as the
physical folding entropy of our system.
\medskip\indent
Take ${\rho}^{(m)}=f^m \rho$ and suppose that ${\rho}^{(m)}
\to \mu \in I_{\backslash \Sigma}$. In the step between time $m$ and
time $m+1$, the entropy production is
$$ -S({\underline\rho}^{(m+1)})+S({\underline
\rho}^{(m)})=F({\rho}^{(m)})+{\rho}^{(m)}(\hbox{log}\bar J) $$
which we approximate by $F(\mu)+\mu({\hbox{log}}\bar J)$. This seems
to mean that $S({\underline\mu})$ increases by a fixed amount at each time
step, which is absurd since $\mu$ does not depend on time. In fact,
typically, $\mu$ is singular, i.e., its density $\underline\mu$ does not
exist, and we should write $S(\underline\mu)=-\infty$. We shall argue later
that the entropy production is positive; the system produces this
entropy by having its own entropy $S({\underline\rho}^{(m)})$ decrease towards
$-\infty$ when $m \to \infty$. The entropy produced is absorbed (or
transfered to the outside world) by the time evolution $f$ (i.e., by
the forces which cause the time evolution).
\medskip\indent
Let $\rho \mapsto \rho * \theta$ denote the action of a
stochastic diffusion operator $\theta$ close to the identity operator.
Let us replace the time evolution $\rho \mapsto f\rho$ by the 'noisy
evolution' $\rho \mapsto (f\rho)*\theta$. We assume that this stochastic
evolution has a steady state $\mu_{\theta}$ tending to $\mu$ when
$\theta \to \hbox{identity}$. Here $S({\underline\mu}_\theta)$ is finite
and we can see that the entropy production is due to the diffusion
$*\theta$. We may indeed write
$$
-S({\underline\mu}^\prime)+S(\underline\mu)=S({\underline\mu}^{\prime\prime})-S({\underline\mu}^\prime) $$
where $\underline\mu$,${\underline\mu}^\prime$,${\underline\mu}^{\prime\prime}$ are the
densities associated respectively with
${\mu}_{\theta}$,$f{\mu}_{\theta}$, and
$(f{\mu}_{\theta})*{\theta}={\mu}_{\theta}$. The left hand side in the
above formula is our familiar expression for the entropy production,
and the right hand side is the entropy produced by the diffusion. Let
${\mu}^{(m)}$ be obtained from $\rho$ by the noisy evolution after $m$
time steps. Because ${\mu}^{(m)}$ is smeared as compared with
${\rho}^{(m)}={f^m}\rho$, we expect that the folding entropy
$F({\mu}^{(m)})$ will be close to $F({\mu}_\theta)$ or $F(\mu)$. This
is further justification for our choice of the definition $e_f (\mu)$
for the entropy production.
\medskip\indent
{\it Positivity of entropy production.}
\medskip\indent
The following result, showing that $e_f (\mu) \ge 0$ for
physically reasonable $\mu$, is close to the results obtained when $f$
is a diffeomorphism. The proof is remarkably simple.
\medskip\indent
{\it 2.4 Theorem.
Let $\rho$ be a probability measure with density
$\underline\rho$ on $M$. If $S(\underline\rho)$ is finite and if $\mu$ is a vague
limit of the measures $\rho^{(m)}=(1/m)\sum_{k=0}^{m-1} f^k \rho$ when
$m \to \infty$, then $e_f (\mu) \ge 0$.}
\medskip\indent
By proposition 2.2(c) and 2.2(b) respectively we have
$$ e_f(\mu) \ge \hbox{lim sup}_{m \to \infty} e_f (\rho^{(m)} ) $$
$$ e_f(\rho^{(m)}) \ge {1 \over m}\sum_{k=0}^{m-1} e_f(f^k\rho) $$
\indent
Using Proposition 2.3(a), we also have
$$ \sum_{k=0}^{m-1} e_f (f^k \rho)
=-S({\underline\rho}_m)+S(\underline\rho) $$
Therefore
$$ e_f(\mu)\ge\hbox{lim sup}{1\over
m}[-S({\underline\rho}_m)+S(\underline\rho)] $$
Since $-S({\underline\rho}_m)\ge -\hbox{log vol}M$ (by (2.1) above) we obtain
$e_f(\mu) \ge0$.\qed
\medskip\indent
{\it Alternate approach.}
\medskip\indent
Instead of our standing assumption, let us suppose that $M$ is
a compact manifold and $f:M \to M$ a $C^1$ map. One may then conjecture that
$$ h(\mu)\le F(\mu)+|\sum \hbox{negative Lyapunov exponents}| $$
when $\mu$ is an $f$-ergodic probability measure. (If our standing
assumptions hold and $f$ is piecewise $C^1$, with $\mu(\Sigma)=0$, this
can be proved along the lines of Ruelle [25]). For a SRB state $\mu$ we have
$$ h(\rho)=\sum\hbox{positive Lyapunov exponents} $$
and our conjecture implies
$$ e_f (\mu)=F(\mu)-\sum\hbox{positive Lyap.
exp.}+|\sum\hbox{negative Lyap. exp.}| $$
$$ \ge h(\mu)-\sum\hbox{positive Lyap. exp.}=0 $$
\vfill\eject
\centerline{3. Entropy production associated with diffusion.}
\bigskip
\bigskip
\indent
Let $M$ be a compact manifold, $f:M \to M$ a diffeomorphism, and
$A$ a compact $f$-invariant subset of $M$. Given a small open
neighborhood $U$ of $A$, we define
$$ U_m =\{ x:f^k x \in U \hbox{ for }k=0,...,m\} $$
Since we do not assume that the set $A$ is attracting, mass will in
general leak out of $U$, {\it i.e.}, vol $U_m \to 0$ when $m \to \infty$.
It is conjectured (see Kantz and Grassberger [19], Eckmann and Ruelle
[11], Gaspard and Nicolis [17]), that in many cases vol $U_m \approx e^{mP}$,
and the escape rate from $A$ under $f$ is (up to change of sign)
$$ P=P_{Af}={\sup}_{\rho\in\partial I_A}\{h(\rho)-\sum\hbox{positive
Lyap. exp. for } (\rho ,f)\}\le 0 $$
where $\partial I_A$ is the set of $f$-ergodic probability measures with
support in $A$.
\medskip\indent
If $\chi_m$ is the characteristic function of $U_m$, let $\rho_{[m]}$
and ${\underline\rho}_{[m]}^*$ be given by
$$ \rho_{[m]}(dx)={{\chi_m (x)} \over {\hbox{vol}U_m}} dx $$
$$ (f^m \rho_{[m]})(dx)={\underline\rho}_{[m]}^* (x) dx $$
Then we may define the entropy production associated with escape from
$A$ as
$$ e_A=\lim_{m\to\infty} {1\over m}
[S({\underline\rho}_{[0]})-S({\underline\rho}_{[m]}^*)] \eqno(3.1)$$
if this limit exists.
\medskip\indent
The next proposition deals with the Axiom A case
\footnote{*}{Smale's foundational article [27] is still a convenient
introduction to hyperbolic dynamical systems (with the definition of
Axiom A diffeomorphisms, basic sets,etc.). For further references see [11].}
, which is well understood mathematically. One may conjecture that results
obtained in that case hold much more generally, but proofs are lacking at
this time.
\medskip\indent
{\it 3.1 Proposition. Let $A$ be a basic set for the $C^2$ Axiom A
diffeomorphism $f$, and $U_m$, $P$,$\rho_{[m]}$, ${\underline\rho}_{[m]}^*$
be as above}
\medskip\indent
(a) $\hbox{lim}_{m\to\infty} {1\over m}\hbox{log vol }U_m=P$
\medskip\indent
(b) {\it There is a unique $f$-ergodic probability measure $\mu$
on $A$ such that}
$$ h(\mu)-\sum\hbox{positive Lyap. exp. for }(\mu,f)=P $$
{\it
\medskip\indent
(c) Define}
$$ \rho^{(m)} ={1\over m} \sum_{k=0}^{m-1} f^k \rho_{[m]} $$
{\it then} $\hbox{v-lim }\rho^{(m)} =\mu$ {\it when $m\to\infty$.
\medskip\indent
(d) The limit (3.1) defining $e_A$ exists, and
$$ e_A=\lim_{m\to\infty} {1\over
m}[S({\underline\rho}_{[0]})-S({\underline\rho}_{[m]}^*)]=-P_{Af}-\mu(\log J) $$
(where $J$ is the absolute value of the Jacobian of $f$).}
\medskip\indent
(a) can readily be extracted from Bowen and Ruelle [4] where a
slightly weaker result is proved (and flows are considered instead of
diffeomorphisms).
\medskip\indent
If $J^u$ denotes the Jacobian in the unstable direction
log $J^u$ is H\"older continuous on $A$, and since $f|A$ is topologically
transitive, there is a unique equilibrium state $\mu$ maximizing
$h(\mu)-\mu(\log J^u)$ (see Ruelle [23]). This proves (b).
\medskip\indent
The {\it volume lemma} of [4] establishes a close relation
between $\rho_m$ and $\mu$. In fact it follows from [4] that any
vague limit of $\rho^{(m)}$ when $m\to\infty$ is
absolutely continuous with respect to $\mu$. Such a limit is also
$f$-invariant and, since $\mu$ is ergodic, equal to $\mu$. This proves
(c).
\medskip\indent
Since ${\underline\rho}_{[m]}(x)={{\chi_m}(x)/ {\hbox{vol }U_m}}$,
we have
$$ S({\underline\rho}_{[0]})-S({\underline\rho}_{[m]})
=\hbox{log vol }U_m-\hbox{log vol }U_0 $$
and (a) yields
$$ \lim_{m\to\infty} {1\over m}
[S({\underline\rho}_{[0]})-S({\underline\rho}_{[m]})]=-P_{Af} $$
We also have
$$ S({\underline\rho}_{[m]})-S({\underline\rho}_{[m]}^*)
=-\int dx {\underline\rho}_m (x) \log \prod_{k=0}^{m-1} J(f^k x)
=-m \int \rho^{(m)} (dx) \log J(x) $$
hence, using (c),
$$ \lim_{m\to\infty} {1\over m}[S({\underline\rho}_{[m]})-S({\underline\rho}_{[m]}^*)]
=-\mu (\log J) $$
and (d) follows.\qed
\medskip\indent
In conclusion the entropy production $e_A$ associated with
escape from the Axiom A basic set $A$ under $f$ is
$$ e_{Af}(\mu)=-P_{Af}-\mu(\hbox{log} J) $$
This may be taken as a {\it definition} of $e_{Af}(\mu)$ for all
$\mu\in I_A$, when $A$ is an $f$-invariant set, $f$ is not
necessarily an Axiom A diffeomorphism, and $I_A$ is the set of
$f$-invariant probability measures with support in $A$. Notice that
$e_{Af}(\mu)\ne e_f(\mu)$ unless $P_{Af}=0$; this corresponds to the fact
that $e_{Af}$ and $e_f$ describe different processes of entropy production
(they coincide if $A$ is an attracting set). It is readily seen that
$e_{Af}(\mu)$ is independent of the choice of Riemann metric. Here again we
shall prove positivity of the entropy production.
\medskip\indent
{\it 3.2 Proposition. Let $\mu\in\partial I_A$ satisfy the following
extension of the Pesin identity}
$$ h(\mu)-\sum\hbox{positive Lyapunov exponents}=P_{Af} $$
{\it We have then
$$ e_{Af}(\mu)\ge -P_{Af^{-1}} \ge 0 \eqno(3.2) $$}
\medskip\indent
We have indeed
$$ e_{Af}(\mu)=-h(\mu)+\sum\hbox{positive Lyap. exp. for
}(\mu,f)-\sum\hbox{Lyap. exp. for }(\mu,f) $$
$$ =-h(\mu)-\sum\hbox{negative Lyap. exp. for }(\mu,f) $$
$$ =-[h_{f^{-1}}(\mu)-\sum\hbox{positive Lyap. exp. for
}(\mu,f^{-1})] $$
$$ \ge -P_{Af^{-1}} $$
and (3.2) follows from $P_{Af}\le 0$.\qed
\medskip\indent
{\it 3.3 Remarks.}
\medskip\indent
(a) Proposition 3.2 holds without restriction, but the
interpretation of $e_{Af}(\mu)$ as entropy production and of $|P_{Af}|$
as escape rate are guaranteed only in the Axiom A case. For more
general situations such interpretations remain conjectural.
\medskip\indent
(b) In the Axiom A case $e_{Af}(\mu)=0$ implies that
$P_{Af^{-1}}=0$, {\it i.e.}, $A$ is an attractor for $f^{-1}$, and $\mu$
is the corresponding SRB measure on $A$.
\vfill\eject
References
\bigskip
[1] R.Artuso, E.Aurell and P.Cvitanovi\'c. "Recycling of strange sets: I.
Cycle expansions, II. Applications." Nonlinearity {\bf
3},325-359,361-386(1990).
[2] P.Billingsley. {\it Ergodic theory and information.} John Wiley, New
York, 1965.
[3] N.Bourbaki. {\it El\'ements de math\'ematiques.} {\bf VI.} {\it
Integration.} Ch.6 {\it Int\'egration vectorielle}. Hermann, Paris,
1959.
[4] R.Bowen and D.Ruelle. "The ergodic theory of Axiom A flows." Invent.
Math. {\bf 29},181-202(1975).
[5] N.I.Chernov, G.L.Eyink, J.L.Lebowitz, and Ya.G.Sinai. "Steady-state
electrical conduction in the periodic Lorentz gas." Commun. Math. Phys.
{\bf154},569-601(1993).
[6] N.I.Chernov and J.L.Lebowitz. "Stationary shear flow in boundary
driven Hamiltonian systems." Phys. Rev. Letters, {\bf 75},2831-2834(1995).
[7] N.I.Chernov and J.L.Lebowitz. In preparation.
[8] P.Cvitanovi\'c, J.-P.Eckmann and P.Gaspard. "Transport properties of
the Lorentz gas in terms of periodic orbits." Chaos, Solitons and Fractals
{\bf 4}(1995).
[9] J.R.Dorfman. {\it From molecular chaos to dynamical chaos}. Lecture
notes. Maryland,1995.
[10] J.R.Dorfman and P.Gaspard. "Chaotic scattering theory of transport and
reaction rate coefficients." Phys. Rev. {\bf E51},28 (1995).
[11] J.-P.Eckmann and D.Ruelle. "Ergodic theory of strange attractors."
Rev. Mod. Phys. {\bf 57},617-656(1985).
[12] D.J.Evans, E.G.D.Cohen and G.P.Morriss. "Probability of second law
violations in shearing steady flows." Phys. Rev. Letters {\bf
71},2401-2404(1993).
[13] G.Gallavotti. "Reversible Anosov diffeomorphisms and large
deviations." Math. Phys. Electronic J. {\bf 1},1-12(1995).
[14] G.Gallavotti. "Chaotic hypothesis: Onsager reciprocity and
fluctuation- dissipation theorem." Preprint.
[15] G.Gallavotti and E.G.D.Cohen. "Dynamical ensembles in nonequilibrium
statistical mechanics." Phys. Rev. Letters {\bf 74},2694-2697(1995).
[16] G.Gallavotti and E.G.D.Cohen. "Dynamical ensembles in stationary
states." J. Statist. Phys. {\bf 80},931-970(1995).
[17] P.Gaspard and G.Nicolis. "Transport properties, Lyapunov exponents,
and entropy per unit time." Phys. Rev. Letters {\bf 65},1693-1696(1990).
[18] W.G.Hoover. {\it Molecular dynamics}. Lecture Notes in Physics {\bf
258}. Springer, Heidelberg, 1986.
[19] H.Kantz and P.Grassberger. "Repellers, semi-attractors, and
long-lived chaotic transients." Physica {\bf 17D},75-86(1985).
[20] J.L.Lebowitz. "Boltzmann's entropy and time's arrow." Physics Today
{\bf 46},No 9,32-38(1993).
[21] F.Ledrappier. "Propri\'et\'es ergodiques des mesures de Sinai."
Publ. math. IHES {\bf 59},163-188(1984).
[22] F.Ledrappier and L.S.Young. "The metric entropy of diffeomorphisms:
I. Characterization of measures satisfying Pesin's formula, II.
Relations between entropy, exponents and dimension." Ann. of Math. {\bf
122},509-539,540-574(1985).
[23] D.Ruelle. "A measure associated with Axiom A attractors." Am. J.
Math. {\bf 98},619-654(1976).
[24] D.Ruelle. "Sensitive dependence on initial condition and turbulent
behavior of dynamical systems." Ann. N.Y. Acad. Sci. {\bf 316},408-416(1978)
[25] D.Ruelle. "An inequality for the entropy of differentiable maps."
Bol. Soc. Bras. Mat. {\bf 9},83-87(1978).
[26] Ya.G.Sinai. "Gibbs measures in ergodic theory." Usp. Mat. Nauk {\bf 27},No 4,21-64
(1972) [Russian Math. Surveys {\bf 27},No 4,21-69(1972)].
[27] S.Smale. "Differentiable dynamical systems." Bull. Amer. Math.
Soc. {\bf 73},747-817(1967).
\vfill\eject
\bye