1 Introduction

The modern study of Iterated Function Systems (IFS for short) come back to the early 80’s with the works of J. Hutchinson [20] and M. Barnsley [3]. In these papers, the theory was unified both in the geometric and the analytical point of view, generating what we call today the Hutchinson-Barnsley theory for IFS. An IFS is a family of maps acting from a set to itself. Under good contraction hypothesis there exist an invariant compact set called the fractal attractor. Moreover, if we add weights having good continuity hypothesis, the IFS acts on probabilities having an invariant probability whose support is the fractal attractor set. We observe that several works on geometric features of fractals were done in the previous decades by Mandelbrot and others, but after the 80’s the IFS assumed the central role in the generation and study of fractals and its applications.

For a dynamical system given by a map \(T: X \rightarrow X\), an initial point \(x_0 \in X\) is iterated by T producing the orbit \(\{x_0, T(x_0), T^2(x_0), \ldots \}\), whose limit or the cluster points are the objects of main interest, from a dynamical point of view. On the other hand, for an IFS \((X, \tau _\theta )_{\theta \in \Theta }\), we iterate the initial point by choosing at each step a possibly different map \(\tau _{\theta }: X\rightarrow X\), indexed by the generally finite set \(\Theta \), producing multiple orbits \(\{Z_j, j \ge 0\}=\{Z_0=x_0, Z_1=\tau _{\theta _0}(x_0), Z_2= \tau _{\theta _1}(\tau _{\theta _0}(x_0)), \ldots \}\). We notice that the orbit is now a set of orbits controlled by the sequence \((\theta _0, \theta _1, \ldots ) \in \Theta ^{{\mathbb {N}}}\). To avoid this complication Hutchinson defined the fractal operator \(F: K(X) \rightarrow K(X)\) by

$$\begin{aligned} F(B)=\bigcup _{\theta \in \Theta }\tau _{\theta }(B), \end{aligned}$$

for \(B \in K(X)\), where K(X) is the family of nonempty compact sets of X. This operator is called the Hutchinson-Barnsley operator and a compact set is invariant or fractal if \(F(\Omega )=\Omega \). Additionally \(\Omega \) is a fractal attractor if the orbit of B by F, given by \(\{B, F(B), F^2(B), \ldots \}\) converge, w.r.t. the Hausdorff-Pompeiu metric to \(\Omega \), for any \(B \in K(X)\) (see [5] for details on the Hausdorff-Pompeiu metric).

Other possible point of view to understand the dynamics of an IFS is the probabilistic one. Suppose X is now a metric space and let \({\mathcal {P}}(X)\) be the set of Borel probability measures defined on X. In this case we consider that, in each step the function to be iterated is chosen according to some probability, which produces a stochastic process \(\{X_{0}, X_{1}, X_{2}, \ldots \}\) where each \(X_{j+1}\in X\) is randomly chosen according to a distribution which is obtained from the previous \(X_{j}\) by a transition kernel using the IFS law. To illustrate that, we consider the classical case of IFS with constant probabilities studied by Hutchinson, Barnsley and many others in the beginnings of the 80’s: we consider \(\Theta =\{1,2,\ldots , n\}\), meaning that we have a finite number of maps, and each one is chosen according to a probability \(p_j>0\) where \(p_1+ \cdots + p_n=1\), constituting an IFS with probabilities (IFSp for short). We denote by \(C(X,\mathbb {R})\) the Banach space of all real continuous functions equipped with supremum norm \(\Vert \; \Vert _{\infty }\). Under this conditions the classic transfer operator (also called Ruelle operator, see Ruelle [29, 30], Walters [33] and Fan [14]) is given by

$$\begin{aligned} B_{q}(f)(x)=\int _{\Theta } f(\tau _{\theta }(x)) dq_{x}(\theta )= \sum _{j=1}^{n} p_j \, f(\tau _{j}(x)), \end{aligned}$$
(1)

for any \(f \in C(X,\mathbb {R})\), where the measure \(q_{x}\) is given by \(dq_{x}(\theta )= \sum _{j=1}^{n} p_j \delta _{j}(\theta ), \; \forall x \in X\). The dual of \(B_{q}\) is a Markov operator (see [11, 22] and [3], for details on Markov operators and its connection with IFS) acting on \(\mu \). The operator \({\mathcal {L}}_{q}(\mu ):=B_{q}^*(\mu )\), is implicitly defined by the property

$$\begin{aligned} \int _{X} f d{\mathcal {L}}_{q}(\mu )= \int _{X} B_{q}(f)(x) d\mu , \end{aligned}$$
(2)

for any \(f \in C(X,\mathbb {R})\). Given an initial distribution \(\mu _{0}\), we iterate it by the Markov operator \(M: {\mathcal {P}}(X) \rightarrow {\mathcal {P}}(X)\) obtaining the distributions \(\mu _{0}, \mu _{1}={\mathcal {L}}_{q}(\mu _{0}), \mu _{2}={\mathcal {L}}_{q}^2(\mu _{0}), \ldots \in {\mathcal {P}}(X)\). We have \(X_n \sim \mu _n\) for all \(n \ge 0\). Analogously to the fractal attractor, we say that \(\mu \in {\mathcal {P}}(X)\) is an invariant measure if \({\mathcal {L}}_{q}(\mu )=\mu \) and that \(\mu \in {\mathcal {P}}(X)\) is an attracting invariant measure (or Hutchinson-Barnsley measure) if \({\mathcal {L}}_{q}^j(\mu _{0})\) converge to \(\mu \) w.r.t. the Monge-Kantorovich metric(see [20]), for any \(\mu _{0}\in {\mathcal {P}}(X)\). It is possible to prove that the support of the invariant attracting measure is the fractal attractor (see [20]).

The final feature of an IFS dynamics we need to understand is the connections between IFS orbits and invariant measures. The first connection is given by a celebrated result due to M. Barnsley, known as the Chaos Game Theorem (CGT for short), which claims that, from the initial probabilities \(p_j\)’s, we can build a probability \({\mathbb {P}}\) over the space \(\Theta ^{{\mathbb {N}}}\) such that for \({\mathbb {P}}\)-a.e. \((\theta _0, \theta _1, \ldots ) \subset \Theta ^{{\mathbb {N}}}\) the corresponding orbit \(\{x_0, \tau _{\theta _0}(x_0), \tau _{\theta _1}(\tau _{\theta _0}(x_0)), \ldots \}\) approximate the fractal attractor \(\Omega \), for any initial point \(x_0\). The second connection is given by the Elton’s Ergodic Theorem (EET for short) [13] claiming that, from the initial probabilities \(p_j\)’s, we can build a probability \({\mathbb {P}}\) over the space \(\Theta ^{{\mathbb {N}}}\) such that \({\mathbb {P}}\)-a.e. \((\theta _0, \theta _1, \ldots ) \subset \Theta ^{{\mathbb {N}}}\), the corresponding asymptotic average of visits of the orbit \(\{x_0, \tau _{\theta _0}(x_0), \tau _{\theta _1}(\tau _{\theta _0}(x_0)), \ldots \}\) to a measurable set \(B \subset X\) is equal to \(\mu (B)\), if \(\mu (\partial (B))=0\), analogously to the usual Birkhoff ergodic theorem for a single map, where \(\mu \) is the invariant measure of the IFS in consideration. For continuous functions it means that

$$\begin{aligned} \frac{1}{N} \left( f(x_0)+f(\tau _{\theta _0}(x_0)) +\cdots + f(\tau _{\theta _{N-1}}( \cdots \tau _{\theta _0}(x_0)))\right) \rightarrow \int _X f d\mu , \end{aligned}$$

for any \(f \in C(X,\mathbb {R})\), as \(N \rightarrow \infty \). In other words

$$\begin{aligned} \frac{1}{N} \left( \delta _{x_0}+\delta _{\tau _{\theta _0}(x_0)}+\cdots + \delta _{\tau _{\theta _{N-1}}( \cdots \tau _{\theta _0}(x_0))}\right) \rightarrow \mu , \end{aligned}$$

in distribution. In synthesis, the CGT and the EET are random procedures to approximate the fractal attractor and the invariant measure, respectively.

The study of the conditions under which we have, for a given IFS, a fractal attractor which is the support of an invariant measure is called the Hutchinson-Barnsley theory. Such conditions have been extremely relaxed and generalized in several ways in the past forty years. A first generalization, yet for \(\Theta =\{1,2,\ldots , n\}\), was for IFSp where the constant probabilities \(p_j>0\) such that \(p_1+ \cdots + p_n=1\) were replaced by variable probabilities \(p_j(x)>0\) where \(p_1(x)+ \cdots + p_n(x)=1\) for all \(x \in X\). Now, the transfer operator is defined by

$$\begin{aligned} B_{q}(f)(x)=\int _{\Theta } f(\tau _{\theta }(x)) dq_{x}(\theta )= \sum _{j=1}^{n} p_j(x) \, f(\tau _{j}(x)), \end{aligned}$$
(3)

where the measure \(q_{x}\) is given by \(dq_{x}(\theta )= \sum _{j=1}^{n} p_j(x) \delta _{j}(\theta ), \; \forall x \in X\), for any \(f \in C(X,\mathbb {R})\), Very general conditions for the existence of the invariant measure for such IFS are given in [4]. We point out that the EET was also proved for variable probabilities and finite functions in [13].

In Fan [14], 1999, the condition \(p_1(x)+ \cdots + p_n(x)=1\) is finally dropped assuming only that each \(p_\theta (x)\ge 0\) for \(\theta \in \{1,\ldots ,n\}\). In this work Fan study a contractive system which is a triplet \((X, \tau _\theta , p_\theta )_{\theta \in \{1,\ldots ,n\}}\), where each \(\tau _\theta \) is a contractive map and each \(p_\theta (x)\ge 0\) for \(\theta =1,\ldots ,n\), generalizing the notion of IFS with probabilities. In this setting, Fan proves a Ruelle-Perron-Frobenius theorem (RPF theorem, for short), meaning the existence of a positive eigenfunction for the operator \(B_{q}\) and an eigenmeasure for the dual operator \(B_{q}^*\) with the same eigenvalue which is the spectral radius of \(B_{q}\).

The next key improvement was given by Stenflo [32], where random iterations are used to represent the iterations of a so called IFS with probabilities, \((X, \tau _\theta , \mu )_{\theta \in \Theta }\) for an arbitrary measurable space \(\Theta \). The approach here is slightly different from the previous works on IFS with probabilities, instead considering weights, the iterations from \(Z_0 \in X\) are \(Z_{j+1}=\tau _{I_j}(Z_{j})\) governed by a sequence of i.i.d variables \(\{I_j \in \Theta \}_{j \in {\mathbb {N}}}\), with distribution \(\mu \), generating a Markov chain \(\{Z_{j},\; j \ge 0\}\) with transfer operator given by

$$\begin{aligned} B_{q}(f)(x)=\int _{\Theta } f(\tau _{\theta }(x)) dq_{x}(\theta )= \sum _{j=1}^{n} p_j(x) \, f(\tau _{j}(x)), \end{aligned}$$
(4)

where the measure \(q_{x}\) is given by \(dq_{x}(\theta )= d\mu (\theta ), \; \forall x \in X\), the for any \(f \in C(X,\mathbb {R})\), The main goal of Stenflo [32] is to establish, when \(B_{q}\) is Feller, the existence of an unique attracting invariant measure \(\pi \), for this Markov chain.

The approach in our paper is a generalization of Stenflo [32], (4), and all classical settings (1) and (3). We take a family \(q_{x}(\cdot ) \in {\mathcal {M}}(\Theta )\), indexed by \(x \in X\), generating a Markov chain with transfer operator given by

$$\begin{aligned} B_q(f)(x)=\int _{\Theta } f(\tau _\theta (x)) dq_{x}(\theta ), \end{aligned}$$

for any \(f \in C(X,\mathbb {R})\). The meaning of the distribution \(q_{x}(\cdot ) \in {\mathcal {M}}(\Theta )\) is such that, the position x of a previous iteration of the IFS determine the distribution \(q_{x}(\cdot )\) of \(\theta \) used to choose the function \(\tau _\theta \) and produce the new point \(\tau _\theta (x)\). When \(q_{x}=\mu \), for any \(x \in X\), is a constant distribution we recover the setting from Stenflo [32].

In our setting the IFSm \((X, \tau _\theta , q)_{\theta \in \Theta }\) can be studied as the sample paths of the Markov process \(\{Z_{j},\; j \ge 0\}\) with initial distribution \(\mu _0=\mu \in {\mathcal {M}}(X)\) and \(\mu _{j+1}= \mathcal {L}_{ q}(\mu _{j})\), where for any \(\nu \in {\mathcal {M}}(X)\),

$$\begin{aligned} \int _{X} f(x) d\mathcal {L}_{ q}(\nu )(x)= \int _{X} B_q(f)(x) d\nu (x), \end{aligned}$$

for any \(f \in C(X,\mathbb {R})\). Such degree of generality is necessary to enlarge the range of application for the IFS theory, specially the thermodynamic formalism. In Sect. 7 we present a situation where we believe the tools developed in the previous sections can be applied when analyzing an interesting problem in economics.

Our goal is to present a complete theory of thermodynamical formalism for these IFS with measures, that is, good definitions for transfer operators, invariant measures, entropy, pressure, equilibrium measures and a variational principle. Finally, we want to use these tools to characterize the solutions of the ergodic optimization problem.

For sake of completeness we would like to point out that we do not prove a RPF theorem for those systems, only the existence of positive eigenfunctions, but we establish all the results that can be derived if we have assumed such a property. To the best of our knowledge the RPF theorem for IFSm has not been established an it is a very hard problem. There are several works on the matter of finding IFS for which the RPF theorem holds. Those IFS are said to have the RPF property. In 2009 Lopes and Oliveira [24] studied those systems renaming it as weighted systems or IFS with weights, having the RPF property, producing a self contained notion of entropy and topological pressure through a variational principle for holonomic measures allowing to establish a thermodynamical formalism for IFS. Other approaches for IFS thermodynamic formalism were developed by Urbański [18, 25, 31] and many others.

It’s worth to mention that, in Urbański et al. [18] a thermodynamic formalism for conformal infinite (countable) iterated function systems is presented using the conformal structure via partition functions. Also in Käenmäki [21] a thermodynamical formalism for IFS is studied with the help of cylinder functions, where general IFS means that \((X, \tau _\theta )_{\theta \in \Theta }\) and \(\Theta \) is the increasing union of finite alphabets. In Lopes et al. [23] a thermodynamic formalism for shift spaces, taking values on a compact metric space is presented, although this problem is closely related to thermodynamic formalism for IFS when we associate the pre images of the shift map with a respective maps producing an infinite IFS. Also in [1] a variational principle for the specific entropy on the context of symbolic dynamics of compact metric space alphabets was developed generalizing somehow the results in [24].

In our work we will extend the variational results in Lopes and Oliveira [24] and, more recently, the preprint Cioletti and Oliveira [8], to a general IFS called IFS with measures (IFSm), \((X, \tau _\theta , q)_{\theta \in \Theta }\) for an arbitrary compact space \(\Theta \) (see Dumitru [12] for the Hutchinson-Barnsley theory for such infinite systems or Lukawska [17] for infinite countable ones. Other related articles are [7, 15, 19, 26,27,28]). We point out that we do not use partitions to construct our entropy, only an integral formulation, and the variational principle is taken over the holonomic measures which enclose the invariant ones.

The structure of the paper is the following: in Sect. 2 we present the basic definitions on IFS with measures (IFSm) and a fundamental result about the eigenspace associated to the maximal eigenvalue of the transfer operator. We also prove the existence of a positive eigenfunction for the transfer operator associated to the spectral radius and give a constructive proof of the existence of equilibrium states. In Sect. 3 we define the Markov operator, which in the case of a normalized IFSm gives the evolution of the distribution of the associated Markov process, and show that the set of eigenmeasures for it is non-empty. In Sect. 4, we introduce holonomic measures, which play the role of invariant measures in the IFS setting. In Sect. 5 we define entropy for a IFSm, the topological pressure of a given potential function, as well as the concept of equilibrium states. In a remark in the end of this section we show how the classical thermodynamical formalism for a dynamical system is a particular case of the IFSm Thermodynamic Formalism. In Sect. 6 a uniqueness result for the equilibrium states is obtained. Finally, in Sect. 7 we present a possible application in economic theory of the theory developed in the previous sections.

2 IFSm and Transfer Operator

In this section we set up the basic notation and present a fundamental result about the eigenspace associated to the maximal eigenvalue (or spectral radius) of the transfer operator.

In this paper \({\textbf {X}},\Theta \) are compact metric spaces, equipped with \(\mathscr {B}({\textbf {X}})\) and \(\mathscr {B}(\Theta )\) respectively the Borel \(\sigma \)-algebra for \({\textbf {X}}\) and \(\Theta \).

The Banach space of all real continuous functions equipped with supremum norm is denoted by \(C({\textbf {X}},\mathbb {R})\). Its topological dual, as usual, is identified with \(\mathscr {M}_{s}({\textbf {X}})\), the space of all finite Borel signed measures endowed with total variation norm. We use the notation \(\mathscr {M}_{1}(X)\) for the set of all Borel probability measures over X supplied with the weak-\(*\) topology. Since we are assuming that X is a compact metric space then we have that the topological space \(\mathscr {M}_{1}(X)\) is compact and metrizable.

Take \(\text {q}={(\text {q}_{x})}_{x\in {\textbf {X}}}\) a collection of measures on \(\mathscr {B}(\Theta )\), such that

  • (q1) \(\displaystyle \sup \text {q}\equiv \sup _{x\in X}\text {q}_{x}(\Theta ) < \infty \),

  • (q2) \(\displaystyle \inf \text {q}\equiv \inf _{x\in {\textbf {X}}}\text {q}_{x}(\Theta ) > 0\),

  • (q3) \(\displaystyle x \mapsto \text {q}_{x}(A)\) is a Borel map, i.e, is \(\mathscr {B}({\textbf {X}})\)-measurable for all fixed \(A\in \mathscr {B}(\Theta )\),

  • (q4) \(\displaystyle x \mapsto \text {q}_{x}\) is \(\hbox {weak}^{*}\)-continuous.

An Iterated Function System with measures q, IFSm for short, is a triple \(\mathcal {R}_{\text {q}}=({\textbf {X}}, \tau , \text {q})\), where \(\tau ={(\tau _{\theta })}_{\theta \in \Theta }\) is a collection of functions from \({\textbf {X}}\) to itself with the following property

  • (\(\tau 1\)) \(\tau : (\Theta , {\textbf {X}}) \mapsto {\textbf {X}}\), where \(\tau (\theta , x) = \tau _{\theta }(x)\) is continuous.

The \(\mathcal {R}_{\text {q}}\) is said to be normalized if for all \(x\in {\textbf {X}}\), \(\text {q}_{x}\) is a probability measure.

Definition 2.1

Let \(\mathcal {R}_{\text {q}}=({\textbf {X}},\tau ,\text {q})\) be an IFSm. The Transfer Operator associated to \(\mathcal {R}_{\text {q}}\) is defined by:

$$\begin{aligned} \text {B}_{\text {q}}(f)(x) = \int _{\Theta }f(\tau _{\theta }(x))\,\text {d}\text {q}_{x}(\theta ), \qquad \forall x \in X. \end{aligned}$$

\(\text {B}_{\text {q}}\) is well defined. In fact, \(\text {B}_{\text {q}}\) is continuous once that

$$\begin{aligned} \left\Vert \text {B}_{\text {q}}(f)\right\Vert _{\infty } = \sup _{x}\left|\int \,f(\tau _{\theta }(x))\,\text {d}\text {q}_{x}(\theta )\right| \le \sup \text {q}\left\Vert f\right\Vert _{\infty } < \infty . \end{aligned}$$

Futhermore, for a fixed \(f \in C({\textbf {X}},\mathbb {R})\) and \(x\in {\textbf {X}}\), given \(\varepsilon > 0\), take \(\delta >0\) s.t.

$$\begin{aligned} \sup _{\theta \in \Theta }d(f(\tau _{\theta }(x)),f(\tau _{\theta }(y)))<\frac{\varepsilon }{2\sup \text {q}}, \end{aligned}$$

and

$$\begin{aligned} \left|\int _{\Theta }f(\tau _{\theta }(x))d\text {q}_{x}(\theta ) - \int _{\Theta }f(\tau _{\theta }(x))d\text {q}_{y}(\theta )\right| < \frac{\varepsilon }{2}, \end{aligned}$$

for all \(y\in {\textbf {X}}\) with \(d(x,y) < \delta \). Then,

$$\begin{aligned} \vert&\text {B}_{\text {q}}(f)(x) - \text {B}_{\text {q}}(f)(y)\vert = \left|\int _{\Theta }f(\tau _{\theta }(x))\,\text {d}\text {q}_{x}(\theta ) - \int _{\Theta }f(\tau _{\theta }(y))\,\text {d}\text {q}_{y}(\theta )\right|\\&\le \int _{\Theta }\left|f(\tau _{\theta }(x))-f(\tau _{\theta }(y))\right|\,\text {d}\text {q}_{x}(\theta ) +\left|\int _{\Theta }f(\tau _{\theta }(y))\,\text {d}\text {q}_{x}(\theta )- \int _{\Theta }f(\tau _{\theta }(y))\,\text {d}\text {q}_{y}(\theta )\right|\\&<\frac{\varepsilon }{2\sup \text {q}}\int _{\Theta }\,\text {d}\text {q}_{x}(\theta ) + \frac{\varepsilon }{2} = \frac{\varepsilon }{2} + \frac{\varepsilon }{2}=\varepsilon . \end{aligned}$$

This shows that, for \(f\in C({\textbf {X}},\mathbb {R})\) and \(x\in {\textbf {X}}\), given \(\varepsilon >0\), there is \(\delta >0\) such that for every \(d(x,y)<\delta \), \(\left|\text {B}_{\text {q}}(f)(x) - \text {B}_{\text {q}}(f)(y)\right| < \epsilon \), therefore \(\text {B}_{\text {q}}(f)(x)\) is continuous.

When dealing with the spectral radius of a positive operator, via Gelfand’s formula, we will need to consider the norm of iterates of such operator, which is defined by a supremum which is attained in the constant function, as we will see in the proof of Theorem 2.5. For this reason we will need the following proposition:

Proposition 2.2

Let \(\mathcal {R}_{\text {q}}=({\textbf {X}}, \tau , \text {q})\) be a IFSm. Then for the N-th iteration of \(\text {B}_{\text {q}}\) we have

$$\begin{aligned} \text {B}_{\text {q}}^{N}(1)(x) = \int _{\Theta ^{N}} \,\text {dP}_{x}^{\text {q}}(\theta _{0},\ldots , \theta _{N-1}) \end{aligned}$$

where,

\(\displaystyle \,\text {dP}_{x}^{\text {q}}(\theta _{0},\ldots , \theta _{N-1}) \equiv \prod _{j=1}^{N}\,\text {d}\text {q}_{x_{N-j}}(\theta _{N-j})\), \(x_0=x\) and \(x_{j+1}=\tau _{\theta _{j}}x_j\).

Proof

This expression can be obtained by proceeding a formal induction on N. For \(N=2\) and \(x=x_{0}\), we have

$$\begin{aligned} \text {B}_{\text {q}}^{2}(1)(x_{0})&=\int _{\Theta } \text {B}_{\text {q}}(1)(\tau _{\theta _{0}}(x_{0}))\,\text {d}\text {q}_{x_{0}}(\theta _{0})\\&=\int _{\Theta }\int _{\Theta }\,\text {d}\text {q}_{x_{1}}(\theta _{1})\,\text {d}\text {q}_{x_{0}}(\theta _{0})\\&=\int _{\Theta ^{2}}\,\text {dP}_{x}^{\text {q}}(\theta _{0},\theta _{1}). \end{aligned}$$

And, if

$$\begin{aligned}\text {B}_{\text {q}}^{N}(1)(x) = \int _{\Theta ^{N}}\,\text {dP}_{x}^{\text {q}}(\theta _{0},\ldots ,\theta _{N-1}),\end{aligned}$$

then

$$\begin{aligned} \text {B}_{\text {q}}^{N+1}(1)(x)&= \int _{\Theta }\text {B}_{\text {q}}^{N}(1)(x_{1})\,\text {d}\text {q}(\theta _{0})\\&=\int _{\Theta }\int _{\Theta ^{N}}\,\text {dP}_{x_{1}}^{\text {q}}(\theta _{1},\ldots ,\theta _{N})\,\text {d}\text {q}(\theta _{0})\\&=\int _{\Theta }\cdots \int _{\Theta }\left( \prod _{j=0}^{N-1}\,\text {d}\text {q}_{x_{N-j}}(\theta _{N-j})\right) \,\text {d}\text {q}(\theta _{0})\\&=\int _{\Theta }\cdots \int _{\Theta }\left( \prod _{j=1}^{N+1}\,\text {d}\text {q}_{x_{N+1-j}}(\theta _{N+1-j})\right) \\&=\int _{\Theta ^{N}}\,\text {dP}_{x}^{\text {q}}(\theta _{0},\ldots \,\theta _{N}). \end{aligned}$$

\(\square \)

Remark 2.3

The formal notation used for \(\,\text {P}_{x}^{\text {q}}\), in fact, means that \(\,\text {P}_{x}^{\text {q}}\) is a measure in \(\Theta ^{N}\) defined by,

$$\begin{aligned}\,\text {P}_{x}^{\text {q}}(\Theta _{0}\times \cdots \times \Theta _{N-1}) = \int _{\Theta _{0}}\cdots \int _{\Theta _{N-1}}\,\text {d}\text {q}_{x_{N-1}}(\theta _{N-1})\cdots \,\text {d}\text {q}_{x_{0}}(\theta _{0}).\end{aligned}$$

In the case \(N=2\) for instance,

$$\begin{aligned} \,\text {P}_{x}^{\text {q}}(\Theta _{0}\times \Theta _{1})&=\int _{\Theta _{0}}\int _{\Theta _{1}}\,\text {d}\text {q}_{\tau _{\theta _{0}}x}(\theta _{1})\,\text {d}\text {q}_{x}(\theta _{0})\\&= \int _{\Theta _{0}}\text {q}_{\tau _{\theta _{0}}x}(\Theta _{1})\,\text {d}\text {q}_{x}(\theta _{0}). \end{aligned}$$

Note that \(\text {q}_{\tau _{\theta _{0}}(x_{0})}(\Theta _{1})\), with fixed \(\Theta _{1}\) and \(x_{0}\), is a function of \(\theta _{0}\) that is measurable: indeed, if \(A\in \mathscr {B}(\Theta )\), \(f_{A}:{\textbf {X}}\rightarrow \mathbb {R}\) defined by \(f_{A}(x) = q_{x}(A)\) is measurable by (q3) and by (\(\tau 1\)) implies \(\tau \) is measurable. Thus, \(F_{A} := f_{A}\circ \tau \) is measurable.

Proposition 2.4

If \(f:{\textbf {X}}\rightarrow \mathbb {R}\) is a measurable nonnegative function, then

$$\begin{aligned}H(x) := \int _{\Theta _{0}} f\circ \tau (\theta , x)\,\text {d}\text {q}_{x}(\theta ),\end{aligned}$$

is measurable.

Using Proposition 2.4, it is a simple induction to prove that

$$\begin{aligned} x\mapsto \,\text {P}_{x}^{\text {q}}(\Theta _{0}\times \cdots \times \Theta _{N-1})= \int _{\Theta _{0}}\,\text {P}_{\tau _{\theta _{0}}x_{0}}^{\text {q}}(\Theta _{1}\times \cdots \times \Theta _{N-1})\,\text {d}\text {q}_{x_{0}}(\theta _{0}) \end{aligned}$$

is measurable for any \(\Theta _{i}\in \mathcal {B}(\Theta )\).

In this way we conclude that \(\,\text {P}_{x}^{\text {q}}\) is well defined for each space \(\Theta ^{N}\).

Theorem 2.5

Let \(\mathcal {R}_{\text {q}}=({\textbf {X}}, \tau , \text {q})\) be a IFSm and suppose that there are a positive number \(\rho \) and a strictly positive continuous function \(h: {\textbf {X}}\rightarrow \mathbb {R}\) such that \(\text {B}_{\text {q}}(h)=\rho h\). Then the following limit exits

$$\begin{aligned} \lim _{N \rightarrow \infty } \frac{1}{N} \ln \left( \text {B}_{\text {q}}^{N}(1) (x) \right) = \ln \rho \end{aligned}$$
(5)

the convergence is uniform in x and \(\rho =\rho (\text {B}_{\text {q}})\) is the spectral radius of \(\text {B}_{\text {q}}\) acting on \(C(X,\mathbb {R})\).

Proof

From the hypothesis we can build a normalized IFSm \(\mathcal {R}_{\text {p}}=({\textbf {X}},\tau ,\text {p})\) where

$$\begin{aligned} \,\text {d}\text {p}_{x}(\theta ) = \frac{h(\tau _{\theta }(x))}{\rho h(x)}\,\text {d}\text {q}_{x}(\theta ).\end{aligned}$$

Note that \(\,\text {dP}_{x}^{\text {q}}\) and \(\,\text {dP}_{x}^{\text {p}}\) are related in the following way

$$\begin{aligned} \,\text {dP}_{x}^{\text {q}}(\theta _{0},\ldots ,\theta _{N-1})&=\prod _{j=1}^{N}\,\text {d}\text {q}_{x_{N-j}}(\theta _{N-j})\\&=\prod _{j=1}^{N}\frac{\rho h(x_{N-j})}{h(\tau _{\theta _{N-j}}(x_{N-j}))}\,\text {d}\text {p}_{x_{N-j}}(\theta _{N-j})\\&=\rho ^{N}\prod _{j=1}^{N}\frac{h(x_{N-j})}{h(x_{N-j+1})}\,\text {d}\text {p}_{x_{N-j}}(\theta _{N-j})\\&=\rho ^{N}\frac{h(x_{0})}{h(x_{N})}\prod _{j=1}^{N}\,\text {d}\text {p}_{x_{N-j}}(\theta _{N-j})\\&=\rho ^{N}\frac{h(x_{0})}{h(x_{N})}\,\text {dP}_{x}^{\text {p}}(\theta _{0},\ldots ,\theta _{n-1}) \end{aligned}$$

where \(x_0=x\) and \(x_{j+1}=\tau _{\theta _{j}}x_j\).

Since \({\textbf {X}}\) is compact and h is a strictly positive continuous function, we have constants \(0<a<1<b\), independent of x, such that

$$\begin{aligned} a \le h(x_{0})/h(x_{N}) \le b.\end{aligned}$$

Using the Proposition 2.2 and the above inequalities, we obtain for any fixed \(N\in \mathbb {N}\) the following expression

$$\begin{aligned} \frac{1}{N}\ln (\text {B}_{\text {q}}^{N}(1)(x))&= \frac{1}{N}\ln \left( \int _{\Theta ^{N}} \,\text {dP}_{x}^{\text {q}}(\theta _{0},\ldots , \theta _{N-1}) \right) \\&= \frac{1}{N}\ln \left( \int _{\Theta ^{N}} \rho ^{N}\frac{h(x_{0})}{h(x_{N})}\,\text {dP}_{x}^{\text {p}}(\theta _{0},\ldots , \theta _{N-1}) \right) \\&= \ln \rho + \frac{1}{N}\ln \left( \int _{\Theta ^{N}} \frac{h(x_{0})}{h(x_{N})}\,\text {dP}_{x}^{\text {p}}(\theta _{0},\ldots , \theta _{N-1}) \right) . \end{aligned}$$

Furthermore,

$$\begin{aligned} \frac{1}{N}\ln \left( \int _{\Theta ^{N}} \frac{h(x_{0})}{h(x_{N})}\,\text {dP}_{x}^{\text {p}}(\theta _{0},\ldots , \theta _{N-1}) \right)&\ge \frac{1}{N}\ln \left( \int _{\Theta ^{N}} a\,\text {dP}_{x}^{\text {p}}(\theta _{0},\ldots , \theta _{N-1}) \right) \\&= \frac{1}{N}\ln a + \frac{1}{N}\ln \int _{\Theta _{N}}\,\text {dP}_{x}^{\text {p}}\\&= \frac{1}{N}\ln a \xrightarrow {\,N\rightarrow \infty \,} 0\\ \end{aligned}$$

and

$$\begin{aligned} \frac{1}{N}\ln \left( \int _{\Theta ^{N}} \frac{h(x_{0})}{h(x_{N})}\,\text {dP}_{x}^{\text {p}}(\theta _{0},\ldots , \theta _{N-1}) \right)&\le \frac{1}{N}\ln \left( \int _{\Theta ^{N}} b\,\text {dP}_{x}^{\text {p}}(\theta _{0},\ldots , \theta _{N-1}) \right) \\&= \frac{1}{N}\ln b + \frac{1}{N}\ln \int _{\Theta _{N}}\,\text {dP}_{x}^{\text {p}}\\&= \frac{1}{N}\ln b\xrightarrow {\,N\rightarrow \infty \,} 0. \end{aligned}$$

Therefore, for every \(N\ge 1\) we have

$$\begin{aligned} \sup _{x\in {\textbf {X}}}\left|\frac{1}{N}\ln \left( \text {B}_{\text {q}}^{N}(1)(x)\right) - \ln \rho \right| \le \frac{\ln b-\ln a }{N}, \end{aligned}$$

which proves (5). Now using Gelfand’s formula for the spectral radius and the fact that, for a positive operator T defined on \(C(X,\mathbb {R})\) we have \(\left\Vert T\right\Vert =\left\| T(1) \right\| _{\infty }\), where \(\left\Vert T\right\Vert \) denotes the usual norm operator \(\left\Vert T\right\Vert = \sup _{\{ \Vert f\Vert _{\infty } \le 1\}} \Vert T(f) \Vert _{\infty }\), we have

$$\begin{aligned} \left|\ln \rho (\text {B}_{\text {q}}) - \ln \rho \right|&= \left|\ln \left( \limsup _{N\rightarrow \infty } \left\Vert \text {B}_{\text {q}}^{N}\right\Vert ^{\frac{1}{N}}\right) - \ln \rho \right| = \limsup _{N\rightarrow \infty } \left|\frac{1}{N} \ln \left\Vert \text {B}_{\text {q}}^{N}\right\Vert - \ln \rho \right|\\&= \limsup _{N\rightarrow \infty } \left| \frac{1}{N} \ln \left( \left\| \text {B}_{\text {q}}^{N}(1)\right\| _{\infty } \right) - \ln \rho \right|\\&\le \limsup _{N\rightarrow \infty } \ \ \sup _{x\in {\textbf {X}}} \left|\frac{1}{N} \ln \left( \text {B}_{\text {q}}^{N}(1)(x) \right) - \ln \rho \right|\\&\le \limsup _{N\rightarrow \infty } \frac{\ln b -\ln a }{N} =0. \end{aligned}$$

\(\square \)

We will now address the question of existence of positive eigenfunctions for the transfer operator associated to the spectral radius, and give a constructive proof of the existence of equilibrium states.

Let \(\mathcal {R}_{\text {q}}=({\textbf {X}}, \tau , \text {q})\) assuming that there is \(\mu \) a probability on \(\Theta \) s.t. \(\forall x \in {\textbf {X}}\), \(\text {q}_{x} \ll \mu \) and \(J:\Theta \times {\textbf {X}}\rightarrow \mathbb {R}\), defined by \(J(\theta ,x) := \frac{\,\text {d}\text {q}_{x}}{\,\text {d}\mu }(\theta )\), a continuous function. Define \(u(x,\theta ) = \log J(\theta ,x)\), and consider a parametric family of variable discount functions \(\delta _{n}:[0,+\infty )\rightarrow \mathbb {R}\), where \(\delta _{n}(t) \rightarrow I(t) = t\), when \(n\rightarrow \infty \), pointwise and the normalized limits \(\lim _{n}w_{n}(x) - \max w_{n}\) of the fixed points

$$\begin{aligned} w_{n}(x) := \log \int _{\Theta }e^{u(\theta ,x)+\delta _{n}(w_{n}(\tau (\theta ,x)))}\,\text {d}\mu = \log \int _{\Theta }e^{\delta _{n}(w_{n}(\tau (\theta ,x)))}\,\text {d}\text {q}_{x}(\theta ) \end{aligned}$$

for a discounted transfer operator (see [9], Definition 3.2 for the operator definition and Theorem 3.6 for the existence of the fixed points) associated to a variable discount decision-making process. We now check that the requirements in [9] are fulfilled in our setting. Consider \(S_{n}:=({\textbf {X}},\Theta ,\Psi ,\tau ,u,\delta _{n})\) where \(\Psi (x) = \Theta \) for all \(x\in {\textbf {X}}\) and the sequence \((\delta _{n})\) satisfies the admissibility conditions:

  1. 1.

    the contraction modulus \(\gamma _{n}\) of \(\delta _{n}\) is also a variable discount function;

  2. 2.

    \(\delta _n(0) = 0\) and \(\delta _{n}(t) \le t\) for any \(t\in (0,+\infty )\);

  3. 3.

    for any fixed \(\alpha > 0\) we have \(\delta _{n}(t+\alpha ) - \delta _{n}(t) \rightarrow \alpha \) when \(n\rightarrow \infty \), uniformly in \(t > 0\).

Theorem 2.6

Let \(\mathcal {R}_{\text {q}}\) and \((\delta _{n})\) in above conditions such that the above defined u satisfies:

  1. 1.

    u is uniformly \(\delta \)-bounded for \((\delta _{n})\);

  2. 2.

    u is uniformly \(\delta \)-dominated for \((\delta _{n})\).

Then there exists a positive and continuous eigenfunction h such that \(\text {B}_{\text {q}}(h) = \rho (\text {B}_{\text {q}})h\).

Proof

Theorem 3.28 of [9] implies that there exists \(k\in [0,\left\Vert u\right\Vert _{\infty }]\) and \(\varphi (x):=e^{h(x)}\) continuous and positive function with

$$\begin{aligned}e^{k}\varphi (x) = \int _{\Theta }\varphi \circ \tau (\theta ,x)\,e^{u(x,\theta )}\,\text {d}\mu (\theta ) = \text {B}_{\text {q}}(\varphi )(x), \end{aligned}$$

for all \(x\in {\textbf {X}}\). Now use the Theorem 2.5 and the theorem is proven. \(\square \)

3 Markov Operator and its Eigenmeasures

In this section we define the Markov Operator, which in the case of a normalized IFSm gives the evolution of the distribution of the associated Markov Process, and show that the set of eigenmeasures for it is non-empty.

Definition 3.1

The Markov Operator is the unique bounded linear operator satisfying

$$\begin{aligned} \int _{{\textbf {X}}} f\, \text {d}[\mathcal {L}_{ q} (\mu )] = \int _{{\textbf {X}}} \text {B}_{\text {q}}(f)\, \text {d}\mu , \end{aligned}$$

for all \(\mu \in \mathscr {M}_s(X)\) and \(f\in C({\textbf {X}},\mathbb {R})\).

In the case of a normalized IFSm, we can consider the Markov Process \(\{Z_j, j\ge 0\}\) with initial distribution \(Z_0\sim \mu _0\), where \(\mu _0 \in \mathscr {M}_{1}(X)\), and \(Z_{j+1}=\tau _{{\theta }_j}(Z_j)\) for \(j \ge 0\), where \({{\theta }_j} \sim q_{Z_j}\). Then, if \(Z_j \sim \mu _j\), we have \(\mu _{j+1}= \mathcal {L}_{\text {q}}(\mu _j)\).

Theorem 3.2

Let \(\mathcal {R}_{\text {q}}=({\textbf {X}}, \tau , \text {q})\) be a IFSm. Then there exists a positive number \(\rho \le \rho (\text {B}_{\text {q}})\) such that the set \( \mathcal {G}^{*}(\text {q}) = \{ \nu \in {\mathscr {M}}_1(X): {\mathcal {L}}_{ q}\nu =\rho \nu \} \) is not empty.

Proof

Notice that the mapping

$$\begin{aligned} \mathscr {M}_1(X)\ni \gamma \mapsto \frac{{\mathcal {L}}_{ q}(\gamma ) }{{\mathcal {L}}_{ q}(\gamma )(X)} \end{aligned}$$

sends \({\mathscr {M}}_1(X)\) to itself. From its convexity and compactness, in the weak topology which is Hausdorff when X is metric and compact, it follows from the continuity of \({\mathcal {L}}_{ q}\) and the Tychonov-Schauder Theorem that there is at least one probability measure \(\nu \) satisfying \({\mathcal {L}}_{ q}(\nu )=({\mathcal {L}}_{ q}(\nu )(X))\, \nu \).

We claim that

$$\begin{aligned} \inf _{x\in {\textbf {X}}}\text {q}_{x}(\Theta ) \le {\mathcal {L}}_{ q}(\gamma )({\textbf {X}}) \le \sup _{x\in {\textbf {X}}} \text {q}_{x}(\Theta ) \end{aligned}$$
(6)

for every \(\gamma \in {\mathscr {M}}_1(X)\).

Indeed,

$$\begin{aligned}{} & {} \text {B}_{\text {q}}(1)(x) = \int _{\Theta }1\,\text {d}\text {q}_{x}(\theta ) = \text {q}_{x}(\Theta ),\\{} & {} \mathcal {L}_{\text {q}}(\gamma )({\textbf {X}}) = \int _{{\textbf {X}}}1\,\text {d}[\mathcal {L}_{\text {q}}\gamma ]=\int _{{\textbf {X}}}\text {B}_{\text {q}}(1)\,\text {d}\gamma = \int _{{\textbf {X}}}\text {q}_{x}(\Theta )\,\text {d}\gamma (x),\\{} & {} 0< \inf _{x\in {\textbf {X}}}\text {q}_{x}(\Theta ) \le \int _{{\textbf {X}}}\text {q}_{x}(\Theta )\,\text {d}\gamma (x) \le \sup _{x\in {\textbf {X}}}\text {q}_{x}(\Theta ) < \infty .\end{aligned}$$

From the inequality (6) it follows that

$$\begin{aligned} 0< \rho \equiv \sup \{ \mathcal {L}_{\text {q}}(\nu )(X): \mathcal {L}_{\text {q}}(\nu ) = (\mathcal {L}_{\text {q}}(\nu )(X))\, \nu \} < +\infty . \end{aligned}$$

By a compactness argument one can show the existence of \(\nu \in \mathscr {M}_{1}({\textbf {X}})\) so that \(\mathcal {L}_{\text {q}}\nu =\rho \nu \). Indeed, let \({(\nu _n)}_{n\in {\mathbb {N}}}\) be a sequence such that \(\mathcal {L}_{\text {q}}(\nu _n)({\textbf {X}})\uparrow \rho \), when n goes to infinity. Since \(\mathscr {M}_1(X)\) is compact metric space in the weak topology we can assume, up to subsequence, that \(\nu _n\rightharpoonup \nu \). This convergence together with the continuity of \(\mathcal {L}_{\text {q}}\) provides

$$\begin{aligned} \mathcal {L}_{\text {q}}\nu = \lim _{n\rightarrow \infty }\mathcal {L}_{\text {q}}\nu _n = \lim _{n\rightarrow \infty }\mathcal {L}_{\text {q}}(\nu _n)(X)\nu _n = \rho \, \nu , \end{aligned}$$

thus showing that the set \( \mathcal {G}^{*}( {q}) \equiv \{ \nu \in \mathscr {M}_1(X): \mathcal {L}_{\text {q}}\nu =\rho \, \nu \} \ne \emptyset \).

To finish the proof we observe that by using any \(\nu \in \mathcal {G}^{*}(\text {q})\), we get the following inequality

$$\begin{aligned} \rho ^N = \int _{X} \text {B}_{\text {q}}^{N}(1)\, \text {d}\nu \le \left\Vert \text {B}_{\text {q}}^{N}\right\Vert . \end{aligned}$$

From this inequality and Gelfand’s Formula, it follows that \(\rho \le \rho (B_{ {q}})\). \(\square \)

4 Holonomic Measure and Disintegrations

Now we introduce holonomic measures, which play the role of invariant measures in the IFS setting.

An invariant measure for a classical dynamical system on a compact space is a measure \(\mu \) satisfying for all \(f\in C(X,\mathbb {R})\)

$$\begin{aligned} \int _{{\textbf {X}}} f(T(x))\,\text {d}\mu = \int _{{\textbf {X}}} f(x)\,\text {d}\mu , \quad \text {equivalently}\quad \int _{{\textbf {X}}} f(T(x))-f(x)\,\text {d}\mu = 0. \end{aligned}$$

From the Ergodic Theory point of view the natural generalization of this concept for an IFS \(\mathcal {R}=({\textbf {X}}, \tau )\) is the concept of holonomy.

Consider the cartesian product space \(\Omega \equiv {\textbf {X}}\times \Theta \) and for each \(f\in C(X,\mathbb {R})\) its “\(\Theta \)-differential” \(\text {d}_{}f: \Omega \rightarrow \mathbb {R}\) defined by \([\text {d}_{x}f](\theta )\equiv f(\tau _{\theta } (x)) -f(x)\).

Definition 4.1

A measure \({\hat{\mu }}\) over \(\Omega \) is said holonomic, with respect to an IFS \(\mathcal {R}\) if for all \(f\in C(X,\mathbb {R})\) we have

$$\begin{aligned} \int _{\Omega }[\text {d}_{x}f](\theta ) \, d{\hat{\mu }}(x,\theta )=0. \end{aligned}$$

Notation,

$$\begin{aligned} \displaystyle \mathcal {H}\left( \mathcal {R}\right) \equiv \{{\hat{\mu }} \, | \, {\hat{\mu }} \text { is a holonomic probability measure with respect to}\ \mathcal {R}\}. \end{aligned}$$

Since \(\Omega \) is compact the set of all holonomic probability measures is obviously convex and compact. It is also not empty because \(\Omega \) is compact and any average

$$\begin{aligned} {\hat{\mu }}_{N} \equiv \frac{1}{N} \sum _{j=0}^{N-1} \delta _{(x_j, {\theta }_j)}, \end{aligned}$$

where \(x_{j+1} = \tau _{{\theta }_j}(x_j)\) and \(x_0\in {\textbf {X}}\) is fixed, will have their cluster points in \(\mathcal {H}\left( \mathcal {R}\right) \). Indeed, for all \(N\ge 1\) we have the following identity

$$\begin{aligned} \int _{\Omega } [\text {d}_{x}f](\theta ) \, d{\hat{\mu }}_{N}(x,\theta )&= \frac{1}{N} \sum _{j=0}^{N-1} [\text {d}_{x_{j}}f](\theta _j) = \frac{1}{N} (f(\tau _{\theta _{N-1}}(x_{N-1}))-f(x_0) ). \end{aligned}$$

From the above expression is easy to see that if \({\hat{\mu }}\) is a cluster point of the sequence \({({\hat{\mu }}_N)}_{N\ge 1}\), then there is a subsequence \({(N_k)}_{k\rightarrow \infty }\) such that

$$\begin{aligned} \int _{\Omega } [\text {d}_{x}f](\theta ) \, d{\hat{\mu }}(x,\theta )&= \lim _{k\rightarrow \infty } \int _{\Omega } [\text {d}_{x}f](\theta ) \, d{\hat{\mu }}_{N_k}(x,\theta )\\&= \lim _{k\rightarrow \infty }\frac{1}{N_{k}} (f(x_{N_k})-f(x_0) ) =0. \end{aligned}$$

Theorem 4.2

(Disintegration). Let X and Y be compact metric spaces, \({\hat{\mu }}:\mathscr {B}(Y)\rightarrow [0,1]\) a Borel probability measure, \(T:Y \rightarrow X\) a Borel mensurable function and for each \(A\in \mathscr {B}(X)\) define a probability measure \(\mu (A)\equiv {\hat{\mu }}(T^{-1}(A))\). Then there exists a family of Borel probability measures \({(\mu _{x})}_{x \in X}\) on Y, uniquely determined \(\mu \)-a.e, such that

  1. 1.

    \(\mu _{x}(Y\backslash T^{-1}(x)) = 0 \), \(\mu \)-a.e;

  2. 2.

    \(\displaystyle \int _{Y} f\, d{\hat{\mu }} = \int _{X}\left[ \int _{T^{-1}(x)} \!\!\! f(y)\ d\mu _{x}(y)\right] d\mu (x)\).

This decomposition is called the disintegration of \({\hat{\mu }}\), with respect to T.

Proof

For a proof of this theorem, see [10] p.78 or [2], Theorem 5.3.1. \(\square \)

In this paper we are interested in disintegrations in cases where Y is the Cartesian product \(\Omega \equiv {\textbf {X}}\times \Theta \) and \(T:\Omega \rightarrow {\textbf {X}}\) is the projection on the first coordinate. In such cases if \({\hat{\mu }}\) is any Borel probability measure on \(\Omega \), then follows from the first conclusion of Theorem 4.2 that the disintegration of \({\hat{\mu }}\) provides for each \(x\in {\textbf {X}}\) a unique probability measure \(\mu _{x}\) (\(\mu \)-a.e.) supported on \(\{x\}\times \Theta \). So we can write the disintegration of \({\hat{\mu }}\) as \(\text {d}{\hat{\mu }}(x,\theta )= \text {d}\mu _{x}(\theta )\text {d}\mu (x)\), where here we are abusing notation identifying \(\mu _x(\{x\}\times A)\) with \(\mu _x(A)\).

Consider any IFSm \(\mathcal {R}_{\text {q}}=({\textbf {X}},\tau ,\text {q})\) and a probability \(\nu \in {\mathscr {M}}_{1}({\textbf {X}})\) such that the Markov operator associated to the IFSm \(\mathcal {R}_{\text {q}}\) satisfies

$$\begin{aligned} \mathcal {L}_{\text {q}}(\nu ) = \nu , \end{aligned}$$

then it is possible to define a holonomic probability measure \({\hat{\mu }}\in \mathcal {H}\left( \mathcal {R}\right) \) given by \(\text {d}{\hat{\mu }}(x,\theta )= \text {d}q_{x}(\theta ) \,\text {d}\nu (x)\). Indeed, fixed \(f \in C(X,{\mathbb {R}})\), we obtain

$$\begin{aligned} \int _{\Omega }[\text {d}_{x}f](\theta ) \,\text {d}{\hat{\mu }}(x,\theta ) = 0 \Longleftrightarrow \int _{\Omega }f(\tau _{\theta }(x)) \,\text {d}{\hat{\mu }}(x,\theta ) = \int _{\Omega } f(x) \,\text {d}{\hat{\mu }}(x,\theta ). \end{aligned}$$

But,

$$\begin{aligned} \int _{{\textbf {X}}}\int _{\Theta } f(\tau _{\theta }(x)) \,\text {d}q_{x}(\theta )\text {d}\nu (x) = \int _{{\textbf {X}}}\int _{\Theta } f(x) \,\text {d}q_{x}(\theta )\text {d}\nu (x). \end{aligned}$$

if, and only if,

$$\begin{aligned} \int _{{\textbf {X}}}\text {B}_{\text {q}}(f)\,\text {d}\nu = \int _{{\textbf {X}}} f \,d\nu , \; \forall f \in C(X,{\mathbb {R}}). \end{aligned}$$

Which is equivalent to \(\mathcal {L}_{\text {q}}(\nu ) = \nu \). Since any disintegration \({\hat{\mu }}\) has a marginal \(\mu _{x}=q_x\), \(\nu -a.e.\), this Borel probability measure \({\hat{\mu }}\) on \(\Omega \), will be called a holonomic lifting of \(\nu \), with respect to \(\mathcal {R}_{\text {q}}\).

5 Entropy and Pressure for IFSm

We now define two concepts of entropy, compare then, show sufficient conditions for them to be equal and introduce the topological pressure of a given potential, as well as the concept of equilibrium states. We show in this section a first result on the existence of equilibrium states. In all that follows, the a priori measure has a special role (see [23]).

As in the previous section the mapping \(T:\Omega \rightarrow X\) denotes the projection on the first coordinate. Even when not explicitly mentioned, any disintegrations of a probability measure \({\hat{\nu }}\), defined over \(\Omega \), will be from now on considered with respect to T.

Definition 5.1

(Variational Entropy). Let \(\mathcal {R}\) be an IFS, \({\hat{\nu }} \in \mathcal {H}\left( \mathcal {R}\right) \), \(\mu \) a probability on \(\Theta \) and \(d{\hat{\nu }}(x,\theta )=d\nu _{x}(\theta )d\nu (x)\) a disintegration of \({\hat{\nu }}\), with respect to T. The variational entropy of \({\hat{\nu }}\) with a priori probability \(\mu \) is defined by

$$\begin{aligned} h_{\text {v}}^{\mu }({\hat{\nu }}) \equiv \inf _{ \begin{array}{c} g\,\in \,C({\textbf {X}}, \mathbb {R}) \\ g>0 \end{array}} \left\{ \int _{{\textbf {X}}} \ln \frac{\text {B}_{\mu }(g)}{ g } \,\text {d}\nu \right\} . \end{aligned}$$

Definition 5.2

When \(\text {q}={(\text {q}_{x})}_{x\in {\textbf {X}}}\) is a family of measures on \(\Theta \) and \(\mu \) a probability on \(\Theta \), and \(\nu \) is a probability on \({\textbf {X}}\), we say that the family \(\text {q}\) is \(\nu \)-almost everywhere absolutely continuous with respect to \(\mu \) when \(\text {q}_{x}\ll \mu \) for \(\nu \)-almost everywhere x on \({\textbf {X}}\) and write \(\text {q}\ll _{\nu }\mu \).

If \(\text {q}\ll _{\nu }\mu \), we define \(J_{x}(\theta )\) such that \(J_{x} = \frac{\,\text {d}\text {q}_{x}}{\,\text {d}\mu }\) when \(\text {q}_{x}\ll \mu \) and \(J_{x}(\theta ) = 0\) otherwise.

Definition 5.3

(Average entropy). Let \(\mathcal {R}\) be an IFS, \({\hat{\nu }} \in \mathcal {H}\left( \mathcal {R}\right) \), \(d{\hat{\nu }}(x,\theta )=d\nu _{x}(\theta )d\nu (x)\) a disintegration of \({\hat{\nu }}\) with respect to T and \(\mu \) a probability on \(\Theta \) such that \((\nu _{x})\ll _{\nu }\mu \). The average entropy of \({\hat{\nu }}\) with respect to \(\mu \) is defined by

$$\begin{aligned} h_{\text {a}}^{\mu }({\hat{\nu }}) \equiv -\int _{\Omega } \ln J_{x}(\theta )\,\text {d}\nu _{x}(\theta )\,\text {d}\nu (x). \end{aligned}$$

Definition 5.4

(Optimal Function). Let \(\mathcal {R}\) be an IFS, \({\hat{\nu }}\in \mathcal {H}\left( \mathcal {R}\right) \), \(\text {d}{\hat{\nu }}(x,\theta )=\text {d}\nu _{x}(\theta )\,\text {d}\nu (x)\) a disintegration of \({\hat{\nu }}\) with respect to T and \(\mu \) a probability on \(\Theta \) such that \((\nu _{x})\ll _{\nu }\mu \). We say that a positive function \(g\in C({\textbf {X}},\mathbb {R})\) is optimal, with respect to the system of measures associated to the desintegration \(\text {d}{\hat{\nu }}(x,\theta )=\text {d}\nu _{x}(\theta )\,\text {d}\nu (x)\) if we have

$$\begin{aligned} J_{x}(\theta ) = \frac{g(\tau _{\theta }(x))}{\text {B}_{\mu }(g)(x)}. \end{aligned}$$

Proposition 5.5

Let \(\,\text {d}\text {q}_{x}(\theta ) = Q_{x}(\theta )\,\text {d}\mu (\theta )\) and \(dp_{x}(\theta ) = P_{x}(\theta )\,\text {d}\mu (\theta )\) be probabilities, and suppose P is continuous, positive and bounded away from zero, while Q is a positive function which is integrable with respect to \(\,\text {d}\text {q}_{x}(\theta )\,\text {d}\nu (x)\). Then we have

$$\begin{aligned}-\int _{{\textbf {X}}}\int _{\Theta }\log (Q_{x}(\theta ))\,\text {d}\text {q}_{x}(\theta )\,\text {d}\nu (x) \le -\int _{{\textbf {X}}}\int _{\Theta }\log (P_{x}(\theta )) \,\text {d}\text {q}_{x}(\theta )\,\text {d}\nu (x).\end{aligned}$$

Proof

Using Jensen’s Inequality on \(f(x) = -x \log (x)\), which is a concave function, we have

$$\begin{aligned} \int _{{\textbf {X}}}\int _{\Theta }-\log&\left( \frac{Q_{x}(\theta )}{P_{x}(\theta )}\right) \frac{Q_{x}(\theta )}{P_{x}(\theta )}\,\text {d}\text {p}_{x}(\theta )\,\text {d}\nu (x)\\&=\int _{\Omega } f\left( \frac{Q_{x}(\theta )}{P_{x}(\theta )}\right) \,\text {d}\text {p}_{x}(\theta )\,\text {d}\nu (x)\\&\le f\left( \int _{\Omega }\,\text {d}\text {p}_{x}(\theta )\,\text {d}\nu (x)\right) = f(1) = 0. \end{aligned}$$

Then,

$$\begin{aligned} \int _{{\textbf {X}}}\int _{\Theta }&-\log \left( \frac{Q_{x}(\theta )}{P_{x}(\theta )}\right) \frac{Q_{x}(\theta )}{P_{x}(\theta )} P_{x}(\theta )\,\text {d}\mu (\theta )\,\text {d}\nu (x)\\ {}&= \int _{{\textbf {X}}}\int _{\Theta }-\log \left( \frac{Q_{x}(\theta )}{P_{x}(\theta )}\right) \,\text {d}\text {q}_{x}(\theta )\,\text {d}\nu (x) \le 0. \end{aligned}$$

Therefore,

$$\begin{aligned} -\int _{{\textbf {X}}}\int _{\Theta }\log \left( Q_{x}(\theta )\right) \,\text {d}\text {q}_{x}(\theta )\,\text {d}\nu (x) \le -\int _{{\textbf {X}}}\int _{\Theta }\log \left( P_{x}(\theta )\right) \,\text {d}\text {q}_{x}(\theta )\,\text {d}\nu (x). \end{aligned}$$

\(\square \)

Theorem 5.6

Let \(\mathcal {R}\) be an IFS, \({\hat{\nu }} \in \mathcal {H}\left( \mathcal {R}\right) \), \(\text {d}{\hat{\nu }}(x,\theta )=\text {d}\nu _{x}(\theta )\,\text {d}\nu (x)\) a disintegration of \({\hat{\nu }}\) with respect to T, and \(\mu \) a probability on \(\Theta \) such that \((\nu _{x})\ll _{\nu }\mu \). Then

$$\begin{aligned} h_{\text {a}}^{\mu }({\hat{\nu }}) \le h_{\text {v}}^{\mu }({\hat{\nu }}) \le 0. \end{aligned}$$

Moreover, if there exists some optimal function \(\phi \), with respect to \({\mathcal {R}}_{ {q}}\), then

$$\begin{aligned} h_{\text {a}}^{\mu }({\hat{\nu }}) = h_{\text {v}}^{\mu }({\hat{\nu }}) = \int _{{\textbf {X}}} \ln \frac{\text {B}_{\mu }(\phi )}{\phi } \,\text {d}\nu . \end{aligned}$$

Proof

From the definition of variational entropy we obtain

$$\begin{aligned} h_{\text {v}}^{\mu }({\hat{\nu }}) = \inf _{ \begin{array}{c} g\in C({\textbf {X}}, \mathbb {R}) \\ g>0 \end{array} } \left\{ \int _{{\textbf {X}}} \ln \frac{\text {B}_{\mu }(g)}{ g } \,\text {d}\nu \right\} \le \int _{{\textbf {X}}} \ln \frac{\text {B}_{\mu }(1)}{1}\,\text {d}\nu = 0. \end{aligned}$$

It remains to show that \(h_{\text {a}}^{\mu }({\hat{\nu }}) \le h_{\text {v}}^{\mu }({\hat{\nu }})\). Let \(g:X\rightarrow \mathbb {R}\) be continuous positive function and define for each \(x\in X\) a probability where \(\,\text {d}\text {p}_{x}(\theta )=g(\tau _{\theta }(x))/\text {B}_{\mu }(g)(x)\,\text {d}\mu (\theta )\). From Proposition 5.5 and the properties of the holonomic measures we get the following inequalities for any continuous and positive function g:

$$\begin{aligned} h_{\text {a}}^{\mu }({\hat{\nu }})&= -\int _{{\textbf {X}}}\int _{\Theta } \ln J_{x}(\theta )\,\text {d}\text {q}_{x}(\theta )\,\text {d}\nu (x) \\&\le -\int _{{\textbf {X}}}\int _{\Theta } \ln \left( \frac{g\circ \tau _{\theta }}{\text {B}_{\mu }(g)}\right) \,\text {d}\text {q}_{x}(\theta )\,\text {d}\nu (x) \\&= -\int _{{\textbf {X}}} \left[ \int _{\Theta } \ln (g\circ \tau _{\theta })\,\text {d}\text {q}_{x}(\theta ) - \int _{\Theta } \ln (\text {B}_{\mu }(g))\,\text {d}\text {q}_{x}(\theta ) \right] \,\text {d}\nu (x) \\&= -\int _{{\textbf {X}}} \text {B}_{\text {q}}(\ln g) \,\text {d}\nu + \int _{{\textbf {X}}} \ln (\text {B}_{\mu }(g)) \,\text {d}\nu \\&= -\int _{{\textbf {X}}}\ln g\,\text {d}\nu + \int _{{\textbf {X}}}\ln (\text {B}_{\mu }(g))\,\text {d}\nu \\&= \int _{{\textbf {X}}} \ln \frac{\text {B}_{\mu }(g)}{g}\,\text {d}\nu , \end{aligned}$$

Therefore, \(h_{\text {a}}^{\mu }({\hat{\nu }}) \le h_{\text {v}}^{\mu }({\hat{\nu }})\). Furthermore, if \(J_{x}(\theta ) = \phi \circ \tau _{\theta }(x) / \text {B}_{\mu }(\phi )(x)\) for some \(\phi > 0\) continuous function, then \(h_{\text {a}}^{\mu }({\hat{\nu }}) = \int _{{\textbf {X}}} \log \frac{\text {B}_{\mu }(\phi )}{\phi } = h_{\text {v}}^{\mu }({\hat{\nu }})\). \(\square \)

Remark 5.7

We would like to address a very important distinction between \(h_{\text {v}}^{\mu }({\hat{\nu }})\) and \(h_{\text {a}}^{\mu }({\hat{\nu }})\). The first one, the variational entropy, is to be used at the variational principle and to define the equilibrium states having no additional requirements except that \(\mu \) is a probability. On the other hand the average entropy is only defined for holonomic probabilities having a marginal absolutely continuous with respect to \(\mu \) and will not be used in a variational principle. The only role of this quantity is to be a lower bound to the variational entropy, when it does exist.

Definition 5.8

Let \(\psi : {\textbf {X}}\rightarrow \mathbb {R}\) be a positive continuous function, \(\mu \) a probability on \(\Theta \), and \(\mathcal {R}_{\psi }({\textbf {X}}, \tau , \text {q})\) an IFSm, where \(\,\text {d}\text {q}_{x}(\theta ) = \psi \circ \tau _{\theta }(x)\,\text {d}\mu (\theta )\). The topological pressure of \(\psi \), with respect to \(\mathcal {R}_{\psi }\), is defined by

$$\begin{aligned} P(\psi ) = \sup _{{\hat{\nu }}\in \mathcal {H}\left( \mathcal {R}\right) }\inf _{g \in C({\textbf {X}};\mathbb {R}) g > 0}\left\{ \int _{{\textbf {X}}}\ln \frac{\text {B}_{\text {q}}(g)}{g}\,\text {d}\nu \right\} , \end{aligned}$$
(7)

where \(\nu := \pi _{*}{\hat{\nu }}\) for \(\pi :\Omega \rightarrow {\textbf {X}}\) the \({\textbf {X}}\) projection.

Lemma 5.9

Let \(\psi : {\textbf {X}}\rightarrow \mathbb {R}\) be a positive continuous function and \(\mathcal {R}_{\psi } = ({\textbf {X}}, \tau , \text {q})\) be the IFSm defined above, where \(\,\text {d}\text {q}_{x}(\theta ) = \psi \circ \tau _{\theta }(x)\,\text {d}\mu (\theta )\). Then, the topological pressure of \(\psi \) is alternatively given by

$$\begin{aligned} P(\psi ) = \sup _{{\hat{\nu }}\in \mathcal {H}\left( \mathcal {R}\right) } \left\{ h_{\text {v}}^{\mu }({\hat{\nu }}) + \int _{{\textbf {X}}}\log \psi \,\text {d}\nu \right\} . \end{aligned}$$

Proof

First, note that \(\text {B}_{\text {q}}(g) = \text {B}_{\mu }(g\cdot \psi )\) where \((g\cdot \psi )(x)=g(x)\psi (x)\). In fact

$$\begin{aligned} \text {B}_{\text {q}}(g)(x) = \int _{\Theta } g\circ \tau _{\theta }(x) \psi \circ \tau _{\theta }(x)\,\text {d}\mu (\theta ) = \int _{\Theta } (g\cdot \psi )\circ \tau _{\theta }(x)\,\text {d}\mu (\theta ) = \text {B}_{\mu }(g\cdot \psi )(x). \end{aligned}$$

To finish the proof, we only need to use the pressure’s definition and some basic properties as follows:

$$\begin{aligned} P(\psi )&= \sup _{{\hat{\nu }}\in \mathcal {H}\left( \mathcal {R}\right) }\inf _{g> 0}\left\{ \int _{{\textbf {X}}}\ln \frac{\text {B}_{\text {q}}(g)}{g}\,\text {d}\nu \right\} ,\\&= \sup _{{\hat{\nu }}\in \mathcal {H}\left( \mathcal {R}\right) }\inf _{g> 0}\left\{ \int _{{\textbf {X}}}\log \psi \,\text {d}\nu - \int _{{\textbf {X}}}\log \psi \,\text {d}\nu + \int _{{\textbf {X}}}\ln \frac{\text {B}_{\text {q}}(g)}{g}\,\text {d}\nu \right\} \\&= \sup _{{\hat{\nu }}\in \mathcal {H}\left( \mathcal {R}\right) }\inf _{g> 0}\left\{ \int _{{\textbf {X}}}\log \psi \,\text {d}\nu + \int _{{\textbf {X}}}\ln \frac{\text {B}_{\mu }(g\cdot \psi )}{g\cdot \psi }\,\text {d}\nu \right\} \\&= \sup _{{\hat{\nu }}\in \mathcal {H}\left( \mathcal {R}\right) }\left\{ \int _{{\textbf {X}}}\log \psi \,\text {d}\nu + \inf _{{\tilde{g}} > 0}\int _{{\textbf {X}}}\ln \frac{\text {B}_{\mu }({\tilde{g}})}{{\tilde{g}}}\,\text {d}\nu \right\} \\&= \sup _{{\hat{\nu }}\in \mathcal {H}\left( \mathcal {R}\right) }\left\{ h_{\text {v}}^{\mu }({\hat{\nu }}) + \int _{{\textbf {X}}}\log \psi \,\text {d}\nu \right\} . \end{aligned}$$

\(\square \)

Remark 5.10

Note that if \(\,\text {d}\text {q}_{x}(\theta ) = \frac{\psi \circ \tau _{\theta }(x)}{\text {B}_{\mu }(\psi )(x)}\,\text {d}\mu (\theta )\), by the Theorem 3.2 there exists \(\rho > 0\) and \(\nu \) s.t. \(\mathcal {L}_{\text {q}}(\nu ) = \rho \nu \).

But,

$$\begin{aligned} \rho&= \int _{{\textbf {X}}}\text {d}\mathcal {L}_{\text {q}}(\nu ) = \int _{{\textbf {X}}} \text {B}_{\text {q}}(1)(x) \,\text {d}\nu (x)= \int _{\Omega }\frac{\psi \circ \tau _{\theta }(x)}{\text {B}_{\mu }(\psi )(x)}\,\text {d}\mu (\theta )\,\text {d}\nu (x)\\&=\int _{{\textbf {X}}}{\text {B}_{\mu }(\psi )(x)}^{-1}\int _{\Theta }\psi \circ \tau _{\theta }(x)\,\text {d}\mu (\theta )\,\text {d}\nu (x) = \int _{{\textbf {X}}}\,\text {d}\nu = 1. \end{aligned}$$

Therefore we have

$$\begin{aligned}P(\psi ) \ge \sup _{\{\nu \, |\, \mathcal {L}_{\text {q}}(\nu ) = \nu \}}\int _{{\textbf {X}}}\ln B_{\mu }(\psi ) \,\text {d}\nu .\end{aligned}$$

Definition 5.11

(Equilibrium States). Let \(\mathcal {R}\) be an IFS, \({\hat{\nu }}\in \mathcal {H}\left( \mathcal {R}\right) \) and \(\mu \) a probability on \(\Theta \). Let \(\psi : {\textbf {X}}\rightarrow \mathbb {R}\) be a positive continuous function. We say that the holonomic measure \({\hat{\nu }}\) is an equilibrium state for \((\psi ,\mu )\) if

$$\begin{aligned} h_{\text {v}}^{\mu }({\hat{\nu }}) + \int _{{\textbf {X}}}\log \psi \,\text {d}\nu = P(\psi ). \end{aligned}$$

Lemma 5.12

Let \({\textbf {X}}\) and \({\textbf {Y}}\) be compact metric spaces and \(T:{\textbf {Y}}\rightarrow {\textbf {X}}\) be a continuous mapping. Then the push-forward mapping \(\Phi _{T}\equiv \Phi :{\mathscr {M}}_{1}({\textbf {Y}})\rightarrow {\mathscr {M}}_{1}({\textbf {X}})\) given by

$$\begin{aligned} \Phi ({\hat{\mu }})(A)= {\hat{\mu }}(T^{-1}(A)), \quad \text {where}\ {\hat{\mu }}\in {\mathscr {M}}_{1}({\textbf {Y}})\ \text {and} \ A\in {\mathscr {B}}({\textbf {X}}) \end{aligned}$$

is weak-\(*\) to weak-\(*\) continuous.

Proof

Since we are assuming that \({\textbf {X}}\) and \({\textbf {Y}}\) are compact metric spaces then we can ensure that the weak-\(*\) topology of both \({\mathscr {M}}_{1}({\textbf {Y}})\) and \({\mathscr {M}}_{1}({\textbf {X}})\) are metrizable. Therefore is enough to prove that \(\Phi \) is sequentially continuous. Let \(({\hat{\mu }}_n)_{n\in {\mathbb {N}}}\) be a sequence in \({\mathscr {M}}_{1}({\textbf {Y}})\) so that \({\hat{\mu }}_n\rightharpoonup {\hat{\mu }}\). For any continuous real function \(f:{\textbf {X}}\rightarrow {\mathbb {R}}\) we have from change of variables theorem that

$$\begin{aligned} \int _{{\textbf {X}}} f\, d[\Phi ({\hat{\mu }}_n)] = \int _{{\textbf {Y}}} f\circ T\, d{\hat{\mu }}_n, \end{aligned}$$

for any \(n\in {\mathbb {N}}\). From the definition of the weak-\(*\) topology, it follows that the right hand side above converges when \(n\rightarrow \infty \), and we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\int _{{\textbf {X}}} f\, d[\Phi ({\hat{\mu }}_n)] = \lim _{n\rightarrow \infty }\int _{{\textbf {Y}}} f\circ T\, d{\hat{\mu }}_n = \int _{{\textbf {Y}}} f\circ T\, d{\hat{\mu }} = \int _{{\textbf {X}}} f\, d[\Phi ({\hat{\mu }})]. \end{aligned}$$

The last equality shows that \(\Phi ({\hat{\mu }}_n)\rightharpoonup \Phi ({\hat{\mu }})\) and consequently the weak-\(*\) to weak-\(*\) continuity of \(\Phi \). \(\square \)

For any \({\hat{\nu }}\in {\mathcal {H}}({\mathcal {R}})\) it is always possible to disintegrate it as \(d{\hat{\nu }}(x,i)= d\nu _{x}(i)d[\Phi ({\hat{\nu }})](x)\), where \(\Phi ({\hat{\nu }})\equiv \nu \) is the probability measure on \({\mathscr {B}}({\textbf {X}})\), defined for any \(A\in {\mathscr {B}}({\textbf {X}})\) by

$$\begin{aligned} \nu (A)\equiv \Phi ({\hat{\nu }})(A) \equiv {\hat{\nu }}(T^{-1}(A)), \end{aligned}$$
(8)

where \(T:\Omega \rightarrow {\textbf {X}}\) is the canonical projection of the first coordinate. This observation together with the previous lemma allow us to define a continuous mapping from \({\mathcal {H}}({\mathcal {R}})\) to \({\mathscr {M}}_{1}({\textbf {X}})\) given by \({\hat{\nu }}\longmapsto \Phi ({\hat{\nu }})\equiv \nu \).

We now prove a theorem ensuring the existence of equilibrium states for any continuous positive function \(\psi \). Although this theorem has clear and elegant proof and works in great generality it has the disadvantage of providing no description of the set of equilibrium states.

Theorem 5.13

(Existence of Equilibrium States). Let \({\mathcal {R}}\) be an IFS, \(\psi :X\rightarrow {\mathbb {R}}\) a positive continuous function and \(\mu \) a probability on \(\Theta \). Then the set of equilibrium states for \((\psi ,\mu )\) is not empty.

Proof

As we observed above we can define a weak-\(*\) to weak-\(*\) continuous mapping

$$\begin{aligned} {\mathcal {H}}({\mathcal {R}})\ni {\hat{\nu }}\longmapsto \nu \in {\mathscr {M}}_{1}({\textbf {X}}), \end{aligned}$$

where \(d{\hat{\nu }}(x,i)=d\nu _{x}(i)d\nu (x)\) is the above constructed disintegration of \({\hat{\nu }}\). From this observation, it follows that for any fixed positive continuous g the mapping \( {\mathcal {H}}({\mathcal {R}})\ni {\hat{\nu }} \longmapsto \int _{\textbf {X}}\ln (B_{\mu }(g)/ g) \, d\nu \) is continuous with respect to the weak-\(*\) topology. Therefore the mapping

$$\begin{aligned} {\mathcal {H}}({\mathcal {R}})\ni {\hat{\nu }} \longmapsto \inf _{ \begin{array}{c} g\in C({\textbf {X}}, {\mathbb {R}}) \\ g>0 \end{array} } \left\{ \int _{{\textbf {X}}} \ln \frac{B_{\mu }(g)}{g}\, d\nu \right\} \equiv h_{\text {v}}^{\mu }({\hat{\nu }}). \end{aligned}$$

is upper semi-continuous (USC) which implies by standard results that the following mapping is also USC

$$\begin{aligned} {\mathcal {H}}({\mathcal {R}})\ni {\hat{\nu }} \longmapsto h_{\text {v}}^{\mu }({\hat{\nu }})+ \int _{{\textbf {X}}} \ln ( \psi (x))\, d\nu (x). \end{aligned}$$

Since \({\mathcal {H}}({\mathcal {R}})\) is compact in the weak-\(*\) topology and the above mapping is USC, it follows that this mapping attains its supremum at some \({\hat{\mu }}\in {\mathcal {H}}({\mathcal {R}})\), i.e.,

$$\begin{aligned} \sup _{{\hat{\nu }} \in {\mathcal {H}}({\mathcal {R}})} \left\{ h_{\text {v}}^{\mu }({\hat{\nu }}) + \int _{{\textbf {X}}} \ln \psi \, d\nu \right\} =h_{\text {v}}^{\mu }({\hat{\mu }}) + \int _{{\textbf {X}}} \ln \psi \, d\mu \end{aligned}$$

thus proving the existence of at least one equilibrium state. \(\square \)

Let \(\psi : {\textbf {X}}\rightarrow \mathbb {R}\) be a positive continuous function, \(\mu \) a probability on \(\Omega \), and \(\mathcal {R}_{\psi }({\textbf {X}}, \tau , \text {q})\) an IFSm, where \(\,\text {d}\text {q}_{x}(\theta ) = \psi \circ \tau _{\theta }(x)\,\text {d}\mu (\theta )\). Suppose that there is h a positive continuous function such that \(\text {B}_{\text {q}}(h) = \text {B}_{\mu }(h\cdot \psi ) = \rho (\text {B}_{\text {q}})h\). Then we can define, following Definition 5.4, \(\mathcal {R}_{\text {p}}=({\textbf {X}},\tau ,\text {p})\) where

$$\begin{aligned} \frac{\,\text {d}\text {p}_{x}}{\,\text {d}\mu }(\theta ) := \frac{(h\cdot \psi )\circ \tau _{\theta }(x)}{\text {B}_{\mu }(h\cdot \psi )} = \frac{(h\cdot \psi )\circ \tau _{\theta }(x)}{\rho (\text {B}_{\text {q}})h} = \frac{h\circ \tau _{\theta }(x)}{\rho (\text {B}_{\text {q}})h}\cdot \frac{dq_{x}}{\,\text {d}\mu }(\theta ). \end{aligned}$$

The IFSm \(\mathcal {R}_{\text {p}}\) is called the normalization of \(\mathcal {R}_{\text {q}}\). Take \(\mathcal {L}_{\text {p}}(\nu ) = \nu \) and let \({\hat{\nu }}\) be the holonomic lifting of \(\nu \). Then by the Theorem 5.6 we know that

$$\begin{aligned} h_{\text {v}}^{\mu }({\hat{\nu }}) = \int _{{\textbf {X}}}\log \frac{\text {B}_{\mu }(h\cdot \psi )}{h\cdot \psi }\,\text {d}\nu = \log \rho (\text {B}_{\text {q}}) - \int _{{\textbf {X}}}\log \psi \,\text {d}\nu . \end{aligned}$$

Then, choosing this \({\hat{\nu }}\) as particular in supremum given in Lema 5.9, \(P(\psi ) \ge h_{\text {v}}^{\mu }({\hat{\nu }}) + \int _{{\textbf {X}}}\log \psi \,\text {d}\nu = \log \rho (\text {B}_{\text {q}})\).

But, remember that the pressure, defined in expression (7), is

$$\begin{aligned} P(\psi ) = \sup _{{\hat{\nu }}\in \mathcal {H}\left( \mathcal {R}\right) }\inf _{g \in C({\textbf {X}};\mathbb {R}) g > 0}\left\{ \int _{{\textbf {X}}}\ln \frac{\text {B}_{\text {q}}(g)}{g}\,\text {d}\nu \right\} , \end{aligned}$$

then,

$$\begin{aligned} \inf _{g \in C({\textbf {X}};\mathbb {R}) g > 0}\left\{ \int _{{\textbf {X}}}\ln \frac{\text {B}_{\text {q}}(g)}{g}\,\text {d}\nu \right\} \le \int _{{\textbf {X}}}\ln \frac{\text {B}_{\text {q}}(h)}{h}\,\text {d}\nu = \log \rho (\text {B}_{\text {q}}). \end{aligned}$$

Taking the supremum over \(\mathcal {H}\left( \mathcal {R}\right) \) in both sides of above inequality, we have \(P(\psi )\le \log \rho (\text {B}_{\text {q}})\). Since the reverse inequality is already shown, we prove that

$$\begin{aligned} P(\psi ) = \log \rho (\text {B}_{\text {q}}). \end{aligned}$$

Remark 5.14

We end this section by showing that the IFSm Thermodynamic Formalism generalizes the Thermodynamical Formalism for a dynamical system. Let \(\Theta \) be a compact metric space and \({\textbf {X}}= \Theta ^\mathbb {N}\). For each \(\theta \in \Theta \) define \(\sigma _\theta (x_1, x_2, \ldots ) = (\theta , x_1, x_2, \ldots )\) the inverse branch of the right shift \(\sigma \). Take \(\mu \) a a-priori probability on \(\Theta \). Let \(\psi : \Omega \rightarrow \mathbb {R}\) be a positive potential and \(A = \log \psi \). Now we define \(\,\text {d}\text {q}_x(\theta ) = e^{A\circ \sigma _\theta (x)}\,\text {d}\mu (\theta )\). We have

$$\begin{aligned} \text {B}_{\text {q}}(\phi ) = \int _S e^{A\circ \sigma _\theta (x)}\phi \circ \sigma _\theta (x)\,\text {d}\mu (\theta ) = L_A(\phi )(x), \end{aligned}$$

where \(L_A\) is the Ruelle Operator for the right shift \(\sigma \) and the potential A (see [23] for more details). By Definition 5.8,

$$\begin{aligned} P(\psi )&= \sup _{{\hat{\nu }}\in \mathcal {H}\left( \mathcal {R}\right) } \inf _{g> 0} \left\{ \int _{{\textbf {X}}}\ln \frac{\text {B}_{\text {q}}(g)}{g}\,\text {d}\nu \right\} \\&= \sup _{\nu \in {\mathcal {M}}_\sigma } \inf _{g > 0} \left\{ \int _{{\textbf {X}}}\ln \frac{L_{A}(g)}{g}\,\text {d}\nu \right\} . \end{aligned}$$

The last expression is exactly the pressure of the potential A in Thermodynamical Formalism. Suppose that there is \(\phi _A\) a positive continuous function such that \(\text {B}_{\text {q}}(\phi _A) = L_{A}(\phi _A) = \rho ({\mathcal {R}}_{A})\phi _A = \lambda _A \phi _A\). Then \(P(e^A) = \log \lambda _A\). For instance, we know that if A is Hölder, then there exists such \(\phi _A > 0\) function. From this example, we can see that the IFSm Thermodynamic Formalism, in certain sense, generalizes the Thermodynamical Formalism for a dynamical system. When we look at the family \(\{\sigma _\theta \}_{\theta \in \Theta }\) of functions, we are looking at the inverse branches of the dynamical system.

6 Pressure Differentiability and Equilibrium States

We show in this section a uniqueness result for the equilibrium state introduced in the last section. In order to do that we will need to consider the functional \(p:C({\textbf {X}},{\mathbb {R}})\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} p(\varphi ) = P(\exp (\varphi )). \end{aligned}$$
(9)

It is immediate to verify that p is a convex and finite valued functional. We say that a Borel signed measure \(\nu \in {\mathscr {M}}_{s}(X)\) is a subgradient of p at \(\varphi \) if it satisfies the following subgradient inequality

$$\begin{aligned} p(\eta )\ge p(\varphi )+\nu (\eta -\varphi ) \;\; \text{ for } \text{ any }\;\; \eta \in {\mathscr {M}}_{s}(X). \end{aligned}$$

The set of all subgradients at \(\varphi \) is called subdifferential of p at \(\varphi \) and denoted by \(\partial p(\varphi )\). It is well-known that if p is a continuous mapping then \(\partial p(\varphi )\ne \emptyset \) for any \(\varphi \in C({\textbf {X}},{\mathbb {R}})\).

We observe that for any pair \(\varphi ,\eta \in C({\textbf {X}},{\mathbb {R}})\) and \(0<t<s\), it follows from the convexity of p the following inequality

$$\begin{aligned} s( p(\varphi +t\eta )-p(\varphi ))\le t(p(\varphi +s\eta )-p(\varphi )). \end{aligned}$$

In particular, the one-sided directional derivative \(d^{+}p(\varphi ):C({\textbf {X}},{\mathbb {R}})\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} d^{+}p(\varphi )(\eta ) = \lim _{t\downarrow 0} \frac{p(\varphi +t\eta )-p(\varphi )}{t} \end{aligned}$$

is well-defined for any \(\varphi \in C({\textbf {X}},{\mathbb {R}})\).

Theorem 6.1

For any fixed \(\varphi \in C({\textbf {X}},{\mathbb {R}})\) we have

  1. 1.

    the signed measure \(\nu \in \partial p(\varphi )\) iff \(\nu (\eta )\le d^{+}p(\varphi )(\eta )\) for all \(\eta \in C(X,{\mathbb {R}})\);

  2. 2.

    the set \(\partial p(\varphi )\) is a singleton iff \(d^{+}p(\varphi )\) is the Gâteaux derivative of p at \(\varphi \).

Proof

This theorem is a consequence of Theorem 7.16 and Corollary 7.17 of the reference [6]. \(\square \)

Corollary 6.2

Let \(\mathcal {R}\) be an IFS, \(\psi :{\textbf {X}}\rightarrow {\mathbb {R}}\) a positive continuous function, \(\mu \) a probability on \(\Theta \) and \(\Phi ({\hat{\nu }}) = \nu \) for \({\hat{\nu }}\in \mathcal {H}\left( \mathcal {R}\right) \) where \(\nu \) is given by disintegration with respect to T. If the functional p defined on (9) is Gâteaux differentiable at \(\varphi \equiv \log \psi \), then

$$\begin{aligned} \# \{\Phi ({\hat{\mu }}):\ {\hat{\mu }}\ \text {is an equilibrium state for}\ \psi \} =1. \end{aligned}$$

Proof

Suppose that \({\hat{\mu }}\) is an equilibrium state for \(\psi \). Then we have from the definition of the pressure that

$$\begin{aligned} p(\varphi +t\eta ) -p(\varphi )&= P(\psi \exp (t\eta ))-P(\psi )\\&\ge h_{v}({\hat{\mu }})+\int _{{\textbf {X}}} \ln \psi \ d\mu + \int _{{\textbf {X}}} t\eta \, d\mu - h_{v}({\hat{\mu }})-\int _{{\textbf {X}}} \ln \psi \ d\mu \\&= t\int _{{\textbf {X}}} \eta \, d\mu . \end{aligned}$$

Since we are assuming that p is Gâteaux differentiable at \(\varphi \), it follows from the above inequality that \(\mu (\eta )\le d^{+}p(\varphi )(\eta )\) for all \(\eta \in C({\textbf {X}},{\mathbb {R}})\). From this inequality and Theorem 6.1 we can conclude that \(\partial p(\varphi )=\{\mu \}\). Therefore for all equilibrium state \({\hat{\mu }}\) for \(\psi \) we have \(\Phi ({\hat{\mu }})=\partial p(\varphi )\), thus finishing the proof. \(\square \)

7 Possible Application in Economics

In Gupta et al. [16] a chaos game is used to represent a time series as a PC plot and compare similarities and dissimilarities in different time frame such as the global pandemic of COVID-19. More precisely, the author consider the set \(X=[0,1]^2\) as the base space and the four linear contractions

$$\begin{aligned} \left\{ \begin{array}{l} \tau _{A}(x,y)=(0.5 x ,0.5 y ) \\ \tau _{B}(x,y)=(0.5 x+0.5,0.5 y ) \\ \tau _{C}(x,y)=(0.5 x ,0.5 y+0.5) \\ \tau _{D}(x,y)=(0.5 x+0.5,0.5 y+0.5) \end{array} \right. \end{aligned}$$
(10)

so \((X, \tau _\theta )_{\theta \in \Theta }, \Theta =\{A,B,C,D\}\), is a classic contractive system whose attractor is X itself. Consider the identification:

A – if the market falls more than 0.01% of the previous value,

B – if the market falls less than 0.01% of the previous value,

C – if the market gains less than 0.01% of the previous value and

D – if the market gains more than 0.01% of the previous value,

in this way the time series of length N associated to a certain economic indicator is translated in to a genetic sequence

$$\begin{aligned} \gamma =(DACCADCDACDC\ldots AACCBADD) \in \Theta ^N. \end{aligned}$$

Fixed an arbitrary initial point \(Z_0=(x_0,y_0)=(0.5, 0.5)\) the chaos game consists in iterating \((x_0,y_0)\) by each map \(Z_1=(x_1,y_1)=\tau _{D}(x_0,y_0), Z_2=(x_2,y_2)=\tau _{A}(x_1,y_1), Z_3=(x_3,y_3)=\tau _{C}(x_2,y_2),\ldots .\) according to \(\gamma \). Considering \(M\ge 2\) and the diadic partition of X given by

$$\begin{aligned} \bigcup _{\gamma ' \in \Theta ^M} \tau _{\gamma '_{M}}( \cdots (\tau _{\gamma '_{1}}(X))), \end{aligned}$$

the PC plot W is a grey scale picture where the color of the each individual part \(\Lambda =\tau _{\gamma '_{M}}( \cdots (\tau _{\gamma '_{1}}(X)))\) is the frequency of visits of the chaos game orbit \(\{Z_j, j\ge 0\}\) to \(\Lambda \) that is,

$$\begin{aligned} W(\Lambda )= \frac{1}{N} \, \sharp \{j=0,\ldots ,N-1\, | \, Z_j \in \Lambda \} \sim \mu (\Lambda ). \end{aligned}$$

Obviously, \(\nu _{N}=\sum _{\Lambda } W(\Lambda ) \delta _{(x_{\Lambda }, \, y_{\Lambda })}\), where \((x_{\Lambda }, \, y_{\Lambda }) \in \Lambda \) is arbitrary, is a discrete probability and, if \(\mu (\partial (\Lambda ))=0\) then by the EET ( [13], Corollary 2), when \(N \rightarrow \infty \), \(\nu _{N}\) converge in distribution to the invariant measure \(\mu \) for the IFS with probabilities \((X, \tau _\theta , p_\theta )_{\theta \in \Theta }\), where \(p_A, p_B, p_C, p_D\) are the relative frequency of each symbol ABCD in \(\gamma \), respectively.

For instance, if \(N=100\) and if a certain time series produces the genetic sequence

$$\begin{aligned} \gamma =( A , A , D , D , A ,\ldots , A , D , B , C , C , B , A , D) \in \{A,B,C,D\}^{100}, \end{aligned}$$

we obtain the frequencies \([p_A, p_B, p_C, p_C]=[0.39, \,0.17, \,0.15,\, 0.29]\), and considering \(M=4\) we obtain the following PC plot which is an approximation for the invariant measure \(\mu \) of the associated IFS with probabilities \([0.39, \,0.17, \,0.15,\, 0.29]\).

Fig. 1
figure 1

PC plot where each square represents one element \(\Lambda \) of the diadic partition and grey scale value \(0 \le \frac{1}{N} \, \sharp \{j=0,\ldots ,N-1\, | \, Z_j \in \Lambda \} \le 1\)

In order to generalize this idea we need to consider an infinite compact continuous range of values of the economic indicator, such as \(\Theta =[0\%, 100\%]\), instead of taking only four values \(\Theta =\{A,B,C,D\}\). Also, it is not reasonable to suppose that the probability of a change of \(\theta \%\) in the indicator is independent of the current state of the indicator: the distribution of the occurrence of \(\theta \in [0\%, 100\%]\), given the current state \(Z \in X\) must be a measure of probability \(q_{Z}( \cdot )\) over \([0\%, 100\%]\). Therefore, we believe the theory developed in the previous sections should be used when making this generalization.