Abstract
In this paper, we establish a version of the central limit theorem for Markov–Feller continuous time processes (with a Polish state space) that are exponentially ergodic in the bounded-Lipschitz distance and enjoy a continuous form of the Foster–Lyapunov condition. As an example, we verify the assumptions of our main result for a specific piecewise-deterministic Markov process, whose deterministic component evolves according to continuous semiflows, switched randomly at the jump times of a Poisson process.
Similar content being viewed by others
1 Introduction
We are concerned with the asymptotic behavior of the process \(\Gamma _g:=\{t^{-1/2}\hspace{-0.1cm}\int _0^t g\left( \Psi (s)\right) ds\}_{t\ge 0}\), where \(\Psi :=\{\Psi (t)\}_{t\ge 0}\) is a non-stationary, time-homogeneous Markov process evolving on a Polish metric space E, with an arbitrary transition semigroup \(\{P(t)\}_{t\ge 0}\), and \(g:E\rightarrow \mathbb {R}\) is a bounded Lipschitz continuous observable. More specifically, our main goal is to provide some testable conditions on \(\{P(t)\}_{t\ge 0}\) under which \(\Psi \) has a unique invariant distribution, say \(\mu _*\), and \(\Gamma _{\bar{g}}(t)\), with \({\bar{g}}:=g-\int _E g\,d\mu _*\), converges in law (as \(t\rightarrow \infty \)) to a centered normal random variable, or, in other words, under which the process \(\{{\bar{g}}(\Psi (t))\}_{t\ge 0}\) obeys the central limit theorem (CLT).
The CLT is definitely the fundamental one in probability theory and statistics. Initially formulated for independent and identically distributed random variables, it was thereafter generalized to martingales (see [31]), which has constituted a background for proving various versions of the CLT pertaining to Markov processes. First results in this field deal with stationary Markov chains (with discrete time) for which the existence of a \(\mu _*\)-square integrable solution to the Poisson equation is guaranteed (see, e.g., [14, 17, 18]). During the later years, many attempts have been made to relax this assumption. For instance, [26] refers to the so-called reversible Markov chains and is based on approximating (in a certain sense) the solutions of the Poisson equation, while [33] introduces a testable condition relying on the convergence of some series. Another noteworthy article is [24], where, among others, the principal hypothesis of [33] is reached by assuming a subgeometric rate of convergence of Markov chain’s distribution to a stationary one in terms of the Wasserstein distance. Furthermore, it should be mentioned that, over the years, the CLT has been also established for certain stationary Markov processes with continuous time parameter; see e.g., [3, 35] and the results of [22] for ergodic processes with normal generators (extending those of [18]).
In recent times, however, most attention has been paid to non-stationary Markov processes. Some classical results on the CLT in this case can be found in [34]. They involve positive Harris reccurent and aperiodic (discrete-time) Markov chains (or, equivalently, those which are irreducible and ergodic in the total variation norm), for which a drift condition towards petite sets is fulfilled (which guarantees the existence of a suitable solution to the Poisson equation). Such requirements are, however, practically unattainable in non-locally compact state spaces. A version of the CLT for a subclass of non-stationary Markov chains evolving on a general (Polish) metric space, based on a kind of geometric ergodicity in the bounded-Lipschitz distance and the ‘second order’ Foster-Lyapunov type condition (where a solution to the Poisson equation is not required) is established in [10]. In the context of processes with continuous time parameter, probably the most general result of this kind to date, but relying on the exponential ergodicity in the Wasserstein distance (additionally, distinct in nature from that assumed in [10] or [12]), is stated in [28]. An analogous result in the discrete-time case can be found in [19].
The results established in this article are mainly inspired by [28]. A major motivation for the current study was the inability to directly apply the CLT established by Komorowski and Walczuk [28] to some subclass of piecewise-deterministic Markov processes (PDMPs), at least under (relatively natural) conditions imposed in [12, Proposition 7.2] (see also [9, 13]).
The problem lies in accomplishing the exponential mixing in the sense of condition (H1), employed in [28] (cf. also [19]), which requires a form of the Lipschitz continuity of each P(t) with respect to the Wasserstein distance \(d_{\text {W}}\) (see, e.g., [28, p. 5] for its definition). More precisely, the authors assume the existence of \(\gamma >0\) and \(c<\infty \) such that for any two Borel probability measures \(\mu \) and \(\nu \) (with finite first moments) the following holds:
We have therefore recognized the need to provide a new, somewhat more useful, criterion that would involve a weaker form of the above requirement, similar to those occuring, for instance, in [9, 12, 13, 25, 39] (cf. also [20]). More precisely, instead of (1.1), we assume that there is \(\gamma >0\) such that, for any two Borel probability measures \(\mu \) and \(\nu \), there exist a continuous function \(V:E\rightarrow [0,\infty )\) and constants \(\beta >0\), \(\delta \in (0,1)\), for which
where \(d_{\text {FM}}\) stands for the bounded-Lipschitz distance, also known as the Fortet–Mourier metric (cf., e.g., [30, p. 236] or [6, p. 192]). Besides, an additional advantage of our approach is that this metric is weaker than the Wasserstein one, among others, in a manner enabling the use of a coupling argument (introduced by M. Hairer in [20], and further applied, e.g., in [9, 10, 12, 13, 25, 37, 39]) to reach an exponential mixing property with respect to \(d_{\text {FM}}\), which fails while demanding it in terms of \(d_{\text {W}}\).
Hypotheses (H2) and (H3) used in [28] roughly ensure that the semigroup \(\{P(t)\}_{t\ge 0}\) preserves the finiteness of measure moments of a given, greater than 2, order. In the present paper, they are both replaced by a strengthened version of the continuous Lyapunov condition, from which, in practice, such properties are usually derived.
As mentioned above, the proof of our main result, that is, Theorem 3.1, is in many places based on the reasoning presented in [28]. Nevertheless, it should be emphasized that without a Lipschitz type assumption on the semigroup \(\{P(t)\}_{t\ge 0}\), such as (1.1) (or its discrete-time analogue, employed, e.g., in [19]), proving the principal limit theorems, like the central one or the law of the iterated logarithm, requires some more subtle arguments, which is reflected, e.g., in [10, 11] or [27]. Most importantly, under condition (1.2), the so-called corrector function \(\chi :E\rightarrow \mathbb {R}\), given by
does not need to be Lipschitzian (which is a meaningful argument in the proof of [28, Theorem 2.1]), but is only continuous. Another problem arising in our setting is that the weak convergence of the process distribution (towards the stationary one), guaranteed by (1.2), yields the convergence of the corresponding integrals as long as the integrands are (apart of being continuous) bounded, which is not required while using the Wasserstein distance. This fact prevents, among others, a direct adaptation of the final argument used in the proof of [28, Lemma 5.5]. We have overcome this obstacle (see Lemma 4.9) by making the use of [6, Lemma8.4.3], which allows replacing the boundedness of the integrand by its uniform integrability with respect to the family of measures constituing the convergent sequence under consideration. Finally, let us indicate that the key role in our proof is played by Lemma 4.5. In contrast, results of this nature are not necessary in [28], since they are in some sense buid-in a priori in hypotheses (H2) and (H3).
The article is organized as follows. In Sect. 2, we gather notation used throughout the paper, as well as recall some basic definitions and facts in the field of measure theory and Markov semigroups. We also quote here a version of the CLT for martingales, crucial for the reasoning line presented in the paper. Section 3 is devoted to formulating assumptions and the main result, namely Theorem 3.1. In this part, it is also shown that the assumptions employed imply the existence of a unique invariant distribution of \(\Psi \). The proof of the main theorem, along with all auxiliary results, is given in Sect. 4. Further, in Sect. 5, drawing on some ideas from [3], we provide a concise representation of the variance of the limiting normal distribution (involved in Theorem 3.1). Additionaly, in Sect. 6, we derive a straightforward conclusion from [3, Theorem 2.1] concerning the functional CLT in the stationary case. Finally, in Sect. 7, we demonstrate the usefulness of the main result by applying it to establish the CLT for the PDMPs considered in [12].
2 Preliminaries
First of all, put \(\mathbb {N}_0:=\mathbb {N}\cup \{0\}\) and \(\mathbb {R}_+=[0,\infty )\). In what follows, we shall consider a complete separable metric space \((E,\rho )\), endowed with its Borel \(\sigma \)-field \(\mathcal {B}(E)\). By \({\text {B}}_b(E)\) we will denote the Banach space of all real-valued, Borel measurable bounded functions on E, equipped with the supremum norm \(\Vert \cdot \Vert _{\infty }\). The subspaces of \({\text {B}}_b(E)\) consisting of all continuous functions and all Lipschitz continuous functions shall be denoted by \({\text {C}}_b(E)\) and \({\text {Lip}}_b(E)\), respectively. Throughout the paper, we will also refer to a particular subset \({\text {Lip}}_{b,1}(E)\) of \({\text {Lip}}_b(E)\), defined as
where the norm \(\Vert \cdot \Vert _{\text {BL}}\) is given by
Furthermore, we will write \(\mathcal {M}(E)\) for the space of all finite non-negative Borel measures on E. The subset of \(\mathcal {M}(E)\) consisting of all probability measures will be, in turn, denoted by \(\mathcal {M}_1(E)\). Moreover, for any given Borel measurable function \(V:E\rightarrow \mathbb {R}_+\) and any \(r>0\), let us define the subset \(\mathcal {M}^V_{1,r}(E)\) of \(\mathcal {M}_1(E)\) consisting of all measures with finite r-th moment with respect to V, that is,
Clearly, for any \(s\in (0,2]\), we have \(\mathcal {M}^V_{1,2}(E)\subset \mathcal {M}_{1,s}^V(E)\), since \(\int _E V^s\,d\mu \le (\int _E V^2\,d\mu )^{s/2}\) for every \(\mu \in \mathcal {M}_1(E)\), due to the Hölder inequality. Also worth noting here is that, for every \(r>0\), \(\mathcal {M}^V_{1,r}(E)\) contains all Dirac measures \(\delta _x\), \(x\in E\), and, when V is continuous, all compactly supported Borel probability measures on E.
For brevity, we will often write \(\langle f,\mu \rangle \) for the Lebesgue integral \(\int _Ef\,d\mu \) of a Borel measurable function \(f:E\rightarrow \mathbb {R}\) with respect to a signed Borel measure \(\mu \), provided that it exists.
To evaluate the distance between measures, we will use the Fortet–Mourier metric (equivalent to the one induced by the Dudley norm), which on \(\mathcal {M}(E)\) is given by
Let us recall that a sequence \(\{\mu _n\}_{n\in \mathbb {N}_0}\subset \mathcal {M}(E)\) of measures is called weakly convergent to a measure \(\mu \in \mathcal {M}(E)\), which is denoted by \(\mu _n{\mathop {\rightarrow }\limits ^{w}}\mu \), if for any \(f\in {\text {C}}_b(E)\) we have \(\lim _{n\rightarrow \infty }\langle f,\mu _n\rangle =\langle f,\mu \rangle \). It is well-known (see, e.g., [15, Theorems 8 and 9]) that, if \((E,\rho )\) is Polish (which is the case here), then the weak convergence of any sequence of probability measures is equivalent to its convergence in the Fortet–Mourier distance, and also the space \((\mathcal {M}_1(E), d_{\text {FM},\rho })\) is complete.
In fact, according to [6, Lemma 8.4.3], the weak convergence \(\mu _n{\mathop {\rightarrow }\limits ^{w}}\mu \) of probability measures ensures the convergence of the corresponding integrals even in the case of continuous but not necessarily bounded functions, provided that they are uniformly integrable with respect to \(\{\mu _n\}_{n\in \mathbb {N}_0}\). An easily verifiable condition guaranteeing this property, and thus also the above indicated statement, is presented in the following result:
Lemma 2.1
Let \(\{\mu _n\}_{n\in \mathbb {N}_0}\subset \mathcal {M}_1(E)\) be weakly convergent to some \(\mu \in \mathcal {M}_1(E)\). Then, for every continuous function \(h:E\rightarrow \mathbb {R}\) satisfying \(\sup _{n\in \mathbb {N}_0} \left\langle |h|^q,\mu _n\right\rangle <\infty \) with some \(q>1\), we have \(\lim _{n\rightarrow \infty }\left\langle h,\mu _n\right\rangle =\left\langle h,\mu \right\rangle \).
Proof
Let \(h:E\rightarrow \mathbb {R}\) be a continuous function such that \(M:=\sup _{n\in \mathbb {N}_0} \left\langle |h|^q,\mu _n\right\rangle <\infty \) with some \(q>1\). Then, for every \(R>0\), we have
Keeping in mind that \(q>1\), we therefore see that
but this, according to [6, Lemma 8.4.3], already implies the assertion of the lemma. \(\square \)
2.1 Markov Semigroups
Let us now recall some basic concepts from the theory of Markov operators, to be employed in the remainder of the paper.
A function \(P: E \times \mathcal {B}(E) \rightarrow [0, 1]\) is called a stochastic kernel if \(E\ni x\mapsto P(x, A)\) is a Borel measurable map for each \(A\in \mathcal {B}(E)\), and \(\mathcal {B}(E)\ni A \mapsto P(x, A)\) is a Borel probability measure for each \(x\in E\). The composition of any two such kernels, say P and Q, is defined by
Given a stochastic kernel P, we can define two corresponding operators (here denoted by the same symbol, according to the convention employed, e.g., in [34, 35]); one acting on \(\mathcal {M}(E)\), defined by
and the second one acting on \({\text {B}}_b(E)\), given by
These operators are related to each other so that
The operator \((\cdot )P: \mathcal {M}(E)\rightarrow \mathcal {M}(E)\), given by (2.2), is called a (regular) Markov operator, and \(P(\cdot ): {\text {B}}_b(E) \rightarrow {\text {B}}_b(E)\), defined by (2.3), is said to be its dual operator. Obviously, the Markov operator (resp. its dual) corresponding to the composition of two given kernels in the sense of (2.1) is just the usual composition of the Markov operators (resp. their duals) induced by these kernels. Let us also highlight that the right-hand sides of (2.3) and (2.4), in fact, make sense for all Borel measurable, bounded below functions, and we will often write Pf also for such functions f (obviously, in that case, Pf takes values in \(\mathbb {R}\cup \{\infty \}\)).
Let \(\mathbb {T}\in \{\mathbb {R}_+,\mathbb {N}_0\}\). A family of stochastic kernels \(\{P(t)\}_{t\in \mathbb {T}}\) (or the induced family of Markov operators) is called a (regular) Markov semigroup whenever
and, if \(T=\mathbb {R}_+\), the map \(\mathbb {R}_+\times E\ni (t,x)\mapsto P(t)(x,A)\) is \(\mathcal {B}(\mathbb {R}_+\times E)/\mathcal {B}(\mathbb {R})\)-measurable for any \(A\in \mathcal {B}(E)\) (cf. the definition of transition function in [16, p. 156]).
We call a Markov semigroup \(\{P(t)\}_{t\in \mathbb {T}}\) Feller if, for each \(t\in \mathbb {T}\), the dual operator \(P(t)(\cdot )\) preserves continuity of bounded functions, i.e., \(P(t)({\text {C}}_b(E))\subset {\text {C}}_b(E)\). A measure \(\mu _*\in \mathcal {M}(E)\) is said to be invariant for such a semigroup \(\{P(t)\}_{t\in \mathbb {T}}\) whenever \({\mu }_*P(t)={\mu }_*\) for every \(t\in \mathbb {T}\). Obviously, these concepts can be also referred to a single Markov operator.
Given a Markov semigroup \(\{P(t)\}_{t\in \mathbb {T}}\) of stochastic kernels on \(E\times \mathcal {B}(E)\), by a time-homogeneous Markov process with transition semigroup \(\{P(t)\}_{t\in \mathbb {T}}\) we mean a family of E-valued random variables \(\Psi :=\{\Psi (t)\}_{t\in \mathbb {T}}\) on some probability space \((\Omega ,\mathcal {F},\mathbb {P})\) such that, for any \(s, t\in \mathbb {T}\), \(A \in \mathcal {B}(E)\), and \(x\in E\),
where \(\{\mathcal {F}(s)\}_{s\in \mathbb {T}}\) is the natural filtration of \(\Psi \). Obviously, \(\Psi \) can also be regarded as a process on the probability space \((\Omega ,\mathcal {F}_{\infty },\mathbb {P}|_{\mathcal {F}_{\infty }})\) with \(\mathcal {F}_{\infty }:=\sigma \left( \{\Psi (t):\,t\in \mathbb {T}\}\right) \). The distribution of \(\Psi (0)\) is referred to as the initial one. If \(\mathbb {T}=\mathbb {N}_0\) and \(P(n)=P(1)^n\) (i.e., P(n) is the nth iteration of P(1)) for every \(n\in \mathbb {N}\), then \(\Psi \) satisfying (2.5) is usually called a Markov chain (rather than a process), and the kernel P(1) is then said to be the one-step transition law of this chain. Let us also note that, letting \(\mu (t)\) be the distribution of \(\Psi (t)\) for every \(t\in \mathbb {T}\), we have \(\mu (s+t) = \mu (s)P(t)\) for any \(s, t\in \mathbb {T}\), and thus it is indeed reasonable to call \(\{P(t)\}_{t\in \mathbb {T}}\) a transition semigroup.
Let us also recall that a continuous-time stochastic process \(\Psi =\{\Psi (t)\}_{t\in \mathbb {R}_+}\), adapted to a filtration \(\{\mathcal {F}(t)\}_{t\in \mathbb {R}_+}\), is called jointly (resp. progressively) measurable if the map \(\mathbb {R}_+\times \Omega \ni (t,\omega )\mapsto \Psi (t)(\omega )\in E\) (resp. its restriction to \([0,t]\times \Omega \)) is \(\mathcal {B}(\mathbb {R}_+)\otimes \mathcal {F}_{\infty } / \mathcal {B}(E)\)-measurable (resp. \(\mathcal {B}([0,t])\otimes \mathcal {F}(t) / \mathcal {B}(E)\)-measurable for every \(t>0\)). As is well known, every adapted process with right- or left-continuous sample paths is progressively measurable, and thus also jointly measurable.
Throughout the paper, for a given Markov process \(\Psi =\{\Psi (t)\}_{t\in \mathbb {T}}\), we shall use the so-called Dynkin set-up (see, e.g., [36, p. 7]), which brings a family \(\{\mathbb {P}_x:\,x\in E\}\) of probability measures on \(\mathcal {F}\) such that, for every \(x\in E\), one has \(\mathbb {P}_x(\Psi (0)=x)=1\) and (2.5) is fulfilled with \(\mathbb {P}_x\) in place of \(\mathbb {P}\). Obviously, \(\mathbb {P}_x\) might be then thought of as the conditional distribution of \(\mathbb {P}\), given the initial state x of \(\Psi \), i.e., \(\mathbb {P}_x(\cdot ):=\mathbb {P}(\cdot \,|\,\Psi (0)=x)\). Within this framework, for each \(\mu \in \mathcal {M}_1(E)\), one can define
and check that \(\Psi \) has the Markov property in the sense of (2.5) relative to \(\mathbb {P}_{\mu }\), with the given transition semigroup and initial law \(\mu \). For any \(\mu \in \mathcal {M}_1(E)\) and \(x\in E\), the expectation operators w.r.t. \(\mathbb {P}_{\mu }\) and \(\mathbb {P}_x (=\mathbb {P}_{\delta _x})\) will be denoted by \(\mathbb {E}_{\mu }\) and \(\mathbb {E}_x\), respectively. It is easy to see that, given any Borel measurable, bounded below function \(f:E\rightarrow \mathbb {R}\), we then have
and thus it follows from (2.5) that
In practice, it will be convenient to work in the following canonical setting. Given a Markov process \(\Psi \), we put \({\widetilde{\Omega }}:=\{\Psi (\cdot )(\omega ):\; \omega \in \Omega \}\subset E^{\mathbb {T}}\), and, for every \(t\in \mathbb {T}\), we define the projection \({\widetilde{\Psi }}(t):{\widetilde{\Omega }}\rightarrow E\) by \({\widetilde{\Psi }}(t)({\tilde{\omega }}):={\tilde{\omega }}(t)\) for any \({\tilde{\omega }}\in {\widetilde{\Omega }}\), as well as the \(\sigma \)-fields
Further, we introduce the map \(\pi :\Omega \rightarrow {\widetilde{\Omega }}\) given by \(\pi (\omega )(t):=\Psi (t)(\omega )\) for all \(\omega \in \Omega \) and \(t\in \mathbb {T}\). Then \({\widetilde{\Psi }}(t)\circ \pi =\Psi (t)\) for every \(t\in \mathbb {T}\) and it follows easily that \(\pi \) is both \(\mathcal {F}_{\infty }/\mathcal {{\widetilde{F}}_{\infty }}\) and \(\mathcal {F}(t)/\mathcal {{\widetilde{F}}}(t)\)-measurable for each \(t\in \mathbb {T}\). Now, let us define \({\widetilde{\mathbb {P}}}_{\mu }({\widetilde{F}}):=\mathbb {P}_{\mu }(\pi ^{-1}({\widetilde{F}}))\) for any \({\widetilde{F}}\in \widetilde{\mathcal {F}}_{\infty }\) and \(\mu \in \mathcal {M}_1(E)\). Then, for each \(\mu \in \mathcal {M}_1(E)\), \({\widetilde{\mathbb {P}}}_{\mu }\) is a probability measure on \(\widetilde{\mathcal {F}}_{\infty }\), and the processes \({\widetilde{\Psi }}\) and \(\Psi \) have the same finite dimensional distributions under \(\mathbb {P}_{\mu }\) and \({\widetilde{\mathbb {P}}}_{\mu }\), respectively, i.e.,
for all \(n\in \mathbb {N}_0\), \(t_0\le t_1\le \ldots \le t_n\) in \(\mathbb {T}\), and \(A_0,\ldots ,A_n\in \mathcal {B}(E)\). In particular, \({\widetilde{\Psi }}\) is then a Markov process w.r.t. \(\{\widetilde{\mathcal {F}}(t)\}_{t\in \mathbb {T}}\) with the transition semigroup such as that of \(\Psi \) (cf. the proof of [5, Theorem 4.3]). Moreover, it is not hard to check that, if \(\Psi \) (with \(\mathbb {T}=\mathbb {R}_+\)) is jointly (resp. progressively) measurable, then \({\widetilde{\Psi }}\) is jointly (resp. progressively) measurable as well. One important advantage of this canonical setting is that we can consider the shift operators \(\Theta _t:{\widetilde{\Omega }}\rightarrow {\widetilde{\Omega }} \), \(t\in \mathbb {T}\), defined by
which allows one to write \({\widetilde{\Psi }}(s)\circ \Theta _t={\widetilde{\Psi }}(s+t)\) for any \(s,t\in \mathbb {T}\). This, among others, enables the use of the Birkhoff ergodic theorem in terms of the Markov process under consideration.
2.2 A Version of the CLT for Martingales
While proving the main result of this article, we will refer to [28, Theorem 5.1], which is quoted below for the convenience of the reader.
Let \((\Omega ,\{\mathcal {F}_n\}_{n\in \mathbb {N}_0},\mathcal {F},\mathbb {P})\) be a filtrated probability space with trivial \(\mathcal {F}_0\), and consider a square integrable martingale \(\{m_n\}_{n\in \mathbb {N}_0}\), as well as the sequence \(\{z_n\}_{n\in \mathbb {N}}\) of its increments, given by \(z_n=m_n-m_{n-1}\) for \(n\in \mathbb {N}\). Further, define \(\langle m\rangle _n\), \(n\in \mathbb {N}\), as
Theorem 2.1
([28, Theorem 5.1]) Suppose that the following conditions hold:
-
(M1)
For every \(\varepsilon >0\) we have
$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{n}\sum _{i=0}^{n-1}\mathbb {E}\left( z_{i+1}^2 \mathbb {1}_{\left\{ \left| z_{i+1}\right| \ge \varepsilon \sqrt{n}\right\} }\right) =0 \end{aligned}$$ -
(M2)
We have \(\sup _{n\ge 1}\mathbb {E}\left( z_n^2\right) <\infty \) and there exists \(\sigma \in [0,\infty )\) such that
$$\begin{aligned} \lim _{k\rightarrow \infty }\limsup _{l\rightarrow \infty }\frac{1}{l}\sum _{j=1}^l\mathbb {E}\left| \frac{1}{k}\mathbb {E}\left( \langle m\rangle _{jk}-\langle m\rangle _{(j-1)k}|\mathcal {F}_{(j-1)k}\right) -\sigma ^2\right| =0. \end{aligned}$$ -
(M3)
For every \(\varepsilon >0\) we have
$$\begin{aligned} \lim _{k\rightarrow \infty }\limsup _{l\rightarrow \infty }\frac{1}{kl}\sum _{j=1}^l\sum _{i=(j-1)k}^{jk-1}\mathbb {E} \left( \left( 1+z_{i+1}^2\right) \mathbb {1}_{\left\{ \left| m_i-m_{(j-1)}k\right| \ge \varepsilon \sqrt{kl}\right\} } \right) =0. \end{aligned}$$
Then
and \(\{m_n\}_{n\in \mathbb {N}_0}\) obeys the CLT, i.e.,
where \(\Phi _{\sigma }\) is the distribution function of a centered normal law with variance \(\sigma ^2\).
Obviously, the centered normal distribution with zero variance (i.e., \(\sigma ^2=0\)) is viewed as the Dirac measure at 0.
Remark 2.1
Basically, almost all known central convergence criteria for martingales rely on [7, Theorem 2] of Brown, which, in turn, is based on the celebrated Lindeberg condition. In the result above, this condition is stated as hypothesis (M1) and is also reflected in assumption (M3). The Lindeberg condition originates from [4, Theorem 27.2] (see also [32, pp. 292–294]), that is, the Lindeberg-Feller CLT, which concerns independent, but not necessarily identically distributed, random variables. The Brown result, apart from this condition, involves the assumption that
This requirement is, however, relatively hard to verify in practice, and thus it is often replaced with other assumptions. For instance, the martingale CLT given in [34, Theorem D.6.4] requires instead that
which can be simply written as \(\lim _{n \rightarrow \infty } \langle m\rangle _n/n = \sigma ^2\) a.s. This assumption, together with the Lindeberg condition, enables one to derive (2.10), which then yields (2.11). Hypotheses (M2) and (M3), adapted here from [28], although more technical, prove to be less restrictive than (2.12). As emphasized by the authors, [28, Theorem 5.1] was inspired by the proof of [35, Theorem 2.1], concerning martingales with stationary increments, where the only assumption is \(\lim _{n \rightarrow \infty } \mathbb {E}|\langle m\rangle _n/n -\sigma ^2| \rightarrow 0\). Conditions (M1)–(M3) therefore constitute, in some way, a substitute of this assumption and the stationarity, expressed in the spirit of Lindeberg-type conditions and the aforementioned hypothesis (2.12) from [34].
3 Assumptions and Formulation of the Main Result
Let \(\Psi =\{\Psi (t)\}_{t\in \mathbb {R}_+}\) be a jointly measurable, E-valued time-homogeneous Markov process with transition semigroup \(\{P(t)\}_{t\in \mathbb {R}_+}\). We assume that the process is given in the Dynkin setup with a suitable family {\(\mathbb {P}_{\mu }:\, \mu \in \mathcal {M}_1(E)\}\) of probability measures. Furthermore, for analysis purposes, we will identify \(\Psi \) with the canonical (and also jointly measurable) process \({\widetilde{\Psi }}\), defined in Sect. 2.1, simultaneously, dropping all the tildes used in the definition of the latter.
To state the main result of this paper, we need to employ several conditions regarding the semigroup \(\{P(t)\}_{t\in \mathbb {R}_+}\). Firstly, we assume that
-
(A1)
\(\{P(t)\}_{t\in \mathbb {R}_+}\) has the Feller property;
and, secondly, we require the existence of a continuous function \(V:E \rightarrow \mathbb {R}_+\) such that the following holds:
-
(A2)
\(\{P(t)\}_{t\in \mathbb {R}_+}\) is V-exponentially mixing in the metric \(d_{\text {FM},\rho }\), in the sense that there exist constants \(\gamma >0\) and \(\beta >0\) such that
$$\begin{aligned}{} & {} d_{\text {FM},\rho } (\mu P(t), \nu P(t))\le \beta (\left\langle V,\mu \right\rangle +\left\langle V,\nu \right\rangle +1)^{1/2}e^{-\gamma t}\\{} & {} \quad \text {for all}\quad \mu ,\nu \in \mathcal {M}_{1,1}^V(E),\;t\in \mathbb {R}_+. \end{aligned}$$ -
(A3)
there exist \(A,B\ge 0\) and \(\Gamma >0\) such that
$$\begin{aligned} P(t)V^2(x)\le A e^{-\Gamma t}V^2(x)+B\;\;\;\text {for all}\;\;\;x\in E,\;t\in \mathbb {R}_+. \end{aligned}$$
Remark 3.1
Assumption (A3) is a strengthened form of the Lyapunov condition (see, e.g., [8, Definition 2.1]) and, among others, it ensures that the semigroup \(\{P(t)\}_{t\in \mathbb {R}_+}\) leaves the set \(\mathcal {M}_{1,2}^V(E)\) invariant. We can say even more, namely, for any \(\mu \in \mathcal {M}_{1,2}^V(E)\) we have \(\sup _{t\in \mathbb {R}_+} \left\langle V^2,\mu P(t)\right\rangle <\infty \), since
In Sect. 4.1, we will derive (in Lemma 4.2) a variant of (A3) concerning \(P(t)\mathcal {C}^p\) for \(p\in (0,4]\) and \(\mathcal {C}=\kappa (V+1)^{1/2}\) (with some constant \(\kappa \)), which leads to a conclusion analogous to that above (see Corollary 4.1). These observations will be essential for proving Lemma 4.5, which, in turn, plays a key role in verifying the hypotheses of Theorem 2.1 for a suitable martingale (defined in Sect. 4.2).
We will now show that the conjunction of the above-stated conditions implies that the semigroup \(\{P(t)\}_{t\in \mathbb {R}_+}\) is, in fact, V-exponentially ergodic in \(d_{\text {FM},\rho }\).
Lemma 3.1
If conditions (A1)–(A3) hold with some continuous function \(V:E \rightarrow \mathbb {R}_+\), then \(\{P(t)\}_{t\in \mathbb {R}_+}\) possesses a unique invariant probability measure \(\mu _*\), and \(\mu _*\in \mathcal {M}_{1,2}^V(E)\). Moreover, if such a measure exists, condition (A2) is equivalent to the following one: there exist \(\gamma >0\) and \(\kappa >0\) such that
Proof
Let \(x_0\in E\). Then from hypotheses (A2), (A3) and the inequality \(PV\le (PV^2)^{1/2}\) it follows that, for any \(s,t\ge 0\),
where \(W(x_0):=\beta (V(x_0)+(AV^2(x_0)+B)^{1/2}+1)^{1/2}\). Hence, for every \(t\ge 0\) and any \(n,m\in \mathbb {N}\), we have
This shows that \(\{\delta _{x_0}P(t+n)\}_{n\in \mathbb {N}}\) is a Cauchy sequence w.r.t. \(d_{\text {FM},\rho }\) for every \(t\ge 0\). Consequently, since the space \((\mathcal {M}_1(E), d_{\text {FM},\rho })\) is complete, each of such sequences is convergent in this space. On the other hand, (3.2) also implies that
which, in turn, guarantees that, all the sequences \(\{\delta _{x_0}P(t+n)\}_{n\in \mathbb {N}}\), \(t\ge 0\), have the same limit, say \(\mu _*\in \mathcal {M}_1(E)\). Obviously, this is equivalent to that \(\delta _{x_0}P(t+n){\mathop {\rightarrow }\limits ^{w}} \mu _*\) for all \(t\ge 0\).
Now, using (A1) we can conclude that \(\mu _*\) is invariant for \(\{P(t)\}_{t\in \mathbb {R}_+}\). Indeed, for any \(t\ge 0\) and \(f\in {\text {C}}_b(E)\), we get
which proves that \(\mu _* P(t)=\mu _*\) for all \(t\ge 0\). Moreover, (A2) yields that \(\delta _x P(t){\mathop {\rightarrow }\limits ^{w}} \mu _*\), as \(t\rightarrow \infty \), for every \(x\in E\), and thus, using Lebesgue’s dominated convergence theorem, one can deduce that, in fact, \(\mu P(t){\mathop {\rightarrow }\limits ^{w}} \mu _*\) for every \(\mu \in \mathcal {M}_1(X)\). This implies that \(\mu _*\) has to be the unique invariant probability measure of \(\{P(t)\}_{t\in \mathbb {R}_+}\).
Further, observe that \(\mu _*\in \mathcal {M}_{1,2}^V(E)\). To see this, let \(V_k:=\min \{V^2,k\}\) for \(k\in \mathbb {N}\). Since \(\delta _{x_0}P(t){\mathop {\rightarrow }\limits ^{w}}\mu _*\) as \(t\rightarrow \infty \) and \(\{V_k\}_{k\in \mathbb {N}}\subset {\text {C}}_b(E)\), applying (A3), we get
Obviously \(V_k(x)\uparrow V^2(x)\) for any \(x\in E\). Hence, we can use the Lebesgue monotone convergence theorem to conclude that \(\left\langle V^2, \mu _*\right\rangle =\lim _{k\rightarrow \infty } \left\langle V_k, \mu _*\right\rangle \le B<\infty \), which is the desired claim.
To prove the second statement of the lemma, let \(\mu _*\in \mathcal {M}_{1,2}^V(E)\) be an invariant measure of \(\{P(t)\}_{t\in \mathbb {R}_+}\). Then (A2) implies that, for every \(\mu \in \mathcal {M}_{1,1}^V(E)\),
whence (3.1) holds with \(\kappa :=\beta (\left\langle V,\mu _*\right\rangle +1)^{1/2}.\) Conversely, if (3.1) is fulfilled, then
for any \(\mu ,\nu \in \mathcal {M}_{1,1}^V(E)\), which means that (A2) holds with \(\beta :=2\kappa \). \(\square \)
Throughout the rest of the paper, upon assuming (A1)–(A3), the unique invariant probability measure of \(\{P(t)\}_{t\in \mathbb {R}_+}\) (which belongs to \(\mathcal {M}_{1,2}^V(E)\)) will be denoted by \(\mu _*\). Moreover, we define
where \(\kappa \) is the constant featured in (3.1). Then, Lemma 3.1 yields that
The main result of this paper reads as follows:
Theorem 3.1
Suppose that the transition semigroup \(\{P(t)\}_{t\in \mathbb {R}_+}\) of \(\Psi \) satisfies hypotheses (A1)–(A3) with some continuous function \(V:E\rightarrow \mathbb {R}_+\). Then it possesses a unique invariant probability measure \(\mu _*\), which belongs to \(\mathcal {M}_{1,2}^V(E)\), and, for every \(g\in {\text {Lip}}_b(E)\), the CLT holds for the process \(\{{\bar{g}}(\Psi (t))\}_{t\in \mathbb {R}_+}\) with \({\bar{g}}:=g-\langle g,\mu _*\rangle \), independently of the initial distribution \(\mu \in \mathcal {M}_1(E)\) of \(\Psi \), that is,
where \(\Phi _{\sigma }\) is the distribution function of a centered normal law with the variance \(\sigma ^2<\infty \) of the form
Remark 3.2
In fact, our main result remains valid (with almost the same proof, except for some obvious minor changes) under slightly weaker conditions than (A2) and (A3).
Firstly, the exponent 1/2 on the right-hand of the inequality in (A2) can be replaced by any other number \(\delta \in (0,1)\). Then (3.1) holds with \(\delta \) in place of 1/2, and thus one may consider \(\mathcal {C}:=\kappa (V+1)^{\delta }\) instead of (3.3). Upcoming Lemma 4.2, Corollary 4.1 and Lemma 4.5 can be then established for \(p\in (0, 2\delta ^{-1}]\) (rather that \(p\in (0, 4]\)), which is sufficient to prove the main result.
Secondly, (A3) can be somewhat relaxed by assuming instead that
with some \(r\ge 2\) (instead of \(r=2\)). Clearly, for \(r\in (0,2)\), condition (3.7) implies (A3) (with 2A and \(2A+B\) in places of A and B, respectively). Moreover, considering (3.7) with \(t=0\) shows that V has to be bounded in this case. Hence, if \(r<2\), then every probability measure has finite all moments w.r.t. V, and thus the assertion of Corollary 4.1 is trivially satisfied. Nevertheless, since V usually occurs in the form \(V=\rho (\cdot ,x_*)\) (with an arbitrary \(x_*\in E\)), such a case is essentially tantamount to assuming the boundedness of the state space, which is a very restrictive requirement.
Regardless of the above, we will stick with the initial version of the assumptions, with \(\delta =1/2\) and \(r=2\), to make the proofs easier to follow.
4 Proof of the Main Result
4.1 Some Auxiliary Facts
First of all, let us note that, due to the boundedness of g, it suffices to prove the CLT for \(\{{\bar{g}}(\Psi (t))\}_{t\in \mathbb {R}_+}\) along the integers.
Lemma 4.1
Let \(\mu \in \mathcal {M}_1(E)\). Then, for any \(g\in {\text {B}}_b(E)\) and
the process \(\{I(t)/\sqrt{t}\}_{t>0}\) converges in law (to some \(\mathbb {R}\)-valued random variable) under \(\mathbb {P}_{\mu }\) (as \(t\rightarrow \infty \)) whenever the sequence \(\{I(n)/\sqrt{n}\}_{n\in \mathbb {N}}\) does.
Proof
Fix an arbitrary \(\varepsilon >0\) and choose \(n_0\in \mathbb {N}\) such that \(n_0^{-1/2}<\varepsilon /(2\Vert g\Vert _{\infty })\). Then, taking into account the boundedness of g, for any \(n\ge n_0\) and any \(t\in [n,n+1)\), we get
which yields that
Finally, using the Chebyshev inequality and the fact that \(n^{-1/2}-(n+1)^{-1/2}\le n^{-3/2}\), we infer that, for very \(n\ge n_0\),
Consequently,
which implies the desired claim. \(\square \)
Additionally, as is evident from the following remark, we may assume without loss of generality that the initial distribution \(\mu \) of \(\Psi \) belongs to \(\mathcal {M}_{1,2}^V(E)\).
Remark 4.1
If (3.5) holds for all \(\mu \in \{\delta _x:\,x\in E\}\), then it is valid for all \(\mu \in \mathcal {M}_1(E)\). This follows directly from (2.6) by applying the Lebesgue dominated convergence theorem.
Another simple observation, to be used later, is expressing the Lyapunov condition assumed in (A3) using the function \(\mathcal {C}:E\rightarrow \mathbb {R}_+\), given by (3.3).
Lemma 4.2
If \(\{P(t)\}_{t\in \mathbb {R}_+}\) enjoys hypothesis (A3) with some Borel measurable function \(V:E\rightarrow \mathbb {R}_+\), then, for any \(p\in (0,4]\), there exist constants \(A_{p},B_{p}\ge 0\) and \(\Gamma _{p}>0\) such that the map \(\mathcal {C}:E\rightarrow \mathbb {R}_+\) given by (3.3) satisfies
Proof
Let \(x\in E\), \(t\in \mathbb {R}_+\). Using the fact that, for any \(r>0\), there exists a positive constant \(\zeta \) (precisely, \(\zeta =1\) for \(r\le 1\) or \(\zeta =2^{r-1}\) for \(r>1\)) such that
we obtain \(\mathcal {C}^p=\kappa ^p(V+1)^{p/2}\le \kappa ^p \zeta (V^{p/2}+1)\) with some \(\zeta >0\). Further, the Hölder inequality (used with exponent 4/p) yields that \(P(t)V^{p/2}(x)\le (P(t)V^2(x))^{p/4}\). Consequently, applying hypothesis (A3) and again inequality (4.1) (with \(r=p/4\in (0,1]\)), we can conclude that
where \(A_p:=\kappa ^p\zeta A^{p/4}\), \(B_p:=\kappa ^p\zeta (B^{p/4}+1)\) and \(\Gamma _p:=(p\Gamma )/4\). The proof is now complete. \(\square \)
As a straightforward consequence of Lemma 4.2, we can deduce that, for any initial measure \(\nu \) with finite second moment w.r.t. V, the fourth moment of the distribution of \(\Psi \) w.r.t. \(\mathcal {C}\), is uniformly bounded over t.
Corollary 4.1
Suppose that \(\{P(t)\}_{t\in \mathbb {R}_+}\) fulfills hypotheses (A1)–(A3) with some continuous \(V:E\rightarrow \mathbb {R}_+\). Then, for every \(p\in (0,4]\) and the function \(\mathcal {C}:E\rightarrow \mathbb {R}_+\) given by (3.3), we have
Proof
Let \(p\in (0,4]\). Taking constants \(A_p,B_p\ge 0\) and \(\Gamma _p>0\) for which the assertion of Lemma 4.2 is valid, we get
for any \(t\ge 0\) and \(\mu \in \mathcal {M}_{1,2}^V(E)\). Since \(\mu _*\in \mathcal {M}_{1,2}^V(E)\) due to Lemma 3.1, applying the same estimation with \(\mu =\mu _*\) also gives \(\left\langle \mathcal {C}^p,\mu _*\right\rangle <\infty \). \(\square \)
4.2 A Martingale-Based Decomposition for the Given Markov Process
Suppose that the transiton semigroup \(\{P(t)\}_{t\in \mathbb {R}_+}\) of \(\Psi \) fulfills hypotheses (A1)–(A3) with some continuous function \(V:E\rightarrow \mathbb {R}_+\) (and therefore it is V-exponentially ergodic by Lemma 3.1). Further, fix an arbitrary \(g\in {\text {Lip}}_{b,1}(E)\), and let \({\bar{g}}:=g-\langle g,\mu _*\rangle \). Then, by (3.4), we have
for every \(x\in E\), \(t\ge 0\) and some \(\gamma >0\), whence
This, in turn, allows us to define the corrector function \(\chi :E\rightarrow \mathbb {R}\) as
Remark 4.2
The function \(\chi \) is continuous. Indeed, let \(x_0\in E\). From (4.3) and the continuity of \(\mathcal {C}\) it follows that there exists \(\delta >0\) such that, for all \(t\ge 0\) and any \(x\in E\) with \(\rho (x_0,x)<\delta \), we have
This, in turn, allows one to apply the Lebesgue dominated convergence theorem, which, together with (A1), gives \(\lim _{x\rightarrow x_0} \chi (x)=\chi (x_0)\).
Remark 4.3
Note that (4.4) implies that, given \(p>0\) and \(\mu \in \mathcal {M}_1(E)\), we have
for all \(t\ge 0\).
Remark 4.4
It is also worth paying attention to certain consequences of Fubini’s theorem:
-
(i)
Since the map \(\mathbb {R}_+\times \Omega \ni (t,\omega )\mapsto \Psi (t)(\omega )\) is \(\mathcal {B}(\mathbb {R}_+)\otimes \mathcal {F}_{\infty }/\mathcal {B}(E)\)-measurable, it follows that, for any \(\mu \in \mathcal {M}_1(E)\) and any \(T>0\), we have
$$\begin{aligned} \mathbb {E}_{\mu }\left( \int _0^T {\bar{g}}(\Psi (t))\,dt \right) =\int _0^T \mathbb {E}_{\mu }\left( {\bar{g}}(\Psi (t)) \right) \,dt. \end{aligned}$$ -
(ii)
The assumed product measurability of the map \(\mathbb {R}_+\times E\ni (u,y)\mapsto P(u)\mathbb {1}_A(y)\) with any \(A\in \mathcal {B}(E)\) (see Sect. 2.1) easily implies the measurability of \((u,y)\mapsto P(u){\bar{g}}(y)\), which, in turn, ensures that \(\mathbb {R}_+\times \Omega \ni (u,\omega )\mapsto P(u){\bar{g}}(\Psi (t)(\omega ))\) is \(\mathcal {B}(\mathbb {R}_+)\otimes \mathcal {F}(t)/\mathcal {B}(\mathbb {R})\)-measurable for any \(t\ge 0\). Moreover, it follows from (4.3) and Lemma 4.2 that the latter is also integrable with respect to \(du\otimes \mathbb {P}_x(d\omega )\) for every \(x\in E\). Hence, using (2.7), for any \(x\in E\) and \(t\ge 0\), we have
$$\begin{aligned} P(t)\chi (x)&=\mathbb {E}_x\left( \chi (\Psi (t))\right) =\mathbb {E}_x\left( \int _0^{\infty }P(u){\bar{g}}(\Psi (t))\,du \right) \\&=\int _0^{\infty } \mathbb {E}_{x}\left( P(u){\bar{g}}(\Psi (t))\right) du=\int _0^{\infty } P(t+u){\bar{g}}(x)du. \end{aligned}$$
Let us now define the processes
and observe that
Moreover, let \(\{Z(n)\}_{n\in \mathbb {N}_0}\) stand for the sequence of the increments of \(\{M(n)\}_{n\in \mathbb {N}_0}\), that is,
It is worth noting here that \(\sigma ^2\), defined by (3.6), can be now expressed as
It is well-known that \(\{M(t)\}_{t\in \mathbb {R}_+}\) is a martingale; see, e.g., [28, Proposition 5.2]. Nevertheless, for the convenience of a reader, we provide a somewhat more detailed proof of this fact below, demonstrating the finiteness of \(\sigma ^2\) at the same time.
Lemma 4.3
Suppose that \(\{P(t)\}_{t\in \mathbb {R}_+}\) satisfies hypotheses (A1)–(A3) with some continuous function \(V:E\rightarrow \mathbb {R}_+\). Then, for every \(\mu \in \mathcal {M}_{1,2}^V(E)\), the process \(\{M(t)\}_{t\in \mathbb {R}_+}\), given by (4.6), is a martingale in \(\mathcal {L}^4(\mathbb {P}_{\mu })\) w.r.t. to the natural filtration \(\{\mathcal {F}(t)\}_{t\in \mathbb {R}_+}\) of \(\Psi \). In particular, the variance \(\sigma ^2\), specified by (4.10), is then finite.
Proof
Let \(\mu \in \mathcal {M}_{1,2}^V(E)\) and \(t\ge 0\). Using twice inequality (4.1) with \(r=4\) and \(\zeta =2^3\), we have
Hence, according to Remark 4.3, it follows that
which shows that \(\mathbb {E}_{\mu }\left( |M(t)|^4\right) <\infty \) due to Corollary 4.1 (applied for \(p=4\)). Clearly, since \(\mu _*\in \mathcal {M}_{1,2}^V(E)\) (according to Lemma 3.1), this also yields the finiteness of \(\sigma ^2\).
Now, let \(s,t\in \mathbb {R}_+\) be such that \(s<t\). Keeping in mind Remark 4.4(i), we can write
Then, applying the Markov property and (2.7), we see that
where the last equality follows from Remark 4.4(ii). Consequently, returning to (4.12), we finally infer that \(\mathbb {E}_{\mu }\left( M(t)|\mathcal {F}(s)\right) =M(s)\), which completes the proof. \(\square \)
Obviously, identity (4.8), combined with Lemma 4.1, reduces the proof of our main result to showing the CLT for the martingale \(\{M(n)\}_{n\in \mathbb {N}_0}\), provided that \(\{R(n)\}_{n\in \mathbb {N}}\) converges in law to 0. This, in turn, follows from the following observation:
Lemma 4.4
Suppose that \(\{P(t)\}_{t\in \mathbb {R}_+}\) hypotheses (A1)–(A3) with some continuous function \(V:E\rightarrow \mathbb {R}_+\). Then, for \(\{R(t)\}_{t>0}\) given by (4.7) and any \(\mu \in \mathcal {M}_{1,2}^V(E)\), we have \(R(t) \rightarrow 0\) in \(\mathcal {L}^1(\mathbb {P}_{\mu })\) as \(t\rightarrow \infty \).
Proof
Taking into account Remark 4.3, we see that
which, in conjunction with Corollary 4.1, gives the desired claim. \(\square \)
4.3 Verification of the Hypotheses of Theorem 2.1
The aim of this section is to show that the martingale \(\{M(n)\}_{n\in \mathbb {N}_0}\), determined by (4.6), fulfills the hypotheses (M1)–(M3) of Theorem 2.1, and thus obeys the CLT. Before we proceed to verify these conditions, let us make the following crucial observation:
Lemma 4.5
Suppose that \(\{P(t)\}_{t\in \mathbb {R}_+}\) satisfies hypotheses (A1)–(A3) with some continuous \(V:E\rightarrow \mathbb {R}_+\), and let \(\{Z(n)\}_{n\in \mathbb {N}}\) be given by (4.9). Then, for every \(p\in (0,4]\), there exists \(\Gamma _p>0\) such that, for any \(\mu \in \mathcal {M}_{1,2}^V(E)\) and certain constants \({\tilde{A}}_p,{\tilde{B}}_p\ge 0\) (depending on \(\mu \)), we have
and, in particular,
Proof
Let \(p\in (0,4]\), \(i\in \mathbb {N}\) and \(t\in \mathbb {R}_+\). Using inequality (4.1) we can choose \(\zeta >0\) so that, for every \(x\in E\),
and also
Consequently, appealing to Remark 4.3, for any \(x\in E\), we get
Now, take any constants \(\Gamma _p>0\) and \(A_p,B_p\ge 0\) for which the assertion of Lemma 4.2 is valid, and let \(\mu \in \mathcal {M}_{1,2}^V(E)\). Then
As a consequence of (4.15) and (4.17), we obtain
and, since \(\left\langle V^{p/2},\mu \right\rangle <\infty \), we can finally deduce that (4.13) holds with
Obviously, (4.14) follows directly from (4.13) (applied with \(\mu =\mu _*\)), since \(\mu _*\in \mathcal {M}_{1,2}^V(E)\) by virtue of Lemma 3.1. \(\square \)
While proving that condition (M2) holds, we will also need the continuity of the maps \(E\ni x \mapsto \mathbb {E}_x\left( M^2(t)\right) \) (for every \(t\in \mathbb {R}_+\)), which is shown in the following two lemmas.
Lemma 4.6
Suppose that \(\{P(t)\}_{t\in \mathbb {R}_+}\) satisfies hypotheses (A1)–(A3) with some continuous \(V:E\rightarrow \mathbb {R}_+\). Then the map \(E\ni x\mapsto \mathbb {E}_x(\chi (\Psi (t)))\) is also continuous for any \(t\in \mathbb {R}_+\).
Proof
Let \(t\in \mathbb {R}_+\), \(x_0\in E\), and observe that (4.3) gives
Hence, referring to Lemma 4.2 and using the continuity of \(V^{1/2}\), we can choose constants \(A_1,B_1\ge 0\) and \(\Gamma _1>0\), as well as \(\delta >0\), so that, for any \(u,v\in \mathbb {R}_+\) and all \(x\in E\) satisfying \(\rho (x_0,x)<\delta \),
Consequently, having in mind Remark 4.4(ii) and (2.7), we can apply the Lebesgue dominated convergence theorem to conclude that, for any \(x_0\in E\),
where the second equality follows from (A1). The proof is now completed. \(\square \)
Lemma 4.7
Suppose that \(\{P(t)\}_{t\in \mathbb {R}_+}\) satisfies hypotheses (A1)–(A3) with some continuous \(V:E\rightarrow \mathbb {R}_+\). Then, for every \(t\in \mathbb {R}_+\), the map \(E\ni x\mapsto \mathbb {E}_x(M^2(t))\), with M(t) given by (4.6), is continuous.
Proof
Let \(t\in \mathbb {R}_+\) and observe that, for any \(x\in E\), we have
Consequently, in view of Lemma 4.6 and the continuity of \(\chi \) (demonstrated in Remark 4.2), it suffices to show the continuity of the following maps:
According to (4.3) we have
Hence, using the Fubini theorem in a manner similar to that in Remark 4.4(ii) gives
Let us now fix an arbitrary \(x_0\in E\). Taking any constants \(A_2,B_2\ge 0\) (and \(\Gamma _2>0\)) for which the assertion of Lemma 4.2 is valid with \(p=2\), and having in mind the continuity of V, we can choose \(\delta >0\) such that, for any \(u,v\in \mathbb {R}_+\) and all \(x\in E\) with \(\rho (x_0,x)<\delta \),
Applying the Lebesgue dominated convergence theorem we can therefore conclude that
where the penultimate equality follows from (A1). Hence \(x\mapsto \mathbb {E}_x\left( \chi ^2\left( \Psi (t)\right) \right) \) is continuous at \(x_0\), and thus in any \(x\in E\).
To prove the continuity of the remaining two maps, let us first note that a direct application of Fubini’s theorem yields that, for any \(t\in \mathbb {R}_+\) and every integrable \(f:[0,t]\rightarrow \mathbb {R}\),
As a consequence, arguing analogously as in Remark 4.4(i), we get
Further, referring to the properties of conditional expectation, the Markov property and identity (2.7), we obtain
Let \(x_0\in E\). Proceeding analogously as before, we can now use (4.3) and then refer to Lemma 4.2 and the continuity of \(V^{1/2}\) to finally conclude that there exist constants \(A_1,B_1\ge 0\) and some \(\delta >0\) such that, for any \(u,v\in \mathbb {R}_+\) and all \(x\in E\) satisfying \(\rho (x_0,x)<\delta \),
This enables us to apply the Lebesgue dominated convergence theorem, which yields that
where the penultimate equality follows from (A1). In view of arbitrariness of \(x_0\), we have thus shown that the map \(x\mapsto \mathbb {E}_x((\int _0^t{\bar{g}}(\Psi (s))\,ds)^2)\) is continuous.
Finally, taking into account that
due to (4.3), and reasoning similarly as in Remark 4.4, we see that, for any \(x\in E\),
Hence, again, fixing \(x_0\in E\) and referring sequentially to (4.3), Lemma 4.2 and the continuity of \(V^{1/2}\), we can estimate the integrand on the right-hand similarly as in (4.18) (with \(t+v\) in place of v). This, analogously as before, enables the use of the Lebesgue theorem to deduce the continuity of \(x\mapsto \mathbb {E}_x(\chi (\Psi (t))\int _0^{t}{\bar{g}}(\Psi (s))\,ds)\). The proof is now complete. \(\square \)
We can now proceed to verifying hypothesis (M1)–(M3).
Lemma 4.8
Suppose that \(\{P(t)\}_{t\in \mathbb {R}_+}\) satisfies hypotheses (A1)–(A3) with some continuous \(V:E\rightarrow \mathbb {R}_+\), and that \(\mu \in \mathcal {M}_{1,2}^V(E)\). Then, the sequence \(\{Z(n)\}_{n\in \mathbb {N}}\) of martingale increments, given by (4.9), fulfills property (M1) with \(\mathbb {E}=\mathbb {E}_{\mu }\).
Proof
Let \(\varepsilon >0\) and \(N\in \mathbb {N}\). By the Markov property, for every \(n\in \mathbb {N}_0\),
whence, using the properties of the conditional expectation, we get
This allows us to write
where
Now, let \(\delta \in (0,2]\). Using subsequently the Hölder inequality (with exponent \((2+\delta )/2\)) and the Markov inequality, we obtain
for all \(x\in E\), which, in turn, implies that
On the other hand, Lemma 4.5 guarantees that \(\sup _{N\in \mathbb {N}}\int _E\mathbb {E}_x\left( |Z(1)|^{2+\delta }\right) \mu U_N(dx)<\infty \). Hence \(\lim _{N\rightarrow \infty } \left\langle G_N,\, \mu U_N\right\rangle =0\), which, due to (4.19), gives the desired claim. \(\square \)
Lemma 4.9
Suppose that \(\{P(t)\}_{t\in \mathbb {R}_+}\) satisfies hypotheses (A1)–(A3) with some continuous \(V:E\rightarrow \mathbb {R}_+\), and that \(\mu \in \mathcal {M}_{1,2}^V(E)\). Then, the sequence \(\{Z(n)\}_{n\in \mathbb {N}}\) of martingale increments, given by (4.9), fulfills property (M2) with \(\sigma ^2\) specified by (4.10), and \(\mathbb {E}=\mathbb {E}_{\mu }\).
Proof
Obviously, the first part of condition (M2), including the finiteness of \(\sigma ^2\), follows directly from Lemma 4.5.
Keeping in mind (2.9) and using the properties of the conditional expectation, as well as the Markov property, for any \(j,k\in \mathbb {N}\), we obtain
Hence, for any given \(l,k\in \mathbb {N}\), we may write
where
Therefore, it now suffices to show that
To do this, we will use Lemma 2.1 and the Birkhoff ergodic theorem.
Let us first observe that, according to Lemma 4.7, \(H_k\) is continuous for every \(k\in \mathbb {N}\). Further, fix \(k\in \mathbb {N}\), \(q\in (1,2]\), and let \(\Gamma _{2q}>0\) and \({\tilde{A}}_{2q},{\tilde{B}}_{2q}\ge 0\) be the constants for which assertion (4.13) of Lemma 4.5 is valid with the given \(\mu \). Then, using (4.1) with \(r=q\) and \(\zeta =2^{q-1}\), as well as the Jensen inequality, for every \(l\in \mathbb {N}\), we obtain
Moreover, referring to the V-ergodicity of \(\{P(t)\}_{t\in \mathbb {R}_+}\) (established in Lemma 3.1), as well as the Cesáro mean convergence theorem, we see that \(\{l^{-1}\sum _{j=1}^l\mu P((j-1)k)\}_{l\in \mathbb {N}}\) converges weakly to \(\mu _*\). This observation, together with the above estimate and the continuity of \(H_k\), enables us to use Lemma 2.1, which gives
In view of the above, it remains to prove that \(\lim _{k\rightarrow \infty }\left\langle |H_k|, \mu _*\right\rangle =0\). For this aim, consider the subfamily \(\{\Theta _k\}_{k\in \mathbb {N}_0}\) of the shift operators, defined according to (2.8), and note that \(\Theta _k=\Theta _1^k\) for every \(n\in \mathbb {N}_0\). Since \(\mu _*\) is the unique invariant distribution of \(\Psi \), it follows that \(\Theta _1\) is an (\(\mathcal {F}_{\infty }\)-measurable) ergodic transformation preserving the measure \(\mathbb {P}_{\mu _*}\), i.e., \(\mathbb {P}_{\mu _*}(\Theta _1^{-1}(F))=\mathbb {P}_{\mu _*}(F)\) for every \(F\in \mathcal {F}_{\infty }\), and \(\mathbb {P}_{\mu _*}(F)\in \{0,1\}\) whenever \(F\in \mathcal {F}_{\infty }\) satisfies \(\Theta _1^{-1}(F)=F\). Moreover, it is easily seen that \(Z^2(k)=Z^2(1)\circ \Theta _{k-1}=Z^2(1)\circ \Theta _1^{k-1}\) for any \(k\in \mathbb {N}\), and by Lemma 4.5 we know that \(Z^2(1)\in \mathcal {L}^1(\mathbb {P}_{\mu _*})\). Consequently, the von Neumann mean ergodic theorem adapted to \(\mathcal {L}^p\) spaces (see [38, Corollary 1.14.1] or [29, Theorem 1.4] and the discussion after it), together with [29, Proposition 1.6], ensures that
Finally, taking into account that
for every \(k\in \mathbb {N}\), we get \(\lim _{k\rightarrow \infty } \left\langle |H_k|, \mu _*\right\rangle =0\). Obviously, in view of (4.23), we can now conclude that (4.22) holds, which completes the proof. \(\square \)
Lemma 4.10
Suppose that \(\{P(t)\}_{t\in \mathbb {R}_+}\) satisfies hypotheses (A1)–(A3) with some continuous \(V:E\rightarrow \mathbb {R}_+\), and that \(\mu \in \mathcal {M}_{1,2}^V(E)\). Then, the sequence \(\{Z(n)\}_{n\in \mathbb {N}}\) of martingale increments, given by (4.9), enjoys property (M3) with \(\mathbb {E}=\mathbb {E}_{\mu }\).
Proof
Let \(\varepsilon >0\) and fix \(k,l,j\in \mathbb {N}\), \(n\ge (j-1)k\) arbitrarily. The Markov property yields that
Thus, taking the expectation of both sides we get
Now, summing up the above equality over all \(n=(j-1)k,\ldots ,jk-1\), and then over all \(j=1,\ldots l\), and finally dividing both sides of the resulting identity by kl, we obtain
where
It is now clear that, to end the proof, we need to show that
From the Hölder inequality it follows that, for any \(x\in E\), \(k,l\in \mathbb {N}\) and \(n\in \mathbb {N}_0\), we have
which (again by the Hölder inequality) yields that, for any \(k,l\in \mathbb {N}\),
Now, fix \(k,l,j\in \mathbb {N}\) and \(n\in \mathbb {N}_0\). Taking, for \(p\in \{2,4\}\), any constants \(\Gamma _p>0\) and \({\tilde{A}}_p, {\tilde{B}}_p\ge 0\) for which assertion (4.13) of Lemma 4.5 is valid with p, we see that the first integral on the right hand-side of (4.24) can be estimated by
Further, applying the Markov inequality and referring to Remark 4.3, as well as to the definition of M(t) (given in (4.6)), we obtain
Then, choosing \(A_1,B_1\ge 0\) (and \(\Gamma _1>0\)) so that the assertion of Lemma 4.2 is valid, we can estimate the second integral as follows:
where \({\hat{A}}=2\Vert g\Vert _{\text {BL}}(\gamma \varepsilon )^{-1}(A_1\langle V^{1/2},\mu \rangle +B_1)\) and \({\hat{B}}=\Vert {\bar{g}}\Vert _{\infty }\varepsilon ^{-1}\). Finally, combining (4.24) with (4.25) and (4.26) gives
for any \(k,l\in \mathbb {N}\). This, in turn, implies that
which completes the proof. \(\square \)
4.4 Finalization of the Proof
Armed with the results obtained in the previous subsections, we are now in a position to finalize the proof of the main theorem.
Proof (Proof of Theorem 3.1)
First of all, according to Lemma 3.1, \(\{P(t)\}_{t\in \mathbb {R}_+}\) admits a unique probability measure \(\mu _*\), which belongs to \(\mathcal {M}_{1,2}^V(E)\). Further, taking into account identity (4.8) and Lemma 4.4, along with insights from Lemma 4.1 and Remark 4.1, it suffices to prove that the CLT holds (with \(\sigma ^2\) given by (3.6)) for the sequence \(\{M(n)\}_{n\in \mathbb {N}_0}\), determined by (4.6), whenever the initial distribution \(\mu \) of \(\Psi \) belongs to \(\mathcal {M}_{1,2}^V(E)\). Since, due to Lemma 4.3, \(\{M(n)\}_{n\in \mathbb {N}_0}\) is a martingale and \(\sigma ^2<\infty \), the proof of that reduces to verifying conditions (M1)–(M3) of Theorem 2.1 (with \(\mathbb {P}=\mathbb {P}_{\mu }\)). This, in turn, has been done in Lemmas 4.8–4.10, respectively. The proof of Theorem 3.1 is therefore complete.
5 A Representation of \(\sigma ^2\)
In this section, we provide a relatively simple representation of the variance \(\sigma ^2\), involved in Theorem 3.1. The line of the reasoning presented below draws heavily from ideas of [3].
Suppose that the semigroup \(\{P(t)\}_{t\in \mathbb {R}_+}\) enjoys conditions (A1)–(A3) with a continuous function \(V:E\rightarrow \mathbb {R}_+\), and let \(\mu _*\) denote its unique invariant probability measure. Further, consider the space \(\mathbb {L}:=\mathcal {L}^2(\mu _*)\) of all Borel measurable and \(\mu _*\)-square integrable functions from E to \(\mathbb {R}\) (precisely, the corresponding quotient space under the relation of \(\mu _*\)-a.e. equality), endowed with the norm
Since, given \(p\in \{1,2\}\), \(f\in \mathbb {L}\), and \(t\in \mathbb {R}_+\), we have \(\left\langle P(t)(|f|^p),\mu _*\right\rangle =\left\langle |f|^p,\mu _*\right\rangle <\infty \), it follows that \(\langle |f|^p, \delta _xP(t)\rangle =P(t)(|f|^p)(x)<\infty \) for \(\mu _*\text {-a.e. } x\in E\). Thus, bearing in mind (2.3), we can identify \(P(t)f^p\) as a real-valued Borel measurable function that coincides with the map \(x\mapsto \left\langle f^p,\delta _x P(t)\right\rangle \) on some Borel set of full measure \(\mu _*\) where \(P(t)(|f|^p)<\infty )\), and is zero outside this set. Moreover, accounting for this identification and the fact that
we see that \(P(t)f\in \mathbb {L}\) and \(\left\| P(t)f\right\| _2\le \left\| f\right\| _2\). Consequently, \(\{P(t)\}_{t\in \mathbb {R}_+}\) can be viewed as a contraction semigroup on \(\mathbb {L}\).
Now, let \(\mathbb {L}_0\) denote the center of the semigroup \(\{P(t)\}_{t\in \mathbb {R}_+}\) on \(\mathbb {L}\), that is
Then, the infinitesimal generator A of \(\{P(t)\}_{t\in \mathbb {R}_+}\) can be defined on the domain
by
Before we state the main result of this section, let us observe that the corrector function \(\chi \), given by (4.5), belongs to \(\mathbb {L}\). To see this, it suffices to note that Remark 4.3 (applied with \(\mu =\mu _*\)), together with Corollary 4.1, yields that
Having established this, we can now prove the following:
Theorem 5.1
Suppose that \(\{P(t)\}_{t\in \mathbb {R}_+}\) satisfies hypotheses (A1)–(A3) with some continuous \(V:E\rightarrow \mathbb {R}_+\). Then \(\sigma ^2\), given by (4.10), takes the form
whenever \(g\in {\text {Lip}}_b(E)\), involved in (4.5), is such that, for every \(x\in E\), the map \(t\mapsto P(t)g(x)\) is continuous at \(t=0\).
Proof
First of all, observe that \(\chi \in D_A\) and \(A\chi =-{\bar{g}}\). To show this, note that
In view of the continuity at 0 of \(P(\cdot )g(x)\), \(x\in E\), we have (see, e.g., [21, Lemma 5.5.1])
Hence, using the Lebesgue dominated convergence theorem, we can conclude that
which implies the desired claim.
Now, for each \(n\in \mathbb {N}\), consider the sequence \(\{Z_n(k)\}_{k\le n}\) of the increments of \(\{M(k/n)\}_{k\le n}\), defined by
Let \(n\in \mathbb {N}\) be arbitrarily fixed. Since \(\{M(t)\}_{t\in \mathbb {R}_+}\) is a martingale with respect to the natural filtration of \(\Psi \), and \(\mu _*\) is an invariant distribution of \(\Psi \), it follows that \(\{Z_n(k)\}_{k\le n}\) forms a sequence of pairwise orthogonal and identically distributed random variables on \((\Omega ,\mathcal {F}_{\infty },\mathbb {P}_{\mu _*})\) with mean 0. In view of this, we have
and the expectation on the right-hand side can be expressed as
Let us define
Then, keeping in mind (2.7) and taking into account that
we can write
for \(\mu _*\text {-a.e. } x\in E\). This, together with the invariance of \(\mu _*\), shows that the first term on the right-hand side of (5.3) can be expressed as
Hence, putting
we can write (5.3) as
Consequently, (5.2) then takes the form
Since \(\chi ,\epsilon _n\in \mathbb {L}\) for every \(n\in \mathbb {N}\), we can apply the Cauchy-Schwarz inequality to obtain
which, together with the fact that \(\lim _{n\rightarrow \infty }\left\| \epsilon _n\right\| _2=0\), yields that
What is now left is to prove that also the sequences \((n\delta _n)_{n\in \mathbb {N}}\) and \((n\Delta _n)_{n\in \mathbb {N}}\) tend to 0 as \(n\rightarrow \infty \). Note that, for every \(n\in \mathbb {N}\), we have
which means that \(0\le n\delta _n\le n^{-1}\left\| {\bar{g}}\right\| _{\infty }^2\) for all \(n\in \mathbb {N}\), whence \(n\delta _n\rightarrow 0\) as \(n\rightarrow \infty \). Moreover, using sequentially the Cauchy-Schwarz inequality, (5.4) and (5.6), we can conclude that, for each \(n\in \mathbb {N}\),
which gives
This, in conjunction with (5.5), shows that \(n\Delta _n\rightarrow 0\) as \(n\rightarrow \infty \), and therefore the proof is complete. \(\square \)
6 A Note on the Functional CLT in the Stationary Case
Although investigating the functional central limit theorem (FCLT), also known as Donsker’s invariance principle, is not the primary focus of this paper, it is worth to note that, under the employed assumptions, [3, Theorem 2.1] implies the validity of such a theorem in the case where \(\Psi \) is stationary.
To present a relevant result, upon assuming \(\mu _*\) to be the unique invariant probability measure of \(\{P(t)\}_{t\in \mathbb {R}_+}\), and given \(g\in {\text {Lip}}_b(E)\), consider the sequence \(\big \{\Gamma _g^{(n)}\big \}_{n\in \mathbb {N}}\) of stochastic processes defined by
with \({\bar{g}}=g-\left\langle g,\mu _*\right\rangle \). Further, let \({\text {C}}_0(\mathbb {R}_+)\) be the space of all real-valued continuous functions on \(\mathbb {R}_+\) starting at zero, endowed with the topology of uniform convergence on compact sets. Since g is bounded, it is easily seen that the processes \(\Gamma _{{\bar{g}}}^{(n)}\), \(n\in \mathbb {N}\), can be regarded as random variables with values in \({\text {C}}_0(\mathbb {R}_+)\). Apart from that, let \(\mathbb {W}_{\sigma }\) stand for the Wiener measure on (the Borel \(\sigma \)-field of) \({\text {C}}_0(\mathbb {R}_+)\) with zero drift and variance \(\sigma ^2\).
When \(\mu \) is the initial distribution of \(\Psi \), the process \(\{{\bar{g}}(\Psi (t))\}_{t\in \mathbb {R}_+}\) is said to obey the FCLT if, for some \(\sigma \ge 0\), the distributions of \(\Gamma _{{\bar{g}}}^{(n)}\) on \({\text {C}}_0(\mathbb {R}_+)\) (under \(\mathbb {P}_{\mu }\)) converge weakly to the Wiener measure \(\mathbb {W}_{\sigma }\), i.e.,
Proposition 6.1
Suppose that \(\{P(t)\}_{t\in \mathbb {R}_+}\) satisfies hypotheses (A1)–(A3) with some continuous \(V:E\rightarrow \mathbb {R}_+\), and that \(t\mapsto P(t)g(x)\) is continuous at \(t=0\) for every \(x\in E\). Furthermore, assume that \(\Psi \) is progressively measurable and stationary, i.e., its initial distribution is the (unique) invariant one \(\mu _*\). Then, for every \(g\in {\text {Lip}}_b(E)\), the process \(\{{\bar{g}}(\Psi (t))\}_{t\in \mathbb {R}_+}\) obeys the FCLT with \(\sigma ^2\) given by (5.1), that is, (6.1) holds with \(\mu =\mu _*\).
Proof
First of all, by Lemma 3.1, the semigroup \(\{P(t)\}_{t\in \mathbb {R}_+}\) admits a unique invariant measure \(\mu _*\in \mathcal {M}_1(E)\) and, therefore, \(\Psi \) is ergodic in the sense of [3]; i.e., \(\mathbb {P}_{\mu _*}(F)\in \{0,1\}\) for every \(F\in \mathcal {F}_{\infty }\) satisfying \(\Theta _t^{-1}(F)=F\) for all \(t>0\). Further, let \(A: D_A\rightarrow \mathbb {L}_0\) be the infinitesimal generator of \(\{P(t)\}_{t\in \mathbb {R}_+}\) defined in Sect. 5, and let \(\chi \) denote the corrector function, specified by (4.5). From the first step of the proof of Theorem 5.1 it follows that \(-\chi \in D_A\) and \({\bar{g}}=A(-\chi )\). This, in particular, means that \({\bar{g}}\) is in the range of A. Hence, we now see that the assertion follows directly from [3, Theorem 2.1]. \(\square \)
7 An Example of Application to Some PDMPs
We shall end this paper by applying our main result, i.e., Theorem 3.1 (and Proposition 6.1) to establish the CLT (and the FCLT in the stationary case) for the PDMPs considered in [12]. More specifically, we will be concerned with a process involving a deterministic motion governed by a finite number of semiflows, which is punctuated by random jumps, occurring in independent and exponentially distributed time intervals \(\Delta \tau _n\) with the same rate \(\lambda \). The state right after a jump will depend randomly on the one immediately preceding this jump, and its probability distribution will be governed by an arbitrary transition law J.
Let \((Y,\rho _Y)\) be a complete separable metric space, and let I be a finite set, endowed with the discrete metric \(\textbf{d}\), i.e., \(\textbf{d}(i,j)=1\) for \(i\ne j\) and \(\textbf{d}(i,j)=0\) otherwise. Moreover, put \(X:=Y\times I\), \({\bar{X}}:=X\times \mathbb {R}_+\), and let \(\rho _{X,c}\) denote the metric in X defined by
where c is a given positive constant.
Further, suppose that we are given an arbitrary stochastic kernel J on \(Y\times \mathcal {B}(Y)\) and some constant \(\lambda >0\). In addition to that, consider a stochastic matrix \(\{\pi _{ij}\}_{i,j\in I}\subset \mathbb {R}_+\) such that \(\min _{i\in I} \pi _{ij_0}>0\) for some \(j_0\in I\) and a collection \(\{S_i\}_{i\in I}\) of (jointly) continuous semiflows from \(\mathbb {R}_+\times Y\) to Y. By saying that \(S_i\) is a semiflow we mean as usual that
Obviously, in practical applications, one usually deals with semiflows generated by unique solutions to certain particular Cauchy problems for autonomous differential equations (see, e.g., [12, Examples 7.2, 7.3]), and their properties are investigated through the operators involved in these equations (such as, e.g., smooth vector fields on \(\mathbb {R}^d\) in [2, §5] or [1, §4]).
We shall investigate a stochastic process \(\Psi :=\{(Y(t),\xi (t))\}_{t\in \mathbb {R}_+}\), evolving on the space \((X,\rho _{X,c})\) in such a way that
and \({\bar{\Phi }}:=\{(Y(\tau _n),\xi (\tau _n),\tau _n)\}_{n\in \mathbb {N}_0}=\{(\Psi (\tau _n),\tau _n)\}_{n\in \mathbb {N}_0}\) is an \({\bar{X}}\)-valued time-homogeneous Markov chain with one-step transition law \({\bar{P}}:{\bar{X}}\times \mathcal {B}({\bar{X}})\rightarrow [0,1]\) given by
for any \(y\in Y\), \(i\in I\), \(s\in \mathbb {R}_+\) and \({\bar{A}}\in \mathcal {B}({\bar{X}})\).
Obviously, \(\Phi :=\{\Psi (\tau _n)\}_{n\in \mathbb {N}_0}\), \(\{\xi (\tau _n)\}_{n\in \mathbb {N}_0}\) and \(\{\tau _n\}_{n\in \mathbb {N}_0}\) are then also Markov chains w.r.t. their own natural filtrations, and, for every \(n\in \mathbb {N}_0\), their transition laws satisfy
where \({\bar{\mu }}=\mu \otimes \delta _0\), and \(\mu \in \mathcal {M}_1(X)\) stands for the initial measure of \(\Phi \) (and thus of \(\Psi \)). Importantly, (7.4) implies that the increments \(\Delta \tau _n:=\tau _{n}-\tau _{n-1}\), \(n\in \mathbb {N}\), form a sequence of independent and exponentially distributed random variables with the same rate \(\lambda \), and therefore \(\tau _n\uparrow \infty \), as \(n\rightarrow \infty \), \(\mathbb {P}_{{\bar{\mu }}}\)-a.s. (which, in turn, yields that (7.2) is well-defined).
It is not hard to check that \(\Psi \), defined as above, is a (piecewise-deterministic) time-homogeneous Markov process. Clearly, such a process is progressively measurable, as it has right-continuous sample paths. In the remainder of the paper, \(\{P(t)\}_{t\in \mathbb {R}_+}\) will stand for the transition semigroup of this process. Let us emphasize here that, since \(\{P(t)\}_{t\in \mathbb {R}_+}\) is stochastically continuous (at \(t=0\)) by [12, Lemma 5.1 (iii)], it indeed satisfies also the last condition required in the definition of transition semigroup adopted in this paper, i.e., the map \((t,x)\mapsto P(t)(x,A)\) is \(\mathcal {B}(\mathbb {R}_+\times E)/\mathcal {B}(\mathbb {R})\)-measurable (see [40, Proposition 3.4.5]).
In [12], we have proposed conditions (J1), (J2) on the kernel J and (S1)-(S3) on the semiflows \(S_i\) under which \(\{P(t)\}_{t\in \mathbb {R}_+}\) enjoys hypothesis (A2) with \(\rho =\rho _{X,c}\), specified by (7.1), and V given by
with some arbitrarily fixed \(y^*\in Y\), provided that the constants involved in [12, (J1) and (S2)] are interrelated by inequality [12, (4.8)], and that c is sufficiently large. More precisely, this follows from [12, Lemma 6.2], applied together with [12, Proposition 7.1]. Moreover, statement (i) of [12, Lemma 5.1] yields that, if J is Feller, then so is \(\{P(t)\}_{t\in \mathbb {R}_+}\), i.e., (A1) holds.
To apply Theorem 3.1 (and Proposition 6.1), it therefore suffices to verify when (A3) is met. We will show that this hypothesis does hold upon assuming the following conditions:
-
(J1’)
There exist \(a,b\ge 0\) for which
$$\begin{aligned} J\rho _Y^2(\cdot ,y^*)(y)\le a \rho _Y^2(y,y^*)+b\quad \text {for}\quad y\in Y, \end{aligned}$$ -
(S0)
There exist \(R,M\ge 0\) and \(N\in \mathbb {N}\) such that
$$\begin{aligned} \rho _Y(S_i(t,y), y^*)\le R\,\rho _Y(y,y^*)+M\left( t^N+1\right) \quad \text {for}\quad y\in Y,\;t\ge 0,\; i\in I; \end{aligned}$$
with \(2aR^2<1\). Obviously, [12, (J1)] can be derived from (J1’) by using the Hölder inequality, which shows that the former holds with \({\tilde{a}}:=\sqrt{a}\) and \({\tilde{b}}:=\sqrt{b}\). At the end of this section, we will also demonstrate how to link (S0) with the aforementioned hypotheses [12, (S1)-(S3)].
Lemma 7.1
Suppose that (J1’) and (S0) hold with \(a,R\ge 0\) such that \(2aR^2<1\). Then there exist constants \({\Gamma }>0\) and \(C\ge 0\) such that, for every \(t_0\ge 0\) and the function \(U_{t_0}:{\bar{X}}\rightarrow \mathbb {R}_+\) given by
we have
Proof
Fix \(t_0\ge 0\), and define \(\eta :=2a R^2<1\) and \(m:=2N\) with N given in (S0). Then, using sequentially (J1’), (S0) and (4.1), we infer that, for any \((y,i)\in X\) and \(t\ge 0\),
In what follows, we will show inductively that for every \(n\in \mathbb {N}\)
For \(n=1\) and any \((y,i,s)\in {\bar{X}}\) we get
and therefore from (7.7) it follows that, for \(s\le t_0\),
and \({\bar{P}}U_{t_0}(y,i,s)=0\) for \(s>t_0\). Hence (7.8) holds with \(n=1\). Now, suppose that (7.8) is fulfilled with some arbitrarily fixed \(n\in \mathbb {N}\). Then, by identity \({\bar{P}}^{n+1}U_{t_0}={\bar{P}}({\bar{P}}^n U_{t_0})\) and the induction hypothesis, we obtain
Consequently, using again (7.7), for \(s\le t_0\), we get
Further, taking into account that
we can finalize the above estimation (in the case where \(s\le t_0\)) as follows:
Obviously, \({\bar{P}}^{n+1}U_{t_0}(y,i,s)=0\) for \(s>t_0\). According to the induction principle, we can therefore conclude that (7.8) indeed holds for all \(n\in \mathbb {N}\).
Now, observe that inequality (7.8) applied with \(s=0\) gives
for all \((y,i)\in X\) and \(n\in \mathbb {N}_0\) (for \(n=0\), this follows trivially from the definition of \(U_{t_0}\)). Finally, we obtain
where \({{\Gamma }}:=(1-\eta )\lambda >0\) and \( {C}:=D(1-\eta )^{-1}\left( 1+\lambda ^{-m}\right) \ge 0\), which completes the proof. \(\square \)
Proposition 7.1
Suppose that (J1’) and (S0) hold with \(a,R\ge 0\) such that \(2aR^2<1\). Then \(\{P(t)\}_{t\in \mathbb {R}_+}\) enjoys hypothesis (A3) with V given by (7.5).
Proof
Let \({{\Gamma }}>0\) and \(C\ge 0\) be the constants for which the assertion of Lemma 7.1 is valid. Further, fix \(t\ge 0\), \(x=(y,i)\in X\), and put \({\bar{x}}:=(x,0)\). Then, we can write
Now, let \(\{\mathcal {F}_n\}_{n\in \mathbb {N}_0}\) denote the natural filtration of the chain \(\Phi \), and fix an arbitrary \(n\in \mathbb {N}_0\). Taking into account that \(\mathbb {P}_{{\bar{x}}}(\tau _{n+1}>t\,|\,\mathcal {F}_n)=\mathbb {P}_x(\tau _{n+1}>t\,|\,\tau _n)=e^{-\lambda (t-\tau _n)}\) on the set \(\{\tau _n\le t\}\) (which follows from (7.4)), we get
Further, letting \(m:=2N\), and using (S0) and (4.1), we can conclude that
Taking the expectation of both sides of this inequality gives
If we now sum up both sides of (7.10) over all \(n\in \mathbb {N}_0\), then, returning to (7.9), we can deduce that
On the other hand, it follows from Lemma 7.1 that
where \(U_t:{\bar{X}}\rightarrow \mathbb {R}_+\) is given by (7.6) (with t in place of \(t_0\)). Moreover, having in mind that \(\tau _n\) has the Erlang distribution, we get
Finally, applying (7.11)–(7.13), we infer that
with \(A=2R^2\) and \(B:=2\left( R^2 {C}+2\,M^2\left( m!\lambda ^{-m}+1 \right) +2\,M^2\sup _{t \ge 0}e^{-\lambda t} \left( t^m+1\right) \right) \). \(\square \)
Let us now consider two conditions constituting strengthened forms of [12, (S1)], namely:
- (S1\(^{*}\)):
-
There exist \(M\ge 0\) and \(N\in \mathbb {N}\) such that
$$\begin{aligned} \max _{i\in I}\sup _{y\in Y}\rho _Y(S_i(t,y), y)\le M\left( t^N+1\right) \quad \text {for}\quad t\ge 0; \end{aligned}$$
-
(S1’)
There exist \(M\ge 0\) and \(N\in \mathbb {N}\) such that
$$\begin{aligned} \max _{i\in I} \rho _Y(S_i(t,y^*), y^*)\le M\left( t^N+1\right) \quad \text {for}\quad t\ge 0. \end{aligned}$$
In addition to that, recall condition [12, (S2)]:
-
(S2)
There exist \(L\ge 1\) and \(\alpha <\lambda \) such that
$$\begin{aligned} \rho _Y(S_i(t,y_1),S_i(t,y_2))\le Le^{\alpha t}\rho _Y(y_1,y_2)\quad \text {for}\quad t\ge 0,\;y_1,y_2\in Y,\;i\in I. \end{aligned}$$
Remark 7.1
Using the triangle inequality, it is easy to verify the following implications:
where [12, (S3)] is obtained (depending on the assumptions) with
It is worth stressing here that conditions (S1’) and (S2) with \(\alpha \le 0\) and \(L=1\) are met, e.g., by the semiflows genereted by a wide class of dissipative differential equations in Hilbert spaces. If the operators involved in such equations are bounded, then even (S1\(^{*}\)) holds. This is explained in detail in [12, Remark 4.3], which, in turn, is based on [23, Chapter 5.2].
Formulating the main result of this section, apart from hypotheses (S1\(^{*}\)), (S1’), (S2) and (J1’), we shall also use condition [12, (J2)], which ensures the existence of a Markovian coupling of J with certain specific properties. Let us quote it to make our result self-contained. Below, a substochastic kernel will mean a map defined similarly to a stochastic kernel (see Sect. 2.1), but allowed to be a subprobability measure with respect to the second variable.
-
(J2)
There exists a substochastic kernel \(Q_J: Y^2\times \mathcal {B}(Y^2)\rightarrow [0,1]\) satisfying
$$\begin{aligned} Q_J((y_1,y_2), B\times Y)\le J(y_1,B)\quad \text {and}\quad Q_J((y_1,y_2), Y\times B)\le J(y_2,B) \end{aligned}$$for any \(y_1,y_2\in Y\) and \(B\in \mathcal {B}(Y)\) such that, for certain constants \({\tilde{a}}, l\ge 0\), we have
$$\begin{aligned}&\int _{Y^2}\rho _Y(u_1,u_2)\,Q_J((y_1,y_2),du_1\times du_2)\le {\tilde{a}}\rho _Y(y_1,y_2)\quad \text {for any}\quad y_1,y_2\in Y,\\&\quad \inf _{(y_1,y_2)\in Y} Q_J((y_1,y_2), \{(u_1,u_2)\in Y^2:\, \rho _Y\big (u_1,u_2)\le {\tilde{a}}\rho _Y(y_1,y_2) \}\big )>0,\\&\quad Q_J((y_1,y_2), Y^2)\ge 1-l\rho _Y(y_1,y_2) \quad \text {for any}\quad y_1,y_2\in Y. \end{aligned}$$
We can now state the announced central convergence criterion for the PDMP under consideration.
Theorem 7.1
Suppose that the kernel J is Feller, (J1’) holds (with certain \(a,b\ge 0\)), and that (J2) is satisfied with \({\tilde{a}}:=\sqrt{a}\). Moreover, assume that one of the following statements is fulfilled:
-
(i)
(S1\(^{*}\)) and (S2) hold with \(\alpha <\lambda \) and \(L\ge 1\) such that \(a<\min \{L^{-2}(1-\alpha \lambda ^{-1})^2,\, 2^{-1}\}\),
-
(ii)
(S1’) and (S2) is satisfied with \(\alpha \le 0\) and \(L\ge 1\) such that \(a<(2\,L^2)^{-1}\).
Then the assertions of both Theorem 3.1 and Proposition 6.1 are valid for the semigroup \(\{P(t)\}_{t\in \mathbb {R}_+}\) determined by (7.2) and (7.3), with V given by (7.5) and \(\sigma ^2\) satisfying (5.1), provided that the constant c, involved in (7.1), is sufficiently large.
Proof
As mentioned earlier, (J1’) implies [12, (J1)] with \({\tilde{a}}=\sqrt{a}\), and (J2) is just assumed. Further, according to Remark 7.1, each of hypotheses (i), (ii) guarantees that [12, (S1)-(S3)] hold with \(\alpha ,L\) satisfying the inequality \({\tilde{a}}L+\alpha /\lambda <1\), which coincides with that assumed in [12, (4.8)]. As has already been said, this, together with the Feller property of J, suffices for hypotheses (A1) and (A2) to hold. On the other hand, appealing again to Remark 7.1, we also know that (S0) is satisfied with \(R=1\) in case (i) or with \(R=L\) in case (ii), and \(2aR^2<1\) in each of these cases. Consequently, Proposition 7.1 yields that hypothesis (A3) holds as well. Hence, Theorem 3.1 is indeed valid for the semigroup being considered.
Furthermore, since \(\{P(t)\}_{t\in \mathbb {R}_+}\) is stochastically continuous (at \(t=0\)) by statement (iii) of [12, Lemma 5.1], it follows that \(\sigma ^2\) can be represented as indicated in Theorem 5.1, and that Proposition 6.1 is also applicable to the given semigroup. \(\square \)
Remark 7.2
By saying that c should be sufficiently large we mean that it supposed to be grater than some specific value dependent on the constants and functions involved in conditions (J1’), (S2) and [12, (S3)] (which ensures that [12, Proposition 7.1] is valid). To be more precise, letting \(\varphi :\mathbb {R}_+\rightarrow \mathbb {R}_+\) and \(\mathcal {L}:Y\rightarrow \mathbb {R}_+\) be the functions for which (S3) holds (see Remark 7.1), one may require that
where \(K_{\varphi }:=\int _0^{\infty } \varphi (t)e^{-\lambda t}\,dt\), \(M_{\varphi }:=\sup _{t\in \mathbb {T}}\varphi (t)\) with \(\mathbb {T}\subset [0,\infty )\) being an arbitrary bounded interval of positive length for which \(\sup _{t\in \mathbb {T}}e^{\alpha t}\le \lambda (\lambda -\alpha )^{-1}\), and \(M_{\mathcal {L}}:=\sup _{\rho _Y(y_*,y)<r}\mathcal {L}(y)\) with \(r:=4b(1-a)^{-1}\) and
An important example of the PDMPs under consideration are those with J defined as the transition law of a random iterated function system. In this case J takes the form
where \(\{w_{\theta }:\,\theta \in \Theta \}\) is an arbitrary family of continuous transformations from Y to itself, indexed by the elements of some topological measure space \((\Theta ,\vartheta )\), and \(\{p_{\theta }:\,\theta \in \Theta \}\) is an associated family of state-dependent densities with respect to \(\vartheta \). In [12, Proposition 7.2], we have provided a set of conditions guaranteeing that J of this form satisfies hypotheses [12, (J1) and (J2)] with \(Q_J\) given by
for any \(y_1,y_2\in Y\) and \(C\in \mathcal {B}(Y^2)\). These conditions can be easily modified to also ensure (J1’), as required in Theorem 7.1. The details are left to the reader.
References
Benaïm, M., Le Borgne, S., Malrieu, F., Zitt, P.-A.: Quantitative ergodicity for some switched dynamical systems. Electron. Commun. Probab. 17, 1–14 (2012)
Benaïm, M., Le Borgne, S., Malrieu, F., Zitt, P.-A.: Qualitative properties of certain piecewise deterministic Markov processes. Ann. Inst. Henri Poincaré Probab. 51(3), 1040–1075 (2015)
Bhattacharya, R.N.: On the functional central limit theorem and the law of the iterated logarithm for Markov processes. Wahrscheinlichkeitstheorie verw Gebiete 60, 185–201 (1982)
Billingsley, P.: Probability and Measure, 2nd edn. Wiley, New York (1986)
Blumenthal, R.M., Getoor, R.K.: Markov processes and potential theory. In: Pure and Applied Mathematics. Academic Press (1968)
Bogachev, V.I.: Measure Theory, vol. II. Springer-Verlag, Berlin (2007)
Brown, B.M.: Martingale central limit theorems. Ann. Math. Stat. 42(1), 59–66 (1971)
Cloez, B., Hairer, M.: Exponential ergodicity for Markov processes with random switching. Bernoulli 21(1), 505–536 (2015)
Czapla, D., Horbacz, K., Wojewódka-Ściążko, H.: Ergodic properties of some piecewise-deterministic Markov process with application to gene expression modelling. Stoch. Proc. Appl. 130(5), 2851–2885 (2020)
Czapla, D., Horbacz, K., Wojewódka-Ściążko, H.: A useful version of the central limit theorem for a general class of Markov chains. J. Math. Anal. Appl. 484(1), 123725 (2020)
Czapla, D., Horbacz, K., Wojewódka-Ściążko, H.: The Strassen invariance principle for certain non-stationary Markov-Feller chains. Asymptot. Anal. 121(1), 1–34 (2021)
Czapla, D., Horbacz, K., Wojewódka-Ściążko, H.: Exponential ergodicity in the bounded-Lipschitz distance for some piecewise-deterministic Markov processes with random switching between flows. Nonlinear Anal. 215, 112678 (2022)
Czapla, D., Kubieniec, J.: Exponential ergodicity of some Markov dynamical systems with application to a Poisson driven stochastic differential equation. Dyn. Syst. 34(1), 130–156 (2019)
Derriennic, Y., Lin, M.: The central limit theorem for Markov chains with normal transition operators, started at a point. Probab. Theory Related Fields 119(4), 508–528 (2001)
Dudley, R.: Convergence of Baire measures. Studia Math. 27, 251–268 (1966)
Ethier, N.E., Kurtz, T.G.: Markov Processes. Characterization and Convergence. Wiley, Hoboken, New Jersey (1986)
Gordin, M.I., Lifšic, B.A.: Central limit theorem for stationary Markov processes. Dokl. Akad. Nauk SSSR 239(4), 766–767 (1978)
Gordin, M.I., Lifšic, B.A.: A remark about a Markov process with normal transition operator. In Third Vilnius Conference on Probability and Statistics, 1:147–148, (1981)
Gulgowski, J., Hille, S.C., Szarek, T., Ziemlańska, M.A.: Central limit theorem for some non-stationary Markov chains. Stud. Math. 246, 109–131 (2019)
Hairer, M.: Exponential mixing properties of stochastic PDEs through asymptotic coupling. Probab. Theory Related Fields 124, 345 (2002)
Heil, C.: Introduction to Real Analysis. Springer (2019)
Holzmann, H.: The central limit theorem for stationary Markov processes with normal generator–with applications to hypergroups. Stochastics 77(4), 371–380 (2005)
Ito, K., Kappel, F.: Evolution Equations and Approximations. Advances in Mathematics for Applied Sciences, vol. 61. World Scientific, New Jersey (2002)
Jin, R., Tan, A.: Central limit theorems for Markov chains based on their convergence rates in Wasserstein distance. Preprint available atarXiv:2002.09427, (2020)
Kapica, R., Ślęczka, M.: Random iteration with place dependent probabilities. Probab. Math. Statist. 40(1), 119–137 (2020)
Kipnis, C., Varadhan, S.R.S.: Central limit theorem for additive functionals of reversible Markov processes and applications to simple exclusions. Comm. Math. Phys. 104, 1–19 (1986)
Komorowski, T., Peszat, S., Szarek, T.: Passive tracer in a flow corresponding to two-dimensional stochastic Navier-stokes equations. Nonlinearity 26(7), 1999–2026 (2013)
Komorowski, T., Walczuk, A.: Central limit theorem for Markov processes with spectral gap in the Wasserstein metric. Stoch. Proc. Appl. 122(5), 2155–2184 (2012)
Krengel., U.: Ergodic theorems. With a supplement by Antoine Brunel. De Gruyter Studies in Mathematics, 6. Walter de Gruyter & Co., Berlin, New York, (1985)
Lasota, A.: From fractals to stochastic differential equations. In P. Garbaczewski, M. Wolf, and A. Weron, editors, Chaos – The Interplay Between Stochastic and Deterministic Behaviour (Proceedings of the XXXIst Winter School of Theoretical Physics, Karpacz, Poland, 13-24 Feb 1995), volume 457 of Lecture Notes in Physics, pages 235–255, Berlin, (1995). Lecture Notes in Physics 457, Springer Verlag
Lévy, P.: Propriétés asymptotiques des sommes de variables indépendantes ou enchainées. Journal des mathématiques pures et appliquées Series 9 14(4), 347–402 (1935)
Loève, M.: Probability Theory 1, 4th edn. Springer-Verlag, New York (1977)
Maxwell, M., Woodroofe, M.: Central limit theorems for additive functionals of Markov chains. Ann. Probab. 28, 713–724 (2000)
Meyn, S.P., Tweedie, R.L.: Markov Chains and Stochastic Stability. Springer-Verlag, Berlin, Heidelberg, New York (1993)
Olla, S., Landim, C., Komorowski, T.: Fluctuations in Markov Processes. Time Symmetry and Martingale Approximation. Springer, Berlin, Heidelberg (2012)
Sharpe, M.: General Theory of Markov Processes Pure and Applied Mathematics, vol. 133. Academic Press (1988)
Ślęczka, M.: The rate of convergence for iterated function systems. Stud. Math. 205, 201–214 (2011)
Walters, P.: An introduction to ergodic theory. In: Graduate Texts in Mathematics, vol. 79. Springer-Verlag, Berlin, Heidelberg, New York (1982)
Wojewódka, H.: Exponential rate of convergence for some Markov operators. Stat. Probab. Lett. 83(10), 2337–2347 (2013)
Worm, D.T.H.: Semigroups on spaces of measures. Leiden University (PhD thesis), Leiden, The Netherlands, (2010)
Acknowledgements
The authors thank the anonymous referees for their constructive comments that have helped to significantly improve the paper. The work of H.W.-Ś. is supported by the project Near-term Quantum Computers: challenges, optimal implementations and applications under grant no. POIR.04.04.00-00-17C1/18-00, which is carried out within the Team-Net programme of the Foundation for Polish Science, co-financed by the European Union under the European Regional Development Fund.
Author information
Authors and Affiliations
Contributions
All authors contributed extensively to the work presented in the paper. HW-Ś conceptualized central ideas and wrote the original draft of the article, which was then reviewed by DC and KH. DC completed certain proofs, as well as added Sects. 5, 6 and 7 to the article.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Czapla, D., Horbacz, K. & Wojewódka-Ściążko, H. The Central Limit Theorem for Markov Processes that are Exponentially Ergodic in the Bounded-Lipschitz Norm. Qual. Theory Dyn. Syst. 23, 7 (2024). https://doi.org/10.1007/s12346-023-00862-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12346-023-00862-4
Keywords
- Markov process
- Central limit theorem
- Martingale method
- Exponential ergodicity
- Bounded-Lipschitz distance