Skip to main content
Log in

A 2-class maintenance model with dynamic server behavior

  • Original Paper
  • Published:
TOP Aims and scope Submit manuscript

Abstract

We analyze a 2-class maintenance system within a single-server polling model framework. There are \(C+f\) machines in the system, where C is the cap on the number of machines that can be turned on simultaneously (and hence, be at risk of failure), and the excess f machines comprise a maintenance float which can be used to replace machines that are taken down for repair. The server’s behavior is dynamic, capable of switching queues upon a machine failure or service completion depending on both queue lengths. This generalized server behavior permits the analysis of several classic service policies, including preemptive resume priority, non-preemptive priority, and exhaustive. More complicated polices can also be considered, such as threshold-based ones and a version of the Bernoulli service rule. The system is modeled as a level-dependent quasi-birth-and-death process and matrix analytic methods are used to find the steady-state joint queue length distribution, as well as the distribution for the sojourn time of a broken machine. An upper bound on the expected number of working machines as a function of C is derived, and Little’s Law is used to find the relationship between the expected number of working machines and the expected sojourn time of a failed machine when \(f=0\) or \(f \ge 1\). Several numerical examples are presented, including how one might optimize an objective function depending on the mean number of working machines, with penalty costs attributed to increasing C or f.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Abboud NE (1996) The Markovian two-echelon repairable item provisioning problem. J Oper Res Soc 47(2):284–296

    Article  Google Scholar 

  • Asmussen S, Nerman O, Olsson M (1996) Fitting phase-type distributions via the EM algorithm. Scand J Stat 23(4):419–441

    Google Scholar 

  • Avrachenkov K, Perel E, Yechiali U (2016) Finite-buffer polling systems with threshold-based switching policy. TOP 24(3):541–571

    Article  Google Scholar 

  • Avram F, Gómez-Corral A (2006) On the optimal control of a two-queue polling model. Oper Res Lett 34(3):339–348

    Article  Google Scholar 

  • Blanc JPC (1990) A numerical approach to cyclic-service queueing models. Queue Syst 6(1):173–188

    Article  Google Scholar 

  • Blanc JPC (1991) The power-series algorithm applied to cyclic polling systems. Commun Stat Stoch Models 7(4):527–545

    Article  Google Scholar 

  • Blanc JPC, van der Mei RD (1995) Optimization of polling systems with Bernoulli schedules. Perform Eval 22(2):139–158

    Article  Google Scholar 

  • Boon MAA (2011) Polling models: from theory to traffic intersections. Dissertation, Eindhoven University of Technology

  • Boon MAA, van der Mei RD, Winands EMM (2011) Applications of polling systems. Surv Oper Res Manag Sci 16(2):67–82

    Google Scholar 

  • Boxma OJ, Koole GM, Mitrani I (1995) Polling models with threshold switching. In: Baccelli F, Jean-Marie A, Mitrani I (eds) Quantitative methods in parallel systems. Esprit basic research series. Springer, Berlin, Heidelberg

    Google Scholar 

  • Buyukkramikli NC, van Ooijen HPG, Bertrand JWM (2015) Integrating inventory control and capacity management at a maintenance service provider. Ann Oper Res 231(1):185–206

    Article  Google Scholar 

  • Gaver D, Jacobs P, Latouche G (1984) Finite birth-and-death models in randomly changing environments. Adv Appl Prob 16(4):715–731

    Article  Google Scholar 

  • Granville K, Drekic S (2018) A 2-class maintenance model with a finite population and competing exponential failure rates. Queue Models Serv Manag 1(1):141–176

    Google Scholar 

  • Gross D, Miller DR, Soland RM (1983) A closed queueing network model for multi-echelon repairable item provisioning. IIE Trans 15(4):344–352

    Article  Google Scholar 

  • He QM (2014) Fundamentals of matrix-analytic methods, vol 365. Springer, New York

    Book  Google Scholar 

  • Iravani SM, Kolfal B (2005) When does the \(c\mu \) rule apply to finite-population queueing systems? Oper Res Lett 33(3):301–304

    Article  Google Scholar 

  • Iravani SM, Krishnamurthy V, Chao GH (2007) Optimal server scheduling in nonpreemptive finite-population queueing systems. Queue Syst 55(2):95–105

    Article  Google Scholar 

  • Keilson J, Servi LD (1986) Oscillating random walk models for GI/G/1 vacation systems with Bernoulli schedules. J Appl Prob 23(3):790–802

    Article  Google Scholar 

  • Kim SK, Dshalalow JH (2003) A versatile stochastic maintenance model with reserve and super-reserve machines. Methodol Comput Appl Prob 5(1):59–84

    Article  Google Scholar 

  • Lakatos L, Szeidl L, Telek M (2012) Introduction to queueing systems with telecommunication applications. Springer Science & Business Media, Berlin

    Google Scholar 

  • Lee DS, Sengupta B (1993) Queueing analysis of a threshold based priority scheme for ATM networks. IEEE/ACM Trans Netw (TON) 1(6):709–717

    Article  Google Scholar 

  • Levy H, Sidi M (1990) Polling systems: applications, modeling and optimization. IEEE Trans Commun COM 38(10):1750–1760

    Article  Google Scholar 

  • Liang WK, Balcıo\(\tilde{{\rm g}}\)lu B, Svaluto R (2013) Scheduling policies for a repair shop problem. Ann Oper Res 211(1):273–288

  • Lin C, Madu CN, Kuei CH (1994) A closed queuing maintenance network for a flexible manufacturing system. Microelectron Reliab 34(11):1733–1744

    Article  Google Scholar 

  • Little JD (1961) A proof for the queuing formula: \(L= \lambda W\). Oper Res 9(3):383–387

    Article  Google Scholar 

  • Mack C (1957) The efficiency of \(N\) machines uni-directionally patrolled by one operative when walking time is constant and repair times are variable. J R Stat Soc Ser B (Methodol) 19(1):173–178

    Google Scholar 

  • Mack C, Murphy T, Webb N (1957) The efficiency of \(N\) machines uni-directionally patrolled by one operative when walking time and repair times are constants. J R Stat Soc Ser B (Methodol) 19(1):166–172

    Google Scholar 

  • Madu CN (1988) A closed queueing maintenance network with two repair centres. J Oper Res Soc 39(10):959–967

    Article  Google Scholar 

  • Meilijson I, Yechiali U (1977) On optimal right-of-way policies at a single-server station when insertion of idle times is permitted. Stoch Process Their Appl 6(1):25–32

    Article  Google Scholar 

  • Perel E, Yechiali U (2017) Two-queue polling systems with switching policy based on the queue that is not being served. Stochastic Models 33(3):1–21

    Article  Google Scholar 

  • Righter R (2002) Optimal maintenance and operation of a system with backup components. Prob Eng Inform Sci 16(3):339–349

    Article  Google Scholar 

  • Ross SM (2014) Introduction to probability models. Academic press, San Diego

    Google Scholar 

  • Syski R (1992) Passage times for Markov chains. IOS Press, Amsterdam

    Google Scholar 

  • Takagi H (1988) Queueing analysis of polling models. ACM Comput Surv 20(1):5–28

    Article  Google Scholar 

  • Van Mieghem JA (1995) Dynamic scheduling with convex delay costs: The generalized \(c \mu \) rule. Ann Appl Prob 5(3):809–833

    Article  Google Scholar 

  • Vishnevskii VM, Semenova OV (2006) Mathematical methods to study the polling systems. Automation and Remote Control 67(2):173–220

    Article  Google Scholar 

  • Weststrate JA, van der Mei RD (1994) Waiting times in a two-queue model with exhaustive and Bernoulli service. Zeitschrift für Operations Research 40(3):289–303

    Google Scholar 

Download references

Acknowledgements

Steve Drekic and Kevin Granville acknowledge the financial support from the Natural Sciences and Engineering Research Council of Canada through its Discovery Grants program (RGPIN-2016-03685) and Postgraduate Scholarship-Doctoral program, respectively. The authors are also very thankful to the referee for their helpful comments and suggestions, which have improved the presentation of the paper as well as widened the scope of the discussion in one of our numerical examples.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kevin Granville.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Proof of Theorem 1

We begin by remarking that the infinitesimal generator subblocks \(Q_{0,1,0}^{[C,f]}\) and \(( UD )_{0,0}^{[C,f]}\) both comprise \(1+s_0\) identical rows equal to \(C\alpha _1 {\underline{\gamma }}_{01}^{[+0]} \otimes {\underline{\beta }}_1\) and \(C\alpha _2 {\underline{\gamma }}_{02}^{[+0]} \otimes {\underline{\beta }}_2\), respectively. This implies that given a machine failure has occurred, the CTMC transitions away from any of the empty queue states \(\{(0,0,0,0,0,0)\} \cup \{(0,0,5,y,0,0), y = 1,2,\ldots ,s_0\}\) in an identical fashion. This observation immediately follows from our assumption that interrupting and switching away from a class-i switch-in is treated the same as the server switching away from class i itself.

We now consider the differences between a system containing k machines where \([C,f]=[k,0]\) or \([C,f]=[k-1,1]\). The two systems will act identically, in terms of infinitesimal generator construction, with the exception of the rows for states where all k machines are functional (the first system puts the \(k\mathrm{th}\) machine to use, while the second stores it in the maintenance float). In either case, the total time spent visiting any combination of the empty queue states between the previous service completion and the next observed failure will have an exponential distribution with rate \(C\alpha \) (i.e., \(\text {Exp}(C \alpha )\)). Hence, we may adjust the CTMCs and consolidate the empty queue states into a single state (0, 0) with steady-state probability \(\pi _{0,0}^{[C,f]} = {\underline{\pi }}_{0,0}^{[C,f]} {\underline{e}}_{1+s_0}'\), such that we do not track the potential phase-type class-0 switch-in time and the sojourn time in this state is simply the time until the next machine failure. Note that this consolidation will not affect the other steady-state probabilities due to the identical rows of \(Q_{0,1,0}^{[C,f]}\) and \(( UD )_{0,0}^{[C,f]}\) which each are now just present once, corresponding to transitions out of state (0, 0). Thus, we have at steady state

$$\begin{aligned} \text {E}\big [N_W^{\big [C,f\big ]}\big ]&= \text {E}\big [\min \big \{C,C+f-X_1^{\big [C,f\big ]}-X_2^{\big [C,f\big ]}\big \}\big ] \nonumber \\&= \sum _{m,n,l,y,y_1,y_2} \min \big \{C,C+f-m-n\big \} \pi _{m,n,l,y,y_1,y_2}^{\big [C,f\big ]} \nonumber \\&= C\left( \pi _{0,0,0,0,0,0}^{\big [C,f\big ]}+\sum _{y=1}^{s_0} \pi _{0,0,5,y,0,0}^{\big [C,f\big ]}\right) \nonumber \\&\quad + \sum _{m+n \ne 0} \sum _{l,y,y_1,y_2} \min \big \{C,C+f-m-n\big \} \pi _{m,n,l,y,y_1,y_2}^{\big [C,f\big ]} \nonumber \\&= C\pi _{0,0}^{\big [C,f\big ]} + \sum _{m+n \ne 0} \sum _{l,y,y_1,y_2} \min \big \{C,C+f-m-n\big \} \pi _{m,n,l,y,y_1,y_2}^{\big [C,f\big ]}. \end{aligned}$$
(36)

That is, the expected number of working machines will be the same in the original CTMCs and the corresponding adjusted CTMCs with the consolidated empty queue state.

Let \(\psi _{0,0}^{[C,f]}\) and \(\psi _{m,n,l,y,y_1,y_2}^{[C,f]}\) denote the steady-state probabilities of the embedded discrete-time Markov chain, or DTMC (e.g., Syski 1992, p. 14), describing an adjusted CTMC with a given [Cf]. As the generators for [k, 0] and \([k-1,1]\) are now identical outside of the first rows for state (0, 0), which for [k, 0] is

$$\begin{aligned} \left[ \begin{array}{ccccccccc} -k\alpha&k\alpha _2 {\underline{\gamma }}_{02}^{[+0]} \otimes {\underline{\beta }}_2&{\underline{0}}&\cdots&{\underline{0}}&k\alpha _1 {\underline{\gamma }}_{01}^{[+0]} \otimes {\underline{\beta }}_1&{\underline{0}}&\cdots&{\underline{0}} \end{array} \right] \end{aligned}$$

and for \([k-1,1]\) is

$$\begin{aligned} \left[ \begin{array}{ccccccccc} -(k-1)\alpha&(k-1)\alpha _2 {\underline{\gamma }}_{02}^{[+0]} \otimes {\underline{\beta }}_2&{\underline{0}}&\cdots&{\underline{0}}&(k-1)\alpha _1 {\underline{\gamma }}_{01}^{[+0]} \otimes {\underline{\beta }}_1&{\underline{0}}&\cdots&{\underline{0}} \end{array} \right] , \end{aligned}$$

it is clear that while the steady-state probabilities for the CTMCs differ, it holds that \(\psi _{0,0}^{[k,0]} = \psi _{0,0}^{[k-1,1]}\) and \(\psi _{m,n,l,y,y_1,y_2}^{[k,0]} = \psi _{m,n,l,y,y_1,y_2}^{[k-1,1]}\).

It is known from the theory of semi-Markov processes (e.g., Ross 2014, p. 445) that if the long-run proportion of transitions by a semi-Markov process into state i is \(\pi _i\) (i.e., the steady-state probability of the embedded DTMC being in state i) and the amount of time spent in state i before transitioning away has mean \(\mu _i\), then the long-run proportion of time that the semi-Markov process is in state i is

$$\begin{aligned} \frac{\pi _i \mu _i}{\sum _{j=1}^N \pi _j \mu _j}, \end{aligned}$$
(37)

where N is the total number of states. Since we are considering CTMCs, the time spent in a state is exponentially distributed with a mean equal to the negative inverse of that state’s corresponding main diagonal element from the infinitesimal generator. Let \(\mu _{m,n,l,y,y_1,y_2}^{[k,0]} = \mu _{m,n,l,y,y_1,y_2}^{[k-1,1]}\) denote the mean time spent in a visit to state \((m,n,l,y,y_1,y_2)\), and \(\mu _{0,0}^{[k,0]} = \frac{1}{k\alpha }\) and \(\mu _{0,0}^{[k-1,1]} = \frac{1}{(k-1)\alpha }\) be the mean times spent in visits to the empty queue state in either adjusted CTMC. We then have

$$\begin{aligned} \pi _{m,n,l,y,y_1,y_2}^{\big [C,f\big ]} = \frac{\psi _{m,n,l,y,y_1,y_2}^{\big [C,f\big ]} \mu _{m,n,l,y,y_1,y_2}^{\big [C,f\big ]}}{\psi _{0,0}^{\big [C,f\big ]} \mu _{0,0}^{\big [C,f\big ]} + \sum _{x_1+x_2 \ne 0} \sum _{w,z,z_1,z_2} \psi _{x_1,x_2,w,z,z_1,z_2}^{\big [C,f\big ]} \mu _{x_1,x_2,w,z,z_1,z_2}^{\big [C,f\big ]}} \end{aligned}$$

and

$$\begin{aligned} \pi _{0,0}^{\big [C,f\big ]} = \frac{\psi _{0,0}^{\big [C,f\big ]} \mu _{0,0}^{\big [C,f\big ]}}{\psi _{0,0}^{\big [C,f\big ]} \mu _{0,0}^{\big [C,f\big ]} + \sum _{x_1+x_2 \ne 0} \sum _{w,z,z_1,z_2} \psi _{x_1,x_2,w,z,z_1,z_2}^{\big [C,f\big ]} \mu _{x_1,x_2,w,z,z_1,z_2}^{\big [C,f\big ]}}. \end{aligned}$$

Let

$$\begin{aligned} D^{\big [C,f\big ]} = \sum _{x_1+x_2 \ne 0} \sum _{w,z,z_1,z_2} \psi _{x_1,x_2,w,z,z_1,z_2}^{\big [C,f\big ]} \mu _{x_1,x_2,w,z,z_1,z_2}^{\big [C,f\big ]}, \end{aligned}$$

which we know satisfies \(D^{[k,0]} = D^{[k-1,1]}\). It now follows that

$$\begin{aligned} \pi _{m,n,l,y,y_1,y_2}^{[k,0]}&= \frac{\psi _{m,n,l,y,y_1,y_2}^{[k,0]} \mu _{m,n,l,y,y_1,y_2}^{[k,0]}}{\psi _{0,0}^{[k,0]} \mu _{0,0}^{[k,0]} + D^{[k,0]}} \nonumber \\&= \frac{\psi _{m,n,l,y,y_1,y_2}^{[k-1,1]} \mu _{m,n,l,y,y_1,y_2}^{[k-1,1]}}{\psi _{0,0}^{[k-1,1]} \mu _{0,0}^{[k-1,1]} + D^{[k-1,1]}} \times \frac{\psi _{0,0}^{[k-1,1]} \mu _{0,0}^{[k-1,1]} + D^{[k-1,1]}}{\psi _{0,0}^{[k,0]} \mu _{0,0}^{[k,0]} + D^{[k,0]}} \nonumber \\&= \pi _{m,n,l,y,y_1,y_2}^{[k-1,1]} c_k, \end{aligned}$$
(38)

where

$$\begin{aligned} c_k = \frac{\psi _{0,0}^{\big [k-1,0\big ]} \mu _{0,0}^{\big [k-1,1\big ]} + D^{\big [k-1,1\big ]}}{\psi _{0,0}^{\big [k,0\big ]} \mu _{0,0}^{\big [k,0\big ]} + D^{\big [k,0\big ]}} = \frac{\frac{1}{(k-1)\alpha }\psi _{0,0}^{\big [k,0\big ]} + D^{\big [k,0\big ]}}{\frac{1}{k\alpha }\psi _{0,0}^{\big [k,0\big ]} + D^{\big [k,0\big ]}} > 1. \end{aligned}$$
(39)

Similarly,

$$\begin{aligned} \pi _{0,0}^{\big [k,0\big ]}&= \frac{\psi _{0,0}^{\big [k,0\big ]} \mu _{0,0}^{\big [k,0\big ]}}{\psi _{0,0}^{\big [k,0\big ]} \mu _{0,0}^{\big [k,0\big ]} + D^{\big [k,0\big ]}} \nonumber \\&= \frac{\psi _{0,0}^{\big [k-1,1\big ]} \mu _{0,0}^{\big [k-1,1\big ]}\left( \frac{k-1}{k}\right) }{\psi _{0,0}^{\big [k-1,1\big ]} \mu _{0,0}^{\big [k-1,1\big ]} + D^{\big [k-1,1\big ]}} \times \frac{\psi _{0,0}^{\big [k-1,1\big ]} \mu _{0,0}^{\big [k-1,1\big ]} + D^{\big [k-1,1\big ]}}{\psi _{0,0}^{\big [k,0\big ]} \mu _{0,0}^{\big [k,0\big ]} + D^{\big [k,0\big ]}} \nonumber \\&= \pi _{0,0}^{\big [k-1,1\big ]} \left( \frac{k-1}{k}\right) c_k. \end{aligned}$$
(40)

Note that we can find an upper bound on \(c_k\). As the steady-state probabilities for both cases must, respectively, sum to 1, using Eqs. (38) and (40), it must simultaneously hold that

$$\begin{aligned} 1&= \pi _{0,0}^{\big [k,0\big ]} + \sum _{x_1+x_2 \ne 0} \sum _{w,z,z_1,z_2} \pi _{x_1,x_2,w,z,z_1,z_2}^{\big [C,f\big ]} \\&= \pi _{0,0}^{\big [k-1,1\big ]} \left( \frac{k-1}{k}\right) c_k + c_k \sum _{x_1+x_2 \ne 0} \sum _{w,z,z_1,z_2} \pi _{m,n,l,y,y_1,y_2}^{\big [k-1,1\big ]} \end{aligned}$$

and

$$\begin{aligned} 1=\pi _{0,0}^{\big [k-1,1\big ]} + \sum _{x_1+x_2 \ne 0} \sum _{w,z,z_1,z_2} \pi _{m,n,l,y,y_1,y_2}^{\big [k-1,1\big ]}. \end{aligned}$$

Clearly, as every probability is non-negative, by Eq. (39),

$$\begin{aligned} c_k \sum _{x_1+x_2 \ne 0} \sum _{w,z,z_1,z_2} \pi _{m,n,l,y,y_1,y_2}^{\big [k-1,1\big ]} > \sum _{x_1+x_2 \ne 0} \sum _{w,z,z_1,z_2} \pi _{m,n,l,y,y_1,y_2}^{\big [k-1,1\big ]} \end{aligned}$$

implying that we must have

$$\begin{aligned} \pi _{0,0}^{\big [k-1,1\big ]} \left( \frac{k-1}{k}\right) c_k < \pi _{0,0}^{\big [k-1,1\big ]}, \end{aligned}$$

or equivalently,

$$\begin{aligned} 1< c_k < \frac{k}{k-1}. \end{aligned}$$

Finally, using Eqs. (36)–(40),

$$\begin{aligned}&\text {E}\big [N_W^{\big [k,0\big ]}\big ] \\&\quad = k \pi _{0,0}^{\big [k,0\big ]} + \sum _{m+n \ne 0} \sum _{l,y,y_1,y_2} \min \big \{k,k+0-m-n\big \} \pi _{m,n,l,y,y_1,y_2}^{\big [k,0\big ]} \\&\quad = k \pi _{0,0}^{\big [k,0\big ]} + \sum _{m+n \ne 0} \sum _{l,y,y_1,y_2} \big (k-m-n\big ) \pi _{m,n,l,y,y_1,y_2}^{\big [k,0\big ]} \\&\quad = k \pi _{0,0}^{\big [k-1,1\big ]} \left( \frac{k-1}{k} \right) c_k + \sum _{m+n \ne 0} \sum _{l,y,y_1,y_2} \big (k-m-n\big ) \pi _{m,n,l,y,y_1,y_2}^{\big [k-1,1\big ]} c_k \\&\quad = c_k\big (\big (k-1\big ) \pi _{0,0}^{\big [k-1,1\big ]} + \sum _{m+n \ne 0} \sum _{l,y,y_1,y_2} \min \big \{k \! - \! 1,k \! - \! 1 \! + \! 1 \! - \! m \! - \! n\big \} \pi _{m,n,l,y,y_1,y_2}^{\big [k-1,1\big ]} \big ) \\&\quad = c_k \text {E}\big [N_W^{\big [k-1,1\big ]}\big ] \\&\quad > \text {E}\big [N_W^{\big [k-1,1\big ]}\big ]. \end{aligned}$$

\(\square \)

1.2 Proof of Theorem 2

In order to consider the limit of the expected number of working machines, we need to first find an expression for \(\text {E}[N_W^{[C,f]}]\). Similar to Abboud (1996), we consider the number of working machines as a subsystem and apply the result of Little (1961). Recall that Little’s Law states that the expected number of “customers” in a system (\(\text {E}[L]\)) is equal to the product of their average arrival rate (\(\lambda \)) and the expected amount of time that a customer spends in the system (\(\text {E}[W]\)).

As we are treating the number of working machines as the subsystem, not the number of functional machines, it is clear that W is simply the time until a working machine fails. Thus, we have \(W \sim \text {Exp}(\alpha )\), and so

$$\begin{aligned} \text {E}[W] = \frac{1}{\alpha }, \end{aligned}$$
(41)

which is independent of C, f, and the service policy. Next, we require the limiting aggregate rate that machines fail and are repaired, which we define as \(\lambda _r^{[C,f]}\), which is the effective average “arrival rate” of repaired machines satisfying

$$\begin{aligned} \text {E}\big [N_W^{\big [C,f\big ]}\big ] = \lambda _r^{\big [C,f\big ]} \text {E}\big [W\big ] = \frac{\lambda _r^{\big [C,f\big ]}}{\alpha }. \end{aligned}$$
(42)

We cite a result from the theory of renewal reward processes (e.g., Ross 2014, p. 427), describing a system which earns a reward \(R_n\) after the \(n\mathrm{th}\) renewal of a renewal process \(\{N(t), t \ge 0\}\) with interarrival times \(X_n\), \(n \in {\mathbb {Z}}^+\), where the \(R_n\)s are iid, but may depend on \(X_n\). The total amount of rewards that have accumulated by time \(t \ge 0\) is

$$\begin{aligned} R(t) = \sum _{n=1}^{N(t)} R_n, \end{aligned}$$

and it is known that the long run rate at which rewards are earned is

$$\begin{aligned} \lim _{t \rightarrow \infty } \frac{R\big (t\big )}{t} = \frac{\text {E}\big [R\big ]}{\text {E}\big [X\big ]}. \end{aligned}$$
(43)

We now define a renewal process based on our adjusted model from the proof of Theorem 1 with [Cf] machines, such that a renewal occurs whenever the adjusted CTMC enters the empty queue state (0, 0) (i.e., at time instants immediately after a repair which leaves all machines functional). At the end of each renewal, we receive a reward of 1 unit per observed service completion during that cycle. Applying Eq. (43) to this renewal process will result in the aggregate rate at which machines are repaired. That is, if we let \(\text {E}[ BP ^{[C,f]}]\) denote the mean duration of a busy period (i.e., the time between a failure to an empty system and when the system is empty again), then

$$\begin{aligned} \lambda _r^{\big [C,f\big ]} = \frac{\text {E}\big [\text {Number of repairs in } BP ^{\big [C,f\big ]}\big ]}{\text {E}\big [\text {Time until first failure at full capacity}\big ] + \text {E}\big [ BP ^{\big [C,f\big ]}\big ]}. \end{aligned}$$
(44)

Let \( BP _{\mathrm{ser}}^{[C,f]}\) and \( BP _{\mathrm{swi}}^{[C,f]}\) denote the time spent serving or switching during a busy period, respectively, such that \( BP ^{[C,f]} = BP _{\mathrm{ser}}^{[C,f]} + BP _{\mathrm{swi}}^{[C,f]}\). Note that regardless of order caused by a particular service policy, every machine that fails during (or initiating) the busy period must eventually be served. Since we assume that any preempted services are resumed when the server returns, no work is lost due to switch-ins. Therefore, if for example a class-2 repair time has the potential to be interrupted until some number of class-1 repairs are completed, the total expected time to repair that class-2 machine is still \(-{\underline{\beta }}_2 B_2^{-1} {\underline{e}}_{b_2}'\). Thus, if we let \(N_{ BP }\) be the number of repairs in \( BP ^{[C,f]}\), then \( BP _{\mathrm{ser}}^{[C,f]}\) can be represented as the sum of all total service times observed during the busy period

$$\begin{aligned} BP _{\mathrm{ser}}^{\left[ C,f\right] } = \sum _{n=1}^{N_{ BP }} Z_n^m, \end{aligned}$$

where \(Z_n^m\), \(n=1,2,\ldots \), are iid random service times which are mixtures of \(\text {PH}({\underline{\beta }}_i,B_i)\) distributions, \(i=1,2\), with weights \(\alpha _1 / \alpha \) and \(\alpha _2 / \alpha \), having mean

$$\begin{aligned} \text {E}\left[ Z^m\right] = -\left( \frac{\alpha _1}{\alpha }\right) {\underline{\beta }}_1 B_1^{-1} {\underline{e}}_{b_1}' - \left( \frac{\alpha _2}{\alpha }\right) {\underline{\beta }}_2 B_2^{-1} {\underline{e}}_{b_2}'. \end{aligned}$$

Therefore, it follows that

$$\begin{aligned} \text {E}\left[ BP _{\mathrm{ser}}^{\left[ C,f\right] }\right] = \text {E}\left[ \text {Number of repairs in } BP ^{\left[ C,f\right] }\right] \text {E}\left[ Z^m\right] , \end{aligned}$$

and Eq. (44) becomes

$$\begin{aligned} \lambda _r^{\big [C,f\big ]} = \frac{\text {E}\big [ BP _{\mathrm{ser}}^{\big [C,f\big ]}\big ] / \text {E}\big [Z^m\big ]}{\frac{1}{C\alpha } + \text {E}\big [ BP _{\mathrm{ser}}^{\big [C,f\big ]}\big ] + \text {E}\big [ BP _{\mathrm{swi}}^{\big [C,f\big ]}\big ]}. \end{aligned}$$
(45)

It should be noted that the distributions of \(N_{ BP }\), \( BP _{\mathrm{ser}}^{[C,f]}\), and \( BP _{\mathrm{swi}}^{[C,f]}\) (and hence \( BP ^{[C,f]}\)) depend not only on C and f, but also on the switch-in decision probabilities. For example, a class-1 preemptive resume priority discipline will always choose to clear out the small jobs as they arrive, which will result in those machines being able to fail again sooner than if the class-2 queue had to be emptied first, hence making it more likely that the server will need to repair more total machines during that busy period in comparison to other policies. We note however that the sole act of serving more machines during a busy period, and hence between renewals, does not necessarily mean that its resulting \(\lambda _f^{[C,f]}\) will be smaller or larger, as it very much also depends on whether these extra switches (relative to other disciplines) cause idle periods due to non-zero switch-in times.

We now consider the first of three cases, where \(\gamma _{ji}^{[0]} = 1 ~ \forall ~ i,j \in \{0,1,2\}, i \ne j\). Clearly, this implies that \(\text {E}[ BP _{\mathrm{swi}}^{[C,f]}]=0\), and Eq. (45) simplifies to

$$\begin{aligned} \lambda _r^{\big [C,f\big ]} = \frac{\text {E}\big [ BP _{\mathrm{ser}}^{\big [C,f\big ]}\big ] / \text {E}\big [Z^m\big ]}{\frac{1}{C\alpha } + \text {E}\big [ BP _{\mathrm{ser}}^{\big [C,f\big ]}\big ]}. \end{aligned}$$
(46)

Since \(\text {E}[ BP _{\mathrm{ser}}^{[C,f]}] \ge \text {E}[Z^m] > 0 ~ \forall ~ C=1,2,\ldots \) and \(\text {E}[ BP _{\mathrm{ser}}^{[C,f]}]\) is an increasing function in C (as we will discuss shortly), by taking the limit of Eq. (46), we observe that

$$\begin{aligned} \lambda _r^{\big [\infty \big ]}&= \lim _{C \rightarrow \infty } \lambda _r^{\big [C,f\big ]} \nonumber \\&= \lim _{C \rightarrow \infty } \frac{\text {E}\big [ BP _{\mathrm{ser}}^{\big [C,f\big ]}\big ] / \text {E}\big [Z^m\big ]}{\frac{1}{C\alpha } + \text {E}\big [ BP _{\mathrm{ser}}^{\big [C,f\big ]}\big ]} \nonumber \\&= \lim _{C \rightarrow \infty } \left( \frac{1}{\frac{1}{C\alpha \text {E}\big [ BP _{\mathrm{ser}}^{\big [C,f\big ]}\big ]} + 1}\right) \frac{1}{\text {E}\big [Z^m\big ]} \nonumber \\&= \frac{-\alpha }{\alpha _1 {\underline{\beta }}_1 B_1^{-1} {\underline{e}}_{b_1}' + \alpha _2{\underline{\beta }}_2 B_2^{-1} {\underline{e}}_{b_2}'}. \end{aligned}$$
(47)

Therefore, Eq. (17) follows immediately from Little’s Law and Eqs. (41) and (47).

Next, suppose that only switches out of or into class 0 can have positive durations. It then follows that \(\text {E}[ BP _{\mathrm{swi}}^{[C,f]}]\) is a constant with respect to C, and so it still holds that

$$\begin{aligned} \lambda _r^{\big [\infty \big ]} = \lim _{C \rightarrow \infty } \left( \frac{1}{\frac{\big (C\alpha \big )^{-1} + \text {E}\big [ BP _{\mathrm{swi}}^{\big [C,f\big ]}\big ]}{\text {E}\big [ BP _{\mathrm{ser}}^{\big [C,f\big ]}\big ]} + 1}\right) \frac{1}{\text {E}\big [Z^m\big ]} = \frac{-\alpha }{\alpha _1 {\underline{\beta }}_1 B_1^{-1} {\underline{e}}_{b_1}' + \alpha _2{\underline{\beta }}_2 B_2^{-1} {\underline{e}}_{b_2}'},\end{aligned}$$

resulting in the statement of Eq. (17).

Finally, we consider the cases where positive switch-in times are observable in at least one direction between the class-1 and class-2 queues (i.e., \(\gamma _{12}^{[0]}\) and/or \(\gamma _{21}^{[0]}\) are less than 1). We now make the seemingly obvious claim that both \(\text {E}[ BP _{\mathrm{ser}}^{[C,f]}]\) and \(\text {E}[ BP _{\mathrm{swi}}^{[C,f]}]\) are increasing functions in C. This is intuitive, as increasing C increases the probability flow, and hence the transition probabilities, for a given state to states within the CTMC corresponding to longer queue lengths. Also, increasing C increases the maximum total queue lengths that if visited, represent more potential total work that must be completed before the end of the busy period than a corresponding “full queue” state (i.e., \(X_1+X_2=C+f\)) in a maintenance system with a smaller C. Thus, the expected number of machine failures within a renewal period must increase with C, implying that \(\text {E}[ BP _{\mathrm{ser}}^{[C,f]}]\) is an increasing function in C.

If machine failures are more frequent, then it also follows that the probability of observing no arrivals to the opposite queue while emptying their current queue goes to zero as \(C \rightarrow \infty \). To see this, consider the system at the start of a class-i service while \(X_i = 1\) and \(X_j = 0\), \(j \ne i\). If we assume that \(f \ge 1\) and let \(W_C \sim \text {Exp}(C\alpha )\) and \(Z_i \sim \text {PH}({\underline{\beta }}_i,B_i)\) be independent random variables, then the probability of having no failures during this class-i service is \(P(W_C > Z_i)\), where

$$\begin{aligned} P(W_C > Z_i) = \int _0^\infty e^{-C\alpha z} {\underline{\beta }}_i \exp \{B_i z\} {\underline{e}}_{b_i}' dz = \text {E}[e^{-C\alpha Z_i}] = {\tilde{Z}}_i(C\alpha ) \end{aligned}$$

is the Laplace transform of \(Z_i\) at \(C\alpha \). If instead we had \(f=0\), then C would be replaced by \(C-1\) in the above equation. Applying the dominated convergence theorem, it is easy to confirm that

$$\begin{aligned} \lim _{C \rightarrow \infty } P(W_C > Z_i) = \lim _{C \rightarrow \infty } {\tilde{Z}}_i(C\alpha ) = 0. \end{aligned}$$

Thus, as we increase C, it becomes more likely that there is a combination of class-1 and/or class-2 arrivals by the end of the service. If at least one failure was from class j, \(j \ne i\), then the server will have to undergo a class-j switch-in after eventually emptying the class-i queue. If every failure was class i, then the server will have at least one more independent and probabilistically identical opportunity to observe class-j failures before either switching to class j or to class 0 (and ending the busy period). Thus, the expected number of transitions between queues after emptying a queue increases with C, which are present for every service policy. Similarly, the number of switches from positive queue lengths will be non-decreasing in C due to the CTMC spending more time at higher queue lengths, as discussed previously. Therefore, we can conclude that \(\text {E}[ BP _{\mathrm{swi}}^{[C,f]}]\) is also an increasing function in C.

Now, we rewrite Eq. (45) as

$$\begin{aligned} \lambda _r^{\big [C,f\big ]} = \left( 1 + \frac{1}{C\alpha \text {E}\big [ BP _{\mathrm{ser}}^{\big [C,f\big ]}\big ]} + \frac{\text {E}\big [ BP _{\mathrm{swi}}^{\big [C,f\big ]}\big ]}{\text {E}\big [ BP _{\mathrm{ser}}^{\big [C,f\big ]}\big ]}\right) ^{-1} \frac{1}{\text {E}\big [Z^m\big ]}. \end{aligned}$$
(48)

Clearly,

$$\begin{aligned} \lim _{C \rightarrow \infty } \frac{1}{C\alpha \text {E}\left[ BP _{\mathrm{ser}}^{\left[ C,f\right] }\right] } = 0, \end{aligned}$$

and so the limit of \(\lambda _r^{[C,f]}\) depends on the rates at which \(\text {E}[ BP _{\mathrm{swi}}^{[C,f]}]\) and \(\text {E}[ BP _{\mathrm{ser}}^{[C,f]}]\) increase with C. If they increase at a comparable rate, i.e.,

$$\begin{aligned} \lim _{C \rightarrow \infty } \frac{\text {E}\left[ BP _{\mathrm{swi}}^{\left[ C,f\right] }\right] }{\text {E}\left[ BP _{\mathrm{ser}}^{\left[ C,f\right] }\right] } = d > 0, \end{aligned}$$

then

$$\begin{aligned} \lambda _r^{\left[ \infty \right] } = \left( \frac{1}{1+d}\right) \frac{1}{\text {E}\left[ Z^m\right] } < \frac{1}{\text {E}\left[ Z^m\right] }, \end{aligned}$$

implying a strict inequality in Eq. (16) after applying Little’s Law and Eq. (41). It also follows that if

$$\begin{aligned} \lim _{C \rightarrow \infty } \frac{\text {E}\left[ BP _{\mathrm{swi}}^{\left[ C,f\right] }\right] }{\text {E}\left[ BP _{\mathrm{ser}}^{\left[ C,f\right] }\right] } = 0, \end{aligned}$$

then Eq. (16) is an equality. \(\square \)

1.3 Algorithm for Sect. 5.3: smart Bernoulli optimization

Letting \(\mathrm{precision} \in {\mathbb {Z}}^+\) denote the number of decimal places we are interested in approximating to and \(\text {E}[N_W](p_2^{\mathrm{SB}})\) represent the expected number of working machines as a function of \(p_2^{\mathrm{SB}}\), we apply:

figure a

What this algorithm does in iteration \(i \in \{1,2,\ldots ,\text {precision}\}\) is divide an interval of probabilities into increments of width \(10^{-i}\), solve for \(\text {E}[N_W]\) at each \(p_2^{\mathrm{SB}}\) which separate the increments and determine which of these resulted in the maximum value, then restart the loop for the next i investigating an interval with length \(2 \times 10^{-i}\) centered around that probability, or if it is a boundary value of 0 or 1, an interval of length \(10^{-i}\) including the said boundary. The above is a condensed version of the algorithm for readability and space considerations, which may have its efficiency improved slightly by being altered to not re-calculate \(\text {E}[N_W]\) at any previously considered \(p_2^{\mathrm{SB}}\)s. We do not propose this algorithm for its speed, but rather for its accuracy to a given decimal place without the need of derivatives, and the fact that it is able to return a probability of exactly 0 or 1.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Granville, K., Drekic, S. A 2-class maintenance model with dynamic server behavior. TOP 28, 34–96 (2020). https://doi.org/10.1007/s11750-019-00509-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11750-019-00509-1

Keywords

Mathematics Subject Classification

Navigation