Qualitative Analysis of VASS-Induced MDPs

We consider infinite-state Markov decision processes (MDPs) that are induced by extensions of vector addition systems with states (VASS). Verification conditions for these MDPs are described by reachability and Buchi objectives w.r.t. given sets of control-states. We study the decidability of some qualitative versions of these objectives, i.e., the decidability of whether such objectives can be achieved surely, almost-surely, or limit-surely. While most such problems are undecidable in general, some are decidable for large subclasses in which either only the controller or only the random environment can change the counter values (while the other side can only change control-states).

Since a strategy of Player 1 induces a probability distribution of runs of the MDP, the objective of an MDP is defined in terms of this distribution, e.g., if the probability of satisfying a reachability/Büchi objective is at least a given constant. The special case where this constant is 1 is a key example of a qualitative objective. Here one asks whether Player 1 has a strategy that achieves an objective surely (all runs satisfy the property) or almost-surely (the probability of the runs satisfying the property is 1).
Most classical work on algorithms for MDPs and stochastic games has focused on finite-state systems (e.g., [14,19,11]), but more recently several classes of infinite-state systems have been considered as well. For instance, MDPs and stochastic games on infinite-state probabilistic recursive systems (i.e., probabilistic pushdown automata with unbounded stacks) [13] and on one-counter systems [7,6] have been studied. Another infinite-state probabilistic model, which is incomparable to recursive systems, is a suitable probabilistic extension of Vector Addition Systems with States (VASS; a.k.a. Petri nets), which have a finite number of unbounded counters holding natural numbers.
Our contribution. We study the decidability of probability-1 qualitative reachability and Büchi objectives for infinite-state MDPs that are induced by suitable probabilistic extensions of VASS that we call VASS-MDPs. (Most quantitative objectives in probabilistic VASS are either undecidable, or the solution is at least not effectively expressible in (R, +, * , ≤) [2].) It is easy to show that, for general VASS-MDPs, even the simplest of these problems, (almost) sure reachability, is undecidable. Thus we consider two monotone subclasses: 1-VASS-MDPs and P-VASS-MDPs. In 1-VASS-MDPs, only Player 1 can modify counter values while the probabilistic player can only change control-states, whereas for P-VASS-MDPs it is vice-versa. Still these two models induce infinite-state MDPs. Unlike for finite-state MDPs, it is possible that the value of the MDP, in the game theoretic sense, is 1, even though there is no single strategy that achieves value 1. For example, there can exist a family of strategies σ ǫ for every ǫ > 0, where playing σ ǫ ensures a probability ≥ 1 − ǫ of reaching a given target state, but no strategy ensures probability 1. In this case, one says that the reachability property holds limit-surely, but not almost-surely (i.e., unlike in finite-state MDPs, almost-surely and limit-surely do not coincide in infinite-state MDPs).
We show that even for P-VASS-MDPs, all sure/almost-sure/limit-sure reachability/Büchi problems are still undecidable. However, in the deadlock-free subclass of P-VASS-MDPs, the sure reachability/Büchi problems become decidable (while the other problems remain undecidable). In contrast, for 1-VASS-MDPs, the sure/almost-sure/limit-sure reachability problem and the sure/almost-sure Büchi problem are decidable.
Our decidability results rely on two different techniques. For the sure and almost sure problems, we prove that we can reduce them to the model-checking problem over VASS of a restricted fragment of the modal µ-calculus that has been proved to be decidable in [3]. For the limit-sure reachability problem in 1-VASS-MDP, we use an algorithm which at each iteration reduces the dimension of the considered VASS while preserving the limit-sure reachability properties.
Although we do not consider the class of qualitative objectives referring to the probability of (repeated) reachability being strictly greater than 0, we observe that reachability on VASS-MDPs in such a setting is equivalent to reachability on standard VASS (though this correspondence does not hold for repeated reachability). Outline. In Section 2 we define basic notations and how VASS induce Markov decision processes. In Sections 3 and 4 we consider verification problems for P-VASS-MDP and 1-VASS-MDP, respectively. In Section 5 we summarize the decidability results (Table 1) and outline future work.

Models and verification problems
Let N (resp. Z) denote the set of nonnegative integers (resp. integers). For two integers i, j such that i ≤ j we use [i..j] to represent the set {k ∈ Z | i ≤ k ≤ j}. Given a set X and n ∈ N \ {0}, X n is the set of n-dimensional vectors with values in X. We use 0 to denote the vector such that 0(i) = 0 for all i ∈ [1..n]. The classical order on Z n is denoted ≤ and is defined by v ≤ w if and only if v(i) ≤ w(i) for all i ∈ [1..n]. We also define the operation + over n-dimensional vectors of integers in the classical way (i.e., for v, v ′ ∈ Z n , v + v ′ is defined by .n]). Given a set S, we use S * (respectively S ω ) to denote the set of finite (respectively infinite) sequences of elements of S. We now recall the notion of well-quasi-ordering (which we abbreviate as wqo). A quasi-order (A, ) is a wqo if for every infinite sequence of elements a 1 , a 2 , . . . in A, there exist two indices i < j such that a i a j . For n > 0, (N n , ≤) is a wqo. Given a set A with an ordering and a subset B ⊆ A, the set B is said to be upward closed in A if a 1 ∈ B, a 2 ∈ A and a 1 a 2 implies a 2 ∈ B.

Markov decision processes
A probability distribution on a countable set X is a function f : X → [0, 1] such that x∈X f (x) = 1. We use D(X) to denote the set of all probability distributions on X. We first recall the definition of Markov decision processes.
Definition 1 (MDPs). A Markov decision process (MDP) M is a tuple C, C 1 , C P , A, →, p where: C is a countable set of configurations partitioned into C 1 and C P (that is C = C 1 ∪ C P and C 1 ∩ C P = ∅); A is a set of actions; →⊆ C × A × C is a transition relation; p : C P → D(C) is a partial function which assigns to some configurations in C P probability distributions on C such that p(c)(c ′ ) > 0 if and only if c a − → c ′ for some a ∈ A.
Note that our definition is equivalent as seeing MDPs as games played between a nondeterministic player (Player 1) and a probabilistic player (Player P). The set C 1 contains the nondeterministic configurations (or configurations of Player 1) and the set C P contains the probabilistic configurations (or configurations of Player P). Given two configurations c, c ′ in C, we write c → c ′ whenever there exists a ∈ A such that c a − → c ′ . We will say that a configuration c ∈ C is a deadlock if there does not exist c ′ ∈ C such that c → c ′ . We use C df 1 (resp. C df P ), to denote the configurations of Player 1 (resp. of Player P) which are not a deadlock (df stands here for deadlock free).
A play of the MDP M = C, C 1 , C P , A, →, p is either an infinite sequence of the form c 0 We call the first kind of play an infinite play, and the second one a finite play. A play is said to be maximal whenever it is infinite or it ends in a deadlock configuration. These latter plays are called deadlocked plays. We use Ω to denote the set of maximal plays. For a finite play ρ = c 0 We use Ω df 1 to denote the set of finite plays ρ such that last (ρ) ∈ C df 1 . A strategy for Player 1 is a function σ : Ω df 1 → C such that, for all ρ ∈ Ω df 1 and c ∈ C, if σ(ρ) = c then last (ρ) → c. Intuitively, given a finite play ρ, which represents the history of the game so far, the strategy represents the choice of Player 1 among the different possible successor configurations from last (ρ). We use Σ to denote the set of all strategies for Player 1. Given a strategy σ ∈ Σ, an infinite play c 0 We define finite plays that respect σ similarly. Let Plays(M, c, σ) ⊆ Ω be the set of all maximal plays of M that start from c and that respect σ.
Note that once a starting configuration c 0 ∈ C and a strategy σ have been chosen, the MDP is reduced to an ordinary stochastic process. We define an event A ⊆ Ω as a measurable set of plays and we use P(M, c, σ, A) to denote the probability of event A starting from c ∈ C under strategy σ. The notation P + (M, c, A) will be used to represent the maximal probability of event A starting from c which is defined as P + (M, c, A) = sup σ∈Σ P(M, c, σ, A).
Here we extend this model with non-deterministic choices made by a controller. We call this new model VASS-MDPs. We first recall the definition of Vector Addition Systems with States.
Definition 2 (Vector Addition System with States). For n > 0, an ndimensional Vector Addition System with States (VASS) is a tuple S = Q, T where Q is a finite set of control states and T ⊆ Q × Z n × Q is the transition relation labelled with vectors of integers.
In the sequel, we will not always make precise the dimension of the considered VASS. Configurations of a VASS are pairs q, v ∈ Q × N n . Given a configuration q, v and a transition t = q, z, q ′ in T , we will say that t is enabled at q ′′ , v , if q = q ′′ and v+z ≥ 0. Let then En(q, v) be the set {t ∈ T | t is enabled at q, v) }.
An n-dimensional VASS S induces a labelled transition system C, T, → where C = Q × N n is the set of configurations and the transition . VASS are sometimes seen as programs manipulating integer variables, a.k.a. counters. When a transition of a VASS changes the i-th value of a vector v, we will sometimes say that it modifies the value of the i-th counter. We show now in which manner we add probability distributions to VASS.
Definition 3 (VASS-MDP). A VASS-MDP is a tuple S = Q, Q 1 , Q P , T, τ where Q, T is a VASS for which the set of control states Q is partitioned into Q 1 and Q P , and τ : T → N\{0} is a partial function assigning to each transition a weight which is a positive natural number.
Nondeterministic (resp. probabilistic) choices are made from control states in Q 1 (resp. Q P ). The subset of transitions from control states of Q 1 (resp. control states of Q P ) is denoted by T 1 (resp. T P ). Hence T = T 1 ∪ T P with T 1 ⊆ Q 1 ×Z n ×Q and T P ⊆ Q P ×Z n ×Q. A VASS-MDP S = Q, Q 1 , Q P , T, τ induces an MDP M S = C, C 1 , C P , T, →, p where: C, T, → is the labelled transition system associated with the VASS Q, T ; C 1 = Q 1 × N n and C P = Q P × N n ; and for all c ∈ C df P and c ′ ∈ C, if c → c ′ , the probability of going from c to c ′ is defined by in the case c → c ′ , there exists at least one transition in En(c) and consequently the sum t∈En(c) τ (t) is never equal to 0. Also, we could have restricted the weights to be assigned only to transitions leaving from a control state in Q P since we do not take into account the weights assigned to the other transitions. A VASS-MDP is deadlock free if its underlying VASS is deadlock free.
Finally, as in [18] or [3], we will see that to gain decidability it is useful to restrict the power of the nondeterministic player or of the probabilistic player by restricting their ability to modify the counters' values and hence letting them only choose a control location. This leads to the two following definitions: a P-VASS-MDP is a VASS-MDP Q, Q 1 , Q P , T, τ such that for all q, z, q ′ ∈ T 1 , we have z = 0 and a 1-VASS-MDP is a VASS-MDP Q, Q 1 , Q P , T, τ such that for all q, z, q ′ ∈ T P , we have z = 0. In other words, in a P-VASS-MDP, Player 1 cannot change the counter values when taking a transition and, in a 1-VASS-MDP, it is Player P which cannot perform such an action.

Verification problems for VASS-MDPs
We consider qualitative verification problems for VASS-MDPs, taking as objectives control-state reachability and repeated reachability. To simplify the presentation, we consider a single target control-state q F ∈ Q. However, our positive decidability results easily carry over to sets of target control-states (while the negative ones trivially do). Note however, that asking to reach a fixed target configuration like q F , 0 is a very different problem (cf. [2]).
Let S = Q, Q 1 , Q P , T, τ be a VASS-MDP and M S its associated MDP. Given a control state q F ∈ Q, we denote by ♦q F the set of infinite plays c 0 · c 1 · · · · and deadlocked plays c 0 · · · · · c l of M S for which there exists an index k ∈ N such that c k = q F , v for some v ∈ N n . Similarly, ♦q F characterizes the set of infinite plays c 0 · c 1 · · · · of M S for which the set {i ∈ N | c i = q F , v for some v ∈ N n } is infinite. Since M S is an MDP with a countable number of configurations, we know that the sets of plays ♦q F and ♦q F are measurable (for more details see for instance [4]), and are hence events for M S . Given an initial configuration c 0 ∈ Q × N n and a control-state q F ∈ Q, we consider the following questions: 1. The sure reachability problem: Does there exist a strategy σ ∈ Σ such that Plays(M S , c 0 , σ) ⊆ ♦q F ? 2. The almost-sure reachability problem: Does there exist a strategy σ ∈ Σ such that P(M S , c 0 , σ, ♦q F ) = 1? 3. The limit-sure reachability problem: Does P + (M S , c 0 , ♦q F ) = 1? 4. The sure repeated reachability problem: Does there exist a strategy σ ∈ Σ such that Plays(M S , c 0 , σ) ⊆ ♦q F ? 5. The almost-sure repeated reachability problem: Does there exist a strategy σ ∈ Σ such that P(M S , c 0 , σ, ♦q F ) = 1? 6. The limit-sure repeated reachability problem: Does P + (M S , c 0 , ♦q F ) = 1?
Note that sure reachability implies almost-sure reachability, which itself implies limit-sure reachability, but not vice-versa, as shown by the counterexamples in Figure 1 (see also [7]). The same holds for repeated reachability. For the sure problems, probabilities are not taken into account, and thus these problems can be interpreted as the answer to a two player reachability game played on the transition system of S. Such games have been studied for instance in [18,1,3]. Finally, VASS-MDPs subsume deadlock-free VASS-MDPs and thus decidability (resp. undecidability) results carry over to the smaller (resp. larger) class.  The circles (resp. squares) are the control states of Player 1 (resp. Player P). All transitions have the same weight 1. From q0, 0 , the state qF is reached almost-surely, but not surely, due to the possible run with an infinite loop at q0 (which has probability zero). From q1, 0 , the state qF can be reached limit-surely (by a family of strategies that repeats the loop at q1 more and more often), but not almost-surely (or surely), since every strategy has a chance of getting stuck at state q2 with counter value zero.

Undecidability in the general case
It was shown in [1] that the sure reachability problem is undecidable for (2dimensional) two player VASS. From this we can deduce that the sure reachability problem is undecidable for VASS-MDPs. We now present a similar proof to show the undecidability of the almost-sure reachability problem for VASS-MDPs.
For all of our undecidability results we use reductions from the undecidable control-state reachability problem for Minsky machines. A Minsky machine is a tuple Q, T where Q is a finite set of states and T is a finite set of transitions manipulating two counters, say x 1 and x 2 . Each transition is a triple of the form q, x i = 0?, q ′ (counter x i is tested for 0) or q, Configurations of a Minsky machine are triples in Q × N × N. The transition relation ⇒ between configurations of the Minsky machine is then defined in the obvious way. Given an initial state q I and a final state q F , the controlstate reachability problem asks whether there exists a sequence of configurations This problem is known to be undecidable [16]. W.l.o.g. we assume that Minsky machines are deadlockfree and deterministic (i.e., each configuration has always a unique successor) and that the only transition leaving q F is of the form q F , We now show how to reduce the control-state reachability problem to the almost-sure and limit-sure reachability problems in deadlock-free VASS-MDPs. From a Minsky machine, we construct a deadlock-free 2-dimensional VASS-MDP for which the control states of Player 1 are exactly the control states of the Minsky machine. The encoding is presented in Figure 2 where the circles (resp. squares) are the control states of Player 1 (resp. Player P), and for each edge the corresponding weight is 1. The state ⊥ is an absorbing state from which the unique outgoing transition is a self loop that does not affect the values of the counters. This encoding allows us to deduce our first result. Theorem 1. The sure, almost-sure and limit-sure (repeated) reachability problems are undecidable problems for 2-dimensional deadlock-free VASS-MDPs.
In the special case of 1-dimensional VASS-MDPs, the sure and almost-sure reachability problems are decidable [7].

Model-checking µ-calculus on single-sided VASS
It is well-known that there is a strong connection between model-checking branching time logics and games, and in our case we have in fact undecidability results for simple reachability games played on a VASS and for the model-checking of VASS with expressive branching-time logics [12]. However for this latter point, decidability can be regained by imposing some restrictions on the VASS structure [3] as we will now recall. We say that a VASS Q, T is (Q 1 , Q 2 )-singlesided iff Q 1 and Q 2 represents a partition of the set of states Q such that for all transitions q, z, q ′ in T with q ∈ Q 2 , we have z = 0; in other words only the transitions leaving a state from Q 1 are allowed to change the values of the counters. In [3], it has been shown that, thanks to a reduction to games played on a single-sided VASS with parity objectives, a large fragment of the µ-calculus called L sv µ has a decidable model-checking problem over single-sided VASS. The idea of this fragment is that the "always" operator is guarded with a predicate enforcing the current control states to belong to Q 2 . Formally, the syntax of L sv µ for (Q 1 , Q 2 )-single-sided VASS is given by the following grammar: for the formula q∈Q2 q and X belongs to a set of variables X . The semantics of L sv µ is defined as usual: it associates to a formula φ and to an environment ε : X → 2 C a subset of configurations φ ε . We use ε 0 to denote the environment which assigns the empty set to any variable. Given an environment ε, a variable X ∈ X and a subset of configurations C, we use ε[X := C] to represent the environment ε ′ which is equal to ε except on the variable X, where we have ε ′ (X) = C. Finally the notation φ corresponds to the interpretation φ ε0 . The problem of model-checking single-sided VASS with L sv µ can then be defined as follows: given a single-sided VASS Q, T , an initial configuration c 0 and a formula φ of L sv µ , do we have c 0 ∈ φ ?

Verification of P-VASS-MDPs
In [3] it is proved that parity games played on a single-sided deadlock-free VASS are decidable (this entails the decidability of model checking L sv µ over single-sided VASS). We will see here that in the case of P-VASS-MDPs, in which only the probabilistic player can modify the counters, the decidability status depends on the presence of deadlocks in the system.

Undecidability in presence of deadlocks
We point out that the reduction presented in Figure 2 to prove Theorem 1 does not carry over to P-VASS-MDPs, because in that construction both players have the ability to change the counter values. However, it is possible to perform a similar reduction leading to the undecidability of verification problems for P-VASS-MDPs, the main difference being that we crucially exploit the fact that the P-VASS-MDP can contain deadlocks.
We now explain the idea behind our encoding of Minsky machines into P-VASS-MDPs. Intuitively, Player 1 chooses a transition of the Minsky machine to simulate, anticipating the modification of the counters values, and Player P is then in charge of performing the change. If Player 1 chooses a transition with a decrement and the accessed counter value is actually 0, then Player P will be in a deadlock state and consequently the desired control state will not be reached. Furthermore, if Player 1 decides to perform a zero-test when the counter value is strictly positive, then Player P is able to punish this choice by entering a deadlock state. Similarly to the proof of Theorem 1, Player P can test if the value of the counter is strictly greater than 0 by decrementing it. The encoding of the Minsky machine is presented in Figure 3. Note that no outgoing edge of Player 1's states changes the counter values. Furthermore, we see that Player P reaches the control state ⊥ if and only if Player 1 chooses to take a transition with a zero-test when the value of the tested counter is not equal to 0. Note that, with the encoding of the transition q 3 , x 2 := x 2 − 1, q 4 , when Player P is in the control state between q 3 and q 4 , it can be in a deadlock if the value of the second counter is not positive. In the sequel we will see that in P-VASS-MDP without deadlocks the sure reachability problem becomes decidable. From this encoding we deduce the following result.
Theorem 3. The sure, almost sure and limit sure (repeated) reachability problems are undecidable for 2-dimensional P-VASS-MDPs.

Sure (repeated) reachability in deadlock-free P-VASS-MDPs
Unlike in the case of general P-VASS-MDPs, we will see that the sure (repeated) reachability problem is decidable for deadlock-free P-VASS-MDPs. Let S = Q, Q 1 , Q P , T, τ be a deadlock-free P-VASS-MDP, M S = (C, C 1 , C P , →, p) its associated MDP and q F ∈ Q a control state. Note that because the P-VASS-MDP S is deadlock free, Player P cannot take the play to a deadlock to avoid the control state q F , but he has to deal only with infinite plays. Since S is a P-VASS-MDP, the VASS Q, T is (Q P , Q 1 )-single-sided. In [18,1], it has been shown that control-state reachability games on deadlock-free single-sided VASS are decidable, and this result has been extended to parity games in [3]. This implies the decidability of sure (repeated) reachability in deadlock-free P-VASS-MDPs. However, to obtain a generic way of verifying these systems, we construct a formula of L sv µ that characterizes the sets of winning configurations and use then the result of Theorem 2. Let V P S be the set of configurations from which the answer to the sure reachability problem (with q F as state to be reached) is negative, i.e., The next lemma relates these two sets with a formula of L sv µ (where Q P corresponds to the formula q∈QP and Q 1 corresponds to the formula q∈Q1 q). Lemma 1.
Note that we use (Q P ∨ (Q 1 ∧ X)) instead of (Q P ∨ X) so that the formulae are in the guarded fragment of the µ-calculus. Since the two formulae belong to L sv µ for the (Q P , Q 1 )-single-sided VASS S, decidability follows directly from Theorem 2.
Theorem 4. The sure reachability and repeated reachability problem are decidable for deadlock free P-VASS-MDPs.

Almost-sure and limit-sure reachability in deadlock-free P-VASS-MDPs
We have seen that, unlike for the general case, the sure reachability and sure repeated reachability problems are decidable for deadlock free P-VASS-MDPs, with deadlock freeness being necessary to obtain decidability. For the corresponding almost-sure and limit-sure problems we now show undecidability, again using a reduction from the reachability problem for two counter Minsky machines, as shown in Figure 4. The main difference with the construction used for the proof of Theorem 3 lies in the addition of a self-loop in the encoding of the transitions for decrementing a counter, in order to avoid deadlocks. If Player 1, from a configuration q 3 , v , chooses the transition q 3 , x 2 := x 2 − 1, q 4 which decrements the second counter, then the probabilistic state with the self-loop is entered, and there are two possible cases: if v(2) > 0 then the probability of staying forever in this loop is 0 and the probability of eventually going to state q 4 is 1; on the other hand, if v(2) = 0 then the probability of staying forever in the self-loop is 1, since the other transition that leaves the state of Player P and which performs the decrement on the second counter effectively is not available. Note that such a construction does not hold in the case of sure reachability, because the path that stays forever in the loop is a valid path. This allows us to deduce the following result for deadlock free P-VASS-MDPs.
Theorem 5. The almost-sure and limit-sure (repeated) reachability problems are undecidable for 2-dimensional deadlock-free P-VASS-MDPs. In this section, we will provide decidability results for the subclass of 1-VASS-MDPs. As for deadlock-free P-VASS-MDPs, the proofs for sure and almost-sure problems use the decidability of L sv µ over single-sided VASS, whereas the technique used to show decidability of limit-sure reachability is different.

Sure problems in 1-VASS-MDPs
First we show that, unlike for P-VASS-MDPs, deadlocks do not matter for 1-VASS-MDPs. The idea is that in this case, if the deadlock is in a probabilistic configuration, it means that there is no outgoing edge (because of the property of 1-VASS-MDPs), and hence one can add an edge to a new absorbing state, and the same can be done for the states of Player 1. Such a construction does not work for P-VASS-MDPs, because in that case deadlocks in probabilistic configurations may depend on the counter values, and not just on the current control-state. Lemma 2. The sure (resp. almost sure, resp. limit sure) (repeated) reachability problem for 1-VASS-MDPs reduces to the sure (resp. almost sure, resp. limitsure) (repeated) reachability problem for deadlock-free 1-VASS-MDPs.
Hence in the sequel we will consider only deadlock-free 1-VASS-MDPs. Let S = Q, Q 1 , Q P , T, τ be a deadlock-free 1-VASS-MDP. For what concerns the sure (repeated) reachability problems we can directly reuse the results from Lemma 1 and then show that the complement formulae of the ones expressed in this lemma belong to L sv µ for the (Q 1 , Q P )-single-sided VASS Q, T (in fact the correctness of these two lemmas did not depend on the fact that we were considering P-VASS-MDPs). Theorem 2 allows us to retrieve the decidability results already expressed in [18] (for sure reachability) and [3] (for sure repeated reachability). Theorem 6. The sure (repeated) reachability problem is decidable for 1-VASS-MDPs.

Almost-sure problems in 1-VASS-MDPs
We now move to the case of almost-sure problems in 1-VASS-MDPs. We consider a deadlock free 1-VASS-MDP S = Q, Q 1 , Q P , T, τ and its associated MDP M S = C, C 1 , C P , →, p . We will see that, unlike for P-VASS-MDPs, it is here also possible to characterize by formulae of L sv µ the two following sets: the set of configurations from which Player 1 has a strategy to reach the control state q F , respectively to visit infinitely often q F , with probability 1.
We begin with introducing the following formula of L sv µ based on the variables X and Y : Intuitively, this formula represents the set of configurations from which (i) Player 1 can make a transition to the set represented by the intersection of the sets characterized by the variables X and Y and (ii) Player P can make a transition to the set Y and cannot avoid making a transition to the set X.
Almost sure reachability. We will now prove that V 1 AS can be characterized by the following formula of L sv µ : νX.µY.(q F ∨ InvPre(X, Y )). Note that a similar result exists for finite-state MDPs, see e.g. [9]; this result in general does not extend to infinite-state MDPs, but in the case of VASS-MDPs it can be applied. Before proving this we need some intermediate results.
We denote by E the set νX.µY. q F ∨ InvPre(X, Y ) ε0 . Since νX.µY. q F ∨ InvPre(X, Y ) is a formula of L sv µ interpreted over the single-sided VASS Q, T , we can show that E is an upward-closed set. We now need another lemma which states that there exists N ∈ N and a strategy for Player 1 such that, from any configuration of E, Player 1 can reach the control state q F in less than N steps and Player P cannot take the play outside of E. The fact that we can bound the number of steps is crucial to show that νX.µY. q F ∨ InvPre(X, Y ) ε0 is equal to V 1 AS . For infinite-state MDPs where this property does not hold, our techniques do not apply.
Lemma 3. There exists N ∈ N and a strategy σ of Player 1 such that for all c ∈ E, there exists a play c · c 1 · c 2 · . . . in Plays(M S , c, σ) satisfying the three following properties: This previous lemma allows us to characterize V 1 AS with a formula of L sv µ . The proof of the following result uses the fact that the number of steps is bounded, and also the fact that the sets described by closed L sv µ formulae are upward-closed. This makes the fixpoint iteration terminate in a finite number of steps.
Since Q, T is (Q 1 , Q P )-single-sided and since the formula associated to V 1 AS belongs to L sv µ , from Theorem 2 we deduce the following theorem.
Almost sure repeated reachability. For the case of almost sure repeated reachability we reuse the previously introduced formula InvPre(X, Y ). We can perform a reasoning similar to the previous ones and provide a characterization of the set W 1 AS .
As previously, this allows us to deduce the decidability of the almost sure repeated reachability problem for 1-VASS-MDP.
Theorem 8. The almost sure repeated reachability problem is decidable for 1-VASS-MDPs.

Limit-sure reachability in 1-VASS-MDP
We consider a slightly more general version of the limit-sure reachability problem with a set X ⊆ Q of target states instead of a single state q F , i.e., the standard case corresponds to X = {q F }.
We extend the set of natural numbers N to N * = N { * } by adding an element * / ∈ N with * +j = * −j = * and j < * for all j ∈ N. We consider then the set of vectors N d * . The projection of a vector v in N d by eliminating components that are indexed by a natural number k is defined by proj k (v)(i) = v(i) if i = k and proj k (v)(i) = * otherwise Let Q c represent control-states which are indexed by a color. The coloring functions col i : Q → Q c create colored copies of control-states by col i (q) = q i .
Given a 1-VASS-MDP S = Q, Q 1 , Q P , T, τ of dimension d, an index k ≤ d and a color i, the colored projection is defined as: where proj k,i (T ) = {proj k,i (t)|t ∈ T } is the projection of the set of transitions T and proj k,i (t) = col i (x), proj k (z), col i (y) is the projection of transition t = x, z, y obtained by removing component k and coloring the states x and y with color i. The transition weights carry over, i.e., τ k, We define the functions state : For any two configurations c 1 and c 2 , we write c 1 ≺ c 2 to denote that state(c 1 ) = state(c 2 ), and there exists a nonempty set of indexes I where for every i ∈ I , count(c 1 )(i) < count(c 2 )(i), whereas for every index j / ∈ I, 0 < j ≤ d, count(c 1 )(j) = count(c 2 )(j). Algorithm 1 reduces the dimension of the limit-sure reachability problem for 1-VASS-MDP by a construction resembling the Karp-Miller tree [15]. It takes as input a 1-VASS-MDP S of some dimension d > 0 with a set of target states X. It outputs a new 1-VASS-MDP S ′ of dimension d − 1 and a new set of target states X ′ such that M S can limit-surely reach X iff M S ′ can limit-surely reach X ′ . In particular, in the base case where d − 1 = 0, the new system S ′ has dimension zero and thus induces a finite-state MDP M S ′ , for which limit-sure reachability of X ′ coincides with almost-sure reachability of X ′ , which is known to be decidable in polynomial time. Algorithm 1 starts by exploring all branches of the computation tree of S (and adding them to S ′ as the so-called initial uncolored part) until it encounters a configuration that is either (1) equal to, or (2) strictly larger than a configuration encountered previously on the same branch. In case (1) it just adds a back loop to the point where the configuration was encountered previously. In case (2), it adds a modified copy of S (identified by a unique color) to S ′ . This so-called colored subsystem is similar to S except that those counters that have strictly increased along the branch are removed. The intuition is that these counters could be pumped to arbitrarily high values and thus present no obstacle to reaching the target. Since the initial uncolored part is necessarily finite (by Dickson's Lemma) and each of the finitely many colored subsystems only has dimension d−1 (since a counter is removed; possibly a different one in different colored subsystems), the resulting 1-VASS-MDP S ′ has dimension d − 1. The set of target states X ′ is defined as the union of all appearances of states in X in the uncolored part, plus all colored copies of states from X in the colored subsystems.
By Dickson's Lemma, the conditions on line 7 or line 19 of the algorithm must eventually hold on every branch of the explored computation tree. Thus, it will terminate. Lemma 6. Algorithm 1 terminates.
The next lemma states the correctness of Algorithm 1. Let S = Q, Q 1 , Q P , T, τ be 1-VASS-MDP of dimension d > 0 with initial configuration c 0 = q 0 , v and X ⊆ Q a set of target states. Let S ′ = Q ′ , Q ′ 1 , Q ′ P , T ′ , τ ′ with initial configuration c ′ 0 = q ′ 0 , 0 and set of target states X ′ ⊆ Q ′ be the (d − 1) dimensional 1-VASS-MDP produced by Algorithm 1. As described above we have the following relation between these two systems.
By applying the result of the previous lemma iteratively until we obtain a finite-state MDP, we can deduce the following theorem.
Theorem 9. The limit-sure reachability problem for 1-VASS-MDP is decidable. Table 1 summarizes our results on the decidability of verification problems for subclasses of VASS-MDP. The exact complexity of most problems is still open.

Conclusion and Future Work
Algorithm 1 Reducing the dimension of the limit-sure reachability problem.
The decidability of the limit-sure repeated reachability problem for 1-VASS-MDP is open. A hint of its difficulty is given by the fact that there are instances where the property holds even though a small chance of reaching a deadlock cannot be avoided from any reachable configuration. In particular, a solution would require an analysis of the long-run behavior of multi-dimensional random walks induced by probabilistic VASS. However, these may exhibit strange nonregular behaviors for dimensions ≥ 3, as described in [8] (Section 5).

A.1 Proof of Lemma 1
Even if the result of this lemma is quite standard, in order to be consistent we provide the proof, in particular to be sure that the fact that we are dealing with infinite state systems does not harm the reasoning.
Proof. We denote by U the set νX.( q∈Q\{qF } q) ∧ (Q 1 ∨ ♦X) ∧ (Q P ∨ (Q 1 ∧ X)) . We will consider the function g : 2 C → 2 C such that for each set of ) where X is interpreted as C ′ . Note that U is then the greatest fixpoint of g and hence U = g(U ).
We first prove that U ⊆ V P S . Let c be a configuration in C such that c / ∈ V P S . Then there exists a strategy σ ∈ Σ such that Plays(M S , c, σ) ⊆ ♦q F . We consider such a strategy σ and we reason by contradiction assuming that c ∈ U .
Let us show that since c ∈ U , there exists an infinite play c 0 → c 1 → c 2 → . . . in Plays(M S , c, σ) such that c 0 = c and c i in U for all i ∈ N. We prove in fact by induction that if c 0 → c 1 . . . → c k is a finite play in M S respecting σ with c 0 = c and such that c i ∈ U for i ∈ [0..k], then there exists c k+1 ∈ U such that c 0 → c 1 . . . → c k+1 is a play in M S respecting σ. The base case is obvious since c ∈ U . We assume now that c 0 → c 1 . . . → c k is a finite play in M S respecting σ such that c i ∈ U for i ∈ [0..k]. Because there is no deadlock, if c k ∈ C 1 then there exists c k+1 ∈ C satisfying c k+1 = σ(c 0 → c 1 . . . → c k ). Furthermore since c k ∈ U and U = g(U ), we deduce that, for all c ∈ C such that c k → c, we have c ∈ U , and consequently c k+1 ∈ U . On the other hand, if c k ∈ C P , then since c k ∈ U , there exists c k+1 ∈ U such that c k → c k+1 . In both cases, we have that c 0 → c 1 . . . → c k → c k+1 is a play in M S which respects σ.
We deduce the existence of an infinite play c 0 → c 1 → c 2 → . . . in Plays(M S , c, σ) such that c 0 = c and c i in U for all i ∈ N. Note that, because U = g(U ), we have that U ⊆ Q \ {q F }. However, because c 0 → c 1 → c 2 → . . . in Plays(M S , c, σ), we also have that c 0 → c 1 → c 2 → . . . ∈ ♦q F , which is a contradiction. Consequently, we have c / ∈ U and this allows us to conclude that U ⊆ V P S .
We now prove that V P S ⊆ U . By the Knaster-Tarski Theorem, since U is the greatest fixpoint of g, we know that U = {C ′ ⊆ C | C ′ ⊆ g(C ′ )}. It hence suffices to show that V P S ⊆ g(V P S ). Let c = q, v be a configuration in V P S . First note that by definition of V P S , we have q = q F . We reason then by a case analysis to prove that c ∈ g(V P S ). First assume c ∈ C 1 . Then by definition of V P S , for all c ′ = q ′ , v ′ in C satisfying c → c ′ , we have c ′ ∈ V P S , otherwise Player 1 would have a strategy to reach q F from c which will consist in taking the transition leading to c ′ . This allows us to deduce that c ∈ g(V P S ). Assume now that c ∈ C P . Then, because c ∈ V P S and S is deadlock free, there necessarily exists c ′ such c → c ′ and c ′ ∈ V P S (otherwise there would be a strategy to reach q F from all states c ′ such that c → c ′ and hence c would not be in V P S ). So also in this case we have c ∈ g(V P S ). Hence we have V P S ⊆ g(V P S ) which allows to deduce that Proof. We denote by U the set µY.νX.
and we consider the function h : 2 C → 2 C such that for each set of configurations Note that U is then the least fixpoint of h.
We first prove that U ⊆ W P S . Note that U being the least fixpoint of h, by the Knaster- Tarski Theorem we We will hence show that h(W P S ) ⊆ W P S from which we will get U ⊆ W P S . For this, we consider the function g W P S : 2 C → 2 C such that for each set of configurations is then by definition the greatest fixpoint of g W P S (so we have as well V = g W P S (V )). Hence we need to show that V ⊆ W P S . For this we will assume that c / ∈ W P S and show that c / ∈ V . Let c / ∈ W P S . Hence there exists a strategy σ ∈ Σ such that Plays(M S , c, σ) ⊆ ♦q F . We reason now by contradiction assuming that c ∈ V . We will show that either there exists a finite play c 0 → c 1 → c 2 . . . → c k in M S which respects σ and with c k ∈ W P S ∩ q F or there exists an infinite play c 0 → c 1 → c 2 → . . . in Plays(M S , c, σ) such that c 0 = c and c i / ∈ q F for all i ∈ N \ {0}. We prove in fact by induction that if c 0 → c 1 . . . → c k is a finite play in M S respecting σ with c 0 = c and such that c i ∈ V and c i / ∈ W P S for i ∈ [0..k], then there exists c k+1 ∈ C such that either c k+1 ∈ W P S ∩ q F or (c k+1 ∈ V and c k+1 / ∈ q F ), and such that c 0 → c 1 . . . → c k+1 is a play in M S respecting σ. We proceed with the base case. First note that c ∈ V and c / ∈ W P S . We recall that V = g W P S (V ). -If c ∈ C 1 , then let c 1 = σ(c). Since c ∈ g W P S (V ) and c ∈ C 1 , we have necessarily that either (c 1 ∈ V and c 1 / ∈ q F ) or c 1 ∈ W P S ∩ q F , and we have c → c 1 is a play in M S respecting σ.
-If c ∈ C P , then by definition of g W P S (V ), there exists necessarily c 1 such that either (c 1 ∈ V and c 1 / ∈ q F ) or c 1 ∈ W P S ∩ q F and c → c 1 . Furthermore this allows us to deduce that c → c 1 is a play in M S respecting σ.
For the inductive case, the proof works exactly the same way.
From this we deduce that either there exists a finite play c → c 1 → c 2 . . . → c k in M S respecting σ with c k ∈ W P S ∩ q F or there exists an infinite play c → c 1 → c 2 → . . . in Plays(M S , c, σ) such that c 0 = c and c i / ∈ q F for all i ∈ N. We recall that we have Plays(M S , c, σ) ⊆ ♦q F and proceed by a case analysis to show a contradiction: We hence deduce that c / ∈ V , and consequently we have shown that U ⊆ W P S . We now prove that W P S ⊆ U . To do that we will instead show that the complement of U , denoted by U , is included in the complement of W P S , denoted by W 1 S and which is equal to the set . Note that then we have as well that U = µX.
. By adapting the proof of Lemma 8, we can deduce that the complement of this last U ′ is equal to {c ∈ C | ∃σ ∈ Σ s.t. Plays(M S , c, σ) ⊆ ♦T } (where ♦T denotes the plays that eventually reach the set T ) and consequently U ′ = {c ∈ C | ∃σ ∈ Σ s.t. Plays(M S , c, σ) ⊆ ♦T }. We have consequently that We can now prove that U ⊆ W 1 S . Let c ∈ U . Since U ⊆ {c ∈ C | ∃σ ∈ Σ s.t. Plays(M S , c, σ) ⊆ ♦T }, from c Player 1 can surely reach T and by definition of T , when it reaches T first it is in q F and then Player 1 can ensure a successor state to belong to U . Hence performing this reasoning iteratively, we can build a strategy for Player 1 from c to reach surely infinitely often q F . Proof. Let S = Q, Q 1 , Q P , T, τ be a 1-VASS-MDP and M S = C, C 1 , C P , →, p its associated MDP. If a configuration q, v ∈ C P is a deadlock, then it means that there is no outgoing edge in S from the control state q; hence each time a play will reach this configuration, Player 1 will lose, so we can add a self loop without any effect on the counters to this state in order to remove the deadlock. For the states q ∈ Q 1 , we add a transition t to the outgoing edge of q which does not modify the counter values and which leads to a new control state with a self-loop, such that if the play reaches a configuration q, v which is a deadlock in S, in the new game arena the only choice for Player 1 will be to go to this new absorbing state and he will lose as he loses in S because of the deadlock. ⊓ ⊔

B.2 Proof of Theorem 6
If we define the two following set of configurations: V 1 S = {c ∈ C | ∃σ ∈ Σ such that Plays(M S , c, σ) ⊆ ♦q F } and W 1 S = {c ∈ C | ∃σ ∈ Σ such that Plays(M S , c, σ) ⊆ ♦q F }, we have the following result: Proof. Since we are looking at deadlock free VASS-MDP, we can reuse the formula given by Lemma 8 and 9. In fact, note that we have V 1 S = V P S and W 1 S = W P S . Hence by taking the complement formulae of L sv µ , we obtain the desired result. For V 1 S , from the formula describing V P S , we obtain the formula µX.q F ∨ (Q 1 ∧ ♦X) ∨ Q 1 ∧ (Q P ∨ X) which is equivalent to µX.q F ∨ (Q 1 ∧ ♦X) ∨ (Q P ∧ X). For W 1 S , from the formula describing W P S , we obtain the formula νY.µX.
We will say that an environment ε : X → 2 C is upward-closed if for each variable X ∈ X , ε(X) is upward-closed (we take as order ≤ for the configurations, the classical one such that q, v ≤ q ′ , v ′ iff q = q ′ and v ≤ v ′ ) . Whereas it is not true that for any VASS, any formula φ ∈ L sv µ and any upward closed environment ε : X → 2 C the set φ ε is upward closed, we now prove that on single-sided VASS this property holds.
Lemma 11. For any formula φ ∈ L sv µ and any upward closed environment ε, φ ε evaluated over the configurations of the (Q 1 , Q P )-single-sided VASS Q, T is an upward closed set.
Proof. The proof is by induction on the length of the formula φ. For formulae of the form q, the result is due to the fact that the set of considered regions is upward closed. For formulae of the form X, the result comes from the assumption on the considered environment. For formulae of the form φ ∧ ψ and µX.φ the result can be obtained using the induction hypothesis and the fact that the intersection of upward closed sets is an upward-closed set. For formulae of the form φ ∨ ψ and νX.φ the result can be obtained using the induction hypothesis and the fact that the union of upward closed sets is an upward closed set. Now we consider formulae of the form ♦φ assuming that for any upward closed environment ε, φ ε is an upward closed set. Let c 1 ∈ ♦φ ε . Then there exists c ′ 1 ∈ φ ε such that c 1 → c ′ 1 . Let c 2 ∈ C such that c 1 ≤ c 2 . Since we are considering VASS, we now that there exists c ′ 2 ∈ C such that c 2 → c ′ 2 and c ′ 1 ≤ c ′ 2 . Since φ ε is upward closed, we have c ′ 2 ∈ φ ε , hence c 2 belongs to ♦φ ε . This proves that ♦φ ε is upward closed. Now we consider formulae of the form Q P ∧ φ assuming that for any upward closed environment ε, φ ε is an upward closed set. Let By definition of the order ≤ on the set of configurations, we have Since q 1 ∈ Q P , by the definition of single-sided VASS, we know that the transition which leads from c 2 to c ′ 2 can also be taken from c 1 (this is because the outgoing transitions from control states in Q P do not modify the counter values), hence there exists c ′ 1 ∈ C such that c 1 → c ′ 1 and since c 1 ≤ c 2 , we have c ′ 1 ≤ c ′ 2 . Furthermore, we have c ′ 1 ∈ φ ε and since, by induction, this last set is upward closed, we deduce c ′ 2 ∈ φ ε . This allows us to conclude that c 2 belongs to Q 1 ∧ φ ε which is hence an upward closed set.

B.4 Proof of Lemma 3
Proof. We consider the function h : 2 C → 2 C which associates to each set of We define a sequence of sets (F i ) i∈N included in C as follows: Using Lemma 11 and the fact that the union of upward closed set is an upward closed set, we can prove that F i is upward closed for all i ∈ N. Furthermore, we have that F i ⊆ F i+1 for all i ∈ N. Since (F i ) i∈N is an increasing sequence of upward closed sets included in C and since (C, ≤) is a wqo, from the theory of wqo, we know that there exists N ∈ N such that for all i ≥ N , F i = F i+1 . We consider also the function g : 2 C → 2 C , which associates to each set of configurations C ′ ⊆ C the set g(C ′ ) = µY. q F ∨ InvPre(X, Y ) ε0[X:=C ′ ] . By definition, E is the greatest fixpoint of the function g, hence E = µY. q F ∨ InvPre(X, Y ) ε0[X:=E] . E is then also the least fixpoint of the function h and consequently, by definition of the sequence (F i ) i∈N , we know that E = i∈N F i . This allows us to deduce that E = F N . We point out that F 1 = q F .
We now define a strategy σ for Player 1 which will be memoryless on E (i.e. the strategy will only depend on the current configuration) and is a function σ : E ∩ C 1 → C (note that since with this strategy all the plays starting from E will stay in E, we do not need to define it precisely on the entire set C 1 and we assume that, on the set C 1 \ E, the strategy can choose any one of the possible successor configurations). Let c ∈ E. Then we denote by j ∈ [2..N ] the smallest index such that c ∈ F j and c / ∈ F j−1 , we then define σ(c) as being the configuration c ′ such that c → c ′ and c ′ ∈ F j−1 (by definition of the sequence ((F i ) i∈N , such a c ′ necessarily exists). If c belongs to F 1 , then the strategy chooses any one of the possible successors.
Let c 0 ∈ E. We show that there exists a play c 0 ·c 1 ·c 2 ·. . . in Plays(M S , c 0 , σ) that satisfies the three properties of the lemma. In fact, we consider the play such that in all configurations c ∈ E ∩ C P and if j ∈ [2..N ] is the smallest index such that c ∈ F j and c / ∈ F j−1 , Player P chooses c ′ such that c → c ′ and c ′ ∈ F j−1 . Hence in this play it is obvious that in less than N steps, the play will reach a configuration in F 1 = q F and the points 2. and 3. also hold for this play by definition of the sets (F i ) i∈N . ⊓ ⊔

B.5 Proof of Lemma 4
We now prove that the set E is included in V 1 AS . Techniques we used here are quite similar than the one presented in [2] to prove decidability of the sure reachability problems in probabilistic VASS (without nondeterminism).

AS
Proof. We consider the integer N ∈ N and the strategy σ of Player 1 given by Lemma 3. For each q ∈ Q P , let Out(q) = {(q, z, q ′ ) ∈ T for some q ′ ∈ Q and z ∈ Z n } be the set of transitions going out of q. We also denote by L q the cardinality of Out(q), by W q the sums Σ t∈Out(q) τ (t) and finally Min q is the minimal element of {τ (t) | t ∈ Out(q)}. By definition of VASS-MDP, we know that for a configuration q, v ∈ C P , for any configuration c ′ ∈ C such that q, v → c ′ , we have p(q, v)(c ′ ) ≥ Minq Lq·Wq . We denote by β the minimal element of the set { Minq Lq·Wq | q ∈ Q P }. Then for any configuration c ∈ C P and c ′ ∈ C such that c → c ′ , we have p(c)(c ′ ) ≥ β. Note that necessarily β > 0. Let c 0 ∈ E \ q F and let c 0 · c 1 · c 2 · · · be a play in Plays(M S , c 0 , σ) such that for all i ∈ N, c i / ∈ q F . From Lemma 3, we know that there exists N ∈ N such that, for all i ∈ N, c i ∈ E, we have that P(M S , c i , σ, ♦q F ) ≥ β N . This allows us to deduce that the probability of never visiting q F from c 0 following σ is smaller than (1 − β N ) ∞ and since β > 0, we deduce that this probability is equal to 0. Consequently P(M S , c 0 , σ, ♦q F ) = 1 and c 0 ∈ V 1 AS .
We now prove the opposite direction. For this we use a technique similar to the ones presented in the proof of Lemma 5.29 in [5].
AS . So there exists a strategy σ for Player 1 such that P(M S , c 0 , σ, ♦q F ) = 1. Let D be the following set of configurations: {c ∈ C | ∃c 0 · c 1 · · · ∈ Plays(M S , c 0 , σ) s.t. ∃i ∈ N for which c i = c and ∀0 ≤ j < i.c j / ∈ q F }. Clearly c 0 belongs to D. We will show that D ⊆ E.
We consider the two functions g, h : 2 C → 2 C such that for each set of We define the following sequence of configurations (F D i ) i∈N such that: since P(M S , c 0 , σ, ♦q F ) = 1, we know that there exists a strategy σ ′ such that P(M S , c, σ ′ , ♦q F ) = 1 (otherwise we would have P(M S , c 0 , σ, ♦q F ) < 1). Hence there is a play in c·c ′ 1 ·c ′ 2 · · · ∈ Plays(M S , c, σ ′ ) for which there exists i ∈ N satisfying c ′ i ∈ q F and c ′ j / ∈ q F for all 0 ≤ j < i. Note also that by definition of D, for all 0 ≤ j ≤ i, we have that c ′ j belongs to D and if c ′ j ∈ C P then for and we have D ⊆ g(D). We know that E is the greatest fixpoint of g, then by Knaster-Tarski Theorem we know that E = {C ′ ∈ C | C ′ ⊆ g(C ′ )}, and hence D ⊆ E. Since c 0 ∈ D, we deduce that c 0 ∈ E.
We first prove that W 1 AS is included in F .
AS . So there exists a strategy σ for Player 1 such that P(M S , c 0 , σ, ♦q F ) = 1. Let T be the following set of configurations {c ∈ C | ∃c 0 · c 1 · · · ∈ Plays(M S , c 0 , σ) s.t. ∃i ∈ N for which c i = c}. Necessarily, we have T ∩ q F = ∅ and c 0 ∈ T . We consider the function g : 2 C → 2 C such that for each set of configurations C ′ ⊆ C, we have g(C ′ ) = InvPre(X, µY.(q F ∨ InvPre(X, Y ))) ε0[X:=C ′ ] . We will prove that T ⊆ g(T ).
Let c ∈ T . We have necessarily that there exists a strategy σ ′ for Player 1 such that P(M S , c, σ ′ , ♦q F ) > 0, otherwise we would have that P(M S , c 0 , σ, ♦q F ) < 1. Hence there exists a play in M S respecting σ of the form: Using a similar reasoning to that done in Lemma 13, we deduce that for all . Furthermore if c ∈ C P , for all configurations c ′′ ∈ C such that c → c ′′ we have also c ′′ ∈ T , hence we have that c ∈ g(T ) (using the definition of InvPre(X, Y )). This implies T ⊆ g(T ).
Since F is the greatest fixpoint of g, then by Knaster-Tarski Theorem we We now prove the left to right inclusion.

AS
Proof. We consider the function h : 2 C → 2 C which associates to each set of We define a sequence of sets (F i ) i∈N included in C as follows: As for the proof of Lemma 3, we know that there exists N ∈ N such that for all i ≥ N , F i = F i+1 and that µY. q F ∨ InvPre(X, Y ) ε0[X:=F ] = F N . Since F = νX.InvPre(X, µY.(q F ∨InvPre(X, Y ))) ε0 , using again fixpoint theory, we know that F = InvPre(X, µY.(q F ∨ InvPre(X, Y ))) ε0[X:=F ] and consequently F = InvPre(X, Y ) ε0[X:=F,Y :=FN ] .
We now define a strategy σ for player 0 which will be memoryless on F (i.e. the strategy will only depend of the current configuration). The strategy σ will be described as a function σ : F ∩ C 1 → C (note that since with this strategy all the plays starting from F will stay in F , we do not need to define it precisely on the entire set C 1 ; we assume that on the set C 1 \ F , the strategy can choose any one of the possible successor states). Let c ∈ C 1 \ F and consider the following two cases. If c ∈ F \ F N or c ∈ F 1 , we define σ(c) as being a configuration c ′ ∈ F N ∩ F such that c → c ′ . By definition of F such a configuration necessarily exists. If c ∈ F N \ F 1 we denote by j ∈ [2..N ] the smallest index such that c ∈ F j and c / ∈ F j−1 . We then define σ(c) as being the configuration c ′ such that c → c ′ and c ′ ∈ F j−1 ∩ F (by definition of F and of the sequence (F i ) i∈N , such a c ′ necessarily exists).
By construction of the strategy σ and using a similar reasoning to the one performed in the proof of Lemma 12, we can prove that for all c ∈ F , we have P(M S , c, σ, ♦q F ) = 1. We would like now to prove that for each c ∈ F , P(M S , c, σ, ♦q F ) = 1.
We denote by ¬q F the set of infinite plays c 0 · c 1 · · · of M S such that for all i ∈ N, c i / ∈ q F . Then for each c ∈ F , we have P(M S , c, σ, ¬q F ) = 0. We also use the notation ♦ ¬q F the set of infinite play c 0 · c 1 · · · of M S for which there exists i ∈ N, such that c j / ∈ q F for all j ≥ i. Then for each c ∈ F , we have P(M S , c, σ, ♦q F ) = 1 − P(M S , c, σ, ♦ ¬q F ). We will now prove that for each c ∈ F , P(M S , c, σ, ♦ ¬q F ) = 0.
For c ∈ C, i ∈ N and d ∈ C \ q F , let Π c−i−d be the set of finite plays of M S of the form c 0 · c 1 · · · c k such that: This represents the set of finite plays starting at c ending at configuration d and that passes exactly through i configurations in q F . For c ∈ F and i ∈ N and d ∈ F \ q F , we define ∆ c−i−d the set of infinite plays of the form ρ·ρ ′ where ρ is a finite play in Π c−i−d and ρ is an infinite play in Intuitively ∆ c−i is the set of infinite plays starting from c which revisits q F exactly i times. For c ∈ F , it is straightforward to check that: Hence, for all c ∈ F and i ∈ N, we have: where the first equality holds by (1) and by the fact that all the configurations reached from c following σ belongs to F (by definition of σ and F ); the second equality follows from (3) and the last equality from the fact that for all d ∈ F , P(M S , d, σ, ¬q F ) = 0. Finally, we have for all c ∈ F : This is due to the fact that if an infinite play belongs to Plays(M S , c, σ) ∩ ♦ ¬q F , then it will pass only a finite number of times through q F . From this inclusion, the previous equality and using (2), we deduce that P(M S , c, σ, ♦ ¬q F ) ≤ i∈N P(M S , c, σ, ∆ c−i ) = 0. Hence, for all c ∈ F , we have P(M S , c, σ, ♦q F ) = 1, which allows us to conclude that F ⊆ W 1 AS . ⊓ ⊔

B.7 Proof of Lemma 6
Proof. Algorithm 1 explores an unfolding of the computation tree of S, which is finitely branching since |T | is finite. The number of counters is fixed, and therefore, by Dickson's Lemma, (N d , ) is a well quasi ordering. Therefore, on every branch we eventually satisfy either the condition of line 19 or of line 7. In the former case, a loop in the derived system S ′ is created, and the exploration of the current branch stops. In the latter case, a finitary description of a new colored (possibly infinite-state) subsystem is added to S ′ by adding finitely many states, transitions and configurations to Q ′ , T ′ and X ′ , respectively. Also in this case, the exploration of the current branch stops. Since the exploration is finitely branching, and every branch eventually stops, the algorithm terminates. ⊓ ⊔ B.8 Proof of Lemma 7 Proof. Let us assume that P + (M S , c 0 , ♦X ) = 1. Therefore, there exists a family of strategies that make the probability of reaching X arbitrarily close to 1.
The strategy σ ′ ǫ will use the same moves on M S ′ as σ ǫ on M S , which is possible due to the way how M S ′ is constructed from M S by Algorithm 1. By construction, for every reachable configuration in M S there is a corresponding configuration in M S ′ , and this correspondence can be maintained stepwise in the moves of the game.
For the initial uncolored part of M S ′ , this is immediate, since S ′ is derived from the unfolding of the game tree of S. The correspondence is expressed by the function λ. Each current state of M S ′ is labeled by the corresponding current configuration of M S .
In the colored subsystems, the corresponding configuration in system M S ′ is a projection of a configuration in M S . For any transition t ∈ T that is controlled by player 1 from a configuration in M S , there exists a transition t ′ ∈ T ′ that belongs to player 1 in the corresponding configuration in M S ′ , such that this transition leads to the corresponding state. This is achieved by the projection and the fact that the 1-VASS-MDP game is monotone w.r.t. player 1, i.e., larger configurations always benefit the player (by allowing the same moves or even additional moves).
We now show a property on how probabilistic transitions in M S and M S ′ correspond to each other: For every probabilistic transition t ∈ T from a configuration in M S , there exists a probabilistic transition t ′ ∈ T ′ in the corresponding configuration in M S ′ , and vice-versa, such that these transitions have the same probability. In particular, a configuration in M S ′ does not allow any additional probabilistic transitions compared to its corresponding configuration in M S (though it may allow additional transitions controlled by player 1).
The first part of this statement follows from the monotonicity of the projection function and the monotonicity of the transitions w.r.t. the size of the configurations. For the second part we need to show that for every probabilistic transition t ′ = col i (x), proj k (op), col i (y) ∈ T ′ from a configuration in M S ′ , there exists a probabilistic transition t = x, 0, y ∈ T P in the corresponding configuration in S, such that the probabilities of these transitions are equal. This latter fact holds only because we are considering 1-VASS-MDP, where only the player can change the counters, whereas the probabilistic transitions can only change the control-states. I.e., the 'larger' projected configurations in M S ′ do not enable additional probabilistic transitions, since in 1-VASS-MDP these only depend on the control-state.
Proof. We use the assumed family of strategies on M S ′ that witnesses the property P + (M S ′ , c ′ 0 , ♦X ′ ) = 1 to synthesize a family of strategies on M S that witnesses P + (M S , c 0 , ♦X ) = 1.
First we establish some basic properties of the system S ′ . It is a 1-VASS-MDP of dimension d−1 with initial configuration c ′ 0 , and consists of several parts. The initial uncolored part induces a finite-state MDP. Moreover, S ′ contains finitely many subsystems of distinct colors, where each subsystem is a 1-VASS-MDP of dimension d − 1 obtained from S by projecting out one component of the integer vector. For color i, let k(i) be the projected component of the vector (see line 9 of the algorithm). Each colored subsystem of dimension d − 1 induces an MDP that may be infinite-state (unless d = 1, in which case it is finite-state).
Note that colored subsystems are not reachable from each other, i.e., a color, once reached, is preserved. Each colored subsystem has its own initial configuration (created in lines 11-12 of Alg. 1). Let m be the number of colors in S ′ and r i the initial configuration of the subsystem of color i (where 0 ≤ i ≤ m − 1).
Let's now consider only those colored subsystems in which the target set X ′ can be reached limit-surely, i.e., let J = {i : 0 ≤ i ≤ m−1 | P + (M S ′ , r i , ♦X ′ ) = 1} be the set of good colors and let R = {r j | j ∈ J}, andR = {r j | j / ∈ J}. Further, let X ′ f be the restriction of X ′ to the finite uncolored part of S ′ (i.e., only those parts added in line 25 of Alg. 1).
We now establish the existence of certain strategies in subsystems of S ′ . These will later serve as building blocks for our strategies on M S .
Since we assumed that P + (M S ′ , c ′ 0 , ♦X ′ ) = 1, there exists a family of strategies that makes the probability of reaching X ′ arbitrarily close to one. In particular, they must also make the probability of reaching configurations inR arbitrarily close to zero. Thus we obtain P + (M S ′ , c ′ 0 , ♦X ′ f ∪ R ) = 1, i.e., we can limit-surely reach X ′ f ∪ R. Since, for this objective, only the finite uncolored part of M S ′ is relevant, this is a problem for a finite-state MDP and limit-surely and almost-surely coincide. So there exists a partial strategy σ, for the uncol-ored part of M S ′ , such that, starting in c ′ B.9 Proof of Theorem 9 Proof. Let S = Q, Q 1 , Q P , T, τ be 1-VASS-MDP of dimension d > 0 with initial configuration c 0 = q 0 , v and X ⊆ Q a set of target states. We show decidability of P + (M S , c 0 , ♦X ) = 1 by induction on d.
Base case d = 0. If S has 0 counters then M S is a finite-state MDP and thus limit sure reachability coincides with almost sure reachability, which is decidable.