DIGGING INPUT-DRIVEN PUSHDOWN AUTOMATA

Input-driven pushdown automata (IDPDA) are pushdown automata where the next action on the pushdown store (push, pop, nothing) is solely governed by the input symbol. Nowadays such devices are usually defined such that popping from the empty pushdown does not block the computation but continues it with empty pushdown. Here, we consider IDPDAs that have a more balanced behavior concerning pushing and popping. Digging input-driven pushdown automata (DIDPDA) are basically IDPDAs that, when forced to pop from the empty pushdown, dig a hole of the shape of the popped symbol in the bottom of the pushdown. Popping further symbols from a pushdown having a hole at the bottom deepens the current hole furthermore. The hole can only be filled up by pushing symbols previously popped. We study the impact of the new behavior of DIDPDAs on their power and compare their capacities with the capacities of ordinary IDPDAs and tinput-driven pushdown automata which are basically IDPDAs whose input may be preprocessed by length-preserving finite state transducers. It turns out that the capabilities are incomparable. We address the determinization of DIDPDAs and their descriptional complexity, closure properties, and decidability questions. Mathematics Subject Classification. 68Q45, 68Q15. Received December 6, 2019. Accepted March 25, 2021.


Introduction
In connection with an upper bound for the space needed for the recognition of deterministic context-free languages, a subclass of pushdown automata has been introduced in [20]. These so-called input-driven pushdown automata (IDPDA) work in real time and, more importantly, in an input-driven way. That is, no moves on empty input are allowed, and the actions on the pushdown store are dictated by the input symbols. To this end, the input alphabet is partitioned into three subsets, where one subset contains symbols on which the automaton pushes a symbol onto the pushdown store, one subset contains symbols on which the automaton pops a symbol, and one subset contains symbols on which the automaton leaves the pushdown unchanged and makes a state change only. The results in [20] and the follow-up papers [7,24] comprise the equivalence of nondeterministic and deterministic models and the proof that the membership problem is solvable in logarithmic space.
The investigation of input-driven pushdown automata has been revisited in [1,2], where such devices are called visibly pushdown automata or nested word automata. Some of the results are descriptional complexity aspects for the determinization as well as closure properties and decidability questions which turned out to be similar to those of finite automata. Further aspects such as the minimization of IDPDAs and a comparison with other subfamilies of deterministic context-free languages have been studied in [5,6]. A recent survey with many valuable references on complexity aspects of input-driven pushdown automata may be found in [22].
The properties and features of IDPDAs revived the research on input-driven pushdown languages and triggered the study of further input-driven automata types, such as input-driven variants of, for example, multi-pushdown automata or more general auxiliary storages [14,18], (ordered) multi-pushdown automata [4], scope-bounded multi-pushdown automata [16], stack automata [3], queue automata [12], etc. In particular, input-driven ordered pushdown automata obey the limitation that a pushdown can be popped only if all the lower indexed pushdowns are empty. Nevertheless, even the variant with two pushdowns accepts non-context-free languages as crossing dependencies and cannot be determinized [15].
However, the questions asked in connection with IDPDAs have widened since the early papers from the eighties. Additionally, the definition of input-driven to visibly pushdown automata has changed at a certain point. In the early papers, IDPDAs are defined as ordinary real-time pushdown automata whose behavior on the pushdown is solely driven by the input symbols. The problem that an IDPDA could be forced to pop from the empty stack has not been addressed. So, following a common sense one could say that in such a situation the computation blocks. Later [1], popping from the empty pushdown was simply allowed by defining that popping from the empty pushdown results in an empty pushdown and the computation may continue. While it seems to be natural to overcome the problem by defining some action in such situations and to continue the computation, it is somehow unbalanced and, thus, artificial simple to say that the pushdown remains empty. In this way, in fact, on every input symbol that requires a push operation some additional symbol is pushed, but not for every input symbol that requires a pop operation some symbol is popped. So, an interesting question is to what extent the definition gives additional power to the devices or whether it restricts the capacity.
Here, we consider input-driven pushdown automata that behave differently when they have to pop from the empty pushdown. In order to implement a more balanced behavior, one can imagine that popping a specific symbol from the empty pushdown digs a hole in the bottom of the pushdown, where the hole has the shape of the symbol. This means that the hole can only be filled up by pushing that symbol again. Popping further symbols from a pushdown having a hole at the bottom deepens the current hole furthermore. As usual, for a transition the digging input-driven pushdown automaton looks at the pushdown from above. If the pushdown has a hole, the automaton 'sees' the bottom of the hole, that is, the (shape of the) symbol which was popped as last. These devices are called digging input-driven pushdown automata (DIDPDA) and are studied in the sequel. In particular, the basic and formal definition is given and clarified by an introductory example in the next section. In Section 3, the computational capacity of DIDPDAs is studied. Of particular interest is the impact of the new behavior of DIDPDAs on their power. So, among others the capacities are compared with the capacities of ordinary IDPDAs and so-called tinput-driven pushdown automata which are basically IDPDAs whose input may be preprocessed by length-preserving finite state transducers [13]. It turns out that the family of languages accepted by DIDPDAs is incomparable with the other devices. So, digging may give power to the machines and, on the other hand, allowing to pop from the empty pushdown without getting stuck can be utilized to perform computations that are impossible for machines that actually have to perform the action required by the input symbol. Section 4 considers the determinization of DIDPDAs and its descriptional complexity, where upper and lower bounds on the size are given that match in the order of magnitude of the second exponent. Closure properties and decidability questions for digging pushdown automata are studied in Section 5. We distinguish here in particular the important special case that all automata involved share the same partition of the input alphabet [1] from the general one. Finally, in Section 6 we consider the family of all languages that are either accepted by DIDPDAs or by IDPDAs and we study again the closure properties and decidability questions for this new family. We would like to note that a preliminary version of this work was presented at the 11th Workshop on Non-Classical Models of Automata and Applications (NCMA 2019), Valencia, Spain, July 2-3, 2019, and it is published in [11].

Preliminaries
We denote the non-negative integers {0, 1, 2, . . . } by N. Let Σ * denote the set of all words over the finite alphabet Σ. The empty word is denoted by λ, and Σ + = Σ * \ {λ}. The set of words of length at most n ≥ 0 is denoted by Σ ≤n . The reversal of a word w is denoted by w R . For the length of w we write |w|. For the number of occurrences of a symbol a in w the notation |w| a is used. We use ⊆ for inclusions and ⊂ for strict inclusions. We write 2 S for the power set and |S| for the cardinality of a set S. We say that two language families L 1 and L 2 are incomparable if L 1 is not a subset of L 2 and vice versa.
A classical deterministic pushdown automaton (DPDA) is called real-time, if no moves on empty input are allowed. A DPDA is called input-driven (IDPDA) if the next input symbol defines the next action on the pushdown store, that is, pushing a symbol onto the pushdown store, popping a symbol from the pushdown store, or changing the state without modifying the pushdown store. To this end, the input alphabet Σ is partitioned into the sets Σ D , Σ R , and Σ N , that control the actions push (D), pop (R), and state change only (N ). However, if the next input symbol forces the IDPDA to pop a symbol from the empty pushdown then the computation does not get stuck but continues with an empty pushdown. At this point, a digging input-driven pushdown automaton has the different behavior mentioned above. In order to implement this behavior, a copy of the pushdown alphabet is used to represent the symbols below the surface. Moreover, when the transition function pops from the pushdown, it explicitly gives the symbol to be popped. A formal definition is: 1. Q is the finite set of internal states, 2. Σ is the finite set of input symbols partitioned into the disjoint sets Σ D , Σ R , and Σ N , 3. Γ is the finite set of pushdown symbols, whereΓ = {X | X ∈ Γ } is a copy of Γ, 4. q 0 ∈ Q is the initial state, 5. F ⊆ Q is the set of accepting states, 6. ⊥ / ∈ Γ is the empty pushdown symbol, there are no transitions δ R (p, a, g) = (q, pop(h)) with h = g, 9. δ N : Q × Σ N × (Γ ∪ {⊥} ∪Γ) → Q is a partial transition function.
A configuration of a DIDPDA M = Q, Σ, Γ, q 0 , F, ⊥, δ D , δ R , δ N is a triple (q, w, s), where q ∈ Q is the current state, w ∈ Σ * is the unread part of the input, and s ∈ Γ * ∪Γ * denotes the current pushdown content. If s ∈ Γ * then we have an ordinary pushdown content, where the leftmost symbol of s is at the top of the pushdown store. If s ∈Γ * then we have a pushdown content below the surface, where the rightmost symbol of s at the bottom of the pushdown store is seen by M .
The initial configuration for an input string w is set to (q 0 , w, λ). During the course of its computation, M runs through a sequence of configurations. One step from a configuration to its successor configuration is denoted by .
Let a ∈ Σ, w ∈ Σ * and s ∈ Γ * ∪Γ * . We set Figure 1. A deterministic digging input-driven pushdown automaton accepting the language { w ∈ {a, b} * | |w| a = |w| b }. Edges corresponding to transitions δ(p, a, g) = (q, op(h)) are labeled a, g on the first line and op(h) on the second line of the label. Symbol Z may be X or Y .
So, whenever the pushdown store is empty, the successor configuration is computed by the transition functions with the special empty pushdown symbol ⊥. Note that s ∈ Γ * ∪Γ * , that is, if one symbol of s belongs to Γ then all symbols of s belong to Γ and, similarly forΓ. As usual, we define the reflexive and transitive closure of by * . The language accepted by the DIDPDA M is the set L(M ) of words for which there exists some computation beginning in the initial configuration and ending in a configuration in which the whole input is read and an accepting state is entered. Formally: In general, the family of all languages accepted by an automaton of some type X will be denoted by L (X). In order to clarify this notion, we continue with an example.
The DIDPDA M is depicted in Figure 1.
The basic idea of the construction is as follows. State q 0 is entered whenever the pushdown gets empty. In order to detect that this will happen after the next step, the first symbol pushed onto the pushdown is X while further symbols on top of X are Y 's (Transitions (1), (2), (3)). Similarly, if the pushdown content falls below the surface, the first symbol popped is an X, while further symbols below X are Y 's (Transitions (6), (9), (10)). In this way, after popping an X from above the surface or pushing an X under the surface, the pushdown gets empty and M enters state q 0 (Transitions (4), (7)). Since there is a push operation for any input symbol a and a pop operation for any input symbol b, the input string belongs to L, if and only if after its processing the pushdown is empty, that is, M enters state q 0 .
We note that the language of the previous example can be accepted by finite state models with unconventional reading order as well. For example, it can be accepted by finite automata with translucent letters [21] or by jumping finite automata [19].

Computational capacity
Any language accepted by some deterministic or nondeterministic IDPDA is accepted by a real-time deterministic pushdown automaton as well. On the other hand, for example, the language { a n $a n | n ≥ 0 } is not accepted by any IDPDA. This immediately implies the fact that the language family accepted by IDPDAs is a proper subset of the real-time deterministic context-free languages. Here, we start to explore the relationships between DIDPDA and other types of devices by presenting a result that yields a construction that effectively converts any given DIDPDA into an equivalent real-time deterministic pushdown automaton. Proof. Let M = Q, Σ, Γ, q 0 , F, ⊥, δ D , δ R , δ N be an arbitrary DIDPDA. We will construct an equivalent realtime DPDA M = Q , Σ, Γ, q 0 , F , ⊥, δ as follows.
Basically, M simulates M where it remembers in its states whether the pushdown of M is above or below the surface. Since M can freely push or pop symbols not depending on the current input symbol, a pushdown below the surface is simulated by pushing onto instead of popping from the empty pushdown. To implement this behavior, we define Q = Q × {↑, ↓}, q 0 = (q 0 , ↑), and F = F × {↑, ↓}.
By construction, a word w is accepted by M if and only if w is accepted by M . Inspecting δ shows that M is indeed a deterministic pushdown automaton working in real time.
Since also DIDPDAs either always have to push or always have to pop when reading the a's, the language { a n $a n | n ≥ 0 } from above is clearly not accepted by any DIDPDA. So, the inclusion of Theorem 3.1 is strict. In [13], input-driven automata are extended in such a way that the input is preprocessed by a deterministic injective and length-preserving finite state transducer. The corresponding devices are called tinput-driven pushdown automata (TDPDA). In this way, for example, language { a n $a n | n ≥ 0 } is accepted by such a device. In general, it turned out that TDPDAs are strictly more powerful than IDPDAs but still not as powerful as realtime deterministic pushdown automata. It is shown that, for example, the language { a n b n+m a m | n, m ≥ 1 } is not accepted by any TDPDA. In order to compare the computational capacities of IDPDAs and TDPDAs with DIDPDAs, we next show that this language is accepted by some deterministic digging input-driven pushdown automaton.

Lemma 3.3.
There is a language accepted by some DIDPDA that cannot be accepted by any TDPDA and, thus, cannot be accepted by any IDPDA.
Proof. We use L = { a n b n+m a m | n, m ≥ 1 } as a witness language. It is known [13] that L does not belong to L (TDPDA). Since L (IDPDA) ⊂ L (TDPDA), L cannot be accepted by any IDPDA.
On the other hand, by Example 2.2, there is some DIDPDA accepting the language L = { w ∈ {a, b} * | |w| a = |w| b }. By simulating a deterministic finite automaton in parallel to the actual computation, it can immediately be seen that the family L (DIDPDA) is closed under intersection with regular languages. Therefore, L = L ∩ a + b + a + belongs to L (DIDPDA) as well.
So, compared with IDPDAs that may arbitrarily pop from the empty pushdown, digging may give power to the machines. On the other hand, allowing to pop from the empty pushdown without getting stuck can be utilized to perform computations that are impossible for machines that actually have to perform the action required by the input symbol.

Lemma 3.4.
There is a language accepted by some IDPDA and, thus, by some TDPDA, that cannot be accepted by any DIDPDA.
Proof. We use the language L = { a k b a m b n | 1 ≤ k < and 1 ≤ m < n } as a witness language.
An IDPDA accepting the language L pushes upon reading a's and pops upon reading b's. Then, it has to check whether the input format is correct, and whether it pops at least once from the empty pushdown while reading the sequences of b's. So, L belongs to the family L (IDPDA). Since L (IDPDA) ⊂ L (TDPDA), L is accepted by some TDPDA as well. Now, we assume that some DIDPDA M = Q, {a, b}, Γ, q 0 , F, ⊥, δ D , δ R , δ N is accepting language L. If at least one of Σ D and Σ R is empty, M accepts a regular language. Since L is non-regular, we have to consider the two cases Σ N = ∅ and either So, let Σ D = {a}, Σ R = {b}, and Σ N = ∅. First we choose some arbitrary k ≥ 1. Let (q 0 , a k b k , λ) 2k (q 1 , λ, λ) be the computation of M on input a k b k . Continuing the computation on factor b i with i > |Q| · |Γ| large enough will drive M into a cycle of length c ≥ 1, possibly after an initial part of length 0 ≤ c 0 ≤ |Q| · |Γ| as follows. For all j 1 ≥ 1: Next, we set j 1 = (c + 1) · |Q|, = k + c 0 + j 1 · c, m = c · |Q|, and consider the accepting computation of M on input a k b a m b m+1 : Since for the continuation of the computation on the following factor a m there are at most c · |Q| different possibilities for being in a state and seeing a pushdown symbol, and m = c · |Q|, such a situation appears at least twice while processing a m , say with state q 3 and pushdown symbolz r , for some 1 ≤ r ≤ c. In general, in this phase M runs again through a cycle, say of lengthĉ ≥ 1, possibly after an initial part of lengthĉ 0 , where where 0 ≤ĉ 0 <ĉ 0 +ĉ ≤ c · |Q| = m. That is, for some 0 ≤ s 1 < s 2 < |Q|, q f ∈ F , and q 4 ∈ Q the computation continues as Now, we consider the computation of M on input a k b a m+c·ĉ b m+1 : Sinceĉ ≤ c · |Q|, the computation ends accepting: So, the input a k b a m+c·ĉ b m+1 is accepted, but does not belong to L, since m + c ·ĉ < m + 1, a contradiction. Finally, the case Σ D = {b}, Σ R = {a}, and Σ N = ∅ has to be considered. The contradiction in this case follows almost literally as in the first case, with the roles played by pushing and popping and symbols fromΓ and Γ interchanged.
By the previous lemmas we obtain the following incomparabilities.

Determinization
It is well known that nondeterministic IDPDAs can be determinized. Okhotin and Salomaa [22] traced this result back to [24]. They give a clear proof that shows that 2 n 2 states are sufficient to simulate an nstate nondeterministic IDPDA by a deterministic one. However, in [22] and [1] IDPDAs are considered in a certain normal form. That is, neither the push nor the state change only operations depend on the topmost pushdown symbol. Since any deterministic pushdown automaton and any deterministic input-driven pushdown automaton can be converted to this normal form, the general computational capacity does not change. But by this conversion the number of states changes since the topmost pushdown symbol has to be remembered in the states. Thus, when we compare such automata with the original definition of IDPDAs in [24], that is based on the usual definition of pushdown automata and does not require a normal form, we obtain that the state complexity bound of 2 n 2 achieved for the determinization of IDPDAs in normal form is lower than in general. Moreover, results in [8,9] show that states can be traded for pushdown symbols and vice versa. Hence, the size of a pushdown automaton is affected by the size of its state set as well as by the number of pushdown symbols. Since here we are closer to the original definition, the size of a digging input-driven pushdown automaton is measured not only by its states but as the product of the number of states and the number of pushdown symbols. Accordingly, we define the function size that maps a digging input-driven pushdown automaton to its size. Definition 4.1. A nondeterministic digging input-driven pushdown automaton, abbreviated as NDIDPDA, is a system M = Q, Σ, Γ, q 0 , F, ⊥, δ D , δ R , δ N , where Q, Σ, Γ, q 0 , and F are defined as for DIDPDAs. The transition functions are now nondeterministic, that is, where the restrictions from the definition of DIDPDAs are adapted. As usual, an input is accepted if there is an accepting computation on it.
The next theorem shows the determinization of NDIDPDAs. Basically, the idea of the proof is along the lines of [22].
The states of M have the following interpretation. Each triple (p, q, z) says either that if M enters state p on pushing z on a pushdown whose content is above the surface then state q is reachable when M sees this z on top of the pushdown again, or that if M enters state p on popping z from a pushdown whose content is below the surface then state q is reachable when M sees thisz on the bottom of the pushdown again.
Since M is nondeterministic, the states of M are subsets of such triples. So, the initial state of M consists of the triple (q 0 , q 0 , ⊥), where state q 0 is reached from state q 0 with empty pushdown simply by doing nothing.
During the computation, M pushes its current state together with the current input symbol whenever the current input symbol enforces a push (of M as well as of M ) to a pushdown whose content is above the surface. Similarly, M pops its current state together with the current input symbol whenever the current input symbol enforces a pop (of M as well as of M ) from a pushdown whose content is below the surface. So, we will define the set of pushdown symbols as Γ = 2 Q×Q×(Γ∪{⊥}) × (Σ D ∪ Σ R ).
So, in the first case, M pushes the current context of the simulation together with the current input symbol, and starts to compute the reachable states with the new top of pushdown symbol of M . In the second case, the context of the simulation popped from the pushdown at last (which is seen by M now) is combined with the simulation that started after that popping (which is given by the state of M ). If both computation paths fit together they are used to define the new triples of the new state of M . Now let a ∈ Σ N be the current input symbol, P be the current state of M , and (Z, x) ∈ Γ . Then we define δ N (P, a, (Z, x)) = P , where The definition for(Z, x) ∈Γ is almost identical: δ N (P, a,(Z, x)) = P , where So, M directly simulates one step of M in all currently traced computation paths. Finally, let a ∈ Σ R be the current input symbol, P be the current state of M , and (Z, x) ∈ Γ . The definition of δ R is similar to the definition of δ D . Here the roles played by pushing and popping and pushdown contents above and below the surface are interchanged. We define δ R (P, a, (Z, x)) = (P , pop((Z, x))), where If(Z, x) ∈Γ then we define δ R (P, a,(Z, x)) = (P , pop((P, a))), where P = { (q , q , z ) | there is (q, r, z) ∈ P such that (q , z ) ∈ δ R (r, a,z) }.
By the constructions and the set of accepting states, when M reaches an accepting state then this state contains a triple that says that M can reach an accepting state. Conversely, if some computation path of M ends accepting, it is simulated by M and, thus, M accepts as well.
Given some nondeterministic digging input-driven pushdown automaton, an upper bound for the size of an equivalent DIDPDA can immediately be derived from Theorem 4.2. In order to obtain lower bounds, for k ≥ 1, we consider witness languages Lemma 4.3. For k ≥ 1, the language L k is accepted by some NDIDPDA with 2k + 4 states and 2 k pushdown symbols.
We set Q = {q 0 , q 1 , . . . , q k+1 , p 0 , p 1 , . . . , p k+1 }, and we specify the transition functions for x, v 1 , v 2 , . . . , v k ∈ {a, b} as: The basic idea of the construction is as follows. In the initial step (uniquely determined by state q 0 and empty pushdown), M guesses a factor v ∈ {a, b} k and pushes it (Transition (1)). Moreover, state q 0 with a non-empty pushdown is used to read the prefix of the input and on some $ to guess whether the next input factor is u (Transitions (1), (2), (5)). Whenever in this phase a $ enforces a push operation, the guessed v at the top of the pushdown is pushed again (Transition (2)).
When the guess is that u comes next, the state q 1 is entered. Then, states q 1 , q 2 , . . . , q k are used to match the input factor u read with the factor v on top of the pushdown (Transition (6)). If and only if the first k symbols of u match v, then M is in state q k+1 . Afterwards the computation continues, if and only if the next input symbol is $ or #, that is, if and only if u = v. If the symbol is #, state p 1 is entered (Transition (7). If the symbol is $, state p 0 is used to read the remaining input up to the symbol #, where M behaves similarly as for state q 0 (Transitions (4), (8)). When M reaches the # in state p 0 , it changes to state p 1 (Transition (9)).
Finally, states p 1 , p 2 , . . . , p k are used to match the remaining input suffix read with the factor v on top of the pushdown (Transition (10)). If and only if the first k symbols of the suffix match v then M is in state p k+1 . Since p k+1 is the sole accepting state and the transition functions are undefined for p k+1 , M halts. So, it accepts, if and only if it has read the input entirely, that is, if and only if the suffix equals v and, thus equals u, and therefore the input belongs to L k .
So, the size of M is The next lemma shows that this upper bound derived from the determinization is tight in the order of the second exponent. We recall that the size of a digging IDPDA is defined as the product of the number of states and the number of pushdown symbols. Proof. Let M = Q, Σ, Γ, q 0 , F, ⊥, δ D , δ R , δ N be a DIDPDA that accepts L k . We consider subsets of {a, b} k and for any such S = {v 1 , v 2 , . . . , v m } ∈ 2 {a,b} k , let the ordering of the elements be arbitrarily fixed. Then, a word w S is defined as $v 1 $v 2 · · · $v m $. Assume now that for two different such subsets R and S, automaton M is in the same state and sees the same symbol at the pushdown after processing the words w R and w S . Since R and S are different, there is a word u belonging to one set but not to the other, say u ∈ R \ S.
Next, we distinguish three cases dependent on the pushdown symbol that M sees after processing w R and w S .
If this symbol is ⊥, the words w R and w S are extended by #u. So, w R #u is accepted, if and only if w S #u is accepted. This is a contradiction since u ∈ R \ S and, thus, w R #u ∈ L k and w S #u / ∈ L k In the next case, M sees a symbol from Γ at the pushdown after processing w R and w S . This implies that Σ D is not empty and contains at least one of the symbols a, b, or $. We denote one of these symbols from Σ D by x and extend w R and w S by x k+1 #u. The computations on both words are where s ∈ Γ k+1 , q 1 , q 2 ∈ Q, z ∈ Γ, and s,ŝ ∈ Γ * . Since |u| = k, we conclude that the continuations of the computations do not depend on s orŝ at the pushdown. So, w R x k+1 #u is accepted, if and only if w S x k+1 #u is accepted. As before, this is a contradiction since u ∈ R \ S and, thus, w R x k+1 #u ∈ L k and w S x k+1 #u / ∈ L k . In the last case, M sees a symbol fromΓ at the pushdown after processing w R and w S . This implies that Σ R is not empty and contains at least one symbol from the set {a, b, $}. As in the case before, we denote one of these symbols from Σ R by x and extend w R and w S by x k+1 #u. The computations on both words are now (q 0 , w R x k+1 #u, ⊥) * (q 1 , x k+1 #u, sz) * (q 2 , #u, szs ) and (q 0 , w S x k+1 #u, ⊥) * (q 1 , x k+1 #u,ŝz) * (q 2 , #u,ŝzs ) where s ∈Γ k+1 , q 1 , q 2 ∈ Q,z ∈Γ, and s,ŝ ∈Γ * . Since |u| = k, we conclude that the continuations of the computations do not depend on s orŝ at the pushdown. So, w R x k+1 #u is accepted, if and only if w S x k+1 #u is accepted. As before, this is a contradiction since u ∈ R \ S and, thus, w R x k+1 #u ∈ L k and w S x k+1 #u / ∈ L k . These contradictions show that the assumption that M is in the same state and sees the same symbol at the pushdown for two different subsets R and S is wrong. Since there are 2 2 k different subsets, we conclude |Q| · |Γ| ≥ 2 2 k . Therefore, the size of M is at least 2 2 k .

Closure properties and decidability questions
For input-driven pushdown automata, strong closure properties are shown in [1] provided that all automata involved share the same partition of the input alphabet. Here, we distinguish this important special case from the general one. For easier writing, we call the partition of an input alphabet a signature, and say that two Before we start to investigate the closure properties in detail, we state the following preparatory lemma which ensures that the input is completely read in every computation. Lemma 5.1. Any DIDPDA can effectively be converted to an equivalent DIDPDA that accepts or rejects its input after having read the input entirely.
Proof. Let M be a DIDPDA. By definition an input is accepted by M only after having read it entirely. An input is rejected by M if the computation of M ends in a non-accepting state after having read the input entirely, or if the computation of M blocks since the transition function is undefined for the current situation. In the latter case, we have to ensure that the remaining input is processed. So, for the construction of an equivalent DIDPDA M it is sufficient to add transitions to a new non-accepting state s − whenever a transition is undefined. Once in state s − , M still has to obey the signature: For any input symbol from Σ N , M stays in state s − . For any input symbol a ∈ Σ D , we define δ D (s − , a, g) = (s − , push(g)) for g ∈ Γ, δ D (s − , a, ⊥) = (s − , push(g)) for some g ∈ Γ, and δ D (s − , a,g) = (s − , push(g)) forg ∈Γ. For any input symbol a ∈ Σ R , we define δ R (s − , a, g) = (s − , pop(g)) for g ∈ Γ, δ R (s − , a, ⊥) = (s − , pop(g)) for some g ∈ Γ, and δ R (s − , a,g) = (s − , pop(g)) forg ∈Γ. In this way, now all non-accepting computations of M end in a non-accepting state of M after having read the input entirely. Now, we will first turn to the closure under the Boolean operations. Proof. First of all we apply Lemma 5.1 and assume that all DIDPDAs considered read their input entirely before accepting or rejecting. For the closure under intersection we can use the cross-product construction that is known for the closure under intersection for IDPDAs. Basically, both DIDPDAs can be simulated in two tracks of a new DIDPDA having as state set pairs of states and as pushdown symbols pairs of pushdown symbols. Since the signatures are compatible, the push and pop operations of each DIDPDA take place at the same time. Hence, both pushdown stores can be simulated and maintained by one pushdown store. Finally, the input is accepted if both computations simulated are accepting.
For the closure under complementation, we first notice that in every accepting or rejecting computation the input is completely read. Thus, we can construct a new DIDPDA for the complement by interchanging accepting and rejecting states. The closure under complementation also implies the closure under union by applying De Morgan's laws.
Next, we turn to non-closure results. Proof. The languages L 1 = { a n $a n | n ≥ 1 } and L 2 = { a n $b 2n | n ≥ 1 } are each not accepted by any DIDPDA. On the other hand, L 1 = { a n $b n | n ≥ 1 } and L 2 = { a 2n $b 2n | n ≥ 1 } are each accepted by some DIDPDA. Now, consider the length-preserving homomorphisms h 1 : {a, b, $} → {a, $} + mapping both a and b to a and $ to $, and h 2 : {a, b, $} → {a, b, $} + mapping a to aa, b to b, and $ to $. Then, h 1 (L 1 ) = L 1 and h −1 2 (L 2 ) = L 2 which implies that L (DIDPDA) is neither closed under length-preserving homomorphism nor under inverse homomorphism.
For the non-closure under intersection consider L 3 = { a n b n c m | n, m ≥ 1 } and L 4 = { a n b m c m | n, m ≥ 1 }. Both languages can be accepted by DIDPDAs with incompatible signatures. However, L 3 ∩ L 4 is a noncontext-free language which gives the non-closure under intersection. Since L (DIDPDA) is closed under complementation by Lemma 5.2, the non-closure under union can be derived as well.
To obtain the the non-closure under concatenation we consider the language L 5 = { a n b m | n, m ≥ 1 and n < m } that is accepted by some DIDPDA. It is known from the proof of Lemma 3.4 that L 5 · L 5 is not accepted by any DIDPDA which gives that L (DIDPDA) is not closed under concatenation even with compatible signatures.
We continue with studying the unary operations Kleene star and reversal. It is known that the family L (IDPDA) is closed under both operations. However, for the reversal the construction for IDPDAs works by considering the inverted signature Σ −1 where Σ D and Σ R are interchanged. This means, that symbols from Σ R then imply push operations, whereas symbols from Σ D imply pop operations. If we require that the IDPDA for the reversal has to have the identical signature, then L (IDPDA) is no longer closed under reversal. This can be seen using the language { a n b n | n ≥ 1 } that is accepted by some IDPDA with Σ D = {a} and Σ R = {b}. Any IDPDA with the same signature is clearly unable to accept the reversal { b n a n | n ≥ 1 }. Thus, the family L (IDPDA) is closed under reversal with arbitrary signatures, but not under reversal with identical signatures. Proof. We consider again the language L used in the proof of Lemma 3.4, which is the concatenation of the language L 5 = { a n b m | n, m ≥ 1 and n < m } with itself, where L 5 itself can be accepted by some DIDPDA.
Let us assume that L (DIDPDA) is closed under Kleene star. Then, language L * 5 ∩ a + b + a + b + is accepted by some DIDPDA, since the intersection with the regular set a + b + a + b + can be checked in the state set. However, L * 5 ∩ a + b + a + b + = L which gives a contradiction, since L is not accepted by any DIDPDA. For the non-closure under reversal, we consider the language The following a-block fills this b-hole. As soon as the b-hole is completely filled, the input has to be rejected and a blocking non-accepting state is entered. If the b-hole has a depth of at least one, the computation continues and the following b-block digs a c-hole, where the first dug symbol is c . Then, the final a-block fills this c-hole. The input is accepted if the symbol c is not filled. As soon as the symbol c is filled, a blocking non-accepting state is entered. Thus, M accepts if the first a-block is strictly shorter than the first b-block as well as the second a-block is strictly shorter than the second b-block. Assume the family L (DIDPDA) is closed under reversal. Then, (L R ) R = L is accepted by some DIDPDA. However, Lemma 3.4 shows that L is not accepted by any DIDPDA. This gives a contradiction and the desired non-closure result.
Next, we will discuss some decidability questions for DIDPDAs. Let us recall that a decidability problem is undecidable whenever the set of all instances for which the answer is "yes" is not recursive, whereas a decidability problem is semidecidable whenever the set of all instances for which the answer is "yes" is recursively enumerable. The family L (DIDPDA) is a subset of the deterministic context-free languages. Thus, all decidability questions that are decidable for DPDAs are decidable for DIDPDAs as well.
It is known that the inclusion problem for deterministic context-free languages is undecidable. However, for DIDPDAs with compatible signatures it is decidable. Proof. It is shown in [13] that the inclusion problem is not semidecidable for two IDPDAs with not necessarily compatible signatures. Moreover, it can be assumed that the pushdown store of the two IDPDAs is in fact used as a counter only. The basic idea of the proof in [13] is to construct two IDPDAs M and M such that the intersection L(M ) ∩ L(M ) represents the suitably encoded version of the valid computations of a counter machine. Then, the inclusion problem can be related with the emptiness problem of a counter machine which is known to be not semidecidable.
In Finally, we look at the computational complexity of the decidable questions.
Theorem 5.8. The emptiness problem for DIDPDAs and NDIDPDAs is P-complete. The universality, equivalence, and inclusion problems for DIDPDAs are P-complete. The universality, equivalence, and inclusion problems for NDIDPDAs are EXPTIME-complete.
Proof. The emptiness problem for general (nondeterministic) pushdown automata is in P. This implies that the emptiness problem for DIDPDAs and NDIDPDAs is in P as well. Since the inclusion problem can be reduced to the emptiness problem due to the proof of Theorem 5.6 and the constructions involved are intersection and complementation which cause a polynomial blow-up only, the inclusion problem for DIDPDAs belongs to P. Hence, the universality and equivalence problems for DIDPDAs belong to P as well. The P-hardness of the emptiness problem for deterministic and nondeterministic IDPDAs is shown in [17] and is a reduction from the alternating graph reachability problem. A different proof is given in [22] which is a reduction from the monotone circuit value problem. The basic idea is to construct a deterministic IDPDA that accepts a unique string if and only if the given circuit evaluates to 1. Since the accepted string is well-nested with respect to the signature of the IDPDA, in the accepting computation there is no situation in which a pop operation on the empty pushdown takes place. Thus, the deterministic IDPDA is in fact a DIDPDA for the accepting computation. For the rejecting computations, we use the transitions of the given deterministic IDPDA, but enter, similar to the construction in the proof of Lemma 5.1, a new rejecting state that is never left, if a situation occurs in which a pop operation on the empty pushdown takes place. Hence, the emptiness problem for DIDPDAs and NDIDPDAs is P-hard as well. This gives also the P-hardness of the equivalence and inclusion problems for DIDPDAs, since for all problems one can choose to test the equivalence with the empty set and inclusion in the empty set, respectively. The P-hardness of the universality problem can be obtained similarly by taking into account that DIDPDAs are closed under complementation by Lemma 5.2.
The universality, equivalence, and inclusion problems for NDIDPDAs are in EXPTIME, since by Theorem 4.2 every NDIDPDA can be converted into an equivalent DIDPDA with exponential blow-up and then the problems considered can be solved for DIDPDAs as described above. The EXPTIME-hardness of universality is shown in [1] (see also [22]) for nondeterministic IDPDAs by a reduction from the membership problem for alternating linear-space Turing machines. The basic idea is to construct a nondeterministic IDPDA M for a given Turing machine M and an input w such that L(M ) = Σ * if and only if w ∈ L(M ). In detail, the automaton M accepts every input but valid encodings of accepting computation trees of M on w. This is realized by guessing a position while reading the input string and to check whether an error really occurs that makes the encoding invalid. Only in this case, the input is accepted. The pushdown store of the IDPDA is basically used to check whether a part of the input string is not of the form xyx R for non-empty strings x and y. Now, we have to ensure that this behavior can also be realized by an NDIDPDA M . The guessing and checking of an error can be realized in the same way as in the IDPDA M . However, we have to ensure that the signature of the input is obeyed before and after the guessing and checking phase and that the guessing and checking phase does not start with a hole on the pushdown store. The first property is obtained by applying again a similar construction as in the proof of Lemma 5.1. We observe that a valid encoding of an accepting computation tree is a well-nested string. This means that as soon as M pops from the empty pushdown store, we can enter an accepting state s + which is never left until the complete input is read. This ensures also the second property: if the guessing and checking phase starts, then we know that there is no hole on the pushdown store and we can proceed as in M . Whenever in the following computation an error is detected or M pops from the empty pushdown store, we enter the accepting state s + . Thus, we obtain the EXPTIME-hardness of universality for NDIDPDAs in a similar way as for nondeterministic IDPDAs. This gives also the EXPTIME-hardness of the equivalence and inclusion problem, since we can test the equivalence with Σ * or the inclusion of Σ * .

On the union of the families accepted by DIDPDAs and IDPDAs
The preceding sections showed that DIDPDAs as well as IDPDAs are interesting special cases of deterministic pushdown automata, since both classes have strong closure properties as well as the decidability of inclusion in case of compatible signatures which is in contrast to general deterministic pushdown automata. Furthermore, it has been shown in Theorem 3.5 that the language families L (DIDPDA) and L (IDPDA) are incomparable. Thus, it is natural to consider the union of both language families, that is, the family of languages which are accepted by DIDPDAs or IDPDAs, and to ask whether the new language family shares the closure properties as well as the decidability of inclusion with its subfamilies. In this section, we will investigate these questions and it turns out that the new family is no longer closed under union and intersection with compatible signatures. Moreover, the inclusion problem for two automata becomes non-semidecidable even if both automata have the same signature. This is in strong contrast to the results known for DIDPDAs as well as for IDPDAs.
Let us denote by L the family of languages which are accepted by DIDPDAs or IDPDAs. Formally, we define L = L (DIDPDA) ∪ L (IDPDA). Since the families L (DIDPDA) and L (IDPDA) are incomparable, it is clear that both families are proper subsets of L . Moreover, L is properly included in the family of real-time deterministic context-free languages. A witness is again language { a n $a n | n ≥ 0 } that is neither accepted by any DIDPDA nor by any IDPDA.
Considering the closure properties of L we have the following positive closure results. Next, we turn to non-closure results.
Lemma 6.2. The family L is not closed under union, intersection, and concatenation even with compatible signatures. It is neither closed closed under Kleene star, reversal, inverse homomorphism nor under lengthpreserving homomorphism.
Proof. We consider L 3 = { a n b n c m | n, m ≥ 1 } and L 4 = { a n b m c m | n, m ≥ 1 } that have already been used in the proof of Lemma 5.3. Language L 4 can be accepted by an IDPDA with the signature Σ D = {b}, Σ R = {a, c}, and Σ N = ∅, while L 3 is accepted by a DIDPDA with the same signature. Thus, L 3 ∩ L 4 belongs to L , if L is closed under intersection with compatible signatures. Since L 3 ∩ L 4 is a non-context-free language, we obtain the non-closure under intersection with compatible signatures. Since L is closed under complementation by Lemma 6.1, the non-closure under union with compatible signatures can be derived as well. Next, we consider language L 5 = { a n b m | n, m ≥ 1 and n < m } that is accepted by some IDPDA with the signature Σ D = {a}, Σ R = {b}, and Σ N = ∅ as well as the language L 6 = { w ∈ {a, b} + | |w| a < |w| b } that is accepted by some DIDPDA with the same signature. Now, assume that L is closed under concatenation. Then, L 5 · L 6 belongs to L . If L 5 · L 6 is accepted by some DIDPDA, then the intersection (L 5 · L 6 ) ∩ a + b + a + b + is accepted by a DIDPDA as well. However, this intersection gives the language L = { a k b a m b n | 1 ≤ k < and 1 ≤ m < n } used in the proof of Lemma 3.4 and it is known that L is not accepted by any DIDPDA. Thus, L 5 · L 6 has to be accepted by an IDPDA. However, a straightforward proof shows that L 5 · L 6 cannot be accepted by any IDPDA as well. This gives the contradiction and the non-closure under concatenation with compatible signatures. Next, we assume that L is closed under Kleene star. We define the language L 7 = L 5 ∪ cL 6 and observe that L 7 is accepted by some DIDPDA and, hence, belongs to L . Since L is closed under Kleene star, language L * 7 belongs to L as well and is accepted by some DIDPDA or IDPDA. If L * 7 is accepted by some DIDPDA, then L * 7 ∩ a + b + ca + b + is accepted by some DIDPDA as well. However, L * 7 ∩ a + b + ca + b + = { a k b ca m b n | 1 ≤ k < and 1 ≤ m < n } = L which is basically the language L used in the proof of Lemma 3.4. Then, a proof almost identical to the proof of Lemma 3.4 shows that L is not accepted by any DIDPDA which is a contradiction. Thus, L * 7 is accepted by some IDPDA. Hence, the language L * 7 ∩ c{a, b} + = cL 6 is accepted by an IDPDA as well. Again, a straightforward proof shows that cL 6 cannot be accepted by any IDPDA which gives the contradiction and the non-closure under Kleene star. Now, we assume that L is closed under reversal. We define the language L 8 = (L 5 cL 5 ) R ∪ cL 6 and note that L 8 is accepted by some DIDPDA, since (L 5 cL 5 ) R can be accepted by some DIDPDA similar to the construction given in the proof of Lemma 5.4. Hence, L 8 belongs to L . Since L is closed under reversal, language L R 8 belongs to L as well and is accepted by some DIDPDA or IDPDA. If L R 8 is accepted by some DIDPDA, then we consider L R 8 ∩ a + b + ca + b + and obtain again language L which is not accepted by any DIDPDA. Thus, L R 8 is accepted by an IDPDA which implies that L R 8 ∩ {a, b} + c = L 6 c is accepted by an IDPDA as well. Again, a straightforward proof shows that L 6 c cannot be accepted by any IDPDA which gives the contradiction and the non-closure under reversal.
The proofs for the non-closure under length-preserving homomorphism and inverse homomorphism are identical to the proofs given for Lemma 5.3: The languages L 1 = { a n $a n | n ≥ 1 } and L 2 = { a n $b 2n | n ≥ 1 } that are each not accepted by any DIDPDA or IDPDA are obtained as length-preserving homomorphic image and inverse homomorphic image of the languages L 1 = { a n $b n | n ≥ 1 } and L 2 = { a 2n $b 2n | n ≥ 1 }, respectively, which in turn are each accepted by some DIDPDA or IDPDA.
All results on the closure properties of Section 5 and Section 6 are summarized in Table 1. The inclusion problem is known to be solvable in polynomial time if both automata have compatible signatures and, moreover, are both DIDPDAs or are both IDPDAs. Our next results shows that this situation changes drastically if we consider the inclusion problem for two automata with the same signature, but one automaton being a DIDPDA and the other one being an IDPDA. In this situation, it turns out that the inclusion problem becomes undecidable and, moreover, not even semidecidable.
The non-semidecidability of the inclusion problem for automata from L that even have identical signatures is shown by reduction of the emptiness problem for deterministic linearly space bounded one-tape, one-head Turing machines, so-called linear bounded automata (LBA). We remark that the non-semidecidability of the latter problem may be obtained by applying the result shown in [23] that every recursively enumerable language L accepted by some Turing machine can be represented as the homomorphic image h(L ) of a context-sensitive language L accepted by some LBA. Since L = h(L ) is empty if and only L is empty, the emptiness problem for Turing machines, which is known to be not semidecidable, is reduced to the emptiness problem for LBAs. Table 1. Closure properties of the language families discussed. Symbols ∪ c , ∩ c , and · c denote union, intersection, and concatenation with compatible signatures. Such operations are not defined for DFAs and DPDAs and they are marked with '-'. The abbreviations h l.p. and h −1 denote length-preserving homomorphism and inverse homomorphism.
For the reduction of the emptiness problem for LBAs to the inclusion problem for automata from L with identical signatures we encode histories of LBA computations in single words that are called valid computations (see, for example, [10]). We may assume that LBAs get their input in between two endmarkers, make no stationary moves, accept by halting in some unique state f on the leftmost input symbol after an even number of steps, and are sweeping, that is, the read-write head changes its direction at endmarkers only. Let δ be the transition function of some LBA M and Q be its state set, where q 0 is the initial state, T with T ∩ Q = ∅ is the tape alphabet containing the endmarkers and , and Σ ⊂ T is the input alphabet. Since M is sweeping, the set of states can be partitioned into Q R and Q L of states appearing in right-to-left and in left-to-right moves. A configuration of M can be written as a string of the form T * QT * such that, t 1 t 2 · · · t i st i+1 · · · t n is used to express that M is in state s, t 1 t 2 · · · t n is the tape inscription, for s ∈ Q R tape symbol t i+1 is scanned, and for s ∈ Q L tape symbol t i is scanned. Now, we consider words of the form w 0 w R 1 w 2 · · · w R 2m−1 w 2m , where w i ∈ { }T * QT * { } are configurations of M , w 0 is an initial configuration of the form q 0 Σ * , w 2m ∈ { f }T * { } is a halting, that is, accepting configuration, and w i+1 is the successor configuration of w i . These words are encoded so that every state symbol is merged together with its both adjacent symbols into a metasymbol. To this end, we assume that the LBA input is nonempty, and rewrite every substring of w 0 w R 1 w 2 · · · w R 2m−1 w 2m having the form tqt to [t, q, t ], where q ∈ Q and t, t ∈ T . Additionally, we consider copies Q and T of the encoding alphabet and use for every reversed configuration the overlined alphabet. The set of these encodings is defined to be the set of valid computations of M . We denote it by VALC(M ). Example 6.3. We consider the following computation of an LBA on input x 1 x 2 x 3 where each configuration consists of the current tape inscription, the current state, and the current position of the read-write head, q 0 , . . . , q 3 , f ∈ Q R and p 0 , p 1 , p 2 , p 3 ∈ Q L . 1: ( x 1 x 2 x 3 , q 0 , 1) 4: ( y 1 y 2 y 3 , q 3 , 4) 7: ( y 1 z 2 z 3 , p 1 , 1) 2: ( y 1 x 2 x 3 , q 1 , 2) 5: ( y 1 y 2 y 3 , p 3 , 3) 8: ( z 1 z 2 z 3 , p 0 , 0) 3: ( y 1 y 2 x 3 , q 2 , 3) 6: ( y 1 y 2 z 3 , p 2 , 2) 9: ( z 1 z 2 z 3 , f, 1) The corresponding valid computation is: Proof. Let us first describe an IDPDA M 1 with signature Σ D = Q ∪ T ∪ (T × Q × T ), Σ R = Q ∪ T ∪ (T × Q × T ), and Σ N = ∅: First of all, the state set of M 1 is used to check the correct format of the input. This means that the first configuration is an initial configuration, the last configuration is an accepting configuration, the alphabet for the configurations alternates between non-overlined and overlined symbols, and that every configuration itself is of a correct format. It should be noted that due to the definition of VALC(M ) every configuration has the same length and any two adjacent configurations differ on two adjacent positions only disregarding the overlining of symbols and the merging into metasymbols. Now, the main task for M 1 is to check starting with the third configuration whether every non-overlined configuration is the successor configuration of its preceding overlined configuration. To this end, M 1 reads and ignores the first configuration while its pushdown store remains empty. Then, every next configuration (with overlined symbols) is read and every symbol is pushed onto the pushdown store. Then, the following configuration (with non-overlined symbols) is read and checked against the pushdown store. If a metasymbol of the form [t, q, t ] is popped off or read in the input, M 1 checks within the next two transitions by using its state set whether the two input symbols encode a correct part of the successor configuration based on the two symbols popped off the pushdown store. If the check is positive, the computation continues. Otherwise, the computation ends non-accepting. If neither the popped symbol nor the input symbol read is a metasymbol and both symbols are equal, the computation continues. In case of nonequivalence, the computation ends non-accepting. Since all configurations have to have the same length, any pop operation on the empty pushdown store after the first configuration leads to a non-accepting computation end as well. Otherwise, the check of the next two configurations is started. In this way, the input is accepted if it is correctly formatted and every non-overlined configuration is the successor configuration of its preceding overlined configuration. The construction of a DIDPDA M 2 with the same signature is similar, since it has to be checked whether every overlined configuration is the successor configuration of its preceding non-overlined configuration. To this end, the first configuration is read which digs a hole of the shape of the configuration read. The next overlined configuration is read and checked against the hole on the pushdown store as follows: If a metasymbol of the form [t, q, t ] is seen at the bottom of the hole in the pushdown store or read in the input, M 2 checks within the next two transitions by using its state set whether the two input symbols encode a correct part of the successor configuration based on the two symbols which are pushed into the hole in the pushdown store. If the check is positive, the computation continues. Otherwise, the computation ends non-accepting. If neither the symbol seen at the bottom of the hole in the pushdown store nor the input symbol read is a metasymbol and both symbols are equal, the computation continues by pushing the adequate symbol into the hole in the pushdown store. In case of non-equivalence, the computation ends non-accepting. Since all configurations have to have the same length, any push operation on the empty pushdown store leads to a non-accepting computation end as well. Otherwise, the check of the next two configurations is started. In this way, the input is accepted if every overlined configuration is the successor configuration of its preceding non-overlined configuration.
Altogether, the inputs that are accepted by M 1 and M 2 are those strings which are correctly formatted and where every non-overlined configuration is the successor configuration of its preceding overlined configuration as well as every overlined configuration is the successor configuration of its preceding non-overlined configuration, hence a valid computation of the given LBA.