RELATIONS OF CONTEXTUAL GRAMMARS WITH STRICTLY LOCALLY TESTABLE SELECTION LANGUAGES

. We continue the research on the generative capacity of contextual grammars where contexts are adjoined around whole words (externally) or around subwords (internally) which belong to special regular selection languages. All languages generated by contextual grammars where all selection languages are elements of a certain subregular language family form again a language family. We investigate the computational capacity of contextual grammars with strictly locally testable selection languages and compare those families to families which are based on finite, monoidal, nilpotent, combinational, definite, suffix-closed, ordered, commutative, circular, non-counting, power-separating, or union-free languages. With these results, also an open problem regarding ordered and non-counting selection languages is solved.


Introduction
Contextual grammars were introduced by Solomon Marcus in [17] as a formal model that might be used for the generation of natural languages.The derivation steps consist of adjoining contexts to given sentences starting from a finite set.A context is given by a pair (u, v) of words.The external adjoining to a word x gives the word uxv and the internal adjoining gives all words x 1 ux 2 vx 3 with x 1 x 2 x 3 = x.Contextual grammars where the contexts are adjoined ex-or internally are called external or internal contextual grammars, respectively.The internal case is different from the case of external contextual grammars, as there are two main differences between the ways in which words are derived.In the case of internal contextual grammars, it is possible that the insertion of a context into a sentential form can be done at more than one place, such that the derivation becomes in some sense non-deterministic; in the case of external grammars, once a context was selected, there is at most one way to insert it: wrapped around the sentential form, when this word is in the selection language of the context.If a context can be added internally, then it can be added arbitrarily often (because the subword where the context is wrapped around does not change) which does not necessarily hold for external grammars.
By the linguistic motivation, it is natural to require that certain contexts can only be used if the subword x or x 2 (where the contexts are to be adjoined) satisfies some condition.One possibility is to require that the words x or x 2 have to belong to a language S, called the selection language, associated with the context.Mostly, it is also required that the language S belongs to some language family F .The first investigations in this direction were presented in [15] where the language families of the Chomsky hierarchy were chosen as F ; for further information, we refer to [15,19,22] and references therein.
By practical reasons, however, it is natural to consider 'simple' languages as selection languages or to restrict F to be a language family of 'simple' languages.Very often subregular language families where used as an approach to simplification.Typical examples for such languages are finite, monoidal, nilpotent, combinational, definite, suffix-closed, commutative, circular, ordered, non-counting, power-separating, strictly locally (k-)testable, and union-free languages and languages which are defined by syntactic restrictions (as the families of languages which can be accepted by deterministic finite automata with at most n states or can be generated by right-linear grammars with at most n nonterminals/productions/symbols).
The study of contextual grammars with selection in special regular sets was started in [3] and continued in [4,7,8,28].For the families defined by syntactic restrictions, in the external case as well as in the internal case, infinite hierarchies with respect to all parameters mentioned above were obtained.Moreover, the hierarchy of external contextual languages with selection by the subregular families mentioned above was almost completely determined.But for the internal derivation mode, the results concern only finite, monoidal, nilpotent, combinational, definite, suffix-closed, commutative, circular, and ordered selection languages.
A recent survey can be found in [28].
In the present paper, we first add some relations between the families of strictly locally (k-)languages to some other families mentioned above.
For external contextual grammars with selections, we solve open problems regarding the relation between the families obtained by ordered, non-counting, and strictly locally 1-testable selection languages.
For internal contextual grammars, we investigate the impact of strictly locally testable selection languages on the generative capacity and compare it to those of the families which are based on finite, monoidal, nilpotent, combinational, definite, suffix-closed, ordered, commutative, circular, non-counting, power-separating, or unionfree selection languages.
In the end, we mention some open problems.This paper is an extension of [9].

Preliminaries
After giving some notations used in this paper, we first recall the subregular families of languages under investigation and then recall the contextual grammars with external or internal language generating modes.
We assume that the reader is familiar with the basic concepts of the theory of automata and formal languages.For details, we refer to [22].
Given an alphabet V , we denote by V * and V + the set of all words and the set of all non-empty words over V , respectively.The empty word is denoted by λ.By V k and V ≤k for some natural number k, we denote the set of all words of the alphabet V with exactly k letters and the set of all words over V with at most k letters, respectively.For a word w, we denote the length of w by |w|.
A deterministic finite automaton is a quintuple A = (V, Z, z 0 , F, δ) where V is a non-empty, finite set, called the input alphabet, Z is a non-empty, finite set whose elements are called states, z 0 ∈ Z which is the socalled start state, F ⊆ Z whose elements are called accepting states, and δ is a transition mapping δ : Z × V → Z.The transition mapping δ is extended to a function δ * : Z × V * → Z as follows: δ * (z, λ) = z for all z ∈ Z and δ * (z, aw) = δ * (δ(z, a), w) for z ∈ Z, a ∈ V , and w ∈ V * .We denote the extended function also by δ.

Definitions of Subregular Language Families
We consider the following restrictions for regular languages.Let L be a language over an alphabet V .We say that the language L -with respect to the alphabet V -is monoidal if and only if L = V * , nilpotent if and only if it is finite or its complement V * \ L is finite, combinational if and only if it has the form L = V * X for some subset X ⊆ V , definite if and only if it can be represented in the form L = A ∪ V * B where A and B are finite subsets of V * , suffix-closed (or fully initial or multiple-entry language) if and only if, for any two words x ∈ V * and y ∈ V * , the relation xy ∈ L implies the relation y ∈ L, ordered if and only if the language is accepted by some deterministic finite automaton with an input alphabet V , a finite set Z of states, a start state z 0 ∈ Z, a set F ⊆ Z of accepting states and a transition mapping δ where (Z, ⪯) is a totally ordered set and, for any input symbol a ∈ V , the relation z ⪯ z ′ implies δ(z, a) ⪯ δ(z ′ , a), commutative if and only if it contains with each word also all permutations of this word, circular if and only if it contains with each word also all circular shifts of this word, non-counting (or star-free) if and only if there is a natural number k ≥ 1 such that, for any three words x ∈ V * , y ∈ V * , and z ∈ V * , it holds xy k z ∈ L if and only if xy k+1 z ∈ L, power-separating if and only if, there is a natural number m ≥ 1 such that for any word x ∈ V * , either the equality J m x ∩ L = ∅ or the inclusion J m x ⊆ L holds where J m x = { x n | n ≥ m }, union-free if and only if L can be described by a regular expression which is only built by product and star, strictly locally k-testable if and only if there are three subsets B, I, and E of V k such that any word a 1 a 2 . . .a n with n ≥ k and a i ∈ V for 1 ≤ i ≤ n belongs to the language L if and only if a 1 a 2 . . .a k ∈ B, a j+1 a j+2 . . .a j+k ∈ I for 1 ≤ j ≤ n − k − 1, and a n−k+1 a n−k+2 . . .a n ∈ E, strictly locally testable if and only if it is strictly locally k-testable for some natural number k.
We remark that monoidal, nilpotent, combinational, definite, ordered, union-free, and strictly locally (k-)testable languages are regular, whereas non-regular languages of the other types mentioned above exist.Here, we consider among the commutative, circular, suffix-closed, non-counting, and power-separating languages only those which are also regular.
These languages are of interest because they have often very different characterizations.We mention here four examples.
A language is non-counting if and only if it can be obtained from the empty set and the languages consisting of the empty word or of letters of the alphabet by using union, concatenation, and complement (see [18]).A language is non-counting if and only if its syntactic monoid is aperiodic (Schützenberger's Theorem).A language is nilpotent if and only if its syntactic monoid is nilpotent.A language is suffix-closed if and only if it can be accepted by a finite automaton where all states serve as initial states (see [12]).
Moreover, the languages are sometimes of interest by 'practical' reasons.Again, we mention four facts.
To check membership of a definite language, it is sufficient to consider only suffixes of a certain length.
To check membership of a strictly locally k-testable language, it is sufficient to move a window of size k over the word.
A strictly locally testable language characterized by three finite sets B, I, and E as above which includes additionally a finite set F of words which are shorter than those of the sets B, I, and E is denoted by [B, I, E, F ].
As the set of all families under consideration, we set

Hierarchy of Subregular Language Families
Many inclusion relations and incomparabilities between these families have been proved in the past, see [28] for a survey.We now insert the families of the strictly locally (k-)testable languages into the existing hierarchy.
The families of strictly locally k-testable languages form an infinite hierarchy of proper inclusions.This is shown in [21] with the witness languages From [18], we know the proper inclusion SLT ⊂ NC .In [4], the proper inclusions COMB ⊂ SLT 1 and DEF ⊂ SLT as well as the incomparability of each family SLT k for k ≥ 1 with the families FIN , NIL, and DEF were mentioned but not proved.This will be done in the sequel.We first give a witness language which will be useful in all these proofs.
Suppose, this language is definite.Then there are two finite subsets D s ⊂ {a, b} * and D e ⊂ {a, b} * such that The word ab k a belongs to the language L SLT 1,¬DEF but not to the subset D s due to its length.Hence, ab k a ∈ {a, b} * D e and also ab k a ∈ {a, b} + D e due to the length of the word.Then we have also b k+1 a ∈ {a, b} + D e and, therefore, which is a contradiction.Thus, L SLT 1,¬DEF / ∈ DEF .
The language L SLT 1 ,¬DEF is a witness language for the properness of the three inclusions stated in the following lemmas.
Proof.We first prove that MON is included in SLT 1 .Let L be a monoidal language over an alphabet V .Then L = V * .With [V, V, V, λ], we have a representation of the language L as a strictly locally 1-testable language.Hence, MON ⊆ SLT 1 .
A witness language for the properness is the language L SLT 1,¬DEF which, according to Lemma 2.1, belongs to the class SLT 1 but not to DEF and not to MON because MON ⊆ DEF .Proof.We first prove that COMB is included in SLT 1 .Let L be a combinational language over an alphabet V .Then L = V * X for some subset X ⊆ V .With [V, V, X, ∅], we have a representation of the language L as a strictly locally 1-testable language.Hence, COMB ⊆ SLT 1 .
A witness language for the properness is the language L SLT 1,¬DEF which, according to Lemma 2.1, belongs to the class SLT 1 but not to DEF and not to COMB because COMB ⊆ DEF .Proof.We first prove DEF ⊆ SLT .Let L be a definite language over an alphabet V .Then L = D s ∪ V * D e for some finite subsets be the set of all words of L with a length smaller than k, -B = V k and I = V k the set of all words over the alphabet V with a length of k, -E = V * D e ∩ V k the set of all words of the set V * D e with length k, and L ′ be the strictly locally k-testable language represented by [B, I, E, F ]. We now prove that L = L ′ holds.
We first show L ⊆ L ′ .Let w ∈ L. If |w| < k, then w ∈ F and, hence, w ∈ L ′ .Otherwise, w ∈ V * D e and there are words w 0 and w 1 such that w = w 1 w 0 and w 0 ∈ V * D e ∩ V k (the word w 0 is the suffix of w of length k).Every subword of w of length k belongs to the set V k .Hence, the prefix of w of length k belongs to the set B, every proper infix of w of length k belongs to the set I, and the suffix w 0 belongs to the set E. Therefore, we have w ∈ L ′ also in this case.
We now show Since L = L ′ and L ′ ∈ SLT k by construction, we also have that L ∈ SLT k and, thus, also L ∈ SLT .A witness language for the properness of the inclusion DEF ⊆ SLT is the language L SLT 1 ,¬DEF which, according to Lemma 2.1, belongs to the class SLT 1 and therefore also to SLT but not to DEF .
The language L SLT 1 ,¬DEF from Lemma 2.1 serves also partially for proving the incomparability of the families of the strictly locally k-testable languages with the families of the finite languages, of the nilpotent languages, and of the definite languages.Proof.Due to the inclusion relations, it suffices to show that there is a language in the class SLT 1 (which belongs also to each other family SLT k for k > 1) but which is not definite (and hence neither nilpotent nor finite) and that there are languages L k (for k ≥ 1) which are finite (and, hence, nilpotent and definite) but not strictly locally k-testable.
A language for the first case is L SLT 1,¬DEF from Lemma 2.1 since Languages for the other incomparabilities are L k = {a} k+1 for k ≥ 1.Every such language L k is finite.Let k be a natural number.Suppose that the language L k is also strictly locally k-testable.Then it is represented by [{a} k , ∅, {a} k , ∅].But then, we also have The incomparabilities of the families of the strictly locally (k-)testable languages to the families UF of the union-free languages, SUF of the suffix-closed languages, COMM of the commutative languages, and CIRC of the circular languages follow, due to the inclusion relations, from the incomparabilities of the classes UF , SUF , COMM , and CIRC to the classes COMB of the combinational languages and NC of the non-counting languages which were proved in [14].
Regarding the class ORD of the ordered languages, we have the relations below for which we use the witness languages given in the following lemmas.
Proof.The language L ORD,¬SLT is accepted by a deterministic finite automaton where the transition function is given by the following table (the order is z 0 ⪯ z 1 ⪯ z 2 , start state is z 0 , accepting state is z 1 ): Assume that L ORD,¬SLT = [B, I, E, F ] ∈ SLT k for some natural number k ≥ 1.Since a k ba 2k ∈ L ORD,¬SLT , we obtain a k ∈ B, a k−r ba r−1 ∈ I for 1 ≤ r ≤ k, a k ∈ I, and a k ∈ E. Consequently, we have a k ba k ba 2k ∈ L ORD,¬SLT which contradicts the structure of words in L ORD,¬SLT .Hence, L ORD,¬SLT / ∈ SLT .
Proof.It is easy to see that Assume that the language L SLT 2,¬ORD is accepted by an ordered deterministic finite automaton A = ({a, b, c}, Z, z 0 , F, δ).For any word w ∈ {a, b, c} * , let [w] be the state of A after reading w: Since there are only finitely many states, there is a number Since there are only finitely many states, there is a number . By iteration, we obtain Since there are only finitely many states, there is a number . By iteration, we obtain Since there are only finitely many states, there is a number Hence, the language L SLT 2,¬ORD is not accepted by an ordered deterministic finite automaton: With the help of the languages from the Lemmas 2.6 and 2.7, we prove the following results.
Proof.We first prove the inclusion SLT 1 ⊆ ORD and show how a strictly locally 1-testable language L over an alphabet V can be accepted by an ordered automaton.For such a language L = [B, I, E, F ], we have B ⊆ V , I ⊆ V , E ⊆ V , and F ⊆ {λ}.We construct the following deterministic finite automaton: where and the transition function δ is given by the diagram in Figure 1 (z 1 is an accepting state if and only if λ ∈ F ).
In order to prove that L(A) = L, we first show that every word w ∈ L is accepted by A and then that every word w ∈ V * \ L is not accepted by A.
The properness of the inclusion follows from Lemma 2.6 with the witness language L ORD,¬SLT .
Lemma 2.9.The classes SLT and SLT k for k ≥ 2 are incomparable to the class ORD.
Proof.From Lemma 2.6, we know that there is a language (namely L ORD,¬SLT ) in ORD which does not belong to any class SLT k for k ≥ 2 and, hence, also not to the class SLT .From Lemma 2.7, we know that there is a language (namely L SLT 2,¬ORD ) in SLT 2 and, hence, also in SLT k for k > 2 and SLT which does not belong to the class ORD.
If we combine these results with those mentioned in [28], we obtain the following statement.
Theorem 2.10.The inclusion relations presented in Figure 2 hold.An arrow from an entry X to an entry Y depicts the proper inclusion X ⊂ Y ; if two families are not connected by a directed path, then they are incomparable.
An edge label in Figure 2 refers to a paper or a lemma in the present paper where the respective inclusion is proved.The other inclusions are easy to see.Proofs for the incomparabilities which are not related to strictly locally testable languages can be found in [27].

Contextual grammars
Let F ∈ F be a family of languages.A contextual grammar with selection in F is a triple G = (V, P, A) where -V denotes an alphabet, -P is a finite set of pairs (S, C) with a language S over some subset U of the alphabet V which belongs to the family F with respect to the alphabet U and a finite set The set V is called the basic alphabet; for a selection pair (S, C) ∈ P, the language S is called a selection language and the set C is called a set of contexts of the grammar G; the elements of A are called axioms.
We now define the derivation modes for contextual grammars with selection.
If the derivation mode is known from the context, we omit the index α.For a family L of languages, we denote by EC(L) and IC(L) the family of all languages generated externally and internally, respectively, by contextual grammars with selection in L (where all selection languages belong to the family L).
From the definition follows that the subset relation is preserved under the use of contextual grammars: if we allow more, we do not obtain less.Lemma 2.11.For any two language classes X and Y with X ⊆ Y , we have the inclusions EC(X) ⊆ EC(Y ) and IC(X) ⊆ IC(Y ).

External contextual grammars
When we speak about contextual grammars in this subsection, we mean contextual grammars with external derivation (also called external contextual grammars).A language of an external contextual grammar is a language which is externally generated.
In [3], contextual grammars were investigated where the selection languages are finite, monoidal, combinational, definite, nilpotent, commutative, or suffix-closed and a hierarchy of the language families generated was presented.In the papers [7,26], results on the power of external contextual grammars with circular, ordered, union-free, or definite selection languages are given.The language families generated by such systems were inserted into the hierarchy from [3].Furthermore, subregular language families F n were considered and integrated which are obtained by restricting to n states, non-terminal symbols, or production rules to accept or to generate regular languages [28].We consider here only subregular families defined by structural properties (not resources).
We now present a witness language to prove a proper inclusion and incomparabilities regarding ordered languages as selection languages.where the selection languages are ordered: For {a, b} * , only one state is needed; the other selection language is accepted by a deterministic finite automaton where the transition function is given by δ(z 0 , a) = z 0 and δ(z Assume that the language L ORD,¬SLT can be generated by a contextual grammar with strictly locally testable selection languages.The subset {c}{a} * {b}{a} * {c} of L ORD,¬SLT is infinite.Therefore, there is an infinite selection language S ⊆ {a} * {b}{a} * which is used to obtain words of the set {c}{a} * {b}{a} * {c}.Hence, to S belongs some context (u, v) with u ∈ {c}{a} * and v ∈ {a} * {c}.If S is a strictly locally k-testable language, then S = [B, I, E, F ] and a k ∈ B ∩ I ∩ E.Then, we have also a k ∈ S. Therefore, a word from the set {c}{a} * {c} is generated which does not belong to the language L ORD,¬SLT .This contradiction implies that L ORD,¬SLT / ∈ EC(SLT ).
We now prove the mentioned proper inclusion.
Many incomparability results have been published in [3,4,28].The only open questions are whether the class EC(ORD) is incomparable to the classes EC(SLT ) and EC(SLT k ) for k ≥ 2. With the following results, we close this gap.
From the axioms, we obtain by the first selection component all words of the sublanguage V + and by the second selection component all words of the sublanguage {e}L{e}.Hence, L SLT 2 ,¬ORD = L(G).
We now show that L SLT 2,¬ORD / ∈ EC(ORD).Suppose that there is a contextual grammar with L(G ′ ) = L SLT 2,¬ORD and the property that, in every selection component (S, C) ∈ P, the selection language S is accepted by an ordered deterministic finite automaton.Since L(G ′ ) = L SLT 2,¬ORD and the language L is infinite, the set P contains a selection component (S, C) where the context (e, e) is contained in the set C and S is a subset of L.
We now consider all and only those selection components where the context (e, e) is contained.Assume that for every such selection language S and every word p ∈ V * there are a natural number n ≥ 1 and a letter x ∈ {a, b, c} such that for every word w ∈ S the word pdx n is not a prefix of the word w.Then there are a selection language S ¬x1 (with x 1 ∈ V ) and n 1 ≥ 1 such that every word w ∈ S ¬x1 does not contain dx n1 1 as a prefix.Since every word of the language L occurs in some selection language, there is a selection language S x1,¬x2 (with x 2 ∈ V ) which contains a word with the prefix dx n1 1 .According to our assumption, there is a natural number n 2 ≥ 1 such that every word w ∈ S x1,¬x2 does not contain dx n1 1 dx n2 2 as a prefix.Further, there are a letter x 3 ∈ V , a selection language S x1,x2,¬x3 , and a natural number n 3 ≥ 1 such that dx n1 1 dx n2 2 occurs as a prefix in some word of S x1,x2,¬x3 but not dx n1 1 dx n2 2 dx n3 3 .From this argumentation follows that there are infinitely many selection languages which is a contradiction.Therefore, the assumption does not hold but the contrary: There are a selection language (for inserting the context (e, e)) and a word p ∈ V * such that for all n ≥ 1 and all letters x ∈ V there is a word w ∈ S such that pdx n is a prefix of the word w, more formally: Let S be such a selection language, p be such a word, and A = (V, Z, z 0 , F, δ) a deterministic finite automaton which accepts the language S and is ordered.Additionally, for every word w ∈ V * , let [w] = δ(z 0 , w) be the state of the automaton A after reading the word w.The states [pda], [pdb], and [pdc] are pairwise different as can be seen as follows: All the words pda n , pdb n , and pdc n with n ≥ 1 occur as prefixes of words in S.  .By iteration, we obtain Since there are only finitely many states, there is a number . By iteration, we obtain Since there are only finitely many states, there is a number . By iteration, we obtain Since there are only finitely many states, there is a number . By iteration, we obtain Since there are only finitely many states, there is a number Thus, the selection language S is not accepted by an ordered deterministic finite automaton which is a contradiction to the assumption that every selection language of the contextual grammar G ′ is ordered.Therefore, L SLT 2,¬ORD / ∈ EC(ORD) and together with the first part, the statement of the lemma is proved.
Lemma 3.4.The class EC(ORD) is incomparable to the classes EC(SLT ) and EC(SLT k ) for k ≥ 2.
Proof.Due to the inclusion relations, it suffices to show that there are languages From Lemma 3.3, we know for From Lemma 3.1, we have Figure 3. Hierarchy of language families by external contextual grammars with selection languages defined by structural properties.An edge label refers to the paper or lemma where the respective inclusion is proved. with Together with the previous results, we close also another open question, namely whether the inclusion EC(ORD) ⊆ EC(NC ) is proper.Lemma 3.5.We have the proper inclusion EC(ORD) ⊂ EC(NC ).
Proof.The inclusion follows from the proper inclusion ORD ⊂ NC (see [24]) and Lemma 2.11.The properness follows from Lemma 3.3 with the witness language L SLT 2,¬ORD which belongs to the class EC(SLT 2 ), and hence, also to the class EC(NC ) (see [4]), but not to the class EC(ORD).
Summarizing, we have the following result.
Theorem 3.6.The inclusion relations presented in Figure 3 hold.An arrow from an entry X to an entry Y depicts the proper inclusion X ⊂ Y .If two families X and Y are not connected by a directed path, then X and Y are incomparable.

Internal contextual grammars
When we speak about contextual grammars in this subsection, we mean contextual grammars with internal derivation (also called internal contextual grammars).A language of an internal contextual grammar is a language which is internally generated.
In [8], such contextual grammars were investigated where the selection languages belong to families F n which are obtained by restriction to n states or n non-terminal symbols, productions, or symbols to accept or to generate regular languages and the effect of finite, monoidal, nilpotent, combinational, definite, ordered, regular commutative, regular circular, regular suffix-closed, and union-free selection languages on the generative capacity of internal contextual grammars was studied.We consider here only subregular families defined by structural properties (not resources).
In contrast to the external derivation mode, contextual grammars can internally apply a infinitely often if it can be applied once.If a word contains a subword which belongs to a selection language, also the word after inserting the context contains a subword (namely the same as before) which belongs to this selection language.This difference has as a consequence that finite selection languages not only yield finitely many words as in the case of contextual grammars working in the external mode.Another consequence is that 'outer' parts of a word do not have to be added at the end of the derivation process but can be produced at some time whereas 'inner' parts can be 'blown up' later.For this reason, the results obtained for external contextual grammars are not of much help here.
According to Lemma 2.11, we have the inclusion IC(X) ⊆ IC(Y ) whenever we have the proper inclusion X ⊂ Y for two families of languages X and Y .
We now present witness languages for proving the properness of the inclusions and IC(DEF ) ⊂ IC(SLT ) ⊂ IC(NC ).Assume that L = L(G) for some internal contextual grammar G with combinational selection languages.Then, for sufficiently large n (which is larger than the sum of the longest length of axioms in G and the maximum of |uv| for contexts (u, v) of G), we have a derivation x =⇒ ac n bd n .Because x ∈ L holds, the used context (α, β) contains no letter a and no letter b (otherwise, we can produce a word with more than two occurrences of a or b), we have x = ac m bd m , α = c n−m , β = d n−m , and the context is wrapped around a subword c t bd s for some numbers t and s with m ≥ t ≥ 0, m ≥ s ≥ 0. Since the selection language C is combinational, we get ac m bd s ∈ C by c t bd s ∈ C. Therefore, we have the derivation x = ac m bd s d m−s =⇒ αac m bd s βd m−s = c n−m ac m bd n , i. e., we can derive a word not in L. Thus, L / ∈ IC(COMB ).
Lemma 3.8.Let n be a natural number with n ≥ 2 and Proof.Let n be a natural number with n ≥ 2 and L n the language mentioned in the claim.The language L n is generated by the contextual grammar with a finite selection language.
The language L n is not generated by a contextual grammar where all selection languages belong to the family SLT n−1 .Assume the contrary.Since the subset { a m b 2n c m | m ≥ n } of L n is infinite, there is a selection language S = [B, I, E, F ] used with a word a p b 2n c q for two natural numbers p ≥ 0 and q ≥ 0. As S ∈ SLT n−1 , we have b n−1 ∈ I. Then also the word a p b n c q belongs to the selection language which is a subword of the word a n−1 b n c n−1 ∈ L n .Hence, another word with exactly n letters b would be generated which is a contradiction to the form of the words in the language L n .Lemma 3.9.
Proof.The language L can be generated by the contextual grammar Assume that the language L can be generated by a contextual grammar G ′ with definite selection languages.Let S i = A i ∪ V * B i for 1 ≤ i ≤ q be the selection languages of G ′ .Further, let Since the language L is infinite and the number of the letters a and b are unbounded in its words, there is a word a r b s c r d s ∈ L with r ≥ p and s ≥ p such that from this word another one is generated.Hence, there is a selection language S i with 1 ≤ i ≤ q which contains a word which is a subword of a r b s c r d s .This word is a r ′ b s c r ′′ with 1 ≤ r ′ ≤ r and 1 ≤ r ′′ ≤ r or b s ′ c r d s ′′ with 1 ≤ s ′ ≤ s and 1 ≤ s ′′ ≤ s in order to maintain the form of the words of the language.Since S i is definite and s − 1 + r ′′ ≥ p or r − 1 + s ′′ ≥ p, the word b s−1 c r ′′ or c r−1 d s ′′ also belongs to the selection language S i .But then a letter a would be inserted inside the b-block or a letter b would be inserted inside the c-block.In both cases, a word would be generated which does not belong to the language L. Therefore, the language L cannot be generated by a contextual grammar with definite selection languages.
Proof.The language L can be generated by the contextual grammar ({a, b} , {({a} * {b}{a} * {b}{a} * {b}{a} * , {(a, a)})} , {abababaabababa}) where the selection language is ordered since it is accepted by the deterministic finite automaton shown below where the transition function is given by the table next to it (the order is z 0 ⪯ z 1 ⪯ z 2 ⪯ z 3 ⪯ z 4 , the start state is z 0 , the accepting state is z 3 ): Assume that the language L can be generated by a contextual grammar with strictly locally testable selection languages.The length of each a-block is unbounded.Therefore, there is an infinite selection language S ⊆ {a} * {b}{a} * {b}{a} * {b}{a} * used where the lengths of the a-blocks between the letters b are unbounded (otherwise, there would be a maximal length of one of the a-blocks in the words of the language L) and which has a context (a l , a l ) associated to it (otherwise, a word would be generated which has not the required form of the words of the language L).If S is a strictly locally k-testable language, then it contains with a word a q ba r ba s ba t with q ≥ 0, r ≥ k, s ≥ k, and t ≥ 0 also the word a q ba r ba s ba s ba t .Adding the context (a l , a l ) around such a subword of a word of L would yield a word which does not belong to the language L (a word with a wrong format).This contradiction implies that L / ∈ IC(SLT ).
We now prove the proper inclusions mentioned above.
Proof.The inclusion IC(SLT 1 ) ⊆ IC(ORD) follows from the Lemmas 2.8 and 2.11.A witness language for the properness is the language L ∈ IC(ORD) \ IC(SLT ) from Lemma 3.10.
The incomparabilities of the families IC(COMM ) and IC(CIRC ) with the families IC(SLT k ) for k ≥ 1 and IC(SLT ) follow from the incomparabilities of the sets IC(COMM ) and IC(CIRC ) with the sets IC(COMB ) and IC(NC ) shown in [28], since Regarding IC(SUF ), we know from [8] that there is a language in the set IC(COMB ) \ IC(SUF ) which also belongs to each set IC(SLT k ) for k ≥ 1 and IC(SLT ) due to the inclusion relations.However, it is still open whether there is a language in the set IC(SUF ) \ IC(NC ) (which would not belong to subsets of IC(NC ) either).So, we cannot use the method as for the classes IC(COMM ) and IC(CIRC ).
In the sequel, we show that, for every number k ≥ 1, there is a language which belongs to the set IC(SUF ) but not to IC(SLT k ).We first note that the internal contextual grammar ({c, d}, {({c, d} * , {(c, d)})}, {λ}) generates the Dyck language D over {c, d}.For k ≥ 1, we set and define K k as the language obtained from K ′′ k by inserting a word of D at any position.Lemma 3.13.For all k ≥ 1, we have K k ∈ IC(SUF ) \ IC(SLT k ).
Proof.The internal contextual grammar with a suffix closed selection language generates the language K k .Thus, we have K k ∈ IC(SUF ).
Assume that K k = L(G) for some internal contextual grammar G where all selection languages are strictly locally k-testable.Let n be sufficiently large.Then there is a derivation x =⇒ c n a k+1 bd n ∈ K k .Since x ∈ K k holds, the used context (α, β) contains no letter a and no letter b.We have x = c m a k+1 bd m , α = c n−m , β = d n−m , and the context is wrapped around a subword c t a k+1 bd s for some numbers t and s with m ≥ t ≥ 0 and m ≥ s ≥ 0.
Let t ≥ k.Since the strictly locally k-testable selection language C which is used contains the word c t a k+1 bd s , it also contains the word c k a 3k bd s .Analogously, if s ≥ k, the selection language C also contains the word c t a 3k bd k .Let k ′ = min{k, t} and k ′′ = min{k, s}.Then we have c k ′ a 3k bd k ′′ ∈ C. Hence, we have the derivation which produces a word not in K k .Therefore, K k / ∈ IC(SLT k ).
From [18] and by Lemma 2.11, we know the inclusion IC(SLT ) ⊆ IC(NC ); from [28], we have the relation IC(ORD) ⊆ IC(NC ).Here, we have shown with Lemma 3.10 that there is a language in the family IC(ORD) which does not belong to the family IC(SLT ).The question whether the family IC(SLT ) is a proper subset of the family IC(ORD) or whether these two families are incomparable is left open.Summarizing, we have the following result.
Theorem 3.16.The inclusion relations presented in Figure 4 hold.An arrow from an entry X to an entry Y depicts the proper inclusion X ⊂ Y ; the dashed arrow from IC(ORD) to IC(NC ) indicates that it is not known so far whether the inclusion is proper or whether equality holds.If two families are not connected by a directed path, then they are incomparable with the exception of the family IC(SUF ) and the families IC(ORD), IC(NC ), and IC(SLT ) where IC(ORD) ̸ ⊆ IC(SUF ), IC(NC ) ̸ ⊆ IC(SUF ), and IC(SLT ) ̸ ⊆ IC(SUF ) hold, and with exception of the family IC(ORD) and the families IC(SLT k ) for k ≥ 2 and IC(SLT ), where IC(ORD) ̸ ⊆ IC(SLT k ) for k ≥ 2 and IC(ORD) ̸ ⊆ IC(SLT ) hold.

Conclusions
The inclusion relations obtained for the families of languages generated by external or internal contextual grammars are in most cases the same as for the families where the selection languages are taken from.
For further research, the open questions already mentioned should be considered: What is the relation between the families IC(SLT ) and IC(ORD), especially, is there a language in the set IC(SLT 2 ) \ IC(ORD)?Is the family IC(SUF ) incomparable to the family IC(SLT ) or is it a proper subset?Additionally, it remains to investigate the relations of the family IC(SUF ) to the families IC(ORD) and IC(NC ).
In [28], two independent hierarchies have been obtained for each type of contextual grammars, one based on selection languages defined by structural properties (as considered in this present paper), the other based on resources (number of non-terminal symbols, production rules, or states).These hierarchies should be merged.
The families of languages which are locally (k-)testable (not necessarily in the strict sense) are the Boolean closure of the families in the strict sense.For contextual grammars where the selection languages are intersections or unions of strictly locally (k-)testable languages, nothing has to be done since the classes SLT k for k ≥ 1 and SLT are closed under intersection and, for union in a selection pair (S 1 ∪ S 2 , C), one can take several selection pairs (S 1 , C), (S 2 , C) instead.It remains to investigate the impact of locally (k-)testable selection languages which are the complement of a strictly locally (k-)testable language.
Additionally, other subfamilies of regular languages could be taken into consideration.Recently, external contextual grammars have been investigated where the selection languages are ideals or codes [5,6].This reseach could be extended to internal contextual grammars with ideals or codes as selection languages.

Lemma 2 . 5 .
The classes SLT k for k ≥ 1 are incomparable to the classes FIN , NIL, and DEF .

Lemma 3 . 14 .
The families IC(SLT k ) for k ≥ 1 are incomparable to the family IC(SUF ).It is left open, whether the family IC(SUF ) is also incomparable to the family IC(SLT ) or whether it is a proper subset (since we know already the relation IC(SLT ) \ IC(SUF ) ̸ = ∅).Now we investigate the relations of the families IC(FIN ), IC(NIL), and IC(DEF ) to the families IC(SLT k ) for k ≥ 1 as well as the relation of the family IC(SLT ) to the family IC(ORD).

Figure 4 .
Figure 4. Hierarchy of language families by internal contextual grammars with selection languages defined by structural properties.An edge label refers to the paper where the respective inclusion is proved.
The state [pda] differs from the states [pdb] and[pdc]because there is a word aw such that [pdaaw] ∈ F but [pdbaw] / Due to the structure of the language L, the other possibilities (permutations of the letters a, b, and c) do not need to be considered.The state [pd] differs from each of the states [pda], [pdb], and [pdc] because there is a word cw such that [pdcw] ∈ F but [pdacw] / ∈ F and [pdbcw] / ∈ F and there is a word aw such that [pdaw] ∈ F but [pdcaw] / ∈ F .We now investigate all possibilities for the position of the state [pd] in the order of states (this is similar to the case distinction in the proof of Lem.2.7).