SYNTHESIZING FUZZY TREE AUTOMATA

. Fuzzy tree automata are mathematical devices for modeling and analyzing vaguely deﬁned tree structures. The behavior of a fuzzy tree automaton generates a fuzzy tree language by mapping a set of regular trees on a ranked alphabet to fuzzy membership values. It calculates the membership grade of trees using a set of rules that process their structural characteristics. This paper deals with constructing fuzzy tree automata models that their behavior satisﬁes a set of given logical propositions (called properties) on the structure of trees. Our goal is uncertainty modeling by synthesizing fuzzy tree automata whose behavior is described by fuzzy linguistic variables. In this regard, we ﬁrst provide several patterns and heuristic tricks and techniques for constructing fuzzy tree automata that satisfy simple properties. Then, we develop a method for modeling complex propositional formulas based on the conversion of a logical formula into a computation tree, as well as a step-by-step combination of models.


Introduction
Studies in the behavior of complex systems led to the development of various computational models.Different types of modeling tools make available different kinds of analysis.One of the powerful computational models is automaton [12,20,38] which is an abstract machine with a set of transition rules.The transition rules enable automaton to move through a series of configurations.Each configuration is specified using a set of symbols called states.Some states are called final states, and if the automaton enters these states after processing, the input is accepted, and otherwise, it is rejected.
Modeling the uncertainty leads to the development of fuzzy automata [36] which is a class of automata with fuzzy transition rules and fuzzy configurations.A fuzzy automaton calculates a fuzzy grade of acceptance for each input term.As well, Fuzzy Tree Automata (FTA) [9,13,14,31] deal with the logic of computation by processing tree structures and calculating the grade of acceptance for each input tree based on its structural characteristics.A Fuzzy Tree Language (FTL) is a fuzzy set of trees in which an FTA recognize them [4,9,32,33,36].
An FTA calculates the membership grade of trees to its language by a nonlinear mapping which is known as the behavior of the automaton [31].The analysis of the behavior of automata [3,32,33] involves the analysis of states of automata and how these states change.On the other hand, modeling an FTL is the task of reasoning states and transition rules of an FTA which accepts it.In this paper, we refer to this task as the synthesis of FTA.The modeling FTL could be expressed explicitly by formal terms [4] or implicitly by some specifications [23,47].Here, we consider the languages expressed by implicit descriptions through logical propositions.
The process of realizing implementation from specification (constructing model from specified properties) is a complicated task, especially when the properties are expressed vaguely and imprecisely.This paper provides a series of practical techniques in the form of patterns that can be used in approximating an FTA to model an FTL expressed by linguistic variables [49,50].Here, we deal with this problem in three main steps.The first step is modeling atomic propositions, which are simple statements or assertions on trees.We provide several patterns for modeling atomic propositions that are true or false (nonfuzzy atomic propositions), as well as those that are not precisely defined and have a degree of truth (fuzzy atomic propositions).The second step includes modeling the combination of two atomic propositions by logical connectives ∨, ∧, and ¬.In this step, we improve the model by defining multi-indices states, refactoring transition rules, and developing final state membership functions by logical connectives to show how combining two models can support the combination of the corresponding properties.The final step is modeling a set of more complex properties that express an FTL.To construct an FTA corresponding to all of these properties, we first organize them into an expression tree [42] and then employ a bottom-up construction method.
The modeling patterns and techniques presented in this article are useful for modeling in three ways.First, each pattern is related to a family of properties and can be used in similar modeling.Second, the proposed multiindexing techniques for FTA integration can generally be used for automata combinations.Third, a systematic approach to the development of complex models is provided.
This paper is organized as follows.Section 2 addresses some of the works related to this research, and Section 3 includes the basic concepts about fuzzy sets, ranked alphabets, and FTA.The proposed methods and techniques for modeling some patterns of FTL are presented in Section 4. Finally, Section 5 summarizes the achievements of this research.

Related work
Usually, synthesis and analysis of systems by computational models is associated with uncertainty modeling, and approximate reasoning [2,10].Fuzzy logic is capable of overcoming the vagueness, imprecision and absence of concrete data for generating robust models to drive decisions from uncertainties [41].
Researchers in various fields have defined and developed different fuzzy models to perform computation and analysis.Fuzzy automata are computational models that can describe practical processes [27].Ying and Lin in [46] have used fuzzy automata as the model of fuzzy discrete event systems and developed online learning algorithms to adjust its parameters.They applied stochastic gradient descent for optimizing the transition matrix.Also, Du and Zhu's researches on the analysis of social networks by fuzzy automata led to defining fuzzy relational structure and deducing knowledge by computing lower and upper approximations based on bisimulation [8].The bisimulation relation models equivalence between states of nondeterministic automata [45].The equivalence relation between states of an automaton is done to minimize the automaton, and the equivalence relation between states of two automata is usually done for comparing their behaviors [13,14,32].
The challenge of modeling hybrid automata is addressed in [28].Their idea is to construct the model by processing the set of unlabeled data belonging to the language of the model.They clustered the data to obtain language subclasses, each labeled with a state symbol, and then modeled the transition rules by searching for binary classification boundaries.Soto et al. [40] focused on creating models from experimental data, as well.They synthesized linear hybrid automata and developed an adaptive method for optimizing the model to minimize the number of modes.Also, they developed another adaptive algorithm to optimize the model by increasing its precision.There exist some other works on the training of hybrid automata by analysis of input/output traces and machine learning techniques (see [25,29]).
Cellular automata (CA) is a computational model for systems that changes occur at any point (cell) based on the status or history of its neighboring points [6,30].A CA consists of a collection of cells arranged in a multidimensional space and a dynamical rule (transition function) that updates their configuration synchronously.The transition function calculates the state of each cell based on the state of its neighbors, where these simple local interactions and computations between cells, form a complex global behavior [30].The set of transition rules of a CA is a type of knowledge that Li and Yeh [26] applied data mining technique for reconstructing it.They constructed a CA model for geographical phenomena by processing a series of spatial data, including the layers of urban development, proximity variables, neighborhood conditions, and physical attributes.Also, He et al. [18] used deep learning techniques to extract transition rules of CA and constructed a prediction model of urban expansion pattern.They trained a convolution neural network using geographical data of the study area.Roodposhti et al. [39] provided a dictionary of trusted rules (called DoTRules) to calculate transition potential in CA models of land-use/cover change.DoTRules supports generating transparent transition rules and quantifies uncertainty by estimating the frequency and entropy of major land-use transitions.
The works mentioned above can be categorized in the area of active learning of automata models [21] which is inferring models from observations.This paper aims to provide some synthesis patterns for FTA models corresponding to FTL that are not described by formal terms but fuzzy linguistic variables.Fuzzy linguistic variables are powerful tools for the approximate characterization of concepts that are not well-defined to be described in quantitative terms [22,49,50].It can be used in fuzzy modeling of requirements to present a method for fuzzy intelligent requirement engineering from natural language to computer-aided design models [11].Also, linguistic variables are employed to develop analytical tools for data mining and time series prediction [15].Of course, linguistic variables and fuzzy membership functions require systematic methods to define and evaluate [37].

Fuzzy sets
Basic concepts will be used as Zadeh's fuzzy logic and fuzzy set theory [16,48,51].Throughout the article, we use as the unit interval [0, 1] ∈ R. Definition 3.1.Let X be a collection of elements or objects.A fuzzy set A in X is a set of ordered pairs where µ A (x) is called the membership function or grade of membership of x in A and (generally) lie in .We denote by X the set of all fuzzy sets on X.Also, a fuzzy number refers to the fuzzy set representing the possible values for a number.

Ranked alphabet and fuzzy tree Automata
Our definitions in this section, are borrowed from [7,32,33].The set of natural numbers (including zero) is denoted by N. A Σ-alphabet is a finite and nonempty set of symbols.A ranked alphabet is a couple (Σ, Arity), which is the disjoint union of sets of n-ary symbols Σ n = {σ | Arity(σ) = n} for all n ∈ N. The set of symbols of arity 0 are called constants.
Let Q be a set of constants called variables.The set We write T Σ for T Σ (φ).The size and the height of a tree t ∈ T Σ (Q) denoted by S(t) and H(t) respectively, are inductively defined by -S(t) = 0 and Definition 3.2.A fuzzy tree automaton is a system M = (Σ, Q, Γ, δ, , ρ, β), where: 1. Σ is a finite set of ranked alphabets called input symbols.
2. Q is a finite set of symbols called states.
3. Γ : Q → is a fuzzy set on Q and called the set of final states.

Synthesizing FTA corresponding to a linguistically described FTL
Let an FTL be described by a set of logical statements, some of which may be ambiguous.It is notable that the logical statement is called property in general, and is called atomic proposition when it is simple and uncomposed [5].Our goal is to define an FTA that recognizes this language as accurately as possible.In this regard, we have provided a number of patterns, each of which includes three main parts; a generic title, a property (FTL) and the corresponding model (FTA).The patterns presented in this section are defined based on some structural patterns on XML data [1,17,19,24,34,35,43,44].

Modeling atomic propositions
An atomic proposition is a sentence that has a truth value and cannot be decomposed into simpler sentences.In this section, we present some patterns of atomic propositions and the required tricks for modeling them by FTA.The patterns include some fuzzy/nonfuzzy restrictions on height and size of trees, fuzzy/nonfuzzy symbol counting, symbol detection, and tree structure pattern recognition.Pattern 4.1.Threshold on the height of trees Property: "The height of the tree is less than H".
In Model 4.1, the processing of constant symbols results in state q 1 and for nodes that have children, given that i is the largest state index of the children, if i < H then processing will result in q i+1 and otherwise it will be q H .The processing of each tree is bottom-up (from leaves to root), and whenever a node has no children or all of its children have been processed, it can be processed.Here, q h represents the state of nodes at the root of trees whose height is h.Of course, for trees whose height is greater than or equal to H the desired condition is violated and for all of them we consider state q H with Γ(q H ) = 0. Figure 1 shows an example of step-by-step running of two different FTA derived from Model 4.1 to process the same tree structure.In Figure 1(a), states q 1 to q 4 are generated in steps 1 to 4, respectively, but in Figure 1(b), after entering state q 2 , the desired condition is violated and no new states are generated at parent nodes and this state is repeated until the root of the tree.Model: Let Ĥ is the fuzzy number denoting approximately H and U ∈ N is the smallest number that for any q min{i+1,U } , g) = 1 and δ(q i , q j , q min{max{i,j}+1,U } , f ) = 1 for i, j ≤ U .Model 4.2 is similar to Model 4.1 except the final states Γ.Here the value of approximately H is considered as the fuzzy number Ĥ and state symbol q h is intended for the root of trees of height h.So, the membership value Γ(q h ) is set proportional to the membership value µ Ĥ(h).For example, let H = 7 and M be an FTA constructed based on Model 4.2, where "approximately 7" is defined by triangular fuzzy number 7 = T (3,7,11).Now, for tree t = f (a, g(f (b, a))) shown in Figure 1, we have L M (t) = 0.25 because processing of t with FTA M leads to state q 4 and µ 7(4) = 0.25.Property: "The size of the tree is greater than S".
In Model 4.3, q i is the state symbol for the root of trees with size i, where i ≤ S. Also, for trees which their sizes are greater than S the desired condition is reached and we consider state q S+1 with Γ(q S+1 ) = 1 for them.Figure 2 shows an example of step by step running of two different FTA derived from Model 4.3 to process the same tree structure.In Figure 2(a), S ≥ 5 which implies {q 1 , q 2 , . . ., q 6 } ⊆ Q.So, states q 1 , q 3 , q 4 and q 6 are generated in steps 1 to 4, respectively.In Figure 2(b), Q = {q 1 , q 2 , q 3 } and after entering state q 3 , the desired condition is satisfied and no new state is generated at parent nodes and this state is repeated until the root of the tree.

Pattern 4.4. Constraint with iterative pattern on height
Property: "The height of the tree is a multiple of M ".
In Model 4.4, indices of state symbols represent the remainder after division of the height of trees by M .Since the trees are processed bottom-up, at each step, the index of reached states is increased by 1 until state q M −1 is reached; then the index value is reset.Pattern 4.5.Restriction that leads to a non-regular tree language Property: "The size of the tree is a prime number".

Model:
The mapping between the size (or height) of trees and the set of prime numbers is not a regular tree language and can not be recognized by an FTA (Example 1.2.2 in [7]).
According to the Pumping Lemma for recognizable tree languages [7] it can be shown that Model 4.5 is not corresponding to a recognizable FTL.Understanding the patterns that lead to unrecognizable tree languages in combination with the closure properties of recognizable tree languages enables us to deduce whether the language is recognizable or not.Detecting the class of tree language of an atomic proposition is useful, especially when we are modeling a language described by more complex properties, because it can prevent a large number of calculations resulting from modeling and combining them.However, a modeling tool can have separate libraries for patterns related to different language classes.Pattern 4.6.Semi-atomic constraint with iterative pattern on the size of trees Property: "The size of the tree is a multiple of M and N ".
The modeling property in Pattern 4.6 is not atomic because it can be broken down into properties "The size of the tree is a multiple of M " and "The size of the tree is a multiple of N ".However, it is equivalent to the atomic property "The size of the tree is a multiple of K" where K is the least common multiple of M and N .So, we call it a "semi-atomic property".Note that detecting and optimizing semi-atomic properties can reduce modeling costs by reducing time complexity and memory requirements.Model: Q = {q 0 , q 1 }, Γ(q 0 ) = 0 and Γ(q 1 ) = 1.δ(q 0 , a) = 1, δ(q 0 , b) = 1, δ(q i , q 1 , g) = 1 and δ(q i , q j , q max{i,j} , f ) = 1 for i, j ∈ {0, 1}.
In Model 4.9, the input pattern has 5 nodes, and the FTA uses state symbols {q 1 , . . ., q 5 } for step by step labeling of the input tree during the pattern recognition.Also, state symbol q 6 is corresponding to situation that a node violates all sub-structures of the pattern while pattern t is not met.Figure 3 shows the steps of running the FTA derived from Model 4.9 on tree structure t = f (b, f (g(t ), a)) where t ∈ T Σ .Here, we used a set of labels to describe the meaning of states: -A : The current node is a.
-B : The current node is b.
-G : The current node is g and pattern t has not been found yet.
-F : The current pattern is f (g(−), a) and pattern t is not found yet.-T : Pattern t is found.
According to Figure 3, if pattern t is found in the processed tree, the model remains in state q 5 , and the processing continues in the same way until the root of the tree.Also, if pattern t is not found, it is detected step by step according to all subtree decompositions.Pattern 4.10.Comparing the number of symbols Property: "The number of a and b is equal in all subtrees with root f ".
State symbol q 3 in modeling part of Pattern 4.10 is a trap state; because if the condition is violated at a node, it leads to q 3 and this state symbol will be propagated to all parent nodes, and it causes the tree to be rejected.When the pattern of the processing tree is a, g(a) and g(g(. . .g(a))), it leads to q 1 and for patterns b, g(b) and g(g(. . .g(b))), it leads to q 2 .So, if a tree (subtree) leads to q 1 /q 2 , it has only one a/b, respectively.Consequently, for each node f that its children are processed, there are only three acceptable cases f (q 0 , q 0 ), f (q 1 , q 2 ) and f (q 2 , q 1 ), which all leads to q 0 .Symbol state q 0 means that the number of symbols a and b are equal in the corresponding tree (subtree).Other cases (e.g., f (q 0 , q 1 ), f (q 0 , q 2 ) and f (q 1 , q 1 )) are not acceptable because they are the result of trees (e.g., f (f (a, b), a), f (f (b, a), g(g(b))) and f (g(a), a)) that violate the condition.
The FTL presented in Pattern 4.10 is not expandable to support adding a new alphabet of rank 2 or more into Σ.For example, if new symbol c ∈ Σ 2 is added to Σ, the pattern will be a context free tree language and can not be presented by a finite tree automaton (Fig. 4).Context free tree languages can be recognized by pushdown tree automata [7].
Pattern 4.11.Special context-free symbol pattern Property: "The number of a and b is equal if the root of the tree is f ".
Model: This property could not be modeled with a finite FTA.Pushdown tree automata and tree automata with infinite number of states are two classes of tree automata that can recognize it.Proposition 4.12.Pattern 4.11 is not a recognizable tree language.
Proof.The corresponding tree language is Suppose that L is recognizable by FTA M having k state symbols.Now, let t = f (t 1 , t 2 ) with t 1 = f (f (.As k is the cardinality of the state set of M , there are two distinct nodes f along t 1 labeled with the same state.Therefore, one could cut the first branch between these two positions leading to a tree t = f (t 1 , t 2 ) with t 1 = f (f (. . .f j (a, a) . . ., a), a), where j < k + 1 and a successful run of M exists for t .This is a contradiction with L(M ) = L. Model: Let Ẑ is a symmetric fuzzy number corresponding to "almost zero", and L, R ∈ Z are bounds of Ẑ such that for every x ∈ Z it holds µ Ẑ(x) > 0 ⇔ x ∈ (L, R).Also, consider mapping γ : Z → [L, R] defined by Now, the corresponding model is: Q = {q L , . . ., q R } and Γ(q i ) = µ Ẑ(i) for L ≤ i ≤ R. δ(q 1 , a) = 1, δ(q −1 , b) = 1, δ(q k , q k , g) = 1, δ(q i , q j , q γ(i+j) , f ) = 1, δ(q L , q k , q L , f ) = 1, δ(q k , q L , q L , f ) = 1, δ(q k , q R , q R , f ) = 1 and δ(q R , q k , q R , f ) = 1, where k ∈ [L, R] and i, j ∈ (L, R).
In the modeling presented in Pattern 4.13, for i ∈ (L, R), each state symbol q i specifies the state of the nodes where the number of a's minus the number of b's in the corresponding subtree is i.Also, q L and q R are two trap state symbols indicating that the condition is violated in the current subtree or at least one of its subtrees.

Modeling properties with logical connectives
In this section, we first provide three sample patterns that use multiple indexing of state symbols and more advanced membership functions for final states to model the fuzzy properties that are the combination of some atomic propositions by logical connectives ∨, ∧, and ¬.Then, we present a general method for combining different FTA to model more complex properties.Proof.According to the theorem, L(M 3 ) = L(M 1 ) ∩ L(M 2 ), L(M 4 ) = L(M 1 ) ∪ L(M 2 ) and L(M 5 ) = L(M 1 ).Since the class of fuzzy tree languages is closed under intersection, union and complement [4,9], fuzzy tree languages L(M 3 ), L(M 4 ) and L(M 5 ) are recognizable and FTA M 3 , M 4 and M 5 are realizable.Now, we use tricks of modeling presented in Pattern 4.14 for constructing M 3 .Also, we construct M 4 with the assumption that M 1 and M 2 are complete.Then, we construct M 5 with the assumption that M 1 is complete and deterministic and the membership grade of all its transition rules is 1.
Corollary 4.18.Assume that AP is a set of atomic propositions such that each a ∈ AP is equivalent to a recognizable FTL.Then, any property which is a combination of the atomic propositions by ∨, ∧, and ¬ can be modeled by an FTA.
Proof.Assume that T is the binary expression tree of the modeling property, where its leaves are the atomic propositions, and its inner nodes are logical operations ∨, ∧, and ¬.Algorithm 1 povides a systematic method for modeling T .In lines 2 and 3, each a ∈ AP is modeled by FTA M a , which means all leaves of T are modeled.Now, using an iterative loop (lines 4 to 6), the algorithm models each node whose children are modeled (Thm.4.17) until all nodes of the tree are modeled.Clearly, the FTA corresponding to the root of T is the desired model of the property.Assume that there exist a set of properties, each one equivalent to a recognizable FTL.Then, there exists an FTA satisfying all of these properties.
Proof.Let P = {p 1 , . . ., p n } is the set of modeling properties, where n ∈ N. Now, set P all = p 1 ∧ . . .∧ p n and construct its FTA by Corollary 4.18.

Conclusion
In this paper, we aimed to model a family of fuzzy tree languages in which some features of trees are described using linguistic variables.In this regard, we provided some heuristic tricks and techniques to employ fuzzy tree automata for modeling vaguely defined sets of regular trees on a ranked alphabet.Also, a general method of combining these models for constructing a synthetic fuzzy tree automaton is developed.This combination is based on the multiple indexing of states and the generalization of transition rules of fuzzy tree automata.It provides a systematic method for modeling a fuzzy tree language described by multiple properties.We also introduced a set of modeling patterns that encodes some crisp and fuzzy atomic propositions (simple properties) into fuzzy tree automata by defining the set of states, final states, and the set of transition rules of the desired automata.These modelings include some patterns such as symbol existence /counting, sub-structure detecting, the threshold on height and size, fuzzy numbers, as well as imprecision and vagueness handling, which are used in the linguistical description of fuzzy tree languages.In addition, we provided some examples that use multiple indexing of state symbols for combining the models, and then we proved that if two properties are satisfiable by some fuzzy tree languages, their combination with logical modalities such as ∨, ∧, and ¬ can be modeled by some fuzzy tree automata (Thm.4.17).As a result of the theorem, every complex property or even a set of properties can be organized into a binary expression tree to construct a fuzzy tree automaton model recognizing the corresponding fuzzy tree language.

Figure 1 .
Figure 1.Steps of running two different FTA corresponding to Model 4.1 on tree f (a, g(f (b, a))).

Pattern 4 . 3 .
Threshold on the size of trees

Figure 2 .
Figure 2. Steps of running two different FTA corresponding to Model 4.3 on tree f (a, g(f (b, a))).

Pattern 4 . 8 .
Symbol countingProperty: "The tree has at least M nodes labeled b".

Figure 4 .
Figure 4.There is no FTA that can compare the number of a in subtree t L with the number of b in subtree t R .

Pattern 4 . 13 .
Fuzzy comparison of the symbolsProperty: "The number of a and b is almost equal in all subtrees with root f ".

Algorithm 1 :
The modeling of a cpmplex property by an FTA.input : P // the input formula (a complex property) output: M // the FTA model of the input formula 1 T ← expression tree(P ) // construct the expression tree 2 forall node ∈ T.leaves do // modeling the atomic propositions 3 node.model← modeling(node.property) 4 while T.root.model= ∅ do // bottom-up modeling of T 5 node ← T.get next() // select a node that is not modeled but its children are modeled 6 node.model← modeling(node.property)// construct the model Corollary 4.19.