TY - JOUR
T1 - Ternary directed acyclic word graphs
AU - Miyamoto, Satoru
AU - Inenaga, Shunsuke
AU - Takeda, Masayuki
AU - Shinohara, Ayumi
PY - 2004/11/29
Y1 - 2004/11/29
N2 - Given a set S of strings, a DFA accepting 5 offers a very time-efficient solution to the pattern matching problem over S. The key is how to implement such a DFA in the trade-off between time and space, and especially the choice of how to implement the transitions of each state is critical. Bentley and Sedgewick proposed an effective tree structure called ternary trees. The idea of ternary trees is to 'implant' the process of binary search for transitions into the structure of the trees themselves. This way the process of binary search becomes visible, and the implementation of the trees becomes quite easy. The directed acyclic word graph (DAWG) of a string w is the smallest DFA that accepts all suffixes of w, and requires only linear space. We apply the scheme of ternary trees to DAWGs, introducing a new data structure named ternary DAWGs (TDAWGs). Furthermore, the scheme of AVL trees is applied to the TDAWGs, yielding a more time-efficient structure AVL TDAWGs. We also perform some experiments that show the efficiency of TDAWGs and AVL TDAWGs, compared to DAWGs in which transitions are implemented by linked lists.
AB - Given a set S of strings, a DFA accepting 5 offers a very time-efficient solution to the pattern matching problem over S. The key is how to implement such a DFA in the trade-off between time and space, and especially the choice of how to implement the transitions of each state is critical. Bentley and Sedgewick proposed an effective tree structure called ternary trees. The idea of ternary trees is to 'implant' the process of binary search for transitions into the structure of the trees themselves. This way the process of binary search becomes visible, and the implementation of the trees becomes quite easy. The directed acyclic word graph (DAWG) of a string w is the smallest DFA that accepts all suffixes of w, and requires only linear space. We apply the scheme of ternary trees to DAWGs, introducing a new data structure named ternary DAWGs (TDAWGs). Furthermore, the scheme of AVL trees is applied to the TDAWGs, yielding a more time-efficient structure AVL TDAWGs. We also perform some experiments that show the efficiency of TDAWGs and AVL TDAWGs, compared to DAWGs in which transitions are implemented by linked lists.
KW - AVL trees
KW - Deterministic finite state automata
KW - Directed acyclic word graphs
KW - Pattern matching on strings
KW - Ternary search trees
UR - http://www.scopus.com/inward/record.url?scp=9544224164&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=9544224164&partnerID=8YFLogxK
U2 - 10.1016/j.tcs.2004.07.008
DO - 10.1016/j.tcs.2004.07.008
M3 - Article
AN - SCOPUS:9544224164
SN - 0304-3975
VL - 328
SP - 97
EP - 111
JO - Theoretical Computer Science
JF - Theoretical Computer Science
IS - 1-2
ER -