Compressed pattern matching for Sequitur

S. Mitarai, M. Hirao, T. Matsumoto, A. Shinohara, M. Takeda, S. Arikawa

Research output: Contribution to journalConference articlepeer-review

12 Citations (Scopus)

Abstract

Sequitur due to Nevill-Manning and Witten. [19] is a powerful program to infer a phrase hierarchy from the input text, that also provides extremely effective compression of large quantities of semi-structured text [18]. In this paper, we address the problem of searching in Sequitur compressed text directly. We show a compressed pattern matching algorithm that finds a pattern in compressed text without explicit decompression. We show that our algorithm is approximately 1.27 times faster than a decompression followed by an ordinal search.

Original languageEnglish
Pages (from-to)469-478
Number of pages10
JournalProceedings of the Data Compression Conference
Publication statusPublished - 2001
EventData Compression Conference - Snowbird, UT, United States
Duration: 2001 Mar 272001 Mar 29

Fingerprint

Dive into the research topics of 'Compressed pattern matching for Sequitur'. Together they form a unique fingerprint.

Cite this