Shift-and approach to pattern matching in LZW compressed text

Takuya Kida, Masayuki Takeda, Ayumi Shinohara, Setsuo Arikawa

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

38 Citations (Scopus)

Abstract

This paper considers the Shift-And approach to the problem of pattern matching in LZW compressed text, and gives a new algorithm that solves it. The algorithm is indeed fast when a pattern length is at most 32, or the word length. After an O(m +| Σ|) time and O(| Σ|) space preprocessing of a pattern, it scans an LZW compressed text in O(n + r) time and reports all occurrences of the pattern, where n is the compressed text length, m is the pattern length, and r is the number of the pattern occurrences. Experimental results show that it runs approxi- mately 1.5 times faster than a decompression followed by a simple search using the Shift-And algorithm. Moreover, the algorithm can be extended to the generalized pattern matching, to the pattern matching with k mismatches, and to the multiple pattern matching, like the Shift-And algorithm..

Original languageEnglish
Title of host publicationCombinatorial Pattern Matching - 10th Annual Symposium, CPM 1999, Proceedings
EditorsMike Paterson, Maxime Crochemore
PublisherSpringer Verlag
Pages1-13
Number of pages13
ISBN (Print)3540662782, 9783540662785
DOIs
Publication statusPublished - 1999
Event10th Annual Symposium on Combinatorial Pattern Matching, CPM 1999 - Warwick, United Kingdom
Duration: 1999 Jul 221999 Jul 24

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1645
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference10th Annual Symposium on Combinatorial Pattern Matching, CPM 1999
Country/TerritoryUnited Kingdom
CityWarwick
Period99/7/2299/7/24

Fingerprint

Dive into the research topics of 'Shift-and approach to pattern matching in LZW compressed text'. Together they form a unique fingerprint.

Cite this