We present new algorithms for different versions of the Submass Finding Problem: Given a string *s* over a weighted alphabet, i.e., an alphabet with a mass function answer efficiently questions of the type, whether, for a set of input masses {*M*_{1}, . . . , *M** _{k}*},

*s*has substrings with masses

*M*

_{1}, . . . ,

*M*

*; and for each*

_{k}*M*

*where*

_{i}*s*has a substring with mass

*M*

*, find one or all occurrences of such substrings. Furthermore, our algorithms allow us to compute efficiently the number of different submasses of*

_{i}*s*(masses of substrings). We encode submasses in polynomials and use Fast Fourier Transform for polynomial multiplications. Our main algorithm runs in time where us the total mass of string

*s*. Since methods for compressing sparse polynomials are know, this can be viewed as where denotes the number of different submasses of

*s*. This makes the runtime independent of the size of the individual masses of characters, and instead only dependent on hence optimal in a sense. We give two variations of this algorithm for the different problems, a Las Vegas randomized algorithm, and a deterministic algorithm, which are both considerably faster than naive solutions if is not too large.

By:* Nikhil Bansal, Mark Cieliebak, Zsuzsanna Lipták*

Published in: Lecture Notes in Computer Science, volume 3109, (no ), pages 194-204 in 2004

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to reports@us.ibm.com .