Preliminary Results On the Discovery of Patterns of Amino Acids Common to Sequences of Core Histones 3 & 4

        Pattern discovery (an NP-hard problem), in contrast to pattern matching (a
        problem solvable in polynomial time), is a challenging problem. Using
        Teiresias, a newly developed algorithm, we have explored (see [5]) a number of
        test cases in order to identify patterns of amino acids that appear across
        sequences (common motifs) as well as within individual sequences (internal
        repeats). In that work, only a small subset of the available results were
        presented in order to showcase the ability of the algorithm to: (a) validate
        the approach through the discovery of previously reported patterns; (b)
        demonstrate the capability to automatically identify highly selective patterns
        particular to the sequences under consideration; and (c) discover unidentified
        patterns in the well-studied example cases used. One of those example cases
        was the core histones 3 and 4 families of proteins. In this report, we present
        the full range of results obtained for this particular
        set of proteins.

By: Isidore Rigoutsos, Aris Floratos, Christos Ouzounis (EMBL Cambridge)

Published in: RC20804 in 1997


This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

Questions about this service can be mailed to .