Figure 1

(a) A illustration of structure-based sequence alignment and hidden state paths. In Sequences 1 and 2, uppercase letters and lowercase letters represent aligned core blocks and unaligned regions, respectively. If two corresponding unaligned regions bounded by the same two core blocks are of different length, we split the shorter one into two pieces and introduce contiguous gaps in the middle. For both N-terminal and C-terminal ends, the shorter unaligned region is pushed toward the core blocks. Secondary structure (ss) types (helix, "h"; strand, "e"; coil, "c") are shown for Sequence 1. The hidden state paths for three models are shown below the amino acid sequences. (b) model structure of HMM_1_1_0. Residue pairs in unaligned regions are modeled using the same match state ("M") as those in the aligned blocks. Insertions in the first sequence and second sequence are modeled using states "X" and "Y", respectively. (c) model structure of HMM_1_1_1. Residue pairs in the unaligned regions are modeled using a different match state ("U") than the match state in the core blocks ("M"). (d) model structure of HMM_1_3_1. Residue pairs in aligned core blocks are modeled using three match states ("H", "S", "C") according to three secondary structure types of the first sequence. In (b), (c) and (d), match states are shown as squares and insertion states are shown as diamonds. Begin state, end state, and transitions from and to them are present in these models but are not shown.