RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy8927
         (1196 letters)



>gnl|CDD|221734 pfam12722, Hid1, High-temperature-induced dauer-formation protein.
            Hid1 (high-temperature-induced dauer-formation protein 1)
            represents proteins of approximately 800 residues long
            and is conserved from fungi to humans. It contains up to
            seven potential transmembrane domains separated by
            regions of low complexity. Functionally it might be
            involved in vesicle secretion or be an inter-cellular
            signalling protein or be a novel insulin receptor.
          Length = 813

 Score =  414 bits (1066), Expect = e-129
 Identities = 201/530 (37%), Positives = 254/530 (47%), Gaps = 124/530 (23%)

Query: 684  KFLYYVLKSSDVLDILVPILYHLNDSRADQSRVGLMHIGVFILLLLSGERNFGVRLNKPY 743
            +F  YV      LD+LV ILY+L   R D S+ GL+ + VFILL LS E+NFGV+LNKP+
Sbjct: 388  RFRNYVADK-RALDLLVFILYYLFYYRNDPSKKGLVPLCVFILLYLSSEKNFGVKLNKPF 446

Query: 744  SA--AVP--MDIPVFTGTHADLLMIVFHKVITSGHNRLQPLFDCLFTILVNVSPYLKTLS 799
            +A   +P    +PVFTGT+AD L+I    +IT+G  RLQ LF CL               
Sbjct: 447  NANNKLPNFFKLPVFTGTYADFLIIQICNIITTGKKRLQQLFPCL--------------- 491

Query: 800  MVASTKLLHLLEAFASPAFLYTGETNHHLVFFLLEIFNNIIQYQFDVSPYLKTLSMVAST 859
                                             LEI  N+        PYL  LS  AS+
Sbjct: 492  ---------------------------------LEILYNLS-------PYLGGLSYNASS 511

Query: 860  KLLHLLEAFASPAFLYTGETNHHLVFFLLEIFNNIIQYQFDGNSNLVYTIIRKRQIFYAL 919
            KLLHL   F+SP FL +   NH L+  LLE FNNIIQYQFDGNSNL+Y I+R R++F  L
Sbjct: 512  KLLHLFSKFSSPWFLLSNPFNHDLLALLLEAFNNIIQYQFDGNSNLLYAILRHRKVFDQL 571

Query: 920  SNLPSDSASIHRALSVSGGRKRSSPLPPLLTSAPSSPSAGVEPSSPSMEGSKPALPAEPG 979
             NL  DS+      S                    S S  +  +  +   S+    +E  
Sbjct: 572  RNLILDSSQEEDERS------------------NQSASGSLSDNPSNDNDSRSPSLSEVP 613

Query: 980  TLNATLADTPGPDKAEDLQSIDSCEYIWERGVGFATSPARCPAHESARTELLKLLLTCFS 1039
              N +LA T   D A    S               TS A  P   ++             
Sbjct: 614  EENKSLAITDDFDPASRENS---------------TSEAAAPPSVNS------------- 645

Query: 1040 ETIYQPPQGIEKMTEKESA-HPGGENENVVAKPSSISSPIQHSHPRVWTPTPEWVKSWKD 1098
                Q     EK   K  A        N   +P   S  +       + PT +WV+SWK 
Sbjct: 646  -VPLQLQGPSEKDRGKNPAGSLAFSRLNSATRPKWPS-GLSSKSKEKFPPTSDWVESWKG 703

Query: 1099 KLPLQTIMRLLQVLVPQVEKICIDKGLTDESEIL-----KFLQHGTLVGLLP-------V 1146
            KLPLQTI+RL++VL+PQV KICI KGLT ESE+L     KFL  GT+VGLLP       V
Sbjct: 704  KLPLQTILRLIKVLLPQVPKICIIKGLTYESELLVLNIEKFLFDGTIVGLLPQLLYDLSV 763

Query: 1147 PHPILIRKYQANSGTTTWFRTYMWGVIYLRNV---DPPIWYDTDVKLFEI 1193
            PHPI IRK+Q N+ +  W+R+ +WGVI+ RN     P IW  TDVKLF+I
Sbjct: 764  PHPIKIRKFQWNALSLGWYRSLLWGVIFNRNSGLGTPNIWNGTDVKLFKI 813



 Score =  240 bits (615), Expect = 2e-66
 Identities = 150/479 (31%), Positives = 213/479 (44%), Gaps = 84/479 (17%)

Query: 191 PQDASDNTFWNQFWSENVTNAQDIFTLIPAAEIRALREEAPSNLATLCYKAVEKLVNAVD 250
             D  D+ FW QFW E   + +D+F+LI  A+IR +R+  P NLATL      +L+   +
Sbjct: 23  IVDPDDDAFWEQFW-ELPESTEDVFSLITPADIRKIRDNNPENLATLIRFLCLRLILLAN 81

Query: 251 SSCRMAH--EQQAVLNCTRLLTRLIPYIFEDM---DWRG-FFWSSLPSKST----SDDNE 300
           S         +  +LNC RLLTRL+PYIFED    +W   FFWS+ P        SD   
Sbjct: 82  SPSFPDSLAPRSELLNCIRLLTRLLPYIFEDEYLEEWEDDFFWSTNPKPLRGAQASDVLF 141

Query: 301 GEGSTPAEDVESVPLAQSLLNAICDLLFCPDFTVASNKCTCQVFLML------GTQDNLF 354
            E  +  ED ++ PL + LL+A+ DLLFCPDFTV S + +             GT  + +
Sbjct: 142 DESQSEEEDEDAKPLGEELLDALVDLLFCPDFTVPSPRSSKGKVSYCIWESGVGTNTSPY 201

Query: 355 INYLSRIHRDEDFQFVLAGFSR-LYQDGNRTRGLALTRHLGTQDNLFINYLSRIHRDEDF 413
            N     +R E  + +L  FS  +Y   +          L +  N F+ YL         
Sbjct: 202 SNPELDSNRLEILRLLLTLFSESMYLSPS----------LSSNGNKFLTYLCTS---APR 248

Query: 414 QFVLAGFSRLLNNPLLQTYLPNSTK------KIEFH---QELLV-----LFWKMCDY-NK 458
             VL     LLN   +  Y PN  K              +E+LV     L   M  Y   
Sbjct: 249 HEVLCLLCSLLN--TVCRYNPNGWKEYGLPYNNSVFKDSREILVEYSLQLLLVMLLYPIP 306

Query: 459 TLTQLAGFSRLLNNPLLQTYLPNSTKKIEFHQELLVLFWKMCDYNKKFLYYVLKSSDVLD 518
           + T+L+       N   +    N   ++   ++L            +F+         LD
Sbjct: 307 SSTRLSFLYLGSKNSKPKNLFRNYLSRLHREEDL------------QFI---------LD 345

Query: 519 ILVPILYH----LNDSRADQSRVGLMHIGVFILLL------KKFLYYVLKSSDVLDILVP 568
            L  IL +     +    +  +V      + ILL       K+F  YV      LD+LV 
Sbjct: 346 GLTRILKNPIQANSSYLPNSPKVVPWAPEMLILLWELIQCNKRFRNYVAD-KRALDLLVF 404

Query: 569 ILYHLNDSRADQSRVGLMHIGVFILLLLSGERNFGVRLNKPYSA--AVP--MDIPVFTG 623
           ILY+L   R D S+ GL+ + VFILL LS E+NFGV+LNKP++A   +P    +PVFTG
Sbjct: 405 ILYYLFYYRNDPSKKGLVPLCVFILLYLSSEKNFGVKLNKPFNANNKLPNFFKLPVFTG 463



 Score =  216 bits (552), Expect = 3e-58
 Identities = 89/257 (34%), Positives = 122/257 (47%), Gaps = 41/257 (15%)

Query: 1   MGNSDTKLNFRQAVIQLTSKTQPIDASDNTFWNQFWSENVTNAQDIFTLIPSAEIRALRE 60
           MGNSD+KLNFR  + +L  K   +D  D+ FW QFW E   + +D+F+LI  A+IR +R+
Sbjct: 1   MGNSDSKLNFRNHIFRLGQKRLIVDPDDDAFWEQFW-ELPESTEDVFSLITPADIRKIRD 59

Query: 61  EAPSNLATLCYKAVEKLVNAVDSSCRMAH--EQQAVLNCTRLLTRLIPYIFEDM---DWR 115
             P NLATL      +L+   +S         +  +LNC RLLTRL+PYIFED    +W 
Sbjct: 60  NNPENLATLIRFLCLRLILLANSPSFPDSLAPRSELLNCIRLLTRLLPYIFEDEYLEEWE 119

Query: 116 G-FFWSSLPSKST----SDENEGEGPDKAEDL---------------------------- 142
             FFWS+ P        SD    E   + ED                             
Sbjct: 120 DDFFWSTNPKPLRGAQASDVLFDESQSEEEDEDAKPLGEELLDALVDLLFCPDFTVPSPR 179

Query: 143 QSIDSCEY-IWERGVGFATSPARCPAHESARTELLKLLLTCFSETIYQPPQDASDNTFWN 201
            S     Y IWE GVG  TSP   P  +S R E+L+LLLT FSE++Y  P  +S+   + 
Sbjct: 180 SSKGKVSYCIWESGVGTNTSPYSNPELDSNRLEILRLLLTLFSESMYLSPSLSSNGNKFL 239

Query: 202 QFWSENVTNAQDIFTLI 218
            +         ++  L+
Sbjct: 240 TYLC-TSAPRHEVLCLL 255



 Score = 75.5 bits (186), Expect = 1e-13
 Identities = 25/50 (50%), Positives = 30/50 (60%), Gaps = 1/50 (2%)

Query: 998  QSIDSCEY-IWERGVGFATSPARCPAHESARTELLKLLLTCFSETIYQPP 1046
             S     Y IWE GVG  TSP   P  +S R E+L+LLLT FSE++Y  P
Sbjct: 180  SSKGKVSYCIWESGVGTNTSPYSNPELDSNRLEILRLLLTLFSESMYLSP 229


>gnl|CDD|220375 pfam09742, Dymeclin, Dyggve-Melchior-Clausen syndrome protein.
           Dymeclin (Dyggve-Melchior-Clausen syndrome protein)
           contains a large number of leucine and isoleucine
           residues and a total of 17 repeated dileucine motifs. It
           is characteristically about 700 residues long and
           present in plants and animals. Mutations in the gene
           coding for this protein in humans give rise to the
           disorder Dyggve-Melchior-Clausen syndrome (DMC, MIM
           223800) which is an autosomal-recessive disorder
           characterized by the association of a
           spondylo-epi-metaphyseal dysplasia and mental
           retardation. DYM transcripts are widely expressed
           throughout human development and Dymeclin is not an
           integral membrane protein of the ER, but rather a
           peripheral membrane protein dynamically associated with
           the Golgi apparatus.
          Length = 659

 Score =  224 bits (573), Expect = 5e-62
 Identities = 132/641 (20%), Positives = 201/641 (31%), Gaps = 215/641 (33%)

Query: 1   MGNSDTK---LNFRQAVIQLTSKTQPIDASDNTFWNQFWSENV---TNAQDIFTLIPSAE 54
           MG S +    L+FR+A+I+L   T P+   D+ FW++  S ++   T++ DIFTL+ +A 
Sbjct: 1   MGASISTESKLSFREALIRL-VGTDPVPPDDDPFWDELLSFSILIPTSSSDIFTLLDAAL 59

Query: 55  IRALREEAPSNLATLCYKAVEKLVNAVDSSCRMA--HEQQAVLNCTRLLTRLIPYIFEDM 112
            R LR   P+NLAT    A+ ++     S  +++   +    +N  R+L R+I Y+ E  
Sbjct: 60  ERILRSLFPNNLATGNLAALVRVFLRQLSELKISEDKQDSQAINALRILRRIIKYLIESG 119

Query: 113 ---DWRGFFWS--------SLPSKSTSDENEGEGPDKAEDLQSIDSCEYIWERGVGFATS 161
              +  GFFWS         L +       +   P K  DL ++ S E   E  VG   S
Sbjct: 120 SESELLGFFWSTDPAGGVRPLATTLFEALTDLLFPAKGADLSTLSSFEDFLEALVGILVS 179

Query: 162 PARCPAHESARTELLKLLLTCFSETIYQPPQDASDNTFWNQFWSENVTNAQDIFTLIPAA 221
           P    A      E+L+LLL   S  +Y  P D S + F+    S                
Sbjct: 180 PPVNDATYLIHLEILRLLLVLLSSQLYIDPSDESGSPFYRSITS---------------- 223

Query: 222 EIRALREEAPSNLATLCYKAVEKLVNA-VDSSCRMAHEQQAVLNCTRLLTRLIPYIFEDM 280
                                  L  + +++            +         P +   +
Sbjct: 224 ---------------AENSHAGPLFTSLLNNFLARDPV-PYPYSHLLFSDGSQPGVLFGI 267

Query: 281 DWRGFFWSSLPSKSTSDDNEGEGSTPAEDVESVPLAQSLLNAICDLLFCPDFTVASNKCT 340
                  SSL             S+        PLA   L  +  LL         +   
Sbjct: 268 --ASSSSSSLVFTFGGGKVSNSESSR------SPLANVSLQLLLVLLDHCPPESPKD--- 316

Query: 341 CQVFLMLGTQDNLFINYLSRIHRDEDFQFVLAGFSRLYQDGNRTRGLALTRHLGTQDNLF 400
              F        L + YLS I R                  +   G              
Sbjct: 317 -NPF-------RLALFYLSDIERSSSV--------------SSLSGSE------------ 342

Query: 401 INYLSRIHRDEDFQFVLAGFSRLLNNPLLQTYLPNSTKKIEFHQELLVLFWKMCDYNKTL 460
                           L  FS L           +S  K+ FH ELL+L +         
Sbjct: 343 ---------------NLIDFSAL----------YDSLCKLLFHDELLLLLY--------- 368

Query: 461 TQLAGFSRLLNNPLLQTYLPNSTKKIEFHQELLVLFWKMCDYNKKFLYYVLKSSDVLDIL 520
                                                K+   N +FL YVL  SD+ ++ 
Sbjct: 369 -------------------------------------KLLHRNSRFLSYVLSRSDLDNL- 390

Query: 521 VPILYHLNDSRADQSRVGLMHIGVFILLLKKFLYYVLKSSDVLDILVPILYHLNDSRADQ 580
                                                        +VPIL  L D+ +DQ
Sbjct: 391 ---------------------------------------------IVPILELLYDAESDQ 405

Query: 581 SRVGLMHIGVFILLLLSGERNFGVRLNKPYSAAVPMDIPVF 621
           S    +++ + ILL+LS +RNF   L+K Y  +VP      
Sbjct: 406 SNSHHIYMALIILLILSQDRNFNRSLHKTYIKSVPWYTERS 446



 Score =  220 bits (562), Expect = 1e-60
 Identities = 74/246 (30%), Positives = 110/246 (44%), Gaps = 43/246 (17%)

Query: 684 KFLYYVLKSSDVLDILVPILYHLNDSRADQSRVGLMHIGVFILLLLSGERNFGVRLNKPY 743
           +FL YVL  SD+ +++VPIL  L D+ +DQS    +++ + ILL+LS +RNF   L+K Y
Sbjct: 376 RFLSYVLSRSDLDNLIVPILELLYDAESDQSNSHHIYMALIILLILSQDRNFNRSLHKTY 435

Query: 744 SAAVPMDIPV--FTGTHADLLMIVFHKVITSGHNRLQ--PLFDCLFTILVNVSPYLKTLS 799
             +VP       F  +   LL++V  + I     +L+   L      IL N+SPY K LS
Sbjct: 436 IKSVPWYTERSLFEISLGSLLVLVLIRTIQYNMKKLRDKYLHTNCLAILANMSPYFKNLS 495

Query: 800 MVASTKLLHLLEAFASPAFLYTGETNHHLVFFLLEIFNNIIQYQFDVSPYLKTLSMVAST 859
             A+ +L+ LLE  +   F  +   N  L                              +
Sbjct: 496 PYAAQRLVSLLELLSRKHFKLSSLINDRLS------------------------ESSEFS 531

Query: 860 KLLHLLEAFASPAFLYTGETNHHLVFFLLEIFNNIIQYQFDGNSNLVYTIIRKRQIFYAL 919
             L +LE                 +  LLEI N+I+ YQ D N  LVY ++RKR++F   
Sbjct: 532 DDLAVLEEV---------------LRLLLEILNSILTYQLDSNPELVYALLRKRELFEQF 576

Query: 920 SNLPSD 925
            N    
Sbjct: 577 RNDHPA 582



 Score = 95.6 bits (238), Expect = 6e-20
 Identities = 34/80 (42%), Positives = 44/80 (55%), Gaps = 3/80 (3%)

Query: 1095 SWKDKLPLQTIMRLLQVLVPQVEKICIDKGLTDESEILKFLQHGTLV-GLLPVP-HPILI 1152
                +LPLQ I R+L+    +VE I  D G  D SEIL+ ++ GTLV    P+   P LI
Sbjct: 581  PAFQELPLQNIDRVLEFFNSRVESIGADIGS-DVSEILEVIEKGTLVWPSDPLKKFPPLI 639

Query: 1153 RKYQANSGTTTWFRTYMWGV 1172
             KY    GT  +F  Y+WGV
Sbjct: 640  FKYVEEEGTEEFFIPYVWGV 659



 Score = 63.6 bits (155), Expect = 5e-10
 Identities = 24/67 (35%), Positives = 29/67 (43%)

Query: 980  TLNATLADTPGPDKAEDLQSIDSCEYIWERGVGFATSPARCPAHESARTELLKLLLTCFS 1039
            TL   L D   P K  DL ++ S E   E  VG   SP    A      E+L+LLL   S
Sbjct: 143  TLFEALTDLLFPAKGADLSTLSSFEDFLEALVGILVSPPVNDATYLIHLEILRLLLVLLS 202

Query: 1040 ETIYQPP 1046
              +Y  P
Sbjct: 203  SQLYIDP 209


>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
          Length = 1352

 Score = 39.0 bits (91), Expect = 0.017
 Identities = 39/165 (23%), Positives = 56/165 (33%), Gaps = 12/165 (7%)

Query: 937  GGRKRSSPLPPLLTSAPSSPSAGVEPSSPSMEGSKPALPAEPGTLNATLADTPGPDKAED 996
                RS+P   L T AP+SP+      SP+  G     P  P    A+   +P PD +E 
Sbjct: 81   ANESRSTPTWSLSTLAPASPA---REGSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSEM 137

Query: 997  LQSIDSCEYIWERGVGFATSPARCPAHESARTELLKLLLTCFSETIYQPPQGIEKMTEKE 1056
            L+ + S           A +     A ++A +    L L+   ET   P     +     
Sbjct: 138  LRPVGSPGPPPAASPPAAGASPAAVASDAASSRQAALPLSSPEETARAPSSPPAEPPPST 197

Query: 1057 SAHPGGENENVVAKPSSISSPIQHSHPRVWTPTPEWVKSWKDKLP 1101
                         +P   SSPI  S      P P   +S  D   
Sbjct: 198  PPAAAS------PRPPRRSSPISASASS---PAPAPGRSAADDAG 233


>gnl|CDD|237782 PRK14666, uvrC, excinuclease ABC subunit C; Provisional.
          Length = 694

 Score = 37.9 bits (88), Expect = 0.035
 Identities = 33/134 (24%), Positives = 46/134 (34%), Gaps = 23/134 (17%)

Query: 941  RSSPLPPLLTSAPSSPSAGVEPSSPSMEGSKPALPAEPGTLNATLADTPG-------PDK 993
            R+   P  L +   +  A  +P+  +         A    L  TLAD  G       P  
Sbjct: 373  RTGTAPTSLANVSHADPAVAQPTQAATLAGAAPKGATHLMLEETLADLRGGPVRIVPPRN 432

Query: 994  AEDLQSIDSCEYIWERGVGFATSPARCPAHESARTELLKLLLTCFSETIYQPPQGIEKMT 1053
              + + +D            A S AR  A   A T L  LL       +  PP  IE + 
Sbjct: 433  PAENRLVD-----------MAMSNAREEARRKAETPLQDLLARALH--LSGPPHRIEAV- 478

Query: 1054 EKESAHPGGENENV 1067
              + +H GG N  V
Sbjct: 479  --DVSHTGGRNTRV 490


>gnl|CDD|238159 cd00256, VATPase_H, VATPase_H, regulatory vacuolar ATP synthase
           subunit H (Vma13p); activation component of the
           peripheral V1 complex of V-ATPase, a heteromultimeric
           enzyme which uses  ATP to actively transport protons
           into organelles and extracellular compartments. The
           topology is that of a superhelical spiral, in part the
           geometry is similar to superhelices composed of
           armadillo repeat motifs, as found in importins for
           example.
          Length = 429

 Score = 36.6 bits (85), Expect = 0.086
 Identities = 42/221 (19%), Positives = 73/221 (33%), Gaps = 49/221 (22%)

Query: 352 NLFINYLSRIHRDEDFQFVLAGFSRLYQDGNRTRGLAL---TRHLGTQDNLFINYLSRIH 408
             F+N LS+I +D+  ++VL     + Q+ + TR                 F N L+R  
Sbjct: 56  KTFVNLLSQIDKDDTVRYVLTLIDDMLQE-DDTRVKLFHDDALLKKKTWEPFFNLLNR-- 112

Query: 409 RDEDFQFVLAGFSRLLNNPLLQTYLPNSTKKIEFHQELLVLFWKMCDYNKTLTQLAGF-- 466
                QF++     +L              K+E   +L   F  + +    +T       
Sbjct: 113 ---QDQFIVHMSFSILA-----KLACFGLAKME-GSDLDYYFNWLKEQLNNITNNDYVQT 163

Query: 467 -SRLLNN-----------------PLLQTYLPNSTKKIEFHQELLVLFW------KMCDY 502
            +R L                   P L   L N+T   +   + +   W         + 
Sbjct: 164 AARCLQMLLRVDEYRFAFVLADGVPTLVKLLSNATLGFQLQYQSIFCIWLLTFNPHAAEV 223

Query: 503 NKKF-----LYYVLKSS---DVLDILVPILYHLNDSRADQS 535
            K+      L  +LK S    V+ I++ I  +L   R D+ 
Sbjct: 224 LKRLSLIQDLSDILKESTKEKVIRIVLAIFRNLISKRVDRE 264


>gnl|CDD|147982 pfam06112, Herpes_capsid, Gammaherpesvirus capsid protein.  This
           family consists of several Gammaherpesvirus capsid
           proteins. The exact function of this family is unknown.
          Length = 148

 Score = 32.5 bits (74), Expect = 0.57
 Identities = 27/114 (23%), Positives = 39/114 (34%), Gaps = 13/114 (11%)

Query: 881 HHLVFFLLEIFNNIIQYQFDGNSNLVYTIIRKRQIFYALSNLPSDSASIHRALSVSGGRK 940
           ++LVF        I Q+ +D     ++ I RK+ +       P  S+SI  ALS S    
Sbjct: 47  NYLVFL-------IAQHCYDQYVRRMHGIRRKKHLQALRGAGPQTSSSIGSALSASS--- 96

Query: 941 RSSPLPPLLTSAPSSPSAGVEPSSPSMEGSKPALPAEPGTLNATLADTPGPDKA 994
            S+   P   +  S  S     S P    S  +L              P   K 
Sbjct: 97  SSASGVPGGANQLSGSSGSALSSGPGSLSSSSSLSGSGAG---AGDTAPSSSKK 147


>gnl|CDD|191356 pfam05725, FNIP, FNIP Repeat.  This repeat is approximately 22
           residues long and is only found in Dictyostelium
           discoideum. It appears to be related to pfam00560
           (personal obs:C Yeats). The alignment consists of two
           tandem repeats. It is termed the FNIP repeat after the
           pattern of conserved residues.
          Length = 44

 Score = 29.4 bits (67), Expect = 0.86
 Identities = 19/65 (29%), Positives = 22/65 (33%), Gaps = 24/65 (36%)

Query: 425 NNPLLQTYLPNSTKKIEFHQELLVLFWKMCDYNKTLTQLAGFSRLLNNPLLQTYLPNSTK 484
           N PL +  LPNS K + F                            N PL    LPNS K
Sbjct: 2   NQPLEKGSLPNSLKSLIFGDSF------------------------NQPLEIGVLPNSLK 37

Query: 485 KIEFH 489
            +EF 
Sbjct: 38  SLEFG 42


>gnl|CDD|130706 TIGR01645, half-pint, poly-U binding splicing factor, half-pint
            family.  The proteins represented by this model contain
            three RNA recognition motifs (rrm: pfam00076) and have
            been characterized as poly-pyrimidine tract binding
            proteins associated with RNA splicing factors. In the
            case of PUF60 (GP|6176532), in complex with p54, and in
            the presence of U2AF, facilitates association of U2 snRNP
            with pre-mRNA.
          Length = 612

 Score = 33.1 bits (75), Expect = 1.0
 Identities = 34/136 (25%), Positives = 47/136 (34%), Gaps = 27/136 (19%)

Query: 935  VSGGRKRSSPLPPLLTSAPS----SPSAGVEPSSPSMEGSKPALPAEPGTLNAT------ 984
            VS  +K +  +PPL  +AP+     P     P  P      P+L A PG +  T      
Sbjct: 348  VSSAKKEAEEVPPLPQAAPAVVKPGPMEIPTPVPPPGLAI-PSLVAPPGLVAPTEINPSF 406

Query: 985  LADTPGPDKAEDLQSIDSCEYIWERGVGF----ATSPARCPAHESARTELLKLLLTCFSE 1040
            LA      K E L             V F     T   + P+ E   +E  K+L      
Sbjct: 407  LASPRKKMKREKL------------PVTFGALDDTLAWKEPSKEDQTSEDGKMLAIMGEA 454

Query: 1041 TIYQPPQGIEKMTEKE 1056
                  +  +K  EKE
Sbjct: 455  AAALALEPKKKKKEKE 470


>gnl|CDD|216257 pfam01034, Syndecan, Syndecan domain.  Syndecans are transmembrane
            heparin sulfate proteoglycans which are implicated in the
            binding of extracellular matrix components and growth
            factors.
          Length = 207

 Score = 32.0 bits (73), Expect = 1.3
 Identities = 23/81 (28%), Positives = 30/81 (37%), Gaps = 17/81 (20%)

Query: 942  SSPLPPLLTSAPSSPSAGVEPSSPSMEGSKPAL--------PAEPGTLNATLAD-----T 988
            +S  PP LT+  SSPS     +S S + S            P+E  T  AT        T
Sbjct: 73   TSATPPKLTTTSSSPSNDTTTASTSTKTSPTVSTTVTTTTSPSETDTEEATTTVSTETPT 132

Query: 989  PGPDKAEDLQSIDSCEYIWER 1009
             G   A    S    + + ER
Sbjct: 133  EGGSSAATDPS----KNLLER 149


>gnl|CDD|185628 PTZ00449, PTZ00449, 104 kDa microneme/rhoptry antigen; Provisional.
          Length = 943

 Score = 32.4 bits (73), Expect = 1.8
 Identities = 19/65 (29%), Positives = 27/65 (41%), Gaps = 9/65 (13%)

Query: 944 PLPPLLTSAPSSPSAGV-EPSSPSMEGSKPALPAEPGT--LNATLADTPGPD------KA 994
           PLPP L      P   + +P +   +  +   P E      + T ADTP PD      K 
Sbjct: 720 PLPPKLPRDEEFPFEPIGDPDAEQPDDIEFFTPPEEERTFFHETPADTPLPDILAEEFKE 779

Query: 995 EDLQS 999
           ED+ +
Sbjct: 780 EDIHA 784


>gnl|CDD|152960 pfam12526, DUF3729, Protein of unknown function (DUF3729).  This
           family of proteins is found in viruses. Proteins in this
           family are typically between 145 and 1707 amino acids in
           length. The family is found in association with
           pfam01443, pfam01661, pfam05417, pfam01660, pfam00978.
           There is a single completely conserved residue L that
           may be functionally important.
          Length = 115

 Score = 30.4 bits (69), Expect = 1.9
 Identities = 15/69 (21%), Positives = 18/69 (26%), Gaps = 5/69 (7%)

Query: 931 RALSVSGGRKRSSPLPPLLTSAPSS--PSAGVEPSSPSMEGSKPALP--AEPGTLNATLA 986
           R  S SG     SP        P            +P    +   LP  +EP        
Sbjct: 18  RTWSTSGFSSCFSPPESAHPDDPPPVGDPRPPVVDTPPPVSAVWVLPPPSEPAAPPPDPE 77

Query: 987 DTPGPDKAE 995
             P P  A 
Sbjct: 78  P-PVPGPAG 85


>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
          Length = 3151

 Score = 32.2 bits (73), Expect = 2.2
 Identities = 16/82 (19%), Positives = 23/82 (28%), Gaps = 12/82 (14%)

Query: 923  PSDSASIHRALSVSGGRKRSSPLPPLLTSAPS-----------SPSAGVEPSSPSMEGSK 971
            P   +    A         + P P      P+                 + SSP     +
Sbjct: 2626 PPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRR 2685

Query: 972  PALPAEPGTLNATLADTPGPDK 993
             A     G+L  +LAD P P  
Sbjct: 2686 RAARPTVGSL-TSLADPPPPPP 2706



 Score = 32.2 bits (73), Expect = 2.5
 Identities = 15/72 (20%), Positives = 23/72 (31%), Gaps = 1/72 (1%)

Query: 923  PSDSASIHRALSVSGGRKRSSPLPPLLTSAPSSPSAGVEPSSPSMEGSKPALPAEPGTLN 982
            P    S           +++SP  P   + P+ P+    P  P+     P   A P    
Sbjct: 2712 PHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPAR-PPTTAGPPAPA 2770

Query: 983  ATLADTPGPDKA 994
               A   GP + 
Sbjct: 2771 PPAAPAAGPPRR 2782



 Score = 30.7 bits (69), Expect = 6.9
 Identities = 18/71 (25%), Positives = 23/71 (32%)

Query: 922  LPSDSASIHRALSVSGGRKRSSPLPPLLTSAPSSPSAGVEPSSPSMEGSKPALPAEPGTL 981
            LP  +       SV   R    P  P +TS    P A  + + P         P  P   
Sbjct: 2555 LPPAAPPAAPDRSVPPPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPP 2614

Query: 982  NATLADTPGPD 992
            +    DT  PD
Sbjct: 2615 SPLPPDTHAPD 2625


>gnl|CDD|218621 pfam05518, Totivirus_coat, Totivirus coat protein. 
          Length = 753

 Score = 32.1 bits (73), Expect = 2.4
 Identities = 42/166 (25%), Positives = 66/166 (39%), Gaps = 34/166 (20%)

Query: 863  HLLEAFASPAFLYTGETNHHLV------FFLLEIFNNIIQYQFDGN--------SNLVYT 908
            HLL A A    +  G  + HL       +F +E   ++I + F G         S  +Y 
Sbjct: 385  HLLTAGAR-YLVADGNPDRHLRMGTVAPYFWVEP-TSLIPHDFFGTPAESAGYGSQCIYG 442

Query: 909  IIRKRQIFYALSNLPSDSASIHRALSVSGGRKRSSPLPPLLTSAPSSPSAGVE----PSS 964
              R +  F A+  L  +  + + A  V     R+SPL     + P +  A +      S+
Sbjct: 443  TRRTQPAFEAIE-LTGERDTTYSAYRVRMRSARTSPLLAHTNNHPGNGLANLRVRQFDSN 501

Query: 965  PSMEGSKPALPAEPGTLNATLADTPGPDKAEDLQSIDSCEYIWERG 1010
              +      LP +PG  NAT        K E   ++D  +Y+W RG
Sbjct: 502  AVV------LPGQPGPTNATCR-----HKVEARLTLD--DYLWGRG 534


>gnl|CDD|188696 cd08742, RGS_RGS12, Regulator of G protein signaling (RGS) domain
           found in the RGS12 protein.  RGS (Regulator of G-protein
           Signaling) domain is an essential part of the RGS12
           protein. RGS12 is a member of the RA/RGS subfamily of
           RGS proteins family, a diverse group of multifunctional
           proteins that regulate cellular signaling events
           downstream of G-protein coupled receptors (GPCRs). As a
           major G-protein regulator, RGS domain containing
           proteins are involved in many crucial cellular processes
           such as regulation of intracellular trafficking, glial
           differentiation, embryonic axis formation, skeletal and
           muscle development, and cell migration during early
           embryogenesis. RGS12 belong to the R12 RGS subfamily,
           which includes RGS10 and RGS14, all of which are highly
           selective for G-alpha-i1 over G-alpha-q.  RGS12 exist in
           multiple splice variants: RGS12s (short) contains the
           core RGS/RBD/GoLoco domains, while RGS12L (long) has
           additional N-terminal PDZ and PTB domains. RGS12 splice
           variants show distinct expression patterns, suggesting
           that they have discrete functions during mouse
           embryogenesis. RGS12 also may play a critical role in
           coordinating Ras-dependent signals that are required for
           promoting and maintaining neuronal differentiation.
          Length = 115

 Score = 30.0 bits (67), Expect = 2.9
 Identities = 15/37 (40%), Positives = 22/37 (59%), Gaps = 2/37 (5%)

Query: 420 FSRLLNNPLLQTYLPNSTKKIEFHQELLVLFWKMCDY 456
           F RLL +P+   Y     +K EF +E  +LFW+ C+Y
Sbjct: 1   FERLLQDPVGVRYFSEFLRK-EFSEEN-ILFWQACEY 35



 Score = 30.0 bits (67), Expect = 2.9
 Identities = 15/37 (40%), Positives = 22/37 (59%), Gaps = 2/37 (5%)

Query: 466 FSRLLNNPLLQTYLPNSTKKIEFHQELLVLFWKMCDY 502
           F RLL +P+   Y     +K EF +E  +LFW+ C+Y
Sbjct: 1   FERLLQDPVGVRYFSEFLRK-EFSEEN-ILFWQACEY 35


>gnl|CDD|215130 PLN02217, PLN02217, probable pectinesterase/pectinesterase
           inhibitor.
          Length = 670

 Score = 31.6 bits (71), Expect = 3.3
 Identities = 11/55 (20%), Positives = 22/55 (40%)

Query: 942 SSPLPPLLTSAPSSPSAGVEPSSPSMEGSKPALPAEPGTLNATLADTPGPDKAED 996
           +SP    L S P++PS  V PS+        +    P +  +++        + +
Sbjct: 597 TSPPAGHLGSPPATPSKIVSPSTSPPASHLGSPSTTPSSPESSIKVASTETASPE 651


>gnl|CDD|224877 COG1966, CstA, Carbon starvation protein, predicted membrane
           protein [Signal transduction mechanisms].
          Length = 575

 Score = 31.5 bits (72), Expect = 3.5
 Identities = 38/152 (25%), Positives = 64/152 (42%), Gaps = 30/152 (19%)

Query: 585 LMHIGVFILLLLSGERNFG---VRL--NKPYSA-AVPMDIP--VFTGLVLRADQSRVGLM 636
              +   ILL+L G   F     +L  N P+    V + IP  V  G+ L   +  +G+ 
Sbjct: 131 FFLLLALILLILVGA-VFAAVIAKLLANSPWGVFTVFLTIPLAVLMGIYLYRLRGNMGIS 189

Query: 637 HIGVFILLLLSGERNFGVRLNKPYSAAVPMDIPVFTGNSINILMEDLKFLYYVLKSS-DV 695
            +    LL+L+              + VP+D P+FTG+ + ++   + F+Y  + S   V
Sbjct: 190 SVIGLALLILA----------IYLGSVVPIDAPIFTGSLVILVWIIILFIYAFIASVLPV 239

Query: 696 LDILVPILYHLNDSRADQSRVGLMHIGVFILL 727
             +L P   +LN         G + IG  + L
Sbjct: 240 WLLLQPR-DYLN---------GFLLIGGLVGL 261


>gnl|CDD|219094 pfam06583, Neogenin_C, Neogenin C-terminus.  This family represents
           the C-terminus of eukaryotic neogenin precursor
           proteins, which contains several potential
           phosphorylation sites. Neogenin is a member of the N-CAM
           family of cell adhesion molecules (and therefore
           contains multiple copies of pfam00047 and pfam00041) and
           is closely related to the DCC tumour suppressor gene
           product - these proteins may play an integral role in
           regulating differentiation programmes and/or cell
           migration events within many adult and embryonic
           tissues.
          Length = 295

 Score = 31.1 bits (70), Expect = 3.7
 Identities = 21/62 (33%), Positives = 27/62 (43%), Gaps = 13/62 (20%)

Query: 941 RSSPLPPLLTSA-------------PSSPSAGVEPSSPSMEGSKPALPAEPGTLNATLAD 987
           R +P PP L +A              S P+A V P+ P    + PALPA   T+   L  
Sbjct: 154 RQTPEPPYLPAAQSESSNAAEEAPSRSIPTAHVRPTHPLKSFAVPALPASMSTIEPKLPS 213

Query: 988 TP 989
           TP
Sbjct: 214 TP 215


>gnl|CDD|182175 PRK09971, PRK09971, xanthine dehydrogenase subunit XdhB;
           Provisional.
          Length = 291

 Score = 30.4 bits (69), Expect = 5.1
 Identities = 10/37 (27%), Positives = 18/37 (48%), Gaps = 3/37 (8%)

Query: 401 INYLSRIHRDEDFQFV---LAGFSRLLNNPLLQTYLP 434
           I  L  I   ED          F++++ +P++Q +LP
Sbjct: 55  IAELRGITLAEDGSIRIGAATTFTQIIEDPIIQKHLP 91


>gnl|CDD|233045 TIGR00601, rad23, UV excision repair protein Rad23.  All proteins
           in this family for which functions are known are
           components of a multiprotein complex used for targeting
           nucleotide excision repair to specific parts of the
           genome. In humans, Rad23 complexes with the XPC protein.
           This family is based on the phylogenomic analysis of JA
           Eisen (1999, Ph.D. Thesis, Stanford University) [DNA
           metabolism, DNA replication, recombination, and repair].
          Length = 378

 Score = 30.6 bits (69), Expect = 5.4
 Identities = 22/93 (23%), Positives = 36/93 (38%), Gaps = 6/93 (6%)

Query: 907 YTIIRKRQIFYALSNLPSDSASIHRALSVSGGRKRSSPLPPLLTSAPSS--PSAGVEPSS 964
           Y I  K  +   +S   + +  +    +        +P PP   ++  S  P++ VE  S
Sbjct: 62  YKIKEKDFVVVMVSKPKTGTGKVAPPAATPTSAPTPTPSPPASPASGMSAAPASAVEEKS 121

Query: 965 PSMEGSKPALPAEPGTLNATLADTPGPDKAEDL 997
           PS E +    P  P    +T   + G D A  L
Sbjct: 122 PSEESATATAPESP----STSVPSSGSDAASTL 150


>gnl|CDD|234938 PRK01297, PRK01297, ATP-dependent RNA helicase RhlB; Provisional.
          Length = 475

 Score = 30.7 bits (69), Expect = 5.9
 Identities = 18/63 (28%), Positives = 25/63 (39%), Gaps = 5/63 (7%)

Query: 931 RALSVSGGRKRSSPLPPLLTSAPSSPSAGVEPSSPSMEGSKPALPAEPGTLNATLADTPG 990
           +AL    G+  +    P    AP SP+A   P  P+   +     A P    A  A+ P 
Sbjct: 3   KALKKIFGKGEAEQPAP----APPSPAAAPAPPPPAKTAAPATKAAAPAA-AAPRAEKPK 57

Query: 991 PDK 993
            DK
Sbjct: 58  KDK 60


>gnl|CDD|223033 PHA03291, PHA03291, envelope glycoprotein I; Provisional.
          Length = 401

 Score = 30.3 bits (68), Expect = 6.7
 Identities = 21/93 (22%), Positives = 30/93 (32%), Gaps = 19/93 (20%)

Query: 942  SSPLPPLLTSAP----SSPSAGVEPSSPSMEGSKPALPAEPGTLNATLADTPGPDKAEDL 997
            ++P P   T+A      +PS    P S ++      + A           TP P      
Sbjct: 206  ATPRPTPRTTASPETTPTPSTTTSPPSTTIPAPSTTIAAPQAGTTPEAEGTPAPPTP--- 262

Query: 998  QSIDSCEYIWERGVGFATSPARCPAHESARTEL 1030
                        G G A      PA E++R EL
Sbjct: 263  ------------GGGEAPPANATPAPEASRYEL 283


>gnl|CDD|162054 TIGR00815, sulP, high affinity sulphate transporter 1.  The SulP
           family is a large and ubiquitous family with over 30
           sequenced members derived from bacteria, fungi, plants
           and animals. Many organisms including Bacillus subtilis,
           Synechocystis sp, Saccharomyces cerevisiae, Arabidopsis
           thaliana and Caenorhabditis elegans possess multiple
           SulP family paralogues. Many of these proteins are
           functionally characterized, and all are sulfate uptake
           transporters. Some transport their substrate with high
           affinities, while others transport it with relatively
           low affinities. Most function by SO42- :H+symport, but
           SO42- :HCO3- antiport has been reported for the rat
           protein (spP45380). The bacterial proteins vary in size
           from 434 residues to 566 residues with one exception, a
           Mycobacterium tuberculosis protein with 784 residues.
           The eukaryotic proteins vary in size from 611 residues
           to 893 residues with one exception, a protein designated
           "early nodulin 70 protein" from Glycine max which is
           reported to be of 485 residues. Thus, the eukaryotic
           proteins are almost without exception larger than the
           prokaryotic proteins. These proteins exhibit 10-13
           putative transmembrane a-helical spanners (TMSs)
           depending on the protein. The phylogenetic tree for the
           SulP family reveals five principal branches. Three of
           these are bacterial specific as follows: one bears a
           single protein from M. tuberculosis; a second bears two
           proteins, one from M. tuberculosis, the other from
           Synechocystis sp, and the third bears all remaining
           prokaryotic proteins. The remaining two clusters bear
           only eukaryotic proteins with the animal proteins all
           localized to one branch and the plant and fungal
           proteins localized to the other. The generalized
           transport reactions catalyzed by SulP family proteins
           are: (1) SO42- (out) + nH+ (out) --> SO42- (in) + nH+
           (in). (2) SO42- (out) + nHCO3- (in) SO42- (in) + nHCO3-
           (out) [Transport and binding proteins, Anions].
          Length = 563

 Score = 30.4 bits (69), Expect = 7.2
 Identities = 31/157 (19%), Positives = 58/157 (36%), Gaps = 19/157 (12%)

Query: 672 TGNSINILMEDLK-FLYY--VLKSSDVLDILVPILYHLNDSRADQSRVGLMHIGVFILLL 728
           TG +I I +  LK  L        +D L +++     L ++        +  IG+ +LL 
Sbjct: 135 TGAAITIGLSQLKGLLGISIFNTRTDTLGVVISTWAGLPNTHNWNWCTLV--IGLVLLLF 192

Query: 729 LSGERNFGVRLNKPYSAAVPM---------DIPVFTGTHADLLMIVFHKVITSGHNRLQP 779
           L   +  G R  K   A                         + I+ H  I SG +   P
Sbjct: 193 LLYTKKLGKRNKKLLFAPAVAPLLVVILATLAVTIGLHKKQGVSILGH--IPSGLSFFPP 250

Query: 780 L-FDCLFTILVNVSPYLKTLSMVASTKLLHLLEAFAS 815
           +  D  + +L  ++P    +++V   + + +  +FA 
Sbjct: 251 ITLD--WELLPTLAPDAIAIAIVGLIESIAIARSFAR 285


>gnl|CDD|218397 pfam05044, Prox1, Homeobox prospero-like protein (PROX1).  The
           homeobox gene Prox1 is expressed in a subpopulation of
           endothelial cells that, after budding from veins, gives
           rise to the mammalian lymphatic system. Prox1 has been
           found to be an early specific marker for the developing
           liver and pancreas in the mammalian foregut endoderm.
           This family contains an atypical homeobox domain.
          Length = 908

 Score = 30.4 bits (68), Expect = 7.5
 Identities = 13/49 (26%), Positives = 16/49 (32%), Gaps = 1/49 (2%)

Query: 932 ALSVSGGRKRSSPLPPLLTSAPSSPSAGVEPSSPSMEGSKPALPAEPGT 980
           AL  + G    +   PL  S+ S  S G  P         P   A P  
Sbjct: 603 ALRDAVGPAAGTHHQPLHPSSLS-ASMGFHPPPFRHPFPLPLTVAIPNP 650


>gnl|CDD|188697 cd08743, RGS_RGS14, Regulator of G protein signaling (RGS) domain
           found in the RGS14 protein.  RGS (Regulator of G-protein
           Signaling) domain is an essential part of the RGS14
           protein. RGS14 is a member of the RA/RGS subfamily of
           RGS proteins family, a diverse group of multifunctional
           proteins that regulate cellular signaling events
           downstream of G-protein coupled receptors (GPCRs). As a
           major G-protein regulator, RGS domain containing
           proteins are involved in many crucial cellular processes
           such as regulation of intracellular trafficking, glial
           differentiation, embryonic axis formation, skeletal and
           muscle development, and cell migration during early
           embryogenesis. RGS14 belong to the R12 RGS subfamily,
           which includes RGS10 and RGS12, all of which are highly
           selective for G-alpha-i1 over G-alpha-q.  RGS14 binds
           and regulates the subcellular localization and
           activities of H-Ras and Raf  kinases in cells and
           thereby integrates G protein and Ras/Raf signaling
           pathways.
          Length = 129

 Score = 28.8 bits (64), Expect = 8.0
 Identities = 23/63 (36%), Positives = 28/63 (44%), Gaps = 7/63 (11%)

Query: 420 FSRLLNNPLLQTYLPNSTKKIEFHQELLVLFWKMCDY-----NKTLTQLAGFSRLLNNPL 474
           F RLL +PL   Y     KK EF  E  V FWK C+           QLA  +R + N  
Sbjct: 11  FERLLQDPLGVEYFTEFLKK-EFSAE-NVNFWKACERFQQIPASDTQQLAQEARKIYNEF 68

Query: 475 LQT 477
           L +
Sbjct: 69  LSS 71



 Score = 28.5 bits (63), Expect = 9.6
 Identities = 17/36 (47%), Positives = 19/36 (52%), Gaps = 2/36 (5%)

Query: 466 FSRLLNNPLLQTYLPNSTKKIEFHQELLVLFWKMCD 501
           F RLL +PL   Y     KK EF  E  V FWK C+
Sbjct: 11  FERLLQDPLGVEYFTEFLKK-EFSAE-NVNFWKACE 44


>gnl|CDD|184287 PRK13735, PRK13735, conjugal transfer mating pair stabilization
           protein TraG; Provisional.
          Length = 942

 Score = 30.1 bits (68), Expect = 9.6
 Identities = 11/52 (21%), Positives = 22/52 (42%), Gaps = 5/52 (9%)

Query: 16  QLTSKTQPIDASDNTFWNQFWSENVTN-----AQDIFTLIPSAEIRALREEA 62
           ++ S+T+ +    +   +Q +++ V       A+ I T   S EI   R   
Sbjct: 744 EMASRTESMSGQMSENLSQQFAQYVMKHAPQDAEAILTNTSSPEIAERRRAM 795


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.322    0.137    0.414 

Gapped
Lambda     K      H
   0.267   0.0702    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 60,772,031
Number of extensions: 6044764
Number of successful extensions: 4906
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4848
Number of HSP's successfully gapped: 59
Length of query: 1196
Length of database: 10,937,602
Length adjustment: 108
Effective length of query: 1088
Effective length of database: 6,147,370
Effective search space: 6688338560
Effective search space used: 6688338560
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.9 bits)
S2: 65 (28.9 bits)