Citrus Sinensis ID: 026921


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------150-------160-------170-------180-------190-------200-------210-------220-------230-
MLFFLHIAAKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILMWRGKDWKSMYPEPPSFSNPVDLDIAGDADGSGTPSDDPSQGTIRSSPKMISLWKRAIESTKALVLDEINLGPDDLLKKVEEFEGISQAAEHSYPALVLSREDGASSSMAEYEDGSQSENYDEDEFYPEDDFNDDDEFYDSDSSDVVPLGSLPVDHIAERLQRK
ccEEEEEcccccEEEEHHHHHHHHHccccEEEEcccccccHHHHHHHHHHHccccEEEEEEcccEEEEEEccccccccccccccccccccccccccccccccccccccccccccHHHHHHHHHHHHHHHHHHccccccccHHHHHHHHHHHccccccccccccEEEccccccccccccccccccccccccccccccccccccccccccccccccccccccHHHHHHHHHcc
ccHHHHHHHHcccEEEEcHHHHHHHccccEEEEEcccccHHHHHHHHHHHHHcccEEEEEEcccEEEEEEccccccccccccccccccccccccccccccccccccccHHHcccHHHHHHHHHHHHcccEEEHHHccccHHHHHHHHHHHHHHHHHHHccccEEEEEcccccccccccccccccccccccccccccccccccHHHHHcccccccccccccHHHHHHHHccc
MLFFLHIAAKNGVYLTLVRDVRnafegsslvkvnckgmhasdyKKLGAKLKELvpcvllsfddeQILMWRGkdwksmypeppsfsnpvdldiagdadgsgtpsddpsqgtirsspKMISLWKRAIESTKALVLDeinlgpddLLKKVEEFEGISQAAEHSYPALvlsredgasssmaeyedgsqsenydedefypeddfndddefydsdssdvvplgslpvdHIAERLQRK
MLFFLHIAAKNGVYLTLVRDVRNAFEGsslvkvnckgMHASDYKKLGAKLKELVPCVLLSFDDEQILMWRGKDWKSMYPEPPSFSNPVDLDIAGDADGsgtpsddpsqgtirsspKMISLWKRAIESTKALVLDEINLGPDDLLKKVEEFEGISqaaehsypaLVLSREDGASSSMAEyedgsqsenydeDEFYPEDDFNDDDEFYDSDssdvvplgslpvdhiaerlqrk
MLFFLHIAAKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILMWRGKDWKSMYPEPPSFSNPVDLDIAGDADGSGTPSDDPSQGTIRSSPKMISLWKRAIESTKALVLDEINLGPDDLLKKVEEFEGISQAAEHSYPALVLSREDGASSSMAEYEDGSQSenydedefypeddfndddefydsdssdVVPLGSLPVDHIAERLQRK
*LFFLHIAAKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILMWRGKDWKSM****************************************ISLWKRAIESTKALVLDEINLGPDDLLKKVEEFEGI******************************************************************************
MLFFLH*AAKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILMWRGKD**********************************************LWKRAIESTKALVLDEINLGPDDLLKKVEEFEGISQAAEHSY*******************************************FYD***SDVVPLGSLPVDHIAERL***
MLFFLHIAAKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILMWRGKDWKSMYPEPPSFSNPVDLDIAGDA***************RSSPKMISLWKRAIESTKALVLDEINLGPDDLLKKVEEFEGISQAAEHSYPALVLSRE********************EDEFYPEDDFNDDDEFYDSDSSDVVPLGSLPVDHIAERLQRK
MLFFLHIAAKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILMWRGKDWKSMYPEPPSF**************************IRSSPKMISLWKRAIESTKALVLDEINLGPDDLLKKVEEFEGISQAAEHSYPALVLSR****************************DDFNDDDEFYDSDSSDVVPLGSLPVDHIAERLQ**
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhhhhooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooooooooooooooooooooooooohhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MLFFLHIAAKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILMWRGKDWKSMYPEPPSFSNPVDLDIAGDADGSGTPSDDPSQGTIRSSPKMISLWKRAIESTKALVLDEINLGPDDLLKKVEEFEGISQAAEHSYPALVLSREDGASSSMAEYEDGSQSENYDEDEFYPEDDFNDDDEFYDSDSSDVVPLGSLPVDHIAERLQRK
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

ID ?Alignment graph ?Length ? Definition ? RBH(Q2H) ? RBH(H2Q) ? Q cover ? H cover ? Identity ? E-value ?
Query231 2.2.26 [Sep-21-2011]
Q9LDA9564 CRS2-associated factor 2, yes no 0.935 0.382 0.550 1e-57
Q84N48611 CRS2-associated factor 2, N/A no 0.948 0.358 0.495 3e-50
Q657G7607 CRS2-associated factor 2, yes no 0.948 0.360 0.457 2e-46
Q9SL79701 CRS2-associated factor 1, no no 0.316 0.104 0.589 3e-22
Q5VMQ5 701 CRS2-associated factor 1, no no 0.307 0.101 0.647 4e-22
Q84N49 674 CRS2-associated factor 1, N/A no 0.307 0.105 0.633 2e-21
Q8VYD9405 CRS2-associated factor 1, no no 0.290 0.165 0.582 5e-18
Q6Z4U2428 CRS2-associated factor 1, no no 0.285 0.154 0.560 1e-16
Q0J7J7366 CRS2-associated factor 2, no no 0.277 0.174 0.515 8e-14
Q9FFU1358 CRS2-associated factor 2, no no 0.277 0.178 0.5 1e-13
>sp|Q9LDA9|CAF2P_ARATH CRS2-associated factor 2, chloroplastic OS=Arabidopsis thaliana GN=At1g23400 PE=2 SV=1 Back     alignment and function desciption
 Score =  223 bits (567), Expect = 1e-57,   Method: Compositional matrix adjust.
 Identities = 125/227 (55%), Positives = 166/227 (73%), Gaps = 11/227 (4%)

Query: 9   AKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILM 68
           +KNGVY++LV+DVR+AFE SSLVKV+C G+  SDYKK+GAKLKELVPCVLLSFDDEQILM
Sbjct: 343 SKNGVYVSLVKDVRDAFELSSLVKVDCPGLEPSDYKKIGAKLKELVPCVLLSFDDEQILM 402

Query: 69  WRGKDWKSMYPEPPSFSNPVDLDIAGDADGSGTPSDDPS---QGTIRSSPKMISLWKRAI 125
           WRG++WKS + + P   +  + +   + D S  PS++ +     T  SSPKMISLW+RA+
Sbjct: 403 WRGREWKSRFVDNPLIPSLSETNTTNELDPSDKPSEEQTVANPSTTISSPKMISLWQRAL 462

Query: 126 ESTKALVLDEINLGPDDLLKKVEEFEGISQAAEHSYPALVLSREDGASSSMAEYEDGSQS 185
           ES+KA++L+E++LGPDDLLKKVEE EG S AAEH+Y A+VLS  DGA+    + +D S+ 
Sbjct: 463 ESSKAVILEELDLGPDDLLKKVEELEGTSLAAEHTYTAMVLSNTDGAAEDYVDEKDRSE- 521

Query: 186 ENYDEDEFYPEDDFNDDDEFYDSDSSDVV-PLGSLPVDHIAERLQRK 231
                 E+Y + D + DDE  D +S D V P+GSLPVD I  +L+ +
Sbjct: 522 ------EYYSDIDDDFDDECSDDESLDPVGPVGSLPVDKIVRKLRER 562




Essential protein required for the splicing of group IIB introns in chloroplasts. Forms splicing particles with CRS2. Interacts with RNA and confers intron specificity of the splicing particles.
Arabidopsis thaliana (taxid: 3702)
>sp|Q84N48|CAF2P_MAIZE CRS2-associated factor 2, chloroplastic OS=Zea mays GN=CAF2 PE=1 SV=1 Back     alignment and function description
>sp|Q657G7|CAF2P_ORYSJ CRS2-associated factor 2, chloroplastic OS=Oryza sativa subsp. japonica GN=Os01g0323300 PE=2 SV=1 Back     alignment and function description
>sp|Q9SL79|CAF1P_ARATH CRS2-associated factor 1, chloroplastic OS=Arabidopsis thaliana GN=At2g20020 PE=1 SV=2 Back     alignment and function description
>sp|Q5VMQ5|CAF1P_ORYSJ CRS2-associated factor 1, chloroplastic OS=Oryza sativa subsp. japonica GN=Os01g0495900 PE=2 SV=1 Back     alignment and function description
>sp|Q84N49|CAF1P_MAIZE CRS2-associated factor 1, chloroplastic OS=Zea mays GN=CAF1 PE=1 SV=1 Back     alignment and function description
>sp|Q8VYD9|CAF1M_ARATH CRS2-associated factor 1, mitochondrial OS=Arabidopsis thaliana GN=At4g31010 PE=2 SV=1 Back     alignment and function description
>sp|Q6Z4U2|CAF1M_ORYSJ CRS2-associated factor 1, mitochondrial OS=Oryza sativa subsp. japonica GN=Os08g0174900 PE=2 SV=1 Back     alignment and function description
>sp|Q0J7J7|CAF2M_ORYSJ CRS2-associated factor 2, mitochondrial OS=Oryza sativa subsp. japonica GN=Os08g0188000 PE=2 SV=2 Back     alignment and function description
>sp|Q9FFU1|CAF2M_ARATH CRS2-associated factor 2, mitochondrial OS=Arabidopsis thaliana GN=At5g54890 PE=2 SV=1 Back     alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query231
224110940 506 predicted protein [Populus trichocarpa] 0.956 0.436 0.651 1e-70
297739073 422 unnamed protein product [Vitis vinifera] 0.935 0.511 0.630 2e-70
255568848 561 conserved hypothetical protein [Ricinus 0.965 0.397 0.647 4e-70
359473240 560 PREDICTED: LOW QUALITY PROTEIN: CRS2-ass 0.926 0.382 0.631 5e-70
225425575 561 PREDICTED: CRS2-associated factor 2, chl 0.935 0.385 0.634 4e-69
356524038 593 PREDICTED: CRS2-associated factor 2, chl 0.852 0.332 0.572 3e-58
449434945 602 PREDICTED: LOW QUALITY PROTEIN: CRS2-ass 0.913 0.350 0.552 5e-58
449478585 603 PREDICTED: LOW QUALITY PROTEIN: CRS2-ass 0.913 0.349 0.552 6e-58
147771140306 hypothetical protein VITISV_034260 [Viti 0.878 0.663 0.577 4e-57
307135966 603 RNA splicing factor [Cucumis melo subsp. 0.926 0.354 0.555 5e-57
>gi|224110940|ref|XP_002315689.1| predicted protein [Populus trichocarpa] gi|222864729|gb|EEF01860.1| predicted protein [Populus trichocarpa] Back     alignment and taxonomy information
 Score =  272 bits (695), Expect = 1e-70,   Method: Compositional matrix adjust.
 Identities = 148/227 (65%), Positives = 178/227 (78%), Gaps = 6/227 (2%)

Query: 9   AKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILM 68
           AKNGVY+TLVRDVR AFEGS LVKV+CKGM  SDYKKLGAKLK+LVPCVLLSFDDEQILM
Sbjct: 282 AKNGVYITLVRDVRAAFEGSPLVKVDCKGMEPSDYKKLGAKLKDLVPCVLLSFDDEQILM 341

Query: 69  WRGKDWKSMYPEP-PSFSNPVDLDIAGDADGSGTPSDDPSQG---TIRSSPKMISLWKRA 124
           WRG+DWKSMYPE  PS S P +LDIA  +D SG   DD        + SSPKM+ LWK A
Sbjct: 342 WRGQDWKSMYPEARPSISFPAELDIASGSDDSGKSDDDCDNSDAKILSSSPKMMLLWKHA 401

Query: 125 IESTKALVLDEINLGPDDLLKKVEEFEGISQAAEHSYPALVLSREDGASSSMAEYEDGSQ 184
           +ES KA++LDEI+LGPD LL KVEEFEGISQA EHSYPALV+S EDG+S+S++ +ED S 
Sbjct: 402 LESNKAILLDEIDLGPDALLTKVEEFEGISQATEHSYPALVMSSEDGSSNSISTFEDDSH 461

Query: 185 SENYDEDEFYPEDDFNDDDEFYDSDSSDVVPLGSLPVDHIAERLQRK 231
           SEN+ ED+ Y +D++ D + F + ++S   P GSL +D IAE+L +K
Sbjct: 462 SENFSEDDMYSDDEYYDSESFEELETS--APPGSLSIDLIAEKLDKK 506




Source: Populus trichocarpa

Species: Populus trichocarpa

Genus: Populus

Family: Salicaceae

Order: Malpighiales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi|297739073|emb|CBI28562.3| unnamed protein product [Vitis vinifera] Back     alignment and taxonomy information
>gi|255568848|ref|XP_002525395.1| conserved hypothetical protein [Ricinus communis] gi|223535358|gb|EEF37033.1| conserved hypothetical protein [Ricinus communis] Back     alignment and taxonomy information
>gi|359473240|ref|XP_003631275.1| PREDICTED: LOW QUALITY PROTEIN: CRS2-associated factor 2, chloroplastic-like [Vitis vinifera] Back     alignment and taxonomy information
>gi|225425575|ref|XP_002267079.1| PREDICTED: CRS2-associated factor 2, chloroplastic [Vitis vinifera] gi|297739063|emb|CBI28552.3| unnamed protein product [Vitis vinifera] Back     alignment and taxonomy information
>gi|356524038|ref|XP_003530640.1| PREDICTED: CRS2-associated factor 2, chloroplastic-like [Glycine max] Back     alignment and taxonomy information
>gi|449434945|ref|XP_004135256.1| PREDICTED: LOW QUALITY PROTEIN: CRS2-associated factor 2, chloroplastic-like [Cucumis sativus] Back     alignment and taxonomy information
>gi|449478585|ref|XP_004155360.1| PREDICTED: LOW QUALITY PROTEIN: CRS2-associated factor 2, chloroplastic-like [Cucumis sativus] Back     alignment and taxonomy information
>gi|147771140|emb|CAN74182.1| hypothetical protein VITISV_034260 [Vitis vinifera] Back     alignment and taxonomy information
>gi|307135966|gb|ADN33825.1| RNA splicing factor [Cucumis melo subsp. melo] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query231
TAIR|locus:2028100564 CAF2 [Arabidopsis thaliana (ta 0.939 0.384 0.513 4.3e-54
TAIR|locus:2061604701 CAF1 [Arabidopsis thaliana (ta 0.480 0.158 0.451 1.6e-23
TAIR|locus:2126694405 AT4G31010 [Arabidopsis thalian 0.432 0.246 0.457 3.2e-18
TAIR|locus:2160195358 AT5G54890 [Arabidopsis thalian 0.272 0.175 0.507 2.7e-13
TAIR|locus:2028100 CAF2 [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
 Score = 559 (201.8 bits), Expect = 4.3e-54, P = 4.3e-54
 Identities = 116/226 (51%), Positives = 153/226 (67%)

Query:     9 AKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILM 68
             +KNGVY++LV+DVR+AFE SSLVKV+C G+  SDYKK+GAKLKELVPCVLLSFDDEQILM
Sbjct:   343 SKNGVYVSLVKDVRDAFELSSLVKVDCPGLEPSDYKKIGAKLKELVPCVLLSFDDEQILM 402

Query:    69 WRGKDWKSMYPEPPSFSNPVDLDIAGDADGSGTPSDDPS---QGTIRSSPKMISLWKRAI 125
             WRG++WKS + + P   +  + +   + D S  PS++ +     T  SSPKMISLW+RA+
Sbjct:   403 WRGREWKSRFVDNPLIPSLSETNTTNELDPSDKPSEEQTVANPSTTISSPKMISLWQRAL 462

Query:   126 ESTKALVLDEINLGPDDLLKKVEEFEGISQAAEHSYPALVLSREDGASSSMAEYEDGSQS 185
             ES+KA++L+E++LGPDDLLKKVEE EG S AAEH+Y A+VLS  DGA+    + +D S+ 
Sbjct:   463 ESSKAVILEELDLGPDDLLKKVEELEGTSLAAEHTYTAMVLSNTDGAAEDYVDEKDRSEE 522

Query:   186 XXXXXXXXXXXXXXXXXXXXXXXXXXXVVPLGSLPVDHIAERLQRK 231
                                        V P+GSLPVD I  +L+ +
Sbjct:   523 YYSDIDDDFDDECSDDESLDP------VGPVGSLPVDKIVRKLRER 562




GO:0003723 "RNA binding" evidence=IEA
GO:0009507 "chloroplast" evidence=ISM
GO:0000373 "Group II intron splicing" evidence=IDA
TAIR|locus:2061604 CAF1 [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2126694 AT4G31010 [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2160195 AT5G54890 [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

No confident hit for EC number transfering in SWISSPROT detected by BLAST

EC Number Prediction by Ezypred Server ?

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING ?

Your Input:
gw1.X.5327.1
hypothetical protein (506 aa)
(Populus trichocarpa)
Predicted Functional Partners:
 
Sorry, there are no predicted associations at the current settings.
 

Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query231
pfam0198584 pfam01985, CRS1_YhbY, CRS1 / YhbY (CRM) domain 4e-11
smart0110384 smart01103, CRS1_YhbY, Escherichia coli YhbY is as 1e-09
>gnl|CDD|190184 pfam01985, CRS1_YhbY, CRS1 / YhbY (CRM) domain Back     alignment and domain information
 Score = 57.1 bits (139), Expect = 4e-11
 Identities = 17/61 (27%), Positives = 29/61 (47%)

Query: 10 KNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILMW 69
          KNG+   +V ++  A E   L+KV   G    D K++  +L E     L+      I+++
Sbjct: 24 KNGLTEGVVEEIDEALEKHELIKVKVLGNDREDRKEIAEELAEKTGAELVQVIGRTIVLY 83

Query: 70 R 70
          R
Sbjct: 84 R 84


Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome. Length = 84

>gnl|CDD|198171 smart01103, CRS1_YhbY, Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly Back     alignment and domain information

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 231
PF0198584 CRS1_YhbY: CRS1 / YhbY (CRM) domain; InterPro: IPR 99.13
TIGR0025395 RNA_bind_YhbY putative RNA-binding protein, YhbY f 97.3
COG153497 Predicted RNA-binding protein containing KH domain 97.02
PRK1034397 RNA-binding protein YhbY; Provisional 96.24
KOG1990564 consensus Poly(A)-specific exoribonuclease PARN [R 95.86
>PF01985 CRS1_YhbY: CRS1 / YhbY (CRM) domain; InterPro: IPR001890 The CRM domain is an ~100-amino acid RNA-binding domain Back     alignment and domain information
Probab=99.13  E-value=3.7e-11  Score=89.36  Aligned_cols=69  Identities=25%  Similarity=0.341  Sum_probs=61.4

Q ss_pred             chhhhhhccCceEeehhHHHHHhhccCceEEEcCCCCCchhHHHhhhhhhcccCeeEEeecCceEEEEe
Q 026921            2 LFFLHIAAKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILMWR   70 (231)
Q Consensus         2 l~~l~kL~kNGvY~~lV~~VrdAF~~~elVrIDC~gl~~sDyKKIGaKLrdLVpCvllsF~~eqIlmWR   70 (231)
                      |=|+..++|||++-++++.+++||+.+|||||.|.+-.+.|.+.+...|.+..+|.++...+.++++||
T Consensus        16 l~p~v~IGk~Glt~~vi~~i~~~l~~~eLvKVk~~~~~~~~~~~~~~~l~~~t~~~~V~~iG~~~vlyR   84 (84)
T PF01985_consen   16 LKPVVQIGKNGLTDGVIEEIDDALEKHELVKVKVLGNCREDRKEIAEQLAEKTGAEVVQVIGRTIVLYR   84 (84)
T ss_dssp             C--SEEE-TTSS-HHHHHHHHHHHHHHSEEEEEETT--HHHHHHHHHHHHHHHTEEEEEEETTEEEEEE
T ss_pred             CCCeEEECCCCCCHHHHHHHHHHHHhCCeeEEEEccCCHHHHHHHHHHHHHHhCCEEEEEECCEEEEEC
Confidence            347788999999999999999999999999999999999999999999999999999999999999998



The name chloroplast RNA splicing and ribosome maturation (CRM) has been suggested to reflect the functions established for the four characterised members of the family: Zea mays (Maize) CRS1 (Q9FYT6 from SWISSPROT), CAF1 (Q84N49 from SWISSPROT) and CAF2 (Q84N48 from SWISSPROT) proteins and the Escherichia coli protein YhbY (P0AGK4 from SWISSPROT). The CRM domain is found in eubacteria, archaea, and plants. The CRM domain is represented as a stand-alone protein in archaea and bacteria, and in single- and multi-domain proteins in plants. It has been suggested that prokaryotic CRM proteins existed as ribosome-associated proteins prior to the divergence of archaea and bacteria, and that they were co-opted in the plant lineage as RNA binding modules by incorporation into diverse protein contexts. Plant CRM domains are predicted to reside not only in the chloroplast, but also in the mitochondrion and the nucleo/cytoplasmic compartment. The diversity of the CRM domain family in plants suggests a diverse set of RNA targets [, ]. The CRM domain is a compact alpha/beta domain consisting of a four-stranded beta sheet and three alpha helices with an alpha-beta-alpha-beta-alpha-beta-beta topology. The beta sheet face is basic, consistent with a role in RNA binding. Proximal to the basic beta sheet face is another moiety that could contribute to nucleic acid recognition. Connecting strand beta1 and helix alpha2 is a loop with a six amino acid motif, GxxG flanked by large aliphatic residues, within which one 'x' is typically a basic residue []. Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants []. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing []. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes []. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome [].; GO: 0003723 RNA binding; PDB: 1RQ8_A 1JO0_B 1LN4_A.

>TIGR00253 RNA_bind_YhbY putative RNA-binding protein, YhbY family Back     alignment and domain information
>COG1534 Predicted RNA-binding protein containing KH domain, possibly ribosomal protein [Translation, ribosomal structure and biogenesis] Back     alignment and domain information
>PRK10343 RNA-binding protein YhbY; Provisional Back     alignment and domain information
>KOG1990 consensus Poly(A)-specific exoribonuclease PARN [Replication, recombination and repair] Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

No homologous structure with e-value below 0.005

Structure Templates Detected by RPS-BLAST ?

No hit with e-value below 0.005

Structure Templates Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query231
1jo0_A98 Hypothetical protein HI1333; structural genomics, 98.4
1rq8_A104 Conserved hypothetical protein; structural genomic 98.15
>1jo0_A Hypothetical protein HI1333; structural genomics, YHBY_HAEI structure 2 function project, S2F, unknown function; 1.37A {Haemophilus influenzae} SCOP: d.68.4.1 PDB: 1ln4_A Back     alignment and structure
Probab=98.40  E-value=1.6e-07  Score=71.97  Aligned_cols=71  Identities=14%  Similarity=0.200  Sum_probs=67.8

Q ss_pred             chhhhhhccCceEeehhHHHHHhhccCceEEEcCCCCCchhHHHhhhhhhcccCeeEEeecCceEEEEecC
Q 026921            2 LFFLHIAAKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILMWRGK   72 (231)
Q Consensus         2 l~~l~kL~kNGvY~~lV~~VrdAF~~~elVrIDC~gl~~sDyKKIGaKLrdLVpCvllsF~~eqIlmWRGk   72 (231)
                      |=|+...||||+--+++..+++|++.+|||||-|.+-.+.|.+.+...|.+...|.++..-+..+++||+.
T Consensus        18 l~pvv~IGk~GlT~~vi~ei~~aL~~~ELIKVkvl~~~~~~~~e~a~~la~~t~a~~Vq~IG~~~vLyR~~   88 (98)
T 1jo0_A           18 LNPVVMLGGNGLTEGVLAEIENALNHHELIKVKVAGADRETKQLIINAIVRETKAAQVQTIGHILVLYRPS   88 (98)
T ss_dssp             BCCSEEECTTCSCHHHHHHHHHHHHHHSEEEEEETTCCHHHHHHHHHHHHHHHCCEEEEEETTEEEEECCC
T ss_pred             CCCeEEECCCCCCHHHHHHHHHHHHHCCeEEEEEeCCCHHHHHHHHHHHHHHhCCEEEEEECCEEEEEccC
Confidence            34778899999999999999999999999999999999999999999999999999999999999999987



>1rq8_A Conserved hypothetical protein; structural genomics, SAV1595, YHBY, UPF0044, unknown function; NMR {Staphylococcus aureus} SCOP: d.68.4.1 Back     alignment and structure

Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query 231
d1jo0a_97 d.68.4.1 (A:) YhbY homologue HI1333 {Haemophilus i 3e-08
>d1jo0a_ d.68.4.1 (A:) YhbY homologue HI1333 {Haemophilus influenzae [TaxId: 727]} Length = 97 Back     information, alignment and structure

class: Alpha and beta proteins (a+b)
fold: IF3-like
superfamily: YhbY-like
family: YhbY-like
domain: YhbY homologue HI1333
species: Haemophilus influenzae [TaxId: 727]
 Score = 47.8 bits (114), Expect = 3e-08
 Identities = 9/65 (13%), Positives = 24/65 (36%)

Query: 9  AKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILM 68
            NG+   ++ ++ NA     L+KV   G      + +   +        +      +++
Sbjct: 25 GGNGLTEGVLAEIENALNHHELIKVKVAGADRETKQLIINAIVRETKAAQVQTIGHILVL 84

Query: 69 WRGKD 73
          +R  +
Sbjct: 85 YRPSE 89


Homologous Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query231
d1jo0a_97 YhbY homologue HI1333 {Haemophilus influenzae [Tax 99.25
d1rq8a_96 Hypothetical protein SAV1595 {Staphylococcus aureu 99.18
>d1jo0a_ d.68.4.1 (A:) YhbY homologue HI1333 {Haemophilus influenzae [TaxId: 727]} Back     information, alignment and structure
class: Alpha and beta proteins (a+b)
fold: IF3-like
superfamily: YhbY-like
family: YhbY-like
domain: YhbY homologue HI1333
species: Haemophilus influenzae [TaxId: 727]
Probab=99.25  E-value=2e-12  Score=96.23  Aligned_cols=72  Identities=14%  Similarity=0.210  Sum_probs=68.7

Q ss_pred             chhhhhhccCceEeehhHHHHHhhccCceEEEcCCCCCchhHHHhhhhhhcccCeeEEeecCceEEEEecCC
Q 026921            2 LFFLHIAAKNGVYLTLVRDVRNAFEGSSLVKVNCKGMHASDYKKLGAKLKELVPCVLLSFDDEQILMWRGKD   73 (231)
Q Consensus         2 l~~l~kL~kNGvY~~lV~~VrdAF~~~elVrIDC~gl~~sDyKKIGaKLrdLVpCvllsF~~eqIlmWRGk~   73 (231)
                      |-|+.+++|||++.++|+.+.+||+.+|||||.|.+..++|.+.+..+|.+..+|.+|.+-+.++++||+..
T Consensus        18 lkp~v~IGk~Glt~~vi~ei~~al~~~ELIKvki~~~~~~~~~~~~~~l~~~t~~~vV~~iG~~~ilYR~~~   89 (97)
T d1jo0a_          18 LNPVVMLGGNGLTEGVLAEIENALNHHELIKVKVAGADRETKQLIINAIVRETKAAQVQTIGHILVLYRPSE   89 (97)
T ss_dssp             BCCSEEECTTCSCHHHHHHHHHHHHHHSEEEEEETTCCHHHHHHHHHHHHHHHCCEEEEEETTEEEEECCCS
T ss_pred             CCCEEEECCCCCCHHHHHHHHHHHHhCCeEEEEEcCCCHHHHHHHHHHHHHHhCCEEEEEECCEEEEEcCCC
Confidence            457889999999999999999999999999999999999999999999999999999999999999999743



>d1rq8a_ d.68.4.1 (A:) Hypothetical protein SAV1595 {Staphylococcus aureus, (strain Mu50 / ATCC 700699) [TaxId: 1280]} Back     information, alignment and structure