CDS 1702 - 2034 /gene="1" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_1" /note=Original Glimmer call @bp 1702 has strength 11.63; Genemark calls start at 1702 /note=SSC: Start = 1702, Stop = 2034. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.12 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 333 bp is the longest possible ORF. GAP: 0 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Francesca_Draft, ProteinNumber= 274, Function= function unknown, EValue= 3.0E-61. NCBIBLAST= . HHPRED= Accession= PF04648.16, Description= MF_alpha ; Yeast mating factor alpha hormone, Probability= 65.3. Coverage= 10.0, SubjectRange= 1:12, QueryRange= 1:45. CDD= . /note=In starterator, there are only two phages in this Pham and they both have the same start. /note=Phages db Blast and HHPRED both have inconclusive results that have very low probability. /note=NCBI Blast: N/A /note=Deep TMHMM: N/A CDS 2090 - 2338 /gene="2" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_2" /note=Original Glimmer call @bp 2090 has strength 0.87; Genemark calls start at 2090 /note=SSC: Start = 2090, Stop = 2338. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.278 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 249 bp is not the longest possible ORF. GAP: 55 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Starterator only has Francesca and Dorin, but both of them call the same start /note=No good results from phagesdb blast or NCBI blast /note=HHPRED has too low probabilities and too high e-values /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS 2351 - 2737 /gene="3" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_3" /note=Original Glimmer call @bp 2351 has strength 1.77; Genemark calls start at 2345 /note=SSC: Start = 2351, Stop = 2737. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.12 is the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 387 bp is not the longest possible ORF. GAP: 12 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 276, Function= function unknown, EValue= 1.0E-11. NCBIBLAST= PhageName= hypothetical protein FDI69_gp002 [Rhodococcus phage Trina] >ref|YP_009615270.1| hypothetical protein FDI69_gp123 [Rhodococcus phage Trina] >gb|ASZ74821.1| hypothetical protein SEA_TRINA_2 [Rhodococcus phage Trina] >gb|ASZ75062.1| hypothetical protein SEA_TRINA_284 [Rhodococcus phage Trina], Coverage= 87.5, SubjectRange= 1:109, QueryRange= 1:122, EValue= 0.00956089. HHPRED= Accession= PF17359.6, Description= DUF5385 ; Family of unknown function (DUF5385), Probability= 52.0. Coverage= 55.4688, SubjectRange= 39:158, QueryRange= 39:127. CDD= . /note=Starterator only has Francesca and Dorin, but both of them call the same start /note=Didn`t change the start because of final score /note=No good results for NCBI blast, PhagesDB, or HHPRED /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS 2831 - 2974 /gene="4" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_4" /note=Original Glimmer call @bp 2855 has strength 2.96; Genemark calls start at 2831 /note=SSC: Start = 2831, Stop = 2974. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.49 is the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 144 bp is the longest possible ORF. GAP: 93 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= PF21689.1, Description= TorS_sensor_domain ; Sensor protein TorS, sensor domain, Probability= 73.3. Coverage= 74.4681, SubjectRange= 46:81, QueryRange= 46:45. CDD= . /note=Starterator has only Francesca and Dorin, but both call the same start /note=HHPRED, NCBI Blast, and Phagesdb Blast are all inconclusive and have terrible results /note=Conserved Domains: N/A /note=Deep TMHMM: N/A CDS 2986 - 3720 /gene="5" /product="glycosylase" /function="glycosylase" /locus tag="Francesca_5" /note=Original Glimmer call @bp 2986 has strength 8.83; Genemark calls start at 3091 /note=SSC: Start = 2986, Stop = 3720. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.282 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 735 bp is the longest possible ORF. GAP: 11 bp. ST: SS=NA. F: glycosylase. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 6, Function= function unknown, EValue= 1.0E-111. NCBIBLAST= PhageName= hypothetical protein FDI65_gp09 [Mycobacterium phage Rockstar] >gb|AEK07440.1| hypothetical protein ROCKSTAR_77 [Mycobacterium phage Rockstar] >gb|QGJ97365.1| hypothetical protein PBI_ISCA_85 [Mycobacterium phage Isca], Coverage= 80.3279, SubjectRange= 1:191, QueryRange= 1:243, EValue= 3.16967E-52. HHPRED= Accession= 3FHG_A, Description= N-glycosylase/DNA lyase; ogg, helix-hairpin-helix, glycosylase, 8-oxoguanine, 8-oxoG, SsOGG, DNA damage, DNA repair, Glycosidase, Hydrolase, Lyase, Multifunctional enzyme, Nuclease; HET: SO4, GOL; 1.9A {Sulfolobus solfataricus}, Probability= 94.9. Coverage= 32.377, SubjectRange= 126:206, QueryRange= 126:244. CDD= . /note=PhagesDB blast and HHPRED have very good evidence for glycosylase, but HHPRED e-values are slightly high. RDJ: No, the HHPRED values are not so great, but Shagrat_113 has the same HHPRED hit with slightly better E-value and was called glycosylase. /note=NCBI blast didn`t have good evidence /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS 3799 - 4467 /gene="6" /product="Lsr2-like DNA bridging protein" /function="Lsr2-like DNA bridging protein" /locus tag="Francesca_6" /note=Original Glimmer call @bp 3799 has strength 12.92; Genemark calls start at 3799 /note=SSC: Start = 3799, Stop = 4467. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.456 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 669 bp is the longest possible ORF. GAP: 78 bp. ST: SS=NA. F: Lsr2-like DNA bridging protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 7, Function= function unknown, EValue= 1.0E-123. NCBIBLAST= . HHPRED= Accession= 2KNG_A, Description= Protein lsr2; DNA-binding domain, Immune response, DNA BINDING PROTEIN; NMR {Mycobacterium tuberculosis}, Probability= 95.7. Coverage= 15.3153, SubjectRange= 11:45, QueryRange= 11:119. CDD= Accession= pfam11774, Coverage= 25.2252, SubjectRange= 51:104, QueryRange= 51:115, EValue= 5.21058E-5. /note=Starterator only has Francesca and Dorin, but both of them call the same start /note=Great coding potential /note=PhagesDB blast and HHPRED have good evidence for this function /note=NCBI blast has no results /note=Conserved Domain has some weak evidence for function. RDJ: conserved domain evidence is actually decent /note=Deep TMHMM: N/A CDS 4534 - 4656 /gene="7" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_7" /note=Original Glimmer call @bp 4534 has strength 9.22; Genemark calls start at 4534 /note=SSC: Start = 4534, Stop = 4656. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.278 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 123 bp is the longest possible ORF. GAP: 66 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 8, Function= function unknown, EValue= 9.0E-15. NCBIBLAST= . HHPRED= Accession= SCOP_d1hw1a2, Description= a.78.1.1 (A:79-230) Fatty acid responsive transcription factor FadR, C-terminal domain {Escherichia coli [TaxId: 562]} | CLASS: All alpha proteins, FOLD: GntR ligand-binding domain-like, SUPFAM: GntR ligand-binding domain-like, FAM: GntR ligand-binding domain-like, Probability= 41.9. Coverage= 47.5, SubjectRange= 120:139, QueryRange= 120:40. CDD= . /note=Starterator only has Francesca and Dorin, but both of them call the same start /note=No good results for phagesDB or HHPRED /note=No results for NCBI blast /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS 4653 - 4748 /gene="8" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_8" /note=Genemark calls start at 4653 /note=SSC: Start = 4653, Stop = 4748. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.315 is the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 96 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= Francesca_Draft, ProteinNumber= 8, Function= function unknown, EValue= 7.0E-10. NCBIBLAST= . HHPRED= Accession= PF12273.12, Description= RCR ; Chitin synthesis regulation, resistance to Congo red, Probability= 88.6. Coverage= 58.0645, SubjectRange= 2:20, QueryRange= 2:30. CDD= . /note=No good evidence/no results in HHPRED, NCBI blast, and PhagesDB blast /note=Start wasn`t changed because good scores and coding potential /note=Conserved Domain: N/A /note=Deep TMHMM: evidence for transmembrane protein CDS complement (4987 - 5334) /gene="9" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_9" /note=Original Glimmer call @bp 5334 has strength 8.45; Genemark calls start at 5334 /note=SSC: Start = 5334, Stop = 4987. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 348 bp is the longest possible ORF. GAP: 1 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 9, Function= function unknown, EValue= 3.0E-48. NCBIBLAST= . HHPRED= Accession= PF17280.6, Description= DUF5345 ; Family of unknown function (DUF5345), Probability= 77.3. Coverage= 41.7391, SubjectRange= 30:78, QueryRange= 30:77. CDD= . /note=HHPRED: low probability/ inconclusive /note=NCBI Blast: N/A /note=Conserved Domain: N/A /note=Deep TMHMM: 2 transmembrane domains CDS complement (5336 - 6016) /gene="10" /product="lysin B" /function="lysin B" /locus tag="Francesca_10" /note=Original Glimmer call @bp 6016 has strength 3.47; Genemark calls start at 6016 /note=SSC: Start = 6016, Stop = 5336. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.648 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 681 bp is the longest possible ORF. GAP: 30 bp. ST: SS=NA. F: lysin B. FS: PHDBLAST= PhageName= Weasels2, ProteinNumber= 31, Function= lysin B, EValue= 7.0E-33. NCBIBLAST= PhageName= lysin B [Rhodococcus phage Weasels2] >gb|AOZ63621.1| lysin B [Rhodococcus phage Weasels2], Coverage= 96.4602, SubjectRange= 1:218, QueryRange= 1:220, EValue= 2.73625E-37. HHPRED= Accession= 3HC7_A, Description= Gene 12 protein; alpha/beta sandwich, CELL ADHESION; 2.0A {Mycobacterium phage D29}, Probability= 99.9. Coverage= 97.7876, SubjectRange= 2:253, QueryRange= 2:222. CDD= Accession= pfam08237, Coverage= 37.1681, SubjectRange= 1:81, QueryRange= 1:112, EValue= 0.00226851. /note=HHPRED shows high probability for Hydrolase /note=NCBI Blast has low probability for everything /note=Conserved Domains/Deep TMHMM: N/A /note= /note=We can call this gene lysin B because gene 25 is a lysin A glycosyl hydrolase domain. CDS complement (6047 - 6253) /gene="11" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_11" /note=Original Glimmer call @bp 6253 has strength 4.16; Genemark calls start at 6253 /note=SSC: Start = 6253, Stop = 6047. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.467 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 207 bp is the longest possible ORF. GAP: -11 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Francesca_Draft, ProteinNumber= 11, Function= function unknown, EValue= 5.0E-37. NCBIBLAST= PhageName= hypothetical protein [Roseibium sp.] >gb|MBO6858359.1| hypothetical protein [Roseibium sp.], Coverage= 82.3529, SubjectRange= 4:59, QueryRange= 4:61, EValue= 0.0125468. HHPRED= Accession= cd21689, Description= stalk_CoV_Nsp13-like; stalk domain of coronavirus Nsp13 helicase and related proteins. This model represents the stalk domain of coronavirus non-structural protein 13 (Nsp13) helicase, found in the Nsp3s of alpha-, beta-, gamma-, and deltacoronaviruses, including Severe Acute Respiratory Syndrome coronavirus (SARS-CoV), SARS-CoV-2 (also called 2019 novel CoV or 2019-nCoV), and Middle East respiratory syndrome coronavirus (MERS-CoV)., Probability= 76.6. Coverage= 42.6471, SubjectRange= 6:35, QueryRange= 6:43. CDD= . /note=Starterator only has Francesca and Dorin and both call the same start /note=Bad results for NCBI blast, HHPRED, and phagesdb blast /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS complement (6243 - 6512) /gene="12" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_12" /note=Original Glimmer call @bp 6512 has strength 4.0; Genemark calls start at 6512 /note=SSC: Start = 6512, Stop = 6243. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.299 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 270 bp is not the longest possible ORF. GAP: -26 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Francesca_Draft, ProteinNumber= 12, Function= function unknown, EValue= 1.0E-45. NCBIBLAST= . HHPRED= Accession= PF10049.13, Description= DUF2283 ; Protein of unknown function (DUF2283), Probability= 92.7. Coverage= 50.5618, SubjectRange= 1:46, QueryRange= 1:78. CDD= . /note=Starterator only has Francesca and Dorin, but both call the same start /note=No results for NCBI blast /note=HHPRED has high probability for protein of unknown protein /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS complement (6487 - 6726) /gene="13" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_13" /note=Original Glimmer call @bp 6726 has strength 6.19; Genemark calls start at 6726 /note=SSC: Start = 6726, Stop = 6487. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.558 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 240 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Francesca_Draft, ProteinNumber= 13, Function= function unknown, EValue= 2.0E-41. NCBIBLAST= PhageName= hypothetical protein BJD55_gp111 [Gordonia phage Yvonnetastic] >gb|AMS02655.1| hypothetical protein SEA_YVONNETASTIC_111 [Gordonia phage Yvonnetastic], Coverage= 92.4051, SubjectRange= 1:76, QueryRange= 1:73, EValue= 4.14308E-5. HHPRED= Accession= PF18843.5, Description= LPD28 ; Large polyvalent protein associated domain 28, Probability= 65.4. Coverage= 58.2278, SubjectRange= 7:51, QueryRange= 7:51. CDD= Accession= pfam12929, Coverage= 45.5696, SubjectRange= 79:108, QueryRange= 79:60, EValue= 0.00411835. /note=Starterator only has Dorin and Francesca and both call the same start /note=No good evidence for HHPRED or NCBI Blast /note=Conserved Domain: very weak evidence /note=Deep TMHMM: N/A CDS complement (6723 - 7109) /gene="14" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_14" /note=Original Glimmer call @bp 7109 has strength 7.96; Genemark calls start at 7109 /note=SSC: Start = 7109, Stop = 6723. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.98 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 387 bp is not the longest possible ORF. GAP: -8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Aminay, ProteinNumber= 95, Function= function unknown, EValue= 1.6. NCBIBLAST= PhageName= hypothetical protein MM2B0307_0317 [Mycobacteroides abscessus subsp. bolletii 2B-0307], Coverage= 89.8438, SubjectRange= 8:120, QueryRange= 8:118, EValue= 5.10227E-12. HHPRED= Accession= PF01481.20, Description= Arteri_nucleo ; Arterivirus nucleocapsid protein, Probability= 58.7. Coverage= 39.8438, SubjectRange= 51:95, QueryRange= 51:52. CDD= . /note=Starterator only has Dorin and Francesca but both of them call the same start /note=The start was changed because the coding potential covered the new start better /note=HHPRED, NCBI blast, and PhagesDB are all inconclusive /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS complement (7102 - 7416) /gene="15" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_15" /note=Original Glimmer call @bp 7347 has strength 2.53; Genemark calls start at 7416 /note=SSC: Start = 7416, Stop = 7102. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.606 is not the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 315 bp is the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein SEA_MCWOLFISH_74 [Mycobacterium phage McWolfish] >gb|AZF96509.1| hypothetical protein SEA_KHALEESI_75 [Mycobacterium phage Khaleesi] >gb|WNT45849.1| hypothetical protein SEA_PURDUEPETE_76 [Mycobacterium phage PurduePete], Coverage= 93.2692, SubjectRange= 52:153, QueryRange= 52:99, EValue= 2.39945E-35. HHPRED= . CDD= . /note=no evidence for function CDS complement (7416 - 7613) /gene="16" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_16" /note=Original Glimmer call @bp 7613 has strength 4.06; Genemark calls start at 7613 /note=SSC: Start = 7613, Stop = 7416. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.725 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 198 bp is not the longest possible ORF. GAP: -14 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= 3HCZ_A, Description= Possible thiol-disulfide isomerase; APC61559.2, thiol-disulfide isomerase, Cytophaga hutchinsonii ATCC, structural genomics, PSI-2, protein structure initiative, midwest center for structural; HET: SO4; 1.88A {Cytophaga hutchinsonii} SCOP: c.47.1.0, l.1.1.1, Probability= 80.6. Coverage= 53.8462, SubjectRange= 91:126, QueryRange= 91:65. CDD= . /note=starterator only had Dorin and Francesca and they both called the same start /note=Phagesdb Blast, NCBI Blast, and HHPRED area all inconclusive /note=Conserved Domains: N/A /note=TMHMM: N/A CDS complement (7600 - 7926) /gene="17" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_17" /note=Original Glimmer call @bp 7884 has strength 5.43; Genemark calls start at 7884 /note=SSC: Start = 7926, Stop = 7600. (Reverse). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 1.501 is not the highest start score. SCS: Start is not called by Glimmer and is not called by Genemark. LO: 327 bp is the longest possible ORF. GAP: -46 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 17, Function= function unknown, EValue= 3.0E-50. NCBIBLAST= . HHPRED= Accession= 6S0K_k, Description= Cytoskeleton protein RodZ; Ribosome nascent chain in complex with SecA, RIBOSOME; HET: MG; 3.1A {Escherichia coli}, Probability= 68.7. Coverage= 20.3704, SubjectRange= 6:28, QueryRange= 6:34. CDD= . /note=Start was changed because coding potential covers the new start much better /note=No results for NCBI blast /note=Terrible results for HHPRED and PhagesDB blast /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS complement (7881 - 8168) /gene="18" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_18" /note=Original Glimmer call @bp 8168 has strength 4.02; Genemark calls start at 8168 /note=SSC: Start = 8168, Stop = 7881. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.217 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 288 bp is the longest possible ORF. GAP: 8 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 18, Function= function unknown, EValue= 4.0E-52. NCBIBLAST= . HHPRED= Accession= PF20088.3, Description= DUF6480 ; Family of unknown function (DUF6480), Probability= 82.8. Coverage= 23.1579, SubjectRange= 76:98, QueryRange= 76:27. CDD= . /note=Starterator only has Francesca and Dorin, but both have the same start /note=Bad results for HHPRED and PhagesDB /note=No results for NCBI blast /note=Conserved Domain: N/A /note=Deep TMHMM: only one so not enough evidence CDS complement (8177 - 8563) /gene="19" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_19" /note=Original Glimmer call @bp 8563 has strength 8.26; Genemark calls start at 8563 /note=SSC: Start = 8563, Stop = 8177. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.138 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 387 bp is not the longest possible ORF. GAP: -20 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Peregrin, ProteinNumber= 31, Function= function unknown, EValue= 2.0E-14. NCBIBLAST= PhageName= hypothetical protein PBI_GRAYSON_33 [Rhodococcus phage Grayson], Coverage= 98.4375, SubjectRange= 1:123, QueryRange= 1:126, EValue= 1.41757E-14. HHPRED= Accession= PF20215.2, Description= DUF6575 ; Family of unknown function (DUF6575), Probability= 99.9. Coverage= 96.875, SubjectRange= 5:120, QueryRange= 5:127. CDD= . /note=HHPRED only says unknown function /note=Bad results for NCBI blast and PhagesDB blast /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS complement (8544 - 8834) /gene="20" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_20" /note=Original Glimmer call @bp 8834 has strength 6.88; Genemark calls start at 8834 /note=SSC: Start = 8834, Stop = 8544. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.726 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 291 bp is the longest possible ORF. GAP: -23 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= PF11116.12, Description= DUF2624 ; Protein of unknown function (DUF2624), Probability= 53.1. Coverage= 30.2083, SubjectRange= 9:38, QueryRange= 9:86. CDD= . /note=Starterator only has Francesca and Dorin, but both have the same start /note=Bad results for HHPRED and PhagesDB blast /note=No results for NCBI blast /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS complement (8812 - 9279) /gene="21" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_21" /note=Original Glimmer call @bp 9279 has strength 7.56; Genemark calls start at 9279 /note=SSC: Start = 9279, Stop = 8812. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.211 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 468 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein [Nocardia farcinica], Coverage= 95.4839, SubjectRange= 158:304, QueryRange= 158:149, EValue= 0.00286316. HHPRED= Accession= PF07924.15, Description= NuiA ; Nuclease A inhibitor-like protein, Probability= 51.9. Coverage= 21.2903, SubjectRange= 75:108, QueryRange= 75:62. CDD= . /note=Starterator only has Francesca and Dorin, but both call the same start /note=HHPRED, NCBI blast, and PhagesDB blast are all inconclusive and have bad results /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS complement (9276 - 9506) /gene="22" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_22" /note=Original Glimmer call @bp 9506 has strength 4.91; Genemark calls start at 9506 /note=SSC: Start = 9506, Stop = 9276. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 231 bp is the longest possible ORF. GAP: 6 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein FDI69_gp031 [Rhodococcus phage Trina] >gb|ASZ74848.1| hypothetical protein SEA_TRINA_31 [Rhodococcus phage Trina], Coverage= 93.4211, SubjectRange= 1:62, QueryRange= 1:71, EValue= 4.95371E-10. HHPRED= Accession= SCOP_d1wjpa2, Description= g.37.1.1 (A:43-66) Zinc finger protein 295, ZNF295 {Human (Homo sapiens) [TaxId: 9606]} | CLASS: Small proteins, FOLD: beta-beta-alpha zinc fingers, SUPFAM: beta-beta-alpha zinc fingers, FAM: Classic zinc finger, C2H2, Probability= 95.2. Coverage= 21.0526, SubjectRange= 3:19, QueryRange= 3:36. CDD= . /note=HHPRED has evidence for `zinc fingers.` Looking on Sea Phages, we were instructed to call this a hypothetical protein /note=NCBI blast and PhagesDB blast do not have good results /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS complement (9513 - 9650) /gene="23" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_23" /note=Original Glimmer call @bp 9650 has strength 8.55; Genemark calls start at 9650 /note=SSC: Start = 9650, Stop = 9513. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.689 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 138 bp is the longest possible ORF. GAP: 18 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= PF15377.10, Description= DUF4604 ; Domain of unknown function (DUF4604), Probability= 85.0. Coverage= 51.1111, SubjectRange= 53:82, QueryRange= 53:28. CDD= . /note=Starterator only has Dorin and Francesca, but both have the same start /note=HHPRED, NCBI blast, and PhagesDB are all inconclusive /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS complement (9669 - 9842) /gene="24" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_24" /note=Genemark calls start at 9842 /note=SSC: Start = 9842, Stop = 9669. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.06 is the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 174 bp is the longest possible ORF. GAP: 102 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_35 [Rhodococcus phage NiceHouse], Coverage= 98.2456, SubjectRange= 1:60, QueryRange= 1:56, EValue= 5.44162E-8. HHPRED= Accession= PF09526.14, Description= DUF2387 ; Probable metal-binding protein (DUF2387), Probability= 81.6. Coverage= 21.0526, SubjectRange= 7:19, QueryRange= 7:50. CDD= . /note=HHPRED, NCBI blast, and PhagesDB blast are all inconclusive /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS complement (9945 - 11054) /gene="25" /product="hydrolase" /function="hydrolase" /locus tag="Francesca_25" /note=Original Glimmer call @bp 11054 has strength 4.5; Genemark calls start at 11054 /note=SSC: Start = 11054, Stop = 9945. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.278 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1110 bp is not the longest possible ORF. GAP: 78 bp. ST: SS=NA. F: hydrolase. FS: PHDBLAST= . NCBIBLAST= PhageName= DUF4185 domain-containing protein [Labilithrix sp.], Coverage= 99.729, SubjectRange= 113:449, QueryRange= 113:369, EValue= 3.6285E-84. HHPRED= Accession= 8HHV_D, Description= endo-alpha-D-arabinanase; D-arabinan, anomer-retaining glycoside hydrolase, HYDROLASE; HET: GOL; 1.6A {Microbacterium arabinogalactanolyticum}, Probability= 100.0. Coverage= 99.458, SubjectRange= 8:325, QueryRange= 8:369. CDD= . /note=Phagesdb blast not informative /note=HHPRED suggests a glycoside hydrolase domain /note=NCBI blast not infomative /note=Conserved Domain: not informative /note=Deep TMHMM: N/A CDS complement (11133 - 11861) /gene="26" /product="ThyX-like thymidylate synthase" /function="ThyX-like thymidylate synthase" /locus tag="Francesca_26" /note=Original Glimmer call @bp 11861 has strength 6.78; Genemark calls start at 11861 /note=SSC: Start = 11861, Stop = 11133. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 729 bp is the longest possible ORF. GAP: 7 bp. ST: SS=NA. F: ThyX-like thymidylate synthase. FS: PHDBLAST= PhageName= Peregrin, ProteinNumber= 41, Function= ThyX-like thymidylate synthase, EValue= 6.0E-76. NCBIBLAST= PhageName= ThyX-like thymidylate synthase [Rhodococcus phage Peregrin], Coverage= 98.7603, SubjectRange= 1:238, QueryRange= 1:239, EValue= 1.07724E-92. HHPRED= Accession= 6J61_B, Description= Flavin-dependent thymidylate synthase; Thymidylate synthase, pyrimidine nucleotide biosynthetic pathway, C-terminal domain, Structural Genomics, TRANSFERASE; HET: FAD, PO4; 2.5A {Thermus thermophilus (strain HB8 / ATCC 27634 / DSM 579)}, Probability= 100.0. Coverage= 95.8678, SubjectRange= 6:221, QueryRange= 6:240. CDD= Accession= pfam02511, Coverage= 84.7107, SubjectRange= 4:185, QueryRange= 4:228, EValue= 2.71601E-30. /note=Phagesdb blast suggests above function /note=HHPRED has excellent results for thyx-like thymidylate synthase /note=NCBI blast does not have strong results /note=Conserved Domain doesn`t have great results but also suggests the chosen function /note=Deep TMHMM: N/A CDS complement (11869 - 12435) /gene="27" /product="glycosyltransferase" /function="glycosyltransferase" /locus tag="Francesca_27" /note=Original Glimmer call @bp 12435 has strength 6.05; Genemark calls start at 12435 /note=SSC: Start = 12435, Stop = 11869. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.925 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 567 bp is not the longest possible ORF. GAP: 105 bp. ST: SS=NA. F: glycosyltransferase. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein PBI_GRAYSON_46 [Rhodococcus phage Grayson], Coverage= 98.4043, SubjectRange= 1:179, QueryRange= 1:185, EValue= 9.48451E-50. HHPRED= Accession= 1XSF_A, Description= Probable resuscitation-promoting factor rpfB; LYSOZYME-LIKE STRUCTURE, CELL CYCLE, HYDROLASE; NMR {Mycobacterium tuberculosis} SCOP: d.2.1.8, Probability= 99.3. Coverage= 40.9574, SubjectRange= 29:106, QueryRange= 29:106. CDD= Accession= pfam06737, Coverage= 38.2979, SubjectRange= 4:75, QueryRange= 4:101, EValue= 4.56842E-40. /note=In HHPRED, we found the function to be a resuscitation protein RfpB. On the Sea-Phages forum, it was discussed that the actual function in a phage for this is glycosyltransferase. /note=Excellent results for HHPRED /note=Bad results for NCBI blast and Conserved Domain /note=Deep TMHMM: N/A CDS complement (12541 - 12801) /gene="28" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_28" /note=Original Glimmer call @bp 12768 has strength 9.77; Genemark calls start at 12768 /note=SSC: Start = 12801, Stop = 12541. (Reverse). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 2.014 is not the highest start score. SCS: Start is not called by Glimmer and is not called by Genemark. LO: 261 bp is the longest possible ORF. GAP: 11 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= 1L8D_B, Description= DNA double-strand break repair rad50 ATPase; zinc finger, rad50, DNA repair, Recombination, hook motif, REPLICATION; HET: CIT, PO4; 2.2A {Pyrococcus furiosus} SCOP: h.4.12.1, Probability= 95.9. Coverage= 88.3721, SubjectRange= 49:111, QueryRange= 49:78. CDD= . /note=Deep TMHMM, Conserved Domain, and NCBI blast did not have results /note=We were considering calling this a MRE11 double-strand break endo/exonuclease based on research on Sea Phages and HHPRED results, but we did not call this function because we did not have enough evidence and HHPRED e-values were too high /note=The start was changed because the new start is the longest ORF and has the best coding potential CDS complement (12813 - 13589) /gene="29" /product="methyltransferase" /function="methyltransferase" /locus tag="Francesca_29" /note=Original Glimmer call @bp 13589 has strength 9.64; Genemark calls start at 13589 /note=SSC: Start = 13589, Stop = 12813. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.705 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 777 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: methyltransferase. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 47, Function= methyltransferase, EValue= 8.0E-95. NCBIBLAST= PhageName= methyltransferase [Rhodococcus phage Trina] >gb|ASZ74864.1| methyltransferase [Rhodococcus phage Trina], Coverage= 98.4496, SubjectRange= 4:256, QueryRange= 4:258, EValue= 5.23738E-116. HHPRED= Accession= PF05050.16, Description= Methyltransf_21 ; Methyltransferase FkbM domain, Probability= 99.4. Coverage= 58.9147, SubjectRange= 1:169, QueryRange= 1:226. CDD= Accession= TIGR01444, Coverage= 50.7752, SubjectRange= 4:142, QueryRange= 4:204, EValue= 1.37119E-24. /note=HHPRED has very good results for methyltransferase /note=NCBI blast and Conserved Domain have weak evidence for methyltransferase /note=Phagesdb blast also has evidence for methyltransferase /note=Deep TMHMM: N/A CDS complement (13586 - 13960) /gene="30" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_30" /note=Original Glimmer call @bp 13960 has strength 7.74; Genemark calls start at 13960 /note=SSC: Start = 13960, Stop = 13586. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.598 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 375 bp is not the longest possible ORF. GAP: -17 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein PBI_GRAYSON_52 [Rhodococcus phage Grayson], Coverage= 93.5484, SubjectRange= 12:126, QueryRange= 12:124, EValue= 1.80712E-6. HHPRED= Accession= 2Q00_B, Description= Orf c02003 protein; P95883, NESG, SSO2109, Structural Genomics, PSI-2, Protein Structure Initiative, Northeast Structural Genomics Consortium, UNKNOWN FUNCTION; 2.4A {Sulfolobus solfataricus P2}, Probability= 83.1. Coverage= 25.0, SubjectRange= 89:120, QueryRange= 89:71. CDD= . /note=Starterator only has Francesca and Dorin, but both call the same start /note=HHPRED, NCBI blast, and PhagesDB blast all do not have strong evidence for anything /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS complement (13944 - 14057) /gene="31" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_31" /note=Original Glimmer call @bp 14057 has strength 5.26 /note=SSC: Start = 14057, Stop = 13944. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.777 is the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 114 bp is not the longest possible ORF. GAP: 62 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= 4Z61_C, Description= Somatic embryogenesis receptor kinase 2; hormone receptor, complex, TRANSFERASE; HET: TYS, NAG; 2.75A {Daucus carota}, Probability= 81.1. Coverage= 91.8919, SubjectRange= 1:35, QueryRange= 1:35. CDD= . /note=starterator just has Dorin and Francesca and they both call the same start /note=NCBI Blast, Phagesdb Blast, and HHPRED are all inconclusive /note=Conserved Domains: N/A /note=Deep TMHMM: N/A CDS 14120 - 14338 /gene="32" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_32" /note=Original Glimmer call @bp 14120 has strength 0.83 /note=SSC: Start = 14120, Stop = 14338. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.797 is the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 219 bp is the longest possible ORF. GAP: 62 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= cd17025, Description= T3SC_IA_ShcF-like; Class IA type III secretion system chaperone protein, similar to Pseudomonas syringae ShcF., Probability= 44.9. Coverage= 70.8333, SubjectRange= 64:115, QueryRange= 64:57. CDD= . /note=starterator just has Dorin and Francesca and they call different starts /note=NCBI Blast, Phagesdb Blast, and HHPRED are all inconclusive /note=Conserved Domains: N/A /note=Deep TMHMM: N/A CDS complement (14559 - 15203) /gene="33" /product="serine hydrolase" /function="serine hydrolase" /locus tag="Francesca_33" /note=Original Glimmer call @bp 15203 has strength 7.5; Genemark calls start at 15203 /note=SSC: Start = 15203, Stop = 14559. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.278 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 645 bp is the longest possible ORF. GAP: 119 bp. ST: SS=NA. F: serine hydrolase. FS: PHDBLAST= PhageName= Shagrat, ProteinNumber= 108, Function= serine hydrolase, EValue= 3.0E-63. NCBIBLAST= PhageName= cutinase family protein [Prescottella agglutinans] >gb|MDH6285068.1| cutinase [Prescottella agglutinans], Coverage= 94.3925, SubjectRange= 2:198, QueryRange= 2:206, EValue= 5.98719E-70. HHPRED= Accession= 7CW1_B, Description= Cutinase-like enzyme; cutinase-like enzyme, biodegradable plastic degrading enzyme, alpha/beta hydrolase fold, hydrolase; HET: CAD; 1.7A {Pseudozyma antarctica}, Probability= 99.8. Coverage= 75.7009, SubjectRange= 2:182, QueryRange= 2:198. CDD= . /note=HHPRED has a lot of strong evidence for different hydrolases, so we called is just hydrolase; RDJ: cutinase domain suggests serine hydrolase /note=NCBI blast and PhagesDB blast inconclusive /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS 15323 - 15667 /gene="34" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_34" /note=Original Glimmer call @bp 15323 has strength 4.56; Genemark calls start at 15323 /note=SSC: Start = 15323, Stop = 15667. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.036 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 345 bp is the longest possible ORF. GAP: 119 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein FDI69_gp055 [Rhodococcus phage Trina] >gb|ASZ74871.1| hypothetical protein SEA_TRINA_55 [Rhodococcus phage Trina], Coverage= 94.7368, SubjectRange= 3:110, QueryRange= 3:114, EValue= 1.89939E-25. HHPRED= Accession= 4PT7_C, Description= Replication initiator A family protein; replication initiation, multidrug resistance, PROTEIN BINDING; HET: SO4; 2.35A {Staphylococcus aureus CA-347}, Probability= 98.3. Coverage= 64.9123, SubjectRange= 31:111, QueryRange= 31:103. CDD= . /note=HHPRED had some evidence for replication initiation protein, but e-values were not as strong as they should be, so called hypothetical protein /note=NCBI blast and PhagesDB blast inconclusive /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS 15664 - 15939 /gene="35" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_35" /note=Original Glimmer call @bp 15664 has strength 8.27; Genemark calls start at 15664 /note=SSC: Start = 15664, Stop = 15939. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.881 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 276 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Sham, ProteinNumber= 15, Function= function unknown, EValue= 0.022. NCBIBLAST= . HHPRED= Accession= 5OYM_B, Description= PC4 and SFRS1-interacting protein; Integrase binding, transciptional co-activator, domain swap, TRANSCRIPTION; 2.05A {Homo sapiens} SCOP: l.1.1.1, a.48.4.1, Probability= 76.6. Coverage= 59.3407, SubjectRange= 44:98, QueryRange= 44:66. CDD= . /note=Starterator only has Dorin and Francesca but they both have the same start /note=HHPRED, NCBI blast, and phagesDB blast are all inconclusive because there is not strong evidence /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS 15936 - 16943 /gene="36" /product="glycosyltransferase" /function="glycosyltransferase" /locus tag="Francesca_36" /note=Original Glimmer call @bp 15936 has strength 3.46; Genemark calls start at 15936 /note=SSC: Start = 15936, Stop = 16943. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 3.12 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1008 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: glycosyltransferase. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 56, Function= glycosyltransferase, EValue= 1.0E-109. NCBIBLAST= PhageName= glycosyltransferase [Rhodococcus phage Trina] >gb|ASZ74872.1| glycosyltransferase [Rhodococcus phage Trina], Coverage= 96.1194, SubjectRange= 1:322, QueryRange= 1:322, EValue= 4.71896E-133. HHPRED= Accession= SCOP_d3c48a1, Description= c.87.1.0 (A:1-418) automated matches {Corynebacterium glutamicum [TaxId: 1718]} | CLASS: Alpha and beta proteins (a/b), FOLD: UDP-Glycosyltransferase/glycogen phosphorylase, SUPFAM: UDP-Glycosyltransferase/glycogen phosphorylase, FAM: automated matches, Probability= 100.0. Coverage= 98.806, SubjectRange= 1:412, QueryRange= 1:332. CDD= Accession= COG0438, Coverage= 72.8358, SubjectRange= 149:377, QueryRange= 149:326, EValue= 2.16493E-8. /note=HHPRED has many very strong results for glycosyltransferase /note=NCBI blast and phagesDB blast have weaker evidence for glycosyltransferase /note=Conserved Domain has weak evidence for glycosyltransferase /note=Deep TMHMM: N/A CDS 17107 - 19068 /gene="37" /product="ribonucleotide reductase" /function="ribonucleotide reductase" /locus tag="Francesca_37" /note=Original Glimmer call @bp 17107 has strength 8.92; Genemark calls start at 17107 /note=SSC: Start = 17107, Stop = 19068. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1962 bp is not the longest possible ORF. GAP: 163 bp. ST: SS=NA. F: ribonucleotide reductase. FS: PHDBLAST= PhageName= Francesca_Draft, ProteinNumber= 37, Function= function unknown, EValue= 0.0. NCBIBLAST= PhageName= ribonucleoside-triphosphate reductase [bacterium], Coverage= 97.5498, SubjectRange= 5:642, QueryRange= 5:644, EValue= 0.0. HHPRED= Accession= cd01676, Description= RNR_II_monomer; Class II ribonucleotide reductase, monomeric form. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides., Probability= 100.0. Coverage= 96.1715, SubjectRange= 1:658, QueryRange= 1:641. CDD= Accession= cd01676, Coverage= 90.8116, SubjectRange= 9:636, QueryRange= 9:613, EValue= 0.0. /note=HHPRED has very strong results for ribonucleotide reductase /note=NCBI Blast and Phagesdb Blast have low probability of ribonucleotide reductase /note=Conserved Domains: low probability of 4. /note=Deep TMHMM: N/A CDS 19184 - 21268 /gene="38" /product="lysin A" /function="lysin A" /locus tag="Francesca_38" /note=Original Glimmer call @bp 19184 has strength 9.88; Genemark calls start at 19184 /note=SSC: Start = 19184, Stop = 21268. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 2085 bp is the longest possible ORF. GAP: 115 bp. ST: SS=NA. F: lysin A. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 51, Function= lysin A, EValue= 1.0E-148. NCBIBLAST= PhageName= lysin A [Rhodococcus phage NiceHouse], Coverage= 66.8588, SubjectRange= 4:461, QueryRange= 4:464, EValue= 1.26914E-168. HHPRED= Accession= 4PMO_B, Description= Tat-secreted protein Rv2525c; Tat secretion GH25, unknown function; HET: GOL, FMT; 1.33A {Mycobacterium tuberculosis}, Probability= 99.9. Coverage= 29.5389, SubjectRange= 32:234, QueryRange= 32:691. CDD= Accession= pfam08924, Coverage= 17.5793, SubjectRange= 1:124, QueryRange= 1:620, EValue= 1.19126E-28. /note=HHPRED had a lot of strong evidence that this protein has an unknown function or is a hypothetical protein /note=NCBI blast, phagesDB, and Conserved Domain did not have very strong evidence /note=Deep TMHMM: N/A CDS 21303 - 21839 /gene="39" /product="HNH endonuclease" /function="HNH endonuclease" /locus tag="Francesca_39" /note=Original Glimmer call @bp 21303 has strength 1.2; Genemark calls start at 21303 /note=SSC: Start = 21303, Stop = 21839. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 2.414 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 537 bp is the longest possible ORF. GAP: 34 bp. ST: SS=NA. F: HNH endonuclease. FS: PHDBLAST= PhageName= Thibault, ProteinNumber= 132, Function= HNH endonuclease, EValue= 1.0E-23. NCBIBLAST= PhageName= NUMOD4 motif-containing HNH endonuclease [Rhodococcus rhodochrous], Coverage= 97.191, SubjectRange= 12:192, QueryRange= 12:175, EValue= 2.9152E-31. HHPRED= . CDD= Accession= pfam07463, Coverage= 28.0899, SubjectRange= 1:48, QueryRange= 1:52, EValue= 5.90586E-13. /note=HHPRED has very strong results for HNH endonuclease /note=Phages DB blast also has good evidence for HNH endonuclease /note=NCBI blast and Conserved domain has weak evidence for HNH endonuclease as well /note=Deep TMHMM: N/A CDS 22173 - 22898 /gene="40" /product="serine protease" /function="serine protease" /locus tag="Francesca_40" /note=Original Glimmer call @bp 22173 has strength 7.4; Genemark calls start at 22173 /note=SSC: Start = 22173, Stop = 22898. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.219 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 726 bp is not the longest possible ORF. GAP: 333 bp. ST: SS=NA. F: serine protease. FS: PHDBLAST= . NCBIBLAST= PhageName= serine protease [Rhodococcus sp. T7] >gb|KAF0960058.1| hypothetical protein MLGJGCBP_06810 [Rhodococcus sp. T7], Coverage= 99.1701, SubjectRange= 1:223, QueryRange= 1:240, EValue= 1.70838E-40. HHPRED= Accession= 6KBR_A, Description= Kallikrein-4; Protein engineering, Cystine knot protein, Protease inhibitor, Structural analysis, PROTEIN BINDING, HYDROLASE-HYDROLASE INHIBITOR complex; 2.0A {Homo sapiens}, Probability= 99.7. Coverage= 75.1037, SubjectRange= 52:250, QueryRange= 52:230. CDD= . /note=HHPRED has a lot of good results for serine protease /note=NCBI blast has weak results for serine protease /note=PhagesDB blast is inconclusive /note=Conserved Domain: inconclusive /note=Deep TMHMM: only one, so inconclusive CDS 22959 - 24470 /gene="41" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_41" /note=Original Glimmer call @bp 22959 has strength 7.83; Genemark calls start at 22959 /note=SSC: Start = 22959, Stop = 24470. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 0.075 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1512 bp is the longest possible ORF. GAP: 60 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 55, Function= function unknown, EValue= 6.0E-83. NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_55 [Rhodococcus phage NiceHouse], Coverage= 99.8012, SubjectRange= 1:502, QueryRange= 1:502, EValue= 1.62887E-88. HHPRED= Accession= 3IPF_A, Description= uncharacterized protein; Q251Q8_DESHY, NESG, DhR8c, Structural Genomics, PSI-2, Protein Structure Initiative, Northeast Structural Genomics Consortium, unknown function; 1.988A {Desulfitobacterium hafniense}, Probability= 37.9. Coverage= 6.75944, SubjectRange= 15:49, QueryRange= 15:177. CDD= . /note=HHPRED, NCBI blast, and phagesDB blast inconclusive and no good evidence /note=Conserved Domain: N/A /note=Deep TMHMM: N/A CDS 24494 - 24841 /gene="42" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_42" /note=Original Glimmer call @bp 24494 has strength 10.11; Genemark calls start at 24494 /note=SSC: Start = 24494, Stop = 24841. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.233 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 348 bp is the longest possible ORF. GAP: 23 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein [Oxalobacteraceae bacterium], Coverage= 80.0, SubjectRange= 19:103, QueryRange= 19:108, EValue= 1.0116E-17. HHPRED= Accession= PF08861.14, Description= DUF1828 ; Domain of unknown function DUF1828, Probability= 93.3. Coverage= 70.4348, SubjectRange= 7:84, QueryRange= 7:100. CDD= Accession= PRK08560, Coverage= 42.6087, SubjectRange= 79:126, QueryRange= 79:98, EValue= 0.00466047. /note=HHPRED, NCBI blast, and phagesDB are all inconclusive and have very weak evidence /note=Conserved Domain: inconclusive /note=Deep TMHMM: N/A CDS 24841 - 26553 /gene="43" /product="portal protein" /function="portal protein" /locus tag="Francesca_43" /note=Original Glimmer call @bp 24841 has strength 10.22; Genemark calls start at 24841 /note=SSC: Start = 24841, Stop = 26553. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.108 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1713 bp is not the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: portal protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 69, Function= portal protein, EValue= 1.0E-157. NCBIBLAST= PhageName= portal protein [Rhodococcus phage Trina] >gb|ASZ74883.1| portal protein [Rhodococcus phage Trina], Coverage= 95.9649, SubjectRange= 5:562, QueryRange= 5:570, EValue= 0.0. HHPRED= Accession= 6TE9_A, Description= Phage portal protein, HK97 family; "neck", "portal", "capsid", "tail tube", VIRUS; 3.58A {Rhodobacter capsulatus}, Probability= 100.0. Coverage= 66.1404, SubjectRange= 47:396, QueryRange= 47:468. CDD= Accession= TIGR01540, Coverage= 56.8421, SubjectRange= 2:306, QueryRange= 2:408, EValue= 2.80673E-28. /note=HHPRED and phagesDB blast have excellent results for portal protein /note=NCBI blast and Conserved Domain have weak evidence for portal protein /note=Deep TMHMM: N/A CDS 26611 - 27900 /gene="44" /product="capsid maturation protease" /function="capsid maturation protease" /locus tag="Francesca_44" /note=Original Glimmer call @bp 26611 has strength 6.56; Genemark calls start at 26611 /note=SSC: Start = 26611, Stop = 27900. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 0.947 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1290 bp is the longest possible ORF. GAP: 57 bp. ST: SS=NA. F: capsid maturation protease. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 44, Function= function unknown, EValue= 0.0. NCBIBLAST= PhageName= head maturation protease [Rhodococcus phage Trina] >gb|ASZ74885.1| capsid maturation protease [Rhodococcus phage Trina], Coverage= 96.0373, SubjectRange= 17:434, QueryRange= 17:424, EValue= 1.19923E-79. HHPRED= Accession= PF04586.21, Description= Peptidase_S78 ; Caudovirus prohead serine protease, Probability= 99.6. Coverage= 35.6643, SubjectRange= 4:158, QueryRange= 4:174. CDD= Accession= COG5271, Coverage= 39.627, SubjectRange= 3804:3949, QueryRange= 3804:414, EValue= 4.87296E-4. /note=Does not have the most annotated start, start found in 7/101 in pham and called 57% of the time it is present. /note=Called start has poor (though still best of the group) zscores. Covers all cp and is LORF. Any other start would increase gap, further shortening the gene. /note=Phagedsdb Blast: Dorin, Trina, CB cluster, and BE2 cluster are all hit. Capsid maturation protein is their function call (not Dorin, still being annotated) /note=HHPRED: good scores (evalue and probability) for serine protease /note=NCBI Blast: Good evalues (better than HHPRED) for head maturation protease and capsid maturation protease (Rhodococcus phages). /note=Conserved Domain Database: has some indications of more domains, but all have bad evalues and coverages. /note=Going with the better evalues/weight of NCBI Blast for function call CDS 27968 - 28978 /gene="45" /product="major capsid protein" /function="major capsid protein" /locus tag="Francesca_45" /note=Original Glimmer call @bp 27968 has strength 7.56; Genemark calls start at 27980 /note=SSC: Start = 27968, Stop = 28978. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 1011 bp is the longest possible ORF. GAP: 67 bp. ST: SS=NA. F: major capsid protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 45, Function= function unknown, EValue= 0.0. NCBIBLAST= PhageName= virion structural protein [Rhodococcus phage Trina] >gb|ASZ74886.1| major capsid protein [Rhodococcus phage Trina], Coverage= 99.7024, SubjectRange= 3:333, QueryRange= 3:336, EValue= 1.65652E-139. HHPRED= Accession= 6TSU_A4, Description= Major capsid protein Rcc01687; "capsid", "jelly roll", "spike", "HK97", VIRUS; 3.42A {Rhodobacter capsulatus DE442}, Probability= 100.0. Coverage= 97.3214, SubjectRange= 86:385, QueryRange= 86:333. CDD= . /note=Starterator: has and calls the most annotated start /note=Best zscore, covers all cp, and LORF /note=PhagesDB Blast: similar to Dorin, Trina, CB cluster, and NiceHouse (all but Dorin called major capsid protein) /note=HHPRED: really good hits (good evalues, probability, and coverage) for major capsid protien. /note=NCBI Blast: great evalues (1.6e-139) for major capsid protein in Rhodococcus phages. /note=Accepted functions list has nothing to add...all good here CDS 29036 - 29464 /gene="46" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_46" /note=Original Glimmer call @bp 29057 has strength 8.04; Genemark calls start at 29057 /note=SSC: Start = 29036, Stop = 29464. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.445 is not the highest start score. SCS: Start is not called by Glimmer and is not called by Genemark. LO: 429 bp is the longest possible ORF. GAP: 57 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 46, Function= function unknown, EValue= 3.0E-77. NCBIBLAST= PhageName= hypothetical protein FDI69_gp073 [Rhodococcus phage Trina] >gb|ASZ74887.1| hypothetical protein SEA_TRINA_73 [Rhodococcus phage Trina], Coverage= 95.0704, SubjectRange= 1:134, QueryRange= 1:142, EValue= 1.31371E-14. HHPRED= Accession= PF21488.1, Description= YqbF_HeH ; YqbF, HeH motif, Probability= 97.8. Coverage= 28.169, SubjectRange= 3:38, QueryRange= 3:49. CDD= . /note=Chosen start is slightly better in RBS scores, is slightly longer, and is the LORF. In starterator, only Dorin also calls this start in a pham of 7 members. /note=PhagesDB Blast has Dorin, Trina, NiceHouse, and CB cluster genes. /note=HHPRED: has no good hits (poor evalues and coverage) /note=NCBI Blast has no good hits CDS 29461 - 30330 /gene="47" /product="head-to-tail adaptor" /function="head-to-tail adaptor" /locus tag="Francesca_47" /note=Original Glimmer call @bp 29461 has strength 9.41; Genemark calls start at 29461 /note=SSC: Start = 29461, Stop = 30330. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.968 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 870 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: head-to-tail adaptor. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 47, Function= function unknown, EValue= 1.0E-165. NCBIBLAST= PhageName= head-to-tail adaptor [Rhodococcus phage NiceHouse], Coverage= 98.6159, SubjectRange= 3:293, QueryRange= 3:289, EValue= 6.66631E-49. HHPRED= Accession= 8HQO_S, Description= Head completion protein; Neck, Portal, T5, VIRUS, VIRAL PROTEIN; 3.2A {Escherichia phage DT57C}, Probability= 99.9. Coverage= 63.6678, SubjectRange= 1:170, QueryRange= 1:284. CDD= . /note=Starterator: does not call the most annotated, found in 2/7 of the pham and called 100% of the time it is found (Dorin and Francesca) /note=Covers all cp, has passable zscores (the best would dramatically shorten the gene) and is LORF. /note=PhagesDB Blast has hits for NiceHouse, Dorin, Trina, and several CB cluster phages. /note=HHPRED: a coupule of hits for Head completion protein/Neck 1 protein/Neck portal protein. All have good evalues and probabilities. /note=NCBI Blast: a few good hits with high coverage but low identity. All have good evalues. Functions include head-to-tail protein (called in NiceHouse) and the rest are hypothetical proteins. /note=Checked approved functions list and it hits the requirements of containing the HK97. CDS 30331 - 30750 /gene="48" /product="head-to-tail stopper" /function="head-to-tail stopper" /locus tag="Francesca_48" /note=Original Glimmer call @bp 30331 has strength 8.3; Genemark calls start at 30331 /note=SSC: Start = 30331, Stop = 30750. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.581 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 420 bp is the longest possible ORF. GAP: 0 bp. ST: SS=NA. F: head-to-tail stopper. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 48, Function= function unknown, EValue= 7.0E-78. NCBIBLAST= PhageName= head-to-tail stopper [Rhodococcus phage NiceHouse], Coverage= 99.2806, SubjectRange= 1:139, QueryRange= 1:138, EValue= 6.11846E-36. HHPRED= Accession= 6TE9_E, Description= Stopper protein Rcc01689; "neck", "portal", "capsid", "tail tube", VIRUS; 3.58A {Rhodobacter capsulatus}, Probability= 93.3. Coverage= 58.2734, SubjectRange= 1:80, QueryRange= 1:86. CDD= . /note=Starterator: does not have the most annotated, start called in 4/98 in the pham. Trina, NiceHouse, and Dorin also called this start. /note=Z-score could be improved if we moved the start, but this would shorten the gene and is not supported by starterator. As is, the start covers all the cp and is LORF. /note=PhagesDB Blast has a hit on Dorin, NiceHouse, and Trina as well as memebers of the BE2 cluster. NiceHouse calls function as Head-to-tail stopper, all others as function unknown. /note=HHPRED: no good hits /note=NCBI Blast: Good evalues and coverage for head-to-tail protein (in NiceHouse). The rest of the hits on this blast were for hypothetical protein. CDS 30750 - 31547 /gene="49" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_49" /note=Original Glimmer call @bp 30750 has strength 9.9; Genemark calls start at 30750 /note=SSC: Start = 30750, Stop = 31547. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.539 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 798 bp is the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 49, Function= function unknown, EValue= 1.0E-152. NCBIBLAST= PhageName= hypothetical protein FDI69_gp076 [Rhodococcus phage Trina] >gb|ASZ74890.1| hypothetical protein SEA_TRINA_76 [Rhodococcus phage Trina], Coverage= 98.8679, SubjectRange= 7:265, QueryRange= 7:265, EValue= 6.92112E-44. HHPRED= Accession= PF12685.11, Description= SpoIIIAH ; SpoIIIAH-like protein, Probability= 55.9. Coverage= 16.9811, SubjectRange= 106:153, QueryRange= 106:263. CDD= . /note=Starterator: does not call the most annotated start, but called 100% of time when present (2/2; Francesca and Dorin). /note=Called start covers all cp, has the best zscore, and is the LORF. /note=PhagesDB Blast: matches with many in the BE1 cluster as well as Dorin, NiceHouse, and Trina. /note=HHPRED: no good hits, all have poor evalues /note=NCBI Blast: good evalues for hypothetical proteins in Rhodococcus phages as well as a Streptomyces phage. CDS 31544 - 32083 /gene="50" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_50" /note=Original Glimmer call @bp 31544 has strength 7.53; Genemark calls start at 31544 /note=SSC: Start = 31544, Stop = 32083. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.416 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 540 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 66, Function= function unknown, EValue= 4.0E-25. NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_66 [Rhodococcus phage NiceHouse], Coverage= 97.2067, SubjectRange= 4:175, QueryRange= 4:176, EValue= 8.68541E-29. HHPRED= Accession= 6TE9_F, Description= Tail terminator protein Rcc01690; "neck", "portal", "capsid", "tail tube", VIRUS; 3.58A {Rhodobacter capsulatus}, Probability= 47.7. Coverage= 27.3743, SubjectRange= 41:94, QueryRange= 41:107. CDD= . /note=Suggested Start is LORF, has best scores, has smallest gap/overlap, only start that contains all CP. /note= /note=All significant BLAST hits are for hypothetical protein, no significant HHPRED hits. CDS 32173 - 32847 /gene="51" /product="major tail protein" /function="major tail protein" /locus tag="Francesca_51" /note=Original Glimmer call @bp 32173 has strength 10.87; Genemark calls start at 32173 /note=SSC: Start = 32173, Stop = 32847. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 675 bp is the longest possible ORF. GAP: 89 bp. ST: SS=NA. F: major tail protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 67, Function= major tail protein, EValue= 1.0E-50. NCBIBLAST= PhageName= protein with Ig domain [Rhodococcus phage Trina] >gb|ASZ74892.1| hypothetical protein SEA_TRINA_78 [Rhodococcus phage Trina], Coverage= 98.6607, SubjectRange= 1:219, QueryRange= 1:221, EValue= 2.53625E-74. HHPRED= Accession= 6XGR_M, Description= YSD1_22 major tail protein; Bacteriophage tail, helical assembly, VIRAL PROTEIN; 3.5A {Bacteriophage sp.}, Probability= 99.0. Coverage= 92.4107, SubjectRange= 2:266, QueryRange= 2:208. CDD= . /note=Start kept due to being longest ORF, best scores, and covering all CP. Gene location, BLAST results and the top HHpred result provide evidence for calling this protein a major tail protein. CDS 32888 - 33412 /gene="52" /product="tail assembly chaperone" /function="tail assembly chaperone" /locus tag="Francesca_52" /note=Original Glimmer call @bp 32888 has strength 11.14; Genemark calls start at 32915 /note=SSC: Start = 32888, Stop = 33412. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.042 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 525 bp is not the longest possible ORF. GAP: 40 bp. ST: SS=NA. F: tail assembly chaperone. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 79, Function= tail assembly chaperone, EValue= 1.0E-25. NCBIBLAST= PhageName= hypothetical protein FDI69_gp079 [Rhodococcus phage Trina] >gb|ASZ74893.1| tail assembly chaperone [Rhodococcus phage Trina], Coverage= 91.3793, SubjectRange= 3:159, QueryRange= 3:172, EValue= 4.25761E-28. HHPRED= Accession= PF11836.12, Description= Phage_TAC_11 ; Phage tail tube protein, GTA-gp10, Probability= 97.1. Coverage= 77.0115, SubjectRange= 5:90, QueryRange= 5:156. CDD= . /note=Start kept due to covering all CP and not having enough evidence to change it despite mediocre scores. Function likely a tail assembly chaperone due to BLAST evidence as well as its location directly upstream of the tape measure protein. /note= /note=4/16 NOTES: /note=Francesca shows a translational frameshift in this gene. /note=Gene 52 start 32888 and ends at 33412.This gene (52) will slip frames at base 33397 (the A base is used twice in translation). This creates an alternative version of the gene that starts at 32888 and ends at 33732. /note=There is a fair amount of coding potential that supports this conclusion. /note=Near the end of Francesca`s gene 52 there is the sequence GGGAAAA where the gene slips at the first A. This exact slip sequence seems to be conserved across clusters CE and CR, as evidenced by similarity between Dorin, Francesca, and NiceHouse CDS 33738 - 40337 /gene="53" /product="tape measure protein" /function="tape measure protein" /locus tag="Francesca_53" /note=Original Glimmer call @bp 33738 has strength 7.45; Genemark calls start at 33738 /note=SSC: Start = 33738, Stop = 40337. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.648 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 6600 bp is the longest possible ORF. GAP: 325 bp. ST: SS=NA. F: tape measure protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 70, Function= tape measure protein, EValue= 0.0. NCBIBLAST= PhageName= tape measure protein [Rhodococcus phage NiceHouse], Coverage= 66.3029, SubjectRange= 861:2309, QueryRange= 861:2197, EValue= 5.63128E-153. HHPRED= Accession= 7ZHJ_g, Description= Pore-forming tail tip protein pb2; Bacteriophage, Siphophage, T5, baseplate, VIRAL PROTEIN; 3.53A {Escherichia phage T5}, Probability= 99.2. Coverage= 11.5962, SubjectRange= 41:311, QueryRange= 41:380. CDD= Accession= TIGR01760, Coverage= 15.0068, SubjectRange= 1:317, QueryRange= 1:499, EValue= 2.35993E-22. /note=Start kept due to covering all CP, longest ORF and good scores. Function called as tape measure protein due to strong BLAST results and conserved domain results as well as its position in the genome. CDS 40330 - 40776 /gene="54" /product="minor tail protein" /function="minor tail protein" /locus tag="Francesca_54" /note=Original Glimmer call @bp 40330 has strength 4.48; Genemark calls start at 40330 /note=SSC: Start = 40330, Stop = 40776. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.219 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 447 bp is the longest possible ORF. GAP: -8 bp. ST: SS=NA. F: minor tail protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 71, Function= minor tail protein, EValue= 1.0E-29. NCBIBLAST= PhageName= minor tail protein [Rhodococcus phage NiceHouse], Coverage= 94.5946, SubjectRange= 3:135, QueryRange= 3:148, EValue= 8.62635E-34. HHPRED= Accession= PF20458.2, Description= DUF6711 ; Family of unknown function (DUF6711), Probability= 99.9. Coverage= 91.8919, SubjectRange= 4:134, QueryRange= 4:148. CDD= . /note=Start kept due to covering all CP, longest ORF and best scores. Function being called as minor tail protein supported by strong BLAST results as well as its location on genome. CDS 40780 - 46137 /gene="55" /product="minor tail protein" /function="minor tail protein" /locus tag="Francesca_55" /note=Original Glimmer call @bp 40780 has strength 5.31; Genemark calls start at 40780 /note=SSC: Start = 40780, Stop = 46137. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.325 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 5358 bp is the longest possible ORF. GAP: 3 bp. ST: SS=NA. F: minor tail protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 83, Function= minor tail protein, EValue= 0.0. NCBIBLAST= PhageName= minor tail protein [Rhodococcus phage Trina] >gb|ASZ74897.1| minor tail protein [Rhodococcus phage Trina], Coverage= 64.4818, SubjectRange= 285:1338, QueryRange= 285:1511, EValue= 0.0. HHPRED= Accession= 8EON_E, Description= Baseplate hub gp41; Pseudomonas, phage, baseplate, VIRUS;{Pseudomonas phage vB_PaeM_E217}, Probability= 96.8. Coverage= 12.8291, SubjectRange= 1:206, QueryRange= 1:715. CDD= Accession= pfam05345, Coverage= 2.46499, SubjectRange= 1:49, QueryRange= 1:1600, EValue= 6.93037E-4. /note=Start kept due to being longest ORF, best scores, and covering all CP. Function called as minor tail protein due to conserved domain results, very strong BLAST results and its location on the genome. CDS 46166 - 46585 /gene="56" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_56" /note=Original Glimmer call @bp 46166 has strength 6.9; Genemark calls start at 46166 /note=SSC: Start = 46166, Stop = 46585. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.583 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 420 bp is the longest possible ORF. GAP: 28 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 84, Function= function unknown, EValue= 1.0E-9. NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_73 [Rhodococcus phage NiceHouse], Coverage= 98.5611, SubjectRange= 2:124, QueryRange= 2:139, EValue= 7.18652E-4. HHPRED= Accession= SCOP_d2ia7a1, Description= d.373.1.1 (A:23-133) Uncharacterized protein GSU0986 {Geobacter sulfurreducens [TaxId: 35554]} | CLASS: Alpha and beta proteins (a+b), FOLD: gpW/gp25-like, SUPFAM: gpW/gp25-like, FAM: gpW/gp25-like, Probability= 80.7. Coverage= 28.777, SubjectRange= 65:106, QueryRange= 65:138. CDD= . /note=Start kept due to covering all CP, longest ORF and good scores. Function unknown due to lack of significant HHpred results and BLAST results with called functions. CDS 46582 - 47379 /gene="57" /product="minor tail protein" /function="minor tail protein" /locus tag="Francesca_57" /note=Original Glimmer call @bp 46582 has strength 7.81; Genemark calls start at 46582 /note=SSC: Start = 46582, Stop = 47379. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.887 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 798 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: minor tail protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 74, Function= minor tail protein, EValue= 1.0E-76. NCBIBLAST= PhageName= minor tail protein [Rhodococcus phage NiceHouse], Coverage= 100.0, SubjectRange= 1:264, QueryRange= 1:265, EValue= 2.51903E-95. HHPRED= Accession= SCOP_d4v0fa1, Description= b.18.1.30 (A:4-170) Endoglucanase H {Ruminiclostridium thermocellum [TaxId: 572545]} | CLASS: All beta proteins, FOLD: Galactose-binding domain-like, SUPFAM: Galactose-binding domain-like, FAM: CBM11, Probability= 98.8. Coverage= 52.0755, SubjectRange= 3:165, QueryRange= 3:232. CDD= . /note=Start kept due to being longest ORF, okay scores, covering all CP and Starterator. Function called as minor tail protein due to strong BLAST results and supported by its location on the genome close to the tape measure protein. CDS 47380 - 51924 /gene="58" /product="minor tail protein" /function="minor tail protein" /locus tag="Francesca_58" /note=Original Glimmer call @bp 47380 has strength 4.09; Genemark calls start at 47380 /note=SSC: Start = 47380, Stop = 51924. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.449 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 4545 bp is the longest possible ORF. GAP: 0 bp. ST: SS=NA. F: minor tail protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 75, Function= minor tail protein, EValue= 0.0. NCBIBLAST= PhageName= minor tail protein [Rhodococcus phage NiceHouse], Coverage= 99.6697, SubjectRange= 5:1472, QueryRange= 5:1513, EValue= 0.0. HHPRED= Accession= 6TPW_A, Description= Receptor-type tyrosine-protein phosphatase F; Fibronectin type-III, adhesion protein, CELL ADHESION; HET: SO4; 2.9A {Homo sapiens}, Probability= 99.8. Coverage= 27.2127, SubjectRange= 13:399, QueryRange= 13:458. CDD= . /note=Start kept due to best scores, covering all CP, longest ORF and Starterator. Function called as minor tail protein due to very strong BLAST results; function call also supported by its location on the genome. Significant HHpred hits for receptor-type tyrosine-protein phosphatase were found but were not as significant as BLAST results and so were not used in calling function. CDS 51942 - 52589 /gene="59" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_59" /note=Original Glimmer call @bp 51942 has strength 2.16; Genemark calls start at 51942 /note=SSC: Start = 51942, Stop = 52589. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 648 bp is the longest possible ORF. GAP: 17 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 59, Function= function unknown, EValue= 5.0E-20. NCBIBLAST= . HHPRED= Accession= 7M3M_A, Description= Capsid protein 2; canine parvovirus, CPV, VIRUS; 2.26A {Canine parvovirus type 2}, Probability= 65.6. Coverage= 18.1395, SubjectRange= 10:48, QueryRange= 10:214. CDD= . /note=Start kept due to best scores, longest ORF and covering all CP. Function unknown due to lack of significant BLAST and HHpred results. CDS 52622 - 52831 /gene="60" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_60" /note=Original Glimmer call @bp 52622 has strength 5.58; Genemark calls start at 52622 /note=SSC: Start = 52622, Stop = 52831. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.149 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 210 bp is the longest possible ORF. GAP: 32 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Talia1610_Draft, ProteinNumber= 276, Function= function unknown, EValue= 0.55. NCBIBLAST= . HHPRED= Accession= 3W92_B, Description= Thioester coiled coil peptide; Transcription; HET: TYZ, MCR; 1.35A {N/A}, Probability= 89.0. Coverage= 37.6812, SubjectRange= 6:32, QueryRange= 6:55. CDD= . /note=Suggested start is LORF, has best scores, has smallest gap. /note= /note=No significant BLAST or HHPRED hits but good CP, so hypothetical protein is called. CDS 52832 - 53263 /gene="61" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_61" /note=Original Glimmer call @bp 52832 has strength 9.24; Genemark calls start at 52832 /note=SSC: Start = 52832, Stop = 53263. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.138 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 432 bp is the longest possible ORF. GAP: 0 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 78, Function= function unknown, EValue= 4.0E-44. NCBIBLAST= PhageName= hypothetical protein FDI69_gp089 [Rhodococcus phage Trina] >gb|ASZ74903.1| hypothetical protein SEA_TRINA_89 [Rhodococcus phage Trina], Coverage= 99.3007, SubjectRange= 4:145, QueryRange= 4:143, EValue= 8.27157E-51. HHPRED= Accession= 7C96_A, Description= RxLR effector protein Avh6; Complex, Inhibitor, Self ubiquitination, negative regulatory of Plant immunity, IMMUNE SYSTEM; HET: MSE; 2.51A {Phytophthora sojae (strain P6497)}, Probability= 81.8. Coverage= 17.4825, SubjectRange= 1:26, QueryRange= 1:35. CDD= . /note=Suggested start is LORF, start that contains the most CP, has best scores, shortest gap. /note= /note=All significant BLAST hits are for hypothetical proteins. No significant HHPRED hits. CDS 53332 - 54090 /gene="62" /product="metallophosphoesterase" /function="metallophosphoesterase" /locus tag="Francesca_62" /note=Original Glimmer call @bp 53332 has strength 8.83; Genemark calls start at 53332 /note=SSC: Start = 53332, Stop = 54090. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.438 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 759 bp is the longest possible ORF. GAP: 68 bp. ST: SS=NA. F: metallophosphoesterase. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 79, Function= metallophosphoesterase, EValue= 5.0E-67. NCBIBLAST= PhageName= metallophosphoesterase [Rhodococcus phage NiceHouse], Coverage= 99.2064, SubjectRange= 1:255, QueryRange= 1:250, EValue= 7.10729E-80. HHPRED= Accession= 2A22_B, Description= vacuolar protein sorting 29; VACUOLAR PROTEIN SORTING PROTEIN, ALPHA-BETA-BETA-ALPHA SANDWICH, Structural Genomics, Structural Genomics Consortium, SGC, PROTEIN TRANSPORT; 2.198A {Cryptosporidium parvum} SCOP: d.159.1.7, l.1.1.1, Probability= 99.8. Coverage= 93.6508, SubjectRange= 23:202, QueryRange= 23:238. CDD= . /note=Suggested Start is LORF, contains all CP, has best scores, and is called 100% of the time when present (with >70 phages calling this start). /note= /note=BLAST had multiple Q1:T1 results with metallophosphoesterase functions, all with very good E-values. CDS 54093 - 54257 /gene="63" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_63" /note=Original Glimmer call @bp 54093 has strength 9.58; Genemark calls start at 54093 /note=SSC: Start = 54093, Stop = 54257. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.003 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 165 bp is the longest possible ORF. GAP: 2 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= PF03672.17, Description= UPF0154 ; Uncharacterised protein family (UPF0154), Probability= 80.9. Coverage= 66.6667, SubjectRange= 9:43, QueryRange= 9:44. CDD= . /note=Suggested Start is LORF, contains all CP, has best scores, also called in Francesca. /note= /note=No significant BLAST or HHPRED results, signifying unknown function. CDS 54247 - 55059 /gene="64" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_64" /note=Original Glimmer call @bp 54247 has strength 6.04; Genemark calls start at 54247 /note=SSC: Start = 54247, Stop = 55059. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.031 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 813 bp is not the longest possible ORF. GAP: -11 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 91, Function= function unknown, EValue= 3.0E-59. NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_80 [Rhodococcus phage NiceHouse], Coverage= 96.2963, SubjectRange= 5:264, QueryRange= 5:263, EValue= 3.56043E-70. HHPRED= Accession= cd19437, Description= lipocalin_apoD-like; apolipoprotein D and similar proteins. Human apolipoprotein D (ApoD) is a small glycoprotein associated with high density lipoproteins (HDL) in plasma., Probability= 47.2. Coverage= 17.4074, SubjectRange= 107:153, QueryRange= 107:202. CDD= . /note=Start contains all CP, has best scores, also called in Francesca. /note= /note=All BLAST and HHPRED results are for proteins with unknown functions. CDS 55074 - 55319 /gene="65" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_65" /note=Original Glimmer call @bp 55074 has strength 8.07; Genemark calls start at 55056 /note=SSC: Start = 55074, Stop = 55319. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.138 is the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 246 bp is not the longest possible ORF. GAP: 14 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= OneDirection, ProteinNumber= 19, Function= function unknown, EValue= 3.0E-9. NCBIBLAST= PhageName= hypothetical protein [Gordonia alkanivorans] >gb|MDJ0010088.1| hypothetical protein [Gordonia alkanivorans] >gb|MDJ0495722.1| hypothetical protein [Gordonia alkanivorans], Coverage= 97.5309, SubjectRange= 1:77, QueryRange= 1:79, EValue= 5.25556E-10. HHPRED= Accession= PF04531.17, Description= Phage_holin_1 ; Bacteriophage holin, Probability= 87.7. Coverage= 71.6049, SubjectRange= 10:71, QueryRange= 10:70. CDD= . /note=Suggested start has best scores, is called 100% of the time when present (in >60 phages) and contains all CP. /note= /note=All significant BLAST and HHPRED results were for proteins with unknown function, leading to a hypothetical protein call. CDS 55399 - 55863 /gene="66" /product="helix-turn-helix DNA binding domain" /function="helix-turn-helix DNA binding domain" /locus tag="Francesca_66" /note=Original Glimmer call @bp 55399 has strength 2.14; Genemark calls start at 55459 /note=SSC: Start = 55399, Stop = 55863. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.776 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 465 bp is not the longest possible ORF. GAP: 79 bp. ST: SS=NA. F: helix-turn-helix DNA binding domain. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 82, Function= helix-turn-helix DNA-binding domain protein, EValue= 2.0E-38. NCBIBLAST= PhageName= helix-turn-helix DNA-binding domain protein [Rhodococcus phage NiceHouse], Coverage= 90.9091, SubjectRange= 7:146, QueryRange= 7:140, EValue= 1.80352E-45. HHPRED= Accession= SCOP_d6hn7b1, Description= a.6.1.5 (B:1-69) automated matches {Escherichia phage [TaxId: 10710]} | CLASS: All alpha proteins, FOLD: Putative DNA-binding domain, SUPFAM: Putative DNA-binding domain, FAM: Terminase gpNU1 subunit domain, Probability= 98.7. Coverage= 40.2597, SubjectRange= 2:56, QueryRange= 2:108. CDD= . /note=Start contains all CP, has the best scores out of CP-containing starts. However, start has longer gap than other potential starts. /note= /note=Blast results are for HTH protein. However, HHPRED was down at the time of annotation, so this should be reviewed further. /note=HHPred alignment shows clear HTH domain CDS 55860 - 56180 /gene="67" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_67" /note=Original Glimmer call @bp 55860 has strength 9.94; Genemark calls start at 55860 /note=SSC: Start = 55860, Stop = 56180. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.923 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 321 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 94, Function= function unknown, EValue= 8.0E-13. NCBIBLAST= PhageName= hypothetical protein FDI69_gp094 [Rhodococcus phage Trina] >gb|ASZ74908.1| hypothetical protein SEA_TRINA_94 [Rhodococcus phage Trina], Coverage= 86.7924, SubjectRange= 1:87, QueryRange= 1:92, EValue= 1.28086E-13. HHPRED= Accession= PF19619.3, Description= DUF6124 ; Family of unknown function (DUF6124), Probability= 84.2. Coverage= 55.6604, SubjectRange= 11:75, QueryRange= 11:70. CDD= . /note=Suggested start is LORF, and is only start that contains all CP. Supported by starterator. /note= /note=All BLAST results with good E-value are for hypothetical proteins, therefore orf called hypothetical protein. CDS 56173 - 56337 /gene="68" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_68" /note=Original Glimmer call @bp 56173 has strength 4.87; Genemark calls start at 56173 /note=SSC: Start = 56173, Stop = 56337. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 2.881 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 165 bp is not the longest possible ORF. GAP: -8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Suggested start contains ALMOST all CP, but not all. Some earlier starts have much better scores and contain all CP, but have huge overlap, so decision for start is difficult. There could be a potential slippery sequence ~56090 (black arrow on GeneMark). /note= /note=BLAST and HHPRED yielded no good results, leading to a hypothetical protein call. CDS 56339 - 56674 /gene="69" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_69" /note=Original Glimmer call @bp 56339 has strength 11.0; Genemark calls start at 56339 /note=SSC: Start = 56339, Stop = 56674. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.459 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 336 bp is the longest possible ORF. GAP: 1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 85, Function= function unknown, EValue= 2.0E-9. NCBIBLAST= . HHPRED= Accession= 4EAH_C, Description= Formin-like protein 3; ATP binding, cytoskeleton, formin, FMNL3, actin, PROTEIN BINDING; HET: ATP, ACT; 3.4A {Oryctolagus cuniculus}, Probability= 74.0. Coverage= 54.0541, SubjectRange= 329:389, QueryRange= 329:98. CDD= . /note=Suggested start is LORF, only start that contains (almost) all CP, has best scores. /note= /note=All significant BLAST results are for hypothetical proteins. CDS 56658 - 57965 /gene="70" /product="DnaB-like dsDNA helicase" /function="DnaB-like dsDNA helicase" /locus tag="Francesca_70" /note=Original Glimmer call @bp 56658 has strength 10.18; Genemark calls start at 56658 /note=SSC: Start = 56658, Stop = 57965. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.884 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1308 bp is the longest possible ORF. GAP: -17 bp. ST: SS=NA. F: DnaB-like dsDNA helicase. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 97, Function= DNA helicase, EValue= 1.0E-145. NCBIBLAST= PhageName= DNA helicase [Rhodococcus phage Trina] >gb|ASZ74911.1| DNA helicase [Rhodococcus phage Trina], Coverage= 96.7816, SubjectRange= 1:422, QueryRange= 1:421, EValue= 0.0. HHPRED= Accession= 4ZC0_A, Description= Replicative DNA helicase; Helicase ATPase DNA replication, dodecamer, hydrolase; HET: TBR; 6.7A {Helicobacter pylori}, Probability= 100.0. Coverage= 94.4828, SubjectRange= 45:505, QueryRange= 45:413. CDD= Accession= COG0305, Coverage= 68.046, SubjectRange= 122:415, QueryRange= 122:390, EValue= 1.21118E-18. /note=Suggested start is LORF, has best scores, contains all CP. /note= /note=Majority of BLAST hits coded for DNA helicase. However, many hits were for alternatives of helicase such as replicative helicase, dsDNA helicase, DnaB-like dsDNA helicase etc. Function should be further reviewed. CDS 57978 - 58259 /gene="71" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_71" /note=Original Glimmer call @bp 57978 has strength 5.1; Genemark calls start at 57978 /note=SSC: Start = 57978, Stop = 58259. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.207 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 282 bp is the longest possible ORF. GAP: 12 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Weasels2, ProteinNumber= 98, Function= function unknown, EValue= 7.0E-14. NCBIBLAST= PhageName= hypothetical protein SEA_MOAB_57 [Streptomyces phage Moab], Coverage= 96.7742, SubjectRange= 1:89, QueryRange= 1:90, EValue= 3.50833E-13. HHPRED= Accession= 4F54_A, Description= Uncharacterized protein; PF13590 family protein, DUF4136, Structural Genomics, Joint Center for Structural Genomics, JCSG, Protein Structure Initiative, PSI-BIOLOGY; HET: SO4, MLY, MSE; 1.6A {Bacteroides thetaiotaomicron}, Probability= 85.0. Coverage= 55.914, SubjectRange= 30:89, QueryRange= 30:55. CDD= . /note=Suggested Start is most commonly called start for pham, is LORF, start that contains most CP. /note= /note=Almost all significant BLAST hits are for hypothetical proteins, no significant HHPRED hits. CDS 58274 - 59269 /gene="72" /product="DNA primase" /function="DNA primase" /locus tag="Francesca_72" /note=Original Glimmer call @bp 58316 has strength 3.2; Genemark calls start at 58256 /note=SSC: Start = 58274, Stop = 59269. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.95 is not the highest start score. SCS: Start is not called by Glimmer and is not called by Genemark. LO: 996 bp is not the longest possible ORF. GAP: 14 bp. ST: SS=NA. F: DNA primase. FS: PHDBLAST= PhageName= Weasels2, ProteinNumber= 99, Function= DNA primase, EValue= 2.0E-77. NCBIBLAST= PhageName= DNA primase [Rhodococcus phage Weasels2] >gb|AOZ63688.1| DNA primase [Rhodococcus phage Weasels2], Coverage= 99.6979, SubjectRange= 1:327, QueryRange= 1:331, EValue= 2.30775E-94. HHPRED= Accession= 2AU3_A, Description= DNA primase; Zinc Ribbon, TOPRIM, RNA POLYMERASE, DNA REPLICATION, TRANSFERASE; 2.0A {Aquifex aeolicus}, Probability= 100.0. Coverage= 94.8641, SubjectRange= 11:350, QueryRange= 11:329. CDD= Accession= TIGR01391, Coverage= 73.1118, SubjectRange= 37:329, QueryRange= 37:284, EValue= 4.38493E-23. /note=Start changed to include all CP. New start has similar scores to previous start. This new start is also supported by Starterator as Francesca did not have most annotated start. Function called as DNA primase due to strong BLAST and HHPred results in addition to a DNA primase conserved domain. CDS 59375 - 59824 /gene="73" /product="HNH endonuclease" /function="HNH endonuclease" /locus tag="Francesca_73" /note=Original Glimmer call @bp 59375 has strength 1.95; Genemark calls start at 59417 /note=SSC: Start = 59375, Stop = 59824. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.193 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 450 bp is not the longest possible ORF. GAP: 105 bp. ST: SS=NA. F: HNH endonuclease. FS: PHDBLAST= PhageName= Gilson, ProteinNumber= 237, Function= HNH endonuclease, EValue= 5.0E-25. NCBIBLAST= PhageName= HNH endonuclease [Streptomyces phage Gilson] >gb|AZU97282.1| HNH endonuclease [Streptomyces phage Gilson] >gb|WNN94795.1| HNH endonuclease [Streptomyces phage Phredrick], Coverage= 91.2752, SubjectRange= 11:137, QueryRange= 11:142, EValue= 4.30856E-28. HHPRED= Accession= 5X42_A, Description= DotN; Type IV secretion system, Coupling protein complex, Effector translocation, PROTEIN TRANSPORT; 1.8A {Legionella pneumophila}, Probability= 98.2. Coverage= 44.9664, SubjectRange= 23:84, QueryRange= 23:139. CDD= Accession= cd00085, Coverage= 39.5973, SubjectRange= 2:55, QueryRange= 2:133, EValue= 2.35844E-5. /note=Start kept due to good scores and covering all CP. Function called HNH endonuclease due to BLAST and HHPRED results as well as conserved domain evidence. Protein sequence meets criteria by having two H`s within 30 amino acids of each other and having at least one N between them. CDS 59878 - 60687 /gene="74" /product="SSB protein" /function="SSB protein" /locus tag="Francesca_74" /note=Original Glimmer call @bp 59878 has strength 7.25; Genemark calls start at 59878 /note=SSC: Start = 59878, Stop = 60687. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.278 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 810 bp is the longest possible ORF. GAP: 53 bp. ST: SS=NA. F: SSB protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 89, Function= ERF family ssDNA binding protein, EValue= 9.0E-39. NCBIBLAST= PhageName= ERF family ssDNA binding protein [Rhodococcus phage NiceHouse], Coverage= 93.3085, SubjectRange= 5:236, QueryRange= 5:258, EValue= 7.57985E-42. HHPRED= Accession= 8GME_B, Description= Single-stranded DNA-binding protein; T4, gp32, Dda, complex, DNA BINDING PROTEIN-DNA complex; 4.98A {Tequatrovirus T4}, Probability= 99.9. Coverage= 78.8104, SubjectRange= 8:236, QueryRange= 8:224. CDD= . /note=Start kept due to covering all CP, good scores and being longest ORF. Function called as single strand binding protein (SSB protein) due to strong BLAST and HHpred results. /note=*Approved function list states that proteins listed as ssDNA binding protein should be labeled as SSB protein. CDS 60735 - 61208 /gene="75" /product="endonuclease VII" /function="endonuclease VII" /locus tag="Francesca_75" /note=Genemark calls start at 60735 /note=SSC: Start = 60735, Stop = 61208. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 0.231 is not the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 474 bp is the longest possible ORF. GAP: 47 bp. ST: SS=NA. F: endonuclease VII. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 61, Function= endonuclease VII, EValue= 7.0E-21. NCBIBLAST= PhageName= endonuclease VII [Rhodococcus phage NiceHouse], Coverage= 82.8026, SubjectRange= 2:124, QueryRange= 2:134, EValue= 6.90078E-23. HHPRED= Accession= 3GOX_A, Description= Restriction endonuclease Hpy99I; ENDONUCLEASE-DNA COMPLEX, RESTRICTION ENZYME, HPY99I, PSEUDOPALINDROME, HYDROLASE-DNA COMPLEX; HET: 1PE; 1.5A {Helicobacter pylori}, Probability= 99.7. Coverage= 78.9809, SubjectRange= 70:199, QueryRange= 70:136. CDD= Accession= pfam02945, Coverage= 45.2229, SubjectRange= 10:81, QueryRange= 10:133, EValue= 3.09497E-9. /note=Start kept due to covering all CP, longest ORF and Starterator. Function called as an endonuclease VII due to strong BLAST, HHpred, and conserved domain results. CDS 61162 - 64965 /gene="76" /product="DnaE-like DNA polymerase III (alpha)" /function="DnaE-like DNA polymerase III (alpha)" /locus tag="Francesca_76" /note=Original Glimmer call @bp 61162 has strength 6.09; Genemark calls start at 61162 /note=SSC: Start = 61162, Stop = 64965. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.219 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 3804 bp is the longest possible ORF. GAP: -47 bp. ST: SS=NA. F: DnaE-like DNA polymerase III (alpha). FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 90, Function= DnaE-like DNA polymerase III, EValue= 0.0. NCBIBLAST= PhageName= DnaE-like DNA polymerase III [Rhodococcus phage NiceHouse], Coverage= 99.2897, SubjectRange= 1:1250, QueryRange= 1:1261, EValue= 0.0. HHPRED= Accession= 2HPI_A, Description= DNA polymerase III alpha subunit; Pol-beta-like Nucleotidyltransferase fold, TRANSFERASE; 3.0A {Thermus aquaticus}, Probability= 100.0. Coverage= 98.5004, SubjectRange= 2:1108, QueryRange= 2:1249. CDD= Accession= COG0587, Coverage= 83.3465, SubjectRange= 3:876, QueryRange= 3:1060, EValue= 0.0. /note=Start kept due to being longest ORF, covering all CP and having good scores, Function called as DnaE-like DNA polymerase III (alpha) due to very strong BLAST, HHpred and conserved domain results. CDS 64940 - 65167 /gene="77" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_77" /note=Original Glimmer call @bp 64940 has strength 4.35; Genemark calls start at 64940 /note=SSC: Start = 64940, Stop = 65167. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.648 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 228 bp is not the longest possible ORF. GAP: -26 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 91, Function= function unknown, EValue= 0.001. NCBIBLAST= . HHPRED= Accession= PF07875.16, Description= Coat_F ; Coat F domain, Probability= 59.6. Coverage= 25.3333, SubjectRange= 43:62, QueryRange= 43:54. CDD= . /note=Start kept due to covering all CP and ok scores. Function unknown due to lack of significant BLAST or HHpred results. CDS 65164 - 65496 /gene="78" /product="MazG-like nucleotide pyrophosphohydrolase" /function="MazG-like nucleotide pyrophosphohydrolase" /locus tag="Francesca_78" /note=Original Glimmer call @bp 65164 has strength 5.32; Genemark calls start at 65164 /note=SSC: Start = 65164, Stop = 65496. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.23 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 333 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: MazG-like nucleotide pyrophosphohydrolase. FS: PHDBLAST= PhageName= Weasels2, ProteinNumber= 104, Function= nucleotide pyrophosphohydrolase, EValue= 3.0E-28. NCBIBLAST= PhageName= MazG-like pyrophosphatase [Rhodococcus phage Weasels2] >gb|AOZ63693.1| nucleotide pyrophosphohydrolase [Rhodococcus phage Weasels2], Coverage= 100.0, SubjectRange= 1:110, QueryRange= 1:110, EValue= 2.25111E-33. HHPRED= Accession= 2Q73_B, Description= Hypothetical protein; MazG, Vibrio, NTP-PPase, HYDROLASE; 1.8A {Vibrio sp. DAT722} SCOP: a.204.1.0, Probability= 99.4. Coverage= 90.0, SubjectRange= 1:97, QueryRange= 1:103. CDD= Accession= cd11541, Coverage= 81.8182, SubjectRange= 1:90, QueryRange= 1:97, EValue= 4.74281E-20. /note=Start is LORF and contains all CP. /note=HHPRED and BLAST call for MazG with high confidence. Approved function list does not have any special criteria. CDS 65493 - 66500 /gene="79" /product="RecA-like DNA recombinase" /function="RecA-like DNA recombinase" /locus tag="Francesca_79" /note=Original Glimmer call @bp 65493 has strength 5.87; Genemark calls start at 65493 /note=SSC: Start = 65493, Stop = 66500. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.881 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1008 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: RecA-like DNA recombinase. FS: PHDBLAST= PhageName= Emma1919, ProteinNumber= 66, Function= RecA-like DNA recombinase, EValue= 7.0E-68. NCBIBLAST= PhageName= RecA-like DNA recombinase [Streptomyces phage Gilson] >gb|QQV92432.1| RecA-like DNA recombinase [Streptomyces phage MeganTheeKilla] >gb|QZE11203.1| RecA-like DNA recombinase [Streptomyces phage Forrest] >gb|QZE11430.1| RecA-like DNA recombinase [Streptomyces phage Jada] >gb|URQ04679.1| RecA-like DNA recombinase [Streptomyces phage Emma1919] >gb|AZU97143.1| RecA-like DNA recombinase [Streptomyces phage Gilson], Coverage= 97.6119, SubjectRange= 2:335, QueryRange= 2:329, EValue= 8.36909E-81. HHPRED= Accession= 3HR8_A, Description= Protein recA; Alpha and beta proteins (a/b, a+b), ATP-binding, Cytoplasm, DNA damage, DNA recombination, DNA repair, DNA-binding, Nucleotide-binding; 1.95A {Thermotoga maritima}, Probability= 100.0. Coverage= 98.806, SubjectRange= 10:341, QueryRange= 10:334. CDD= Accession= cd00983, Coverage= 82.3881, SubjectRange= 60:324, QueryRange= 60:328, EValue= 1.54626E-33. /note=Start is LORF and contains all CP. /note=BLAST, HHPRED, and conserved domain database call for RecA with high certainty. /note= /note=Requires meeting of following criteria in HHPRED: /note=https://seaphages.org/forums/topic/5567/ /note=It does! Has: 1) alignment to N-terminal domain; 2) complete ATPase domain with ATP binding site (Walker A motif), Walker B motif, hydrolytic E, and hydrolytic E/D KNK motif, and 3) C-terminal domain alignment CDS 66490 - 66822 /gene="80" /product="Holliday junction resolvase" /function="Holliday junction resolvase" /locus tag="Francesca_80" /note=Original Glimmer call @bp 66490 has strength 7.0; Genemark calls start at 66490 /note=SSC: Start = 66490, Stop = 66822. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.573 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 333 bp is not the longest possible ORF. GAP: -11 bp. ST: SS=NA. F: Holliday junction resolvase. FS: PHDBLAST= PhageName= Weasels2, ProteinNumber= 106, Function= holliday junction resolvase, EValue= 1.0E-31. NCBIBLAST= PhageName= Holliday junction resolvase [Rhodococcus phage Weasels2] >gb|AOZ63695.1| holliday junction resolvase [Rhodococcus phage Weasels2], Coverage= 99.0909, SubjectRange= 1:109, QueryRange= 1:109, EValue= 1.76008E-38. HHPRED= Accession= 7BGS_A, Description= Holliday junction resolvase; archeal holliday junction resolvase helicase DNA binding enzyme phage 15-6 thermus thermophilus, RECOMBINATION; HET: SO4, MSE; 2.5A {Thermus thermophilus phage 15-6}, Probability= 99.5. Coverage= 90.9091, SubjectRange= 5:146, QueryRange= 5:102. CDD= . /note=Start kept due to good RBS and captures all coding potential /note= /note=Good data from Blast, HHPRED, for function CDS 66819 - 67151 /gene="81" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_81" /note=Original Glimmer call @bp 66819 has strength 9.34; Genemark calls start at 66819 /note=SSC: Start = 66819, Stop = 67151. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.085 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 333 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 107, Function= function unknown, EValue= 5.0E-16. NCBIBLAST= PhageName= hypothetical protein FDI69_gp107 [Rhodococcus phage Trina] >gb|ASZ74921.1| hypothetical protein SEA_TRINA_107 [Rhodococcus phage Trina], Coverage= 96.3636, SubjectRange= 4:108, QueryRange= 4:110, EValue= 2.33876E-17. HHPRED= . CDD= . /note=Start kept due to RBS and captures all CP /note=function decided based on strong correlation in Blast results CDS 67206 - 67964 /gene="82" /product="Cas4 exonuclease" /function="Cas4 exonuclease" /locus tag="Francesca_82" /note=Original Glimmer call @bp 67206 has strength 8.28; Genemark calls start at 67206 /note=SSC: Start = 67206, Stop = 67964. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.768 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 759 bp is the longest possible ORF. GAP: 54 bp. ST: SS=NA. F: Cas4 exonuclease. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 96, Function= Cas4 family exonuclease, EValue= 3.0E-29. NCBIBLAST= PhageName= Dna2/Cas4 domain-containing protein [Alphaproteobacteria bacterium], Coverage= 92.4603, SubjectRange= 31:259, QueryRange= 31:248, EValue= 1.52673E-34. HHPRED= Accession= cd09637, Description= Cas4_I-A_I-B_I-C_I-D_II-B; CRISPR/Cas system-associated protein Cas4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA., Probability= 99.8. Coverage= 78.1746, SubjectRange= 1:177, QueryRange= 1:235. CDD= Accession= PHA01622, Coverage= 79.7619, SubjectRange= 12:199, QueryRange= 12:235, EValue= 9.12202E-10. /note=Start kept due to good RBS, captures all CP /note=very good results from blast and HHPRED for function CDS 67961 - 68503 /gene="83" /product="RuvC-like resolvase" /function="RuvC-like resolvase" /locus tag="Francesca_83" /note=Original Glimmer call @bp 67961 has strength 3.55; Genemark calls start at 67961 /note=SSC: Start = 67961, Stop = 68503. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.775 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 543 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: RuvC-like resolvase. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 109, Function= RuvC-like resolvase, EValue= 1.0E-39. NCBIBLAST= PhageName= RuvC-like resolvase [Rhodococcus phage Trina] >gb|ASZ74923.1| RuvC-like resolvase [Rhodococcus phage Trina], Coverage= 98.8889, SubjectRange= 2:179, QueryRange= 2:180, EValue= 3.12764E-48. HHPRED= Accession= SCOP_d1hjra_, Description= c.55.3.6 (A:) RuvC resolvase {Escherichia coli [TaxId: 562]} | CLASS: Alpha and beta proteins (a/b), FOLD: Ribonuclease H-like motif, SUPFAM: Ribonuclease H-like, FAM: RuvC resolvase, Probability= 99.9. Coverage= 98.3333, SubjectRange= 1:154, QueryRange= 1:179. CDD= Accession= cd00529, Coverage= 93.8889, SubjectRange= 1:147, QueryRange= 1:171, EValue= 3.81564E-5. /note=Start kept due to good RBS and captures all CP /note=Good blast results along with a few HHPRED results CDS 68500 - 68637 /gene="84" /product="helix-turn-helix DNA binding domain" /function="helix-turn-helix DNA binding domain" /locus tag="Francesca_84" /note=Original Glimmer call @bp 68500 has strength 3.66; Genemark calls start at 68500 /note=SSC: Start = 68500, Stop = 68637. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.944 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 138 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: helix-turn-helix DNA binding domain. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 98, Function= helix-turn-helix DNA-binding domain protein, EValue= 3.0E-6. NCBIBLAST= PhageName= helix-turn-helix DNA-binding domain protein [Rhodococcus phage NiceHouse], Coverage= 95.5556, SubjectRange= 5:47, QueryRange= 5:44, EValue= 3.98904E-7. HHPRED= Accession= PF05344.15, Description= DUF746 ; Domain of Unknown Function (DUF746), Probability= 97.0. Coverage= 77.7778, SubjectRange= 1:36, QueryRange= 1:42. CDD= . /note=Start kept although the CP is not strong throughout the gene and also the start isn`t the suggested start /note=Function could be the one shown but the evidence isn`t super strong /note=HHPred alignment confirms HTH CDS 68634 - 69149 /gene="85" /product="thymidylate kinase" /function="thymidylate kinase" /locus tag="Francesca_85" /note=Original Glimmer call @bp 68634 has strength 5.8; Genemark calls start at 68634 /note=SSC: Start = 68634, Stop = 69149. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.085 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 516 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: thymidylate kinase. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 112, Function= HTH DNA binding protein, EValue= 2.0E-27. NCBIBLAST= PhageName= thymidylate kinase [Rhodococcus phage NiceHouse], Coverage= 95.3216, SubjectRange= 3:168, QueryRange= 3:166, EValue= 2.94183E-26. HHPRED= Accession= 4YER_A, Description= ABC transporter ATP-binding protein; PF00005 family protein, P-loop containing nucleoside triphosphate hydrolases fold, Structural Genomics, Joint Center for Structural Genomics; HET: MSE, ADP; 2.35A {Thermotoga maritima}, Probability= 99.5. Coverage= 91.8129, SubjectRange= 31:206, QueryRange= 31:158. CDD= . /note=Suggested start is LORF, called 100% of time when present, only start with all CP, has good scores. /note= /note=Most good BLAST and HHPRED hits are for thymidylate kinase, with multiple Q1:T1 BLAST hits. A few hits were for HTH binding protein, which should be further reviewed. CDS 69130 - 69282 /gene="86" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_86" /note=Original Glimmer call @bp 69130 has strength 5.06; Genemark calls start at 69130 /note=SSC: Start = 69130, Stop = 69282. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.934 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 153 bp is the longest possible ORF. GAP: -20 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 100, Function= function unknown, EValue= 1.0E-11. NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_100 [Rhodococcus phage NiceHouse], Coverage= 90.0, SubjectRange= 1:45, QueryRange= 1:45, EValue= 2.57681E-13. HHPRED= Accession= PF06676.15, Description= DUF1178 ; Protein of unknown function (DUF1178), Probability= 99.1. Coverage= 80.0, SubjectRange= 1:55, QueryRange= 1:41. CDD= . /note=Suggested start is LORF, only start that contains all CP, has best final score. /note= /note=All significant BLAST hits were for hypothetical proteins. CDS 69282 - 69584 /gene="87" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_87" /note=Original Glimmer call @bp 69282 has strength 6.31; Genemark calls start at 69282 /note=SSC: Start = 69282, Stop = 69584. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.317 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 303 bp is not the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 101, Function= function unknown, EValue= 7.0E-22. NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_101 [Rhodococcus phage NiceHouse], Coverage= 90.0, SubjectRange= 2:91, QueryRange= 2:94, EValue= 1.43907E-24. HHPRED= Accession= 6A7K_B, Description= Tlr0636 protein; NADH dehydrogenase-like complex, NDH-1, cyclic electron flow (CEF), Ferredoxin, ELECTRON TRANSPORT; 1.9A {Thermosynechococcus elongatus (strain BP-1)}, Probability= 90.7. Coverage= 62.0, SubjectRange= 9:72, QueryRange= 9:99. CDD= . /note=Suggested start contains all CP, has best scores, Has lowest gap/overlap, called 100% of time when present /note= /note=All significant BLAST hits are for hypothetical proteins. CDS 69577 - 70095 /gene="88" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_88" /note=Original Glimmer call @bp 69577 has strength 12.19; Genemark calls start at 69577 /note=SSC: Start = 69577, Stop = 70095. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.219 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 519 bp is the longest possible ORF. GAP: -8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 115, Function= function unknown, EValue= 2.0E-24. NCBIBLAST= PhageName= hypothetical protein FDI69_gp115 [Rhodococcus phage Trina] >gb|ASZ74929.1| hypothetical protein SEA_TRINA_115 [Rhodococcus phage Trina], Coverage= 96.5116, SubjectRange= 3:168, QueryRange= 3:170, EValue= 2.09986E-25. HHPRED= . CDD= . /note=Suggested start is LORF, has best scores, only start that contains all CP. /note= /note=All significant BLAST hits are for hypothetical proteins, no significant HHPRED hits. CDS 70092 - 71906 /gene="89" /product="terminase" /function="terminase" /locus tag="Francesca_89" /note=Original Glimmer call @bp 70092 has strength 5.28; Genemark calls start at 70092 /note=SSC: Start = 70092, Stop = 71906. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.992 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1815 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: terminase. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 103, Function= terminase large subunit, EValue= 1.0E-163. NCBIBLAST= PhageName= terminase [Rhodococcus phage Trina] >gb|ASZ74930.1| terminase [Rhodococcus phage Trina], Coverage= 99.6689, SubjectRange= 4:590, QueryRange= 4:604, EValue= 0.0. HHPRED= Accession= 6Z6D_A, Description= Terminase large subunit; genome packaging, bacteriophage, ATPase, nuclease, VIRAL PROTEIN; HET: BR; 2.2A {Enterobacteria phage HK97}, Probability= 100.0. Coverage= 88.245, SubjectRange= 15:498, QueryRange= 15:553. CDD= . /note=Suggested start is LORF, only start containing all CP, has best scores. /note= /note=Majority of significant BLAST hits are for terminase, large subunit, with many Q1:T1 hits. If a small subunit gene is not found within this genome, then the function should be changed to Terminase. If a small subunit is found, then gene should be called terminase, large subunit. CDS 71906 - 72463 /gene="90" /product="HNH endonuclease" /function="HNH endonuclease" /locus tag="Francesca_90" /note=Original Glimmer call @bp 71906 has strength 3.33; Genemark calls start at 71906 /note=SSC: Start = 71906, Stop = 72463. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.217 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 558 bp is not the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: HNH endonuclease. FS: PHDBLAST= . NCBIBLAST= PhageName= HNH endonuclease [Rhodococcus phage NiceHouse], Coverage= 87.5676, SubjectRange= 15:173, QueryRange= 15:170, EValue= 2.26605E-63. HHPRED= Accession= 5H0M_A, Description= HNH endonuclease; Thermophilic bacteriophage, HNH Endonuclease, DNA nicking, HYDROLASE; 1.52A {Geobacillus virus E2}, Probability= 97.7. Coverage= 36.7568, SubjectRange= 59:125, QueryRange= 59:78. CDD= Accession= pfam01844, Coverage= 27.5676, SubjectRange= 1:47, QueryRange= 1:77, EValue= 4.07882E-7. /note=Start contains all CP. /note=HHPRED, BLAST, and conserved domain database all call for HNH endonuclease with high certainty. Meets approved function criteria; protein sequence contains N surrounded by H`s within a 30 amino acid length. CDS 72465 - 72917 /gene="91" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_91" /note=Original Glimmer call @bp 72465 has strength 10.03; Genemark calls start at 72465 /note=SSC: Start = 72465, Stop = 72917. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.23 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 453 bp is not the longest possible ORF. GAP: 1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_105 [Rhodococcus phage NiceHouse], Coverage= 90.0, SubjectRange= 8:142, QueryRange= 8:143, EValue= 2.86548E-11. HHPRED= . CDD= . /note=Start contains all CP, good Glimmer score. /note=BLAST has hits for hypothetical protein and HHPRED is insignificant. tRNA 72992 - 73065 /gene="92" /product="tRNA-Glu(ttc)" /locus tag="FRANCESCA_92" /note=tRNA-Glu(ttc) CDS 73066 - 73506 /gene="93" /product="HNH endonuclease" /function="HNH endonuclease" /locus tag="Francesca_93" /note=Original Glimmer call @bp 73006 has strength 3.55; Genemark calls start at 73066 /note=SSC: Start = 73066, Stop = 73506. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.193 is not the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 441 bp is not the longest possible ORF. GAP: 148 bp. ST: SS=NA. F: HNH endonuclease. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein GCM10010472_39880 [Pseudonocardia halophobica] >dbj|GLL14256.1| hypothetical protein GCM10017577_54030 [Pseudonocardia halophobica], Coverage= 97.2603, SubjectRange= 51:199, QueryRange= 51:143, EValue= 8.23774E-45. HHPRED= Accession= 6GHC_A, Description= 5-methylcytosine-specific restriction enzyme A; HNH ENDONUCLEASE, MODIFICATION DEPENDENT RESTRICTION, 5-METHYLCYTOSINE, 5MC, 5-HYDROXYMETHYLCYTOSINE, 5HMC, BBA-ME NUCLEASE, ScoMcrA, HYDROLASE; 2.85A {Escherichia coli (strain K12)}, Probability= 98.1. Coverage= 43.8356, SubjectRange= 202:269, QueryRange= 202:141. CDD= Accession= cd00085, Coverage= 29.4521, SubjectRange= 14:55, QueryRange= 14:136, EValue= 2.10586E-5. /note=Start changed from uncertain Glimmer call (low Glimmer score), chosen start contains CP. /note=HHPRED and BLAST have hits that call for HNH endonuclease with high query cover. Phage frequency also reveals HNH endonuclease. Meets criteria; amino acid sequence contains N`s found between H`s within a 30 sequence length. tRNA 73507 - 73578 /gene="94" /product="tRNA-Gly(gcc)" /locus tag="FRANCESCA_94" /note=tRNA-Gly(gcc) tRNA 73758 - 73831 /gene="95" /product="tRNA-Glu(ctc)" /locus tag="FRANCESCA_95" /note=tRNA-Glu(ctc) tRNA 73896 - 73968 /gene="96" /product="tRNA-Pro(agg)" /locus tag="FRANCESCA_96" /note=tRNA-Pro(agg) tRNA 73981 - 74053 /gene="97" /product="tRNA-Gly(tcc)" /locus tag="FRANCESCA_97" /note=tRNA-Gly(tcc) CDS 74088 - 74513 /gene="98" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_98" /note=Original Glimmer call @bp 74088 has strength 7.9; Genemark calls start at 74088 /note=SSC: Start = 74088, Stop = 74513. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.798 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 426 bp is not the longest possible ORF. GAP: 581 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein [Mycolicibacterium fortuitum] >gb|MCV7141669.1| hypothetical protein [Mycolicibacterium fortuitum] >gb|MDV7195623.1| hypothetical protein [Mycolicibacterium fortuitum] >gb|MDV7209272.1| hypothetical protein [Mycolicibacterium fortuitum] >gb|MDV7231141.1| hypothetical protein [Mycolicibacterium fortuitum] >gb|MDV7262718.1| hypothetical protein [Mycolicibacterium fortuitum], Coverage= 64.539, SubjectRange= 15:103, QueryRange= 15:101, EValue= 3.20791E-6. HHPRED= . CDD= . /note=Start contains all CP. /note=BLAST top hit is hypothetical protein, and HHPRED is inconclusive. CDS 74510 - 74677 /gene="99" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_99" /note=Original Glimmer call @bp 74510 has strength 2.52; Genemark calls start at 74510 /note=SSC: Start = 74510, Stop = 74677. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.945 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 168 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start kept: best RBS, all CP captured, suggested start /note=No viable functions from Blast, HHPRED, or TMHMM CDS 74670 - 74987 /gene="100" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_100" /note=Original Glimmer call @bp 74670 has strength 9.24; Genemark calls start at 74670 /note=SSC: Start = 74670, Stop = 74987. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.867 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 318 bp is not the longest possible ORF. GAP: -8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Peregrin, ProteinNumber= 174, Function= function unknown, EValue= 2.0E-15. NCBIBLAST= PhageName= hypothetical protein GCM10025732_47720 [Glycomyces mayteni], Coverage= 97.1429, SubjectRange= 8:113, QueryRange= 8:103, EValue= 7.294E-19. HHPRED= . CDD= . /note=Start kept: good RBS, captures all CP, not suggested start though /note=Solid Blast results for hypothetical protein, no good HHPRED or TMHMM results CDS 75063 - 75323 /gene="101" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_101" /note=Original Glimmer call @bp 75063 has strength 5.03; Genemark calls start at 75063 /note=SSC: Start = 75063, Stop = 75323. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.014 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 261 bp is the longest possible ORF. GAP: 75 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein [Candidatus Solirubrobacter pratensis], Coverage= 56.9767, SubjectRange= 28:76, QueryRange= 28:56, EValue= 3.08653E-9. HHPRED= . CDD= . /note=Start kept: good RBS, captures all CP and suggested start /note=No viable function results from Blast, HHPRED or TMHMM tRNA 75378 - 75451 /gene="102" /product="tRNA-Asn(gtt)" /locus tag="FRANCESCA_102" /note=tRNA-Asn(gtt) tRNA 75595 - 75678 /gene="103" /product="tRNA-Tyr(gta)" /locus tag="FRANCESCA_103" /note=tRNA-Tyr(gta) tRNA 75905 - 75976 /gene="104" /product="tRNA-Trp(cca)" /locus tag="FRANCESCA_104" /note=tRNA-Trp(cca) tRNA 75988 - 76059 /gene="105" /product="tRNA-Thr(cgt)" /locus tag="FRANCESCA_105" /note=tRNA-Thr(cgt) tRNA 76098 - 76173 /gene="106" /product="tRNA-Leu(tag)" /locus tag="FRANCESCA_106" /note=tRNA-Leu(tag) CDS 76201 - 76398 /gene="107" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_107" /note=Original Glimmer call @bp 76201 has strength 11.49; Genemark calls start at 76201 /note=SSC: Start = 76201, Stop = 76398. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 198 bp is the longest possible ORF. GAP: 877 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start kept: good RBS, captures all CP, suggested start /note=No good results from Blast, HHPRED, or TMHMM tRNA 76407 - 76478 /gene="108" /product="tRNA-Met(cat)" /locus tag="FRANCESCA_108" /note=tRNA-Met(cat) tRNA 76666 - 76739 /gene="109" /product="tRNA-Lys(ctt)" /locus tag="FRANCESCA_109" /note=tRNA-Lys(ctt) tRNA 76905 - 76979 /gene="110" /product="tRNA-Lys(ttt)" /locus tag="FRANCESCA_110" /note=tRNA-Lys(ttt) tRNA 77102 - 77174 /gene="111" /product="tRNA-Arg(cct)" /locus tag="FRANCESCA_111" /note=tRNA-Arg(cct) CDS 77205 - 77453 /gene="112" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_112" /note=Original Glimmer call @bp 77205 has strength 2.15; Genemark calls start at 77205 /note=SSC: Start = 77205, Stop = 77453. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 249 bp is the longest possible ORF. GAP: 806 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start kept: good RBS, captures all CP, suggested start /note=No good Blast or HHPRED results /note=Very good TMHMM results so membrane protein called CDS 77453 - 77692 /gene="113" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_113" /note=Original Glimmer call @bp 77453 has strength 3.3; Genemark calls start at 77453 /note=SSC: Start = 77453, Stop = 77692. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.628 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 240 bp is the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start kept: good RBS, all CP covered, however not suggested start /note=No good Blast, HHPRED or TMHMM results available CDS 77689 - 77946 /gene="114" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_114" /note=Original Glimmer call @bp 77689 has strength 4.83; Genemark calls start at 77689 /note=SSC: Start = 77689, Stop = 77946. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.23 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 258 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= 4WSF_A, Description= Serine/threonine-protein phosphatase 4 regulatory subunit 3; phosphatase EVH1 domain, signaling protein; 1.501A {Drosophila melanogaster}, Probability= 45.1. Coverage= 45.8824, SubjectRange= 18:56, QueryRange= 18:57. CDD= . /note=Start: 77689 End: 77946. The CP is good, longest ORF, best z score best final score. PhagesDb function frequency hit to nothing. Phages DB blast had hits to nothing. So did HHPred and NCBI blast. No conserved domains. tRNA 77967 - 78041 /gene="115" /product="tRNA-Gln(ctg)" /locus tag="FRANCESCA_115" /note=tRNA-Gln(ctg) tRNA 78115 - 78189 /gene="116" /product="tRNA-His(gtg)" /locus tag="FRANCESCA_116" /note=tRNA-His(gtg) CDS 78190 - 78570 /gene="117" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_117" /note=Original Glimmer call @bp 78190 has strength 7.58; Genemark calls start at 78190 /note=SSC: Start = 78190, Stop = 78570. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.81 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 381 bp is the longest possible ORF. GAP: 243 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 78190 End: 28570. The CP is good, longest ORF, almost best z score and final score but it lengthens the gene significantly. No hits on PhagesDB or NCBI blast. No HHPred either. CDS 78563 - 78970 /gene="118" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_118" /note=Original Glimmer call @bp 78563 has strength 6.89; Genemark calls start at 78563 /note=SSC: Start = 78563, Stop = 78970. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.194 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 408 bp is the longest possible ORF. GAP: -8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 78563 End: 78970. The CP is good, longest ORF, best z score and best final score. No hits on PhagesDB and NCBI Blast and nothing on HHPred either. CDS 79064 - 79372 /gene="119" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_119" /note=Original Glimmer call @bp 79064 has strength 6.21; Genemark calls start at 79064 /note=SSC: Start = 79064, Stop = 79372. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.217 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 309 bp is the longest possible ORF. GAP: 93 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 77689 End: 77946. The CP is good, longest ORF, best z score best final score. PhagesDb function frequency hit to major capsid hexamer protein, but nothing else points to it. Phages DB blast had hits to nothing. So did HHPred and NCBI blast. No conserved domains. tRNA 79380 - 79469 /gene="120" /product="tRNA-Ser(gct)" /locus tag="FRANCESCA_120" /note=tRNA-Ser(gct) CDS 79425 - 79727 /gene="121" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_121" /note=Original Glimmer call @bp 79470 has strength 7.09; Genemark calls start at 79470 /note=SSC: Start = 79425, Stop = 79727. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.945 is the highest start score. SCS: Start is not called by Glimmer and is not called by Genemark. LO: 303 bp is the longest possible ORF. GAP: 52 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 79425 End: 79727. The CP is good, longest ORF, best z score and best final score. No hits on PhagesDB and NCBI Blast and nothing on HHPred either. Changed the start because of bad RBS scores and the CP still covers it. CDS 79801 - 81792 /gene="122" /product="rIIA-like protein" /function="rIIA-like protein" /locus tag="Francesca_122" /note=Original Glimmer call @bp 79801 has strength 5.55; Genemark calls start at 79801 /note=SSC: Start = 79801, Stop = 81792. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.138 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1992 bp is the longest possible ORF. GAP: 73 bp. ST: SS=NA. F: rIIA-like protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 156, Function= rIIA-like protein, EValue= 2.0E-75. NCBIBLAST= PhageName= RIIA lysis inhibitor [Rhodococcus phage Trina] >gb|ASZ74953.1| rIIA-like protein [Rhodococcus phage Trina], Coverage= 98.9442, SubjectRange= 1:637, QueryRange= 1:656, EValue= 8.73206E-82. HHPRED= . CDD= Accession= COG1389, Coverage= 20.0603, SubjectRange= 40:184, QueryRange= 40:177, EValue= 2.14741E-4. /note=Start: 79801 End: 81792. The CP is good, longest ORF, best z score best final score. PhagesDB function frequency points to rIIA-like protein. There are some hits on Phages DB and NCBI blast but there are also some other hits calling it something else as well. Looking into the approved function list it said that rIIA-like protein can be found using UniProt-SwissProt-viral database on HHPred. When using HHPred and only using the UniProt-SwissProt-viral database we go a hit with 100% probability and a good e-value for the rIIA-like protein. The rIIA-like protein is also located before the rIIB-like protein which is what the next gene is. CDS 81793 - 82872 /gene="123" /product="rIIB-like protein" /function="rIIB-like protein" /locus tag="Francesca_123" /note=Original Glimmer call @bp 81793 has strength 6.09; Genemark calls start at 81793 /note=SSC: Start = 81793, Stop = 82872. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1080 bp is the longest possible ORF. GAP: 0 bp. ST: SS=NA. F: rIIB-like protein. FS: PHDBLAST= PhageName= Peregrin, ProteinNumber= 117, Function= rIIB-like protein, EValue= 5.0E-53. NCBIBLAST= PhageName= rIIB-like protein [Rhodococcus phage Peregrin], Coverage= 99.4429, SubjectRange= 1:360, QueryRange= 1:357, EValue= 9.97898E-59. HHPRED= . CDD= . /note=Hits on Phagesdb Function Frequency, PhagesDB Blast, and NCBI blast point to rIIB-like protein. Forward Start: 81793, End: 82872. Length 1080. Best Z and Final score, longest ORF. Good CP. CDS 82887 - 83045 /gene="124" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_124" /note=Original Glimmer call @bp 82887 has strength 5.17; Genemark calls start at 82887 /note=SSC: Start = 82887, Stop = 83045. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 159 bp is the longest possible ORF. GAP: 14 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Weasels2, ProteinNumber= 122, Function= function unknown, EValue= 4.0E-8. NCBIBLAST= PhageName= hypothetical protein FDH04_gp122 [Rhodococcus phage Weasels2] >gb|AOZ63711.1| hypothetical protein SEA_WEASELS2_122 [Rhodococcus phage Weasels2], Coverage= 90.3846, SubjectRange= 2:48, QueryRange= 2:49, EValue= 3.247E-9. HHPRED= . CDD= . /note=Start: 82887 End: 83045. The CP is good, longest ORF, best z score best final score. PhagesDb function frequency hit to nothing. Phages DB had hits to all phages in cluster CB, but function is unknown. So did HHPred. NCBI blast had hits to hypothetical proteins. No conserved domains. CDS 83133 - 83294 /gene="125" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_125" /note=Original Glimmer call @bp 83133 has strength 6.75; Genemark calls start at 83133 /note=SSC: Start = 83133, Stop = 83294. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.138 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 162 bp is the longest possible ORF. GAP: 87 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 83133 End: 83294. The CP is good, longest ORF, second best z score and best final score. No hits on PhagesDB and NCBI Blast and nothing on HHPred either. tRNA 83348 - 83419 /gene="126" /product="tRNA-Thr(tgt)" /locus tag="FRANCESCA_126" /note=tRNA-Thr(tgt) tRNA 83516 - 83588 /gene="127" /product="tRNA-Thr(ggt)" /locus tag="FRANCESCA_127" /note=tRNA-Thr(ggt) CDS 83609 - 84265 /gene="128" /product="DNA methyltransferase" /function="DNA methyltransferase" /locus tag="Francesca_128" /note=Original Glimmer call @bp 83609 has strength 7.14; Genemark calls start at 83609 /note=SSC: Start = 83609, Stop = 84265. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.264 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 657 bp is not the longest possible ORF. GAP: 314 bp. ST: SS=NA. F: DNA methyltransferase. FS: PHDBLAST= PhageName= Grayson, ProteinNumber= 224, Function= DNA methyltransferase, EValue= 3.0E-82. NCBIBLAST= PhageName= DNA methyltransferase [Rhodococcus phage Grayson], Coverage= 97.2477, SubjectRange= 2:213, QueryRange= 2:216, EValue= 8.81451E-87. HHPRED= Accession= 4U7T_C, Description= DNA (cytosine-5)-methyltransferase 3A; DNA methyltransferase, active form, TRANSFERASE-TRANSFERASE REGULATOR complex; HET: SAH; 2.9A {Homo sapiens}, Probability= 99.2. Coverage= 98.1651, SubjectRange= 166:439, QueryRange= 166:217. CDD= Accession= COG0270, Coverage= 49.0826, SubjectRange= 1:118, QueryRange= 1:107, EValue= 6.3423E-14. /note=Hits by Phagesdb function frequency, PhagesDB BLAST, HHPred and NCBI BLAST all point to DNA methyltransferase. Start: 83609, End: 84265. High Z score, middle-high final score. Good CP. tRNA 84274 - 84347 /gene="129" /product="tRNA-Gln(ttg)" /locus tag="FRANCESCA_129" /note=tRNA-Gln(ttg) CDS 84371 - 84532 /gene="130" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_130" /note=Original Glimmer call @bp 84371 has strength 9.42; Genemark calls start at 84371 /note=SSC: Start = 84371, Stop = 84532. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 162 bp is the longest possible ORF. GAP: 105 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 84371 End: 84532. The CP is good, longest ORF, best z score and best final score. No hits on PhagesDB and NCBI Blast and nothing on HHPred either. tRNA 84634 - 84706 /gene="131" /product="tRNA-Cys(gca)" /locus tag="FRANCESCA_131" /note=tRNA-Cys(gca) CDS 84718 - 84903 /gene="132" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_132" /note=Original Glimmer call @bp 84718 has strength 9.73; Genemark calls start at 84718 /note=SSC: Start = 84718, Stop = 84903. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.217 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 186 bp is not the longest possible ORF. GAP: 185 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein PBI_PEREGRIN_124 [Rhodococcus phage Peregrin], Coverage= 75.4098, SubjectRange= 6:51, QueryRange= 6:55, EValue= 1.55068E-5. HHPRED= . CDD= . /note=Start: 83133 End: 84903. The CP is good, not longest ORF, best z score and best final score. Same start was kept in Dorin. CP still covers it all. No hits on PhagesDB and NCBI Blast and nothing on HHPred either. CDS 84903 - 85178 /gene="133" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_133" /note=Original Glimmer call @bp 84903 has strength 1.15; Genemark calls start at 84903 /note=SSC: Start = 84903, Stop = 85178. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.469 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 276 bp is the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 84903 End: 85178. The CP is good, longest ORF, third best z score and best final score. Start lengthens the gene a lot so no change in start. No hits on PhagesDB and NCBI Blast and nothing on HHPred either. tRNA 85188 - 85262 /gene="134" /product="tRNA-Ile(gat)" /locus tag="FRANCESCA_134" /note=tRNA-Ile(gat) CDS 85290 - 85553 /gene="135" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_135" /note=Original Glimmer call @bp 85290 has strength 11.77; Genemark calls start at 85290 /note=SSC: Start = 85290, Stop = 85553. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.467 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 264 bp is not the longest possible ORF. GAP: 111 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 85290 End: 85553. The CP is good, best z score best final score. PhagesDb function frequency hit to nothing. Phages DB blast had hit to Huwbert, but it is function unknown.. No hits on HHPred and NCBI blast. No conserved domains. tRNA 85563 - 85661 /gene="136" /product="tRNA-Ser(tga)" /locus tag="FRANCESCA_136" /note=tRNA-Ser(tga) tRNA 85779 - 85851 /gene="137" /product="tRNA-Val(tac)" /locus tag="FRANCESCA_137" /note=tRNA-Val(tac) tRNA 85853 - 85926 /gene="138" /product="tRNA-Met(cat)" /locus tag="FRANCESCA_138" /note=tRNA-Met(cat) CDS 85964 - 86212 /gene="139" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_139" /note=Original Glimmer call @bp 85964 has strength 5.98; Genemark calls start at 85964 /note=SSC: Start = 85964, Stop = 86212. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.264 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 249 bp is not the longest possible ORF. GAP: 410 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 85964 End: 86212. The CP is good, longest ORF, second best z score and second best final score. Start lengthens the gene a lot so no change in start. No hits on PhagesDB and NCBI Blast and nothing on HHPred either. CDS 86213 - 86776 /gene="140" /product="HNH endonuclease" /function="HNH endonuclease" /locus tag="Francesca_140" /note=Original Glimmer call @bp 86213 has strength 7.15; Genemark calls start at 86213 /note=SSC: Start = 86213, Stop = 86776. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.934 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 564 bp is the longest possible ORF. GAP: 0 bp. ST: SS=NA. F: HNH endonuclease. FS: PHDBLAST= PhageName= Finkle, ProteinNumber= 27, Function= function unknown, EValue= 5.0E-22. NCBIBLAST= PhageName= hypothetical protein GCM10010170_053480 [Dactylosporangium salmoneum], Coverage= 93.0481, SubjectRange= 115:292, QueryRange= 115:186, EValue= 1.03896E-41. HHPRED= Accession= PF14410.10, Description= GH-E ; HNH/ENDO VII superfamily nuclease with conserved GHE residues, Probability= 94.5. Coverage= 35.2941, SubjectRange= 1:68, QueryRange= 1:141. CDD= Accession= pfam07510, Coverage= 51.8717, SubjectRange= 2:102, QueryRange= 2:149, EValue= 9.4304E-8. /note=Start: 86213 End: 86776. The CP is good, longest ORF, best z score and best final score. PhagesDB and NCBI blast hit for unknown function but NCBI also hit for HNH endonuclease. The conserved domain hit for unknown function. Looking into the approved function list and the forums liked in the list under the HNH endonuclease I was able to find out that it was not possible to be HNH endonuclease. /note=Actually there is an HNN (HNN and HNK can be an alternative to HNH) CDS 86785 - 87210 /gene="141" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_141" /note=Original Glimmer call @bp 86785 has strength 9.33; Genemark calls start at 86785 /note=SSC: Start = 86785, Stop = 87210. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.467 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 426 bp is not the longest possible ORF. GAP: 8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 86785 End: 87210. The CP is good, not the longest ORF, best z score and best final score. The first start barely lengthens the gene and causes a small overlap with worse RBS scores. No hits on NCBI or PhagesDB blast. No hits on HHPred. CDS 87207 - 87497 /gene="142" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_142" /note=Original Glimmer call @bp 87207 has strength 1.41; Genemark calls start at 87207 /note=SSC: Start = 87207, Stop = 87497. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.487 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 291 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein KNU44_gp112 [Mycobacterium phage CicholasNage] >gb|QBP29888.1| hypothetical protein SEQ_HALENA_114 [Mycobacterium phage Halena] >gb|QDK04108.1| hypothetical protein SEA_AVADAKEDAVRA_114 [Mycobacterium phage AvadaKedavra] >gb|QGJ93126.1| hypothetical protein SEA_ZARIA_116 [Mycobacterium phage Zaria] >gb|QWT30632.1| hypothetical protein SEA_ROSE5_115 [Mycobacterium phage Rose5] >gb|UEM46389.1| hypothetical protein SEA_ENCELADUS_111 [Mycobacterium phage Enceladus] >gb|WMI34701.1| hypothetical protein SEA_CALM_119 [Mycobacterium phage Calm], Coverage= 57.2917, SubjectRange= 1:55, QueryRange= 1:55, EValue= 8.39315E-5. HHPRED= . CDD= . /note=Start: 87207 End: 87497. The CP is good, longest ORF, best z score best final score. PhagesDb function frequency hit to nothing. Phages DB blast had hits to nothing. So did HHPred and NCBI blast. No conserved domains. CDS 87651 - 87872 /gene="143" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_143" /note=Original Glimmer call @bp 87651 has strength 9.39; Genemark calls start at 87651 /note=SSC: Start = 87651, Stop = 87872. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.217 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 222 bp is the longest possible ORF. GAP: 153 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 87651 End: 87872. The CP is good, not the longest ORF, not best z score or final score but it gets lengthened a lot by keeping start. No hits on NCBI or PhagesDB blast. No hits on HHPred. CDS 87814 - 87981 /gene="144" /product="John Cena" /function="John Cena" /locus tag="Francesca_144" /note=Original Glimmer call @bp 87856 has strength 0.13 /note=SSC: Start = 87814, Stop = 87981. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 2.529 is not the highest start score. SCS: Start is not called by Glimmer and is not called by Genemark. LO: 168 bp is not the longest possible ORF. GAP: -59 bp. ST: SS=NA. F: John Cena. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . CDS 87869 - 87976 /gene="145" /product="https://www.youtube.com/watch?v=xvFZjo5PgG0" /function="https://www.youtube.com/watch?v=xvFZjo5PgG0" /locus tag="Francesca_145" /note=Genemark calls start at 87869 /note=SSC: Start = 87869, Stop = 87976. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.705 is the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 108 bp is the longest possible ORF. GAP: -113 bp. ST: SS=NA. F: https://www.youtube.com/watch?v=xvFZjo5PgG0. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . CDS 88134 - 88556 /gene="146" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_146" /note=Original Glimmer call @bp 88134 has strength 7.66; Genemark calls start at 88134 /note=SSC: Start = 88134, Stop = 88556. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.467 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 423 bp is the longest possible ORF. GAP: 157 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= excalibur calcium-binding domain-containing protein [Streptomyces sp. DSM 42041] >gb|MDT0377277.1| excalibur calcium-binding domain-containing protein [Streptomyces sp. DSM 42041], Coverage= 92.1429, SubjectRange= 27:143, QueryRange= 27:137, EValue= 1.13362E-8. HHPRED= Accession= 5J8T_A, Description= Choline binding protein; Excalibur, Choline-binding Protein L, Pneumococcal Adhesion, hydrolase; HET: CA; NMR {Streptococcus pneumoniae}, Probability= 95.2. Coverage= 28.5714, SubjectRange= 6:47, QueryRange= 6:140. CDD= Accession= pfam05901, Coverage= 22.8571, SubjectRange= 4:36, QueryRange= 4:137, EValue= 7.68232E-11. /note=Longest ORF, good coding potential, no significant BLAST hits, no transmembrane domains. Hits on HHPred and conserved domains for Excalibur calcium binding domains. /note=excalibur calcium binding domain noted but no known function tRNA 88599 - 88672 /gene="147" /product="tRNA-Ala(tgc)" /locus tag="FRANCESCA_147" /note=tRNA-Ala(tgc) tRNA 88794 - 88866 /gene="148" /product="tRNA-Asp(gtc)" /locus tag="FRANCESCA_148" /note=tRNA-Asp(gtc) CDS 88894 - 89139 /gene="149" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_149" /note=Original Glimmer call @bp 88894 has strength 5.47; Genemark calls start at 88894 /note=SSC: Start = 88894, Stop = 89139. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.467 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 246 bp is the longest possible ORF. GAP: 337 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Grayson, ProteinNumber= 127, Function= function unknown, EValue= 6.0E-8. NCBIBLAST= PhageName= hypothetical protein PBI_GRAYSON_127 [Rhodococcus phage Grayson], Coverage= 97.5309, SubjectRange= 1:79, QueryRange= 1:79, EValue= 8.90682E-8. HHPRED= Accession= 4F98_A, Description= hypothetical protein; PF10976 family protein, DUF2790, Structural Genomics, Joint Center for Structural Genomics, JCSG, Protein Structure Initiative, PSI-BIOLOGY; HET: MSE; 1.26A {Pseudomonas aeruginosa}, Probability= 90.9. Coverage= 61.7284, SubjectRange= 17:59, QueryRange= 17:68. CDD= . /note=Longest ORF, good coding potential, no significant BLAST or HHPred hits, no conserved or transmembrane domains CDS 89140 - 89310 /gene="150" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_150" /note=Original Glimmer call @bp 89140 has strength 8.02; Genemark calls start at 89140 /note=SSC: Start = 89140, Stop = 89310. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.456 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 171 bp is the longest possible ORF. GAP: 0 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 146, Function= function unknown, EValue= 3.0E-27. NCBIBLAST= . HHPRED= Accession= PF18536.5, Description= DUF5623 ; Domain of unknown function (DUF5623), Probability= 72.7. Coverage= 60.7143, SubjectRange= 36:71, QueryRange= 36:40. CDD= . /note=Longest ORF, good coding potential, no significant BLAST or HHPred hits, no conserved or transmembrane domains CDS 89303 - 89536 /gene="151" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_151" /note=Original Glimmer call @bp 89303 has strength 8.31; Genemark calls start at 89303 /note=SSC: Start = 89303, Stop = 89536. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.467 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 234 bp is the longest possible ORF. GAP: -8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 147, Function= function unknown, EValue= 6.0E-35. NCBIBLAST= PhageName= hypothetical protein [Nocardia cyriacigeorgica], Coverage= 89.6104, SubjectRange= 13:80, QueryRange= 13:74, EValue= 1.48267E-6. HHPRED= Accession= 4F98_A, Description= hypothetical protein; PF10976 family protein, DUF2790, Structural Genomics, Joint Center for Structural Genomics, JCSG, Protein Structure Initiative, PSI-BIOLOGY; HET: MSE; 1.26A {Pseudomonas aeruginosa}, Probability= 90.0. Coverage= 58.4416, SubjectRange= 22:59, QueryRange= 22:62. CDD= . /note=Longest ORF, good coding potential, no significant BLAST or HHPred hits, no conserved or transmembrane domains CDS 89600 - 89776 /gene="152" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_152" /note=Original Glimmer call @bp 89600 has strength 4.62; Genemark calls start at 89600 /note=SSC: Start = 89600, Stop = 89776. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.537 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 177 bp is the longest possible ORF. GAP: 63 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Francesca_Draft, ProteinNumber= 151, Function= function unknown, EValue= 1.0E-29. NCBIBLAST= PhageName= hypothetical protein PP304_gp197 [Gordonia phage Phendrix] >gb|QDK02673.1| hypothetical protein SEA_PHENDRIX_148 [Gordonia phage Phendrix], Coverage= 100.0, SubjectRange= 1:72, QueryRange= 1:58, EValue= 0.00255406. HHPRED= Accession= SCOP_d2h88b2, Description= a.1.2.1 (B:115-246) Succinate dehydogenase {Chicken (Gallus gallus) [TaxId: 9031]} | CLASS: All alpha proteins, FOLD: Globin-like, SUPFAM: alpha-helical ferredoxin, FAM: Fumarate reductase/Succinate dehydogenase iron-sulfur protein, C-terminal domain, Probability= 50.4. Coverage= 44.8276, SubjectRange= 105:132, QueryRange= 105:56. CDD= . /note=Longest ORF, good coding potential, no significant BLAST or HHPred hits, no conserved or transmembrane domains tRNA 89867 - 89938 /gene="153" /product="tRNA-Pro(tgg)" /locus tag="FRANCESCA_153" /note=tRNA-Pro(tgg) CDS 89986 - 90429 /gene="154" /product="HNH endonuclease" /function="HNH endonuclease" /locus tag="Francesca_154" /note=Original Glimmer call @bp 89986 has strength 4.01; Genemark calls start at 89986 /note=SSC: Start = 89986, Stop = 90429. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.869 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 444 bp is the longest possible ORF. GAP: 209 bp. ST: SS=NA. F: HNH endonuclease. FS: PHDBLAST= PhageName= JimJam, ProteinNumber= 109, Function= HNH endonuclease, EValue= 4.0E-44. NCBIBLAST= PhageName= HNH endonuclease [Streptomyces phage JimJam], Coverage= 93.1973, SubjectRange= 2:137, QueryRange= 2:139, EValue= 6.32763E-49. HHPRED= Accession= 3GOX_A, Description= Restriction endonuclease Hpy99I; ENDONUCLEASE-DNA COMPLEX, RESTRICTION ENZYME, HPY99I, PSEUDOPALINDROME, HYDROLASE-DNA COMPLEX; HET: 1PE; 1.5A {Helicobacter pylori}, Probability= 99.0. Coverage= 89.1156, SubjectRange= 75:181, QueryRange= 75:133. CDD= Accession= pfam12898, Coverage= 20.4082, SubjectRange= 48:76, QueryRange= 48:34, EValue= 0.00219761. /note=Longest ORF, decent coding potential, BLAST and HHPred hits to HNH endonuclease: HNK sequence found tRNA 90420 - 90492 /gene="155" /product="tRNA-Phe(gaa)" /locus tag="FRANCESCA_155" /note=tRNA-Phe(gaa) CDS 90520 - 90729 /gene="156" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_156" /note=Original Glimmer call @bp 90520 has strength 3.89; Genemark calls start at 90520 /note=SSC: Start = 90520, Stop = 90729. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.184 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 210 bp is not the longest possible ORF. GAP: 90 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 90520 End: 90729. The CP is good, long ORF, high z score high final score. PhagesDb function frequency hit to nothing. Phages DB blast had hits to nothing. So did HHPred and NCBI blast. No conserved domains. CDS 90738 - 91070 /gene="157" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_157" /note=Original Glimmer call @bp 90738 has strength 4.03; Genemark calls start at 90738 /note=SSC: Start = 90738, Stop = 91070. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.217 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 333 bp is the longest possible ORF. GAP: 8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 90738 End: 91070. The CP is good, longest ORF, best z score best final score. PhagesDb function frequency hit to nothing. Phages DB blast had hits to nothing. So did HHPred and NCBI blast. No conserved domains. CDS 91080 - 91436 /gene="158" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_158" /note=Original Glimmer call @bp 91080 has strength 4.4; Genemark calls start at 91080 /note=SSC: Start = 91080, Stop = 91436. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.788 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 357 bp is the longest possible ORF. GAP: 9 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 153, Function= function unknown, EValue= 1.0E-68. NCBIBLAST= PhageName= hypothetical protein FDI69_gp219 [Rhodococcus phage Trina] >gb|ASZ74967.1| hypothetical protein SEA_TRINA_183 [Rhodococcus phage Trina], Coverage= 52.5424, SubjectRange= 21:82, QueryRange= 21:75, EValue= 0.00244408. HHPRED= Accession= SCOP_d1wjpa2, Description= g.37.1.1 (A:43-66) Zinc finger protein 295, ZNF295 {Human (Homo sapiens) [TaxId: 9606]} | CLASS: Small proteins, FOLD: beta-beta-alpha zinc fingers, SUPFAM: beta-beta-alpha zinc fingers, FAM: Classic zinc finger, C2H2, Probability= 86.6. Coverage= 13.5593, SubjectRange= 1:17, QueryRange= 1:84. CDD= . /note=Longest ORF, good coding potential, no significant BLAST or HHPred hits, no conserved or transmembrane domains CDS 91399 - 91665 /gene="159" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_159" /note=Original Glimmer call @bp 91399 has strength 5.32; Genemark calls start at 91405 /note=SSC: Start = 91399, Stop = 91665. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.573 is the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 267 bp is not the longest possible ORF. GAP: -38 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 235, Function= function unknown, EValue= 6.0E-10. NCBIBLAST= PhageName= hypothetical protein [Prescottella equi], Coverage= 96.5909, SubjectRange= 13:95, QueryRange= 13:88, EValue= 1.59259E-12. HHPRED= . CDD= . /note=Start: 91399 End: 91665. The CP is good, not longest ORF but any change significantly impacts scores, best z score best final score. PhagesDb function frequency hit to nothing. Phages DB blast had hits to nothing. So did HHPred and NCBI blast. No conserved domains. CDS 91658 - 91795 /gene="160" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_160" /note=Original Glimmer call @bp 91658 has strength 1.81; Genemark calls start at 91658 /note=SSC: Start = 91658, Stop = 91795. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.627 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 138 bp is the longest possible ORF. GAP: -8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Francesca_Draft, ProteinNumber= 159, Function= function unknown, EValue= 5.0E-21. NCBIBLAST= . HHPRED= . CDD= . /note=Start: 91658 End: 91795. The CP is good, longest ORF, best z score best final score. PhagesDb function frequency hit to nothing. Phages DB blast had hits to nothing. So did HHPred and NCBI blast. No conserved domains. CDS 91806 - 92111 /gene="161" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_161" /note=Original Glimmer call @bp 91806 has strength 6.97; Genemark calls start at 91806 /note=SSC: Start = 91806, Stop = 92111. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 306 bp is the longest possible ORF. GAP: 10 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start: 91806 End: 92111. The CP is good, longest ORF, best z score best final score. PhagesDb function frequency hit to nothing. Phages DB blast had hits to nothing. So did HHPred and NCBI blast. No conserved domains. tRNA 92248 - 92321 /gene="162" /product="tRNA-Arg(tct)" /locus tag="FRANCESCA_162" /note=tRNA-Arg(tct) CDS 92378 - 92659 /gene="163" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_163" /note=Original Glimmer call @bp 92378 has strength 7.14; Genemark calls start at 92378 /note=SSC: Start = 92378, Stop = 92659. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.138 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 282 bp is the longest possible ORF. GAP: 266 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 159, Function= function unknown, EValue= 8.0E-51. NCBIBLAST= PhageName= hypothetical protein [Clostridium perfringens], Coverage= 97.8495, SubjectRange= 2:89, QueryRange= 2:93, EValue= 7.65335E-18. HHPRED= Accession= 1HIC_A, Description= HIRUDIN VARIANT; HIRUDIN; NMR {Hirudo medicinalis} SCOP: g.3.15.2, Probability= 50.1. Coverage= 19.3548, SubjectRange= 15:34, QueryRange= 15:84. CDD= . /note=Longest ORF, good coding potential, no significant BLAST or HHPred hits, no conserved or transmembrane domains CDS 92697 - 92873 /gene="164" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_164" /note=Original Glimmer call @bp 92697 has strength 7.61; Genemark calls start at 92697 /note=SSC: Start = 92697, Stop = 92873. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.467 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 177 bp is not the longest possible ORF. GAP: 37 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 160, Function= function unknown, EValue= 3.0E-30. NCBIBLAST= . HHPRED= Accession= 6Q2Z_A, Description= UPF0339 family protein; conserved hypothetical protein, UNKNOWN FUNCTION; NMR {Haloferax volcanii (strain ATCC 29605 / DSM 3757 / JCM 8879 / NBRC 14742 / NCIMB 2012 / VKM B-1768 / DS2)} SCOP: d.348.1.0, Probability= 75.5. Coverage= 60.3448, SubjectRange= 14:50, QueryRange= 14:37. CDD= . /note=Start: 92697. End: 92873. The CP is good, not longest ORF but this start was called in Dorin, best z score, best final score. No significant hits to anything, HHPred hits had high e-values and are insignificant, no DeepTmhmm hits or conserved domains CDS 92976 - 93113 /gene="165" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_165" /note=Original Glimmer call @bp 92976 has strength 6.58; Genemark calls start at 92976 /note=SSC: Start = 92976, Stop = 93113. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 138 bp is the longest possible ORF. GAP: 102 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 161, Function= function unknown, EValue= 9.0E-20. NCBIBLAST= PhageName= hypothetical protein [Nocardia nova], Coverage= 68.8889, SubjectRange= 4:34, QueryRange= 4:38, EValue= 0.0395953. HHPRED= Accession= 5H7U_A, Description= Eukaryotic translation initiation factor 3 subunit C; translation, initiation factor, eukaryotic initiation factor; NMR {Saccharomyces cerevisiae (strain ATCC 204508 / S288c)}, Probability= 92.9. Coverage= 77.7778, SubjectRange= 83:118, QueryRange= 83:39. CDD= . /note=Start: 92976 End: 93113. The CP is good, longest ORF, best z score, best final score. No significant hits to anything, HHPred hits had high e-values and are insignificant, no DeepTmhmm hits or conserved domains CDS 93125 - 93364 /gene="166" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_166" /note=Original Glimmer call @bp 93125 has strength 9.24; Genemark calls start at 93125 /note=SSC: Start = 93125, Stop = 93364. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.138 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 240 bp is not the longest possible ORF. GAP: 11 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 162, Function= function unknown, EValue= 1.0E-36. NCBIBLAST= . HHPRED= Accession= PF10768.13, Description= FliX ; Class II flagellar assembly regulator, Probability= 63.6. Coverage= 44.3038, SubjectRange= 70:105, QueryRange= 70:39. CDD= . /note=Not longest ORF but good scores and no overlap, good coding potential, no significant BLAST or HHPred hits, no conserved or transmembrane domains tRNA 93459 - 93547 /gene="167" /product="tRNA-Leu(taa)" /locus tag="FRANCESCA_167" /note=tRNA-Leu(taa) CDS 93586 - 94017 /gene="168" /product="HNH endonuclease" /function="HNH endonuclease" /locus tag="Francesca_168" /note=Original Glimmer call @bp 93586 has strength 2.82 /note=SSC: Start = 93586, Stop = 94017. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.414 is the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 432 bp is the longest possible ORF. GAP: 221 bp. ST: SS=NA. F: HNH endonuclease. FS: PHDBLAST= PhageName= GodonK, ProteinNumber= 149, Function= HNH endonuclease, EValue= 1.0E-19. NCBIBLAST= PhageName= HNH endonuclease [Nevskiaceae bacterium], Coverage= 83.2168, SubjectRange= 18:136, QueryRange= 18:142, EValue= 1.16646E-33. HHPRED= Accession= 6GHC_A, Description= 5-methylcytosine-specific restriction enzyme A; HNH ENDONUCLEASE, MODIFICATION DEPENDENT RESTRICTION, 5-METHYLCYTOSINE, 5MC, 5-HYDROXYMETHYLCYTOSINE, 5HMC, BBA-ME NUCLEASE, ScoMcrA, HYDROLASE; 2.85A {Escherichia coli (strain K12)}, Probability= 98.0. Coverage= 44.0559, SubjectRange= 200:268, QueryRange= 200:138. CDD= Accession= cd00085, Coverage= 35.6643, SubjectRange= 6:55, QueryRange= 6:134, EValue= 9.79583E-6. /note=Longest ORF, good coding potential, significant BLAST and HHPred hits to HNH endonuclease, conserved domain of HNH endonuclease: HNH and HNK sequences found in protein CDS 94044 - 94259 /gene="169" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_169" /note=Original Glimmer call @bp 94044 has strength 8.58; Genemark calls start at 94044 /note=SSC: Start = 94044, Stop = 94259. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.944 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 216 bp is not the longest possible ORF. GAP: 26 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dmitri, ProteinNumber= 9, Function= head-to-tail adaptor, EValue= 2.7. NCBIBLAST= . HHPRED= Accession= SCOP_d1rjoa2, Description= d.17.2.1 (A:9-96) Copper amine oxidase, domains 1 and 2 {Arthrobacter globiformis [TaxId: 1665]} | CLASS: Alpha and beta proteins (a+b), FOLD: Cystatin-like, SUPFAM: Amine oxidase N-terminal region, FAM: Amine oxidase N-terminal region, Probability= 70.3. Coverage= 26.7606, SubjectRange= 4:23, QueryRange= 4:27. CDD= . /note=Start called in all members of the CG cluster and by proxy, this pham. Not the longest ORF, but the longest ORF adds no coding potential. There is not enough information to suggest a known specific function, therefor the protein is being called as a hypothetical protein. CDS 94277 - 94789 /gene="170" /product="nucleotidyl transferase" /function="nucleotidyl transferase" /locus tag="Francesca_170" /note=Original Glimmer call @bp 94277 has strength 1.93; Genemark calls start at 94277 /note=SSC: Start = 94277, Stop = 94789. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.874 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 513 bp is the longest possible ORF. GAP: 17 bp. ST: SS=NA. F: nucleotidyl transferase. FS: PHDBLAST= PhageName= Peregrin, ProteinNumber= 241, Function= tRNA nucleotidyltransferase, EValue= 3.0E-34. NCBIBLAST= PhageName= nucleotidyltransferase [Rhodococcus phage Weasels2] >gb|AOZ63827.1| nucleotidyltransferase [Rhodococcus phage Weasels2], Coverage= 98.2353, SubjectRange= 2:167, QueryRange= 2:169, EValue= 3.34049E-40. HHPRED= Accession= PF10127.13, Description= RlaP ; RNA repair pathway DNA polymerase beta family, Probability= 99.9. Coverage= 90.0, SubjectRange= 19:199, QueryRange= 19:155. CDD= Accession= PHA02603, Coverage= 92.9412, SubjectRange= 1:178, QueryRange= 1:158, EValue= 5.10063E-7. /note=Not most annotated start (does not have most annotated), start called 100% of the time when present. Longest ORF. The call for tRNA nucleotidyltransferase was due to high BLAST and HHPred matches with several other phages. CDS 94791 - 95066 /gene="171" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_171" /note=Original Glimmer call @bp 94791 has strength 3.94; Genemark calls start at 94791 /note=SSC: Start = 94791, Stop = 95066. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.365 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 276 bp is the longest possible ORF. GAP: 1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 166, Function= function unknown, EValue= 8.0E-51. NCBIBLAST= PhageName= hypothetical protein [Candidatus Methanoperedens sp.], Coverage= 79.1209, SubjectRange= 1:62, QueryRange= 1:72, EValue= 2.5559E-4. HHPRED= Accession= PF09986.13, Description= DUF2225 ; Uncharacterized protein conserved in bacteria (DUF2225), Probability= 96.1. Coverage= 59.3407, SubjectRange= 3:58, QueryRange= 3:64. CDD= . /note=The current selcted start is the best because it has a good coding potential, smallest gap and best RBS score. Although this start has not been annotated yet(because it is found in only Dorin and Francesca) it is called 100% of the time. /note=HHPRED results show 96.1% probability with Cpxc domain which is presumed to be functionally un characterized. /note=No conserved domain. No transmembrane domain. tRNA 95107 - 95180 /gene="172" /product="tRNA-Arg(acg)" /locus tag="FRANCESCA_172" /note=tRNA-Arg(acg) tRNA 95305 - 95380 /gene="173" /product="tRNA-Leu(caa)" /locus tag="FRANCESCA_173" /note=tRNA-Leu(caa) CDS 95408 - 95695 /gene="174" /product="thioredoxin" /function="thioredoxin" /locus tag="Francesca_174" /note=Original Glimmer call @bp 95408 has strength 8.22; Genemark calls start at 95408 /note=SSC: Start = 95408, Stop = 95695. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 288 bp is not the longest possible ORF. GAP: 341 bp. ST: SS=NA. F: thioredoxin. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 169, Function= function unknown, EValue= 1.0E-48. NCBIBLAST= PhageName= thioredoxin family protein [Verrucomicrobiota bacterium], Coverage= 64.2105, SubjectRange= 48:112, QueryRange= 48:65, EValue= 1.89896E-8. HHPRED= Accession= 3QOU_A, Description= protein ybbN; thioredoxin-like fold, tetratricopeptide repeat, lysine dimethylation, PROTEIN BINDING; HET: MLY; 1.8A {Escherichia coli}, Probability= 99.7. Coverage= 64.2105, SubjectRange= 27:88, QueryRange= 27:62. CDD= Accession= cd02947, Coverage= 88.4211, SubjectRange= 14:91, QueryRange= 14:87, EValue= 2.81758E-11. /note=The current selected start is the best because it has a good coding potential, "smallest" gap and best RBS score. Although this start has not been annotated yet (because it is found in only Dorin and Francesca) it is called 100% of the time. NBCI Blast results show a function of thioredoxin in Phage Lilbooboo. /note=Conserved domain: TRX_superfamily which includes proteins that exclusively encode a TRX domain. /note= HHPRED show good e-scores and 99.7 probability for protein ybbN which is a Trx-like protein. CDS 95682 - 95963 /gene="175" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_175" /note=Original Glimmer call @bp 95682 has strength 4.0; Genemark calls start at 95709 /note=SSC: Start = 95682, Stop = 95963. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.277 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 282 bp is the longest possible ORF. GAP: -14 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= JasonD_Draft, ProteinNumber= 49, Function= function unknown, EValue= 0.24. NCBIBLAST= . HHPRED= Accession= 6ANZ_A, Description= Uncharacterized protein; SSGCID, Neisseria gonorrhoeae, hypothetical protein, uncharacterized protein, iodide phasing, Structural Genomics, Seattle Structural Genomics Center for; HET: SO4; 1.6A {Neisseria gonorrhoeae (strain NCCP11945)}, Probability= 53.3. Coverage= 93.5484, SubjectRange= 1:101, QueryRange= 1:88. CDD= . /note=The current selected start is the best because it has a good coding potential, best RBS score. Although this start has not been annotated yet (because it is found in only Dorin and Francesca) it is called 100% of the time. /note=No NBCI Blast results /note=No Transmembrane and Conserved Domains /note=HHPRED shows results for family of unknown function. CDS 95972 - 96112 /gene="176" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_176" /note=Genemark calls start at 95972 /note=SSC: Start = 95972, Stop = 96112. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.456 is the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 141 bp is the longest possible ORF. GAP: 8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 171, Function= function unknown, EValue= 4.0E-20. NCBIBLAST= . HHPRED= Accession= 8B1R_P, Description= Probable RecBCD inhibitor gp5.9; Homologous recombination, DNA repair, phage, Helicase, Nuclease, Inhibitor, Protein complex, Enzyme, DNA mimic, DNA BINDING PROTEIN; 3.2A {Escherichia coli}, Probability= 81.7. Coverage= 89.1304, SubjectRange= 8:49, QueryRange= 8:42. CDD= . /note=LORF and best scores; start: 95,972, stop: 96,112; no major hits to anything, HHPred hits are insignificant CDS 96114 - 96284 /gene="177" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_177" /note=Original Glimmer call @bp 96114 has strength 9.47; Genemark calls start at 96114 /note=SSC: Start = 96114, Stop = 96284. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.217 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 171 bp is the longest possible ORF. GAP: 1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 187, Function= function unknown, EValue= 2.0E-8. NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_187 [Rhodococcus phage NiceHouse], Coverage= 87.5, SubjectRange= 7:55, QueryRange= 7:51, EValue= 3.24604E-8. HHPRED= Accession= PF04534.16, Description= Herpes_UL56 ; Herpesvirus UL56 protein, Probability= 37.1. Coverage= 35.7143, SubjectRange= 79:99, QueryRange= 79:55. CDD= . /note=Longest ORF, good coding potential, no significant BLAST or HHPred hits, no conserved or transmembrane domains CDS 96281 - 96517 /gene="178" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_178" /note=Original Glimmer call @bp 96281 has strength 5.71; Genemark calls start at 96281 /note=SSC: Start = 96281, Stop = 96517. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.151 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 237 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Camille, ProteinNumber= 16, Function= tail terminator, EValue= 3.7. NCBIBLAST= PhageName= chorismate mutase [Oscillospiraceae bacterium] >gb|MDD7428177.1| prephenate dehydratase domain-containing protein [Oscillospiraceae bacterium] >gb|MDY2847927.1| prephenate dehydratase domain-containing protein [Oscillospiraceae bacterium], Coverage= 61.5385, SubjectRange= 4:51, QueryRange= 4:53, EValue= 7.85687E-4. HHPRED= Accession= 5HUB_A, Description= Chorismate mutase; chorismate mutase, shikimate pathway, pericyclic reaction, Isomerase; 1.06A {Corynebacterium glutamicum}, Probability= 99.1. Coverage= 94.8718, SubjectRange= 13:85, QueryRange= 13:77. CDD= Accession= smart00830, Coverage= 57.6923, SubjectRange= 1:45, QueryRange= 1:53, EValue= 0.00118804. /note=The current selected start is the best because it has a good coding potential, smallest overlap and best RBS score. HHPRED shows results of chorismate mutase with 99.1% identity but very low e-value. Although Phagesdb blasts to gene 16 in Camille with function of tail terminator, the evalue is very small and there is no evidence (from the approved functions list) to call it a tail terminator. /note=Conserved domain:Chorismate mutase /note=No Transmembrane domain. CDS 96640 - 97269 /gene="179" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_179" /note=Original Glimmer call @bp 96640 has strength 7.47; Genemark calls start at 96640 /note=SSC: Start = 96640, Stop = 97269. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.31 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 630 bp is the longest possible ORF. GAP: 122 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 174, Function= function unknown, EValue= 1.0E-120. NCBIBLAST= . HHPRED= Accession= PF11023.12, Description= DUF2614 ; Zinc-ribbon containing domain, Probability= 81.6. Coverage= 44.9761, SubjectRange= 1:105, QueryRange= 1:125. CDD= . /note=Start used in all CG phages. Longest ORF. Two predicted transmembrane domains. Not enough information otherwise to point to a known function, thus leading to this protein being called as a hypothetical protein. tRNA 97299 - 97373 /gene="180" /product="tRNA-Ile(tat)" /locus tag="FRANCESCA_180" /note=tRNA-Ile(tat) CDS 97348 - 97947 /gene="181" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_181" /note=Original Glimmer call @bp 97684 has strength 4.06; Genemark calls start at 97402 /note=SSC: Start = 97348, Stop = 97947. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.776 is not the highest start score. SCS: Start is not called by Glimmer and is not called by Genemark. LO: 600 bp is not the longest possible ORF. GAP: 78 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= Weasels2, ProteinNumber= 176, Function= function unknown, EValue= 0.2. NCBIBLAST= . HHPRED= Accession= PF14316.10, Description= DUF4381 ; Domain of unknown function (DUF4381), Probability= 73.2. Coverage= 57.2864, SubjectRange= 13:90, QueryRange= 13:173. CDD= . /note=Start changed from 97684 to 97348 in order to contain all coding capacity and to eliminate a gap of over 400. The protein has been called as a hypothetical protein due to a lack of information to suggest a known function. However, the membrane predictions are certainly interesting. CDS 97958 - 98995 /gene="182" /product="RNA ligase" /function="RNA ligase" /locus tag="Francesca_182" /note=Original Glimmer call @bp 97958 has strength 7.39; Genemark calls start at 97958 /note=SSC: Start = 97958, Stop = 98995. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1038 bp is the longest possible ORF. GAP: 10 bp. ST: SS=NA. F: RNA ligase. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 194, Function= RNA ligase, EValue= 5.0E-56. NCBIBLAST= PhageName= RNA ligase [Rhodococcus phage NiceHouse], Coverage= 89.5652, SubjectRange= 1:318, QueryRange= 1:309, EValue= 9.6847E-62. HHPRED= Accession= 5TT6_A, Description= T4 RNA ligase 1; metal catalysis, covalent nucleotidyltransferase, lysyl-AMP, LIGASE; HET: ATP; 2.187A {Enterobacteria phage T4}, Probability= 100.0. Coverage= 93.3333, SubjectRange= 26:369, QueryRange= 26:332. CDD= Accession= pfam09511, Coverage= 52.1739, SubjectRange= 1:221, QueryRange= 1:233, EValue= 1.12016E-19. /note=Start: 97,958 End: 98,995. The CP is good, longest ORF, best z score best final score. Phages DB blast and HHPred had hits to RNA ligase, and conserved domain had hits to RNA ligase. CDS complement (98992 - 99378) /gene="183" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_183" /note=Original Glimmer call @bp 99378 has strength 3.34; Genemark calls start at 99378 /note=SSC: Start = 99378, Stop = 98992. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.643 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 387 bp is not the longest possible ORF. GAP: 7 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 179, Function= function unknown, EValue= 3.0E-45. NCBIBLAST= PhageName= polysaccharide deacetylase family protein [Streptomyces resistomycificus] >gb|KUN99502.1| hypothetical protein AQJ84_11180 [Streptomyces resistomycificus], Coverage= 92.9688, SubjectRange= 1:108, QueryRange= 1:119, EValue= 0.00642369. HHPRED= Accession= 7QOI_AD, Description= Major capsid protein gp32; crAssphage, bacteriophage, virus, DNA virus, portal, vertex, capsid, connector; HET: MG; 3.62A {Bacteroides phage crAss001}, Probability= 34.8. Coverage= 28.125, SubjectRange= 134:169, QueryRange= 134:113. CDD= . /note=Not longest ORF but best scores and no gap, good coding potential, no significant BLAST or HHPred hits, no conserved or transmembrane domains CDS complement (99386 - 100456) /gene="184" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_184" /note=Original Glimmer call @bp 100456 has strength 3.99; Genemark calls start at 100456 /note=SSC: Start = 100456, Stop = 99386. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.108 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1071 bp is the longest possible ORF. GAP: 14 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 180, Function= function unknown, EValue= 0.0. NCBIBLAST= PhageName= hydrolase [Rhodococcus phage NiceHouse], Coverage= 48.0337, SubjectRange= 1:165, QueryRange= 1:171, EValue= 7.23762E-12. HHPRED= Accession= 1WCK_A, Description= BCLA PROTEIN; COLLAGEN-LIKE PROTEIN, BACTERIAL SURFACE ANTIGEN, JELLY-ROLL TOPOLOGY, STRUCTURAL PROTEIN; 1.36A {BACILLUS ANTHRACIS}, Probability= 83.5. Coverage= 18.8202, SubjectRange= 123:195, QueryRange= 123:333. CDD= . /note=Suggested start was selected due to good scores and had good CP. There was some evidence that this gene could code for a hydrolase, but not enough to officially declare it. CDS complement (100471 - 102375) /gene="185" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_185" /note=Original Glimmer call @bp 102375 has strength 9.48; Genemark calls start at 102375 /note=SSC: Start = 102375, Stop = 100471. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.119 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1905 bp is not the longest possible ORF. GAP: 63 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Stormageddon, ProteinNumber= 26, Function= minor tail protein, EValue= 1.0E-64. NCBIBLAST= PhageName= hydrolase [Arthrobacter phage Qui] >gb|QED11527.1| minor tail protein [Arthrobacter phage Qui] >gb|QOC56359.1| minor tail protein [Arthrobacter phage Paella], Coverage= 55.5205, SubjectRange= 67:447, QueryRange= 67:594, EValue= 9.79402E-41. HHPRED= Accession= 3QC7_A, Description= Head fiber protein; supercoiled triple repeating helix-turn-helix, VIRAL PROTEIN; 1.52A {Bacillus phage phi29}, Probability= 98.5. Coverage= 12.3028, SubjectRange= 8:136, QueryRange= 8:112. CDD= . /note=The selected start is called 100% of the time when called. CP is good/includes start and has good final/z scores. HHPred yielded significant hits for head fiber protein. NCBI Blast produced good e-values for hydrolase. However, due to two high likelihoods for both, upon further discussion it was revealed that if minor tail protein is called, if the gene is not near a moderate/average "long" size, then it cannot be claimed as that. Along with the calling of a head protein, as an effect of the being called for a minor tail protein. This led to the decision of claiming it as a hypothetical protein. CDS complement (102439 - 103563) /gene="186" /product="minor tail protein" /function="minor tail protein" /locus tag="Francesca_186" /note=Original Glimmer call @bp 103563 has strength 11.56; Genemark calls start at 103563 /note=SSC: Start = 103563, Stop = 102439. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1125 bp is the longest possible ORF. GAP: 24 bp. ST: SS=NA. F: minor tail protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 205, Function= function unknown, EValue= 1.0E-113. NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_197 [Rhodococcus phage NiceHouse], Coverage= 98.9305, SubjectRange= 1:368, QueryRange= 1:370, EValue= 2.62803E-127. HHPRED= Accession= 8I4M_j, Description= Fiber protein(gp 28) of the cyanophage P-SCSP1u; Whole virus, Capsid, cyanophage, T7-like virus, VIRUS; 3.81A {Prochlorococcus phage P-SCSP1u}, Probability= 88.4. Coverage= 12.0321, SubjectRange= 523:577, QueryRange= 523:64. CDD= Accession= COG5301, Coverage= 44.6524, SubjectRange= 189:355, QueryRange= 189:326, EValue= 1.44205E-6. /note=Not most annotated start, but shared with all members of the CG cluster. This gene has some strong hits for tail fiber protein as well as head decoration protein, which we refer to as capsid decoration protein. This place in the genome would be a odd place for a tail fiber protein, so the capsid decoration protein seems more likely, but we don`t have enough evidence to support this latter function, so we are calling it as a hypothetical protein. /note=Minor tail proteins can be in non-canonical positions CDS complement (103588 - 104409) /gene="187" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_187" /note=Original Glimmer call @bp 104409 has strength 5.34; Genemark calls start at 104409 /note=SSC: Start = 104409, Stop = 103588. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.16 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 822 bp is not the longest possible ORF. GAP: 98 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 183, Function= function unknown, EValue= 1.0E-158. NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_198 [Rhodococcus phage NiceHouse], Coverage= 99.6337, SubjectRange= 1:252, QueryRange= 1:272, EValue= 6.09386E-28. HHPRED= Accession= PF18667.5, Description= BppU_IgG ; Baseplate upper protein immunoglobulin like domain, Probability= 71.1. Coverage= 17.5824, SubjectRange= 4:69, QueryRange= 4:267. CDD= . /note=HHPred yielded no significant hits. NCBI Blast yielded good e-values for hypothetical protein. CDS 104508 - 104756 /gene="188" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_188" /note=Original Glimmer call @bp 104508 has strength 8.33; Genemark calls start at 104508 /note=SSC: Start = 104508, Stop = 104756. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.299 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 249 bp is the longest possible ORF. GAP: 98 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 184, Function= function unknown, EValue= 3.0E-29. NCBIBLAST= . HHPRED= Accession= PF18571.5, Description= VWA_3_C ; von Willebrand factor type A C-terminal domain, Probability= 44.5. Coverage= 25.6098, SubjectRange= 9:30, QueryRange= 9:55. CDD= . /note=This start has the best scores and has good coding potential. This has no significant hits for any particular function. CDS 104758 - 104988 /gene="189" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_189" /note=Original Glimmer call @bp 104758 has strength 10.06; Genemark calls start at 104758 /note=SSC: Start = 104758, Stop = 104988. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.98 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 231 bp is not the longest possible ORF. GAP: 1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 185, Function= function unknown, EValue= 7.0E-38. NCBIBLAST= . HHPRED= . CDD= . /note=Start is called 100% of the time when present and shares this selected start with Dorin. The selected start has good coding potential as well as final/z-scores. HHPred results were not significant and NCBI Blast did not produce any results. CDS 104990 - 105181 /gene="190" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_190" /note=Original Glimmer call @bp 104990 has strength 5.11; Genemark calls start at 104990 /note=SSC: Start = 104990, Stop = 105181. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.23 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 192 bp is not the longest possible ORF. GAP: 1 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 186, Function= function unknown, EValue= 2.0E-26. NCBIBLAST= . HHPRED= Accession= PF10112.13, Description= Halogen_Hydrol ; 5-bromo-4-chloroindolyl phosphate hydrolysis protein, Probability= 88.7. Coverage= 65.0794, SubjectRange= 2:41, QueryRange= 2:52. CDD= . /note=The start is really good, low gap, high/goop coding potential. No strong hits for any particular function. CDS 105184 - 106086 /gene="191" /product="polynucleotide kinase" /function="polynucleotide kinase" /locus tag="Francesca_191" /note=Original Glimmer call @bp 105184 has strength 5.26; Genemark calls start at 105184 /note=SSC: Start = 105184, Stop = 106086. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 903 bp is the longest possible ORF. GAP: 2 bp. ST: SS=NA. F: polynucleotide kinase. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 187, Function= function unknown, EValue= 1.0E-174. NCBIBLAST= PhageName= polynucleotide kinase [Rhodococcus phage NiceHouse], Coverage= 99.3333, SubjectRange= 4:301, QueryRange= 4:300, EValue= 2.30672E-106. HHPRED= Accession= 1LTQ_A, Description= POLYNUCLEOTIDE KINASE; KINASE, PHOSPHATASE, ALPHA/BETA, P-LOOP, TRANSFERASE; HET: MSE, ADP; 2.33A {Enterobacteria phage T4} SCOP: c.37.1.1, c.108.1.9, Probability= 100.0. Coverage= 99.6667, SubjectRange= 2:301, QueryRange= 2:300. CDD= . /note=Contains the suggested start and called 99% of the time when present. The selected start has good CP and good final/z-scores. Significant hits in HHPred. NCBI Blast had good e-values. CDS 106107 - 106376 /gene="192" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_192" /note=Original Glimmer call @bp 106107 has strength 6.68; Genemark calls start at 106107 /note=SSC: Start = 106107, Stop = 106376. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.06 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 270 bp is the longest possible ORF. GAP: 20 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 188, Function= function unknown, EValue= 5.0E-43. NCBIBLAST= . HHPRED= Accession= PF17255.6, Description= EbsA ; EbsA-like protein, Probability= 82.9. Coverage= 95.5056, SubjectRange= 16:95, QueryRange= 16:87. CDD= . /note=Good start with great scores, had good coding potential, and has synteny within the CG cluster. No hits for any particular function. CDS 106373 - 106546 /gene="193" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_193" /note=Genemark calls start at 106373 /note=SSC: Start = 106373, Stop = 106546. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.253 is not the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 174 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= Shagrat, ProteinNumber= 21, Function= tape measure protein, EValue= 6.1. NCBIBLAST= . HHPRED= Accession= PF11772.12, Description= EpuA ; DNA-directed RNA polymerase subunit beta, Probability= 82.9. Coverage= 64.9123, SubjectRange= 3:41, QueryRange= 3:40. CDD= . /note=Start shared with the other member of the CG cluster, Dorin. This protein has been called as a hypothetical due to a lack of information to suggest a known function. CDS 106557 - 106754 /gene="194" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_194" /note=Original Glimmer call @bp 106557 has strength 3.6; Genemark calls start at 106557 /note=SSC: Start = 106557, Stop = 106754. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 198 bp is the longest possible ORF. GAP: 10 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Poco6, ProteinNumber= 015, Function= function unknown, EValue= 7.9. NCBIBLAST= . HHPRED= Accession= PF15643.10, Description= Tox-PL-2 ; Papain fold toxin 2, Probability= 51.3. Coverage= 16.9231, SubjectRange= 81:92, QueryRange= 81:13. CDD= . /note=Start chosen in all members of the CG cluster. The protein has been called as a hypothetical protein as there is no information to suggest a known function. CDS 106812 - 106961 /gene="195" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_195" /note=Original Glimmer call @bp 106812 has strength 11.28; Genemark calls start at 106812 /note=SSC: Start = 106812, Stop = 106961. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 150 bp is the longest possible ORF. GAP: 57 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 191, Function= function unknown, EValue= 2.0E-20. NCBIBLAST= . HHPRED= Accession= 8HHF_B, Description= Cell division protein FtsB; Bacterial cell division, divisome, FtsB, FtsL, FtsQ, FtsBLQ, FtsQLB, membrane protein complex, heterotrimer, MEMBRANE PROTEIN; 3.04A {Escherichia coli}, Probability= 93.5. Coverage= 83.6735, SubjectRange= 46:88, QueryRange= 46:43. CDD= . /note=Called 100% of the time when present and shares this start with Dorin. Coding potential is good and the selected start has good final/z-scores. Significant hits in HHPred for cell division protein. NCBI Blast yielded no values. Determined hypothetical protein due to inconsistency between the two. CDS 106975 - 107139 /gene="196" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_196" /note=Genemark calls start at 106975 /note=SSC: Start = 106975, Stop = 107139. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.31 is the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 165 bp is not the longest possible ORF. GAP: 13 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= PF13129.10, Description= DUF3953 ; Protein of unknown function (DUF3953), Probability= 82.9. Coverage= 64.8148, SubjectRange= 3:39, QueryRange= 3:42. CDD= . /note=No Blast results, suggested start was good. CP was good and evidence presents itself as hypothetical protein CDS 107142 - 107369 /gene="197" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_197" /note=Original Glimmer call @bp 107139 has strength 6.87; Genemark calls start at 107139 /note=SSC: Start = 107142, Stop = 107369. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is not called by Glimmer and is not called by Genemark. LO: 228 bp is not the longest possible ORF. GAP: 2 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 193, Function= function unknown, EValue= 2.0E-36. NCBIBLAST= . HHPRED= Accession= PF20556.2, Description= DUF6768 ; Family of unknown function (DUF6768), Probability= 77.4. Coverage= 68.0, SubjectRange= 44:103, QueryRange= 44:52. CDD= . /note=Called 100% of the time when present and shares this start with Dorin. We decided to change the start (from start one to start two) to increase the scores and minimize the gene overlap. There is one transmembrane present, however the presence of a single one is not enough to conclude a transmembrane protein. CDS 107439 - 108461 /gene="198" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_198" /note=Original Glimmer call @bp 107439 has strength 11.05; Genemark calls start at 107439 /note=SSC: Start = 107439, Stop = 108461. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.98 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 1023 bp is the longest possible ORF. GAP: 69 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 194, Function= function unknown, EValue= 0.0. NCBIBLAST= PhageName= hypothetical protein FDI69_gp197 [Rhodococcus phage Trina] >gb|ASZ74989.1| hypothetical protein SEA_TRINA_210 [Rhodococcus phage Trina], Coverage= 98.2353, SubjectRange= 1:341, QueryRange= 1:334, EValue= 1.16649E-117. HHPRED= Accession= PF06067.15, Description= DUF932 ; Domain of unknown function (DUF932), Probability= 99.9. Coverage= 68.5294, SubjectRange= 1:221, QueryRange= 1:326. CDD= Accession= TIGR03299, Coverage= 89.7059, SubjectRange= 7:298, QueryRange= 7:322, EValue= 3.411E-37. /note=Has good start, and strong coding potential, shares this start with other phages. This gene has hits for some conserved domains, but they either have an unknown function or the phage plasmid-like protein, which is not super specific either. Additionally, these are not noted in the approved functions list, so it could be looked into further in regard to the plasmid-like protein conserved domain, but for now we are calling it as a hypothetical protein. CDS 108520 - 109041 /gene="199" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_199" /note=Original Glimmer call @bp 108520 has strength 6.13; Genemark calls start at 108520 /note=SSC: Start = 108520, Stop = 109041. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.467 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 522 bp is the longest possible ORF. GAP: 58 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 195, Function= function unknown, EValue= 2.0E-90. NCBIBLAST= . HHPRED= Accession= SCOP_d1hroa_, Description= a.3.1.1 (A:) Cytochrome c2 {Rhodopila globiformis [TaxId: 1071]} | CLASS: All alpha proteins, FOLD: Cytochrome c, SUPFAM: Cytochrome c, FAM: monodomain cytochrome c, Probability= 67.0. Coverage= 9.24856, SubjectRange= 88:104, QueryRange= 88:167. CDD= . /note=Called 100% of the time when present and shares this start with Dorin. Coding potential is good and the selected start has good final/z-scores. No significant hits in HHPred. NCBI Blast yielded no values. CDS 109043 - 109390 /gene="200" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_200" /note=Original Glimmer call @bp 109043 has strength 7.59; Genemark calls start at 109043 /note=SSC: Start = 109043, Stop = 109390. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.342 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 348 bp is not the longest possible ORF. GAP: 1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=No NCBI Blast hits, went with suggested start. Good CP evidence presents itself as hypothetical protein CDS 109412 - 109855 /gene="201" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_201" /note=Genemark calls start at 109412 /note=SSC: Start = 109412, Stop = 109855. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.164 is not the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 444 bp is the longest possible ORF. GAP: 21 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 197, Function= function unknown, EValue= 2.0E-77. NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_210 [Rhodococcus phage NiceHouse], Coverage= 82.3129, SubjectRange= 1:121, QueryRange= 1:133, EValue= 1.05836E-7. HHPRED= Accession= 7DYR_E, Description= PTS system mannose-specific EIIC component; MceA, Microcin E492, Bacteriocin, PTS, ManYZ, transporter, Mannose, PROTEIN TRANSPORT; HET: MAN;{Escherichia coli (strain K12)}, Probability= 94.1. Coverage= 95.9184, SubjectRange= 122:252, QueryRange= 122:145. CDD= . /note=Does not contain the most annotated start, but called 100% of the time when present. Coding potential is good and the selected start has good final/z-scores. Significant hits in HHPred. NCBI Blast yielded hypothetical protein, called in NiceHouse. NiceHouse also called function unknown. Determining function as hypothetical protein. CDS 109865 - 110146 /gene="202" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_202" /note=Original Glimmer call @bp 109865 has strength 7.45; Genemark calls start at 109865 /note=SSC: Start = 109865, Stop = 110146. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.217 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 282 bp is the longest possible ORF. GAP: 9 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 214, Function= function unknown, EValue= 3.0E-10. NCBIBLAST= PhageName= hypothetical protein FDI69_gp193 [Rhodococcus phage Trina] >gb|ASZ74993.1| hypothetical protein SEA_TRINA_214 [Rhodococcus phage Trina], Coverage= 82.7957, SubjectRange= 16:88, QueryRange= 16:93, EValue= 5.12747E-10. HHPRED= Accession= PF20542.2, Description= DUF6757 ; Family of unknown function (DUF6757), Probability= 65.3. Coverage= 43.0108, SubjectRange= 2:40, QueryRange= 2:67. CDD= . /note=Most evidence presents itself as a hypothetical protein. CP is good; only one start was presented CDS 110158 - 110313 /gene="203" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_203" /note=Original Glimmer call @bp 110158 has strength 8.74; Genemark calls start at 110158 /note=SSC: Start = 110158, Stop = 110313. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.23 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 156 bp is the longest possible ORF. GAP: 11 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= 6O7B_B, Description= Csm4; Type III-A CRISPR-Cas system, Csm1-Csm4 cassette in complex with cA4, IMMUNE SYSTEM, immune system-dna complex, immune; 2.4A {Thermococcus onnurineus (strain NA1)}, Probability= 48.6. Coverage= 58.8235, SubjectRange= 49:72, QueryRange= 49:51. CDD= . /note=Called 100% of the time when present and shares this start with Dorin. Coding potential is good and the selected start has good final/z-scores. No significant hits in HHPred. NCBI Blast yielded no values. CDS 110355 - 110813 /gene="204" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_204" /note=Original Glimmer call @bp 110355 has strength 9.55; Genemark calls start at 110355 /note=SSC: Start = 110355, Stop = 110813. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 2.98 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 459 bp is the longest possible ORF. GAP: 41 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein FDI69_gp192 [Rhodococcus phage Trina] >gb|ASZ74994.1| hypothetical protein SEA_TRINA_215 [Rhodococcus phage Trina], Coverage= 100.0, SubjectRange= 1:150, QueryRange= 1:152, EValue= 3.16427E-24. HHPRED= . CDD= . /note=Start has good CP and the best RBS scores. There were no significant hits for function. CDS 110806 - 111204 /gene="205" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_205" /note=Original Glimmer call @bp 110806 has strength 7.88; Genemark calls start at 110806 /note=SSC: Start = 110806, Stop = 111204. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.934 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 399 bp is the longest possible ORF. GAP: -8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Francesca_Draft, ProteinNumber= 204, Function= function unknown, EValue= 8.0E-75. NCBIBLAST= PhageName= hypothetical protein FDI69_gp191 [Rhodococcus phage Trina] >gb|ASZ74995.1| hypothetical protein SEA_TRINA_216 [Rhodococcus phage Trina], Coverage= 97.7273, SubjectRange= 15:143, QueryRange= 15:131, EValue= 4.65905E-5. HHPRED= Accession= 2DJ6_C, Description= hypothetical protein PH0634; 6-pyruvoyl tetrahydrobiopterin synthase (PTPS), Structural Genomics, NPPSFA, National Project on Protein Structural and Functional Analyses, RIKEN; 2.1A {Pyrococcus horikoshii} SCOP: d.96.1.0, Probability= 57.1. Coverage= 12.1212, SubjectRange= 17:32, QueryRange= 17:92. CDD= . /note=Called 100% of the time when present and shares this start with Dorin. Coding potential is good and the selected start has good final/z-scores. HHPred had no significant hits. NCBI Blast yielded only hypothetical protein with good e-value. CDS 111161 - 111334 /gene="206" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_206" /note=Original Glimmer call @bp 111161 has strength 1.82 /note=SSC: Start = 111161, Stop = 111334. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.797 is the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 174 bp is the longest possible ORF. GAP: -44 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=delete because of weak CP and large overlap with previous gene; one point in favor of keeping it however, is that the same gene is called in Dorin /note=Also despite a significant overlap (-44), the RBS score is quite strong. CDS 111315 - 111692 /gene="207" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_207" /note=Original Glimmer call @bp 111315 has strength 5.46; Genemark calls start at 111315 /note=SSC: Start = 111315, Stop = 111692. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.218 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 378 bp is not the longest possible ORF. GAP: -20 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 218, Function= function unknown, EValue= 2.0E-10. NCBIBLAST= PhageName= hypothetical protein FDI69_gp189 [Rhodococcus phage Trina] >gb|ASZ74997.1| hypothetical protein SEA_TRINA_218 [Rhodococcus phage Trina], Coverage= 95.2, SubjectRange= 2:119, QueryRange= 2:121, EValue= 5.28711E-10. HHPRED= Accession= PF14445.10, Description= Prok-RING_2 ; Prokaryotic RING finger family 2, Probability= 65.5. Coverage= 41.6, SubjectRange= 3:55, QueryRange= 3:97. CDD= . /note=Does not contain the most annotated start, but called 100% of the time when present. Coding potential is good and the selected start has good final/z-scores. HHPred had no significant hits. NCBI Blast yielded hypothetical protein with good e-values. CDS 111763 - 112185 /gene="208" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_208" /note=Original Glimmer call @bp 111763 has strength 11.82; Genemark calls start at 111763 /note=SSC: Start = 111763, Stop = 112185. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.069 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 423 bp is not the longest possible ORF. GAP: 70 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 219, Function= function unknown, EValue= 9.0E-7. NCBIBLAST= . HHPRED= Accession= 2MXE_A, Description= MvaT; TRANSCRIPTION REGULATOR; NMR {Pseudomonas aeruginosa PAO1}, Probability= 28.8. Coverage= 8.57143, SubjectRange= 5:17, QueryRange= 5:99. CDD= . /note=No hits on NCBI Blast; CP is good. Evidence presents itself as hypothetical protein CDS 112281 - 112466 /gene="209" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_209" /note=Original Glimmer call @bp 112281 has strength 7.02; Genemark calls start at 112281 /note=SSC: Start = 112281, Stop = 112466. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 186 bp is the longest possible ORF. GAP: 95 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 206, Function= function unknown, EValue= 3.0E-30. NCBIBLAST= . HHPRED= Accession= PF04697.17, Description= Pinin_SDK_N ; pinin/SDK conserved region, Probability= 82.7. Coverage= 49.1803, SubjectRange= 1:31, QueryRange= 1:31. CDD= . /note=Called 100% of the time when present and shares this start with Dorin. Coding potential is good and the selected start has good final/z-scores. HHPred had no significant hits. NCBI Blast yielded no results. CDS 112469 - 112705 /gene="210" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_210" /note=Original Glimmer call @bp 112469 has strength 3.22; Genemark calls start at 112469 /note=SSC: Start = 112469, Stop = 112705. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.617 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 237 bp is the longest possible ORF. GAP: 2 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 208, Function= function unknown, EValue= 1.0E-41. NCBIBLAST= PhageName= hypothetical protein FDJ30_gp122 [Streptomyces phage BillNye] >gb|AVD99311.1| hypothetical protein SEA_BILLNYE_134 [Streptomyces phage BillNye] >gb|QBZ72394.1| hypothetical protein SEA_CIRCINUS_135 [Streptomyces phage Circinus], Coverage= 96.1538, SubjectRange= 4:79, QueryRange= 4:75, EValue= 4.80157E-6. HHPRED= Accession= PF10879.12, Description= DUF2674 ; Protein of unknown function (DUF2674), Probability= 55.6. Coverage= 41.0256, SubjectRange= 17:41, QueryRange= 17:39. CDD= . /note=The chosen start has good coding potential and RBS scores. There are no significant hits for function. CDS 112702 - 112986 /gene="211" /product="WhiB family transcription factor" /function="WhiB family transcription factor" /locus tag="Francesca_211" /note=Original Glimmer call @bp 112702 has strength 5.05; Genemark calls start at 112702 /note=SSC: Start = 112702, Stop = 112986. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.06 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 285 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: WhiB family transcription factor. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 209, Function= function unknown, EValue= 1.0E-52. NCBIBLAST= PhageName= WhiB family transcription factor [Rhodococcus phage NiceHouse], Coverage= 97.8723, SubjectRange= 7:98, QueryRange= 7:93, EValue= 8.66069E-21. HHPRED= Accession= 6ONO_A, Description= Transcription regulator WhiB1; Iron-sulfur cluster, transcription regulation, redox-sensing, TRANSCRIPTION; HET: PEG, SF4, MSE; 1.85A {Mycobacterium tuberculosis H37Rv}, Probability= 99.8. Coverage= 74.4681, SubjectRange= 1:74, QueryRange= 1:72. CDD= Accession= pfam02467, Coverage= 61.7021, SubjectRange= 1:58, QueryRange= 1:61, EValue= 1.60393E-6. /note=Does not contain the most annotated start, but called 100% of the time when present. Coding potential is good, however there is a gap. Changing the state would remove the gap, but the scores would get significantly worse. For this reason, we kept the selected start. HHPred has significant hits. NCBI Blast yielded only good e-values. NiceHouse also called same function. Function was present in conserved domain with good e-value. CDS 112967 - 113269 /gene="212" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_212" /note=Original Glimmer call @bp 112967 has strength 4.18; Genemark calls start at 112967 /note=SSC: Start = 112967, Stop = 113269. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.039 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 303 bp is the longest possible ORF. GAP: -20 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 219, Function= function unknown, EValue= 2.0E-11. NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_219 [Rhodococcus phage NiceHouse], Coverage= 76.0, SubjectRange= 15:87, QueryRange= 15:100, EValue= 7.83792E-12. HHPRED= Accession= PF12875.11, Description= DUF3826 ; Protein of unknown function (DUF3826), Probability= 62.9. Coverage= 37.0, SubjectRange= 111:146, QueryRange= 111:80. CDD= . /note=Went with suggested start, CP is good. Evidence presents itself as hypothetical protein CDS 113283 - 113450 /gene="213" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_213" /note=Original Glimmer call @bp 113262 has strength 6.38; Genemark calls start at 113262 /note=SSC: Start = 113283, Stop = 113450. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.131 is the highest start score. SCS: Start is not called by Glimmer and is not called by Genemark. LO: 168 bp is not the longest possible ORF. GAP: 13 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= NiceHouse, ProteinNumber= 220, Function= function unknown, EValue= 0.56. NCBIBLAST= . HHPRED= Accession= 7Q21_V, Description= Actinobacterial supercomplex, subunit C (AscC); MEMBRANE PROTEIN, CRYO-EM, RESPIRATORY SUPERCOMPLEX, ACTINOBACTERIA, ELECTRON TRANSPORT; HET: 9XX, TWT, MQ9, HEC, TRD, HAS, PLM, 9YF, HEM, CDL, 7PH; 3.0A {Corynebacterium glutamicum ATCC 13032}, Probability= 64.3. Coverage= 29.0909, SubjectRange= 48:66, QueryRange= 48:32. CDD= . /note=HHPred has no significant hits. NCBI Blast yielded no results. Start was changed to eliminate the gap between genes and improve final/z-scores. Coding potential of the selected start is good. After changing the start, there are no significant hits for functions. CDS 113530 - 113889 /gene="214" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_214" /note=Original Glimmer call @bp 113530 has strength 8.96; Genemark calls start at 113530 /note=SSC: Start = 113530, Stop = 113889. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.973 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 360 bp is the longest possible ORF. GAP: 79 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= Trina, ProteinNumber= 225, Function= function unknown, EValue= 3.0E-18. NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_221 [Rhodococcus phage NiceHouse], Coverage= 76.4706, SubjectRange= 1:91, QueryRange= 1:91, EValue= 2.1022E-19. HHPRED= Accession= PF10828.12, Description= DUF2570 ; Protein of unknown function (DUF2570), Probability= 97.9. Coverage= 83.1933, SubjectRange= 3:97, QueryRange= 3:108. CDD= . /note=Went with suggested start, CP is good. Evidence presents itself as hypothetical protein CDS 113879 - 114034 /gene="215" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_215" /note=Original Glimmer call @bp 113879 has strength 6.49; Genemark calls start at 113879 /note=SSC: Start = 113879, Stop = 114034. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.12 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 156 bp is the longest possible ORF. GAP: -11 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 213, Function= function unknown, EValue= 2.0E-23. NCBIBLAST= . HHPRED= Accession= cd22266, Description= AcrIE1; Anti-CRISPR type I subtype E1. AcrIE1 (also known as AcrE1) is an anti-CRISPR (Acr) protein which binds as a homodimer to and inactivates the CRISPR-associated helicase/nuclease Cas3 protein., Probability= 20.1. Coverage= 23.5294, SubjectRange= 11:23, QueryRange= 11:24. CDD= . /note=Does not contain the most annotated start, but called 100% of the time when present. Coding potential is good and the selected start has good final/z-scores. HHPred has no significant hits. NCBI Blast yielded no results. CDS 114027 - 114284 /gene="216" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_216" /note=Original Glimmer call @bp 114027 has strength 6.36; Genemark calls start at 114027 /note=SSC: Start = 114027, Stop = 114284. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.2 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 258 bp is not the longest possible ORF. GAP: -8 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=The start has good CP and RBS scores. There are no significant hits for function in NCBI or HHPred. CDS 114470 - 114946 /gene="217" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_217" /note=Original Glimmer call @bp 114470 has strength 4.48; Genemark calls start at 114536 /note=SSC: Start = 114470, Stop = 114946. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.846 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 477 bp is the longest possible ORF. GAP: 185 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 216, Function= function unknown, EValue= 1.0E-90. NCBIBLAST= PhageName= hypothetical protein FDI69_gp177 [Rhodococcus phage Trina] >gb|ASZ75009.1| hypothetical protein SEA_TRINA_230 [Rhodococcus phage Trina], Coverage= 56.962, SubjectRange= 16:87, QueryRange= 16:105, EValue= 5.83795E-6. HHPRED= Accession= PF07093.15, Description= SGT1 ; SGT1 protein, Probability= 23.2. Coverage= 32.2785, SubjectRange= 109:151, QueryRange= 109:54. CDD= . /note=Called 100% of the time when present and shares this start with Dorin. Coding potential is good and the selected start has good final/z-scores. HHPred has no significant hits. NCBI Blast yielded no results. CDS 114957 - 115286 /gene="218" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_218" /note=Original Glimmer call @bp 114957 has strength 9.51; Genemark calls start at 115014 /note=SSC: Start = 114957, Stop = 115286. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 330 bp is the longest possible ORF. GAP: 10 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 217, Function= function unknown, EValue= 3.0E-59. NCBIBLAST= . HHPRED= Accession= PF20715.1, Description= DUF6827 ; Domain of unknown function (DUF6827), Probability= 50.2. Coverage= 49.5413, SubjectRange= 2:58, QueryRange= 2:95. CDD= . /note=went with suggested start; CP is good. Evidence is presented as hypothetical protein CDS 115286 - 115912 /gene="219" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_219" /note=Original Glimmer call @bp 115286 has strength 9.67; Genemark calls start at 115286 /note=SSC: Start = 115286, Stop = 115912. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 627 bp is the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 218, Function= function unknown, EValue= 3.0E-93. NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_151 [Rhodococcus phage NiceHouse], Coverage= 86.0577, SubjectRange= 1:170, QueryRange= 1:179, EValue= 2.67474E-13. HHPRED= Accession= 3ZG9_A, Description= PENICILLIN-BINDING PROTEIN 4; PENICILLIN-BINDING PROTEIN; HET: GOL, DXF; 1.804A {LISTERIA MONOCYTOGENES}, Probability= 60.7. Coverage= 11.0577, SubjectRange= 24:47, QueryRange= 24:90. CDD= . /note=HHPred has no significant hits. NCBI Blast yielded no results. CDS 115905 - 116093 /gene="220" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_220" /note=Original Glimmer call @bp 115905 has strength 6.63; Genemark calls start at 115905 /note=SSC: Start = 115905, Stop = 116093. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.857 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 189 bp is the longest possible ORF. GAP: -8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 219, Function= function unknown, EValue= 2.0E-26. NCBIBLAST= . HHPRED= Accession= cd16250, Description= EFh_DTNB; EF-hand-like motif found in beta-dystrobrevin. Beta-dystrobrevin, also termed dystrobrevin beta (DTN-B), is a dystrophin-related protein that is restricted to non-muscle tissues and is abundantly expressed in brain, lung, kidney, and liver., Probability= 84.2. Coverage= 82.2581, SubjectRange= 49:100, QueryRange= 49:60. CDD= . /note=Good scores, coding potential, and longest ORF. No significant hits for any particular function. CDS 116109 - 116390 /gene="221" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_221" /note=Original Glimmer call @bp 116265 has strength 4.14; Genemark calls start at 116109 /note=SSC: Start = 116109, Stop = 116390. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.075 is not the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 282 bp is the longest possible ORF. GAP: 15 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Changed start to capture all CP, even though RBS scores are not good. After start change, membrane domains are found. Same start is found in Dorin. CDS 116394 - 116876 /gene="222" /product="DprA-like DNA processing chain A" /function="DprA-like DNA processing chain A" /locus tag="Francesca_222" /note=Original Glimmer call @bp 116394 has strength 5.76; Genemark calls start at 116394 /note=SSC: Start = 116394, Stop = 116876. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.342 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 483 bp is the longest possible ORF. GAP: 3 bp. ST: SS=NA. F: DprA-like DNA processing chain A. FS: PHDBLAST= PhageName= Appletree2, ProteinNumber= 84, Function= function unknown, EValue= 4.0E-19. NCBIBLAST= PhageName= hypothetical protein SEA_APPLETREE2_84 [Mycobacterium phage Appletree2], Coverage= 98.125, SubjectRange= 3:134, QueryRange= 3:158, EValue= 8.88106E-21. HHPRED= Accession= SCOP_d2nx2a1, Description= c.129.1.2 (A:2-177) Hypothetical protein YpsA {Bacillus subtilis [TaxId: 1423]} | CLASS: Alpha and beta proteins (a/b), FOLD: MCP/YpsA-like, SUPFAM: MCP/YpsA-like, FAM: YpsA-like, Probability= 99.6. Coverage= 98.75, SubjectRange= 1:166, QueryRange= 1:159. CDD= . /note=The chosen start has the best RBS scores with a reasonable gap. Start also has good CP. HHPred hits to YpsA proteins but also DPR DNA processing chain proteins, which is an approved function. CDS 116886 - 117149 /gene="223" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_223" /note=Original Glimmer call @bp 116886 has strength 6.79; Genemark calls start at 116904 /note=SSC: Start = 116886, Stop = 117149. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.788 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 264 bp is the longest possible ORF. GAP: 9 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 222, Function= function unknown, EValue= 1.0E-42. NCBIBLAST= . HHPRED= Accession= 6QX9_A1, Description= Splicing factor 3A subunit 1,Splicing factor 3A subunit 1,Splicing factor 3A subunit 1; RNP complex, splicing, RNA, protein, spliceosome; HET: IHP, M7M, GTP; 3.28A {Homo sapiens}, Probability= 63.8. Coverage= 89.6552, SubjectRange= 159:257, QueryRange= 159:85. CDD= . /note=Called 100% of the time when present and shares this start with Dorin. Coding potential is good and the selected start has good final/z-scores. HHPred has no significant hits. NCBI Blast yielded no results. CDS 117149 - 117322 /gene="224" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_224" /note=Original Glimmer call @bp 117149 has strength 4.36; Genemark calls start at 117149 /note=SSC: Start = 117149, Stop = 117322. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.746 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 174 bp is the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 223, Function= function unknown, EValue= 3.0E-27. NCBIBLAST= . HHPRED= Accession= SCOP_d2fm9a1, Description= a.257.1.1 (A:49-263) Cell invasion protein SipA, N-terminal domain {Salmonella typhimurium [TaxId: 90371]} | CLASS: All alpha proteins, FOLD: SipA N-terminal domain-like, SUPFAM: SipA N-terminal domain-like, FAM: SipA N-terminal domain-like, Probability= 84.3. Coverage= 28.0702, SubjectRange= 197:213, QueryRange= 197:23. CDD= . /note=Has good scores, good coding potential, small overlap and longest ORF. No hits for any particular function. CDS 117346 - 117471 /gene="225" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_225" /note=Original Glimmer call @bp 117346 has strength 2.11 /note=SSC: Start = 117346, Stop = 117471. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.638 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 126 bp is not the longest possible ORF. GAP: 23 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 224, Function= function unknown, EValue= 1.0E-20. NCBIBLAST= . HHPRED= Accession= cd16819, Description= SP-RING_PIAS2; SP-RING finger found in protein inhibitor of activated STAT protein 2 (PIAS2) and similar proteins., Probability= 84.4. Coverage= 21.9512, SubjectRange= 40:49, QueryRange= 40:36. CDD= . /note=Called 100% of the time when present and shares this start with Dorin. Coding potential is okay and appears very squiggly (frequent incline/declines) and the selected start has good final/z-scores. HHPred has no significant hits. NCBI Blast yielded no results. CDS 117518 - 117676 /gene="226" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_226" /note=Original Glimmer call @bp 117518 has strength 4.01 /note=SSC: Start = 117518, Stop = 117676. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.105 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 159 bp is the longest possible ORF. GAP: 46 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 225, Function= function unknown, EValue= 3.0E-24. NCBIBLAST= . HHPRED= Accession= PF06785.15, Description= UPF0242 ; Uncharacterised protein family (UPF0242) N-terminus, Probability= 52.6. Coverage= 30.7692, SubjectRange= 1:17, QueryRange= 1:42. CDD= . /note=Has good scores, the coding potential is not great, but we don`t think the gene should be deleted since it is shared/shows up as well in Dorin. It has no hits for any particular function. CDS 117737 - 118456 /gene="227" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_227" /note=Original Glimmer call @bp 117737 has strength 8.49; Genemark calls start at 117737 /note=SSC: Start = 117737, Stop = 118456. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.596 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 720 bp is the longest possible ORF. GAP: 60 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 226, Function= function unknown, EValue= 1.0E-140. NCBIBLAST= . HHPRED= Accession= 3IOX_A, Description= AgI/II; alpha helix, PPII helix, supersandwich fold, surface adhesin, Cell wall, Peptidoglycan-anchor, CELL ADHESION; HET: PMS; 1.8A {Streptococcus mutans}, Probability= 25.0. Coverage= 12.1339, SubjectRange= 447:476, QueryRange= 447:153. CDD= . /note=Called 100% of the time when present and shares this start with Dorin. Coding potential is good and the selected start has good final/z-scores. HHPred has no significant hits. NCBI Blast yielded no results. CDS 118468 - 118689 /gene="228" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_228" /note=Original Glimmer call @bp 118468 has strength 7.6; Genemark calls start at 118468 /note=SSC: Start = 118468, Stop = 118689. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.219 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 222 bp is not the longest possible ORF. GAP: 11 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Muntaha, ProteinNumber= 190, Function= function unknown, EValue= 0.96. NCBIBLAST= . HHPRED= Accession= PF14584.10, Description= DUF4446 ; Protein of unknown function (DUF4446), Probability= 84.4. Coverage= 39.726, SubjectRange= 84:114, QueryRange= 84:51. CDD= . /note=Went with suggested start; CP is good. No hits of NCBI Blast and evidence presents itself as hypothetical protein. CDS 118698 - 119183 /gene="229" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_229" /note=Original Glimmer call @bp 118698 has strength 10.78; Genemark calls start at 118698 /note=SSC: Start = 118698, Stop = 119183. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 2.217 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 486 bp is the longest possible ORF. GAP: 8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 228, Function= function unknown, EValue= 7.0E-93. NCBIBLAST= . HHPRED= Accession= cd16400, Description= ParB_Srx_like_nuclease; ParB/Srx_like nuclease and putative transcriptional regulators related to SbnI. This family contains a Pyrococcus Furiosus enzyme reported to have DNA nuclease activity and resembles the N-terminal domain of ParB proteins of the parABS bacterial chromosome partitioning system., Probability= 96.3. Coverage= 42.8571, SubjectRange= 19:70, QueryRange= 19:136. CDD= . /note=HHPred has significant hits. NCBI Blast yielded no results. CDS 119183 - 119326 /gene="230" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_230" /note=Original Glimmer call @bp 119183 has strength 8.84; Genemark calls start at 119183 /note=SSC: Start = 119183, Stop = 119326. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.23 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 144 bp is the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 229, Function= function unknown, EValue= 4.0E-21. NCBIBLAST= . HHPRED= Accession= PF21184.1, Description= HAT1_C_fung ; Fungal HAT1, C-terminal, Probability= 70.7. Coverage= 25.5319, SubjectRange= 11:23, QueryRange= 11:22. CDD= . /note=The chosen start has the best RBS scores, a reasonable gap, and good CP. There were no significant hits in HHPred or NCBI Blast. CDS 119319 - 119531 /gene="231" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_231" /note=Original Glimmer call @bp 119319 has strength 8.35; Genemark calls start at 119319 /note=SSC: Start = 119319, Stop = 119531. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.096 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 213 bp is the longest possible ORF. GAP: -8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 230, Function= function unknown, EValue= 2.0E-33. NCBIBLAST= . HHPRED= Accession= 2JS5_A, Description= Uncharacterized protein; homodimer, protein structure, NMR spectroscopy, Structural Genomics, PSI-2, Protein Structure Initiative, Northeast Structural Genomics Consortium, NESG; NMR {Methylococcus capsulatus}, Probability= 30.7. Coverage= 34.2857, SubjectRange= 2:26, QueryRange= 2:25. CDD= . /note=Went with suggested start; CP is good. No hits on NCBI blasts and evidence presents itself as hypothetical protein CDS 119534 - 119746 /gene="232" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_232" /note=Original Glimmer call @bp 119534 has strength 5.45; Genemark calls start at 119534 /note=SSC: Start = 119534, Stop = 119746. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.812 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 213 bp is the longest possible ORF. GAP: 2 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 231, Function= function unknown, EValue= 6.0E-35. NCBIBLAST= . HHPRED= Accession= PF18956.4, Description= DUF5699 ; Family of unknown function (DUF5699), Probability= 81.6. Coverage= 22.8571, SubjectRange= 32:48, QueryRange= 32:62. CDD= . /note=Went with suggested start; CP is good. No hits on NCBI Blast and evidence presents itself as hypothetical protein. CDS 119746 - 120438 /gene="233" /product="DNA methyltransferase" /function="DNA methyltransferase" /locus tag="Francesca_233" /note=Original Glimmer call @bp 119746 has strength 6.08; Genemark calls start at 119746 /note=SSC: Start = 119746, Stop = 120438. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.976 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 693 bp is the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: DNA methyltransferase. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 232, Function= function unknown, EValue= 1.0E-135. NCBIBLAST= PhageName= DNA methyltransferase [Mycobacterium phage Ruotula], Coverage= 97.3913, SubjectRange= 3:210, QueryRange= 3:224, EValue= 6.82227E-94. HHPRED= Accession= SCOP_d3ubta_, Description= c.66.1.26 (A:) automated matches {Haemophilus aegyptius [TaxId: 197575]} | CLASS: Alpha and beta proteins (a/b), FOLD: S-adenosyl-L-methionine-dependent methyltransferases, SUPFAM: S-adenosyl-L-methionine-dependent methyltransferases, FAM: C5 cytosine-specific DNA methylase, DCM, Probability= 99.6. Coverage= 56.087, SubjectRange= 1:160, QueryRange= 1:133. CDD= Accession= COG0270, Coverage= 46.087, SubjectRange= 1:119, QueryRange= 1:106, EValue= 1.79818E-14. /note=Has good scores for the start, it is the longest ORF, has good coding potential, and a smaller overlap. It has significant hits for DNA methyltransferase in the HHPred, NCBI blast, as well as the list of similar phages and conserved domains. CDS 120438 - 120716 /gene="234" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_234" /note=Original Glimmer call @bp 120438 has strength 3.67; Genemark calls start at 120498 /note=SSC: Start = 120438, Stop = 120716. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.215 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 279 bp is the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= SadLad, ProteinNumber= 22, Function= terminase, EValue= 0.065. NCBIBLAST= . HHPRED= Accession= 7SPP_C, Description= VNAR 2C02; RBD, VIRAL PROTEIN, VNAR, VIRAL PROTEIN-IMMUNE SYSTEM complex; HET: NAG, EDO; 1.96A {Severe acute respiratory syndrome coronavirus 2}, Probability= 78.8. Coverage= 16.3043, SubjectRange= 100:115, QueryRange= 100:60. CDD= . /note=Kept start. Best rbs scores that cover all cp. /note=No clear function call. Possibly terminase subunit. CDS 120722 - 120871 /gene="235" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_235" /note=Original Glimmer call @bp 120722 has strength 9.83; Genemark calls start at 120722 /note=SSC: Start = 120722, Stop = 120871. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 150 bp is not the longest possible ORF. GAP: 5 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= 7AR7_c, Description= Transmembrane protein; Complex-I Arabidopsis, ELECTRON TRANSPORT; HET: T7X, 8Q1, UQ9, PTY, LMN, FMN, NDP, SF4, PC7, PSF, PGT;{Arabidopsis thaliana}, Probability= 80.0. Coverage= 59.1837, SubjectRange= 17:46, QueryRange= 17:41. CDD= . /note=Kept start. Best rbs scores and covers all cp. /note=No clear function call. Phamerator shows a transmembrane domain from amino acid position 14-34. HHPred suggests something with electron transport but the E value is terrible. CDS 120878 - 121084 /gene="236" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_236" /note=Original Glimmer call @bp 120878 has strength 14.11; Genemark calls start at 120878 /note=SSC: Start = 120878, Stop = 121084. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.357 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 207 bp is the longest possible ORF. GAP: 6 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= 6DM9_C, Description= DHD15_extended_A; Computational design, heterodimer, coiled-coil, DE NOVO PROTEIN; HET: FME; 2.25A {synthetic construct}, Probability= 78.2. Coverage= 48.5294, SubjectRange= 19:52, QueryRange= 19:40. CDD= . /note=Kept start. Covers as much cp as possible with good rbs scores. Previous stop codon cuts off about 6 bases. /note=No clear function call. CDS 121378 - 121857 /gene="237" /product="SprT-like protease" /function="SprT-like protease" /locus tag="Francesca_237" /note=Genemark calls start at 121378 /note=SSC: Start = 121378, Stop = 121857. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.219 is the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 480 bp is the longest possible ORF. GAP: 293 bp. ST: SS=NA. F: SprT-like protease. FS: PHDBLAST= PhageName= SirPhilip, ProteinNumber= 90, Function= SprT-like protease, EValue= 3.0E-33. NCBIBLAST= PhageName= SprT-like protease [Mycobacterium phage SirPhilip] >gb|ASR85292.1| SprT-like protease [Mycobacterium phage SirPhilip], Coverage= 85.5346, SubjectRange= 45:175, QueryRange= 45:136, EValue= 7.59587E-36. HHPRED= Accession= 6MDW_A, Description= SprT-like domain-containing protein Spartan; DPC repair protease, DNA BINDING PROTEIN; HET: FLC, MLZ, ADP; 1.5A {Homo sapiens}, Probability= 99.8. Coverage= 89.3082, SubjectRange= 22:192, QueryRange= 22:147. CDD= Accession= pfam10263, Coverage= 69.8113, SubjectRange= 4:131, QueryRange= 4:125, EValue= 2.38213E-11. /note=Kept start. Best rbs scores, covers all cp. /note=Starterator: Francesca has start 64 which is called about 13% of the time when present, but we decided not to change to start 64 because it would have worse rbs scores. In this pham there is a lot of variability on which start is called, and the chosen start seems dependent on the cluster. Since we are forming our own cluster, it supports that we could chose a not-commonly called start. /note= /note=Lots of evidence this is a Sprt-like protease, phagesdb, hhpred, and ncbi all have at least good hits, with hhpred having great hits (99.8 probability, 89% coverage, 3.5e^-19, etc). CDS 121867 - 122175 /gene="238" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_238" /note=Genemark calls start at 121867 /note=SSC: Start = 121867, Stop = 122175. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 309 bp is the longest possible ORF. GAP: 9 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein SEA_NICEHOUSE_245 [Rhodococcus phage NiceHouse], Coverage= 94.1176, SubjectRange= 2:97, QueryRange= 2:99, EValue= 1.45402E-30. HHPRED= Accession= 1KRX_A, Description= NITROGEN REGULATION PROTEIN NR(I); two component signal transduction, receiver domain, BeF3, phosphorylation, Bacterial nitrogen regulatory protein, SIGNALING PROTEIN; HET: BEF; NMR {Salmonella typhimurium} SCOP: c.23.1.1, Probability= 98.7. Coverage= 94.1176, SubjectRange= 4:107, QueryRange= 4:98. CDD= . /note=Kept start. Only option for start that covers all possible cp, best rbs scores. /note=No clear function call. Lots of high probability hhpred hits but none are listed on the approved function list. /note=-Come back to check for function: good hits to "phosphorylation" and "regulator" and "transcription factor". CDS complement (121246 - 122178) /gene="239" /product="nothingburger" /function="nothingburger" /locus tag="Francesca_239" /note=Original Glimmer call @bp 122178 has strength 4.04 /note=SSC: Start = 122178, Stop = 121246. (Reverse). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.91 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 933 bp is the longest possible ORF. GAP: 58 bp. ST: SS=NA. F: nothingburger. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Kept start. Almost no coding potential, and completely overlaps two forward genes. /note=Should probably delete this gene. CDS 122237 - 122809 /gene="240" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_240" /note=Original Glimmer call @bp 122237 has strength 6.3; Genemark calls start at 122237 /note=SSC: Start = 122237, Stop = 122809. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 3.138 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 573 bp is not the longest possible ORF. GAP: 58 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Gibbous, ProteinNumber= 54, Function= RecA-like DNA recombinase, EValue= 0.14. NCBIBLAST= PhageName= hypothetical protein [Pseudonocardia sp. C8] >gb|MBC3189465.1| hypothetical protein [Pseudonocardia sp. C8], Coverage= 93.6842, SubjectRange= 23:192, QueryRange= 23:183, EValue= 8.1267E-28. HHPRED= Accession= PF09629.14, Description= YorP ; YorP protein, Probability= 90.9. Coverage= 36.3158, SubjectRange= 5:67, QueryRange= 5:159. CDD= . /note=Kept start. Best rbs scores that covers the most cp. Small cp cutoff at start. /note=No clear function call, but okay hit to YorP protein and phagesdb hits to RecA-like DNA recombinase. CDS 122945 - 123214 /gene="241" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_241" /note=Original Glimmer call @bp 122945 has strength 5.27; Genemark calls start at 122945 /note=SSC: Start = 122945, Stop = 123214. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.795 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 270 bp is the longest possible ORF. GAP: 135 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= PhageName= hypothetical protein KHQ85_gp017 [Gordonia phage Skog] >gb|QIG58169.1| hypothetical protein SEA_SKOG_17 [Gordonia phage Skog], Coverage= 100.0, SubjectRange= 3:98, QueryRange= 3:89, EValue= 5.91598E-11. HHPRED= . CDD= . /note=Kept start. Only start that covers all cp, okay rbs scores. This start is the most annotated start. /note=No clear function call. CDS 123216 - 123614 /gene="242" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_242" /note=Original Glimmer call @bp 123216 has strength 8.44; Genemark calls start at 123216 /note=SSC: Start = 123216, Stop = 123614. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 2.223 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 399 bp is the longest possible ORF. GAP: 1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= PF19698.3, Description= DUF6197 ; Family of unknown function (DUF6197), Probability= 99.8. Coverage= 91.6667, SubjectRange= 5:139, QueryRange= 5:131. CDD= . /note=Kept start. Some cp cutoff but this start covers the most cp possible. /note=No clear function call. CDS 123673 - 123897 /gene="243" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_243" /note=Original Glimmer call @bp 123673 has strength 8.06; Genemark calls start at 123673 /note=SSC: Start = 123673, Stop = 123897. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.377 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 225 bp is the longest possible ORF. GAP: 58 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= PF07494.15, Description= Reg_prop ; Two component regulator propeller, Probability= 87.3. Coverage= 18.9189, SubjectRange= 10:24, QueryRange= 10:29. CDD= . /note=Kept start. Great and best rbs scores, covers all cp possible (previous stop codon cuts it off). /note=No clear function call. CDS 123899 - 124078 /gene="244" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_244" /note=Original Glimmer call @bp 123899 has strength 3.25; Genemark calls start at 123899 /note=SSC: Start = 123899, Stop = 124078. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.617 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 180 bp is the longest possible ORF. GAP: 1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= PF18067.5, Description= Lipase_C ; Lipase C-terminal domain, Probability= 88.9. Coverage= 33.8983, SubjectRange= 19:39, QueryRange= 19:23. CDD= . /note=Kept start. Good rbs scores and covers all cp possible. Loses some cp in the beginning, but there would be overlap and there`s a stop codon cutting it off. CDS 124136 - 124378 /gene="245" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_245" /note=Original Glimmer call @bp 124136 has strength 6.66; Genemark calls start at 124136 /note=SSC: Start = 124136, Stop = 124378. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.217 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 243 bp is the longest possible ORF. GAP: 57 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= Accession= PF04695.17, Description= Pex14_N ; Pex14 N-terminal domain, Probability= 74.7. Coverage= 22.5, SubjectRange= 27:45, QueryRange= 27:63. CDD= . /note=Kept start, best rbs scores, covers all cp. /note=No clear function call. CDS 124371 - 124568 /gene="246" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_246" /note=Original Glimmer call @bp 124371 has strength 4.71; Genemark calls start at 124392 /note=SSC: Start = 124371, Stop = 124568. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.215 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 198 bp is not the longest possible ORF. GAP: -8 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 245, Function= function unknown, EValue= 1.0E-27. NCBIBLAST= . HHPRED= . CDD= Accession= pfam14155, Coverage= 49.2308, SubjectRange= 1:32, QueryRange= 1:40, EValue= 3.54627E-5. /note=Start chosen for covering coding potential, having good z and f scores with little overlap, and being the most called start. CDS 124572 - 124799 /gene="247" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_247" /note=Original Glimmer call @bp 124572 has strength 8.28; Genemark calls start at 124572 /note=SSC: Start = 124572, Stop = 124799. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.944 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 228 bp is the longest possible ORF. GAP: 3 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 246, Function= function unknown, EValue= 2.0E-37. NCBIBLAST= . HHPRED= . CDD= . /note=Start selected for being most called, covering coding potential, having good z and f scores, and being the longest ORF CDS 124796 - 125230 /gene="248" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_248" /note=Original Glimmer call @bp 124796 has strength 3.76; Genemark calls start at 124796 /note=SSC: Start = 124796, Stop = 125230. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.533 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 435 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 247, Function= function unknown, EValue= 1.0E-84. NCBIBLAST= . HHPRED= . CDD= . /note=Start chosen for being most called and covering all coding potential. CDS 125274 - 125432 /gene="249" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_249" /note=Original Glimmer call @bp 125274 has strength 5.09; Genemark calls start at 125262 /note=SSC: Start = 125274, Stop = 125432. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 2.618 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 159 bp is not the longest possible ORF. GAP: 43 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 248, Function= function unknown, EValue= 1.0E-23. NCBIBLAST= . HHPRED= . CDD= . /note=Start chosen for being most called, covering most of the coding potential, and having no overlap. CDS 125449 - 125772 /gene="250" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_250" /note=Original Glimmer call @bp 125449 has strength 4.47; Genemark calls start at 125449 /note=SSC: Start = 125449, Stop = 125772. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.955 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 324 bp is the longest possible ORF. GAP: 16 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 249, Function= function unknown, EValue= 1.0E-60. NCBIBLAST= . HHPRED= Accession= 7WHG_G, Description= Lokiarchaeota gelsolin (2DGel); Asgard, gelsolin, actin, filament, STRUCTURAL PROTEIN; HET: HIC, ADP; 3.25A {Oryctolagus cuniculus}, Probability= 97.7. Coverage= 54.2056, SubjectRange= 288:334, QueryRange= 288:93. CDD= . /note=Start chosen for being most called and covering all coding potential. /note= /note=Some HHPred evidence suggests this may be a DNA structural protein--more research needed CDS 125781 - 126008 /gene="251" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_251" /note=Original Glimmer call @bp 125781 has strength 6.22; Genemark calls start at 125781 /note=SSC: Start = 125781, Stop = 126008. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.217 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 228 bp is the longest possible ORF. GAP: 8 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 250, Function= function unknown, EValue= 6.0E-38. NCBIBLAST= . HHPRED= . CDD= . /note=Start chosen for covering all coding potential and being most called. CDS 126079 - 126390 /gene="252" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_252" /note=Original Glimmer call @bp 126079 has strength 5.72; Genemark calls start at 126079 /note=SSC: Start = 126079, Stop = 126390. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.219 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 312 bp is not the longest possible ORF. GAP: 70 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 251, Function= function unknown, EValue= 6.0E-54. NCBIBLAST= . HHPRED= . CDD= . /note=Start chosen for being most called and covering all coding potential. CDS 126495 - 126641 /gene="253" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_253" /note=Original Glimmer call @bp 126495 has strength 6.49; Genemark calls start at 126495 /note=SSC: Start = 126495, Stop = 126641. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 2.217 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 147 bp is not the longest possible ORF. GAP: 104 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 252, Function= function unknown, EValue= 2.0E-21. NCBIBLAST= . HHPRED= . CDD= . /note=Start chosen for being most called, decent z and f scores, and calling most of the coding potential. CDS 126657 - 126830 /gene="254" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_254" /note=Original Glimmer call @bp 126657 has strength 12.45; Genemark calls start at 126657 /note=SSC: Start = 126657, Stop = 126830. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.217 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 174 bp is the longest possible ORF. GAP: 15 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 253, Function= function unknown, EValue= 4.0E-26. NCBIBLAST= . HHPRED= . CDD= . /note=Start chosen for good z and f scores, being the longest ORF, and covering all coding potential. CDS 126841 - 127065 /gene="255" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_255" /note=Original Glimmer call @bp 126841 has strength 14.08; Genemark calls start at 126841 /note=SSC: Start = 126841, Stop = 127065. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 225 bp is not the longest possible ORF. GAP: 10 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 254, Function= function unknown, EValue= 4.0E-34. NCBIBLAST= . HHPRED= . CDD= . /note=Chosen for good z and f scores, being the most called, and covering all coding potential. CDS 127065 - 127511 /gene="256" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_256" /note=Original Glimmer call @bp 127065 has strength 5.95; Genemark calls start at 127065 /note=SSC: Start = 127065, Stop = 127511. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.686 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 447 bp is the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 255, Function= function unknown, EValue= 1.0E-81. NCBIBLAST= PhageName= hypothetical protein EVB79_052 [Rhizobium phage RHph_N3_13] >gb|QIG69878.1| hypothetical protein F67_I3_11_052 [Rhizobium phage RHph_I3_11], Coverage= 87.8378, SubjectRange= 12:136, QueryRange= 12:141, EValue= 7.32582E-12. HHPRED= . CDD= . /note=Start chosen for good z and f scores, being the longest ORF, and covering all coding potential. CDS 127514 - 127687 /gene="257" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_257" /note=Original Glimmer call @bp 127514 has strength 3.05; Genemark calls start at 127514 /note=SSC: Start = 127514, Stop = 127687. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.925 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 174 bp is not the longest possible ORF. GAP: 2 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 256, Function= function unknown, EValue= 1.0E-25. NCBIBLAST= . HHPRED= . CDD= . /note=Start has good z and f scores and covers all coding potential CDS 127684 - 127974 /gene="258" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_258" /note=Original Glimmer call @bp 127684 has strength 4.61; Genemark calls start at 127684 /note=SSC: Start = 127684, Stop = 127974. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 2.944 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 291 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 257, Function= function unknown, EValue= 4.0E-52. NCBIBLAST= . HHPRED= . CDD= . /note=Start chosen for good coding potential, good z and f scores, and being the longest ORF. CDS 128076 - 128237 /gene="259" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_259" /note=Original Glimmer call @bp 128076 has strength 6.62; Genemark calls start at 128076 /note=SSC: Start = 128076, Stop = 128237. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 3.219 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 162 bp is the longest possible ORF. GAP: 101 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 258, Function= function unknown, EValue= 8.0E-27. NCBIBLAST= PhageName= hypothetical protein L3Y19_gp079 [Gordonia phage Neville] >gb|AXQ64448.1| hypothetical protein SEA_NEVILLE_79 [Gordonia phage Neville], Coverage= 86.7924, SubjectRange= 7:52, QueryRange= 7:47, EValue= 1.53677E-11. HHPRED= . CDD= . /note=Start chosen because it`s the most called among the CG cluster, has good z and f scores, is the longest ORF, and covers most coding potential. CDS 128406 - 128813 /gene="260" /product="membrane protein" /function="membrane protein" /locus tag="Francesca_260" /note=Original Glimmer call @bp 128406 has strength 5.22; Genemark calls start at 128406 /note=SSC: Start = 128406, Stop = 128813. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 1.281 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 408 bp is not the longest possible ORF. GAP: 168 bp. ST: SS=NA. F: membrane protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 260, Function= function unknown, EValue= 9.0E-39. NCBIBLAST= . HHPRED= . CDD= . /note=Start covers all potential. Similarities in hypothetical protein to fellow CG phage Dorin CDS 128813 - 128998 /gene="261" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_261" /note=Original Glimmer call @bp 128813 has strength 6.8; Genemark calls start at 128813 /note=SSC: Start = 128813, Stop = 128998. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.388 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 186 bp is the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 261, Function= function unknown, EValue= 6.0E-32. NCBIBLAST= . HHPRED= . CDD= . /note=Start chosen because it is most called, has good z and f scores, is the longest ORF, and covers all coding potential. CDS 128964 - 129188 /gene="262" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_262" /note=Original Glimmer call @bp 128964 has strength 3.8 /note=SSC: Start = 128964, Stop = 129188. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.709 is the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 225 bp is the longest possible ORF. GAP: -35 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 262, Function= function unknown, EValue= 9.0E-41. NCBIBLAST= . HHPRED= . CDD= . /note=Kept start because it`s the longest ORF, captures all coding potential, and is called 100% of the time. /note= /note=Evidence indicates this is a shared hypothetical protein. CDS 129192 - 129557 /gene="263" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_263" /note=Original Glimmer call @bp 129192 has strength 7.67; Genemark calls start at 129192 /note=SSC: Start = 129192, Stop = 129557. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 2.673 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 366 bp is the longest possible ORF. GAP: 3 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 263, Function= function unknown, EValue= 3.0E-67. NCBIBLAST= PhageName= hypothetical protein [Streptomyces sp. CS081A] >gb|PVC73505.1| hypothetical protein DBP18_14255 [Streptomyces sp. CS081A], Coverage= 89.2562, SubjectRange= 57:156, QueryRange= 57:112, EValue= 2.71147E-10. HHPRED= Accession= PF19698.3, Description= DUF6197 ; Family of unknown function (DUF6197), Probability= 99.8. Coverage= 96.6942, SubjectRange= 8:140, QueryRange= 8:118. CDD= . /note=Start chosen because it has good z and f scores, is the longest ORF, and covers as much of the coding potential as it can. CDS 129532 - 129873 /gene="264" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_264" /note=Original Glimmer call @bp 129532 has strength 10.76; Genemark calls start at 129532 /note=SSC: Start = 129532, Stop = 129873. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.183 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 342 bp is the longest possible ORF. GAP: -26 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 264, Function= function unknown, EValue= 6.0E-62. NCBIBLAST= PhageName= hypothetical protein [Planosporangium mesophilum], Coverage= 96.4602, SubjectRange= 16:128, QueryRange= 16:111, EValue= 1.84134E-8. HHPRED= Accession= PF19698.3, Description= DUF6197 ; Family of unknown function (DUF6197), Probability= 99.9. Coverage= 99.115, SubjectRange= 7:136, QueryRange= 7:113. CDD= . /note=Chose start because it is the longest ORF with the most coding potential, and is the most frequently called among the CG cluster and all phages. CDS 129870 - 130199 /gene="265" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_265" /note=Original Glimmer call @bp 129870 has strength 9.73; Genemark calls start at 129870 /note=SSC: Start = 129870, Stop = 130199. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.223 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 330 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 265, Function= function unknown, EValue= 2.0E-63. NCBIBLAST= PhageName= hypothetical protein GOOTI_034_00110 [Gordonia otitidis NBRC 100426], Coverage= 70.6422, SubjectRange= 6:84, QueryRange= 6:81, EValue= 5.32865E-4. HHPRED= . CDD= . /note=Start chosen for good z and f scores, being the most called start, covering all coding potential, and being the longest ORF. CDS 130186 - 130470 /gene="266" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_266" /note=Original Glimmer call @bp 130186 has strength 8.79; Genemark calls start at 130186 /note=SSC: Start = 130186, Stop = 130470. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.787 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 285 bp is the longest possible ORF. GAP: -14 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 266, Function= function unknown, EValue= 2.0E-49. NCBIBLAST= . HHPRED= . CDD= . /note=Start chosen because it covers all coding potential, is called 100% of the time, and is the longest ORF. CDS 130470 - 130631 /gene="267" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_267" /note=Genemark calls start at 130470 /note=SSC: Start = 130470, Stop = 130631. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.134 is the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 162 bp is not the longest possible ORF. GAP: -1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 267, Function= function unknown, EValue= 2.0E-26. NCBIBLAST= . HHPRED= . CDD= . /note=Start chosen because it is most called, covers all coding potential, and has good z and f scores. /note= /note=Some evidence suggests similarity to a DNA methylase, but more investigation will be required. CDS 130655 - 130876 /gene="268" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_268" /note=Original Glimmer call @bp 130655 has strength 3.72; Genemark calls start at 130655 /note=SSC: Start = 130655, Stop = 130876. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 2.673 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 222 bp is not the longest possible ORF. GAP: 23 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 268, Function= function unknown, EValue= 7.0E-42. NCBIBLAST= PhageName= hypothetical protein SEA_TROGGLEHUMPER_89 [Rhodococcus phage Trogglehumper], Coverage= 73.9726, SubjectRange= 1:68, QueryRange= 1:54, EValue= 1.95252E-4. HHPRED= . CDD= . /note=Start chosen because of its being called by both CG phages and capturing most of the coding potential. /note= /note=BLAST suggests a hypothetical protein, possibly related to non-CG phage Trogglehumper. CDS 130873 - 131199 /gene="269" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_269" /note=Original Glimmer call @bp 130873 has strength 9.45; Genemark calls start at 130873 /note=SSC: Start = 130873, Stop = 131199. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.014 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 327 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 269, Function= function unknown, EValue= 5.0E-60. NCBIBLAST= PhageName= hypothetical protein MARCHEWKA_03470 [Brevundimonas phage vB_BpoS-Marchewka], Coverage= 73.1481, SubjectRange= 17:96, QueryRange= 17:99, EValue= 1.18922E-4. HHPRED= Accession= PF19698.3, Description= DUF6197 ; Family of unknown function (DUF6197), Probability= 99.9. Coverage= 96.2963, SubjectRange= 6:141, QueryRange= 6:106. CDD= . /note=Start chosen for having good z and f scores, being the longest ORF, being the most called, and covering all coding potential. /note= /note=Substantial BLAST and HHPred evidence suggest this is a hypothetical protein. CDS 131201 - 131398 /gene="270" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_270" /note=Original Glimmer call @bp 131201 has strength 9.63; Genemark calls start at 131201 /note=SSC: Start = 131201, Stop = 131398. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.31 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 198 bp is the longest possible ORF. GAP: 1 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 270, Function= function unknown, EValue= 4.0E-31. NCBIBLAST= PhageName= type IIA topoisomerase [Bacillus phage Mgbh1] >gb|AMQ66727.1| type IIA topoisomerase [Bacillus phage Mgbh1], Coverage= 78.4615, SubjectRange= 3:58, QueryRange= 3:52, EValue= 1.51383E-5. HHPRED= . CDD= . /note=Start chosen because longest ORF, called 100% of the time, and captures all coding potential. /note= /note=HHPred and BLAST results suggest a hypothetical protein. perhaps a topoisomerase. CDS 131388 - 131597 /gene="271" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_271" /note=Original Glimmer call @bp 131388 has strength 9.6; Genemark calls start at 131388 /note=SSC: Start = 131388, Stop = 131597. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.23 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 210 bp is not the longest possible ORF. GAP: -11 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 271, Function= function unknown, EValue= 4.0E-33. NCBIBLAST= . HHPRED= Accession= 2L02_B, Description= Uncharacterized protein; Structural Genomics, NORTHEAST STRUCTURAL GENOMICS CONSORTIUM (NESG), PSI-2, Protein Structure Initiative, Unknown function; NMR {Bacteroides thetaiotaomicron}, Probability= 97.1. Coverage= 92.7536, SubjectRange= 2:63, QueryRange= 2:65. CDD= . /note=Start chosen because it is the most called, has good z and final scores, and covers all coding potential. CDS 131621 - 131848 /gene="272" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_272" /note=Original Glimmer call @bp 131621 has strength 6.35; Genemark calls start at 131621 /note=SSC: Start = 131621, Stop = 131848. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.456 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 228 bp is the longest possible ORF. GAP: 23 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 272, Function= function unknown, EValue= 3.0E-38. NCBIBLAST= . HHPRED= . CDD= . /note=Selected start because it is longest ORF, covers all coding potential, and is called 100% of the time. /note= /note=Shares similar hypothetical protein with fellow CG phage Dorin, no clear function CDS 131835 - 132017 /gene="273" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_273" /note=Original Glimmer call @bp 131835 has strength 2.03; Genemark calls start at 131841 /note=SSC: Start = 131835, Stop = 132017. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.688 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 183 bp is not the longest possible ORF. GAP: -14 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Francesca_Draft, ProteinNumber= 272, Function= function unknown, EValue= 2.0E-30. NCBIBLAST= . HHPRED= . CDD= . /note=good z score and will cover all CP /note=No clear function call CDS 132068 - 132520 /gene="274" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_274" /note=Original Glimmer call @bp 132068 has strength 10.03; Genemark calls start at 132068 /note=SSC: Start = 132068, Stop = 132520. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.944 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 453 bp is not the longest possible ORF. GAP: 50 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 274, Function= function unknown, EValue= 4.0E-81. NCBIBLAST= . HHPRED= . CDD= . /note=Chose start for good scores, being the most called, and covers coding potential. /note= /note=Shares PhagesDB BLAST results with fellow CG phage Dorin CDS 134588 - 134920 /gene="275" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_275" /note=Original Glimmer call @bp 134588 has strength 11.63; Genemark calls start at 134588 /note=SSC: Start = 134588, Stop = 134920. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.12 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 333 bp is the longest possible ORF. GAP: 2067 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Francesca_Draft, ProteinNumber= 274, Function= function unknown, EValue= 3.0E-61. NCBIBLAST= . HHPRED= . CDD= . /note=decent z score and will cover all CP /note=start s called 100% of times when present /note=no good HHPRED hits not NCBI hits CDS 134976 - 135224 /gene="276" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_276" /note=Original Glimmer call @bp 134976 has strength 0.87; Genemark calls start at 134976 /note=SSC: Start = 134976, Stop = 135224. (Forward). CP: Does not contain all GeneMarkHost capacity. SD: ZScore 3.278 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 249 bp is not the longest possible ORF. GAP: 55 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Chose start because it is most called and covers most coding potential. /note=No clear function call CDS 135237 - 135623 /gene="277" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_277" /note=Original Glimmer call @bp 135237 has strength 1.77; Genemark calls start at 135231 /note=SSC: Start = 135237, Stop = 135623. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.12 is the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 387 bp is not the longest possible ORF. GAP: 12 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 276, Function= function unknown, EValue= 1.0E-11. NCBIBLAST= PhageName= hypothetical protein FDI69_gp002 [Rhodococcus phage Trina] >ref|YP_009615270.1| hypothetical protein FDI69_gp123 [Rhodococcus phage Trina] >gb|ASZ74821.1| hypothetical protein SEA_TRINA_2 [Rhodococcus phage Trina] >gb|ASZ75062.1| hypothetical protein SEA_TRINA_284 [Rhodococcus phage Trina], Coverage= 87.5, SubjectRange= 1:109, QueryRange= 1:122, EValue= 0.00956089. HHPRED= . CDD= . /note=Start chosen due to being the most called, covers all coding potential, and has good z and final scores. /note= /note=Evidence from BLAST results suggest this is a hypothetical protein. CDS 135726 - 135860 /gene="278" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_278" /note=Original Glimmer call @bp 135741 has strength 2.96; Genemark calls start at 135717 /note=SSC: Start = 135726, Stop = 135860. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.49 is the highest start score. SCS: Start is not called by Glimmer and is not called by Genemark. LO: 135 bp is not the longest possible ORF. GAP: 102 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Chose start because it covers all coding potential and is most called. /note=No clear function call CDS 135872 - 136606 /gene="279" /product="glycosylase" /function="glycosylase" /locus tag="Francesca_279" /note=Original Glimmer call @bp 135872 has strength 8.83; Genemark calls start at 135977 /note=SSC: Start = 135872, Stop = 136606. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.282 is not the highest start score. SCS: Start is called by Glimmer and is not called by Genemark. LO: 735 bp is the longest possible ORF. GAP: 11 bp. ST: SS=NA. F: glycosylase. FS: PHDBLAST= PhageName= Francesca_Draft, ProteinNumber= 5, Function= function unknown, EValue= 1.0E-138. NCBIBLAST= PhageName= glycosylase [Mycobacterium phage MK4], Coverage= 80.3279, SubjectRange= 1:191, QueryRange= 1:243, EValue= 3.46086E-51. HHPRED= Accession= 3FHG_A, Description= N-glycosylase/DNA lyase; ogg, helix-hairpin-helix, glycosylase, 8-oxoguanine, 8-oxoG, SsOGG, DNA damage, DNA repair, Glycosidase, Hydrolase, Lyase, Multifunctional enzyme, Nuclease; HET: SO4, GOL; 1.9A {Sulfolobus solfataricus}, Probability= 94.9. Coverage= 32.377, SubjectRange= 126:206, QueryRange= 126:244. CDD= . /note=good z core and covers all CP /note=good NCBI hits but there is also a good hit for glycosylase /note=starterator calls start 6 but it will shorten length and not as good z score /note= /note=Called glycosylase based on synteny of the direct terminal repeat and the blast evidence slightly below the ncbi evidence selected. CDS 136685 - 137353 /gene="280" /product="Lsr2-like DNA bridging protein" /function="Lsr2-like DNA bridging protein" /locus tag="Francesca_280" /note=Original Glimmer call @bp 136685 has strength 12.92; Genemark calls start at 136685 /note=SSC: Start = 136685, Stop = 137353. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.456 is not the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 669 bp is the longest possible ORF. GAP: 78 bp. ST: SS=NA. F: Lsr2-like DNA bridging protein. FS: PHDBLAST= PhageName= Dorin_Draft, ProteinNumber= 281, Function= function unknown, EValue= 1.0E-123. NCBIBLAST= . HHPRED= Accession= 2KNG_A, Description= Protein lsr2; DNA-binding domain, Immune response, DNA BINDING PROTEIN; NMR {Mycobacterium tuberculosis}, Probability= 95.7. Coverage= 15.3153, SubjectRange= 11:45, QueryRange= 11:119. CDD= Accession= pfam11774, Coverage= 25.2252, SubjectRange= 51:104, QueryRange= 51:115, EValue= 5.21058E-5. /note=Start chosen because it is the most called, it covers all coding potential, and is the longest ORF. /note= /note=Substantial BLAST and HHPred evidence suggests this may be an LSR2-like protein. Synteny with Dorin confirms this CDS 137420 - 137542 /gene="281" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_281" /note=Original Glimmer call @bp 137420 has strength 9.22; Genemark calls start at 137420 /note=SSC: Start = 137420, Stop = 137542. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 3.278 is the highest start score. SCS: Start is called by Glimmer and is called by Genemark. LO: 123 bp is the longest possible ORF. GAP: 66 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= PhageName= Francesca_Draft, ProteinNumber= 7, Function= function unknown, EValue= 3.0E-15. NCBIBLAST= . HHPRED= . CDD= . /note=decent z score and will cover all CP /note=start called 100% of times when present /note=no good HHPRED hits and no NCBI hits CDS 137539 - 137634 /gene="282" /product="Hypothetical Protein" /function="Hypothetical Protein" /locus tag="Francesca_282" /note=Genemark calls start at 137539 /note=SSC: Start = 137539, Stop = 137634. (Forward). CP: Does contain all GeneMarkHost capacity. SD: ZScore 2.315 is the highest start score. SCS: Start is not called by Glimmer and is called by Genemark. LO: 96 bp is the longest possible ORF. GAP: -4 bp. ST: SS=NA. F: Hypothetical Protein. FS: PHDBLAST= . NCBIBLAST= . HHPRED= . CDD= . /note=Start selected because it`s the longest ORF, the most called start, and covers all coding potential. /note=TMHMM shows membrane domain