CDS 64 - 612 /gene="1" /product="gp1" /function="terminase, small subunit" /locus tag="Windest_1" /note=Original Glimmer call @bp 253 has strength 1.11; Genemark calls start at 109 /note=SSC: 64-612 CP: no SCS: both-cs ST: NI BLAST-Start: [terminase small subunit [Arthrobacter phage RadFad]],,NCBI, q1:s1 100.0% 2.28711E-126 GAP: 0 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.612, -3.404185773875114, no F: terminase, small subunit SIF-BLAST: ,,[terminase small subunit [Arthrobacter phage RadFad]],,UYL86558,100.0,2.28711E-126 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, Glimmer and GeneMark both call the gene, It has coding potential in GeneMarkS as well /note=What is the start site?: 64, Has the better Z-score and final score, this start site also includes all coding potential, the manual annotations in starterator chose this start site the majority of the time when present /note=What is the function?: The best matches in HHPred call a terminase small subunit but the probabilities are below 90% but based on synteny and BLAST (both NCBI and phagesdb) it is called this function in the non-draft genomes--we will as well. /note=Reviewed: KJ, TH CDS 599 - 1678 /gene="2" /product="gp2" /function="terminase, large subunit (ATPase domain)" /locus tag="Windest_2" /note=Original Glimmer call @bp 599 has strength 4.47; Genemark calls start at 599 /note=SSC: 599-1678 CP: no SCS: both ST: NI BLAST-Start: [terminase large subunit [Arthrobacter phage Isolde] ],,NCBI, q1:s1 100.0% 0.0 GAP: -14 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.139, -4.863452397416132, no F: terminase, large subunit (ATPase domain) SIF-BLAST: ,,[terminase large subunit [Arthrobacter phage Isolde] ],,YP_010656092,99.7215,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? : Yes, Glimmer and GeneMark both call the gene. It has coding potential on GeneMarkS /note=What is the start site?: 599; it includes all coding potential, It is called by most of the manual annotations when the start site is present (starterator), the next closest start site would eliminate 200+bp of coding potential, Z-score was acceptable as was the final score /note=What is the function?: HHPred calls it a Terminase, large subunit, AY cluster phage can have two separate domains of the large subunit of the terminase and this phage has both. Gene on similar phages are called ATP domains of the large subunit of terminase (Phagesdb and NCBI BLAST) /note=Reviewed: KJ, TH CDS 1647 - 2258 /gene="3" /product="gp3" /function="terminase, large subunit (nuclease domain)" /locus tag="Windest_3" /note=Original Glimmer call @bp 1647 has strength 9.04; Genemark calls start at 1647 /note=SSC: 1647-2258 CP: no SCS: both ST: NI BLAST-Start: [terminase large subunit [Arthrobacter phage Isolde] ],,NCBI, q1:s1 100.0% 6.02236E-150 GAP: -32 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.202, -4.344614665115397, yes F: terminase, large subunit (nuclease domain) SIF-BLAST: ,,[terminase large subunit [Arthrobacter phage Isolde] ],,YP_010656093,100.0,6.02236E-150 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, called by Genemark and Glimmer, and also has coding potential. Where is start site? 4984, because it includes the coding information, and it has a high score on glimmer that the start site is at 4984. Also the majority of times it is present it has been called the start site by manual annotations (starterator). Although the overlap is larger than what we want, I`m not sure if that matters in this case. It has a good z-score and final score. /note=Function: Terminase, large subunit (nuclease domain) - Multiple HHPred calls of similar genes were for the nuclease domain of large subunits of terminase; SYnteny in AY phages - the two large subunits of terminase are adjacent in the genome, Similar proteins in phagesdb and NCBI Blast were called the same. /note=Reviewed: KJ, TH CDS 2274 - 3611 /gene="4" /product="gp4" /function="portal protein" /locus tag="Windest_4" /note=Original Glimmer call @bp 2274 has strength 12.31; Genemark calls start at 2274 /note=SSC: 2274-3611 CP: no SCS: both ST: NI BLAST-Start: [portal protein [Arthrobacter phage Isolde] ],,NCBI, q1:s1 100.0% 0.0 GAP: 15 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.619, -3.3275678682521845, no F: portal protein SIF-BLAST: ,,[portal protein [Arthrobacter phage Isolde] ],,YP_010656094,100.0,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes according to Glimmer and Genemark. It has coding potential on GeneMarkS. /note=Where is the start site? 2274, Glimmer and Genemark both predict is it at 2274, it has also been called by manuel annotations most of the time, according to starterator. It has a better Z-score and final score. Function: portal protein (structural protein), 100 percent probability on HHPred. Also 99.7 % identical to another portal protein in NCBI. Phagesdb only had 20% probability that it was a portal protein. /note=Reviewed: KJ, TH CDS 3614 - 4828 /gene="5" /product="gp5" /function="capsid maturation protease" /locus tag="Windest_5" /note=Original Glimmer call @bp 3614 has strength 4.02; Genemark calls start at 3614 /note=SSC: 3614-4828 CP: no SCS: both ST: NI BLAST-Start: [head maturation protease [Arthrobacter phage Isolde] ],,NCBI, q1:s1 100.0% 0.0 GAP: 2 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.312, -4.033958934919761, no F: capsid maturation protease SIF-BLAST: ,,[head maturation protease [Arthrobacter phage Isolde] ],,YP_010656095,99.2574,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, Gimmer and Genemark called it a gene. Coding potential in GenemarkS. Where is the start site? 3614, because it includes all the coding potential, and when it is present is called the start site the majority of the time. It does have a low glimmer score, but has a good z-score and final score. /note=Function: Capsid maturation protease. HHpred, phagesdb and NCBI Blast all seemed to be evidence for this function. I also looked on phamerator and compared this protein to phage isolde which has a similar gene. /note=CM /note=Reviewed: KJ, TH tRNA 4859 - 4930 /gene="6" /product="tRNA-Trp(cca)" /locus tag="WINDEST_6" /note=tRNA-Trp(cca) CDS 4984 - 5517 /gene="7" /product="gp7" /function="scaffolding protein" /locus tag="Windest_7" /note=Original Glimmer call @bp 4984 has strength 8.73; Genemark calls start at 4984 /note=SSC: 4984-5517 CP: no SCS: both ST: NI BLAST-Start: [scaffolding protein [Arthrobacter phage Isolde] ],,NCBI, q1:s1 100.0% 4.50908E-114 GAP: 155 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.208, -2.0895162839948163, yes F: scaffolding protein SIF-BLAST: ,,[scaffolding protein [Arthrobacter phage Isolde] ],,YP_010656096,98.3051,4.50908E-114 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, called by Genemark and Glimmer, and also has coding potential. Where is start site? 4984, because it includes the coding information, and it has a high score on glimmer that the start site is at 4984. Also the majority of times it is present it has been called the start site by manual annotations (starterator). Although the gap is larger than what we want, I`m not sure if that matters in this case. It has a good z-score and final score. /note=Function: Scaffolding protein. I got this based off NCBI blast where phages Isolde and Phrank had similar proteins and were called scaffolding proteins. PhagesDB and HHPred also pointed to this same function. /note=CM /note=Reviewed: KJ, TH CDS 5547 - 5942 /gene="8" /product="gp8" /function="capsid decoration protein" /locus tag="Windest_8" /note=Original Glimmer call @bp 5547 has strength 11.22; Genemark calls start at 5547 /note=SSC: 5547-5942 CP: no SCS: both ST: NI BLAST-Start: [head decoration [Arthrobacter phage Isolde] ],,NCBI, q1:s1 100.0% 5.33546E-90 GAP: 29 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.04, -2.5052746077145835, yes F: capsid decoration protein SIF-BLAST: ,,[head decoration [Arthrobacter phage Isolde] ],,YP_010656097,99.2366,5.33546E-90 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes. Start Site?: 5547. It has coding potential, Glimmer and GeneMark agree on the start site and there are no visible start codons before the one chosen. It has a relatively high Glimmer score and a good z-score alongside a good final score. This gene appears 85.7% of the time in the pham and is called 100% of the time alongside 114/135 MA. (C.A.R) /note=Function?: Most likely a capsid decoration protein. HHPred gave a 99.55% probability and a 3E-13 for the E-value for the highest hit, with two other capsid decoration proteins close behind at 99.5% and 99.4%. It also matches up with Isolde on phamerator and has a conserved domain that is a "lamdba head decoration protein". In the official functions list they would like head decoration protein to not be used and for capsid decoration protein to be used instead. It is unsure if it is just a capsid decoration protein or a LamD-like capsid decoration protein as the notes in the official functions list for LamD-like capsid decoration protein are slightly confusing. (C.A.R) /note=Reviewed: KJ, TH CDS 5959 - 6999 /gene="9" /product="gp9" /function="major capsid protein" /locus tag="Windest_9" /note=Original Glimmer call @bp 5959 has strength 15.4; Genemark calls start at 5959 /note=SSC: 5959-6999 CP: no SCS: both ST: NI BLAST-Start: [major head protein [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 0.0 GAP: 16 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.873, -3.8954117747686845, yes F: major capsid protein SIF-BLAST: ,,[major head protein [Arthrobacter phage Richie] ],,YP_010655729,100.0,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes. Start Site?: 5959. It has coding potential and the gene is called by both Glimmer and GeneMark with a high Glimmer score. It has the best final score and z-score out of all of them and there were no visible start codons before the point picked. It is found in 85.4% of the pham and is called 88.6% of the time alongside 106/143 MA. The score for the spacer is not the best but appears to not effect anything significantly. (C.A.R) /note=Function?: Major capsid protein. This is backed up by HHPred with all the top probabilities being major capsid proteins at 100% and having E-values in the -30`s. Other members of the AY cluster and every single member of the pham have it as a major capsid protein as well. There is also a conserved domain in phamerator for "phage major capsid protein". (C.A.R) /note=Reviewed: KJ, TH CDS 7012 - 7335 /gene="10" /product="gp10" /function="Hypothetical Protein" /locus tag="Windest_10" /note=Original Glimmer call @bp 7012 has strength 15.78; Genemark calls start at 7012 /note=SSC: 7012-7335 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP635_gp09 [Arthrobacter phage Auxilium] ],,NCBI, q1:s1 100.0% 6.34728E-69 GAP: 12 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.284, -2.0720764396375664, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP635_gp09 [Arthrobacter phage Auxilium] ],,YP_010655828,100.0,6.34728E-69 SIF-HHPRED: SIF-Syn: /note=Is it a gene?:Yes. Start Site?: 7012. It has coding potential. Glimmer and GeneMark both agree and there are no visible start codons before the one chosen. It has the best z-score and final score alongside a high Glimmer score. This gene is found in 100% of the pham and is called 100% of the time alongside 17/17 MA, and it has only been found in AY cluster phages. (C.A.R) /note=Function?: Hypothetical protein. It is not labeled on any other AY phages in phamerator or any phages in the pham aside from Hestia. HHPred brings up a probability of 98% for something, however it is unclear what exactly it is. HHPred also has a probability of 97% for a hypothetical protein. (C.A.R) /note=Reviewed: KJ, TH CDS 7352 - 7708 /gene="11" /product="gp11" /function="head-to-tail adaptor" /locus tag="Windest_11" /note=Original Glimmer call @bp 7352 has strength 6.84; Genemark calls start at 7352 /note=SSC: 7352-7708 CP: no SCS: both ST: NI BLAST-Start: [head-tail adaptor Ad1 [Arthrobacter phage Auxilium] ],,NCBI, q1:s1 100.0% 8.17008E-77 GAP: 16 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.629, -3.3861058366776207, yes F: head-to-tail adaptor SIF-BLAST: ,,[head-tail adaptor Ad1 [Arthrobacter phage Auxilium] ],,YP_010655829,100.0,8.17008E-77 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, both Glimmer and GeneMark call the same gene. It also has coding potential on GenemarkS. /note=What is the start site? 7352, it includes all of the coding potential. Starterator said that 770 out of the 826 of the non drafts were start site 14. And the z score and final score were good. /note=What is the function? I checked HHPred for its probability and coverage. Both looked good. Then I went to phamerator and compared windest against other AY. I concluded that the function is a hypothetical protein. /note=Reviewed: KJ - changed to Head to Tail Adaptor, TH CDS 7705 - 8103 /gene="12" /product="gp12" /function="head-to-tail stopper" /locus tag="Windest_12" /note=Original Glimmer call @bp 7705 has strength 7.96; Genemark calls start at 7705 /note=SSC: 7705-8103 CP: no SCS: both ST: NI BLAST-Start: [head-to-tail stopper [Arthrobacter phage Auxilium] ],,NCBI, q1:s1 100.0% 4.41887E-90 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.56, -5.6924825190935575, no F: head-to-tail stopper SIF-BLAST: ,,[head-to-tail stopper [Arthrobacter phage Auxilium] ],,YP_010655830,100.0,4.41887E-90 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, both Glimmer and GeneMark call the same gene. It has coding potential on GenemarkS. /note=What is the start site? 7705, it includes all of the coding potential. Starterator said that 27 out of 29 of the non-drafts were start site 14. And the z score and final score were decent. also, -4 overlap /note=What is the function? I checked HHPred for its probability and coverage. Both looked good. Then I went to phamerator and compared windest against other AY. I concluded that the function is an unknown function. /note=Reviewed; TH, JL CDS 8100 - 8483 /gene="13" /product="gp13" /function="minor capsid protein" /locus tag="Windest_13" /note=Original Glimmer call @bp 8100 has strength 7.41; Genemark calls start at 8100 /note=SSC: 8100-8483 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP635_gp12 [Arthrobacter phage Auxilium] ],,NCBI, q1:s1 100.0% 1.8997E-85 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.04, -2.583959800616441, yes F: minor capsid protein SIF-BLAST: ,,[hypothetical protein PP635_gp12 [Arthrobacter phage Auxilium] ],,YP_010655831,100.0,1.8997E-85 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, both Glimmer and GeneMark call the same gene. It has coding potential on GenemarkS. /note=What is the start site? 8100, it includes all of the coding potential. Starterator said that 28 out of 28 of the non drafts were start site 4. And the z score and final score were good. The gap was also -4 which almost always indicates that the correlating start site is correct. /note=What is the function? I checked HHPred for its probability and coverage. Both looked good however there were some with similar percentage and coverage. Then I went to phamerator and compared windest against other AY. I chose minor capsid protien however on phamerator the majority of the similar phages were head to tail stop. /note=Checked: TH, JL CDS 8480 - 8878 /gene="14" /product="gp14" /function="tail terminator" /locus tag="Windest_14" /note=Original Glimmer call @bp 8465 has strength 5.83; Genemark calls start at 8480 /note=SSC: 8480-8878 CP: no SCS: both-gm ST: NI BLAST-Start: [tail terminator [Arthrobacter phage Auxilium] ],,NCBI, q1:s1 100.0% 1.558E-91 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.728, -5.550379362024007, no F: tail terminator SIF-BLAST: ,,[tail terminator [Arthrobacter phage Auxilium] ],,YP_010655832,100.0,1.558E-91 SIF-HHPRED: SIF-Syn: /note=Gene: Yes has coding potential, called by both Glimmer and GeneMark /note=Start: 8480; -4 overlap, includes all coding potential /note=Function: Mulitple hits in both HHpred and BLAST to tail terminator /note=Checked by: JL CDS 8925 - 9110 /gene="15" /product="gp15" /function="Hypothetical Protein" /locus tag="Windest_15" /note=Original Glimmer call @bp 8925 has strength 10.76; Genemark calls start at 8925 /note=SSC: 8925-9110 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp14 [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 6.17697E-34 GAP: 46 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.965, -2.6013996449736907, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP634_gp14 [Arthrobacter phage Richie] ],,YP_010655735,100.0,6.17697E-34 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, it has coding potential and start and stop codons, It is called by Glimmer and GeneMark /note=Start: 8925, This site is called by both Glimmer and GeneMark, It has the best raw and final score, it includes all coding potential, It is called in all genes in the pham - both manual and auto annotated (starterator) /note=Function: Hypothetical protein, There are no significant hits on HHpred, all BLAST hits are to hypothetical proteins, no transmembrane domains (DeepTMHMM) /note=Checked by: JL CDS 9103 - 9618 /gene="16" /product="gp16" /function="major tail protein" /locus tag="Windest_16" /note=Original Glimmer call @bp 9103 has strength 10.9; Genemark calls start at 9103 /note=SSC: 9103-9618 CP: no SCS: both ST: NI BLAST-Start: [major tail protein [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 1.22873E-120 GAP: -8 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.592, -3.4463834536695686, no F: major tail protein SIF-BLAST: ,,[major tail protein [Arthrobacter phage Richie] ],,YP_010655736,100.0,1.22873E-120 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, it is a gene with good potential. Viewing the gene mark the start site selected included pretty much all coding potential recognized. Both Glimmer and Gene mark collected the same start site which has a really good score which is why I selected it.-KO What is the function? HHPRED showed a hypothetical protein which had a 81% probability and 17% coverage which was not good. NCBI shows major tail protein which was called 100% of the time and had 100% coverage which is why I chose it.-KO /note=Checked by: JL CDS 9708 - 10244 /gene="17" /product="gp17" /function="tail assembly chaperone" /locus tag="Windest_17" /note=Original Glimmer call @bp 9708 has strength 10.01; Genemark calls start at 9708 /note=SSC: 9708-10244 CP: no SCS: both ST: NI BLAST-Start: [tail assembly chaperone [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 1.78719E-128 GAP: 89 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.377, -3.8366550365551095, no F: tail assembly chaperone SIF-BLAST: ,,[tail assembly chaperone [Arthrobacter phage Richie] ],,YP_010655738,100.0,1.78719E-128 SIF-HHPRED: SIF-Syn: /note=9708, Both Glimmer and GeneMark agree on it with the Glimmer score being high. Starterator supports this reading as well. /note=It is a gene due to the fact that it includes all potential coding within the range. /note=Similar(If not identical) sequences on NCBI, HHpred, and Phamerator all yielded confident results that this is a tail assembly chaperone. /note=Checked by: JL CDS 10388 - 10591 /gene="18" /product="gp18" /function="tail assembly chaperone" /locus tag="Windest_18" /note=Original Glimmer call @bp 10388 has strength 6.76; Genemark calls start at 10388 /note=SSC: 10388-10591 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_GORPY_17 [Arthrobacter phage Gorpy]],,NCBI, q1:s1 100.0% 1.31942E-40 GAP: 143 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.865, -4.910886061811229, no F: tail assembly chaperone SIF-BLAST: ,,[hypothetical protein SEA_GORPY_17 [Arthrobacter phage Gorpy]],,UVF60979,98.5075,1.31942E-40 SIF-HHPRED: SIF-Syn: /note=10388, Glimmer and GeneMark agree with the Glimmer score being high. Starterator supports this reading. The Z-Score and Final-Score are the best out of the options available. High coding potential also supports it. /note=The function was determined off of readings found on NCBI and HHpred. The Phage `SeaHorse` matched Windest relatively well, so it was used as a small reference. /note= /note=This gene includes a programmed translational shift. {9708:10195;10195:10591} -1 frameshift /note=Checked by: JL CDS complement (10588 - 10827) /gene="19" /product="gp19" /function="membrane protein" /locus tag="Windest_19" /note=Original Glimmer call @bp 10827 has strength 4.0; Genemark calls start at 10827 /note=SSC: 10827-10588 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp18 [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 1.14367E-46 GAP: 71 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.779, -5.172539920119419, no F: membrane protein SIF-BLAST: ,,[hypothetical protein PP634_gp18 [Arthrobacter phage Richie] ],,YP_010655739,100.0,1.14367E-46 SIF-HHPRED: SIF-Syn: /note=10827. There`s coding potential shown through Starterator, both Glimmer and GeneMark agree with the call. Though the gap is 71, this gene is Reversed (Sandwiched by two Forward). The Z-Score and Final Score aren`t as high as we`d like, it is the best one with a gap of 50+. /note=Based on findings in PhagesDB, HHpred, and NCBI, this is a Hypothetical Protein. /note=Checked by: JL /note= /note=There are two transmembrane domains. Changed function to membrane protein CDS 10899 - 14840 /gene="20" /product="gp20" /function="tape measure protein" /locus tag="Windest_20" /note=Original Glimmer call @bp 10899 has strength 7.64; Genemark calls start at 10899 /note=SSC: 10899-14840 CP: no SCS: both ST: NI BLAST-Start: [tail length tape measure protein [Arthrobacter phage Auxilium] ],,NCBI, q1:s1 100.0% 0.0 GAP: 71 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.144, -4.325675514574479, no F: tape measure protein SIF-BLAST: ,,[tail length tape measure protein [Arthrobacter phage Auxilium] ],,YP_010655838,99.9238,0.0 SIF-HHPRED: SIF-Syn: /note=Is this a gene? Yes, it is a gene. The start codon occurs at 10899. This start codon includes all of the coding potential, it has been called, with a high degree of certainty from GeneMark, and Glimmer, with glimmer reporting a score of 7.64. There is also a gap of 71, and a Z score of 2.114, and a final score of -4.326. All other start codons exclude much of the coding potential, have a much larger gap, and do not have the best scores as the start codon at 10899. /note= /note=Function: based on HHpred and many NCBI hits this is the tape measure protein /note=Checked by: JL CDS 14840 - 15667 /gene="21" /product="gp21" /function="minor tail protein" /locus tag="Windest_21" /note=Original Glimmer call @bp 14840 has strength 13.2; Genemark calls start at 14840 /note=SSC: 14840-15667 CP: no SCS: both ST: NI BLAST-Start: [minor tail protein [Arthrobacter phage Gorpy] ],,NCBI, q1:s1 100.0% 0.0 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.198, -2.17469248771465, yes F: minor tail protein SIF-BLAST: ,,[minor tail protein [Arthrobacter phage Gorpy] ],,UVF60982,99.2727,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, Glimmer and GeneMark both call the gene, It has coding potential in GeneMarkS as well /note=What is the start site?: 14840, Has the better Z-score and final score, this start site also includes all coding potential, the manual annotations in starterator chose this start site the majority of the time when present; -1 overlap as well /note=What is the function?: minor tail protein; BLAST searches return many minor tail proteins; no significant hits on HHpred /note=Checked by: JL CDS 15679 - 16980 /gene="22" /product="gp22" /function="minor tail protein" /locus tag="Windest_22" /note=Original Glimmer call @bp 15679 has strength 11.88; Genemark calls start at 15679 /note=SSC: 15679-16980 CP: no SCS: both ST: NI BLAST-Start: [minor tail protein [Arthrobacter phage BenchScraper]],,NCBI, q1:s1 100.0% 0.0 GAP: 11 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.144, -4.466674028236667, no F: minor tail protein SIF-BLAST: ,,[minor tail protein [Arthrobacter phage BenchScraper]],,XIJ69239,99.7691,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes. Both GeneMark and Glimmer have called it a gene. It has a high Glimmer score of 11.88. It has a start codon at 15679; this codon contains all the coding potential, and it is the last start codon in the series. It. Also has a gap of 11, a high Z score, and an acceptable final score; start site is annotated the majority of the time when present in the pham /note=Function? HHpred and BLAST searches have several hits to minor tail proteins CDS 16989 - 17948 /gene="23" /product="gp23" /function="minor tail protein" /locus tag="Windest_23" /note=Original Glimmer call @bp 16989 has strength 6.67; Genemark calls start at 16998 /note=SSC: 16989-17948 CP: no SCS: both-gl ST: NI BLAST-Start: [minor tail protein [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 0.0 GAP: 8 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.388, -3.893834241316366, no F: minor tail protein SIF-BLAST: ,,[minor tail protein [Arthrobacter phage Richie] ],,YP_010655743,100.0,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes it has coding potential called by Glimmer Star (16989) and GeneMark (16998) /note=what is the start cite? 16989, includes most coding potential, contains smaller gap and spacer compared to next best option (16998), and has a final score closer to zero. Confirmed by 42 MA`s on Pham /note=What is the function? There are no outstanding e values, yet there are some minor tail proteins that have decent overall scores listed on phages db. NCBI blast presents strongest argument for minor tail protein, with excellent coverage, e value, and identity, pointing to the function as a minor tail protein. No significant hits on HHpred CDS 17948 - 18607 /gene="24" /product="gp24" /function="minor tail protein" /locus tag="Windest_24" /note=Original Glimmer call @bp 17948 has strength 8.07; Genemark calls start at 17948 /note=SSC: 17948-18607 CP: yes SCS: both ST: NI BLAST-Start: [minor tail protein [Arthrobacter phage MidnightRain]],,NCBI, q1:s1 100.0% 2.40021E-157 GAP: -1 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.296, -4.005585920014169, no F: minor tail protein SIF-BLAST: ,,[minor tail protein [Arthrobacter phage MidnightRain]],,WNM64510,100.0,2.40021E-157 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, Glimmer and Genemark both call the gene. It has coding potential on GeneMarkS /note=What is the start cite? 17948; it includes most coding potential with a reasonable overlap, it is called by all of the manual annotations when start cite is present (starerator), and it has the best z score, and final score compared to next best start cite. -1 overlap as well /note=What is the function? NCBI and Phages DB blasts indicate a minor tail protein. HHpred lists as unkown function but has very poor e value and coverage. Phages DB has many minor tail proteins that have excellent e value and overall score. CDS 18619 - 19062 /gene="25" /product="gp25" /function="Hypothetical Protein" /locus tag="Windest_25" /note=Original Glimmer call @bp 18619 has strength 6.86; Genemark calls start at 18619 /note=SSC: 18619-19062 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_MIDNIGHTRAIN_24 [Arthrobacter phage MidnightRain]],,NCBI, q1:s1 100.0% 1.11262E-101 GAP: 11 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.873, -2.8564937087383147, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_MIDNIGHTRAIN_24 [Arthrobacter phage MidnightRain]],,WNM64511,100.0,1.11262E-101 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, Glimmer and Genemark both call the gene. It has coding potential on GeneMarkS /note=What is the start cite? 18619; it includes most coding potential with a reasonable gap, it is called by most of the manual annotations when start cite is present (starerator), and it has the best z score, and final score compared to next best start cite. /note=What is the function? HHpred, NCBI, and Phages DB blasts all indicate a hypothetical protein. There are no possible functions with acceptable e values or coverage. CDS 19059 - 19385 /gene="26" /product="gp26" /function="Hypothetical Protein" /locus tag="Windest_26" /note=Original Glimmer call @bp 19059 has strength 5.52; Genemark calls start at 19059 /note=SSC: 19059-19385 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp25 [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 2.43925E-72 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.965, -2.6814417326944517, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP634_gp25 [Arthrobacter phage Richie] ],,YP_010655746,100.0,2.43925E-72 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes it is the start is -4 which is an automatic start site which is supported by the glimmer and gene mark readings. -Ko What is the function? The function of the protein is hypothetical which is supported by the findings of HHPRED and the NCBI Blast. CDS 19397 - 19531 /gene="27" /product="gp27" /function="Hypothetical Protein" /locus tag="Windest_27" /note=Original Glimmer call @bp 19397 has strength 9.07; Genemark calls start at 19397 /note=SSC: 19397-19531 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp26 [Arthrobacter phage Richie] ],,NCBI, q6:s3 88.6364% 5.79446E-18 GAP: 11 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.144, -4.466674028236667, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP634_gp26 [Arthrobacter phage Richie] ],,YP_010655747,92.6829,5.79446E-18 SIF-HHPRED: SIF-Syn: /note=is it a gene? There is a good amount of genetic information before the listed start site. However, the Glimmer score and gene mark suggest the start site is pretty good. What is the function? The function is hypothetical protein. The only other suggestion was a mating protein but that only had 50% probability and coverage which are not good while the hypothetical protein had 92% alignment and identity. /note= /note=No significant matches on HHpred; only hypothetical proteins on BLAST searches CDS 19531 - 20493 /gene="28" /product="gp28" /function="endolysin" /locus tag="Windest_28" /note=Original Glimmer call @bp 19531 has strength 14.26; Genemark calls start at 19531 /note=SSC: 19531-20493 CP: no SCS: both ST: NI BLAST-Start: [endolysin [Arthrobacter phage RadFad] ],,NCBI, q1:s1 100.0% 0.0 GAP: -1 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.01, -5.6562426662814325, no F: endolysin SIF-BLAST: ,,[endolysin [Arthrobacter phage RadFad] ],,UYL86584,99.6875,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes it is a gene which shows a good deal of coding potential. The glimmer score was good. The gene mark was also really good. It also showed on the Phamerator maps show a remarkable similarity to our gene in question and the start mark selected makes our gene line up in size with several others. - KO; also -1 overlap /note=What is the function? The function is endolysin which is supported by NCBI. Look for lysin B, No lysin B was found so call endolysin CDS 20486 - 20791 /gene="29" /product="gp29" /function="membrane protein" /locus tag="Windest_29" /note=Original Glimmer call @bp 20486 has strength 7.46; Genemark calls start at 20486 /note=SSC: 20486-20791 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp28 [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 7.52718E-63 GAP: -8 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.965, -2.66371296573402, yes F: membrane protein SIF-BLAST: ,,[hypothetical protein PP634_gp28 [Arthrobacter phage Richie] ],,YP_010655749,100.0,7.52718E-63 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, it has coding potential called by Glimmer Star and GeneMark at 20486 /note=what is the start cite? 20486, includes most coding potential, contains acceptable overlap and has better z score and final score compared to next best option (20537). /note=What is the function? The majority of evidence is pointing toward hypothetical protein, Phages db blast revealed Unknown function for every option with excellent e scores and overall scores. However, there is some evidence that it could be a membrane protein. The second-best option on NCBI has a coverage of 98% along with other values that are identical to the best option which has 100% coverage. There is too little evidence pointing toward membrane protein; therefore function will be listed as hypothetical. Two transmembrane domains (phamerator) CDS 20810 - 21115 /gene="30" /product="gp30" /function="Hypothetical Protein" /locus tag="Windest_30" /note=Original Glimmer call @bp 20810 has strength 7.54; Genemark calls start at 20810 /note=SSC: 20810-21115 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_MIDNIGHTRAIN_29 [Arthrobacter phage MidnightRain] ],,NCBI, q1:s1 100.0% 8.34982E-65 GAP: 18 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.04, -2.442961286954254, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_MIDNIGHTRAIN_29 [Arthrobacter phage MidnightRain] ],,WNM64516,99.0099,8.34982E-65 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes it has coding potential called by Glimmer Star and GeneMark (20810) /note=what is the start cite? 20810, includes most coding potential, contains acceptable gap and spacer. /note=What is the function? Most likely a hypothetical protein, all options with acceptable to excellent values point towards a hypothetical protein. NCBI and Phages db presented most evidence. CDS 21112 - 21486 /gene="31" /product="gp31" /function="Hypothetical Protein" /locus tag="Windest_31" /note=Original Glimmer call @bp 21112 has strength 6.05; Genemark calls start at 21112 /note=SSC: 21112-21486 CP: no SCS: both ST: NI BLAST-Start: [tail protein [Arthrobacter phage Isolde] ],,NCBI, q1:s1 100.0% 1.54229E-79 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.208, -2.230514797657003, yes F: Hypothetical Protein SIF-BLAST: ,,[tail protein [Arthrobacter phage Isolde] ],,YP_010656119,99.1936,1.54229E-79 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, it has coding potential called by Glimmer Star and GeneMark. /note=what is the start cite? 21112, -4 start cite. Called 100% of the time when present on Pham. /note=What is the function? This sequence is very similar to ones found in a few different genomes, Isolde calls it minor tail protein, yet all the other genomes with the same converge and values call it a hypothetical protein. There is not any definitive evidence that this is a minor tail protein therefore it is a hypothetical protein. CDS 21490 - 23250 /gene="32" /product="gp32" /function="hydrolase" /locus tag="Windest_32" /note=Original Glimmer call @bp 21490 has strength 10.61; Genemark calls start at 21511 /note=SSC: 21490-23250 CP: no SCS: both-gl ST: NI BLAST-Start: [hypothetical protein [Microbacterium sp.]],,NCBI, q101:s183 46.7577% 5.36377E-65 GAP: 3 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.956, -4.862260131515226, no F: hydrolase SIF-BLAST: ,,[hypothetical protein [Microbacterium sp.]],,MDR2294517,36.9469,5.36377E-65 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes. Both Glimmer and GeneMark have called it a gene. The start codon is at 21490. This start codon contains the most of the coding potential, has the best Z-Scores and final scores, and has the best gap. /note= /note=Checked: HHPred does not seem to suggest to me that this is a minor tail protein. It however seems to suggest a thioesterase or just an esterase. A plain esterase may be most appropriate as there is only around one thioesterase compared to the 5 or so that suggest it is an esterase. -C.A.R, TH-I agree on an esterase. /note= /note=There are also several matches on HHpred that would be to lipases. It is probably better to go with a more general terms of hydrolase in this case CDS complement (23499 - 23627) /gene="33" /product="gp33" /function="membrane protein" /locus tag="Windest_33" /note=Original Glimmer call @bp 23624 has strength 2.26; Genemark calls start at 23627 /note=SSC: 23627-23499 CP: yes SCS: both-gm ST: NI BLAST-Start: GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.463, -5.756073525669775, no F: membrane protein SIF-BLAST: SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes it is s gene . It has the a start codon at 23627. This codon includes all of the potential, has the best Z-Score and final score, and has the best gap. also, -1 overlap /note= /note=Checked: I am not seeing any HHPred or NCBI results for hydrolase above a 90%. I do however see a very large transmembrane domain, so I believe this may be a membrane protein. -C.A.R CDS complement (23627 - 23851) /gene="34" /product="gp34" /function="Hypothetical Protein" /locus tag="Windest_34" /note=Original Glimmer call @bp 23851 has strength 7.11; Genemark calls start at 23851 /note=SSC: 23851-23627 CP: no SCS: both ST: NI BLAST-Start: [MULTISPECIES: hypothetical protein [Pseudarthrobacter] ],,NCBI, q1:s1 71.6216% 7.75088E-5 GAP: 127 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.415, -5.857463951082417, no F: Hypothetical Protein SIF-BLAST: ,,[MULTISPECIES: hypothetical protein [Pseudarthrobacter] ],,WP_175318623,44.6154,7.75088E-5 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes. it is a gene,. Both Glimmer and GeneMark have called it a gene. It has a start codon at 23851. It has an adequate gap, adequate final scores, and it includes all of the coding potential. /note= /note=Checked: C.A.R, TH /note= /note=Function: No significant hits on HHpred; Only hypothetical proteins are matched on BLAST searches CDS 23979 - 24125 /gene="35" /product="gp35" /function="Hypothetical Protein" /locus tag="Windest_35" /note=Original Glimmer call @bp 23979 has strength 0.86 /note=SSC: 23979-24125 CP: no SCS: glimmer ST: NI BLAST-Start: [hypothetical protein SEA_KHUMPHREY_28 [Arthrobacter phage KHumphrey]],,NCBI, q2:s7 85.4167% 2.72892E-13 GAP: 127 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.619, -3.678676728259483, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_KHUMPHREY_28 [Arthrobacter phage KHumphrey]],,XIJ70194,65.3846,2.72892E-13 SIF-HHPRED: SIF-Syn: /note=Start site is 23979 due to the high final score/z-score and coding potential shown on GeneMark. GeneMark shows that this range includes most of the coding potential, but not all of it. /note=As for function, this is a Hypothetical protein due to the absence of evidence from HHpred for Telethonin. NCBI and PhagesDB agree with Hypothetical protein. /note= /note=Checked: C.A.R, TH CDS complement (24122 - 24265) /gene="36" /product="gp36" /function="Hypothetical Protein" /locus tag="Windest_36" /note=Original Glimmer call @bp 24265 has strength 11.11; Genemark calls start at 24265 /note=SSC: 24265-24122 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP640_gp36 [Arthrobacter phage Faja] ],,NCBI, q1:s1 100.0% 8.8394E-24 GAP: 75 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.714, -5.309117108752867, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP640_gp36 [Arthrobacter phage Faja] ],,YP_010656322,100.0,8.8394E-24 SIF-HHPRED: SIF-Syn: /note=Start site is at 23979 due to GeneMark`s coding potential and Glimmer/GeneMark`s reading. Phamerator calls it as 24265 with 8/8 manual annotations and a 100% call rate. /note=Function: Hypothetical protein. HHPred suggested it as a Bacterial RNA polymerase inhibitor from a variety of similar genes, but none had a satisfactory e-value and other sources (NCBI, PhagesDB) suggested an unknown function/hypothetical protein. /note=-TM, TH /note= /note=Checked: C.A.R CDS 24341 - 24697 /gene="37" /product="gp37" /function="Hypothetical Protein" /locus tag="Windest_37" /note=Original Glimmer call @bp 24341 has strength 6.87; Genemark calls start at 24341 /note=SSC: 24341-24697 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp38 [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 3.77858E-78 GAP: 75 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.488, -4.509923156117645, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP634_gp38 [Arthrobacter phage Richie] ],,YP_010655759,98.3051,3.77858E-78 SIF-HHPRED: SIF-Syn: /note=Start site is 24341 due to the fact there is high coding potential (GeneMark) and the Glimmer Score is a satisfactory 6.87. Phamerator report supports a Start Site of 24341 w/a start site with 96.9% annotation rate/87.3% called the start site manually on Pham. The Z-score/Final score are the best options as well. /note=As for function, HHpred suggested that this was a Bacterial RNA polymerase inhibitor. However, when checking PhagesDB and NCBI, they all suggested it was a Hypothetical protein. HHpred also failed to yield a satisfactory E-value. /note=-TM /note= /note=Checked: C.A.R, TH CDS complement (24686 - 24922) /gene="38" /product="gp38" /function="helix-turn-helix DNA binding domain" /locus tag="Windest_38" /note=Original Glimmer call @bp 24898 has strength 6.13; Genemark calls start at 24922 /note=SSC: 24922-24686 CP: no SCS: both-gm ST: NI BLAST-Start: [helix-turn-helix DNA binding protein [Arthrobacter phage Phrank15]],,NCBI, q1:s1 100.0% 1.37554E-47 GAP: 109 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.805, -5.945489798766537, no F: helix-turn-helix DNA binding domain SIF-BLAST: ,,[helix-turn-helix DNA binding protein [Arthrobacter phage Phrank15]],,XIJ70526,100.0,1.37554E-47 SIF-HHPRED: SIF-Syn: /note=Gene: yes, it has coding potential and is called by both Glimmer and Genemark, /note=Start: 24922, Called by Genemark (not Glimmer) but includes all coding potential, It has the most annotations in the pham (starterator) /note=Function: Helix-turn-helix DNA binding protein, No significant hits on HHpred; many BLAST hits to a HTH DNA binding protein; AF3 predicts several alpha helices separated by small spacer regions /note= /note=Checked: C.A.R, TH CDS 25032 - 25535 /gene="39" /product="gp39" /function="HNH endonuclease" /locus tag="Windest_39" /note=Original Glimmer call @bp 25056 has strength 4.25; Genemark calls start at 25056 /note=SSC: 25032-25535 CP: no SCS: both-cs ST: NI BLAST-Start: [HNH endonuclease [Arthrobacter phage Phrank15]],,NCBI, q9:s1 95.2096% 1.69976E-114 GAP: 109 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.151, -5.359720151340713, no F: HNH endonuclease SIF-BLAST: ,,[HNH endonuclease [Arthrobacter phage Phrank15]],,XIJ70527,98.7421,1.69976E-114 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, there is coding potential, Glimmer and GeneMark both call it /note=Start: 25032, this start site is called with the same frequency in the pham as the next best choice. Both Glimmer and GeneMark call a different start site but this start will include all the coding potential. /note=Function: Shows multiple conserved domains that are HNH endonuclease domains; There are several hits to the HNH endonuclease in HHpred and BLAST; the H-N-H is present within a 30 amino acid sequence. /note= /note=Checked: C.A.R, TH CDS complement (25497 - 26669) /gene="40" /product="gp40" /function="tyrosine integrase" /locus tag="Windest_40" /note=Original Glimmer call @bp 26669 has strength 11.47; Genemark calls start at 26669 /note=SSC: 26669-25497 CP: no SCS: both ST: NI BLAST-Start: [tyrosine integrase [Arthrobacter phage BillyTP]],,NCBI, q1:s1 100.0% 0.0 GAP: -8 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.534, -4.0338999058337315, no F: tyrosine integrase SIF-BLAST: ,,[tyrosine integrase [Arthrobacter phage BillyTP]],,XEN18640,98.9744,0.0 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, It has coding potential and is called by both Glimmer and GeneMark /note=Start: 26669, called by both Glimmer and GeneMark, has the best Z- and Final-scores, includes all the coding potential, it is the most commonly annotated start site the is present within this gene(starterator) /note=Function: multiple hits on HHpred and BLAST to tyrosine integrase, several conserved domains hits that map to tyrosine integrases as well /note= /note=Checked: C.A.R, TH CDS complement (26662 - 26850) /gene="41" /product="gp41" /function="membrane protein" /locus tag="Windest_41" /note=Original Glimmer call @bp 26850 has strength 10.61; Genemark calls start at 26850 /note=SSC: 26850-26662 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_MIDNIGHTRAIN_43 [Arthrobacter phage MidnightRain]],,NCBI, q2:s1 98.3871% 5.50677E-34 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.965, -2.6814417326944517, yes F: membrane protein SIF-BLAST: ,,[hypothetical protein SEA_MIDNIGHTRAIN_43 [Arthrobacter phage MidnightRain]],,WNM64530,93.8462,5.50677E-34 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, Glimmer and GeneMark both call the gene. It has coding potential on GenemarkS. /note=What is the start site? 26,850 it includes all coding potential. It has a -4 gap. /note=What is the function? Hypothetical Protein because it has a 93% identity on NCBI blast. /note= /note=Checked: C.A.R /note= /note=Check For transmembrane domain /note= /note=Detected as membrane protein. -TH CDS complement (26847 - 27278) /gene="42" /product="gp42" /function="Hypothetical Protein" /locus tag="Windest_42" /note=Original Glimmer call @bp 27278 has strength 9.58; Genemark calls start at 27278 /note=SSC: 27278-26847 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp42 [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 2.68822E-50 GAP: -8 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.57, -3.4932654573164204, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP634_gp42 [Arthrobacter phage Richie] ],,YP_010655763,81.6667,2.68822E-50 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes Glimmer and GeneMark both call the gene. It also shows coding potential on GenemarkS. /note=What is the start site? 27278, it includes all coding potential. /note=What is the function? hypothetical protein and has a 99% coverage on Hhpred and has a good e-value /note= /note=Checked by: JP, TH, JL CDS complement (27271 - 27513) /gene="43" /product="gp43" /function="Hypothetical Protein" /locus tag="Windest_43" /note=Original Glimmer call @bp 27483 has strength 3.45; Genemark calls start at 27513 /note=SSC: 27513-27271 CP: no SCS: both-gm ST: NI BLAST-Start: [hypothetical protein SEA_BENCHSCRAPER_40 [Arthrobacter phage BenchScraper]],,NCBI, q1:s1 100.0% 3.44705E-46 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.456, -3.75071250087134, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_BENCHSCRAPER_40 [Arthrobacter phage BenchScraper]],,XIJ69258,95.0,3.44705E-46 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, Glimmer and GeneMark both call the gene, and it has coding potential on GenemarkS. /note=What is the start site? 27513, it includes all coding potential. It had a -4-start site. /note=What is the function? Hypothetical protein, because Hhpred did not have anything at least 90% and it was called by others on NCBI blast at a 90% identity and 95% alignment. /note= /note=Checked by: JP, JL CDS complement (27510 - 27860) /gene="44" /product="gp44" /function="Hypothetical Protein" /locus tag="Windest_44" /note=Original Glimmer call @bp 27860 has strength 11.64; Genemark calls start at 27866 /note=SSC: 27860-27510 CP: no SCS: both-gl ST: NI BLAST-Start: [hypothetical protein [Rhodoglobus sp.]],,NCBI, q6:s4 95.6897% 4.15057E-16 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.611, -5.586941349020283, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein [Rhodoglobus sp.]],,MEP6477834,46.4286,4.15057E-16 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, it has coding potential. It also has a -4 start site that Glimmer picked out and a good Glimmer score but however does not have the best z-score and final score available. However Windest appears to be the only phage in the pham for this gene which gives me some doubts about it. Starterator also gives a 404 error when it is attempted to be opened, this may be related to Windest`s single status in the pham. /note=Function: There are no NCBI or HHPred results with good %probability, %identity or %aligned. I believe the best option for function would be a hypothetical protein as it has fairly good %coverage in NCBI. I however do not want to call a function until absolutely certain it is a gene first as despite the good coding potential, the lack of other phages in the pham and the inability to see the starterator data makes me suspicious. /note=Checked: JT, JL CDS complement (27857 - 28063) /gene="45" /product="gp45" /function="Hypothetical Protein" /locus tag="Windest_45" /note=Original Glimmer call @bp 28063 has strength 10.4; Genemark calls start at 28063 /note=SSC: 28063-27857 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP639_gp044 [Arthrobacter phage Seahorse] ],,NCBI, q1:s1 98.5294% 1.4132E-34 GAP: 454 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.061, -6.680630292081155, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP639_gp044 [Arthrobacter phage Seahorse] ],,YP_010656230,36.8421,1.4132E-34 SIF-HHPRED: SIF-Syn: /note=Is it a gene?:Yes, it has coding potential. The start site with the better z-score and final score does not appear to cover all of the coding potential so I have decided to stick with the start site GeneMark and Glimmer have chosen. However the DNA sequence looks quite short to me. However in the starterator data despite this gene being found in 4 of 6 members of the pham it has never been manually annotated and only seems to be called by our phage Windest. /note=Function: most likely a hypothetical protein, There is no results with 90% or above probability showing up in HHPred and NCBI is not showing good %identity and only good %coverage for hypothetical proteins. I am hesitant to call any function until certain it is actually a gene given the short DNA sequence. (C.A.R) /note= /note=Checked:JT, JL CDS 28518 - 28685 /gene="46" /product="gp46" /function="Hypothetical Protein" /locus tag="Windest_46" /note=Original Glimmer call @bp 28518 has strength 9.79; Genemark calls start at 28518 /note=SSC: 28518-28685 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_RADFAD_47 [Arthrobacter phage RadFad]],,NCBI, q1:s1 100.0% 2.04909E-25 GAP: 454 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.273, -2.0162541296952132, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_RADFAD_47 [Arthrobacter phage RadFad]],,UYL86604,90.9091,2.04909E-25 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, it has coding potential that appears to start just a bit after 28500 so I agree with the start site Glimmer and GeneMark chose. It has a good glimmer score and also has the best z-score and final score of all of the start sites. This gene is called 94% of the time and has 5/7 MA. /note=Function: The best matches on NCBI were for hypothetical proteins and HHPred did not give any results over 90%. (C.A.R) /note=Checked: JT, JL CDS 28786 - 29124 /gene="47" /product="gp47" /function="Hypothetical Protein" /locus tag="Windest_47" /note=Original Glimmer call @bp 28786 has strength 9.24; Genemark calls start at 28786 /note=SSC: 28786-29124 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp50 [Arthrobacter phage Richie] ],,NCBI, q1:s1 97.3214% 2.18941E-71 GAP: 100 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.906, -4.905314844093283, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP634_gp50 [Arthrobacter phage Richie] ],,YP_010655771,93.75,2.18941E-71 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, Glimmer and Genemark call it a gene. There is coding potential. /note=Where is the start site? 28,786. It has a good z score and final score. And it includes all the coding potential (GenemarkS). It was called by manual annotations 100% of the time when it was present (starterator). /note=Function? Hypothetical protein. No good hits on HHPred. NCBI Blast and phagesDB Blast both give evidence for it to be a hypothetical protein. Everyone in PhagesDB has called this gene hypothetical when it is in this cluster. CM /note=Checked: JT, JL CDS 29121 - 29297 /gene="48" /product="gp48" /function="Hypothetical Protein" /locus tag="Windest_48" /note=Original Glimmer call @bp 29121 has strength 4.91; Genemark calls start at 29136 /note=SSC: 29121-29297 CP: no SCS: both-gl ST: NI BLAST-Start: [hypothetical protein PP639_gp051 [Arthrobacter phage Seahorse] ],,NCBI, q3:s9 96.5517% 1.26112E-24 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.609, -3.4892599424135016, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP639_gp051 [Arthrobacter phage Seahorse] ],,YP_010656237,76.5625,1.26112E-24 SIF-HHPRED: SIF-Syn: /note=Is it a gene? I don`t know, I don`t think it has coding potential. I was concerned when I looked at the coding potential on genemark. Where is the start site? The start site is at 29,121 because it has a -4 gap. /note=Function? I ran the protein sequence through hhpred and got 94% probably it was zinc finger protein. But on PECANN, HHpred had 90 % probability it was Transcriptional repressor protein (it didn`t have good coverage). On phagesDB Blast, everyone called it hypothetical. NCBI blast had no good hits. CM /note=Checked: JT, JL CDS 29294 - 29485 /gene="49" /product="gp49" /function="Hypothetical Protein" /locus tag="Windest_49" /note=Original Glimmer call @bp 29294 has strength 8.37; Genemark calls start at 29294 /note=SSC: 29294-29485 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein QEO99_gp57 [Arthrobacter phage Bauer] ],,NCBI, q1:s1 100.0% 3.35427E-28 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.594, -5.62127637145861, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein QEO99_gp57 [Arthrobacter phage Bauer] ],,YP_010761350,95.1613,3.35427E-28 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, called by Glimmer and Genemark. It also has coding potential (GeneMarkS). Start site? 29,294 because of -4 gap. /note=What is the function? hypothetical protein. HHPred 92% chance of hypothetical protein. NCBI Blast showed good alignment, good coverage, and okay identify to hypothetical protein. Others on PhagesDB called it hypothetical protein (although not in the AY cluster). No phages in the same cluster have had this protein assigned yet. CM /note=Checked: JT, JL CDS 29482 - 29715 /gene="50" /product="gp50" /function="Hypothetical Protein" /locus tag="Windest_50" /note=Genemark calls start at 29482 /note=SSC: 29482-29715 CP: no SCS: genemark ST: NI BLAST-Start: [hypothetical protein SEA_ZUCKER_61 [Arthrobacter phage Zucker]],,NCBI, q33:s5 57.1429% 1.09021E-7 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.292, -6.195624325390232, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_ZUCKER_61 [Arthrobacter phage Zucker]],,UUG70020,70.0,1.09021E-7 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, it is a gene with good coding potential. I chose the start site because it was a negative 4. Glimmer and Gene mark also called this start site. -KO What is the function? Hypothetical protein. HHPRED gave a variety of options but all of them had poor e value scares (nothing above 90%) while NCBI blast and phamerator called it as a hypothetical protein. There is really nothing else it could be. This is the only thing with good coverage and good probability. There was also a really good e-value score. - KO /note=Checked: JT, JL CDS 29708 - 29899 /gene="51" /product="gp51" /function="Hypothetical Protein" /locus tag="Windest_51" /note=Original Glimmer call @bp 29708 has strength 1.94 /note=SSC: 29708-29899 CP: no SCS: glimmer ST: NI BLAST-Start: [hypothetical protein SEA_RADFAD_53 [Arthrobacter phage RadFad] ],,NCBI, q3:s5 96.8254% 2.29891E-35 GAP: -8 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.023, -2.5588120788329394, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_RADFAD_53 [Arthrobacter phage RadFad] ],,UYL86609,92.3077,2.29891E-35 SIF-HHPRED: SIF-Syn: /note=Is it a gene? I think that it is a gene there is limited information on Gene mark but Glimmer has a good score and a good start. This start site allowed for a good gap.What is a function? Hypothetical Protein. There is 92% coverage and alignment. NCBI blast and Phages DB called this with very good e-values while HHPred called a chromatin remodeler which had a probability of 11% and a e-value score of 300 both of which are pretty poor. /note=Checked: I am not entirely sure that this is a gene on phamerator it is shown as a transmembrane protein but it overlaps. The genes it overlaps with have better coding potential JT /note=Checked by: JL CDS complement (29959 - 30333) /gene="52" /product="gp52" /function="IrrE-like protein" /locus tag="Windest_52" /note=Original Glimmer call @bp 30174 has strength 0.71; Genemark calls start at 30213 /note=SSC: 30333-29959 CP: no SCS: both-cs ST: NI BLAST-Start: [IrrE-like protein [Arthrobacter phage Anekin]],,NCBI, q1:s1 100.0% 2.20025E-82 GAP: 16 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.404, -7.485084544641892, no F: IrrE-like protein SIF-BLAST: ,,[IrrE-like protein [Arthrobacter phage Anekin]],,XIJ70747,98.3871,2.20025E-82 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, it has coding potential called by Glimmer Star and GeneMark. /note=what is the start cite? 30213, was called 10 out of 10 times on pham. Includes most coding potential. /note=What is the function? Similar genes found on NCBI blast have been called DNA binding protein. Good coverage, good e values. /note=Checked by CM /note= /note=AF3 predicts several alpha helices with a few amino acid spacers between them /note= /note=After looking at the cluster specific annotation notes, it appears that we can should choose an earlier start site. When I did that (choosing the best z and final score, there was a bit of an overlap but the function changes to an IrrE-like protein CDS complement (30350 - 30856) /gene="53" /product="gp53" /function="helix-turn-helix DNA binding domain" /locus tag="Windest_53" /note=Original Glimmer call @bp 30760 has strength 3.41; Genemark calls start at 30802 /note=SSC: 30856-30350 CP: no SCS: both-cs ST: NI BLAST-Start: [transcriptional regulator [Arthrobacter phage Richie] ],,NCBI, q19:s1 89.2857% 2.46772E-107 GAP: 307 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.77, -5.172915488065123, no F: helix-turn-helix DNA binding domain SIF-BLAST: ,,[transcriptional regulator [Arthrobacter phage Richie] ],,YP_010655775,100.0,2.46772E-107 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, it has coding potential called by Glimmer Star and GeneMark. /note=what is the start cite? 30802, called 3 out 4 Ma`s times when present (pham). /note=What is the function? Similar genes on phages db have called HTH binding, good e scores, function is HTH binding domain. /note=Checked by CM /note= /note=There is coding potential that extends beyond the 30802 start site. I believe we should extend the start site to 30856 to include this coding potential. /note=Several links to DNA binding proteins and HTH DNA binding proteins. AF3 does predict several alpha helices separated by 3-4 amino acids CDS 31164 - 31391 /gene="54" /product="gp54" /function="helix-turn-helix DNA binding domain" /locus tag="Windest_54" /note=Original Glimmer call @bp 31164 has strength 8.74; Genemark calls start at 31164 /note=SSC: 31164-31391 CP: no SCS: both ST: NI BLAST-Start: [helix-turn-helix DNA-binding domain protein [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 4.86019E-47 GAP: 307 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.208, -4.541358966900425, no F: helix-turn-helix DNA binding domain SIF-BLAST: ,,[helix-turn-helix DNA-binding domain protein [Arthrobacter phage Richie] ],,YP_010655776,100.0,4.86019E-47 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, it has coding potential called by Glimmer Star and GeneMark. /note=what is the start cite? 31164, has large gap which should be reviewed, called 9 out of 17 times on pham, other start cite called not present on windest. Always called when present in the pham /note=What is the function? NCBI data is pointing toward DNA binding domain, most of the similar genes are called DBA binding domain. /note=Checked by CM /note= /note=AF3 predicts alpha helices separated by 3-4 amino acids CDS 31396 - 31620 /gene="55" /product="gp55" /function="helix-turn-helix DNA binding domain" /locus tag="Windest_55" /note=Original Glimmer call @bp 31411 has strength 6.8; Genemark calls start at 31396 /note=SSC: 31396-31620 CP: no SCS: both-gm ST: NI BLAST-Start: [excisionase [Arthrobacter phage Auxilium] ],,NCBI, q1:s1 100.0% 1.7484E-46 GAP: 4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.308, -7.131202241652635, no F: helix-turn-helix DNA binding domain SIF-BLAST: ,,[excisionase [Arthrobacter phage Auxilium] ],,YP_010655869,100.0,1.7484E-46 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, it has coding potential called by Glimmer Star and GeneMark. /note=what is the start cite? 31396, even though it has lower values, it has a more reasonable gap and spacer and includes all the coding potential. Additionally, 31396 has more manual annotations than the only alternative start cite. /note=What is the function? Data from phages db points to DNA binding protein. /note=Checked by CM /note= /note=AF3 does predict a possible HTH DNA binding domain CDS 31661 - 31954 /gene="56" /product="gp56" /function="Hypothetical Protein" /locus tag="Windest_56" /note=Original Glimmer call @bp 31694 has strength 6.8; Genemark calls start at 31694 /note=SSC: 31661-31954 CP: no SCS: both-cs ST: NI BLAST-Start: [hypothetical protein PP636_gp48 [Arthrobacter phage Hestia] ],,NCBI, q14:s3 86.5979% 1.155E-40 GAP: 40 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.896, -5.753236859831856, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP636_gp48 [Arthrobacter phage Hestia] ],,YP_010655958,86.0465,1.155E-40 SIF-HHPRED: SIF-Syn: /note=Is this a gene? Yes, it is a gene. Both Glimmer and GeneMark have called it a gene. Both Glimmer and GeneMark report a start site of 31694, with Glimmer reporting a score of 6.8. This start site also includes all of the coding potential, it has the best Z-Score, and Final scores. /note=Checked by CM /note= /note=Extending this to the earlier start site will catch more coding potential. Since the scores are about the same, I would err on the side of including more coding potential. No significant hits on HHpred; only hypothetical proteins on BLAST CDS 31951 - 32187 /gene="57" /product="gp57" /function="Hypothetical Protein" /locus tag="Windest_57" /note=Original Glimmer call @bp 31951 has strength 12.75; Genemark calls start at 31951 /note=SSC: 31951-32187 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_ANEKIN_53 [Arthrobacter phage Anekin]],,NCBI, q5:s29 94.8718% 1.14263E-43 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.973, -5.785864821823211, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_ANEKIN_53 [Arthrobacter phage Anekin]],,XIJ70703,70.5882,1.14263E-43 SIF-HHPRED: SIF-Syn: /note=Is this a gene? Yes, it is a gene. It has been called by both Glimmer and GeneMark, with a high degree of confidence. The start site is at 31951. This start site has an acceptable gap, an acceptable Z- score and Final scores, and it includes all of the coding potential. -4 overlap as well /note=Checked by CM /note= /note=No significant HHpred hits; only match on NCBI BLAST is hypothetical protein CDS 32184 - 32462 /gene="58" /product="gp58" /function="helix-turn-helix DNA binding domain" /locus tag="Windest_58" /note=Original Glimmer call @bp 32199 has strength 10.53; Genemark calls start at 32184 /note=SSC: 32184-32462 CP: no SCS: both-gm ST: NI BLAST-Start: [helix-turn-helix DNA binding domain protein [Arthrobacter phage Anekin]],,NCBI, q1:s1 100.0% 3.31952E-59 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.906, -4.905314844093283, yes F: helix-turn-helix DNA binding domain SIF-BLAST: ,,[helix-turn-helix DNA binding domain protein [Arthrobacter phage Anekin]],,XIJ70704,97.8261,3.31952E-59 SIF-HHPRED: SIF-Syn: /note=Is this a gene? Yes it is a gene. Both Glimmer and GeneMark have called it a gene. It has a start site 32184. This start site has the best Z-Score and Final scores, the best gap, and it includes the most coding potential. -4 overlap as well /note=Changed function to Helix-Turn-Helix binding domain due to good evidence on NCBI and HHpred. /note=Checked by CM /note= /note=AF3 predicts a structure that would be compatible with HTH DNA binding domain CDS 32459 - 32608 /gene="59" /product="gp59" /function="Hypothetical Protein" /locus tag="Windest_59" /note=Original Glimmer call @bp 32459 has strength 9.18; Genemark calls start at 32459 /note=SSC: 32459-32608 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_MIDNIGHTRAIN_61 [Arthrobacter phage MidnightRain]],,NCBI, q1:s1 100.0% 8.70338E-27 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.134, -4.408951082954879, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_MIDNIGHTRAIN_61 [Arthrobacter phage MidnightRain]],,WNM64548,100.0,8.70338E-27 SIF-HHPRED: SIF-Syn: /note=is this a gene? Yes, it is a gene. it has been called by both Glimmer and GeneMark. The start site is at 32459. This start codon has the best gap(-4 overlap), the best Z-Score and Final scores, and the best coverage if the coding potential. /note=Checked by CM /note= /note=no significant hits on HHpred; Almost all the hits on BLAST are hypothetical proteins, only one was oxidoreductase but there is not other evidence of this function CDS 32605 - 32814 /gene="60" /product="gp60" /function="Hypothetical Protein" /locus tag="Windest_60" /note=Original Glimmer call @bp 32605 has strength 2.5; Genemark calls start at 32605 /note=SSC: 32605-32814 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_MIDNIGHTRAIN_62 [Arthrobacter phage MidnightRain]],,NCBI, q1:s1 100.0% 6.89961E-43 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.619, -4.234979229026771, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_MIDNIGHTRAIN_62 [Arthrobacter phage MidnightRain]],,WNM64549,100.0,6.89961E-43 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, it is a gene with good coding potential as shown on Gene mark. The start site is -4 which we have seen 100% of the time denotes the start site. There is good coding potential as shown on Genemark.What is the function? All the other genes in this pham that have been manually annotated show unknown function. Phages DB and HHPred both show some options both of which have poor e value scores. NCBI blast shows a hypothetical protein with high coverage and probability. -KO /note=Checked by CM CDS 32811 - 32963 /gene="61" /product="gp61" /function="Hypothetical Protein" /locus tag="Windest_61" /note=Original Glimmer call @bp 32811 has strength 5.4; Genemark calls start at 32811 /note=SSC: 32811-32963 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHRANK15_60 [Arthrobacter phage Phrank15]],,NCBI, q1:s1 100.0% 3.65482E-27 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.264, -4.134629096801125, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHRANK15_60 [Arthrobacter phage Phrank15]],,XIJ70549,96.0,3.65482E-27 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes it a gene. It has a negative four start site which has been shown to when present always be the start site. There is good coding potential and is a gene. What is the function? It is a hypothetical protein as shown on NCBI blast. When the information is then inputted manually and blasted into HHPred it shows the same results which has a really good e-value and coverage and probability.-KO /note= /note=Checked by: JP, TH CDS 32960 - 33103 /gene="62" /product="gp62" /function="Hypothetical Protein" /locus tag="Windest_62" /note=Original Glimmer call @bp 32948 has strength 10.16; Genemark calls start at 32960 /note=SSC: 32960-33103 CP: no SCS: both-gm ST: NI BLAST-Start: [hypothetical protein SEA_PHRANK15_61 [Arthrobacter phage Phrank15]],,NCBI, q1:s1 100.0% 8.18637E-24 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.772, -5.935899670259746, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHRANK15_61 [Arthrobacter phage Phrank15]],,XIJ70550,97.8723,8.18637E-24 SIF-HHPRED: SIF-Syn: /note=Is this a gene. Yes it is a gene. It has a negative four start site and really good coding potential as shown on gene mark. While the computer initially chose a different start site the negative four is the correct one because of the data that has been shown. What is the function? The NCBI blast shows ta hypothetical protein which is over 90% aligned and compatible where there is very few other viable options shown by HHPREd and Phages DB. HHPred shows a leucine zipper which is poorly aligned and has a terrible e-value and Phages DB shows that others in the pham are also unknown. /note= /note=Checked by: JP, TH CDS 33100 - 33450 /gene="63" /product="gp63" /function="Hypothetical Protein" /locus tag="Windest_63" /note=Original Glimmer call @bp 33100 has strength 9.24; Genemark calls start at 33100 /note=SSC: 33100-33450 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein KNU08_gp63 [Gordonia phage Skysand] ],,NCBI, q51:s21 43.9655% 1.80516E-10 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.437, -3.7891620722695425, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein KNU08_gp63 [Gordonia phage Skysand] ],,YP_010098131,46.4286,1.80516E-10 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes it is a gene with good coding potential it has a negative four start site which has been shown as a guaranteed start site call. This is supported by the data found in other phages and this is supported by glimmer and genemark. What is the function? It is called an uncharacterized protein by HHPRed. Phages db function frequency shows that is 75% frequently called a DNA polymerase/primase but there is nothing shown in either HHPred, Phages DB or NCBI blast to support this call. NCBI blast shows hypothetical protein which has a great e-value score. /note= /note=Checked by: JP, TH CDS 33447 - 33593 /gene="64" /product="gp64" /function="Hypothetical Protein" /locus tag="Windest_64" /note=Original Glimmer call @bp 33447 has strength 9.9; Genemark calls start at 33447 /note=SSC: 33447-33593 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP639_gp064 [Arthrobacter phage Seahorse] ],,NCBI, q1:s1 93.75% 1.06817E-11 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.398, -5.972387497942108, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP639_gp064 [Arthrobacter phage Seahorse] ],,YP_010656250,75.0,1.06817E-11 SIF-HHPRED: SIF-Syn: /note=SS is 33447 due to the fact it is the only viable option and it has a gap of -4. GeneMark shows coding potential between 33447 and 33593. /note=This gene is a Hypothetical Protein due to the fact that HHpred yields no satisfactory results (E-value wise) and PhagesDB and NCBI suggest that this gene is a Hypothetical Protein. /note=HHPred has no results with close to a 90% probability. /note= /note=Checked by: JP, TH CDS 33590 - 33784 /gene="65" /product="gp65" /function="Hypothetical Protein" /locus tag="Windest_65" /note=Original Glimmer call @bp 33590 has strength 4.32; Genemark calls start at 33590 /note=SSC: 33590-33784 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_BENCHSCRAPER_62 [Arthrobacter phage BenchScraper]],,NCBI, q1:s1 100.0% 1.72817E-36 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.04, -2.583959800616441, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_BENCHSCRAPER_62 [Arthrobacter phage BenchScraper]],,XIJ69280,100.0,1.72817E-36 SIF-HHPRED: SIF-Syn: /note=SS is 33590 due primarily to the fact that its gap is -4, GeneMark shows high coding potential as well. /note=Its function is Hypothetical. NCBI and PhagesDB agree with this call, suggesting similar genes that have also been called Hypothetical. Specific functions have only been called by HHpred, but they have unsatisfactory E-value. /note=HHPred has no hits with a 90% probability /note= /note=Checked by: JP, TH CDS 33781 - 34005 /gene="66" /product="gp66" /function="helix-turn-helix DNA binding domain" /locus tag="Windest_66" /note=Original Glimmer call @bp 33781 has strength 9.04; Genemark calls start at 33781 /note=SSC: 33781-34005 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp68 [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 4.64645E-46 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.407, -4.602200557842397, yes F: helix-turn-helix DNA binding domain SIF-BLAST: ,,[hypothetical protein PP634_gp68 [Arthrobacter phage Richie] ],,YP_010655789,100.0,4.64645E-46 SIF-HHPRED: SIF-Syn: /note=The SS is 33781 mainly due to the fact that this SS has a gap of -4 and GeneMark shows high coding potential for this region. /note=As for function, this protein apears to be a Helix-turn-helix DNA-binding domain protein. This finding is supported by both PhagesDB BLAST and NCBI BLAST. HHpred failed to yield any findings that could be used as evidence. /note=HHPred lists as possible matches with 90% probability that are transcriptional regulators which commonly have a helix turn helix domain (one listed a helix-turn-helix domain) /note=AlphaFold3 predicts several helices separated by a few amino acids /note= /note=Checked by: JP, TH CDS 33998 - 34495 /gene="67" /product="gp67" /function="Hypothetical Protein" /locus tag="Windest_67" /note=Original Glimmer call @bp 33998 has strength 2.02; Genemark calls start at 33998 /note=SSC: 33998-34495 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP640_gp65 [Arthrobacter phage Faja] ],,NCBI, q1:s1 100.0% 1.37956E-114 GAP: -8 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.219, -4.2474293116543835, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP640_gp65 [Arthrobacter phage Faja] ],,YP_010656351,99.3939,1.37956E-114 SIF-HHPRED: SIF-Syn: /note=SS is 33998 due to the fact that the Z-score and final score are good and there is high coding potential between 33998-34495. While there is another option of 34469-34495 with a better z-score/final score, it has an unrealistic gap of 463 and does not include all coding potential. /note=As for function, this is a hypothetical protein due to the findings by PhagesDB and NCBI BLAST. Hhpred failed to yield a result with a satisfactory E-value or probabilities over 90%. /note= /note=Checked by: JP, TH CDS 34640 - 34810 /gene="68" /product="gp68" /function="Hypothetical Protein" /locus tag="Windest_68" /note=Original Glimmer call @bp 34640 has strength 6.46; Genemark calls start at 34640 /note=SSC: 34640-34810 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp70 [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 1.8591E-32 GAP: 144 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.56, -3.5316035464369326, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP634_gp70 [Arthrobacter phage Richie] ],,YP_010655791,100.0,1.8591E-32 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, both glimmer and genemark called it. /note=What is the start site? 34640, it included all the coding potential and the z score was good even though the gap was very large, but there is not another start site upstream. /note=What is the function? hypothetical protein, NCBI blast called 100% identity, alignment, and coverage. /note=HHPred did not have great coverage or probability for the non-hypothetical proteins /note= /note=Checked by: JP, TH CDS 34971 - 35789 /gene="69" /product="gp69" /function="exonuclease" /locus tag="Windest_69" /note=Original Glimmer call @bp 34971 has strength 10.44; Genemark calls start at 34971 /note=SSC: 34971-35789 CP: no SCS: both ST: NI BLAST-Start: [exonuclease [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 0.0 GAP: 160 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.629, -3.3861058366776207, yes F: exonuclease SIF-BLAST: ,,[exonuclease [Arthrobacter phage Richie] ],,YP_010655792,99.6324,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, both glimmer and genemark called it /note=What is the start site? 34971, it includes all of the coding potential and has a good z score and final score. /note=Is it a function? yes, exonuclease 99% on Hhpred and 91% coverage and 99% identity on NCBI Blast. /note= /note=Checked by: JP, TH CDS 35789 - 36343 /gene="70" /product="gp70" /function="Hypothetical Protein" /locus tag="Windest_70" /note=Original Glimmer call @bp 35789 has strength 9.29; Genemark calls start at 35789 /note=SSC: 35789-36343 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp72 [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 3.65734E-131 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.372, -3.9881460756569185, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP634_gp72 [Arthrobacter phage Richie] ],,YP_010655793,100.0,3.65734E-131 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, Glimmer and GeneMark both call the gene and it has coding potential on GenemarkS. /note=What is the start site? 35789, it includes all coding potential. It also has a good z score and final score. There is a -1 overlap with the previous gene. /note=What is the function? hypothetical protein, did not have good coverage or probability on HHpred and had an identity of 99% and a coverage of 100% called it a hypothetical protein /note= /note=Checked by: JP, TH CDS 36336 - 36512 /gene="71" /product="gp71" /function="Hypothetical Protein" /locus tag="Windest_71" /note=Original Glimmer call @bp 36336 has strength 9.73; Genemark calls start at 36336 /note=SSC: 36336-36512 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_RAPHAELLA_70 [Arthrobacter phage Raphaella]],,NCBI, q1:s1 100.0% 2.12061E-33 GAP: -8 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.965, -2.6013996449736907, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_RAPHAELLA_70 [Arthrobacter phage Raphaella]],,XEN13687,100.0,2.12061E-33 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, genemark and glimmer both called it and it has coding potential /note=What is the start site? 36336, it includes all of the coding potential and has a decent z score and final score /note=What is the function? hypothetical protein, Hhpred did not have good coverage or probability and NCBI blast was 100% identity as a hypothetical protein. /note= /note=Reviewed: TM, JL CDS 36509 - 37030 /gene="72" /product="gp72" /function="SSB protein" /locus tag="Windest_72" /note=Original Glimmer call @bp 36509 has strength 11.59; Genemark calls start at 36509 /note=SSC: 36509-37030 CP: no SCS: both ST: NI BLAST-Start: [SSB protein [Arthrobacter phage Raphaella]],,NCBI, q1:s1 100.0% 1.03878E-121 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.534, -3.5690131075310805, yes F: SSB protein SIF-BLAST: ,,[SSB protein [Arthrobacter phage Raphaella]],,XEN13688,98.8439,1.03878E-121 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes! It has very good coding potential and a high glimmer score. Glimmer and GeneMark agree on the start site which is a -4 start site, it is also called 98% of the time and has 310/487 MA. The z-score and final score are also good scores. /note=Function?: I believe it is a single-stranded DNA binding protein. HHPred gives multiple results for this that are 99% probability and the conserved domains in phamerator all suggest a single-stranded DNA binding protein. Other phages in the pham also seem to agree with this function. The official functions list requests that this be called an SSB protein. (C.A.R) /note= /note=Reviewed by: TM, JL CDS 37050 - 37451 /gene="73" /product="gp73" /function="Hypothetical Protein" /locus tag="Windest_73" /note=Original Glimmer call @bp 37050 has strength 8.81; Genemark calls start at 37050 /note=SSC: 37050-37451 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP636_gp30 [Arthrobacter phage Hestia] ],,NCBI, q1:s1 100.0% 3.25156E-89 GAP: 19 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.177, -2.156361006712914, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP636_gp30 [Arthrobacter phage Hestia] ],,YP_010655976,99.2481,3.25156E-89 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, it has good coding potential. However I am unable to locate the little dash that indicates the start site. However the nearest dash that can be seen is the one with the -200 or so gap which cannot be the start site. GeneMark and Glimmer agree on the start site and the start site picked has the best z-score and final score out of all of them. The start site is called 100% of the time when present but only has 3/21 MA which is not many. However Windest does not have the most annotated start which may contribute to that. /note=Function?: Most likely a transmembrane protein. HHPred shows results around 95% and 94% for some form of membrane protein and phamerator shows that it is a transmembrane protein. No phages in the pham list a particular function and NCBI also suggests that it is a membrane protein. A few phages in Phagesdb blast suggest it could be a tape measure protein, however it does not look long enough and the evidence of it being a membrane protein far outweighs any evidence suggesting it could be otherwise. (C.A.R) /note= /note=Reviewed by: TM, JL /note=- I think we should change this to a Hypothetical Protein due to the fact that the Membrane domain is located at the very beginning of the Gene CDS 37460 - 37927 /gene="74" /product="gp74" /function="Hypothetical Protein" /locus tag="Windest_74" /note=Original Glimmer call @bp 37460 has strength 11.01; Genemark calls start at 37460 /note=SSC: 37460-37927 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP636_gp29 [Arthrobacter phage Hestia] ],,NCBI, q1:s1 100.0% 1.55763E-108 GAP: 8 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.188, -2.66069824281639, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP636_gp29 [Arthrobacter phage Hestia] ],,YP_010655977,100.0,1.55763E-108 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes it has good coding potential. GeneMark and Glimmer agree on the start site and it has a high Glimmer score alongside the best z-score and final score available. It also has a better gap that the earlier options. This start site does not have the most MA with only 8/23, but it is called 100% of the time when it is found. /note=Function?: Most likely a membrane protein. HHPred shows a result with high probability for a membrane protein and two other results of high probability that do not specifically state they are membrane proteins. However one shares 100% identity with various lipoproteins and membrane proteins. There is also an HHPred result for a lower probability that is still above 90% for a lysis protein. However, because the NCBI database also shows 100% identity, aligned, and coverage for a hypothetical protein, and given that phamerator shows it as a transmembrane protein I am inclined to believe it is one. No other phages in the pham state a specific function on the phagesdb. (C.A.R) /note= /note=Reviewed: TM, JL CDS 38033 - 38356 /gene="75" /product="gp75" /function="NrdH-like glutaredoxin" /locus tag="Windest_75" /note=Original Glimmer call @bp 38033 has strength 11.47; Genemark calls start at 38033 /note=SSC: 38033-38356 CP: no SCS: both ST: NI BLAST-Start: [thioredoxin domain [Arthrobacter phage Hestia] ],,NCBI, q1:s1 100.0% 1.51395E-71 GAP: 105 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.038, -5.377288074699542, yes F: NrdH-like glutaredoxin SIF-BLAST: ,,[thioredoxin domain [Arthrobacter phage Hestia] ],,YP_010655978,99.0654,1.51395E-71 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes it has good coding potential. The start site chosen is the one with the best z-score and final score out of the two options, although neither option have very good scores. Glimmer and GeneMark agree and the Glimmer score is quite high. This start site is found in a majority of the phages in the pham and is called 89.5% of the time with 16/22 MA. /note=Function?: Most likely an NrdH-like glutaredoxin. HHPred brings up multiple 99% probability results, most of which are some form of glutaredoxin or a circadian clock protein. Given how a circadian rhythm sounds like it is an animal and insect thing and not a bacteriophage thing as I do not think bacteriophages sleep. It also has conserved domains on Phamerator for NrdH-like glutaredoxin and other glutaredoxin like proteins. NCBI also shows 99% identity for NrdH-like glutaredoxin, thioredoxin, and just regular glutaredoxin. However aside from NCBI I have no reason to believe it to be thioredoxin. Given the presence of NrdH in the conserved domains in phamerator I am inclined to believe it is the NrdH-like glutaredoxin and not just glutaredoxin. (C.A.R) /note= /note=Reviewed: TM, JL CDS 38356 - 38586 /gene="76" /product="gp76" /function="Hypothetical Protein" /locus tag="Windest_76" /note=Original Glimmer call @bp 38356 has strength 3.29 /note=SSC: 38356-38586 CP: no SCS: glimmer ST: NI BLAST-Start: [hypothetical protein SEA_BENCHSCRAPER_74 [Arthrobacter phage BenchScraper]],,NCBI, q1:s1 100.0% 1.06558E-46 GAP: -1 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.935, -4.9056006736197375, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_BENCHSCRAPER_74 [Arthrobacter phage BenchScraper]],,XIJ69292,100.0,1.06558E-46 SIF-HHPRED: SIF-Syn: /note=Is it a gene?Yes, because although GeneMark didn`t call it, Glimmer did, and it does have some coding potential. Everyone else has also called it a gene. Start site? 38,356 it has been annotated most, and it includes all the coding potential. It also has a gap of -1. Function? Hypothetical. It has been called hypothetical by mostly everyone else, and there is no good evidence for anything different in HHpred. Has good evidence from NCBI that it is hypothetical. /note= /note=Reviewed: TM, JL CDS 38583 - 41111 /gene="77" /product="gp77" /function="DNA methyltransferase" /locus tag="Windest_77" /note=Original Glimmer call @bp 38583 has strength 15.93; Genemark calls start at 38583 /note=SSC: 38583-41111 CP: no SCS: both ST: NI BLAST-Start: [DNA helicase/methylase [Arthrobacter phage Hestia] ],,NCBI, q1:s1 100.0% 0.0 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.273, -2.033982896655645, yes F: DNA methyltransferase SIF-BLAST: ,,[DNA helicase/methylase [Arthrobacter phage Hestia] ],,YP_010655980,99.7625,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, glimmer and genemark call it, and there is coding potential. Start site? 38,583. This is the start site because it has a -4 gap. Function? HHPred gave many 99 % probability for helicase, methyltransferase, and binding protein. Most others in PhagesDB have called it helicase/methylase and SeaPhages functional assignment list said methylase and transferase are equivalent terms. NCBI blast /note= /note=Reviewed: TM, JL CDS 41108 - 41497 /gene="78" /product="gp78" /function="helix-turn-helix DNA binding domain" /locus tag="Windest_78" /note=Original Glimmer call @bp 41108 has strength 3.95; Genemark calls start at 41108 /note=SSC: 41108-41497 CP: no SCS: both ST: NI BLAST-Start: [helix-turn-helix DNA binding domain [Arthrobacter phage BillyTP]],,NCBI, q1:s1 100.0% 5.97689E-90 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.935, -4.844644247678312, no F: helix-turn-helix DNA binding domain SIF-BLAST: ,,[helix-turn-helix DNA binding domain [Arthrobacter phage BillyTP]],,XEN18673,100.0,5.97689E-90 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, Glimmer and Genemark both call it a gene, and there is coding potential (GenemarkS). Start site? 41,108 because there is a -4 gap. Functions? Helix-turn helix (check on the sea-phages assignment function list). NCBI blast has good evidence for Helix-turn helix DNA binding domain, and most everyone has called it the same on PhagesDB. In HHpred it had 99% probability for binding protein but didn`t specify helix-turn helix. No evidence that it isn`t a helix-turn helix so called it that. /note= /note=Reviewed: TM, JL /note= /note=AF3 predicts a structure that would fit an HTH DNA binding domain CDS 41494 - 42576 /gene="79" /product="gp79" /function="DNA polymerase III sliding clamp (Beta)" /locus tag="Windest_79" /note=Original Glimmer call @bp 41494 has strength 14.49; Genemark calls start at 41494 /note=SSC: 41494-42576 CP: no SCS: both ST: NI BLAST-Start: [DNA polymerase III sliding clamp beta [Arthrobacter phage BillyTP]],,NCBI, q1:s1 100.0% 0.0 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.198, -2.4634880269616195, yes F: DNA polymerase III sliding clamp (Beta) SIF-BLAST: ,,[DNA polymerase III sliding clamp beta [Arthrobacter phage BillyTP]],,XEN18674,100.0,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, called a gene on Glimmer and Genemark and has coding potential (GenemarkS). Start site? 41,494 because it has a -4 gap. Function? DNA polymerase III sliding clamp (Beta). HHPred and NCBI blast both had really good evidence for this. Others on phagesDB called it this as well. /note= /note=Reviewed: TM, JL CDS 42573 - 43028 /gene="80" /product="gp80" /function="Hypothetical Protein" /locus tag="Windest_80" /note=Original Glimmer call @bp 42573 has strength 10.57; Genemark calls start at 42573 /note=SSC: 42573-43028 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_BILLYTP_79 [Arthrobacter phage BillyTP] ],,NCBI, q1:s1 100.0% 3.6819E-108 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.976, -4.677841823131603, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_BILLYTP_79 [Arthrobacter phage BillyTP] ],,XEN18675,100.0,3.6819E-108 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes it is a gene. It has been called by both glimmer and GeneMark, and it has a Glimmer score of 10.57. The start site is 42573. This has the best Z-Score and final score, and it also includes all of the coding potential. /note= /note=Reviewed: TM, JL /note= /note=No significant hits on HHpred; NCBI BLAST only has matches to hypothetical proteins CDS 43025 - 44491 /gene="81" /product="gp81" /function="DNA methyltransferase" /locus tag="Windest_81" /note=Original Glimmer call @bp 43025 has strength 13.69; Genemark calls start at 43025 /note=SSC: 43025-44491 CP: no SCS: both ST: NI BLAST-Start: [DNA methyltransferase [Arthrobacter phage BenchScraper]],,NCBI, q1:s1 100.0% 0.0 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.177, -2.2364030944336752, yes F: DNA methyltransferase SIF-BLAST: ,,[DNA methyltransferase [Arthrobacter phage BenchScraper]],,XIJ69297,100.0,0.0 SIF-HHPRED: SIF-Syn: /note=is it a gene? Yes it is a gene. Both Glimmer and GeneMark have called it a gene. It has a glimmer score of 13.69. It has a start codon of 43025. This has the best gap, best Z-score and final score, and it has the best coding potential. /note=Checked: RR /note= /note=Function: Several matches on HHpred to a DNA methyltransferase; Many BLAST matches as well CDS 44509 - 44943 /gene="82" /product="gp82" /function="DNA binding protein" /locus tag="Windest_82" /note=Original Glimmer call @bp 44509 has strength 10.74; Genemark calls start at 44509 /note=SSC: 44509-44943 CP: no SCS: both ST: NI BLAST-Start: [replication initiation protein [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 2.70119E-99 GAP: 17 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.273, -2.4811409279978642, yes F: DNA binding protein SIF-BLAST: ,,[replication initiation protein [Arthrobacter phage Richie] ],,YP_010655804,100.0,2.70119E-99 SIF-HHPRED: SIF-Syn: /note=is it a gene? Yes, it has been called a Gene by both glimmer of GeneMark, with a Glimmer score of 10.74. The start codon is at 44509. This has an acceptable z score, final score, and a gap. It also includes all of the coding potential. /note=Checked: RR /note= /note=Only one hit above 90% on HHpred - replication initiator protein - This would be a type of DNA bonding protein which matches the majority of the hits on BLAST CDS 45177 - 45548 /gene="83" /product="gp83" /function="Hypothetical Protein" /locus tag="Windest_83" /note=Original Glimmer call @bp 45177 has strength 7.01; Genemark calls start at 45177 /note=SSC: 45177-45548 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp84 [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 2.89217E-83 GAP: 233 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.944, -3.2975203404035645, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP634_gp84 [Arthrobacter phage Richie] ],,YP_010655805,100.0,2.89217E-83 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, both Glimmer and GeneMark have called it a geneit also has a glimmer score of 7.01. It has a start site of 45177. This site has an acceptable Z-Score and Final score. This site also includes all of the coding potential. /note=checked: RR /note= /note=No significant ORF in the gap before this gene /note=No significant hits on HHpred; matches on NCBI BLAST are to hypothetical proteins CDS 45545 - 45943 /gene="84" /product="gp84" /function="Hypothetical Protein" /locus tag="Windest_84" /note=Original Glimmer call @bp 45545 has strength 6.18; Genemark calls start at 45545 /note=SSC: 45545-45943 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp85 [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 4.56787E-94 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.407, -3.8540125308361968, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP634_gp85 [Arthrobacter phage Richie] ],,YP_010655806,100.0,4.56787E-94 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, called by Glimmer and GeneMark. Also has coding potential. Start site? 45,545 because of the negative 4 gap. Function? Hypothetical protein. It had no good hits on HHPred, NCBI had really strong evidence for hypothetical protein, and everyone one in phages blast called it hypothetical. /note= /note=Checked: RR CDS 45943 - 46374 /gene="85" /product="gp85" /function="Hypothetical Protein" /locus tag="Windest_85" /note=Original Glimmer call @bp 45943 has strength 5.61; Genemark calls start at 45943 /note=SSC: 45943-46374 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp86 [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 2.4589E-102 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.944, -2.996490344739583, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP634_gp86 [Arthrobacter phage Richie] ],,YP_010655807,100.0,2.4589E-102 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, it was called by Glimmer and Genemark. Start site? At 45,943 because of the -1 gap and it includes all the coding potential. What is the function? Hypothetical protein. All others on Phages blast called it hypothcail, and NCBI had 100% probability and 100% of coverage. HHPred doesn`t have any evidence for another function. /note=Checked: RR CDS 46371 - 46865 /gene="86" /product="gp86" /function="MazG-like nucleotide pyrophosphohydrolase" /locus tag="Windest_86" /note=Original Glimmer call @bp 46371 has strength 12.95; Genemark calls start at 46371 /note=SSC: 46371-46865 CP: no SCS: both ST: NI BLAST-Start: [pyrophosphatase [Arthrobacter phage Hestia] ],,NCBI, q1:s1 100.0% 1.50419E-115 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.717, -3.201021426436999, yes F: MazG-like nucleotide pyrophosphohydrolase SIF-BLAST: ,,[pyrophosphatase [Arthrobacter phage Hestia] ],,YP_010655989,100.0,1.50419E-115 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, it was called by Glimmer and Genemark and it has a really high score for Glimmer. It has coding potential(GenemarkS). Start site? 46,371 because it has a -4 gap. Function? MazG-like nucleotide pyrophosphohydrolase. It had 98 % probability on HHpred and it had good coverage. NCBI blast had 100% probability for this function. Many others have called it this on PhagesDB. /note=checked: RR CDS 46868 - 47047 /gene="87" /product="gp87" /function="Hypothetical Protein" /locus tag="Windest_87" /note=Original Glimmer call @bp 46868 has strength 8.25; Genemark calls start at 46868 /note=SSC: 46868-47047 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP634_gp88 [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 1.00873E-35 GAP: 2 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.901, -4.976656753876107, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP634_gp88 [Arthrobacter phage Richie] ],,YP_010655809,100.0,1.00873E-35 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, it was called by Glimmer and Genemark and it has good coding potential(GenemarkS). Start site? 46,868 because it was the start site that included all the coding potential, it had a good score on Glimmer and good z-score, final score, and it is called 100% of the time when it is present (starterator). Function? Hypothetical protein. It had no good hits on HHPred. NCBI blast had strong evidence for hypothetical protein, and others called it hypothetical. /note=Checked: RR CDS 47044 - 47469 /gene="88" /product="gp88" /function="RusA-like resolvase" /locus tag="Windest_88" /note=Original Glimmer call @bp 47044 has strength 9.16; Genemark calls start at 47044 /note=SSC: 47044-47469 CP: no SCS: both ST: NI BLAST-Start: [RusA-like Holliday junction resolvase [Arthrobacter phage Richie] ],,NCBI, q1:s1 100.0% 2.59913E-94 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.629, -3.368377069717189, yes F: RusA-like resolvase SIF-BLAST: ,,[RusA-like Holliday junction resolvase [Arthrobacter phage Richie] ],,YP_010655810,99.2908,2.59913E-94 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, it has coding potential, Glimmer and GeneMark both call it /note=Start: 47044, Glimmer and GeneMark both call this start, It has the best Z and final Scores, it has a -4 overlap, most commonly annotated start site in starterator that is present in this gene /note=Function: several hits in both BLAST and HHpred to RusA resolvase; Conserved domain is present too /note=Checked: RR CDS 47466 - 48017 /gene="89" /product="gp89" /function="Hypothetical Protein" /locus tag="Windest_89" /note=Original Glimmer call @bp 47466 has strength 8.45; Genemark calls start at 47466 /note=SSC: 47466-48017 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_GORPY_84 [Arthrobacter phage Gorpy] ],,NCBI, q1:s1 100.0% 7.15984E-130 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.198, -2.17469248771465, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_GORPY_84 [Arthrobacter phage Gorpy] ],,UVF61046,99.4536,7.15984E-130 SIF-HHPRED: SIF-Syn: /note=Gene: yes, it has coding potential and is called by both Glimmer and GeneMark /note=Start: 47466, -4 overlap, Called by Glimmer and GeneMark, Called 100% of the time in this pham (starterator), best Z and final Scores /note=Function: hypothetical protein; no significant hits in HHpred, Only hypothetical proteins in BLAST hits; no conserved domains, no transmembrane domains /note=Checked: RR CDS 48030 - 48695 /gene="90" /product="gp90" /function="Hypothetical Protein" /locus tag="Windest_90" /note=Original Glimmer call @bp 48030 has strength 3.96; Genemark calls start at 48042 /note=SSC: 48030-48695 CP: no SCS: both-gl ST: NI BLAST-Start: [hypothetical protein SEA_GORPY_85 [Arthrobacter phage Gorpy] ],,NCBI, q1:s1 100.0% 3.44915E-157 GAP: 12 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.284, -2.0720764396375664, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_GORPY_85 [Arthrobacter phage Gorpy] ],,UVF61047,97.7376,3.44915E-157 SIF-HHPRED: SIF-Syn: /note=Gene: yes, has coding potential and is called by Glimmer and GeneMark /note=Start: 48030, called by Glimmer but GeneMark, has the best final and Z score, includes all possible coding potential, called in all other manual annotations in this pham /note=Function: Hypothetical protein, no significant hits on HHpred, only hypothetical proteins on BLAST hits, No conserved domains, no transmembrane domains /note=Checked: RR CDS 48692 - 48928 /gene="91" /product="gp91" /function="Hypothetical Protein" /locus tag="Windest_91" /note=Original Glimmer call @bp 48692 has strength 10.66; Genemark calls start at 48692 /note=SSC: 48692-48928 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_ANEKIN_88 [Arthrobacter phage Anekin]],,NCBI, q1:s1 100.0% 1.39517E-50 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.359, -3.9531326986493704, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_ANEKIN_88 [Arthrobacter phage Anekin]],,XIJ70737,100.0,1.39517E-50 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, both glimmer and genemark called it /note=What is the start site? 48692, it includes all of the coding potential and has a good z score and final score, while also having a -4 start site. /note=Is it a function? On NCBI blast hypothetical protein was called 100% of the time and on Hhpred there were no good matches /note= /note=Concur, TH and KJ CDS 48912 - 49139 /gene="92" /product="gp92" /function="Hypothetical Protein" /locus tag="Windest_92" /note=Original Glimmer call @bp 48912 has strength 9.05; Genemark calls start at 48912 /note=SSC: 48912-49139 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PP638_gp13 [Arthrobacter phage Isolde] ],,NCBI, q1:s1 100.0% 2.46209E-44 GAP: -17 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.456, -3.7329837339109084, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PP638_gp13 [Arthrobacter phage Isolde] ],,YP_010656179,98.6667,2.46209E-44 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, both glimmer and genemark called it. /note=What is the start site? 48912, it includes all of the coding potential and has a good z score and final score, it also has the most manual annotations in starterator /note=Is it a function? Hypothetical protein, it was not a transmembrane protein and Hhpred did not call any of the functions. NCBI blast was 96% identity and 100% coverage for hypothetical. /note=(Concur, TH) and KJ CDS 49136 - 49510 /gene="93" /product="gp93" /function="Hypothetical Protein" /locus tag="Windest_93" /note=Original Glimmer call @bp 49136 has strength 9.77; Genemark calls start at 49136 /note=SSC: 49136-49510 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_BRAYBEAST_46 [Arthrobacter phage BrayBeast]],,NCBI, q1:s1 95.9677% 4.87501E-21 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.177, -2.6835611257758942, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_BRAYBEAST_46 [Arthrobacter phage BrayBeast]],,WXW93182,66.3717,4.87501E-21 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, both glimmer and genemark called it /note=What is the start site? 49136, it includes all of the coding potential and has a good z score and final score, while also having a -4 start site. /note=Is it a function? It has a 81% probability with a 95% coverage on Hhpred; so I ran it again on the actual website and it reran it as a 91% probability Trs0524 protein. When I looked on Phamerator it did not say transmembrane. /note=(Concur, TH) and KJ CDS 49507 - 50121 /gene="94" /product="gp94" /function="helix-turn-helix DNA binding domain" /locus tag="Windest_94" /note=Original Glimmer call @bp 49507 has strength 11.5; Genemark calls start at 49507 /note=SSC: 49507-50121 CP: no SCS: both ST: NI BLAST-Start: [helix-turn-helix DNA binding domain protein [Arthrobacter phage EvePickles]],,NCBI, q4:s2 98.5294% 2.41114E-129 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.407, -3.914968956777623, yes F: helix-turn-helix DNA binding domain SIF-BLAST: ,,[helix-turn-helix DNA binding domain protein [Arthrobacter phage EvePickles]],,UYL88379,91.133,2.41114E-129 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, both glimmer and genemark called it /note=What is the start site? 49507, it includes all of the coding potential and has a good z score and final score, while also having a -4 start site. /note=Is it a function? It has a 96% probability with a 29% coverage on Hhpred; so I ran it again on the actual website and it wasn,t any clearer. The final decision was between an excise and a helix-turn-helix because those had 96% probability. /note=The coverage isnot great either, I dont lean one way or the other-Th /note=KJ- appears to be helix turn helix from phages DB and NCBI blast /note= /note=AF3 predicts a structure that would be compatible with a HTH DNA binding domain CDS 50675 - 50812 /gene="95" /product="gp95" /function="Hypothetical Protein" /locus tag="Windest_95" /note=Original Glimmer call @bp 50675 has strength 12.45; Genemark calls start at 50675 /note=SSC: 50675-50812 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_GORPY_90 [Arthrobacter phage Gorpy] ],,NCBI, q1:s1 100.0% 6.05801E-25 GAP: 553 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.491, -3.6776057178969577, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_GORPY_90 [Arthrobacter phage Gorpy] ],,UVF61052,100.0,6.05801E-25 SIF-HHPRED: SIF-Syn: /note=Is a gene; adequate coding potential. Coding potential lines up with the start site suggested by Glimmer and GeneMark. Final and Z scores are good enough--other listed start sites are terrible. Not able to call function. /note=Concur, TH, KJ /note=No significant hits on HHpred; only hypothetical proteins on NCBI BLAST CDS 50825 - 50986 /gene="96" /product="gp96" /function="Hypothetical Protein" /locus tag="Windest_96" /note=Original Glimmer call @bp 50825 has strength 6.7; Genemark calls start at 50825 /note=SSC: 50825-50986 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_GORPY_91 [Arthrobacter phage Gorpy] ],,NCBI, q1:s1 100.0% 1.35115E-31 GAP: 12 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.723, -3.170856472917156, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_GORPY_91 [Arthrobacter phage Gorpy] ],,UVF61053,100.0,1.35115E-31 SIF-HHPRED: SIF-Syn: /note=Is this a gene? Yes it is a gene. Both Genemark and Glimmer have chosen the same start site which has good coding potential. It also encompasses all coding potential available. There is a ATPase at 93% probability, but it is from a bacterial species. No other functions can be considered. /note= /note=(Concur, TH) and KJ CDS 50997 - 51383 /gene="97" /product="gp97" /function="metalloprotease" /locus tag="Windest_97" /note=Original Glimmer call @bp 50997 has strength 9.0; Genemark calls start at 50997 /note=SSC: 50997-51383 CP: no SCS: both ST: NI BLAST-Start: [metalloprotease [Arthrobacter phage Gorpy] ],,NCBI, q1:s1 100.0% 5.74677E-89 GAP: 10 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.723, -3.635743271219807, yes F: metalloprotease SIF-BLAST: ,,[metalloprotease [Arthrobacter phage Gorpy] ],,UVF61054,100.0,5.74677E-89 SIF-HHPRED: SIF-Syn: /note=Is this a Gene? Yes it a gene. All of the coding potential is encased in the section selected. The Glimmer and Genemark both chose the same start site. There is a good z score and a good final score. What is the function? The function is metalloprotease. Phages DB showed multiple phages in our pham which had called the same function with a really good e-value. The sequence was long enough to further validate their findings. This call was also supported by NCBI blast which called the same function with 100% alignment, 100% coverage, and a very good e-value score. HHPred called a hypothetical protein which was 50% aligned and a mid e-value. It also looked at a smaller piece of genetic information than both NCBI and Phages DB which is why I decided to go with the more overwhelming evidence offered by NCBI and PHages DB as well as the strong correlation between our gene and others in our pham.-KO, TH, KJ CDS 51370 - 51690 /gene="98" /product="gp98" /function="HNH endonuclease" /locus tag="Windest_98" /note= /note=SSC: 51370-51690 CP: no SCS: neither ST: NI BLAST-Start: [HNH endonuclease [Arthrobacter phage Auxilium] ],,NCBI, q1:s1 100.0% 3.44473E-62 GAP: -14 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.158, -7.446192415156777, no F: HNH endonuclease SIF-BLAST: ,,[HNH endonuclease [Arthrobacter phage Auxilium] ],,YP_010655912,91.5094,3.44473E-62 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, not much coding potential, Neither Glimmer nor GeneMark called it but this gene is present in many phages of the AY cluster and show good hits to functional proteins /note=Start: 51370, a bit of an overlap with the previous gene but it is necessary to get the entirety of the protein as a good portion of the matches is in the first part of the protein /note=Function: hnh endonuclease, several matches in HHpred and BLAST to this function, Cluster annotation notes /note=Concur, TH, KJ