CDS 604 - 975 /gene="1" /product="gp1" /function="Hypothetical Protein" /locus tag="Bouchard_1" /note=Original Glimmer call @bp 604 has strength 5.91; Genemark calls start at 604 /note=SSC: 604-975 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_1 [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 2.2223E-83 GAP: 0 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.293, -2.0720764396375664, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_1 [Arthrobacter phage Tokki] ],,UGL63230,100.0,2.2223E-83 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, it has coding potential. Although the glimmer score is not the highest it does have the best z-score and final score out of all of the options. Glimmer and GeneMark also agree on the start site. This gene is only present in 34% of the phages in the pham, but is called 100% of the time and has 48/165 MA. /note= /note=Function: Most likely a hypothetical protein. The HHPred results did not give anything above a 90% and the NCBI results did give results for 100% Identity, Aligned, and Coverage for a hypothetical protein however. (C.A.R) /note= /note=Checked: TM /note=Checked gap before this gene for a reverse gene. It did not match produce any matches to other gene. Not a gene. Checked gap in the forward direction. It did not produce any significant hits to another gene. /note=Checked by: JL CDS 968 - 1375 /gene="2" /product="gp2" /function="HNH endonuclease" /locus tag="Bouchard_2" /note=Original Glimmer call @bp 962 has strength 5.73; Genemark calls start at 968 /note=SSC: 968-1375 CP: no SCS: both-gm ST: NI BLAST-Start: [HNH endonuclease [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 7.49261E-97 GAP: -8 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.276, -4.0662565164291395, no F: HNH endonuclease SIF-BLAST: ,,[HNH endonuclease [Arthrobacter phage Tokki]],,UGL63231,100.0,7.49261E-97 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, it has coding potential. I ended up changing the gene start site to agree with the pick GeneMark had chosen as it has 71 MA compared to Glimmer`s choice which only has 7 MA. However I am wondering if the black upside down triangle is indicating the start site GeneMark had chosen, and if that is the case Glimmer`s choice may be better despite the larger gap as it would cover the full coding potential. /note= /note=Function: Most likely an HNH endonuclease. There were multiple HHPred results above 90% that suggested this and most members of the pham also have chosen this function. There is also an NCBI result that matches 100% on everything for this particular function as well. The official functions list does have specificities that they want to be checked requiring an HNH over a 30 aa span, which is present. (C.A.R) /note= /note=Checked: TM, JL CDS 1397 - 2491 /gene="3" /product="gp3" /function="endolysin" /locus tag="Bouchard_3" /note=Original Glimmer call @bp 1397 has strength 8.77; Genemark calls start at 1397 /note=SSC: 1397-2491 CP: no SCS: both ST: NI BLAST-Start: [endolysin [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 0.0 GAP: 21 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.049, -2.5052746077145835, yes F: endolysin SIF-BLAST: ,,[endolysin [Arthrobacter phage Phaby]],,XJP08305,98.9011,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, it has coding potential. GeneMark and Glimmer agree on the start site and the Glimmer score is good. It has the best z-score and final score. The gene is called 100% of the time and has 11/18 MA. /note= /note=Function: Most likely endolysin. In phamerator there are conserved domains for this gene and HHPred brings up quite a few results for endolysin. It could also potentially be lysin A and should most likely be looked at a second time if a lysin B is found later in this gene. (C.A.R) /note=checked by TM, JL CDS 2488 - 2661 /gene="4" /product="gp4" /function="membrane protein" /locus tag="Bouchard_4" /note=Original Glimmer call @bp 2488 has strength 3.57; Genemark calls start at 2488 /note=SSC: 2488-2661 CP: no SCS: both ST: NI BLAST-Start: [membrane protein [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 2.47535E-30 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.112, -6.5721535979387395, no F: membrane protein SIF-BLAST: ,,[membrane protein [Arthrobacter phage Phaby]],,XJP08306,100.0,2.47535E-30 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Most likely as there is a transmembrane domain, and therefore it most likely has a function despite the strange looking coding potential. The glimmer score is low, but it is a -4 start site. The z-score and final score are also not the best out of all the options, but once again it is a -4 start site so it will not be changed unless someone with more knowledge says it would be best to do so. /note= /note=Function: The DNA sequence is quite short but I believe it is most likely still long enough. NCBI blast shows a 100% match in most everything for a membrane protein like I had suspected, however I will not make a final call until HHPred is finished running. There are no HHPred results above 90% and all phages in the pham from what I can tell have put it as function unknown. There is a transmembrane domain as well. (C.A.R), JL, TM CDS 2654 - 3025 /gene="5" /product="gp5" /function="Hypothetical Protein" /locus tag="Bouchard_5" /note=Original Glimmer call @bp 2654 has strength 5.63; Genemark calls start at 2654 /note=SSC: 2654-3025 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_5 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 1.13757E-83 GAP: -8 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.424, -4.107110848202932, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_5 [Arthrobacter phage Phaby]],,XJP08307,99.187,1.13757E-83 SIF-HHPRED: SIF-Syn: /note=The SS for this gene is 2654 due to the fact that both Glimmer and GeneMark call this as the SS, and there is high coding potential shown on GeneMark. /note=As for the function, it is a Hypothetical Protein. This is mainly because both NCBI BLAST and PhagesDB BLAST provide many similar genes as such. HHpred failed to yield any satisfactory results. /note=TM /note= /note=Checked: C.A.R, JL CDS 2998 - 3504 /gene="6" /product="gp6" /function="Hypothetical Protein" /locus tag="Bouchard_6" /note=Original Glimmer call @bp 2998 has strength 10.74; Genemark calls start at 2998 /note=SSC: 2998-3504 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_LILHUDDY_6 [Arthrobacter phage LilHuddy] ],,NCBI, q1:s1 100.0% 2.35743E-119 GAP: -28 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.895, -3.6727816321281046, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_LILHUDDY_6 [Arthrobacter phage LilHuddy] ],,UYL88201,100.0,2.35743E-119 SIF-HHPRED: SIF-Syn: /note=The start site is at 2998 mainly due to the fact that GeneMark and Glimmer agree with a 10 Glimmer Score. There is high coding potential in this area as shown on GeneMark. Even though this includes significant overlap with the previous gene the start site seems likely due to the coding potential overlapping as well. /note=As for the function, it is a Hypothetical Protein based off of findings on PhagesDB and NCBI BLAST. Phamerator had no evidence for suspecting this was a Transmembrane protein. /note=TM /note= /note=Checked: C.A.R, JL CDS 3521 - 5263 /gene="7" /product="gp7" /function="terminase" /locus tag="Bouchard_7" /note=Original Glimmer call @bp 3521 has strength 11.29; Genemark calls start at 3521 /note=SSC: 3521-5263 CP: no SCS: both ST: NI BLAST-Start: [terminase [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 0.0 GAP: 16 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.282, -2.0949393225970705, yes F: terminase SIF-BLAST: ,,[terminase [Arthrobacter phage Tokki]],,UGL63236,100.0,0.0 SIF-HHPRED: SIF-Syn: /note=The SS for this gene is 3521 due tot he fact that both Glimmer and GeneMark claimed this as the SS, Glimmer Score is an 11, so Glimmer is quite confident in their call. GeneMark shows high coding potential for this region as well. /note=As for the function, this protein is a terminase. This is supported by PhagesDB, HHpred, and NCBI blast. /note=TM /note= /note=Checked: C.A.R, JL CDS 5260 - 6111 /gene="8" /product="gp8" /function="Hypothetical Protein" /locus tag="Bouchard_8" /note=Original Glimmer call @bp 5260 has strength 13.36; Genemark calls start at 5260 /note=SSC: 5260-6111 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_8 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 0.0 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.293, -2.0720764396375664, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_8 [Arthrobacter phage Phaby]],,XJP08310,100.0,0.0 SIF-HHPRED: SIF-Syn: /note=The SS for this gene is at 5260 due to the fact that both Glimmer and GeneMark agreed that this was the SS and the Glimmer Score is 13.36. GeneMark shows high coding potential in this region. Phamerator agrees with this call with 62 manual Annotations for this SS. /note=As for function, this is a Hypothetical Protein. Besides PhagesDB, there were no other sources that advocated for a function for this protein (At least none with a satisfactory E-value) /note=TM /note= /note=Checked: C.A.R, JL CDS 6108 - 6686 /gene="9" /product="gp9" /function="Hypothetical Protein" /locus tag="Bouchard_9" /note=Original Glimmer call @bp 6108 has strength 9.67; Genemark calls start at 6108 /note=SSC: 6108-6686 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_9 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 2.00932E-133 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.12, -4.473132615753581, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_9 [Arthrobacter phage Phaby]],,XJP08311,100.0,2.00932E-133 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, it has coding potential called by Glimmer Star and GeneMark. /note=what is the start cite? 6108, -4 start cite. 29 out of 29 manual annotations have called (pham). /note=What is the function? Hypothetical protein, all aligned proteins from NCBI and phages db with excellent to average e scores point to a hypothetical protein. /note= /note=Checked: C.A.R, JL CDS 6683 - 7234 /gene="10" /product="gp10" /function="acetyltransferase" /locus tag="Bouchard_10" /note=Original Glimmer call @bp 6683 has strength 11.51; Genemark calls start at 6683 /note=SSC: 6683-7234 CP: no SCS: both ST: NI BLAST-Start: [acetyltransferase [Arthrobacter phage Shepard]],,NCBI, q1:s1 100.0% 4.17344E-131 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.142, -4.426679849915311, no F: acetyltransferase SIF-BLAST: ,,[acetyltransferase [Arthrobacter phage Shepard]],,QFG13620,100.0,4.17344E-131 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, it has coding potential called by Glimmer Star and GeneMark. /note=what is the start cite? 6683, -4 start cite, 12 of 17 MA`s on phamerator /note=What is the function? acetyltransferase, all aligned proteins with excellent to average e scores from NCBI, phages db, and HHPRED all point to acetyltransferase. /note= /note=Checked: C.A.R, JL CDS 7258 - 7728 /gene="11" /product="gp11" /function="Hypothetical Protein" /locus tag="Bouchard_11" /note=Original Glimmer call @bp 7258 has strength 14.82; Genemark calls start at 7258 /note=SSC: 7258-7728 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_11 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 1.98093E-108 GAP: 23 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.91, -4.9157003279346805, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_11 [Arthrobacter phage Phaby]],,XJP08313,100.0,1.98093E-108 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, it has coding potential called by Glimmer Star and GeneMark. /note=what is the start cite? 7258, called second most on pham, the most called is not present in Bouchard. Includes all coding potential /note=what is the function? All aligned proteins with excellent to average e scores point to hypothetical protein, supported by Phages db and NCBI. /note= /note=Checked: C.A.R, TM CDS 7767 - 9098 /gene="12" /product="gp12" /function="portal protein" /locus tag="Bouchard_12" /note=Original Glimmer call @bp 7767 has strength 6.75; Genemark calls start at 7767 /note=SSC: 7767-9098 CP: no SCS: both ST: NI BLAST-Start: [portal protein [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 0.0 GAP: 38 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.128, -2.338663114094478, yes F: portal protein SIF-BLAST: ,,[portal protein [Arthrobacter phage Tokki]],,UGL63241,99.5485,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes. called by glimmer and genemark and has coding potential. /note=What is the start cite: 7767, only option with reasonable gap. Called 60 out 167 MA`s, the other gene most called is not present in Bouchard. /note=What is the function? Portal protein, very strongly supported by NCBI blast, multiple aligned proteins with e values of 0. /note= /note=Checked: C.A.R, TM CDS 9100 - 9834 /gene="13" /product="gp13" /function="Hypothetical Protein" /locus tag="Bouchard_13" /note=Original Glimmer call @bp 9100 has strength 9.61; Genemark calls start at 9100 /note=SSC: 9100-9834 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_13 [Arthrobacter phage Phaby]],,NCBI, q2:s1 99.5902% 2.78002E-172 GAP: 1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.141, -4.700217548281525, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_13 [Arthrobacter phage Phaby]],,XJP08315,100.0,2.78002E-172 SIF-HHPRED: SIF-Syn: /note=Is it a agene? yes, called by Glimmer and Genemar, and it has really good coding potential. Start site? 9,100. This start site has the most manual annotations and it includes all coding potential. Function? Hypothetical. iT had no good hits on HHpred, it had strong evidence on NCBI blast that it was a hypothetical protein, and everyone else on PhagesDB called it a Hypothetical protein as well. /note= /note=Checked: C.A.R, TM CDS 9856 - 12018 /gene="14" /product="gp14" /function="major capsid and protease fusion protein" /locus tag="Bouchard_14" /note=Original Glimmer call @bp 9856 has strength 8.9; Genemark calls start at 9856 /note=SSC: 9856-12018 CP: no SCS: both ST: NI BLAST-Start: [major capsid and protease fusion protein [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 0.0 GAP: 21 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.185, -3.7627423718235193, yes F: major capsid and protease fusion protein SIF-BLAST: ,,[major capsid and protease fusion protein [Arthrobacter phage Tokki]],,UGL63243,100.0,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, Glimmer and genemark called it one, and it has coding potential. Start site? 9856, because it includes all the coding potential. It has also good z-scores and final scores. Function? Major Capsid and Protease Fusion protein. HHpred had good evidence for major capsid protein. It has great identity, alignment and coverage with other genes called it Major Capsid and Protease Fusion protein. /note= /note=Checked: C.A.R, TM CDS 12030 - 12362 /gene="15" /product="gp15" /function="Hypothetical Protein" /locus tag="Bouchard_15" /note=Original Glimmer call @bp 12030 has strength 5.12; Genemark calls start at 12030 /note=SSC: 12030-12362 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_15 [Arthrobacter phage Shepard] ],,NCBI, q1:s1 100.0% 4.72291E-73 GAP: 11 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.765, -3.1011115863108, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_15 [Arthrobacter phage Shepard] ],,QFG13625,100.0,4.72291E-73 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, Glimmer and Genemark call it, and it has good coding potential. Start site? 12030, because it includes all the coding potential, it has good score and final scores. It has been manually annotated most of the time it is called (Starterator). Function? Hypothetical Protein. It had no good hits on HHpred, everyone else on PhagesDB called it hypothetical and NCBI blast had excellent coverage, alignment, and identity for a hypothetical protein. /note= /note=Checked: C.A.R, TM CDS 12340 - 12636 /gene="16" /product="gp16" /function="tail terminator" /locus tag="Bouchard_16" /note=Original Glimmer call @bp 12340 has strength 6.01 /note=SSC: 12340-12636 CP: no SCS: glimmer ST: NI BLAST-Start: [tail terminator [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 3.07384E-65 GAP: -23 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.876, -4.905249357754755, no F: tail terminator SIF-BLAST: ,,[tail terminator [Arthrobacter phage Phaby]],,XJP08318,100.0,3.07384E-65 SIF-HHPRED: SIF-Syn: /note=It it a gene? It was called by Glimmer but not by genemark. But it does have some coding potential and everyone else has called it a gene. Start site? 12,340. I think it includes the coding potential? Out of 131 manual annotations it has been called the start site every time but one. Function? Tail terminator. HHpred had good hits for tail terminator, and NCBI also had great identity, alignment and coverage for it. Others have called it a tail terminator as well(PhagesDB) /note= /note=Checked: C.A.R, TM CDS 12648 - 13517 /gene="17" /product="gp17" /function="major tail protein" /locus tag="Bouchard_17" /note=Original Glimmer call @bp 12648 has strength 8.95; Genemark calls start at 12648 /note=SSC: 12648-13517 CP: no SCS: both ST: NI BLAST-Start: [major tail protein [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 0.0 GAP: 11 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.185, -2.2364030944336752, yes F: major tail protein SIF-BLAST: ,,[major tail protein [Arthrobacter phage Tokki] ],,UGL63246,99.654,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, both glimmer and genemark called it /note=What is the start site? 12648, includes all of the coding potential and has a good gap, z score, and final score. /note=What is the function? Major tail protein, coverage on Hhpred was not great, but the probability was good on both there and NCBI blast /note= /note=Checked: C.A.R, TM CDS 13543 - 14100 /gene="18" /product="gp18" /function="Hypothetical Protein" /locus tag="Bouchard_18" /note=Original Glimmer call @bp 13543 has strength 5.95; Genemark calls start at 13543 /note=SSC: 13543-14100 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_18 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 1.24727E-131 GAP: 25 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.282, -2.0949393225970705, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_18 [Arthrobacter phage Phaby]],,XJP08320,100.0,1.24727E-131 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, both glimmer and genemark called it. /note=What is the start site? 13543, includes all of the coding potential /note=What is the function? Hypothetical protein, it has good alignment, identity, and coverage on NCBI. Nothing else has good matched. /note= /note=Checked: TM, C.A.R. CDS 14116 - 14916 /gene="19" /product="gp19" /function="major tail protein" /locus tag="Bouchard_19" /note=Original Glimmer call @bp 14116 has strength 10.65; Genemark calls start at 14116 /note=SSC: 14116-14916 CP: no SCS: both ST: NI BLAST-Start: [major tail protein [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 0.0 GAP: 15 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.282, -1.953940808934884, yes F: major tail protein SIF-BLAST: ,,[major tail protein [Arthrobacter phage Phaby]],,XJP08321,99.6241,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, both glimmer and genemark called it /note=What is the start site? 14116, it includes all of the coding potential /note=What is the function? major tail protein had good results and it was not a membrane protein /note= /note=Checked: C.A.R, TM CDS 14932 - 15852 /gene="20" /product="gp20" /function="Hypothetical Protein" /locus tag="Bouchard_20" /note=Original Glimmer call @bp 14932 has strength 12.33; Genemark calls start at 14932 /note=SSC: 14932-15852 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_20 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 0.0 GAP: 15 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.871, -2.8793565916978188, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_20 [Arthrobacter phage Phaby]],,XJP08322,96.732,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, both glimmer and genemark called it /note=What is the start site? 14932 includes all of the coding potential /note=What is the function? membrane protein on phamerator; domain is a signal domain not a transmembrane domain so that makes this a hypothetical protein. /note= /note=Checked: C.A.R, TM CDS 15852 - 16163 /gene="21" /product="gp21" /function="Hypothetical Protein" /locus tag="Bouchard_21" /note=Original Glimmer call @bp 15852 has strength 9.31; Genemark calls start at 15852 /note=SSC: 15852-16163 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_21 [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 1.56727E-60 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.408, -4.696756665456976, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_21 [Arthrobacter phage Tokki]],,UGL63250,98.0583,1.56727E-60 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, it has coding potential. It has a high glimmer score and has 18/18 MA and is called 100% of the time in the phages in this pham. It also has the best z-score and final score. /note= /note=Function: Most likely a hypothetical protein as despite the HHPred results suggesting a tail assembly chaperone, there does not appear to be enough evidence to call this for certainty. Based off of the NCBI results for other members of the pham they all seem to call this gene a hypothetical protein. I have put below the HHPred results just in case whoever reviews these annotations who has more expertise wants to look into it further. /note=Tail fiber chaperone; fiber, VIRUS, VIRAL PROTEIN;{uncultured cyanophage} /note=Probability 98.71% E-value: 9.5E-7 Score: 53.12 SS: 9.4 Aligned cols: 78 Target length: 162 /note=U1_BPMU Tail fiber assembly protein U OS=Escherichia phage Mu OX=10677 GN=U PE=1 SV=1 /note=Probability 92.57% E-value: 0.41 Score: 29.81 SS: 2.8 Aligned cols: 54 Target length: 175 /note=All others are under 90% but still mention a Tail assembly protein of some form. /note=No significant NCBI or phagesdb evidence though. Most all call this gene as something else different that I see no evidence for in Bouchard, or it is called a hypothetical protein. /note= /note=The closest to a "slippery sequence" found was TTTGGGA starting at 15907 However I have found better looking ones in other genes that I did not think were tail assembly chaperones. This was also not one of the sequences given on the list in the bioinformatics guide, but I am however having a hard time ignoring the HHPred results...(C.A.R) /note= /note=I think this is the best call for now. The one HHpred hit is hard to ignore though /note= /note=Checked by: JP, TM CDS 16178 - 17881 /gene="22" /product="gp22" /function="minor tail protein" /locus tag="Bouchard_22" /note=Original Glimmer call @bp 16178 has strength 4.9; Genemark calls start at 16178 /note=SSC: 16178-17881 CP: no SCS: both ST: NI BLAST-Start: [minor tail protein [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 0.0 GAP: 14 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.196, -2.213540211474171, yes F: minor tail protein SIF-BLAST: ,,[minor tail protein [Arthrobacter phage Phaby]],,XJP08324,78.3831,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, it has coding potential. It has the best final and z-score. Glimmer and GeneMark agree but the glimmer score is quite low. This gene however has 53/54 MA and is called 100% of the time when it is present. /note= /note=Function: Could very likely be either an esterase domain or a minor tail protein. The HHPred results outside of the PECAAN HHPred give a 99.11% probability with an E-value of 6.9E-8 for a hydrolase-like esterase domain. However it does have a conserved domain for R1 and R2 pyocins which appear to be related to minor tail proteins. Most others in the pham have also labeled this as a minor tail protein as well. I believe a second opinion may be beneficial before making a final decision. /note=The evidence seems to better support a minor tail protein as the esterase does not have the best coverage on HHPred and there is only one hit for an esterase domain. (C.A.R) /note= /note=Checked by: JP, TM CDS 17892 - 18848 /gene="23" /product="gp23" /function="minor tail protein" /locus tag="Bouchard_23" /note=Original Glimmer call @bp 17892 has strength 7.22; Genemark calls start at 17892 /note=SSC: 17892-18848 CP: no SCS: both ST: NI BLAST-Start: [minor tail protein [Arthrobacter phage Shepard] ],,NCBI, q1:s1 100.0% 1.08518E-105 GAP: 10 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.293, -1.9310779259753799, yes F: minor tail protein SIF-BLAST: ,,[minor tail protein [Arthrobacter phage Shepard] ],,QFG13633,66.5625,1.08518E-105 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, it has coding potential. It has the best final and z-score out of all the options and GeneMark and Glimmer agree. The glimmer score is good. However it only has 5/99 MA and Bouchard is listed on Starterator twice as calling the most annotated site and also simultaneously not having the most annotated site. It appears as though the Starterator for Genes 23 and 24 are combined, which explains why Bouchard is listed twice. It appears to be in mostly AU2 phages which may somewhat explain the GeneMarkS looking like the two genes should be connected in some way. /note=May be worth mentioning that both gene 23 and 24 contain a TTTGGGA sequence. Not entirely sure if that is important of not. /note= /note=Function: HHPred brings up good results for a major binding protein. However that is not an option listed in the official functions list. It once again has a conserved domain for R1 and R2 pyocins and based off of fellow members of the pham I am inclined to believe it may be appropriate to call this a minor tail protein. (C.A.R) /note= /note=Checked by JP, TM CDS 18859 - 19707 /gene="24" /product="gp24" /function="minor tail protein" /locus tag="Bouchard_24" /note=Original Glimmer call @bp 18859 has strength 7.79; Genemark calls start at 18859 /note=SSC: 18859-19707 CP: no SCS: both ST: NI BLAST-Start: [minor tail protein [Arthrobacter phage Zeina]],,NCBI, q1:s1 100.0% 1.14999E-97 GAP: 10 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.396, -3.8761054743559344, no F: minor tail protein SIF-BLAST: ,,[minor tail protein [Arthrobacter phage Zeina]],,UQS94694,67.474,1.14999E-97 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes it has coding potential. GeneMark and Glimmer agree and the glimmer score is good. It also has a good final and z-score. It has 49/99 MA and is called 100% of the time. /note= /note=Function: Once again HHPred is bringing up results very similar to those of Gene 23 for major structural proteins. It also has a conserved domain for R1 and R2 pyocins as well and based off of the other members of the pham and this conserved domain I am inclined to also call this a minor tail protein. (C.A.R) /note= /note=Minor tail protein appears the best call. This protein does seem like it may have some part in receptor binding which should involve the tail. /note= /note=Check by: JP, TM CDS 19716 - 19850 /gene="25" /product="gp25" /function="hypothetical protein" /locus tag="Bouchard_25" /note=Original Glimmer call @bp 19716 has strength 8.0 /note=SSC: 19716-19850 CP: no SCS: glimmer ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_25 [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 1.32289E-22 GAP: 8 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.152, -4.4057176022952405, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_25 [Arthrobacter phage Tokki]],,UGL63254,100.0,1.32289E-22 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes. Glimmer and genemark both called it. The glimmer score is an 8. The start site is 19716, as it is also the only option. There is coding potential being shown. /note=The protein is a hypothetical protein. NCBI BLAST and Phagesdb BLAST both call it a hypothetical protein and the E-values are satisfactory. /note=CH, TM /note= /note=Checked: C.A.R CDS 19873 - 20256 /gene="26" /product="gp26" /function="Hypothetical Protein" /locus tag="Bouchard_26" /note=Original Glimmer call @bp 19873 has strength 6.12; Genemark calls start at 19873 /note=SSC: 19873-20256 CP: no SCS: both ST: NI BLAST-Start: [minor tail protein [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 1.41243E-85 GAP: 22 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.385, -3.8366550365551095, yes F: Hypothetical Protein SIF-BLAST: ,,[minor tail protein [Arthrobacter phage Tokki]],,UGL63255,99.2126,1.41243E-85 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes. Glimmer and Genemark both call it. The glimmer score is satisfactory. The start site is 19873. The function is hypothetical protein; there were two times it was considered a minor tail protein but not enough information to entertain it. /note=CH, TM /note= /note=Checked: I will agree with the function as HHPred seems to give a tie between major capsid, minor tail, and a hypothetical protein. -C.A.R CDS 20253 - 24839 /gene="27" /product="gp27" /function="tape measure protein" /locus tag="Bouchard_27" /note=Original Glimmer call @bp 20253 has strength 7.9; Genemark calls start at 20253 /note=SSC: 20253-24839 CP: no SCS: both ST: NI BLAST-Start: [tape measure protein [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 0.0 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.915, -4.966271270034708, no F: tape measure protein SIF-BLAST: ,,[tape measure protein [Arthrobacter phage Tokki]],,UGL63256,98.822,0.0 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, it has coding potential, called by both Glimmer and GeneMark /note=Start: 20253, -4 overlap, called by Glimmer and GeneMark, best z and final scores, has the most manually annotated start site of those present in this gene (starterator) /note=Function: tape measure protein; Multiple hits on BLAST and HHpred /note= /note=Checked: C.A.R, TM CDS 24829 - 25647 /gene="28" /product="gp28" /function="minor tail protein" /locus tag="Bouchard_28" /note=Original Glimmer call @bp 24829 has strength 8.94; Genemark calls start at 24829 /note=SSC: 24829-25647 CP: no SCS: both ST: NI BLAST-Start: [minor tail protein [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 0.0 GAP: -11 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.541, -4.813835316181514, no F: minor tail protein SIF-BLAST: ,,[minor tail protein [Arthrobacter phage Phaby]],,XJP08330,100.0,0.0 SIF-HHPRED: SIF-Syn: /note=Gene: yes, it has coding potential and is called by both Glimmer and GeneMark /note=Start: 24829, Called by both Glimmer and GeneMark, has the best Z and Final score, includes all coding potential, Most commonly manual annotated start site in starterator /note=Function: minor tail protein, several hits in HHpred and BLAST, Conserved domain of a phage tail protein /note= /note=Checked: C.A.R, TM CDS 25647 - 26810 /gene="29" /product="gp29" /function="minor tail protein" /locus tag="Bouchard_29" /note=Original Glimmer call @bp 25647 has strength 8.63; Genemark calls start at 25647 /note=SSC: 25647-26810 CP: no SCS: both ST: NI BLAST-Start: [minor tail protein [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 0.0 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.603, -5.542591178556752, no F: minor tail protein SIF-BLAST: ,,[minor tail protein [Arthrobacter phage Phaby]],,XJP08331,100.0,0.0 SIF-HHPRED: SIF-Syn: /note=The Start Site of this gene is 25647 mainly due to the fact that it has an acceptable gap coupled with a satisfactory Z-score/Final score. There is High coding potential throughout this region as well. Glimmer and GeneMark both agree on this SS. /note=As for function, this protein is a minor tail protein. This is mainly supported by Phages DB BLAST and NCBI BLAST. HHpred provided results that correspond with this call. /note=TM /note= /note=Checked: C.A.R CDS 26832 - 27071 /gene="30" /product="gp30" /function="Hypothetical Protein" /locus tag="Bouchard_30" /note=Original Glimmer call @bp 26832 has strength 13.31; Genemark calls start at 26832 /note=SSC: 26832-27071 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_30 [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 2.1381E-49 GAP: 21 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.293, -2.0720764396375664, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_30 [Arthrobacter phage Tokki] ],,UGL63259,100.0,2.1381E-49 SIF-HHPRED: SIF-Syn: /note=The start site for this gene is at 26832. Both Glimmer and GeneMark called this as the SS w/a Glimmer Score of 13.31, and GeneMark shows high coding potential there. TM /note=As for functionality, this seems to be a Hypothetical protein. Phamerator/NCBI/Phage DB BLAST support this call. /note=TM /note= /note=Checked: C.A.R, JP CDS 27071 - 27862 /gene="31" /product="gp31" /function="minor tail protein" /locus tag="Bouchard_31" /note=Original Glimmer call @bp 27071 has strength 3.4; Genemark calls start at 27071 /note=SSC: 27071-27862 CP: no SCS: both ST: NI BLAST-Start: [minor tail protein [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 0.0 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.319, -4.056692446522387, no F: minor tail protein SIF-BLAST: ,,[minor tail protein [Arthrobacter phage Tokki] ],,UGL63260,100.0,0.0 SIF-HHPRED: SIF-Syn: /note=SS for this gene is 27071 since the gap accompanied by the z-score/final score is the best option. Glimmer and GeneMark both agree on this call. GeneMark shows high coding potential in this region. /note=As for function, this is a Minor Tail Protein. This call is supported by Phamerator, Phages DB/NCBI BLAST. HHpred provided some different results but didn`t provide any solid conclusions. /note=TM /note=Checked: JT, JL CDS 27859 - 28302 /gene="32" /product="gp32" /function="membrane protein" /locus tag="Bouchard_32" /note=Original Glimmer call @bp 28147 has strength 2.43; Genemark calls start at 27859 /note=SSC: 27859-28302 CP: no SCS: both-gm ST: NI BLAST-Start: [membrane protein [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 7.75523E-98 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.975, -5.049507594900835, no F: membrane protein SIF-BLAST: ,,[membrane protein [Arthrobacter phage Tokki] ],,UGL63261,100.0,7.75523E-98 SIF-HHPRED: SIF-Syn: /note=The start site for this gene is 27859 due to the fact that the gap is -4. Glimmer disagrees with this call, but GeneMark agrees with this decision and shows high coding potential as well. /note=As for functionality, this gene is a membrane protein due to the fact that Phamerator shows two transmembrane domains within this gene. /note=TM /note=Checked: JT, JL CDS 28306 - 28629 /gene="33" /product="gp33" /function="membrane protein" /locus tag="Bouchard_33" /note=Original Glimmer call @bp 28306 has strength 7.87; Genemark calls start at 28306 /note=SSC: 28306-28629 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_33 [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 1.94498E-68 GAP: 3 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.871, -2.8170432709374893, yes F: membrane protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_33 [Arthrobacter phage Tokki] ],,UGL63262,100.0,1.94498E-68 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, it is a gene, called by glimmer and genemark and has coding potential. /note=Start Cite: 28306, includes best z score and final score, it also has the most reasonable gap compared to other options. It has 35 MAs out of 154, the other most called start cites are not present on Bouchard. /note=Function: All similar genes show hypothetical protein, however according to phamerator, this gene does contain a transmembrane protein. /note=Checked: JT, TM, JL CDS 28632 - 29036 /gene="34" /product="gp34" /function="membrane protein" /locus tag="Bouchard_34" /note=Original Glimmer call @bp 28632 has strength 3.61; Genemark calls start at 28632 /note=SSC: 28632-29036 CP: no SCS: both ST: NI BLAST-Start: [holin [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 4.42852E-92 GAP: 2 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.217, -2.1695583717155773, yes F: membrane protein SIF-BLAST: ,,[holin [Arthrobacter phage Phaby]],,XJP08336,100.0,4.42852E-92 SIF-HHPRED: SIF-Syn: /note=Gene: Called by glimmer and genemark, and has coding potential. /note=Start Cite: 28632, has best z score and final score, as well most reasonable gap. 39 out of 105 MA`s, the other most called start cite is not present in Bouchard. /note=Function: Hollin, Phaby has the exact same gene and is in the same cluster Bouchard, other similar genes have been called Hollin as well as membrane proteins. I went with Hollin because of Phaby and its` identical gene and because it is in the same cluster as Bouchard. On Phamerator it was revealed that transmembrane proteins are present, so it may be a membrane protein. /note= /note=Checked: I believe it should be a membrane protein because on the sea phages functional assignments it says that if there are multiple possibilities for a holin gene, to call them membrane proteins. JT /note=-Checked both original and checked version TM, JL CDS 29096 - 29392 /gene="35" /product="gp35" /function="Hypothetical Protein" /locus tag="Bouchard_35" /note=Original Glimmer call @bp 29096 has strength 7.54; Genemark calls start at 29096 /note=SSC: 29096-29392 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_35 [Arthrobacter phage Shepard] ],,NCBI, q1:s1 100.0% 1.37298E-63 GAP: 59 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.282, -1.953940808934884, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_35 [Arthrobacter phage Shepard] ],,QFG13645,100.0,1.37298E-63 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, called by glimmer and genemark and has coding potential. /note=Start Cite: 29096, best z score and final score, has the most reasonable gap.159 of 167 MA`s on Pham. /note=Function: Hypothetical Protein, all similar genes from phages db and NCBI are called hypothetical protein. /note= /note=Checked: JT, JL CDS 29563 - 29781 /gene="36" /product="gp36" /function="Hypothetical Protein" /locus tag="Bouchard_36" /note=Original Glimmer call @bp 29563 has strength 13.11; Genemark calls start at 29563 /note=SSC: 29563-29781 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_38 [Arthrobacter phage Shepard] ],,NCBI, q1:s1 100.0% 2.76841E-44 GAP: 170 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.293, -1.9310779259753799, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_38 [Arthrobacter phage Shepard] ],,QFG13647,100.0,2.76841E-44 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, called by glimmer and genemark, and has coding potential. /note=Start Cite: 29563, has best z score and overall score, and has the smallest gap. 6 out of 6 MA`s on Pham. This Gap is large enough it should be checked for possible genes. /note=Function: Hypothetical Protein, all similar genes on Phages db and NCBI blast have called hypothetical. /note= /note=Checked: JL, JT /note= /note=Gap in front of this gene checked with no significant hits in HHpred and no hits at all in NCBI BLAST CDS 29839 - 30159 /gene="37" /product="gp37" /function="Hypothetical Protein" /locus tag="Bouchard_37" /note=Original Glimmer call @bp 29872 has strength 9.45; Genemark calls start at 29872 /note=SSC: 29839-30159 CP: no SCS: both-cs ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_39 [Arthrobacter phage Shepard] ],,NCBI, q12:s1 89.6226% 4.89496E-61 GAP: 57 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.315, -4.064123888439605, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_39 [Arthrobacter phage Shepard] ],,QFG13648,100.0,4.89496E-61 SIF-HHPRED: SIF-Syn: /note=Gene: Called by glimmer and genemark, and has coding potential. /note=Start Cite: 29872, has best z score and final score, and best gap to z score/final score ratio. Called 7 times on Pham, 5 out 7 MA`s, the other most called start cite is not present in Bouchard. There is coding potential that extends beyond this start site, I think the earlier start site makes more sense /note=Function: All similar proteins with excellent to average e scores have been called hypothetical proteins, except for one outlier with decent coverage and e score which called it a membrane protein. I viewed this as an exception and went with hypothetical protein. /note= /note=Checked: JL, JT CDS 30248 - 30442 /gene="38" /product="gp38" /function="Hypothetical Protein" /locus tag="Bouchard_38" /note=Original Glimmer call @bp 30248 has strength 8.35; Genemark calls start at 30248 /note=SSC: 30248-30442 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_40 [Arthrobacter phage Shepard] ],,NCBI, q1:s1 100.0% 3.52099E-34 GAP: 88 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.293, -1.993391246735709, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_40 [Arthrobacter phage Shepard] ],,QFG13649,96.875,3.52099E-34 SIF-HHPRED: SIF-Syn: /note=Gene: yes, called by glimmer and genemark, and has coding potential. /note=Start: 30248, smallest gap best z score. 26 out 26 MA`s /note=Function: Hypothetical Protein, all similar proteins with excellent to average e scores point to hypothetical proteins. /note=Checked:JL, JT, TM CDS 30556 - 30792 /gene="39" /product="gp39" /function="Hypothetical Protein" /locus tag="Bouchard_39" /note=Original Glimmer call @bp 30556 has strength 8.45; Genemark calls start at 30556 /note=SSC: 30556-30792 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_41 [Arthrobacter phage Shepard] ],,NCBI, q1:s1 100.0% 2.48941E-48 GAP: 113 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.293, -1.9310779259753799, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_41 [Arthrobacter phage Shepard] ],,QFG13650,100.0,2.48941E-48 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, called by glimmer and genemark, and has coding potential. /note=Start Cite: 30556, relatively large gap but smaller compared to other start cite options. 65 of 99 MA`s on pham. /note=Function: Most likely a hypothetical protein, all similar proteins have been called hypothetical except for tokki,I don`t think this is enough evidence to overturn. /note=Checked: JL, JT /note= /note=Gap in front of this gene checked. No significant hits on HHpred and no hits at all in NCBI BLAST CDS 31234 - 31995 /gene="40" /product="gp40" /function="membrane protein" /locus tag="Bouchard_40" /note=Original Glimmer call @bp 31234 has strength 8.8; Genemark calls start at 31234 /note=SSC: 31234-31995 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_41 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 0.0 GAP: 441 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.282, -2.0162541296952132, yes F: membrane protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_41 [Arthrobacter phage Phaby]],,XJP08342,100.0,0.0 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, called by glimmer and genemark, and has coding potential. /note=Start Cite: 31234, smallest gap compared to other options, most annotated start cite on Pham. /note=Function: All data points to hypothetical protein, however phamerator has indicated a transmembrane protein, so I will call it a membrane protein. /note=Checked: JL, JT CDS 32020 - 32466 /gene="41" /product="gp41" /function="Hypothetical Protein" /locus tag="Bouchard_41" /note=Original Glimmer call @bp 32020 has strength 7.27; Genemark calls start at 32020 /note=SSC: 32020-32466 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_45 [Arthrobacter phage Shepard] ],,NCBI, q1:s1 100.0% 5.03957E-91 GAP: 24 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.982, -4.684633435032512, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_45 [Arthrobacter phage Shepard] ],,QFG13652,93.2432,5.03957E-91 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, it has coding potential and was called by glimmer and genemark. /note=Start Cite: 32020, smallest gap, best z score and overall score, and has 38 of 52 MA`s on pham. /note=Function: All similar proteins have been called hypothetical with little to no exception. /note=Checked by CM, C.A.R CDS 32501 - 32713 /gene="42" /product="gp42" /function="Hypothetical Protein" /locus tag="Bouchard_42" /note=Original Glimmer call @bp 32501 has strength 12.74; Genemark calls start at 32501 /note=SSC: 32501-32713 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_46 [Arthrobacter phage Shepard] ],,NCBI, q1:s1 100.0% 3.77775E-41 GAP: 34 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.185, -2.2364030944336752, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_46 [Arthrobacter phage Shepard] ],,QFG13653,100.0,3.77775E-41 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, has coding potential and is called by glimmer and genemark. /note=Start Cite: 32501, The two best options are 32501 and 32486. While 32486 does have a smaller gap, its` z score and final score are worse than 32501. Additionally, 32501 has a gap of 32 which is not terrible. /note=Function: Hypothetical protein, all similar proteins are called hypothetical with little to no exception. /note=Checked by CM, C.A.R CDS 32788 - 33081 /gene="43" /product="gp43" /function="membrane protein" /locus tag="Bouchard_43" /note=Original Glimmer call @bp 32788 has strength 4.92; Genemark calls start at 32860 /note=SSC: 32788-33081 CP: no SCS: both-gl ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_44 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 1.17272E-62 GAP: 74 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.396, -3.8137921535956054, yes F: membrane protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_44 [Arthrobacter phage Phaby]],,XJP08345,95.8763,1.17272E-62 SIF-HHPRED: SIF-Syn: /note=Is this a gene ? Yes, it is a gene. It has been called by both. Glimmer and GeneMark. It has a start codon of 32788.this codon has the best gap, includes the most coding potential, and also has the best z score and final score. /note=Checked by CM, C.A.R /note= /note=Function: It has a transmembrane domain. CDS 33214 - 33864 /gene="44" /product="gp44" /function="Hypothetical Protein" /locus tag="Bouchard_44" /note=Original Glimmer call @bp 33214 has strength 15.67; Genemark calls start at 33214 /note=SSC: 33214-33864 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_46 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 3.75116E-152 GAP: 132 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.206, -2.17469248771465, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_46 [Arthrobacter phage Phaby]],,XJP08347,100.0,3.75116E-152 SIF-HHPRED: SIF-Syn: /note=Is this a gene? Yes. It is a gene. It has been called by both Glimmer and GeneMark. Glimmer reported a high score of 15.67. The start codon is at 33214. This start codon has the best gap, the best final score, the best z score, and it includes all of the coding potential. Checked by CM, C.A.R /note= /note=Function: No significant hits in HHpred. Only hypothetical proteins in BLAST. No membrane domain CDS 33902 - 34177 /gene="45" /product="gp45" /function="Hypothetical Protein" /locus tag="Bouchard_45" /note=Original Glimmer call @bp 33902 has strength 8.5; Genemark calls start at 33902 /note=SSC: 33902-34177 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_49 [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 6.78086E-60 GAP: 37 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.217, -4.190250106893126, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_49 [Arthrobacter phage Tokki]],,UGL63274,100.0,6.78086E-60 SIF-HHPRED: SIF-Syn: /note=Is this a gene? Yes it is a gene. It has been called by both Glimmer and GeneMark. It has a start codon at 33902. This has the best Z score, best final score, the best gap, and it includes all of the coding potential. /note=Checked by CM, C.A.R /note=Function: No significant hits of HHpred or BLAST other than to hypothetical proteins. No transmembrane domain CDS 34278 - 34577 /gene="46" /product="gp46" /function="Hypothetical Protein" /locus tag="Bouchard_46" /note=Original Glimmer call @bp 34278 has strength 6.06; Genemark calls start at 34278 /note=SSC: 34278-34577 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_50 [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 3.99256E-61 GAP: 100 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.754, -3.2026596621721617, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_50 [Arthrobacter phage Tokki]],,UGL63275,100.0,3.99256E-61 SIF-HHPRED: SIF-Syn: /note=Is this a gene? Yes, it is a gene. it has been called by both Glimmer and GeneMark. It has a start codon at 34278. This start codon has the best gap, the best final and Z-score, and it includes all of the coding potential. /note=Checked by CM, C.A.R /note= /note=Function: No significant hits on HHpred; only hypothetical proteins on BLAST; no transmembrane domain. CDS 34638 - 34778 /gene="47" /product="gp47" /function="Hypothetical Protein" /locus tag="Bouchard_47" /note=Original Glimmer call @bp 34638 has strength 9.82; Genemark calls start at 34638 /note=SSC: 34638-34778 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_50 [Arthrobacter phage Shepard]],,NCBI, q1:s1 100.0% 1.80655E-23 GAP: 60 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.396, -3.893834241316366, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_50 [Arthrobacter phage Shepard]],,QFG13657,100.0,1.80655E-23 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes it is a gene. it has been called for both glimmer and GeneMark. It has a start codon at 34778. This has the best gap, the best z Score and final score, the covers the most coding potential. /note=Checked by CM, TM /note= /note=No 90% matches on HHpred. No matches to anything other than hypothetical proteins on NCBI BLAST. JP CDS 34781 - 34894 /gene="48" /product="gp48" /function="membrane protein" /locus tag="Bouchard_48" /note=Original Glimmer call @bp 34781 has strength 10.61; Genemark calls start at 34781 /note=SSC: 34781-34894 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_51 [Arthrobacter phage Shepard]],,NCBI, q1:s1 100.0% 6.00239E-16 GAP: 2 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.952, -2.707694805492614, yes F: membrane protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_51 [Arthrobacter phage Shepard]],,QFG13658,100.0,6.00239E-16 SIF-HHPRED: SIF-Syn: /note=Is this a gene? Yes it is a gene. Both Genemark and Glimmer have chosen the same start site which has good genetic potential. It also encompasses all genetic potential available. What is the function? The function is hypothetical protein. This has a good e-value score and good coverage as shown on NCBI blast this is also supported by other Phages in our cluster.-ko /note=Checked by CM, C.A.R /note= /note=There is a transmembrane domain. changed to membrane protein. JP CDS 34919 - 35596 /gene="49" /product="gp49" /function="Hypothetical Protein" /locus tag="Bouchard_49" /note=Original Glimmer call @bp 34919 has strength 11.87; Genemark calls start at 34919 /note=SSC: 34919-35596 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_52 [Arthrobacter phage Shepard]],,NCBI, q1:s1 100.0% 1.3478E-162 GAP: 24 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.3, -5.117443738164335, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_52 [Arthrobacter phage Shepard]],,QFG13659,100.0,1.3478E-162 SIF-HHPRED: SIF-Syn: /note=Is this a gene? Yes this a gene. Looking at the genemark gene map it shows that this start site has included almost all of the pertinent gene information. What is the function? The function is a hypothetical protein as shown on HHpred as well as phamerator and NCBI blast.-ko /note=Checked by CM, TM CDS 35605 - 35772 /gene="50" /product="gp50" /function="Hypothetical Protein" /locus tag="Bouchard_50" /note=Original Glimmer call @bp 35605 has strength 2.94 /note=SSC: 35605-35772 CP: no SCS: glimmer ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_55 [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 3.68488E-31 GAP: 8 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.152, -4.4057176022952405, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_55 [Arthrobacter phage Tokki] ],,UGL63280,100.0,3.68488E-31 SIF-HHPRED: SIF-Syn: /note=Is this a gene? Yes, it is a gene with all the genetic coding information. There is also a really good z score. There is also a really good gap. I think that is why this is the best start site to be chosen after looking at the start sites and then genemark. what is the function? The function is hypothetical protein. This is supported by other phages in our cluster as well as HHpred. This is also supported by NCBI blast which also shows great coverage and a great e-value score.-ko /note=Checked by CM, TM CDS 35783 - 35917 /gene="51" /product="gp51" /function="Hypothetical Protein" /locus tag="Bouchard_51" /note=Original Glimmer call @bp 35783 has strength 7.65; Genemark calls start at 35783 /note=SSC: 35783-35917 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_54 [Arthrobacter phage Shepard] ],,NCBI, q1:s1 100.0% 3.49938E-23 GAP: 10 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.152, -4.85287563363746, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_54 [Arthrobacter phage Shepard] ],,QFG13661,100.0,3.49938E-23 SIF-HHPRED: SIF-Syn: /note=Is it a gene? The gene has a really good z score. It also has a really good gap. Looking at the genemark map there is some genetic information that is not included in this start site but I think that this start site is the best because all of the majority of genetic information was included in this. What is the function? Again Hypothetical protein because of the e-value score and also because of what was called by others in our pham. There was another option but it had poor coverage and a poor score.-ko /note= /note=This start site includes all coding potential. It is manually annotated 100% of the time when present (starterator). /note=No HHPred hit with 90% probability or greater. No transmembrane domain. All matches via BLAST are to hypothertics proteins. /note= /note=Checked by: JP, C.A.R CDS 35941 - 36846 /gene="52" /product="gp52" /function="Hypothetical Protein" /locus tag="Bouchard_52" /note=Original Glimmer call @bp 35941 has strength 13.29; Genemark calls start at 35941 /note=SSC: 35941-36846 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_55 [Arthrobacter phage Phaby]],,NCBI, q1:s7 100.0% 0.0 GAP: 23 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.049, -2.794070146961553, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_55 [Arthrobacter phage Phaby]],,XJP08356,97.3941,0.0 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes it is a gene. All the coding potential is included. There is a good z score there is also a good score from both glimmer and genemark. What is the function? The function is hypothetical protein. There was limited data from both NCBI blast. I ran the blasts manually but the data was still inconclusive and the best e-value score was for a hypothetical protein.-ko /note= /note=This start site includes the coding potential. The coding potential may extend a little further so it is possible that the start site could be further upstream, but this site has better Z and final scores. It is also the most frequently annotated start site (that is present in this gene). The only alignment with >90% match on HHPred was to a domain of unknown function. /note= /note=Checked by: JP, C.A.R CDS 37040 - 37258 /gene="53" /product="gp53" /function="Hypothetical Protein" /locus tag="Bouchard_53" /note=Original Glimmer call @bp 37040 has strength 10.82; Genemark calls start at 37040 /note=SSC: 37040-37258 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_56 [Arthrobacter phage Phaby]],,NCBI, q1:s23 100.0% 1.05661E-43 GAP: 193 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.128, -2.2763497933341483, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_56 [Arthrobacter phage Phaby]],,XJP08357,76.5957,1.05661E-43 SIF-HHPRED: SIF-Syn: /note=Is is a gene? Yes it is a gene? There was good coding potential which was included in the start site that was chosen. There was a good z-score and a good score from both genemark and glimmer. Looking at the Genemark map and at other phages in our pham it was the best start site. What is the function? The function is hypothetical protein. Looking at all the data there is some low coverage for this choice but it had the best e-value score overall.-ko /note= /note=No high probability matches on HHPred. Only good matches via BLAST were hypothetical proteins. /note= /note=Checked by: JP, TM CDS 37255 - 38007 /gene="54" /product="gp54" /function="hypothetical protein" /locus tag="Bouchard_54" /note=Original Glimmer call @bp 37255 has strength 13.4; Genemark calls start at 37255 /note=SSC: 37255-38007 CP: no SCS: both ST: NI BLAST-Start: [membrane protein [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 6.82437E-178 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.818, -5.554893073213532, no F: hypothetical protein SIF-BLAST: ,,[membrane protein [Arthrobacter phage Tokki] ],,UGL63284,99.6,6.82437E-178 SIF-HHPRED: SIF-Syn: /note=Is this a gene? Yes, it is a gene. The start site is -4 which always denotes a start site. what is the function? There were several options listed such as an uncharacterized protein but this had a low e-value score. I was looking through the phages Db as well as HHpred which showed wither an uncharacterized protein as well as hypothetical protein. NCBI blast said membrane protein.-ko /note= /note=There is coding potential, and this start site picks up almost all of the coding potential but there is not another available start site upstream. No hits with 90% probability or greater on HHPred (and best hits were to proteins of unknown function). /note= /note=Checked by: JP, C.A.R CDS 38011 - 38568 /gene="55" /product="gp55" /function="Hypothetical Protein" /locus tag="Bouchard_55" /note=Original Glimmer call @bp 38011 has strength 7.51; Genemark calls start at 38011 /note=SSC: 38011-38568 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_61 [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 6.44358E-133 GAP: 3 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.185, -4.336903751127196, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_61 [Arthrobacter phage Tokki]],,UGL63285,99.4595,6.44358E-133 SIF-HHPRED: SIF-Syn: /note=The Start Site for this gene is at 38011. Both GeneMark and Glimmer agree with this call (Glimmer Score 7.51) and there is high coding potential for this region as shown by GeneMark. /note=As for function, this is a Hypothetical Protein. This call is supported by Phamerator, NCBI BLAST, and Phages db BLAST. /note= /note=Starterator data - this is the only start site with manual annotation present in this gene. no significant hit on HHPred. /note= /note=Checked by:JP, C.A.R CDS 38592 - 42455 /gene="56" /product="gp56" /function="DNA primase/polymerase" /locus tag="Bouchard_56" /note=Original Glimmer call @bp 38592 has strength 7.94; Genemark calls start at 38592 /note=SSC: 38592-42455 CP: no SCS: both ST: NI BLAST-Start: [DNA primase/polymerase [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 0.0 GAP: 23 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.388, -3.8308929311341546, no F: DNA primase/polymerase SIF-BLAST: ,,[DNA primase/polymerase [Arthrobacter phage Phaby]],,XJP08360,99.8446,0.0 SIF-HHPRED: SIF-Syn: /note=The Start Site for this gene is 42455 due to Glimmer and GeneMark agreeing with this call (Glimmer Score 7.94) and there is high coding potential throughout this region. /note=As for function, it is a DNA Primase/Polymerase. This finding is supported by Phamerator, Phages DB BLST, HHPred, and NCBI BLAST. Practically every database agreed on this function, though some E-values in some of the databases were unacceptable. The continuous trend with the other acceptable evidence made this function very likely. /note=TM /note= /note=Start site chosen was the most commonly manual annotated start site on starterator. /note= /note=Checked by: JP, C.A.R CDS 42421 - 42756 /gene="57" /product="gp57" /function="Hypothetical Protein" /locus tag="Bouchard_57" /note=Original Glimmer call @bp 42559 has strength 5.23; Genemark calls start at 42421 /note=SSC: 42421-42756 CP: no SCS: both-gm ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_60 [Arthrobacter phage Shepard]],,NCBI, q10:s3 36.036% 1.33027E-15 GAP: -35 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.977, -4.835643960141637, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_60 [Arthrobacter phage Shepard]],,QFG13667,32.7434,1.33027E-15 SIF-HHPRED: SIF-Syn: /note=The Start Site for this gene is at 42559 because the z-score/final score is the highest from the other options and there is high coding potential shown for this region. /note=As for function, this is a Hypothetical protein since there is no evidence to support a call for a specific function. PhagesDB, NCBI, and HHpred didn`t yield any usable or efficient results. /note=TM /note=Discussed Start site in class, we decided to move to the earlier start site /note=Checked by: JP CDS 42753 - 42968 /gene="58" /product="gp58" /function="Hypothetical Protein" /locus tag="Bouchard_58" /note=Genemark calls start at 42753 /note=SSC: 42753-42968 CP: no SCS: genemark ST: NI BLAST-Start: [hypothetical protein SEA_LILHUDDY_59 [Arthrobacter phage LilHuddy]],,NCBI, q2:s3 98.5916% 2.12447E-38 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.799, -3.7960205838585104, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_LILHUDDY_59 [Arthrobacter phage LilHuddy]],,UYL88252,93.1507,2.12447E-38 SIF-HHPRED: SIF-Syn: /note=The Start Site for this gene is at 42753. This is supported by GeneMark, has a high z-score/final score combo, and has a gap of -4. /note=As for function, this gene is a hypothetical protein. This issupported by NCBI BLAST, PhagesDB BLAST, and Phamerator. /note=TM /note=Start: includes all coding potential as well; Starterator shows this start has been called 100% of the time in the pham /note= /note=Checked by: JP, C.A.R CDS 42979 - 43143 /gene="59" /product="gp59" /function="Hypothetical Protein" /locus tag="Bouchard_59" /note=Original Glimmer call @bp 42979 has strength 7.68; Genemark calls start at 42979 /note=SSC: 42979-43143 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_61 [Arthrobacter phage Shepard] ],,NCBI, q1:s1 100.0% 9.31417E-32 GAP: 10 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.656, -6.1964086969435535, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_61 [Arthrobacter phage Shepard] ],,QFG13668,100.0,9.31417E-32 SIF-HHPRED: SIF-Syn: /note=The Start Site for this gene is 42979 due to the fact that both Glimmer and GeneMark agree on this call and GeneMark shows high coding potential for this region. /note=As for function, this s a Hypothetical Protein. NCBI, Phages DB, and Phamerator all support this call and no transmembrane domains were detected by Phamerator. /note=TM /note= /note=Checked by: JP, C.A.R CDS 43121 - 43276 /gene="60" /product="gp60" /function="Hypothetical Protein" /locus tag="Bouchard_60" /note=Original Glimmer call @bp 43121 has strength 9.29; Genemark calls start at 43121 /note=SSC: 43121-43276 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_65 [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 1.15254E-25 GAP: -23 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.049, -2.5052746077145835, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_65 [Arthrobacter phage Tokki] ],,UGL63288,100.0,1.15254E-25 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Most likely yes as it has coding potential. I am going to agree with GeneMark and Glimmer on the start site as it has a better final score of the options and 7 MA and a high glimmer score. However the earlier start site appears to cover the full coding potential in the GeneMark, however there are no MA and there is not sufficient evidence that would lead me to believe that I should switch the start site. Also the gap is much worse on the earlier start site. /note= /note=Function: HHPred did not provide any results above a 70%, but it is a 100% match on NCBI blast with other AU2 phages for a hypothetical protein. It has no conserved or transmembrane domains from what I could find. (C.A.R) /note= /note=Reviewed: TM CDS 43277 - 43501 /gene="61" /product="gp61" /function="Hypothetical Protein" /locus tag="Bouchard_61" /note=Original Glimmer call @bp 43277 has strength 7.46; Genemark calls start at 43277 /note=SSC: 43277-43501 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_66 [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 6.11476E-46 GAP: 0 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.185, -2.6835611257758942, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_66 [Arthrobacter phage Tokki] ],,UGL63289,100.0,6.11476E-46 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, it has coding potential and it has the best final and z-score. It also has a good glimmer score and Glimmer and GeneMark agree. It also has 29/43 MA and is called 100% of the time in the pham. /note= /note=Function: HHPred did not provide any results that were at,least 90%, however even the HHPred results suggest an "unknown function" which aligns with the NCBI results which have a 100% match to other AU2 phages Tokki and Phaby. (C.A.R) /note= /note=Reviewed: TM, JL CDS 43508 - 43810 /gene="62" /product="gp62" /function="Hypothetical Protein" /locus tag="Bouchard_62" /note=Original Glimmer call @bp 43508 has strength 9.59; Genemark calls start at 43508 /note=SSC: 43508-43810 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_67 [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 2.10336E-58 GAP: 6 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.049, -3.544192673744953, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_67 [Arthrobacter phage Tokki]],,UGL63290,90.566,2.10336E-58 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, it has coding potential. It has the best final and z-score of all the start sites and a good glimmer score. GeneMark and Glimmer also agree on the start site. It only has 7/43 MA but this start site is only found in 7 phages in the pham and Bouchard does not have the most called start site in the pham. It is also called 100% of the time when present. /note= /note=Function: Most likely a hypothetical protein. HHPred does bring up one result above 90% for a Prohead Protease, however it is nowhere to be found in the official functions list. In the NCBI database many of the phages that matched Bouchard most closely called this gene a hypothetical protein as well. (C.A.R) /note= /note=Reviewed: TM, JL CDS 43810 - 44001 /gene="63" /product="gp63" /function="Hypothetical Protein" /locus tag="Bouchard_63" /note=Original Glimmer call @bp 43828 has strength 4.0; Genemark calls start at 43828 /note=SSC: 43810-44001 CP: no SCS: both-cs ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_65 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 4.81488E-36 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.069, -6.600588204360394, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_65 [Arthrobacter phage Phaby]],,XJP08366,100.0,4.81488E-36 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: I believe so, it has coding potential, however there is a very dramatic dip in the coding potential that is a bit suspicious. The glimmer score is quite low and the original start site does not appear to cover all of the coding potential to me. The original start site was called by both GeneMark and Glimmer, but it only has 1 MA compared to the other start site which has 5 MA. this start site is also favored based on the -1 overlap /note= /note=Function: HHPred did not give any results above 60%. However the NCBI database shows a good match to another AU2 phage for a hypothetical protein. (C.A.R) /note= /note=Reviewed: TM, JL CDS 44008 - 44604 /gene="64" /product="gp64" /function="SSB protein" /locus tag="Bouchard_64" /note=Original Glimmer call @bp 44008 has strength 15.13; Genemark calls start at 44008 /note=SSC: 44008-44604 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_66 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 9.9278E-141 GAP: 6 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.562, -4.116905853494032, yes F: SSB protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_66 [Arthrobacter phage Phaby]],,XJP08367,100.0,9.9278E-141 SIF-HHPRED: SIF-Syn: /note=Is it a gene?: Yes, it has coding potential. It also has the best z-score and final score alongside a very good glimmer score. Glimmer and GeneMark also agree on the start site. It only has 5 MA, however Bouchard does not have the most annotated site and this gene is only present in 22 phages. It is only called 36.4% of the time however, but no other start site is a viable option and this gene does have a function for certainty. /note= /note=Function: Most likely an ssDNA binding protein. There were multiple HHPred results above 90% for this function. However in the NCBI database Bouchard and Phaby match 100%, but Phaby has called this gene a hypothetical protein. The HHPred on PECAAN does not show nearly as many results for an ssDNA binding protein, I however prefer using the HHPred website outside of PECAAN which shows around 6 results for an ssDNA binding protein with good E-values and all with a 99% probability. Therefore I would like to argue for this function. In the official functions list, it is requested that this be called SSB protein instead of an ssDNA binding protein. I did not see any evidence that it was a DprA-like ssDNA binding protein either as I only saw evidence for a regular ssDNA binding protein. (C.A.R) /note= /note=Reviewed: TM, JL CDS 44672 - 44965 /gene="65" /product="gp65" /function="Hypothetical Protein" /locus tag="Bouchard_65" /note=Original Glimmer call @bp 44672 has strength 5.54; Genemark calls start at 44672 /note=SSC: 44672-44965 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_70 [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 1.16494E-64 GAP: 67 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.396, -3.8137921535956054, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_70 [Arthrobacter phage Tokki] ],,UGL63293,100.0,1.16494E-64 SIF-HHPRED: SIF-Syn: /note=Is it a a gene? Yes, called by Glimmer and Genemark, and it has coding potential. Start site? 44,672, it includes all the coding potential and it is the only option for a start site. Function? Hypothetical protein. HHpred said it could be PDGYG protein, and when I tried to find out what that was, I found out that it really is still hypothetical. NCBI blast had great coverage and alignment for hypothetical protein. Everyone else has called it hypothetical (PhagesDB). /note=Checked: RR, JL CDS 44943 - 45233 /gene="66" /product="gp66" /function="Hypothetical Protein" /locus tag="Bouchard_66" /note=Original Glimmer call @bp 45006 has strength 2.65; Genemark calls start at 44943 /note=SSC: 44943-45233 CP: no SCS: both-gm ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_71 [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 3.56387E-61 GAP: -23 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.206, -2.1924212546750814, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_71 [Arthrobacter phage Tokki]],,UGL63294,100.0,3.56387E-61 SIF-HHPRED: SIF-Syn: /note=Is it a gene? yes, called by Glimmer and Genemark. And there is coding potential. Start site? 44,943. It includes all doing potential, it has been manually annotations many times, and it has good z-score and final score. Function? Hypothetical protein. It had no good matches on HHpred, and nothing came up on NCBI blast. Everyone else called it the similar genes hypothetical. /note=Checked: JL, RR CDS 45269 - 45454 /gene="67" /product="gp67" /function="Hypothetical Protein" /locus tag="Bouchard_67" /note=Original Glimmer call @bp 45269 has strength 3.7; Genemark calls start at 45269 /note=SSC: 45269-45454 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_72 [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 1.6169E-34 GAP: 35 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.293, -2.0111200136961407, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_72 [Arthrobacter phage Tokki]],,UGL63295,100.0,1.6169E-34 SIF-HHPRED: SIF-Syn: /note=Is it a agene? Yes, it was called by glimmer and genemark and it has coding potential. Start site? 45,269 because it includes all coding potential, and has a good z-score and final score. Function? Hypothetical protein. NCBI blast had great coverage and alignment for this, and others have called it hypothetical. HHpred had no good hits. /note=Checked: JL, RR CDS 45451 - 45738 /gene="68" /product="gp68" /function="Hypothetical Protein" /locus tag="Bouchard_68" /note=Original Glimmer call @bp 45490 has strength 4.11; Genemark calls start at 45451 /note=SSC: 45451-45738 CP: no SCS: both-gm ST: NI BLAST-Start: [membrane protein [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 2.82196E-63 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.13, -4.513234975742373, no F: Hypothetical Protein SIF-BLAST: ,,[membrane protein [Arthrobacter phage Tokki]],,UGL63296,100.0,2.82196E-63 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes called by Glimmer and genemark, also has coding potential. Start site?, because of the -4 gap. Function? membrane protein. NCBI blast and hhpred both have evidence for membrane protein. Also on Phamerator is has transmembrane domains. /note=Checked: JL, RR /note= /note=Signal domain - not a membrane protein CDS 45738 - 46451 /gene="69" /product="gp69" /function="membrane protein" /locus tag="Bouchard_69" /note=Original Glimmer call @bp 45735 has strength 4.96; Genemark calls start at 45738 /note=SSC: 45738-46451 CP: no SCS: both-gm ST: NI BLAST-Start: [membrane protein [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 3.79862E-173 GAP: -1 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.659, -3.4030855957987485, yes F: membrane protein SIF-BLAST: ,,[membrane protein [Arthrobacter phage Tokki]],,UGL63297,98.7342,3.79862E-173 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, called by Glimmer and Genemark and has coding potential. Start site? 45,738. It had a -4 and -1 gap so i picked the -1 gap start site. Function? membrane protein. It had multiple hits on HHPred for this. NCBI, and phamerator both gave evidence for membrane protein. There were also hits on HHpred for pnuc-like nicotinamide riboside transporter. I didn`t have enough evidence to put that function, so I stuck with membrane protein. /note=Checked: JL, RR CDS 46435 - 46707 /gene="70" /product="gp70" /function="Hypothetical Protein" /locus tag="Bouchard_70" /note=Original Glimmer call @bp 46435 has strength 2.8; Genemark calls start at 46435 /note=SSC: 46435-46707 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_75 [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 1.42117E-59 GAP: -17 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.86, -2.979545903895001, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_75 [Arthrobacter phage Tokki]],,UGL63298,100.0,1.42117E-59 SIF-HHPRED: SIF-Syn: /note=Gene? Yes, Glimmer and genemark called it, and it has coding potential. Start site? 46,435. It includes all the coding potential and has a good z-score and final score. Function? Hypothetical protein. It had no good hits on HHPred, others have called it hypothetical, and NCBI had great coverage, identity, and alignment. /note= /note=Checked by TM, C.A.R CDS 46714 - 46953 /gene="71" /product="gp71" /function="Hypothetical Protein" /locus tag="Bouchard_71" /note= /note=SSC: 46714-46953 CP: no SCS: neither ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_76 [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 3.98489E-50 GAP: 6 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.601, -3.735178992916538, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_76 [Arthrobacter phage Tokki] ],,UGL63299,100.0,3.98489E-50 SIF-HHPRED: SIF-Syn: /note=Gene? Yes, It has coding potential and other genomes have this protein. /note=Start site? 46714, It has the best Z and Final score; Includes all coding potential /note=Function? hypothetical protein; matches in HHpred only hit on the first few amino acid of the protein, not enough to call a function; NCBI hits are to hypothetical proteins CDS 46989 - 47174 /gene="72" /product="gp72" /function="Hypothetical Protein" /locus tag="Bouchard_72" /note=Original Glimmer call @bp 46989 has strength 3.65 /note=SSC: 46989-47174 CP: no SCS: glimmer ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_77 [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 3.792E-35 GAP: 35 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.185, -2.297359520375101, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_77 [Arthrobacter phage Tokki] ],,UGL63300,100.0,3.792E-35 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, Glimmer called the gene. It has coding potential on GenemarkS. /note=What is the start site? 46989, it includes all coding potential. It is called by the manual annotations 19 out 35 times when the start site is present (starterator) the next closest start site would eliminate, Z-score was acceptable as was the final score. /note=What is the function? Hypothetical protein, because on NCBI blast similar phages where named that with a 100% coverage and 100% alignment. On Hhpred it also has no proteins with good coverage or probability. /note= /note=Checked: There is not a 47174 start site, the start site is 46989. -C.A.R, TM CDS 47181 - 47318 /gene="73" /product="gp73" /function="Hypothetical Protein" /locus tag="Bouchard_73" /note= /note=SSC: 47181-47318 CP: no SCS: neither ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_78 [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 2.51494E-23 GAP: 6 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.049, -2.794070146961553, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_78 [Arthrobacter phage Tokki] ],,UGL63301,100.0,2.51494E-23 SIF-HHPRED: SIF-Syn: /note=Gene? Yes, called by others within the cluster; has some coding potential but not a lot /note=Start? 47181, the coding potential is almost all at the beginning of the protein so the earliest start site is need to ensure coding potential is included /note=Function? hypothetical Protein, Not significant hits on HHpred; many BLAST hit to a hypothetical protein CDS 47398 - 47901 /gene="74" /product="gp74" /function="HNH endonuclease" /locus tag="Bouchard_74" /note=Original Glimmer call @bp 47398 has strength 2.09; Genemark calls start at 47428 /note=SSC: 47398-47901 CP: no SCS: both-gl ST: NI BLAST-Start: [HNH endonuclease [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 5.95515E-122 GAP: 79 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.829, -3.0457372370466467, yes F: HNH endonuclease SIF-BLAST: ,,[HNH endonuclease [Arthrobacter phage Tokki]],,UGL63302,100.0,5.95515E-122 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes Glimmer and GeneMark both call it a gene. It has coding potential on GenemarkS. /note=What is the start site? 47398, it includes all coding potential. It is called by the manual annotations when the start site is present 94.7% (starterator), Z-score was acceptable as was the final score. /note=What is the function? the function is HNH endonuclease, NCBI Blast called it that for most of the similar phages with 100% alignments with 100% coverage. Hhpred also called it a type of endonuclease 99% of the time. /note= /note=Checked: C.A.R, TM CDS 47901 - 48038 /gene="75" /product="gp75" /function="Hypothetical Protein" /locus tag="Bouchard_75" /note= /note=SSC: 47901-48038 CP: no SCS: neither ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_76 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 1.38189E-24 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.457, -5.866498242384257, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_76 [Arthrobacter phage Phaby]],,XJP08389,100.0,1.38189E-24 SIF-HHPRED: SIF-Syn: /note=Gene? yes, it has coding potential; several other genomes in the cluster have this protein /note=Start? 47901, -1 overlap; includes all coding potential /note=Function? Hypothetical protein; all BLAST hits are to hypothetical proteins; Several HHpred hits to transcription factors, might be able to called it a DNA binding protein but I don`t believe the evidence is strong enough CDS 48019 - 48327 /gene="76" /product="gp76" /function="Hypothetical Protein" /locus tag="Bouchard_76" /note=Genemark calls start at 48019 /note=SSC: 48019-48327 CP: no SCS: genemark ST: NI BLAST-Start: [hypothetical protein SEA_INKED_77 [Arthrobacter phage Inked]],,NCBI, q1:s1 100.0% 1.65747E-29 GAP: -20 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.185, -2.2186743274732437, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_INKED_77 [Arthrobacter phage Inked]],,WGH21760,57.1429,1.65747E-29 SIF-HHPRED: SIF-Syn: /note=Checked: Start site is at 48019 due to the fact that on Genemark it captures all of the coding potential. Z-score and Final Score combination is the best of the options. As for functionality, this appears to be a hypothetical protein due to findings on Phages DB and NCBI BLAST. Phamerator showed no evidence of transmembrane domains. /note=TM, C.A.R CDS 48490 - 49722 /gene="77" /product="gp77" /function="DNA helicase" /locus tag="Bouchard_77" /note=Original Glimmer call @bp 48490 has strength 3.37; Genemark calls start at 48490 /note=SSC: 48490-49722 CP: no SCS: both ST: NI BLAST-Start: [DNA helicase [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 0.0 GAP: 162 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.697, -3.243963255891538, no F: DNA helicase SIF-BLAST: ,,[DNA helicase [Arthrobacter phage Phaby]],,XJP08374,98.7805,0.0 SIF-HHPRED: SIF-Syn: /note=gene? has coding potential; call by Glimmer and Genemark /note=Start? 48490, called by Glimmer and Genemark; includes all coding potential; most commonly annotated start site available on this gene /note=Function: DNA helicase; Multiple hits on HHpred and on NCBI BLAST; conserved domain is present too /note= /note=Checked: C.A.R, TM /note= /note=Gap before this protein was investigated. No ORFs long enough to be a protein were present CDS 49712 - 50248 /gene="78" /product="gp78" /function="Hypothetical Protein" /locus tag="Bouchard_78" /note=Original Glimmer call @bp 49712 has strength 9.79; Genemark calls start at 49712 /note=SSC: 49712-50248 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_80 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 8.40128E-91 GAP: -11 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.282, -2.606079664606164, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_80 [Arthrobacter phage Phaby]],,XJP08375,83.2402,8.40128E-91 SIF-HHPRED: SIF-Syn: /note=Start site: 49712 - called by both Glimmer and Genemark; includes all coding potential; has good Z and final scores /note= /note=Function: hypothetical protein, There are a couple of matches to nucleases in HHpred but in the crystal structure the match was not to the nuclease domain of the protein. All Blast matches are to hypothetical proteins /note= /note=Checked: C.A.R, TM CDS 50245 - 50535 /gene="79" /product="gp79" /function="Hypothetical Protein" /locus tag="Bouchard_79" /note=Original Glimmer call @bp 50245 has strength 5.75; Genemark calls start at 50245 /note=SSC: 50245-50535 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_80 [Arthrobacter phage Shepard] ],,NCBI, q4:s2 96.875% 4.71608E-30 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.989, -5.5764581073684, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_80 [Arthrobacter phage Shepard] ],,QFG13685,73.1183,4.71608E-30 SIF-HHPRED: SIF-Syn: /note=Gene: yes, there is coding potential and Glimmer and GeneMark both call it /note=Start: 50245, -4 overlap, called by Glimmer and GeneMark, includes all coding potential, this start is called 100% of the time when present (most commonly called start is not present in this gene - starterator) /note=Function: hypothetical protein, no significant hit on HHpred, only hit in BLAST are to hypothetical proteins, no conserved domains, no transmembrane domains /note= /note=Checked: C.A.R, TM CDS 50547 - 51332 /gene="80" /product="gp80" /function="HNH endonuclease" /locus tag="Bouchard_80" /note=Original Glimmer call @bp 50547 has strength 9.15; Genemark calls start at 50547 /note=SSC: 50547-51332 CP: no SCS: both ST: NI BLAST-Start: [HNH endonuclease [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 0.0 GAP: 11 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.551, -3.487629880782256, yes F: HNH endonuclease SIF-BLAST: ,,[HNH endonuclease [Arthrobacter phage Tokki] ],,UGL63307,99.6169,0.0 SIF-HHPRED: SIF-Syn: /note=Gene: yes, it has coding potential and is called by both Glimmer and GeneMark /note=Start: 50547, includes as much coding potential as possible, Glimmer and GeneMark both call this start, has the best Z and final scores, this start is called 100% of the time when present (starterator) /note=Function: has several BLAST and HHpred hit to HNH endonuclease, sequence does show an HNH over a 30 amino acid span. /note= /note=Checked: C.A.R, TM CDS 51329 - 51529 /gene="81" /product="gp81" /function="nuclease" /locus tag="Bouchard_81" /note=Original Glimmer call @bp 51329 has strength 1.95; Genemark calls start at 51329 /note=SSC: 51329-51529 CP: no SCS: both ST: NI BLAST-Start: [HNH endonuclease [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 1.74859E-40 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.263, -4.7464192966305205, no F: nuclease SIF-BLAST: ,,[HNH endonuclease [Arthrobacter phage Tokki] ],,UGL63308,100.0,1.74859E-40 SIF-HHPRED: SIF-Syn: /note=Gene: yes, there is coding potential and it is called by Glimmer and GeneMark /note=Start: 51329, -4 overlap, includes all coding potential, Glimmer and GeneMark both call this start, most commonly called start - 100% of the time when present (starterator) /note=Function: several hits in HHpred to endonuclease and HNH endonucleases, A few BLAST hits to endonucleases but there in not a conserved HNH in the sequence of the protein so it is called a nuclease /note= /note=Checked: C.A.R, TH CDS 51534 - 51761 /gene="82" /product="gp82" /function="Hypothetical Protein" /locus tag="Bouchard_82" /note=Original Glimmer call @bp 51534 has strength 9.88; Genemark calls start at 51534 /note=SSC: 51534-51761 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_BRUNSWICK_81 [Arthrobacter phage Brunswick]],,NCBI, q5:s6 94.6667% 6.44802E-30 GAP: 4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.049, -2.5052746077145835, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_BRUNSWICK_81 [Arthrobacter phage Brunswick]],,XEN20527,77.6316,6.44802E-30 SIF-HHPRED: SIF-Syn: /note=Gene: Yes, it has coding potential and is called by Glimmer and GeneMark /note=Start: 51534, has the best Z and Final scores, includes as much coding potential as possible, Glimmer and GeneMark both call this start site, Starterator data show that no other genes in this pham have this start site and this gene does not have the other called start sites /note=Function: No significant hits on HHpred, only hypothetical proteins are hit on BLAST, no conserved domains, no transmembrane domains /note= /note=Checked: C.A.R, TH CDS 51758 - 52036 /gene="83" /product="gp83" /function="Hypothetical Protein" /locus tag="Bouchard_83" /note=Original Glimmer call @bp 51758 has strength 6.09; Genemark calls start at 51758 /note=SSC: 51758-52036 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_89 [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 9.15161E-61 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.38, -4.754558922769318, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_89 [Arthrobacter phage Tokki]],,UGL63310,100.0,9.15161E-61 SIF-HHPRED: SIF-Syn: /note=Gene: yes, it has coding potential and is called by both Glimmer and GeneMark /note=Start: 51578, -4 overlap, best Z and Final score, includes all coding potential, Glimmer and GeneMark both call this start, starterator indicates this start is called 100% of the time when present /note=Function: Hypothetical protein, no significant hits on HHpred, Only hypothetical proteins in BLAST hits, no conserved domains, no transmembrane domains /note= /note=Checked: C.A.R, TH CDS 52033 - 52383 /gene="84" /product="gp84" /function="Hypothetical Protein" /locus tag="Bouchard_84" /note=Original Glimmer call @bp 52033 has strength 8.61; Genemark calls start at 52033 /note=SSC: 52033-52383 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_PHABY_85 [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 6.54093E-76 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.707, -3.3019548882942655, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_PHABY_85 [Arthrobacter phage Phaby]],,XJP08380,99.1379,6.54093E-76 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes, it is a gene. it has a start site of 52033. This gene has been called by both Glimmer and GeneMark. The start site includes the most coding potential; the start site has the best final score, Z-Score, gap. -4 overlap; NCBI Blast and Phagesdb both list hypothetical proteins that are the most similar to this, so this is likely a hypothetical protein. No significant hits on HHpred /note=Concur, TH /note=I agree, TM CDS 52380 - 52589 /gene="85" /product="gp85" /function="membrane protein" /locus tag="Bouchard_85" /note=Original Glimmer call @bp 52419 has strength 0.72; Genemark calls start at 52380 /note=SSC: 52380-52589 CP: no SCS: both-gm ST: NI BLAST-Start: GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.691, -5.37496011789745, no F: membrane protein SIF-BLAST: SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes it has coding potential and glimmer and genemark called it a gene. What is the start sit? Start site in 52380 because of the -4 gap. What is the function? Hypothetical Protein, HHpred had no good hits, nothing on NCBI or PhagesDB. /note=Membrane protein by DeepTMHMM--TH /note= /note=Checked: TM CDS 52589 - 52936 /gene="86" /product="gp86" /function="membrane protein" /locus tag="Bouchard_86" /note=Genemark calls start at 52589 /note=SSC: 52589-52936 CP: no SCS: genemark ST: NI BLAST-Start: [membrane protein [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 2.15821E-73 GAP: -1 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.217, -4.2529836510151195, no F: membrane protein SIF-BLAST: ,,[membrane protein [Arthrobacter phage Tokki]],,UGL63314,97.3913,2.15821E-73 SIF-HHPRED: SIF-Syn: /note=Is it a gene? Yes it is. Genemark shows good coding potential. All the pertinent coding potential was also included. The z score and final score were also good. There were no glimmer or genemark score but there is enough evidence that this is a gene that that is not an overwhelming concern. -1 overlap for the start site /note=What is the function? The overwhelming evidence from both NCBI blast and Phages DB supported trans membrane or some kind of membrane protein. -KO and CH, /note=Someone else should look at this, I see nothing on DeepTMHMM, I did see some on the previous gene though. CDS 52937 - 53302 /gene="87" /product="gp87" /function="Hypothetical Protein" /locus tag="Bouchard_87" /note=Genemark calls start at 52937 /note=SSC: 52937-53302 CP: no SCS: genemark ST: NI BLAST-Start: [hypothetical protein PBI_SHEPARD_88 [Arthrobacter phage Shepard]],,NCBI, q6:s3 95.8678% 1.34958E-74 GAP: 0 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 0.873, -8.062145158647677, no F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein PBI_SHEPARD_88 [Arthrobacter phage Shepard]],,QFG13702,95.7627,1.34958E-74 SIF-HHPRED: SIF-Syn: /note=Function: This gene is a Hypothetical Protein. HHpred did not yield any applicable results, PhagesDB/NCBI BLAST both recognized it as a Hypothetical Protein with high E-values. /note=Start Site: The Start Site appears to be 52937 as it includes all the coding potential and was called by GeneMark. This start site was noted on Phamerator and while there were two other manually annotated start sites for similar genes, they were different. /note= /note=TM, CAR, TH CDS 53388 - 53726 /gene="88" /product="gp88" /function="VRR-Nuc domain protein" /locus tag="Bouchard_88" /note=Original Glimmer call @bp 53388 has strength 4.45; Genemark calls start at 53388 /note=SSC: 53388-53726 CP: no SCS: both ST: NI BLAST-Start: [VRR-Nuc domain protein [Arthrobacter phage Tokki] ],,NCBI, q1:s1 100.0% 1.26012E-76 GAP: 85 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.228, -5.21746608827694, yes F: VRR-Nuc domain protein SIF-BLAST: ,,[VRR-Nuc domain protein [Arthrobacter phage Tokki] ],,UGL63315,100.0,1.26012E-76 SIF-HHPRED: SIF-Syn: /note=The glimmer and Genemark score is pretty low, but they do agree. There is coding potential for this gene. The Z score is good but the Final score isn`t. Has a 99.6% chance to be Hydrolase or VRR-Nuc domain protein. JL /note= /note=Could also be called a nuclease--TH /note= /note=Multiple hits to a vrr-nuc domain in HHpred and NCBI BLAST CDS 53689 - 54231 /gene="89" /product="gp89" /function="Cas4 exonuclease" /locus tag="Bouchard_89" /note=Original Glimmer call @bp 53689 has strength 10.11; Genemark calls start at 53671 /note=SSC: 53689-54231 CP: no SCS: both-gl ST: NI BLAST-Start: [Cas4 family exonuclease [Arthrobacter phage Shepard]],,NCBI, q1:s1 100.0% 1.21375E-129 GAP: -38 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.046, -5.8544093294192034, no F: Cas4 exonuclease SIF-BLAST: ,,[Cas4 family exonuclease [Arthrobacter phage Shepard]],,QFG13694,99.4444,1.21375E-129 SIF-HHPRED: SIF-Syn: /note=Both Glimmer and Genemark agree on the start site and the score is decent. The Z score is good but the final score isn`t the greatest. There is coding potential. The gap is the best for this start site. Has a 99.1% chance of being exonuclease, however has the most coverage with 98.9 %. /note=TH /note=Glimmer and genemark do not agree on the start site. There is a significant overlap but the coding potential overlaps significantly with the previous gene. Choosing the next start site would cut off significant coding potential. /note= /note=Shows alignment with the structures in the Cas4 family exonuclease notes on official function list CDS 54228 - 56966 /gene="90" /product="gp90" /function="helix-turn-helix DNA binding domain" /locus tag="Bouchard_90" /note=Original Glimmer call @bp 54228 has strength 8.89; Genemark calls start at 54228 /note=SSC: 54228-56966 CP: no SCS: both ST: NI BLAST-Start: [helix-turn-helix DNA binding domain [Arthrobacter phage Phaby]],,NCBI, q1:s1 100.0% 0.0 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.964, -4.862260131515226, no F: helix-turn-helix DNA binding domain SIF-BLAST: ,,[helix-turn-helix DNA binding domain [Arthrobacter phage Phaby]],,XJP08385,99.8904,0.0 SIF-HHPRED: SIF-Syn: /note=Glimmer and Genemark both agree on the start site and the score is decent. There is great coding potential for it. This one has the smallest gap, with a little overlap. The Z score is decent and the Final score is alright, not the greatest. All the AU clusters say helix-turn-helix as its function. Has a 97.5 % probability of being a DNA binding domain. JL /note=Reviewed: KJ /note= /note=-4 overlap /note=Has a conserved HTH DNA binding domain (phamerator) CDS 56963 - 57238 /gene="91" /product="gp91" /function="Hypothetical Protein" /locus tag="Bouchard_91" /note=Original Glimmer call @bp 56969 has strength 5.87; Genemark calls start at 56963 /note=SSC: 56963-57238 CP: yes SCS: both-gm ST: NI BLAST-Start: [hypothetical protein SEA_TOKKI_98 [Arthrobacter phage Tokki]],,NCBI, q1:s1 100.0% 1.89985E-59 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.952, -2.7254235724530456, yes F: Hypothetical Protein SIF-BLAST: ,,[hypothetical protein SEA_TOKKI_98 [Arthrobacter phage Tokki]],,UGL63318,98.9011,1.89985E-59 SIF-HHPRED: SIF-Syn: /note=I channged the start site to 56963, because it has a better Z and final score. There is coding potential for the gene. glimmer and genemark don`t agree on the start site, however there is a -4 overlap on my start site instead of a gap. Hypothetical protein. JL /note=Reviewed: KJ /note= /note=No significant hit on HHPred or NCBI other than hypothetical proteins