CDS complement (153 - 359) /gene="1" /product="gp1" /function="hypothetical protein" /locus tag="alkhayr_1" /note=Original Glimmer call @bp 359 has strength 15.06; Genemark calls start at 329 /note=SSC: 359-153 CP: yes SCS: both-gl ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_1 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 1.39849E-39 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.891, -2.9185894779535753, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_1 [Mycobacterium phage Catdawg] ],,YP_008409170,100.0,1.39849E-39 SIF-HHPRED: SIF-Syn: Gene 1 is homologous with Gene 1 of Firecracker and Krilli, neither of which calls a function for Gene 1. /note=Glimmer and Starterator agree on start site 359, which offers both the longest ORF and the smallest overlap. Start 359 has been manually annotated 22 other times according to Starterator. Coding potential is not maintained throughout the entire gene, with coding potential dropping to 0 from 200 to 153 bp. There is no evidence in HHPRED to support any function, Phages DB Blast shows function unknown in other cluster members with e value of 2e-30. CDS complement (359 - 676) /gene="2" /product="gp2" /function="hypothetical protein" /locus tag="alkhayr_2" /note=Original Glimmer call @bp 589 has strength 8.0; Genemark calls start at 565 /note=SSC: 676-359 CP: yes SCS: both-cs ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_2 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 1.28287E-70 GAP: 2 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.082, -2.583959800616441, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_2 [Mycobacterium phage Catdawg] ],,YP_008409171,100.0,1.28287E-70 SIF-HHPRED: SIF-Syn: Gene 2 is homologous with Gene 2 of Firecracker and Krilli, neither of which calls a function for Gene 2. /note=Start site 676 provides the greatest balance between the length of ORF, while minimizing gap. Site 676 is the most manually annotated start site in Starterator. Coding potential is strong from approximately 575 to 475bp. PhagesDB BLAST supports NKF, hit with other cluster O members with e-value of 1e-59. HHPred and NCBI BLAST provide no evidence for a known protein function. CDS complement (679 - 819) /gene="3" /product="gp3" /function="hypothetical protein" /locus tag="alkhayr_3" /note=Original Glimmer call @bp 819 has strength 12.02; Genemark calls start at 819 /note=SSC: 819-679 CP: yes SCS: both ST: SS BLAST-Start: [gp4 [Mycobacterium phage Corndog] ],,NCBI, q1:s1 100.0% 1.87491E-26 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.752, -5.330723697772664, yes F: hypothetical protein SIF-BLAST: ,,[gp4 [Mycobacterium phage Corndog] ],,NP_817855,100.0,1.87491E-26 SIF-HHPRED: SIF-Syn: Gene 3 is homologous with Gene 3 of Firecracker and Krilli, neither of which calls a function for Gene 3. /note=Genemark and Glimmer both agree on start site 819 and according to Starterator site 819 has been manually annotated in 7 other pham members. Coding potential is strong at a value of ~1.0 from bp 775-625. Phages Db BLAST supports no known function with strong e-value for other subcluster O phages, NCBI BLAST supports no known function with strong hits for hypothetical protein with 100% identity, coverage, alignment, and e-value of 1.87e-26. HHPred hits do not support a function. No TMB domain according to Deep TmHmm. CDS complement (819 - 1289) /gene="4" /product="gp4" /function="terminase, large subunit" /locus tag="alkhayr_4" /note=Original Glimmer call @bp 1289 has strength 5.76; Genemark calls start at 1289 /note=SSC: 1289-819 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_WINGET_4 [Mycobacterium phage Winget]],,NCBI, q1:s1 100.0% 4.88809E-111 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.383, -6.015984560757171, no F: terminase, large subunit SIF-BLAST: ,,[hypothetical protein SEA_WINGET_4 [Mycobacterium phage Winget]],,QWY81489,100.0,4.88809E-111 SIF-HHPRED: Large subunit terminase; large terminase, VIRAL PROTEIN; 2.2A {Deep-sea thermophilic phage D6E},,,5OE8_A,75.641,99.4 SIF-Syn: no evidence from syntenic genes to support function call. No transmembrane domains. /note=Genemark and Glimmer both agree on start site 1289 and according to Starterator, start site 1289 has been manually annotated in 23 other pham members. Start site 1289 provides the second longest possible ORF, while offering the shortest gap. Coding potential is strong from 1250-1100 bp, then experiences large variations in coding potential until 900 bp, where coding potential remains 0 through stop site 819. Phagesdb has called terminase the function in a number of phages in clusters other than O. NCBI and PhagesDB BLAST don`t support that function call. HHPred matches to large unit of terminase in thermophylic virus E value 10E-12, 99.45 prob. CDS complement (1286 - 1855) /gene="5" /product="gp5" /function="DNA methyltransferase" /locus tag="alkhayr_5" /note=Original Glimmer call @bp 1855 has strength 6.25; Genemark calls start at 1669 /note=SSC: 1855-1286 CP: yes SCS: both-gl ST: SS BLAST-Start: [DNA methyltransferase [Mycobacterium phage Dylan] ],,NCBI, q1:s1 100.0% 1.34133E-132 GAP: -209 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.076, -2.5969056218148614, yes F: DNA methyltransferase SIF-BLAST: ,,[DNA methyltransferase [Mycobacterium phage Dylan] ],,YP_008530570,98.9418,1.34133E-132 SIF-HHPRED: SIF-Syn: Synteny with Familton and Firecracker supports function. /note=Start site 1855 was predicted by Glimmer and has been manually annotated 21 times according to Starterator. It provides the longest ORF, but it has a very large overlap with the next gene. Function has been called as DNA methyltransferase in 25% of phages in subcluster O and NCBI BLAST supports DNA methyltransferase with 100% coverage and e-value of 1.34E^-132. CDS complement (1647 - 2237) /gene="6" /product="gp6" /function="DNA methyltransferase" /locus tag="alkhayr_6" /note=Genemark calls start at 2237 /note=SSC: 2237-1647 CP: yes SCS: genemark ST: SS BLAST-Start: [DNA methylase [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 7.76349E-137 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.061, -5.4589402468268435, no F: DNA methyltransferase SIF-BLAST: ,,[DNA methylase [Mycobacterium phage Vorrps] ],,AYQ98843,100.0,7.76349E-137 SIF-HHPRED: Modification methylase HhaI; CG-SPECIFICITY, CPG SEQUENCE, C5-METHYLCYTOSINE, NUCLEOTIDE FLIPPING, S-ADENOSYL-L-HOMOCYSTEINE, COMPLEX (METHYLTRANSFERASE- DNA), transferase-DNA complex; HET: 3DR, SO4, SAH; 1.594A {Haemophilus parahaemolyticus},,,5CIY_A,98.4694,99.9 SIF-Syn: Synteny with JangDynasty and Firecracker support function. /note=Start site 2237 which was selected by Genemark provides the shortest gap and has been manually annotated in 21 pham numbers according to Starterator. HHPred supports function as DNA methyltransferase with 99.9% probability and e-value of 7.9e-25. NCBI BLAST supports function with 100% coverage and identity and an e-value of 7.76e-137 of other mycobacterium phages. CDS complement (2234 - 2326) /gene="7" /product="gp7" /function="membrane protein" /locus tag="alkhayr_7" /note=Genemark calls start at 2326 /note=SSC: 2326-2234 CP: yes SCS: genemark ST: SS BLAST-Start: [hypothetical protein SEA_VORRPS_7 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 1.30985E-10 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.217, -4.309437458116814, yes F: membrane protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_7 [Mycobacterium phage Vorrps] ],,AYQ98844,100.0,1.30985E-10 SIF-HHPRED: SIF-Syn: Gene 7 is homologous with Gene 7 of Krilli, which does not call a function for Gene 7. /note=Start 2326 has been selected by Genemark and provides the shortest gap and the largest ORF. According to Starterator it has been manually annotated in all non-draft pham members. PhagesDb BLAST supports no known function with strong hits with other cluster O members and e-value of 1e-11. HHPred hits do not support a known function. NCBI BLAST supports no known function with 100% alignment, identity, coverage, and an e-value of 1.3e-10. Although DeepTMHMM shows that the protein has a transmembrane domain, TOPCONS indicates no homologous TM proteins detected. SOSUI shows one TMD. CDS complement (2323 - 2517) /gene="8" /product="gp8" /function="hypothetical protein" /locus tag="alkhayr_8" /note=Original Glimmer call @bp 2517 has strength 8.49; Genemark calls start at 2517 /note=SSC: 2517-2323 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_DYLAN_8 [Mycobacterium phage Dylan] ],,NCBI, q1:s1 100.0% 5.87118E-40 GAP: -8 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.683, -3.329294830936593, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_DYLAN_8 [Mycobacterium phage Dylan] ],,YP_008530572,100.0,5.87118E-40 SIF-HHPRED: SIF-Syn: Gene 8 is homologous with Gene 8 of Familton and Krilli, neither of which calls a function for Gene 8. /note=Start site 2517 has been selected by both Glimmer and Genemark and it provides the second longest ORF with the least overlap. According to Starterator, start 2 at 2517 bp has the most manual annotations for pham members. There is not coding potential throughout the gene, potential drops at approximately 2375 bp to the stop site at 2323bp. PhagesDb BLAST does not support a function for this gene with strong e-values for hits annotated as NKF. NCBI BLAST does not support a function with 100% identity, alignment, and coverage for hits with e-value of 5.87e-40. No TMB according ot Deep TmHmm CDS complement (2510 - 2839) /gene="9" /product="gp9" /function="hypothetical protein" /locus tag="alkhayr_9" /note=Original Glimmer call @bp 2839 has strength 7.96; Genemark calls start at 2839 /note=SSC: 2839-2510 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_VORRPS_9 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 4.38318E-73 GAP: 28 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.408, -5.963494220754885, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_9 [Mycobacterium phage Vorrps] ],,AYQ98846,100.0,4.38318E-73 SIF-HHPRED: SIF-Syn: Gene 1 is homologous with Gene 9 of Familton and Krilli, neither of which calls a function for Gene 9. /note=Both Glimmer and Genemark agree on start site 2839, which has been manually annotated in 6 other pham members. Start 2839 provides both the longest ORF and the shortest gap. There is not coding potential throughout the entire gene, with coding potential dropping to .25 near 2760bp before reaching 1.0 at approximately 2700 bp and maintaining that potential until dropping to 0 at 2550 bp. PhagesDB BLAST, NCBI BLAST do not support a known function with other cluster O members noting NKF with an e-value of e-62 for PhagesDB BLAST and e-value of 4.38e-73 for NCBI BLAST. HHPred does not contain any acceptable matches. Deep TmHmm shows no TMD. CDS complement (2868 - 3488) /gene="10" /product="gp10" /function="endonuclease VII" /locus tag="alkhayr_10" /note=Original Glimmer call @bp 3488 has strength 4.91; Genemark calls start at 3488 /note=SSC: 3488-2868 CP: yes SCS: both ST: SS BLAST-Start: [endonuclease VII [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 9.86475E-150 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.284, -4.155204357786937, no F: endonuclease VII SIF-BLAST: ,,[endonuclease VII [Mycobacterium phage Vorrps] ],,AYQ98847,99.5146,9.86475E-150 SIF-HHPRED: Restriction endonuclease Hpy99I; ENDONUCLEASE-DNA COMPLEX, RESTRICTION ENZYME, HPY99I, PSEUDOPALINDROME, HYDROLASE-DNA COMPLEX; HET: 1PE; 1.5A {Helicobacter pylori},,,3GOX_A,66.9903,99.7 SIF-Syn: Gene 10 is homologous with genes 10 of Mori, Firecracker, MadKillah, and Krilli, which all call for the function of endonuclease VII. /note=GeneMark and Glitter both agree that the start site for this gene is at 3488, and it gives the longest ORF, and it includes all the coding according to the Genemark report. According to Starterator, this start site has been called 100% of the time when present, with 21 manual annotations. The gene does have a slight overlap of -4. There is evidence of coding potential throughout the entire gene, however, coding potential drops at roughly 3300bp. Phagesdb has called endonuclease VII the function in many other phages that are present in clusters other than O. NCBI blast also supports this function with 100% coverage and an E-value of 9.86475e-150. CDS complement (3485 - 4306) /gene="11" /product="gp11" /function="hypothetical protein" /locus tag="alkhayr_11" /note=Original Glimmer call @bp 4306 has strength 5.45; Genemark calls start at 4306 /note=SSC: 4306-3485 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_VORRPS_11 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 0.0 GAP: 488 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.342, -7.644728946791978, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_11 [Mycobacterium phage Vorrps] ],,AYQ98848,100.0,0.0 SIF-HHPRED: SIF-Syn: Gene 11 is homologous with genes 11 in both Firecracker and Finget, which provide no known evidence of a function for a protein. /note=Glimmer and Genemark both agree on start site 4306 for this gene, which was called the "Most Annotated" start site and is evident in 22 manual annotations, according to Starterator. The start site 4306 depicts the longest ORF, and the shortest gap. There is coding potential throughout the gene. Coding potential is fairly strong amongst some drops, as it drops to 0 at roughly 3745bp, spiking back to around 0.8, dropping again to 0.25 near 3830bp, and spiking again. It drops to 0 once again near 4095bp, then spiking back to 1.0 at around 4170bp. Phages DB blast and NCBI Blast, HHPred, and CDD provide no evidence for a known protein function. No evidence of TMB. CDS 4795 - 5082 /gene="12" /product="gp12" /function="hypothetical protein" /locus tag="alkhayr_12" /note=Original Glimmer call @bp 4795 has strength 15.47; Genemark calls start at 4795 /note=SSC: 4795-5082 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_VORRPS_12 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 3.78217E-62 GAP: 488 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.543, -5.684798849043998, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_12 [Mycobacterium phage Vorrps] ],,AYQ98849,100.0,3.78217E-62 SIF-HHPRED: Uncharacterized protein R354; MIMIVIRE, Cas4-like, nuclease, R354, NUCLEAR PROTEIN; 2.806A {Acanthamoeba polyphaga mimivirus},,,5YET_B,68.4211,95.1 SIF-Syn: Gene 12 is homologous with Gene 12 of Firecracker and MadKillah, neither of which calls for a function. /note=Both Glimmer and Genemark agree on the start site 4795, which has 25 manual annotations and was called 88.6% of the time when present, according to Starterator. The start site 4795 depicts the longest ORF with the smallest gap of 488. There is evidence of coding potential throughout the entire gene, as it spikes to 1.0 at approximately 4840bp, slightly dropping to about 0.75 at around 4865bp, then increasing again and maintaining this coding potential until it drops to 0 at around 5070bp. NCBI blast and Phages DB blast does not support evidence of a known function. Nor HHPred and CDD. According to Deep TmHmm no TMB. CDS 5075 - 5824 /gene="13" /product="gp13" /function="exonuclease" /locus tag="alkhayr_13" /note=Original Glimmer call @bp 5075 has strength 13.05; Genemark calls start at 5075 /note=SSC: 5075-5824 CP: yes SCS: both ST: NA BLAST-Start: [exonuclease [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 0.0 GAP: -8 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.488, -4.197870532213559, yes F: exonuclease SIF-BLAST: ,,[exonuclease [Mycobacterium phage Vorrps] ],,AYQ98850,100.0,0.0 SIF-HHPRED: Cas4_I-A_I-B_I-C_I-D_II-B; CRISPR/Cas system-associated protein Cas4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA.,,,cd09637,79.1165,99.8 SIF-Syn: Gene 13 is homologous with genes 13 in NiebruSaylor, Krilli, and Vorrps, which all call for exonuclease as a function. It is also homologous with Gene 14 in MadKillah, which calls for cas4 exonuclease. /note=Both Glimmer and Genemark agree that the start site for this gene is 5075, as it conveys the longest ORF and contains all coding potential throughout the gene, along with a small overlap. The coding potential spikes to 1.0 then drops to about 0.1 near 5210bp, where it spikes back up to 1.0 at around 5270bp. It drops to about 0.02 at around 5550bp, spiking back to 1.0 where the coding potential is maintained until it drops to 0 at 5785bp. Starterator, this phage doesn`t have the suggested start and 5075 is the most manually annotated start. Phages DB Blast supports the function call as exonuclease, as exonuclease has been called in other pham members with a score of 524, with an e-value of 1e-149. NCBI Blast supports this function with 100% coverage, alignment, percent identity and an e-value of 0. CDS 5824 - 5979 /gene="14" /product="gp14" /function="hypothetical protein" /locus tag="alkhayr_14" /note=Original Glimmer call @bp 5824 has strength 12.85; Genemark calls start at 5827 /note=SSC: 5824-5979 CP: yes SCS: both-gl ST: NI BLAST-Start: [hypothetical protein SEA_VORRPS_14 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 2.51597E-27 GAP: -1 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.293, -4.724454631712076, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_14 [Mycobacterium phage Vorrps] ],,AYQ98851,100.0,2.51597E-27 SIF-HHPRED: Toxin; ParE/RelE toxin, ParD antitoxin, RHH, DNA binding motif, Neutralization, ANTITOXIN, TOXIN-ANTITOXIN complex; 1.79A {Pseudoalteromonas rubra},,,7YCS_A,80.3922,82.5 SIF-Syn: There is no evidence from syntenic genes that support a function call. /note=Both Genemark and Glimmer agree that the start site for this gene appears at approximately 5824-5827 base pairs, however, Glimmer provides the best evidence for site 5824. This includes the longest ORF, and provides a strong coding potential that expands throughout the entire ORF, spiking to 1.0 around 5860bp and remaining consistent before dropping to 0 at 5979 base pairs. There is a mere gap of -1 base pairs. Starterator also agrees with this start site, which has been called "Most Annotated" for this gene and possesses 22 manual annotations. PhagesDB Blast does not provide evidence for a known protein function. NCBI Blast provides evidence for the function of a hypothetical protein, as it provides a 1:1 alignment of the query and subject, and 100% query cover. There is a percent identity of gene 14 in phage name Vorrps of 100%, which has been identified as a hypothetical protein, with an e-value of 3e-27. CDS 6062 - 6742 /gene="15" /product="gp15" /function="hypothetical protein" /locus tag="alkhayr_15" /note=Original Glimmer call @bp 6062 has strength 13.13; Genemark calls start at 6062 /note=SSC: 6062-6742 CP: no SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_VORRPS_15 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 1.95703E-154 GAP: 82 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.544, -5.681959630587693, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_15 [Mycobacterium phage Vorrps] ],,AYQ98852,100.0,1.95703E-154 SIF-HHPRED: SIF-Syn: Catdawg_14 is syntenic but not informative for function. /note=Both Glimmer and GeneMark agree on start sites 6062 and end sites 6742. It has an 681 ORF and a gap of 82. The RBS score is -5. 682. According to pham Starterator, the most annotated gene is 6005 which was annotated by for 15 other pham members CDS 6766 - 7140 /gene="16" /product="gp16" /function="hypothetical protein" /locus tag="alkhayr_16" /note=Original Glimmer call @bp 6826 has strength 3.26; Genemark calls start at 6826 /note=SSC: 6766-7140 CP: yes SCS: both-cs ST: NI BLAST-Start: [hypothetical protein PBI_CATDAWG_15 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 1.68828E-85 GAP: 23 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.321, -7.182354764100016, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_15 [Mycobacterium phage Catdawg] ],,YP_008409184,100.0,1.68828E-85 SIF-HHPRED: SIF-Syn: not informative. /note=Despite Glimmer and GeneMark predicting 6826 as the start site, according to Starterator 6766 is the most manually annotated start for this gene. In addition, BLASTp alignments for 6826 is Q1:S15 it doesn`t support 6826 as the start site and it supports a start further upstream, in our case 6766. 6766 gives the longest ORF and the shortest gap even though the RBS score is not the highest. /note=None of the function tools, BLASTp, HHPred CDD support a function. It is not a membrane protein according to TmHmm. CDS 7133 - 7444 /gene="17" /product="gp17" /function="hypothetical protein" /locus tag="alkhayr_17" /note=Original Glimmer call @bp 7133 has strength 4.16; Genemark calls start at 7133 /note=SSC: 7133-7444 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_KRILI_17 [Mycobacterium phage Krili] ],,NCBI, q1:s1 100.0% 1.85829E-69 GAP: -8 bp gap LO: no RBS: Kibler 6, Karlin Medium, 0.262, -9.317152772904528, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_KRILI_17 [Mycobacterium phage Krili] ],,QFP97065,100.0,1.85829E-69 SIF-HHPRED: Astro_p19 ; Astrovirus p19 protein,,,PF19414.2,68.932,82.9 SIF-Syn: not informative /note=Glimmer and genemark predict same start site 7133, also supported by starterator, 14 MAs. /note=none of the sources of evidence for function are informative. Doesn`t have a transmembrane domain. CDS 7441 - 7668 /gene="18" /product="gp18" /function="hypothetical protein" /locus tag="alkhayr_18" /note=Original Glimmer call @bp 7504 has strength 13.9; Genemark calls start at 7432 /note=SSC: 7441-7668 CP: yes SCS: both-cs ST: NI BLAST-Start: [gp20 [Mycobacterium phage Corndog] ],,NCBI, q1:s1 100.0% 1.9312E-45 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.546, -3.6916801401915316, no F: hypothetical protein SIF-BLAST: ,,[gp20 [Mycobacterium phage Corndog] ],,NP_817871,100.0,1.9312E-45 SIF-HHPRED: DUF3960 ; Domain of unknown function (DUF3960),,,PF13142.10,29.3333,69.2 SIF-Syn: Not informative. /note=Glimmer and genemark disagree on start site and starterator is not informative, no MAs for this gene. Chose 7441 based on shortest overlap/gap. Bora_18 is syntenic but is a shorter gene and has NKF. No TMB domain. CDS 7665 - 7895 /gene="19" /product="gp19" /function="hypothetical protein" /locus tag="alkhayr_19" /note=Original Glimmer call @bp 7665 has strength 13.79; Genemark calls start at 7665 /note=SSC: 7665-7895 CP: yes SCS: both ST: SS BLAST-Start: [gp21 [Mycobacterium phage Corndog] ],,NCBI, q1:s1 100.0% 7.04291E-46 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.41, -3.9109350188549152, yes F: hypothetical protein SIF-BLAST: ,,[gp21 [Mycobacterium phage Corndog] ],,NP_817872,100.0,7.04291E-46 SIF-HHPRED: GFRP ; GTP cyclohydrolase I feedback regulatory protein (GFRP),,,PF06399.16,17.1053,83.5 SIF-Syn: Bora_19 is syntenic similar length and similar genes up and downstream but NKF. /note=Genemark and Glimmer agree on 7665, Starterator has 22 MAs calling it the most likely start. No supporting evidence to call a function and doesn`t have any TMDs. CDS 7892 - 10255 /gene="20" /product="gp20" /function="DNA primase/polymerase" /locus tag="alkhayr_20" /note=Original Glimmer call @bp 7892 has strength 15.15; Genemark calls start at 7895 /note=SSC: 7892-10255 CP: yes SCS: both-gl ST: SS BLAST-Start: [DNA primase/polymerase [Mycobacterium phage Shida]],,NCBI, q1:s1 100.0% 0.0 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.926, -3.6727816321281046, yes F: DNA primase/polymerase SIF-BLAST: ,,[DNA primase/polymerase [Mycobacterium phage Shida]],,QOC58454,99.8729,0.0 SIF-HHPRED: ORF904; primase, polymerase, Replication; 1.85A {Sulfolobus islandicus},,,3M1M_A,20.7116,99.4 SIF-Syn: Doesn`t have a transmembrane domain. /note=In spite of Glimmer and GeneMark disagreeing with the start sites, Starterator agrees with Glimmer making the start site 7892. According to Starterator gene, 7892 is the most manually annotated start site for this gene, 17 MA`s. In addition, BLASTp alignments for 7892 is Q1:S1 it doesn`t support 7895 as the start site it supports a start site further upstream, that being 7892. 7892 gives the smallest ORF and shortest gap even with the RBS score not being the highest. HHPred and CDD support this function. CDS 10248 - 10448 /gene="21" /product="gp21" /function="hypothetical protein" /locus tag="alkhayr_21" /note=Original Glimmer call @bp 10248 has strength 2.13; Genemark calls start at 10248 /note=SSC: 10248-10448 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_20 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 2.1457E-41 GAP: -8 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.06, -5.08247711351149, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_20 [Mycobacterium phage Catdawg] ],,YP_008409189,94.2857,2.1457E-41 SIF-HHPRED: SIF-Syn: Not informative. Doesn`t have a transmembrane domain. /note=Glimmer and GeneMark agree with the predicted start site 10248. This site according to starterator is the most manually annotated as the start for this gene. In addition, BLASTp alignments for 10428 are Q1:S1 it supports 10428 as the start site. 10248 gives us the longest length and shortest gap even though the RBS score is not the highest. None of the function tools BLASTp, hHPred CDD support a function. CDS 10445 - 10651 /gene="22" /product="gp22" /function="hypothetical protein" /locus tag="alkhayr_22" /note= /note=SSC: 10445-10651 CP: no SCS: neither ST: NI BLAST-Start: [hypothetical protein KNU03_gp024 [Mycobacterium phage Ryadel] ],,NCBI, q5:s1 94.1176% 1.81321E-35 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.367, -4.572111470246742, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein KNU03_gp024 [Mycobacterium phage Ryadel] ],,YP_010097514,100.0,1.81321E-35 SIF-HHPRED: SIF-Syn: None of the sources of evidence for function are informative. There is synteny for gene 22 which is similar to 22 in familton. Doesn`t have a transmembrane domain. /note=Added missing gene. There is no coding potential on Genemark report, the gene is present in similar cluster O phages with unknown functions. NCBI Blast: 97% similar unknown gene, 94% coverage, Q1:S1, e value: 1.81321e-35. 10445 has the smallest gap. Pham map predicts the potential start site to be 10445 with the stop site being 10651. CDS 10697 - 11071 /gene="23" /product="gp23" /function="mycobacteriophage mobile element 1 (MPME 1)" /locus tag="alkhayr_23" /note= /note=SSC: 10697-11071 CP: yes SCS: neither ST: NI BLAST-Start: [mobile element MPME [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 2.60892E-86 GAP: 45 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.178, -6.903978873041489, no F: mycobacteriophage mobile element 1 (MPME 1) SIF-BLAST: ,,[mobile element MPME [Mycobacterium phage Catdawg] ],,YP_008409191,100.0,2.60892E-86 SIF-HHPRED: SIF-Syn: Not informative There is synteny for gene 23 which is similar to 23 in familton. Doesn`t have a transmembrane domain. /note=There is no coding potential on genemark report, gene is present in similar cluster O phages with function mobile element MPME. NCBI Blast: 100%, 100% coverage, Q1:S1, e value: 2.60892e-86. 10679 has the smallest gap. Pham maps predicted start site is 10679, while the predicted stop site is 11071. CDS complement (11072 - 11371) /gene="24" /product="gp24" /function="WhiB family transcription factor" /locus tag="alkhayr_24" /note=Original Glimmer call @bp 11371 has strength 2.15; Genemark calls start at 11371 /note=SSC: 11371-11072 CP: yes SCS: both ST: SS BLAST-Start: [WhiB family transcription factor [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 1.13412E-63 GAP: 44 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.245, -6.588784052801323, no F: WhiB family transcription factor SIF-BLAST: ,,[WhiB family transcription factor [Mycobacterium phage Vorrps] ],,AYQ98861,100.0,1.13412E-63 SIF-HHPRED: Transcriptional regulator WhiB1; nitric oxide, sigmaA, iron-sulfur, tuberculosis, Wbl protein, SIGNALING PROTEIN; HET: SF4; NMR {Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv)},,,5OAY_A,73.7374,99.7 SIF-Syn: /note=The current predicted start @ 11371 bp (start site #1) is the most annotated start in Starterator. Agrees with Glimmer and Genemark. Found in 28/28 genes in this pham. NCBI Blast supports the function WhiB family transcription factor with a 100% Query cover and an e-value of 1e-63. Also supported by HHPred. CDS 11416 - 11850 /gene="25" /product="gp25" /function="hypothetical protein" /locus tag="alkhayr_25" /note=Original Glimmer call @bp 11416 has strength 11.08; Genemark calls start at 11416 /note=SSC: 11416-11850 CP: no SCS: both ST: NA BLAST-Start: [hypothetical protein SEA_VORRPS_25 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 5.07646E-101 GAP: 44 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 0.827, -7.752332126922963, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_25 [Mycobacterium phage Vorrps] ],,AYQ98862,100.0,5.07646E-101 SIF-HHPRED: SIF-Syn: Not informative /note=Alkhayr doesn`t have the most annotated start for this pham in Starterator. The current predicted start @ 11416 bp (start site #57) is not the most annotated start in Starterator but has been manually annotated 20 times. However, this start site agrees with Glimmer and Genemark. Found in 28/303 genes in this pham. NCBI Blast and HHpred did not support any function. No TMD. CDS 11847 - 12083 /gene="26" /product="gp26" /function="hypothetical protein" /locus tag="alkhayr_26" /note=Original Glimmer call @bp 11847 has strength 11.99; Genemark calls start at 11835 /note=SSC: 11847-12083 CP: yes SCS: both-gl ST: NA BLAST-Start: [hypothetical protein PBI_CATDAWG_25 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 1.80758E-51 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.556, -3.529988066030215, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_25 [Mycobacterium phage Catdawg] ],,YP_008409194,100.0,1.80758E-51 SIF-HHPRED: SIF-Syn: Not informative. /note=Alkhayr doesn`t have the most annotated start. The current predicted start @ 11847 bp (start site #9) is not the most annotated start in Starterator but has been called in 20 MAs. . However, it agrees with Glimmer but not Genemark, which has a 11835 bp start. Found in 28/298 genes in this pham. NCBI Blast and HHpred did not support any function. No TMD. CDS 12076 - 12348 /gene="27" /product="gp27" /function="hypothetical protein" /locus tag="alkhayr_27" /note=Original Glimmer call @bp 12076 has strength 9.95; Genemark calls start at 12076 /note=SSC: 12076-12348 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_26 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 1.96726E-60 GAP: -8 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.488, -3.7329837339109084, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_26 [Mycobacterium phage Catdawg] ],,YP_008409195,100.0,1.96726E-60 SIF-HHPRED: SIF-Syn: Not informative. /note=The current predicted start @ 12076 bp (start site #1) is the most annotated start in Starterator. Agrees with Glimmer and Genemark. Found in 16/16 genes in this pham. NCBI Blast and HHpred do not support a function call for this gene. No TMD. CDS 12345 - 12671 /gene="28" /product="gp28" /function="HNH endonuclease" /locus tag="alkhayr_28" /note=Original Glimmer call @bp 12411 has strength 9.48; Genemark calls start at 12345 /note=SSC: 12345-12671 CP: yes SCS: both-gm ST: NA BLAST-Start: [HNH endonuclease [Mycobacterium phage Dylan] ],,NCBI, q1:s1 100.0% 4.04407E-70 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.746, -3.2181355793933353, no F: HNH endonuclease SIF-BLAST: ,,[HNH endonuclease [Mycobacterium phage Dylan] ],,YP_008530591,99.0826,4.04407E-70 SIF-HHPRED: HNH endonuclease; Thermophilic bacteriophage, HNH Endonuclease, DNA nicking, HYDROLASE; 1.52A {Geobacillus virus E2},,,5H0M_A,60.1852,96.3 SIF-Syn: Other O cluster phages call this gene the same function. /note=The current predicted start @ 12411 bp (start site #2) is the most annotated start in Starterator, but alkhayr does not call it. However, it agrees with Glimmer but not Genemark, which has a 12345 bp start. Found in 28/28 genes in this pham. NCBI BLAST and HHPred support function call. CDS 12754 - 13179 /gene="29" /product="gp29" /function="hypothetical protein" /locus tag="alkhayr_29" /note=Original Glimmer call @bp 12754 has strength 6.32; Genemark calls start at 12691 /note=SSC: 12754-13179 CP: yes SCS: both-gl ST: NI BLAST-Start: [hypothetical protein PBI_YUNGJAMAL_31 [Mycobacterium phage YungJamal]],,NCBI, q1:s22 100.0% 2.25713E-75 GAP: 82 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.658, -4.624360939601904, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_YUNGJAMAL_31 [Mycobacterium phage YungJamal]],,AII28270,80.7692,2.25713E-75 SIF-HHPRED: SIF-Syn: /note=The current predicted start @ 12754 bp (start site #3) is the most annotated start in Starterator, but alkhayr does not call it. However, it agrees with Glimmer but not Genemark, which has a 12691 bp start. Found in 28/28 genes in this pham. NCBI Blast and HHPred do not support a known function. CDS 13190 - 13480 /gene="30" /product="gp30" /function="hypothetical protein" /locus tag="alkhayr_30" /note=Original Glimmer call @bp 13190 has strength 7.98; Genemark calls start at 13181 /note=SSC: 13190-13480 CP: no SCS: both-gl ST: NI BLAST-Start: [terminase small subunit [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 4.91939E-61 GAP: 10 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.596, -6.040876411038884, no F: hypothetical protein SIF-BLAST: ,,[terminase small subunit [Mycobacterium phage Catdawg] ],,YP_008409198,100.0,4.91939E-61 SIF-HHPRED: SIF-Syn: Similar genes in other O phages do not support a function call. /note=The current predicted start @ 13190 bp (start site #29) is not the most annotated start in Starterator. However, it agrees with Glimmer but not Genemark, which has a 13181 bp start. Found in 28/106 genes in this pham. NCBI Blast supports the function of terminase small subunit with a query cover of 100% and an e-vale of 6e-61. HHpred does not point towards this function or any known function. CDS 13437 - 14951 /gene="31" /product="gp31" /function="terminase" /locus tag="alkhayr_31" /note=Original Glimmer call @bp 13437 has strength 12.59; Genemark calls start at 13440 /note=SSC: 13437-14951 CP: yes SCS: both-gl ST: SS BLAST-Start: [terminase [Mycobacterium phage Ryadel] ],,NCBI, q1:s1 100.0% 0.0 GAP: -44 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.082, -2.5052746077145835, yes F: terminase SIF-BLAST: ,,[terminase [Mycobacterium phage Ryadel] ],,YP_010097523,99.8016,0.0 SIF-HHPRED: Terminase large subunit; genome packaging, bacteriophage, ATPase, nuclease, VIRAL PROTEIN; HET: BR; 2.2A {Enterobacteria phage HK97},,,6Z6D_A,84.9206,100.0 SIF-Syn: by looking at alkhayr in Pham maps and comparing it with Krili, a similar location of gene 29 is present for gene 30 with Krili. The functions of gene 29 is also the same for gene 30 in krili and gene 30 in alkhayr compares with gene 31 in Krili and gene 31 in alkhayr compares with gene 32 with Krili all the functions of the genes in alkhayr compared with the genes in krili are match in placement and functions, which shows evidence of syntenic. /note=genemark and glimmer disagree on the start sites at 13437, but starterator and glimmer is agree that the right start sites is 13437 because it gives us the longest ORF, according to starterator it is also the most called start site in all the annotations. it includes all the coding according to the genemark report. there is also an overlap of -44 bp. the coding potential is strong along the length of the ORF. phagesdb has called terminase to be the function in a number of other clusters other than cluster O. phagesdb blast don`t support the function that has been called. NCBI blast agree that terminase is the right function for the genes. HHPRED matches to large subunit terminase in termophylic virus with a E-value of 2.9E-33 and a probability of 100%. at 2024-03-24T22:21:15. No TMD in this protein. CDS 15001 - 16230 /gene="32" /product="gp32" /function="portal protein" /locus tag="alkhayr_32" /note=Original Glimmer call @bp 15001 has strength 17.17; Genemark calls start at 15001 /note=SSC: 15001-16230 CP: yes SCS: both ST: SS BLAST-Start: [portal protein [Mycobacterium phage YungJamal]],,NCBI, q1:s1 100.0% 0.0 GAP: 49 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.488, -3.670670413150579, no F: portal protein SIF-BLAST: ,,[portal protein [Mycobacterium phage YungJamal]],,AII28274,100.0,0.0 SIF-HHPRED: Phage portal protein, HK97 family; "neck", "portal", "capsid", "tail tube", VIRUS; 3.58A {Rhodobacter capsulatus},,,6TE9_A,83.3741,100.0 SIF-Syn: by looking at alkhayr in Pham maps and comparing it with Krili, a similar location of gene 30 is present for gene 31 with Krili. The functions of gene 30 is also the same for gene 31 in krili the placement of the genes in Pham maps and their functions match, which shows evidence of syntenic. /note=gene mark glimmer and starterator all agree that the right start site for gene 30 is 15001 because it has the longest ORF, and a gap of 49bp, and it is also the most called start sites in all annotations. phagesdb has called portal protein to be the function in a number of other clusters other than cluster O. phagesdb blast don`t support the function that has been called. NCBI blast agree that portal protein is the right function for the genes. HHPRED matches Phage portal protein, HK97 family; neck, "portal", capsid, tail tube, VIRUS; 3.58A Rhodobacter capsulatus, with a probability of 100 and a E value of 1.3E-38. according to TMHMM evidence shows 1 transmembrane domain located in the gene. CDS 16227 - 16811 /gene="33" /product="gp33" /function="O-methyltransferase" /locus tag="alkhayr_33" /note=Original Glimmer call @bp 16227 has strength 12.6; Genemark calls start at 16227 /note=SSC: 16227-16811 CP: no SCS: both ST: SS BLAST-Start: [methyltransferase [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 8.54185E-139 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.8, -5.154811153158988, no F: O-methyltransferase SIF-BLAST: ,,[methyltransferase [Mycobacterium phage Catdawg] ],,YP_008409202,99.4845,8.54185E-139 SIF-HHPRED: CalS11; Methyltransferase, Calicheamicin, CalS11, Structural genomics, Protein Structure Initiative, PSI, NatPro, Enzyme Discovery for Natural Product Biosynthesis; HET: GLU, MSE, NA, SAH, EDO; 1.55A {Micromonospora echinospora} SCOP: c.66.1.61,,,3TOS_H,78.866,99.7 SIF-Syn: by looking at alkhayr in Pham maps and comparing it with Krili, a similar location of gene 31 is present for gene 32 with Krili. The functions of gene 31 is also the same for gene 32 in krili the placement of the genes in Pham maps and their functions match, which shows evidence of syntenic. /note=Glimmer, Genemark and starterator are all agree that the right start site is 16227 because it contains the longest open reading frame and also a gap of -4 which means there is a little overlap between the genes. phagesdb has called O-methyltransferase to be the function in a number of other clusters other than cluster O. phagesdb blast don`t support the function that has been called. NCBI blast agree that O-methyltransferase is the right function for the genes. HHPRED matches to CalS11; Methyltransferase, Calicheamicin, CalS11, Structural genomics, Protein Structure Initiative, PSI, NatPro, Enzyme with a probability of 99.67, and E-Value of 6.9E-14. TMHMM show evidence of 1 transmembrane domain inside the gene. CDS 16804 - 17673 /gene="34" /product="gp34" /function="hypothetical protein" /locus tag="alkhayr_34" /note=Original Glimmer call @bp 16855 has strength 17.01; Genemark calls start at 16804 /note=SSC: 16804-17673 CP: yes SCS: both-gm ST: NI BLAST-Start: [glycosyltransferase [Mycobacterium phage MiaZeal] ],,NCBI, q1:s1 95.5017% 2.08768E-78 GAP: -8 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.647, -3.4219145408355454, no F: hypothetical protein SIF-BLAST: ,,[glycosyltransferase [Mycobacterium phage MiaZeal] ],,YP_009213233,60.6498,2.08768E-78 SIF-HHPRED: SIF-Syn: Gene 32 in alkhayr possesses synteny to final called genomes, Krilli and CatDawg. The observed gene is syntenic to gene #33 in Krilli and gene #34 in CatDawg, gene is called as glycotransferase. Upstream gene and downstream gene surrounding gene #32 are conserved in location and function = syntenic. Function call is Not consistent across syntenic genes in the O cluster. /note=Start is found in forward strand at 16804 bp. This start is supported/called by Genemark only. Suggested start from starterator is not found in Alkhayr but 16804 has 24 MA. Glimmers start could be a start, but genemarks suggested start yields a longer reading frame with smaller overlap/gap value. The genemark graph supports start at 16804 as one can observe strong full coding capacity as well as a rather long reading frame of 870 base pairs with a small overlap of around 8 bp length. RBS of 2.647 which is comparable to glimmers suggested start having RBS of 2.984. The observed gene is syntenic to gene #33 in Krilli from same O cluster. The gene’s function is that of Glycotransferase, which was determined using cross genomics to Krilli, a fully sequenced genome. NCBI blast alignment had 2.08768e-78 and PhagesDB Blast had value of 1e-172. No great quality HHPRED comparison, but closest is that of cd00761which has an e value of 0.024 with 48.8% coverage and 97.4% probability. CDS 17670 - 18362 /gene="35" /product="gp35" /function="glycosyltransferase" /locus tag="alkhayr_35" /note=Original Glimmer call @bp 17670 has strength 12.36; Genemark calls start at 17670 /note=SSC: 17670-18362 CP: yes SCS: both ST: NI BLAST-Start: [galactosyl transferase [Mycobacterium phage Corndog] ],,NCBI, q1:s1 100.0% 2.76966E-165 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.577, -3.54831954703195, no F: glycosyltransferase SIF-BLAST: ,,[galactosyl transferase [Mycobacterium phage Corndog] ],,NP_817888,100.0,2.76966E-165 SIF-HHPRED: CESA_CelA_like; CESA_CelA_like are involved in the elongation of the glucan chain of cellulose. Family of proteins related to Agrobacterium tumefaciens CelA and Gluconacetobacter xylinus BscA.,,,cd06421,90.8696,99.9 SIF-Syn: Observed gene possesses synteny to both upstream and downstream genes observed in same O cluster, Krili and MadKillah. It is syntenic to gene #36 in MadKillah and gene #34 in Krili. /note=Start is found in forward strand at 17670bP. ALkhayr doesn`t have the most annotated start in Starterator but 17670 has 64 MAs. This start is called by both Glimmer and Genemark report. This potential start site agrees with the annotating principles, it has a long length of 693 bp and possesses a small overlap of -4. These values, when compared to other potential starts, are made clear to be the best length for most minimized gap/overlap. Genemark graph supports this conclusion as strong coding capacity is present and maintained with potential 17670 start. This gene start yields a fair RBS of 2.577. PhagesDB blast tells us the function is observed as a glycosyltransferase, seen in Krili with an e-value of 1e-129. NCBI blast supports calling this respective function in multiple other Phages with one having an e-value of 2.76966e-165. HHPRED has an e-value of 3e-25 for a 99% probability of 90.8696% coverage. Gene is not a deep transmembrane protein. CDS 18359 - 18979 /gene="36" /product="gp36" /function="glycosyltransferase" /locus tag="alkhayr_36" /note=Original Glimmer call @bp 18359 has strength 8.34; Genemark calls start at 18359 /note=SSC: 18359-18979 CP: yes SCS: both ST: SS BLAST-Start: [glycosyltransferase [Mycobacterium phage JangDynasty] ],,NCBI, q1:s1 100.0% 7.99115E-145 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.722, -5.392936417653797, no F: glycosyltransferase SIF-BLAST: ,,[glycosyltransferase [Mycobacterium phage JangDynasty] ],,AVI04067,100.0,7.99115E-145 SIF-HHPRED: Glyco_transf_25; Glycosyltransferase family 25 [lipooligosaccharide (LOS) biosynthesis protein] is a family of glycosyltransferases involved in LOS biosynthesis.,,,cd06532,72.8155,99.5 SIF-Syn: This observed gene holds synteny to gene #35 in final called Krili and gene #36 in CatDawg both belonging to O cluster. The observed gene has upstream (glycosyltransferase) and downstream (capsid maturation) gene sytenty to the genes found in Krili and CatDawg. /note=Gene is found in the forward strand. Genemark and glimmer agree on the start sites at 18359bp, it gives us the longest ORF of 621bp with the shortest overlap at 4bp. According to starterator, it is also the most called start site in all the annotations. This start, as seen on the genemark report, reflects strong coding potential throughout the gene with slight negligible staggering at around 18800bp. PhagesDB blast finds this gene in final called Krili and CatDawg, both calling it a glycosyltransferase. Krili e-value of 1e-116 and CatDawg e-value of 1e-117. This potential start site yields a fair RBS of 1.722. NCBI blast provides accession AVI04067 which calls a gycosyltransferase yielding 99.5146% identity over 100% coverage with an e-value of 7.99115e-145. HHPRED tells us they`ve seen, with 99.5 probability, the 72.8155% coverage with an e-value of 5.9e-13. This gene is not a deep transmembrane protein. CDS 19048 - 19713 /gene="37" /product="gp37" /function="capsid maturation protease" /locus tag="alkhayr_37" /note=Original Glimmer call @bp 19048 has strength 15.57; Genemark calls start at 19048 /note=SSC: 19048-19713 CP: no SCS: both ST: SS BLAST-Start: [gp39 [Mycobacterium phage Corndog] ],,NCBI, q1:s1 100.0% 5.91405E-162 GAP: 68 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.567, -3.50848394673489, no F: capsid maturation protease SIF-BLAST: ,,[gp39 [Mycobacterium phage Corndog] ],,NP_817890,100.0,5.91405E-162 SIF-HHPRED: Peptidase_S78 ; Caudovirus prohead serine protease,,,PF04586.20,71.0407,99.8 SIF-Syn: This observed gene holds synteny to gene #36 in final called Krili and gene #37 in CatDawg both belonging to O cluster. The observed gene has upstream (glycosyltransferase) and a small downstream gene (UKF) sytenic to the genes found in Krili and CatDawg. /note=The actual start for this gene is at 19048bp length, which was predicted by both Glimmer and Genemark. Using this as a start site, we get the longest reading frame of 666 bp’s, while also producing the smallest gap seen in this region with its` upstream gene, of 68 bp gap. The genemark report reveals an interesting coding potential. The coding potential is strong but takes two severe dips, one at around 19110bp and second at around 19305bp. This potential start yields a RBS of 2.567. Starterator tells us that this start is found 28/29 (96.6%) times when annotated, of the genes found in this O cluster, thus we lit. PhagesDB blast calls this gene a capsid maturation protease in both Krili and CatDawg; Krili has e-value of 1e-124 and CatDawg an e-value of 1e-125. NCBI Blast tells us this start is conserved as seen in phage Corndog with 100% identity over 100% coverage with ane-value of 5.91405e-162. HHPRED observes this start in a protease with 99.8% probability over 71.0407% coverage with an e-value of 5e-17. This gene is not a deep transmembrane protein. As far as majority of the guiding principles to annotating go, this start does not violate any. However, the 68bp gap between this gene and its` upstream gene, thus it should be noted, "The gene density in phage genomes is very high, so genes tend to be tightly packed. Thus, there are typically not large non-coding gaps between genes," (SeaPhages BioInformatics Site). Just something to consider, ish happens ¯\_(ツ)_/¯ CDS 19710 - 19847 /gene="38" /product="gp38" /function="hypothetical protein" /locus tag="alkhayr_38" /note=Original Glimmer call @bp 19710 has strength 14.37; Genemark calls start at 19710 /note=SSC: 19710-19847 CP: yes SCS: both ST: SS BLAST-Start: [gp40 [Mycobacterium phage Corndog] ],,NCBI, q1:s1 100.0% 8.65016E-22 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.085, -4.582905859694018, no F: hypothetical protein SIF-BLAST: ,,[gp40 [Mycobacterium phage Corndog] ],,NP_817891,100.0,8.65016E-22 SIF-HHPRED: Glyco_hydro_65C ; Glycosyl hydrolase family 65, C-terminal domain,,,PF03633.18,31.1111,52.5 SIF-Syn: Syntenic to gene 37 in Krili (final), which has an unknown function. /note=The start site for this gene is agreed upon at 19710bp length. This start site produces a reading frame with the length of about 138bp’s, which is not the longest, but it produces the smallest gap/overlap between genes of -4. The coding potential/capacity of this gene is very strong. This start site is found 100% of the time for genes in this related Pham. No supporting evidence for a function call. No TMD. CDS 19999 - 21231 /gene="39" /product="gp39" /function="major capsid protein" /locus tag="alkhayr_39" /note=Original Glimmer call @bp 19999 has strength 17.94; Genemark calls start at 19999 /note=SSC: 19999-21231 CP: yes SCS: both ST: SS BLAST-Start: [major head protein [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 0.0 GAP: 151 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.006, -2.7423981586358774, yes F: major capsid protein SIF-BLAST: ,,[major head protein [Mycobacterium phage Catdawg] ],,YP_008409208,100.0,0.0 SIF-HHPRED: Major capsid protein; HK97-like fold, capsid size redirection, major capsid protein, VIRUS; 4.0A {Staphylococcus aureus},,,7RWZ_B,99.0244,100.0 SIF-Syn: Syntenic to gene 38 in Krili (final), which has a known function of major capsid protein. /note=The current predicted start @ 19999 bp (start site #1) is the most annotated start in Starterator. Agrees with Glimmer and Genemark. From the genemark graph we can infer there is strong coding potential with a long reading frame and small gap length. The genes function is Major Capsid Protein, which was determined using cross genomics to Krili, a fully sequenced genome. NCBI Blast supports the function major capsid protein with a 100% Query cover and an e-value of 0. HHPred and CDD support the function call too. CDS 21231 - 21479 /gene="40" /product="gp40" /function="hypothetical protein" /locus tag="alkhayr_40" /note=Original Glimmer call @bp 21231 has strength 9.52; Genemark calls start at 21231 /note=SSC: 21231-21479 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_40 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 2.97628E-47 GAP: -1 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.835, -5.160693811781179, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_40 [Mycobacterium phage Catdawg] ],,YP_008409209,100.0,2.97628E-47 SIF-HHPRED: DsbC_N ; Disulfide bond isomerase protein N-terminus,,,PF10411.12,17.0732,37.1 SIF-Syn: Syntenic to gene 39 in Krili (final), which has an unknown function. /note=The start site for this gene is agreed upon at 21231bp length. This start site is called 100% of the time for genes in this related Pham, 22 MAs. From the genemark graph we can infer there is strong coding potential with a long reading frame and small overlap. The genes function is unknown, which was determined using cross genomics to Krili, a fully sequenced genome. No supporting evidence for a function. No TMD. CDS 21489 - 22094 /gene="41" /product="gp41" /function="head-to-tail adaptor" /locus tag="alkhayr_41" /note=Original Glimmer call @bp 21489 has strength 16.16; Genemark calls start at 21489 /note=SSC: 21489-22094 CP: yes SCS: both ST: SS BLAST-Start: [head-tail adaptor [Mycobacterium phage Dylan] ],,NCBI, q1:s1 100.0% 1.39864E-144 GAP: 9 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.912, -2.8742224756987462, yes F: head-to-tail adaptor SIF-BLAST: ,,[head-tail adaptor [Mycobacterium phage Dylan] ],,YP_008530604,100.0,1.39864E-144 SIF-HHPRED: Adaptor protein Rcc01688; "neck", "portal", "capsid", "tail tube", VIRUS; 3.58A {Rhodobacter capsulatus},,,6TE9_C,98.5075,99.9 SIF-Syn: Syntenic to gene 40 in Krili (final), which has a known function of head-to-tail adaptor. /note=The current predicted start @ 21489 bp is the most annotated start in Starterator. Agrees with Glimmer and Genemark. From the genemark graph we can infer there is strong coding potential with a long reading frame and small gap length. The genes function is Head-to-tail adaptor, which was determined using cross genomics to Krili, a fully sequenced genome. NCBI Blast supports the function head-to-tail adaptor with a 100% Query cover and an e-value of 1.3e-144. HHPred also supports this function call. CDS 22108 - 22245 /gene="42" /product="gp42" /function="hypothetical protein" /locus tag="alkhayr_42" /note=Original Glimmer call @bp 22108 has strength 15.62; Genemark calls start at 22108 /note=SSC: 22108-22245 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_42 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 4.8847E-22 GAP: 13 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.508, -3.708514821076885, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_42 [Mycobacterium phage Catdawg] ],,YP_008409211,100.0,4.8847E-22 SIF-HHPRED: d.168.1.1 (A:360-505) Flavocytochrome c3 (respiratory fumarate reductase) {Shewanella putrefaciens [TaxId: 24]} | CLASS: Alpha and beta proteins (a+b), FOLD: Succinate dehydrogenase/fumarate reductase flavoprotein, catalytic domain, SUPFAM: Succinate dehydrogenase/fumarate reductase flavoprotein, catalytic domain, FAM: Succinate dehydrogenase/fumarate reductase flavoprotein, catalytic domain,,,SCOP_d1d4da3,57.7778,81.5 SIF-Syn: Syntenic to gene 41 in Krili (final), which has an unknown function. /note=The start site for this gene is agreed upon at 22108bp length. This start site is called in 22 manual annotations. From the genemark graph we can infer there is strong coding potential with a long reading frame and small overlap. The genes function is unknown, which was determined using cross genomics to Krili, a fully sequenced genome. BLAST and HHPred do not support a function call. CDS 22248 - 22607 /gene="43" /product="gp43" /function="head-to-tail stopper" /locus tag="alkhayr_43" /note=Original Glimmer call @bp 22248 has strength 8.35; Genemark calls start at 22248 /note=SSC: 22248-22607 CP: yes SCS: both ST: SS BLAST-Start: [head closure [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 9.81334E-81 GAP: 2 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.39, -4.701320725655571, no F: head-to-tail stopper SIF-BLAST: ,,[head closure [Mycobacterium phage Catdawg] ],,YP_008409212,100.0,9.81334E-81 SIF-HHPRED: Phage-like element PBSX protein xkdH; NESG X-Ray SR362 P54328 Structure, Structural Genomics, PSI-2, Protein Structure Initiative, Northeast Structural Genomics Consortium, unknown; HET: PO4; 2.5A {Bacillus subtilis},,,3F3B_A,97.479,99.5 SIF-Syn: Syntenic to gene 42 in Krili (final), which has a known function as head-to-tail stopper. /note=The current predicted start @ 22248 bp is the most manually annotated start in Starterator. Agrees with Glimmer and Genemark. From the genemark graph we can infer there is strong coding potential with a long reading frame and small gap length. The genes function is Head-to-tail stopper, which was determined using cross genomics to Krili, a fully sequenced genome. NCBI Blast supports the function head-to-tail stopper with a 100% Query cover and an e-value of 9.8e-81. HHPred also supports this function. CDS 22604 - 22924 /gene="44" /product="gp44" /function="hypothetical protein" /locus tag="alkhayr_44" /note=Original Glimmer call @bp 22619 has strength 8.79; Genemark calls start at 22619 /note=SSC: 22604-22924 CP: yes SCS: both-cs ST: SS BLAST-Start: [hypothetical protein SEA_VORRPS_43 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 3.55845E-69 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.039, -4.597144958218091, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_43 [Mycobacterium phage Vorrps] ],,AYQ98880,100.0,3.55845E-69 SIF-HHPRED: SIF-Syn: Syntenic to gene 43 in Krili (final), which has an unknown function. CDS 22908 - 23294 /gene="45" /product="gp45" /function="hypothetical protein" /locus tag="alkhayr_45" /note=Original Glimmer call @bp 22887 has strength 7.39; Genemark calls start at 22887 /note=SSC: 22908-23294 CP: no SCS: both-cs ST: NI BLAST-Start: [minor tail protein [Mycobacterium phage Dylan] ],,NCBI, q1:s1 100.0% 1.93955E-88 GAP: -17 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.924, -4.835658240213919, no F: hypothetical protein SIF-BLAST: ,,[minor tail protein [Mycobacterium phage Dylan] ],,YP_008530608,100.0,1.93955E-88 SIF-HHPRED: HK97-gp10_like ; Bacteriophage HK97-gp10, putative tail-component,,,PF04883.15,57.0312,97.4 SIF-Syn: Gene 45 synteny with gene 44 on Krili with similar upstream and downstream genes. /note=Glimmere and GeneMark agree to start site at 22887. Most annotated site in starterator is 22908 with 15 annotations. Stop site at 23294. Host-trained GeneMark supports start site 22908 with the lowest dips in prediction. Phagesdb BLAST supports NKF comparing other members within the pham with e-value 1e-77. HHPRED, NCBI BLAST, and CDD. DeepTMHMM predicts proteinmost likely outside membrane with high probability. CDS 23287 - 23739 /gene="46" /product="gp46" /function="tail terminator" /locus tag="alkhayr_46" /note=Original Glimmer call @bp 23287 has strength 7.58; Genemark calls start at 23287 /note=SSC: 23287-23739 CP: yes SCS: both ST: SS BLAST-Start: [head-tail adaptor [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 7.81496E-105 GAP: -8 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.851, -5.336604134363396, no F: tail terminator SIF-BLAST: ,,[head-tail adaptor [Mycobacterium phage Catdawg] ],,YP_008409215,100.0,7.81496E-105 SIF-HHPRED: TAIL-TO-HEAD JOINING PROTEIN GP17; VIRAL PROTEIN, VIRAL INFECTION, TAILED BACTERIOPHAGE, SIPHOVIRIDAE, SPP1, VIRAL ASSEMBLY, HEAD-TO-TAIL INTERFACE, DNA GATEKEEPER, ALLOSTERIC MECHANISM; 7.2A {BACILLUS PHAGE SPP1},,,5A21_G,79.3333,98.2 SIF-Syn: Synteny with Krili_45 with similar upstream and downstream genes. /note=Glimmer and GeneMark agree with start site @ 23287 with stop site @ 23739. Starterator provides most annotated site @ 23287 with 68 MAs. This provides the second longest ORF w/ second shortest gap. Phagesdb function frequency shows tail terminator through the same pham. Phagesdb BLAST calls tail terminator in phage Idergollasper with e-value 1e-84. HHPRED calls TAIL-TO-HEAD JOINING PROTEIN with 98.2% probability and 79.3333% coverage with e-value 6.7e-9. NCBI BLAST shows head-tail adaptor with 100% identity, coverage, and alignment. CDD unremarkable and DeepTMHMM shows protein most likely within membrane. CDS 23768 - 24583 /gene="47" /product="gp47" /function="major tail protein" /locus tag="alkhayr_47" /note=Original Glimmer call @bp 23768 has strength 15.94; Genemark calls start at 23768 /note=SSC: 23768-24583 CP: no SCS: both ST: NI BLAST-Start: [gp49 [Mycobacterium phage Corndog] ],,NCBI, q1:s1 100.0% 0.0 GAP: 28 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.831, -3.0248543014571307, yes F: major tail protein SIF-BLAST: ,,[gp49 [Mycobacterium phage Corndog] ],,NP_817900,100.0,0.0 SIF-HHPRED: YSD1_22 major tail protein; Bacteriophage tail, helical assembly, VIRAL PROTEIN; 3.5A {Bacteriophage sp.},,,6XGR_L,68.2657,97.1 SIF-Syn: Synteny with Krili_46 with similar upstrem and downstream genes. /note=Glimmer and GeneMark call start site @ 23768 with stop site @ 24583. Starterator agrees with start site 23768 with most annotations of 221 MAs. Stop site at 24582 with high coding potential throughout. Longest ORF and shortest gap. Most commonly called as major tail protein within the same pham as per Phagesdb function frequency. Blessica called as major tail protein with e-value of 1e-162. HHPRED unremarkable. NCBI BLAST calls major tail protein with 100% identity, alignment, and coverage. CDD and DeepTMHMM unremarkable, doesn`t predict a TMD. CDS complement (24680 - 24775) /gene="48" /product="gp48" /function="hypothetical protein" /locus tag="alkhayr_48" /note=Genemark calls start at 24775 /note=SSC: 24775-24680 CP: yes SCS: genemark ST: NI BLAST-Start: [hypothetical protein PBI_CATDAWG_48 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 3.17267E-12 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.577, -4.013206345334601, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_48 [Mycobacterium phage Catdawg] ],,YP_008409217,100.0,3.17267E-12 SIF-HHPRED: SIF-Syn: Synteny with Blessica_49 with similar upstream and downstream genes. /note=GeneMark calls start site @ 24775 with high coding potential. Glimmer does not call any site. Starterator report shows 24775 as the most annotated site with 18 MAs. This is not the longest ORF. NKF in same pham as per Phagesdb BLAST with e-value 1e-11. HHPRED uninformative. NCBI Blast calls a hypothetical protein with 100% identity, coverage, and alignment. DeepTMHMM prediction doesn`t predict a TMD. CDS complement (24772 - 24945) /gene="49" /product="gp49" /function="hypothetical protein" /locus tag="alkhayr_49" /note=Original Glimmer call @bp 24945 has strength 5.04; Genemark calls start at 24945 /note=SSC: 24945-24772 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_YUNGJAMAL_52 [Mycobacterium phage YungJamal] ],,NCBI, q1:s2 100.0% 3.09748E-34 GAP: -1 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.016, -4.64618386363431, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_YUNGJAMAL_52 [Mycobacterium phage YungJamal] ],,AII28291,98.2759,3.09748E-34 SIF-HHPRED: SIF-Syn: Synteny with Krili_48 with similar upstream and downstream genes. /note=Glimmer and GeneMark agree with start site @ 24945 with high coding potential. Starterator agrees with 23 MAs. This provides the second longest ORF and second shortest gap. Function unknown as per Phagesdb BLAST with e-value of 1e-31 within same pham. CDD, HHPRED, NCBI BLAST, and DeepTMHMM uninformative. CDS complement (24945 - 25136) /gene="50" /product="gp50" /function="hypothetical protein" /locus tag="alkhayr_50" /note=Original Glimmer call @bp 25136 has strength 16.74; Genemark calls start at 25136 /note=SSC: 25136-24945 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_VORRPS_49 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 2.27693E-37 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.082, -2.583959800616441, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_49 [Mycobacterium phage Vorrps] ],,AYQ98886,100.0,2.27693E-37 SIF-HHPRED: SIF-Syn: Synteny with Krili_49 with similar upstream and downstream genes. /note=Glimmer and GeneMark call start site @ 25136 and stop site @ 24945. Starterator agrees with 22 MAs for 25136. This provides the fourth longest ORF and fourth shortest gap by HT GeneMark provides strong coding potential. Phagesdb BLAST shows NKF within same pham with e-value 5e-29. HHPRED, NCBI BLAST, and CDD unremarkable. DeepTMHMM most likely inside membrane with high probability CDS complement (25133 - 25588) /gene="51" /product="gp51" /function="hypothetical protein" /locus tag="alkhayr_51" /note=Original Glimmer call @bp 25588 has strength 10.02; Genemark calls start at 25588 /note=SSC: 25588-25133 CP: yes SCS: both ST: SS BLAST-Start: [helix-turn-helix DNA binding domain protein [Mycobacterium phage Shida]],,NCBI, q1:s1 100.0% 2.73738E-103 GAP: 70 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.435, -5.924300499696598, no F: hypothetical protein SIF-BLAST: ,,[helix-turn-helix DNA binding domain protein [Mycobacterium phage Shida]],,QOC58483,100.0,2.73738E-103 SIF-HHPRED: SIF-Syn: Synteny with MadKillah_52 with similar upstream and downstream genes. /note=GeneMark and Glimmer agree with start site @ 25588. Starterator shows this start site with the most MAs of 21. This start site provides the longest ORF with shortest gap. Gene called as helix-turn-helix DNA binding domain in same pham with MadKillah and Shida with e-value of 1e-84. NCBI BLAST calls this protein with 100% identity, coverage and alignment. Phagesdb function frequency shows this protein with 19% frequency. Nevertheless HHPred doesn`t have a strong match to a DNA binding domain in the protein E=0.000053. CDS 25659 - 26213 /gene="52" /product="gp52" /function="tail assembly chaperone" /locus tag="alkhayr_52" /note=Original Glimmer call @bp 25659 has strength 9.71; Genemark calls start at 25671 /note=SSC: 25659-26213 CP: yes SCS: both-gl ST: SS BLAST-Start: [tail assembly chaperone [Mycobacterium phage YungJamal] ],,NCBI, q1:s1 100.0% 1.9296E-132 GAP: 70 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.289, -4.989254565642423, no F: tail assembly chaperone SIF-BLAST: ,,[tail assembly chaperone [Mycobacterium phage YungJamal] ],,AII28294,100.0,1.9296E-132 SIF-HHPRED: GP24_25 ; Mycobacteriophage tail assembly protein,,,PF17388.5,81.5217,97.1 SIF-Syn: Observed gene possesses synteny to both upstream and downstream genes observed in subcluster O phages Krili and Corndog. It is syntenic to gene #51 in Krili and #54 in Corndog. /note=Starterator and Glimmer agree on start site at 25659. GeneMark report shows strong coding potential throughout the length of the ORF. This gene shares function with related subcluster O phages such as Krili and YungJamal with e-value 1e-102 as "tail assembly chaperone". This gene does not violate any major guiding principles as it contains the longest open reading frame as well as the shortest gap at 70bp. The Z-score is 2.289 and final RBS score -4.989. The length of the ORF is 555bp. This start site is conserved in multiple other phages. HHPRED does not provide any insight into the genes specific function as the e-values are not credible. NCBI Blast provides information into the genes general function and it is a "tail assembly chaperone" with an e-value of 1.9296e-132, 100% identity, aligned and coverage, with a Subject:Query of S1:Q1 and are exact matches. There is no transmembrane prediction for this gene. CDS 26266 - 26610 /gene="53" /product="gp53" /function="hypothetical protein" /locus tag="alkhayr_53" /note=Original Glimmer call @bp 26266 has strength 14.8; Genemark calls start at 26266 /note=SSC: 26266-26610 CP: yes SCS: both ST: SS BLAST-Start: [tail assembly chaperone [Mycobacterium phage Idergollasper]],,NCBI, q1:s34 100.0% 4.25769E-76 GAP: 52 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.543, -5.622485528283668, no F: hypothetical protein SIF-BLAST: ,,[tail assembly chaperone [Mycobacterium phage Idergollasper]],,URM87824,77.551,4.25769E-76 SIF-HHPRED: DUF5361 ; Family of unknown function (DUF5361),,,PF17318.5,35.0877,98.2 SIF-Syn: Many of the members of cluster O have annotated the PTFS incorrectly. Based on the forum in seaphages.org, the slippery sequence for most of the cluster O phages is not canonical. While the BLAST results support the PTFS call HHPred does not. The synteny in this case may not be a sound source of evidence as most of the annotations are incorrect. /note=Genemark, Glimmer and Starterator agree on start site at 26266. GeneMark report shows strong coding potential throughout the length of the ORF. This gene shares function with related subcluster O phages such as Corndog and MadKillah with e-value 1e-60 but function varies between NKF and "tail assembly chaperone". This gene is not the longest open reading frame available but does contain the shortest gap at 52bp. The Z-score is 1.543 and final RBS score -5.622. The length of the ORF is 345bp. HHPRED does not provide any insight into the genes specific function as the e-values are not credible. NCBI Blast provides information into the genes general function and it is labelled a "tail assembly chaperone" coinciding with phage Idergollasper with an e-value of 4.25769e-76, 77% Identity and Aligned with 100% coverage, Subject:Query of S34:Q1. There is no transmembrane prediction for this gene. There is insufficient evidence to call function of this gene. CDS 26610 - 31436 /gene="54" /product="gp54" /function="tape measure protein" /locus tag="alkhayr_54" /note=Original Glimmer call @bp 26610 has strength 11.65; Genemark calls start at 26610 /note=SSC: 26610-31436 CP: no SCS: both ST: SS BLAST-Start: [tape measure protein [Mycobacterium phage Shida]],,NCBI, q1:s1 100.0% 0.0 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.438, -3.914968956777623, no F: tape measure protein SIF-BLAST: ,,[tape measure protein [Mycobacterium phage Shida]],,QOC58486,99.3159,0.0 SIF-HHPRED: SIF-Syn: Observed gene possesses synteny to both upstream and downstream genes observed in subcluster O phages Krili and MadKillah. It is syntenic to gene #53 in Krili and #55 in MadKillah. /note=Starterator and Glimmer agree on start site at 26610. GeneMark report shows strong coding potential throughout the length of the ORF. This gene shares function with related subcluster O phages such as Krili and MadKillah with e-value 0 as "tape measure protein". This gene does not violate any major guiding principles as it contains the longest open reading frame as well as the shortest gap at 0bp. The Z-score is 2.438 and final RBS score -3.915. The length of the ORF is 4827bp. This start site is conserved in multiple other phages. HHPRED does not provide any insight into the genes specific function as the e-values are not credible. NCBI Blast provides information into the genes general function and it is a "tape measure protein" with an e-value of 0, 99% Identity and Aligned with 100% coverage, Subject:Query of S1:Q1. There is no transmembrane prediction for this gene. CDS 31441 - 33192 /gene="55" /product="gp55" /function="minor tail protein" /locus tag="alkhayr_55" /note=Original Glimmer call @bp 31474 has strength 13.66; Genemark calls start at 31441 /note=SSC: 31441-33192 CP: yes SCS: both-gm ST: SS BLAST-Start: [gp58 [Mycobacterium phage Corndog] ],,NCBI, q1:s1 100.0% 0.0 GAP: 4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.029, -5.924455819644863, no F: minor tail protein SIF-BLAST: ,,[gp58 [Mycobacterium phage Corndog] ],,NP_817909,99.8285,0.0 SIF-HHPRED: Dit_N ; Distal tail protein, N-terminal domain,,,PF16774.9,26.7581,99.2 SIF-Syn: Familton_56 is syntenic as well as the upstream and downstream genes. /note=The start site for this gene is 31441 based on the data found on Genemark, Starterator program and Glimmer confirming it. The stop site for this gene as per Pham maps is 33192. Based on the gene mark report presented a strong coding potential which was seen through the ORF. The function of this gene is minor tail protein and it shares functions with O phages including Dylan and Idergollasper. Alkhayr, Cluster O is 70791 Bp. This gene has a total length of 282 amino acids and 849 base pairs. CDS 33189 - 34925 /gene="56" /product="gp56" /function="minor tail protein" /locus tag="alkhayr_56" /note=Original Glimmer call @bp 33189 has strength 18.08; Genemark calls start at 33180 /note=SSC: 33189-34925 CP: yes SCS: both-gl ST: NI BLAST-Start: [minor tail protein [Mycobacterium phage Firecracker] ],,NCBI, q1:s1 100.0% 0.0 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.732, -5.23097565637154, no F: minor tail protein SIF-BLAST: ,,[minor tail protein [Mycobacterium phage Firecracker] ],,YP_009014420,99.827,0.0 SIF-HHPRED: Sipho_Gp37 ; Siphovirus ReqiPepy6 Gp37-like protein,,,PF14594.9,83.91,99.9 SIF-Syn: Corndog 59. Similar flanking genes too. /note=Based on the confirmed information and data on Glimmer, starterator and genemark the start site for this gene is 33189. As per Pham maps the stop of this gene is 34925. The function of this gene is minor tail protein and its shares functions similar to O phages, those including Zakhe 101, Dylan, Idergollasper. /note=alkhayr, cluster O is 70791 bp. The gene has a total of 836 amino acids and 2511 base pairs. CDS 34965 - 35813 /gene="57" /product="gp57" /function="minor tail protein" /locus tag="alkhayr_57" /note=Original Glimmer call @bp 34965 has strength 13.31; Genemark calls start at 34971 /note=SSC: 34965-35813 CP: yes SCS: both-gl ST: SS BLAST-Start: [minor tail protein [Mycobacterium phage Ryadel] ],,NCBI, q1:s1 100.0% 0.0 GAP: 39 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.567, -4.16062280240617, yes F: minor tail protein SIF-BLAST: ,,[minor tail protein [Mycobacterium phage Ryadel] ],,YP_010097551,99.6454,0.0 SIF-HHPRED: SIF-Syn: Idergollasper_57 is also a minor tail protein. Flanking genes are also syntenic. /note=The match site indicated by Starterator (35965) is the same as indicated by Glimmer and different from the one indicated by GeneMark. Starterator reveals a consensus start site. There is evidence for coding potential throughout most of the gene. This gene can be seen in many annotated genomes such as “Shida”, “MadKillah”, “Ryadel”, etc. These are the same genes as the e-value is 0. The best match regarding gene number is “Krili” as this genome has the gene denoted as gene #57. The same gene can be seen in “Krili” as gene #57, cluster O, and pham 157471. It does not violate any major guiding principles because this gene has the longest ORF with the least gaps/spaces. The Z-score is 2.567 and the final RBS score is -4.161. This is the best score because it correlates with the best start site. The length of the ORF is 849 and has 39 gaps with the nearest upstream gene. This start site is conserved in multiple other phages. HHPRED does not provide any insight into the genes specific function as the e-values are not credible. NCBI Blast provides information into the genes general function and it is a "minor tail protein" as the e-values provided from similar genes in a variety of phages are e=0. NCBI Blast confidently tells us that this is a minor tail protein as the subject:query for multiple phages is 1:1 and are exact matches. There is no transmembrane prediction for this gene which tells us that this gene is not coding for a transmembrane protein. Not every O cluster calls this is a minor tail protein. CDS 35810 - 38320 /gene="58" /product="gp58" /function="minor tail protein" /locus tag="alkhayr_58" /note=Original Glimmer call @bp 35810 has strength 12.73; Genemark calls start at 35810 /note=SSC: 35810-38320 CP: no SCS: both ST: SS BLAST-Start: [minor tail protein [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 0.0 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.893, -6.000812032440203, no F: minor tail protein SIF-BLAST: ,,[minor tail protein [Mycobacterium phage Vorrps] ],,AYQ98895,100.0,0.0 SIF-HHPRED: SIF-Syn: /note=The match site indicated by Starterator (35810) is the same as indicated by Glimmer and GeneMark. Starterator reveals a consensus start site. There is evidence for coding potential throughout most of the gene. This gene can be seen in many annotated genomes such as “Krili”, “Mori”, “Niebru Saylor”, etc. These are the same genes as the e-value is 0. The best match regarding gene number is “Schuy_draft” as both genomes have the gene as #58. The same gene can be seen as “Schuy_draft” as gene #58, cluster O, and pham 146743. It does not violate any major guiding principles because this gene has the longest ORF with the least gaps/spaces. The Z-score is 1.893 and the final RBS score is -6.001. This is the best score because it correlates with the best start site. The length of the ORF is 2511 and only has 4 overlaps with the nearest upstream gene. This start site is conserved in multiple other phages. HHPRED does not provide any insight into the genes specific function as the e-values are not credible. NCBI Blast provides information into the genes general function and it is a "minor tail protein" as the e-values provided from similar genes in a variety of phages are e=0. NCBI Blast confidently tells us that this is a minor tail protein as the subject:query for multiple phages is 1:1 and are exact matches. There is no transmembrane prediction for this gene which tells us that this gene is not coding for a transmembrane protein. CDS 38313 - 39803 /gene="59" /product="gp59" /function="minor tail protein, D-ala-D-ala carboxypeptidase" /locus tag="alkhayr_59" /note=Original Glimmer call @bp 38313 has strength 17.13; Genemark calls start at 38313 /note=SSC: 38313-39803 CP: yes SCS: both ST: SS BLAST-Start: [minor tail protein with lysin activity [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 0.0 GAP: -8 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.217, -4.229395370396053, no F: minor tail protein, D-ala-D-ala carboxypeptidase SIF-BLAST: ,,[minor tail protein with lysin activity [Mycobacterium phage Catdawg] ],,YP_008409228,100.0,0.0 SIF-HHPRED: Beta-lactamase; colibactin peptidase, S12 peptidase, HYDROLASE, HYDROLASE-INHIBITOR complex; HET: 2PE, Z9A, Z9G, 97N, AV0; 2.3A {Escherichia coli CFT073},,,7MDF_A,73.7903,100.0 SIF-Syn: /note=Glimmer, genemark and Starterator all agree on start site 38313. According to starterator it does not have the most annotated start. There is evidence for coding potential throughout most of the gene but does not continue throughout. It has the longest open reading frame at 1491bp and the shortest gap at -8. Blastp and HHPred support the function minor tail protein.CBI Blast confidently tells us that this is a minor tail protein as the subject:query for multiple phages is 1:1. CDS 39815 - 40216 /gene="60" /product="gp60" /function="hypothetical protein" /locus tag="alkhayr_60" /note=Original Glimmer call @bp 39815 has strength 18.62; Genemark calls start at 39815 /note=SSC: 39815-40216 CP: yes SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_CATDAWG_60 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 1.55801E-84 GAP: 11 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.082, -2.442961286954254, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_60 [Mycobacterium phage Catdawg] ],,YP_008409229,100.0,1.55801E-84 SIF-HHPRED: SIF-Syn: /note=Glimmer, genemark and Starterator all agree on start site 39815. It has the most annotated start site and the longest open reading frame at 402bp with the shortest gaps at 11.The gene has good coding potential. Blastp and HHPred support the function minor tail protein. CDS 40220 - 40621 /gene="61" /product="gp61" /function="hypothetical protein" /locus tag="alkhayr_61" /note=Original Glimmer call @bp 40220 has strength 6.93; Genemark calls start at 40220 /note=SSC: 40220-40621 CP: yes SCS: both ST: SS BLAST-Start: [minor tail protein [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 4.72993E-83 GAP: 3 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.329, -6.063367366410245, no F: hypothetical protein SIF-BLAST: ,,[minor tail protein [Mycobacterium phage Catdawg] ],,YP_008409230,100.0,4.72993E-83 SIF-HHPRED: SIF-Syn: Positive synteny in related subcluster O species such as Catdawg and Krili. /note=Glimmer, genemark and Starterator all agree on start site. Contains the most annotated start site. It has the lonest open reading frame at 402bp and the shortest gaps at 3 bps.Blastp and HHPred support the function minor tail protein.The gene has good coding potential but it does not continue throughout. CDS 40631 - 40855 /gene="62" /product="gp62" /function="hypothetical protein" /locus tag="alkhayr_62" /note=Original Glimmer call @bp 40631 has strength 12.02; Genemark calls start at 40631 /note=SSC: 40631-40855 CP: yes SCS: both ST: SS BLAST-Start: [gp65 [Mycobacterium phage Corndog] ],,NCBI, q1:s1 100.0% 6.64513E-47 GAP: 9 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.171, -4.325675514574479, no F: hypothetical protein SIF-BLAST: ,,[gp65 [Mycobacterium phage Corndog] ],,NP_817916,100.0,6.64513E-47 SIF-HHPRED: MYB PROTO-ONCOGENE PROTEIN; TRANSCRIPTION/DNA, PROTEIN-DNA COMPLEX, TRANSCRIPTION REGULATION, BZIP, PROTO-ONCOGENE, MYB, C-MYB, C/EBP, TRANSCRIPTION-DNA complex; 2.45A {HOMO SAPIENS} SCOP: a.4.1.3,,,1H89_C,24.3243,90.4 SIF-Syn: Upstream and downstream genes on Firecracker both exhibit NKF. /note=Both Genemark and Glimmer concur on the start site at 40631, which provides the longest ORF with the smallest 9 bp gap. This site is repeatedly identified in all 26 manual annotations, underscoring its genomic significance. The ORF shows robust coding potential throughout its length. However, the function remains annotated as NKF due to the absence of conclusive functional evidence from NCBI Blast and high e-values in HHPRED searches. Additionally, no transmembrane domains were predicted by DeepTMHMM, suggesting a non-membrane-associated role. CDS 40868 - 41056 /gene="63" /product="gp63" /function="hypothetical protein" /locus tag="alkhayr_63" /note=Original Glimmer call @bp 40868 has strength 16.54; Genemark calls start at 40868 /note=SSC: 40868-41056 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_63 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 1.17337E-35 GAP: 12 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.33, -2.0720764396375664, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_63 [Mycobacterium phage Catdawg] ],,YP_008409232,100.0,1.17337E-35 SIF-HHPRED: SIF-Syn: Upstream and downstream genes on Firecracker both exhibit NKF. /note=Both Genemark and Glimmer agree on the start site at 40868, which provides the longest open reading frame (ORF) with the smallest gap of 12 bp. This start site has been robustly validated in 212 out of 213 manual annotations, emphasizing its genomic significance. The ORF displays significant coding potential along its entire length, indicative of a potentially functional protein. However, despite its coding potential, it remains classified as NKF due to NCBI BLAST annotations labeling it as a hypothetical protein, indicating an absence of recognized functional signatures. Although HHpred identifies a low e-value, suggesting a potential structural or functional homology to known proteins, the lack of specific matched functions in databases keeps its functional classification as NKF. DeepTMHMM analysis indicates no transmembrane domains, supporting the interpretation that this protein does not span cellular membranes, which might suggest a cytoplasmic or extracellular role, depending on further experimental characterization. CDS 41056 - 41742 /gene="64" /product="gp64" /function="hypothetical protein" /locus tag="alkhayr_64" /note=Original Glimmer call @bp 41056 has strength 12.39; Genemark calls start at 41056 /note=SSC: 41056-41742 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_SHIDA_64 [Mycobacterium phage Shida] ],,NCBI, q1:s1 100.0% 1.26764E-159 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.308, -4.123435721190396, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_SHIDA_64 [Mycobacterium phage Shida] ],,QOC58496,100.0,1.26764E-159 SIF-HHPRED: NDUFB9; Electron transport chain, respiratory complex, membrane protein, Euglena gracilis, ELECTRON TRANSPORT; HET: NAI, ZMP, CDL, NDP, 3PE, 2MR, SF4, FMN, U10, PC1;{Euglena gracilis},,,8J9J_B9,11.8421,16.1 SIF-Syn: Synteny analysis reveals that adjacent genes upstream and downstream also exhibit NKF, suggesting a genomic region involved in specialized functions that are not yet determined. /note=Both Genemark and Glimmer concur on the start site at 41056, which offers the longest ORF with high coding potential and the best ribosome binding site score. This start site is further validated by Starterator, suggesting high confidence in its genomic significance. Despite the strong coding potential throughout the ORF`s length, the function remains annotated as NKF. NCBI BLAST indicates the protein as hypothetical. PhagesDB BLAST reports the function as unknown. High e-values from HHpred suggest low confidence in any functional predictions based on structural homology. Absence of transmembrane domains as indicated by DeepTMHMM, suggest the protein`s non-membrane-associated role. CDS 41742 - 41861 /gene="65" /product="gp65" /function="hypothetical protein" /locus tag="alkhayr_65" /note=Original Glimmer call @bp 41742 has strength 5.55; Genemark calls start at 41742 /note=SSC: 41742-41861 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein KNU03_gp069 [Mycobacterium phage Ryadel] ],,NCBI, q1:s1 100.0% 1.44307E-18 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.768, -5.986297339340146, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein KNU03_gp069 [Mycobacterium phage Ryadel] ],,YP_010097559,100.0,1.44307E-18 SIF-HHPRED: DUF3098 ; Protein of unknown function (DUF3098),,,PF11297.11,74.359,82.2 SIF-Syn: The gene exhibits synteny with genes in similar positions on Firecracker, where adjacent genes are also annotated as NKF. Notably, the upstream gene is annotated as lysin A, indicating a potential functional linkage or regulatory interaction in the phage lifecycle, although this gene itself remains uncharacterized. /note=Both Genemark and Glimmer agree on the start site at 41742, which yields the longest ORF with the shortest overlap of 1 bp. This site is also the most frequently identified start in all 27 manual annotations, underscoring its significance in the genome. The ORF exhibits strong coding potential across its length, indicative of a potentially significant biological role. Despite this, the function remains classified as NKF due to NCBI Blast results labeling the protein as hypothetical, suggesting an undefined biological role. High e-values in HHpred, indicating a lack of significant homology to functionally characterized proteins, despite the low e-value pointing to some structural similarities with proteins of unknown functions. Absence of transmembrane domains according to DeepTMHMM analysis, suggesting it does not function within or across cellular membranes. CDS 41911 - 43131 /gene="66" /product="gp66" /function="lysin A" /locus tag="alkhayr_66" /note=Original Glimmer call @bp 41911 has strength 16.87; Genemark calls start at 41911 /note=SSC: 41911-43131 CP: yes SCS: both ST: SS BLAST-Start: [lysin A [Mycobacterium phage Winget]],,NCBI, q1:s1 100.0% 0.0 GAP: 49 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.209, -6.964808423506407, no F: lysin A SIF-BLAST: ,,[lysin A [Mycobacterium phage Winget]],,QWY81552,100.0,0.0 SIF-HHPRED: N-acetylmuramoyl-L-alanine amidase amiD; ZINC AMIDASE, PGRP, Peptidoglycan Recognizing Protein, AmpD, N-ACETYLMURAMYL-L-ALANINE AMIDASE, Cell wall biogenesis/degradation, Hydrolase, Lipoprotein, Membrane, Metal-binding; HET: GOL, AH0; 1.75A {Escherichia coli},,,3D2Y_A,54.1872,99.6 SIF-Syn: When looking at alkhayr and comparing it with Krili, a similar location of gene 66 is present for gene 65 with Krili. The functions are also and gene 65 in alkhayr compares with gene 64 in Krili and gene 67 in alkhayr compares with gene 66 with Krili /note=Glimmer, genemark, and starterator all predict that the start site is 41,911. In addition, BLASTp alignments for 41,911 is Q1:S1 so it does support 41,911 as the start site. 41,911 gives the longest ORF but not the shortest gap. 41,911 does not give the highest RBS score. In phagesdb function frequency supports the fact that the function is lysin A because the subcluster O present states that the function name is lysin A. NCBI blast supports that it is lysin A because it has a 100% identity, 100% aligned and has an e-value of 0. Phagesdb shows the cluster O has a function of lysin A with an e-value of 0. There is no information on TmHmm due to it being on the outside. Coding potential is strong throughout the gene. CDS 43133 - 44173 /gene="67" /product="gp67" /function="lysin B" /locus tag="alkhayr_67" /note=Original Glimmer call @bp 43133 has strength 16.37; Genemark calls start at 43133 /note=SSC: 43133-44173 CP: yes SCS: both ST: SS BLAST-Start: [lysin B [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 0.0 GAP: 1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.082, -2.970161406017234, yes F: lysin B SIF-BLAST: ,,[lysin B [Mycobacterium phage Vorrps] ],,AYQ98904,100.0,0.0 SIF-HHPRED: Gene 12 protein; alpha/beta sandwich, CELL ADHESION; 2.0A {Mycobacterium phage D29},,,3HC7_A,71.3873,100.0 SIF-Syn: When looking at alkhayr and comparing it with Krili, a similar location of gene 67 is present for gene 66 with Krili. The functions are also and gene 66 in alkhayr compares with gene 65 in Krili and gene 68 in alkhayr compares with gene 67 with Krili /note=Glimmer, genemark and starterator all predict that the start site is 43,133. In addition, BLASTp alignments for 43,133 is Q1:S1 so it does support 43,133 as the start site. 43,133 gives the longest ORF and the shortest gap. 43,133 also gives the highest RBS score. In phagesdb function frequency support the fact that the function is lysin b because subclsuster O states the function name as lysin O. NCBI blast supports that it is lysin B because it has 100%. identity, 100% alignment and an e-value of 0. Phagesdb shows that cluster O has a function of lysin B with an e-value of 0. HHPred matches to an Acetylxylan esterase which has also been called as a Lysin in other clusters There is no information on TmHmm due to everything being on the outside only. Coding potential is strong throughout the gene overall but has slight dips. CDS 44186 - 44482 /gene="68" /product="gp68" /function="holin" /locus tag="alkhayr_68" /note=Original Glimmer call @bp 44186 has strength 11.3; Genemark calls start at 44186 /note=SSC: 44186-44482 CP: yes SCS: both ST: SS BLAST-Start: [holin [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 1.79138E-62 GAP: 12 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.78, -5.133465050264298, no F: holin SIF-BLAST: ,,[holin [Mycobacterium phage Catdawg] ],,YP_008409237,98.9796,1.79138E-62 SIF-HHPRED: SIF-Syn: When looking at alkhayr and comparing it with Krili, a similar location of gene 68 is present for gene 67 with Krili. The functions are also and gene 67 in alkhayr compares with gene 66 in Krili and gene 69 in alkhayr compares with gene 68 with Krili /note=Glimmer, genemark and starterator all predict that the start site is 44,186. In addition, BLASTp alignments for 44,186 is Q1:S1 so it does support 44,186 as the start site. 44,186 gives the longest ORF and the shortest gap. 44,186 does not give the highest RBS score. The phagesdb function frequency shows that for subcluster O the function name is holin. NCBI blast supports that the protein is holin because it has a 98.9% identity, 98.9% alignment and an e-value of 1.79138e-62. as well as phagesdb. Phagesdb shows cluster O to have a function of holin e value of the function holin is 8e-50. There is no coding potential in the beginning but has strong coding potential towards the middle and end of the gene around 44,200 to 44,4482. HHPred is not supportive of Holin as a function. DeepTmHmm shows that this protein has two transmembrane domains which is a hallmark for holins. CDS 44479 - 44778 /gene="69" /product="gp69" /function="hypothetical protein" /locus tag="alkhayr_69" /note=Original Glimmer call @bp 44479 has strength 17.92; Genemark calls start at 44479 /note=SSC: 44479-44778 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_FAMILTON_70 [Mycobacterium phage Familton]],,NCBI, q1:s1 100.0% 4.18098E-60 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.159, -5.001247543945507, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_FAMILTON_70 [Mycobacterium phage Familton]],,ATW60552,100.0,4.18098E-60 SIF-HHPRED: SIF-Syn: When looking at alkhayr and comparing it with Krili, a similar location of gene 69 is present for gene 68 with Krili. The functions are also and gene 68 in alkhayr compares with gene 67 in Krili and gene 70 in alkhayr compares with gene 69 with Krili /note=Glimmer, genemark and starterator all predict that the start site is 44,479. In addition, BLASTp alignments for 44 479 is Q1:S1 so it does support 44,479 as the start site. 44,479 does not give the longest ORF but does have the shortest gap. 44,479 does not give the highest RBS score. In phagesdb function frequency there is no data available. NCBI blast supports that their is a hypothetical protein because it has 100% identity, 100% alignment and e-value of 4.18089e-60 under unknown protein. protein as well as phagesdb. In Phagesdb blast with cluster 0 there is no function and an e value of 3e-48. There is no information on TmHmm as it is only found on the inside. There is strong coding potential within this gene. No evidence from CDD. CDS 44789 - 45265 /gene="70" /product="gp70" /function="membrane protein" /locus tag="alkhayr_70" /note=Original Glimmer call @bp 44921 has strength 4.26; Genemark calls start at 44870 /note=SSC: 44789-45265 CP: yes SCS: both-cs ST: NI BLAST-Start: [hypothetical protein SEA_KRILI_69 [Mycobacterium phage Krili] ],,NCBI, q1:s1 100.0% 1.0473E-105 GAP: 10 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.804, -4.067758173094699, no F: membrane protein SIF-BLAST: ,,[hypothetical protein SEA_KRILI_69 [Mycobacterium phage Krili] ],,QFP97115,100.0,1.0473E-105 SIF-HHPRED: SIF-Syn: When looking at alkhayr and comparing it with Krili, a similar location of gene 70 is present for gene 69 with Krili. The functions are also and gene 69 in alkhayr compares with gene 68 in Krili and gene 71 in alkhayr compares with gene 70 with Krili /note=Glimmer predict that the start site is 44,921. In GeneMark the start site that is predicted is 44,870. Starterator states that 44,789 is the most manually annoted site so this is going to be the start site. In addition, BLASTp alignments for 44, 789 is Q1:S1 so it does support 44,789. 44,789 has the smallest gap, longest ORF but does not give the highest RBS score. Phagesdb function frequency subcluster O is not present so this does not give any information on the function. NCBI blast supports that the fuction has a hypothetical protein with 100% identity, 100% alignment and an e-value of 10473e-105. Phagesdb blast shows that for cluster O the function is unknown with an e-value of 3e-62. There is present coding potential within this gene but it fluents. TmHmm shows four transmembrane domains with no specification the exact transmembrane which helps to prove the function is a membrane protien. CDS 45256 - 45753 /gene="71" /product="gp71" /function="hypothetical protein" /locus tag="alkhayr_71" /note=Original Glimmer call @bp 45256 has strength 4.15; Genemark calls start at 45256 /note=SSC: 45256-45753 CP: no SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_KRILI_70 [Mycobacterium phage Krili] ],,NCBI, q1:s1 100.0% 2.63901E-113 GAP: -10 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.814, -3.34809460825955, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_KRILI_70 [Mycobacterium phage Krili] ],,QFP97116,100.0,2.63901E-113 SIF-HHPRED: SIF-Syn: According to Pham Maps, gene 71 is most closely related to foulball ( draft) for gene 68. /note=Both Glimmer and GeneMark agree on the start site 45265. It doesn`t have the longest ORF but it has the smallest gap. It has the highest RBS score. It does not show coding potential. Starterator agrees with this start site which has been called" Most Annotated" for this gene and has 22 manual annotations. Function tools, BLASTp , CDD, and HHPRED do not support the function. There is no transmembrane protein in this gene. CDS 45750 - 46190 /gene="72" /product="gp72" /function="membrane protein" /locus tag="alkhayr_72" /note=Original Glimmer call @bp 45759 has strength 10.86; Genemark calls start at 45750 /note=SSC: 45750-46190 CP: yes SCS: both-gm ST: SS BLAST-Start: [gp75 [Mycobacterium phage Corndog] ],,NCBI, q1:s1 100.0% 3.11439E-99 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.489, -6.641980767149036, no F: membrane protein SIF-BLAST: ,,[gp75 [Mycobacterium phage Corndog] ],,NP_817926,100.0,3.11439E-99 SIF-HHPRED: SIF-Syn: According to Pham Maps, gene 72 is most closely related to ashwin (draft) 73. But it is not informative about any gene function. /note=Glimmer and GeneMark both have different start sites. Glimmers start site is 45759 and GeneMark start site is 45750. The gene selected on PECAAN does not have the longest ORF but has the shortest gap. The RBS score is the slightly the highest. According to Pham Starterator 45750 is the start site with the most manually annotated. With this new start site selected which is the GeneMark start site, it has the longest ORF and the smallest gap. Although the RBS score is not the highest. Coding potential is all inclusive. Deep TMHMM shows that this gene has a transmembrane domain. HHPRED confirms there is no function found with a score of 95.8% and a e value of 0.95. CDS complement (46159 - 46566) /gene="73" /product="gp73" /function="helix-turn-helix DNA binding domain" /locus tag="alkhayr_73" /note=Original Glimmer call @bp 46479 has strength 7.67; Genemark calls start at 46554 /note=SSC: 46566-46159 CP: no SCS: both-cs ST: NI BLAST-Start: [helix-turn-helix DNA binding domain protein [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 3.80504E-93 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.733, -5.309117108752867, no F: helix-turn-helix DNA binding domain SIF-BLAST: ,,[helix-turn-helix DNA binding domain protein [Mycobacterium phage Vorrps] ],,AYQ98910,100.0,3.80504E-93 SIF-HHPRED: SIF-Syn: When comparing the alkhayr with Krili, gene 71 in alkhayr is gene 72 in Krili. They both have similar start sites. They are both in the reverse strand. /note=Glimmer and GeneMark both have different start sites. Glimmer start site is 46479 and GeneMark start site is 46554. According to PECAAN, the 46566 gene is in the reverse strand, has the longest ORF and has the smallest gap, although the RBS score isn`t the highest. According to Pham starterator, gene 46566 was the most manually annotated. There is a function and it has the higher number of frequency, which is the helix-turn-helix DNA binding domain. This function also has the lowest e value. There is no membrane present. In addition, BASTp alignments for 46566 is Q1:S1. CDS complement (46563 - 46697) /gene="74" /product="gp74" /function="hypothetical protein" /locus tag="alkhayr_74" /note= /note=SSC: 46697-46563 CP: no SCS: neither ST: SS BLAST-Start: [hypothetical protein SEA_VORRPS_74 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 3.17327E-24 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.419, -4.164901013602904, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_74 [Mycobacterium phage Vorrps] ],,AYQ98911,100.0,3.17327E-24 SIF-HHPRED: SIF-Syn: /note=Added missing gene. There is no coding potential on genemark report, gene is present in similar cluster O phages with unknown function. NCBI Blast: 100%, 100% coverage, Q1:S1, e value: 3.17327e-24. There is no membrane present. Gene 46697 is in the reverse strand, having the longest ORF, the shortest gap and the RBS isn`t the highest. CDS complement (46694 - 46927) /gene="75" /product="gp75" /function="hypothetical protein" /locus tag="alkhayr_75" /note=Original Glimmer call @bp 46927 has strength 8.14; Genemark calls start at 46927 /note=SSC: 46927-46694 CP: yes SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_VORRPS_75 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 5.51721E-46 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.06, -4.6353190821692705, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_75 [Mycobacterium phage Vorrps] ],,AYQ98912,100.0,5.51721E-46 SIF-HHPRED: DUF2642 ; Protein of unknown function (DUF2642),,,PF10842.11,51.9481,96.8 SIF-Syn: According to Pham Maps, gene 75 is the most closely related to wildflower (draft) for gene 73. /note=Start site 46927 does not have any coding potential in the Host-Trained GeneMark report. Glimmer reports that the start site is 46927 and GeneMark reports that the start site is 46927 as well, and according to starterator 46927 is the only manually annotated start site (22 MA) for this gene. In addition, BASTp alignments for 46927 is Q77:S1. 46927 gives us the shortest gap of -4 compared to all the other start sites with larger gaps. 46927 does not give us the longest length though. Function tools, BASTp, and CCD do not support a function. HHPRED shows function unknown with a 96.8% and a e value of 0.03. It is also not a membrane protein according to TmHmm. CDS complement (46924 - 47301) /gene="76" /product="gp76" /function="hypothetical protein" /locus tag="alkhayr_76" /note=Original Glimmer call @bp 47301 has strength 13.64; Genemark calls start at 47301 /note=SSC: 47301-46924 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_VORRPS_76 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 1.35728E-86 GAP: 48 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.082, -2.583959800616441, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_76 [Mycobacterium phage Vorrps] ],,AYQ98913,100.0,1.35728E-86 SIF-HHPRED: SIF-Syn: According to Pham Maps, gene 76 is the most closely related to wildflower (draft) for gene 74. But not informative about any gene function. /note=Start site 47301 does have coding potential on the complementary sequence in the Host-Trained GeneMark report. Glimmer reports that the start site is 47301 and GeneMark reports that the start site is 47301 as well. And according to starterator 47301 is the most and only manually annotated (19 MA) start for this gene. In addition, BlASTp alignments for 47301 is Q1:S1. 47301 gives us the shortest gap of 48 compared to all the other start sites with larger gaps. 47301 also gives us the longest length as well of 378. According to PhagesDB none of the cluster O`s have a good e-value they are all too high also indicating that the function is unknown. None of the function tools, BLASTp, HHPRED, or CCD support a function. It is also not a membrane protein according to TmHmm. CDS complement (47350 - 47781) /gene="77" /product="gp77" /function="hypothetical protein" /locus tag="alkhayr_77" /note=Original Glimmer call @bp 47712 has strength 10.62; Genemark calls start at 47781 /note=SSC: 47781-47350 CP: yes SCS: both-gm ST: SS BLAST-Start: [hypothetical protein KNU03_gp080 [Mycobacterium phage Ryadel] ],,NCBI, q1:s1 100.0% 1.14225E-98 GAP: 45 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.901, -5.711975353186392, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein KNU03_gp080 [Mycobacterium phage Ryadel] ],,YP_010097570,100.0,1.14225E-98 SIF-HHPRED: SIF-Syn: According to Pham Maps, gene 77 is the most closely related to Schuy (draft) for gene 75. But not informative about any gene function. /note=Start sites 47781 and 47712 both have coding potential on the complementary sequence in the Host-Trained GeneMark report. Glimmer reports that the start site is 47712 and GeneMark reports that the start site is 47781, and according to starterator 47781 is the most manually annotated (22 MA) start for this gene. In addition, BASTp alignments for 47712 is Q1:S1 it does not support 47712 as the start site. 47781 gives us the shortest gap of 45 compared to 47712 which has a gap of 114. 47781 also gives us the longest length as well. According to PhagesDB none of the cluster O`s have a good e-value they are all too high also indicating that the function is unknown. None of the function tools, BASTp, HHPRED CCD support a function. It is also not a membrane protein according to TmHmm. CDS complement (47827 - 48258) /gene="78" /product="gp78" /function="hypothetical protein" /locus tag="alkhayr_78" /note=Original Glimmer call @bp 48252 has strength 12.32; Genemark calls start at 48258 /note=SSC: 48258-47827 CP: yes SCS: both-gm ST: NI BLAST-Start: [hypothetical protein PBI_DYLAN_80 [Mycobacterium phage Dylan] ],,NCBI, q1:s1 100.0% 1.29001E-102 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.082, -3.3503726477288405, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_DYLAN_80 [Mycobacterium phage Dylan] ],,YP_008530644,100.0,1.29001E-102 SIF-HHPRED: SIF-Syn: According to Pham Maps, gene 78 is the most closely related to Madkillah (final) for gene 79. But not informative about any gene function. /note=Start sites 48252 and 48258 both have coding potential on the complementary sequence in the Host-Trained GeneMark report. Glimmer reports that the start site is 48252 and GeneMark reports that the start site is 48258. And according to starterator 48258 is the most manually annotated (20 MA) start for this gene, and start site 48252 does not have any. In addition, BASTp alignments is Q1:S1 which does not support 48252 or 48258 as the start site. 48252 gives us the shortest gap of 2 compared to 48258 which has a gap of -4. But 48258 gives us the longest length which is 432. According to PhagesDB none of the cluster O`s have a good e-value they are all too high also indicating that the function is unknown. None of the function tools, BASTp, HHPRED, or CCD support a function. It is also not a membrane protein according to TmHmm. CDS complement (48255 - 48980) /gene="79" /product="gp79" /function="hypothetical protein" /locus tag="alkhayr_79" /note=Original Glimmer call @bp 48980 has strength 15.72; Genemark calls start at 48980 /note=SSC: 48980-48255 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein KNU03_gp082 [Mycobacterium phage Ryadel] ],,NCBI, q1:s1 100.0% 4.37906E-179 GAP: 44 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.163, -2.2763497933341483, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein KNU03_gp082 [Mycobacterium phage Ryadel] ],,YP_010097572,100.0,4.37906E-179 SIF-HHPRED: SIF-Syn: According to Pham Maps, gene 79 is the most closely related to Madkillah (final) for gene 80. But not informative about any gene function. /note=Start site 48980 does have coding potential on the complementary sequence in the Host-Trained GeneMark report. Glimmer reports that the start site is 48980 and GeneMark reports that the start site is 48980 as well. And according to starterator 48980 is the most manually annotated (22 MA) start for this gene. In addition, BASTp alignments for 48980 is Q:159:S163. 48980 gives us the shortest gap of 44 compared to all the other start sites with larger gaps. 48980 also gives us the longest length as well of 726. According to PhagesDB none of the cluster O`s have a good e-value they are all too high also indicating that the function is unknown. None of the function tools, BASTp, HHPRED CCD support a function. It is also not a membrane protein according to TmHmm. CDS complement (49025 - 49657) /gene="80" /product="gp80" /function="hypothetical protein" /locus tag="alkhayr_80" /note=Original Glimmer call @bp 49657 has strength 15.15; Genemark calls start at 49657 /note=SSC: 49657-49025 CP: yes SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_VORRPS_80 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 2.21889E-152 GAP: 115 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.243, -2.1123791669543204, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_80 [Mycobacterium phage Vorrps] ],,AYQ98917,100.0,2.21889E-152 SIF-HHPRED: SIF-Syn: According to Pham Maps, gene 80 is the most closely related to Madkillah (final) for gene 81. But not informative about any gene function. /note=Start site 49657 does have coding potential on the complementary sequence in the Host-Trained GeneMark report. Glimmer reports that the start site is 49657 and GeneMark reports that the start site is 49657 as well. And according to starterator 49657 is the most manually annotated (21 MA) start for this gene. In addition, BASTp alignments for 49657 is Q37:S272. 49657 gives us a small gap of 115 but is not the smallest. 49657 also gives us a length of 633. According to PhagesDB none of the cluster O`s have a good e-value they are all too high also indicating that the function is unknown. None of the function tools, BLASTp, HHPRED, or CCD support a function. It is also not a membrane protein according to TmHmm. CDS complement (49773 - 50951) /gene="81" /product="gp81" /function="DNA polymerase III sliding clamp (Beta)" /locus tag="alkhayr_81" /note=Original Glimmer call @bp 50816 has strength 13.62; Genemark calls start at 50816 /note=SSC: 50951-49773 CP: no SCS: both-cs ST: NI BLAST-Start: [DNA polymerase III sliding clamp beta [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 0.0 GAP: -11 bp gap LO: no RBS: Kibler 6, Karlin Medium, 0.904, -7.082057011547625, no F: DNA polymerase III sliding clamp (Beta) SIF-BLAST: ,,[DNA polymerase III sliding clamp beta [Mycobacterium phage Vorrps] ],,AYQ98918,100.0,0.0 SIF-HHPRED: DNA polymerase III, beta subunit; TM0262, DNA Polymerase III, beta subunit, Structural Genomics, Joint Center for Structural Genomics, JCSG, Protein Structure; 2.0A {Thermotoga maritima} SCOP: d.131.1.1, l.1.1.1,,,1VPK_A,89.0306,100.0 SIF-Syn: CDS complement (50941 - 51906) /gene="82" /product="gp82" /function="membrane protein, Band-7 -like" /locus tag="alkhayr_82" /note=Original Glimmer call @bp 51906 has strength 11.72; Genemark calls start at 51906 /note=SSC: 51906-50941 CP: yes SCS: both ST: NA BLAST-Start: [band-7-like membrane protein [Mycobacterium phage JangDynasty]],,NCBI, q1:s1 100.0% 0.0 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.984, -2.7863799983944713, yes F: membrane protein, Band-7 -like SIF-BLAST: ,,[band-7-like membrane protein [Mycobacterium phage JangDynasty]],,AVI04112,99.6885,0.0 SIF-HHPRED: SIF-Syn: Madkilla_82 and Murai_82 similar up and downstream genes. /note=Glimmmer, genemark and starterator all agree that the right start site of the gene is 51906, although it looks like the starterator agrees with the start sites but it is not the most called start site. Doesn`t have the most annotated start in Starterator but 51906 has the most manual annotations. Deep TmHmm doesn`t predict a TMD but HHPred has a strong match for a membrane microdomain. It is not always called in O cluster phages. It is also found in Madkilla, Krilli and Murai. CDS complement (51903 - 52169) /gene="83" /product="gp83" /function="membrane protein" /locus tag="alkhayr_83" /note=Original Glimmer call @bp 52169 has strength 12.34; Genemark calls start at 52169 /note=SSC: 52169-51903 CP: yes SCS: both ST: SS BLAST-Start: [membrane protein [Mycobacterium phage Shida] ],,NCBI, q1:s1 100.0% 2.02035E-55 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.97, -4.8822772450371135, no F: membrane protein SIF-BLAST: ,,[membrane protein [Mycobacterium phage Shida] ],,QOC58514,100.0,2.02035E-55 SIF-HHPRED: RCR ; Chitin synthesis regulation, resistance to Congo red,,,PF12273.11,50.0,81.4 SIF-Syn: /note=Glimmer and genemark are both agree that the right start sites of gene 80 to be 52169 and according to starterator it is the most annotated start sites of the gene but the start site do not call a function.It gives us the longest ORF and the least gabs and overlap and the gene also contains the longest length. Phagesdb does not show any evidence to support the gene function in this cluster or any other clusters. Phagesdb blast don’t support the function that has been called for the gene. NCBI blast show evidence that membrane protein is the function of the gene. HHPRED matches to RCR; chitin synthesis regulation, resistance to Congo red with a probability of 81.4 which is low and E-Value of 11. According TMHMM it show evidence of 1 transmembrane protein. CDS complement (52166 - 52342) /gene="84" /product="gp84" /function="hypothetical protein" /locus tag="alkhayr_84" /note=Original Glimmer call @bp 52342 has strength 9.33; Genemark calls start at 52342 /note=SSC: 52342-52166 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein KNU03_gp087 [Mycobacterium phage Ryadel] ],,NCBI, q1:s1 100.0% 8.84201E-33 GAP: 40 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.082, -2.5052746077145835, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein KNU03_gp087 [Mycobacterium phage Ryadel] ],,YP_010097577,100.0,8.84201E-33 SIF-HHPRED: SIF-Syn: not informative. /note=Neither 52342 or 52273 have any coding potential in the Host-Trained GeneMark report. Glimmer reports that the start site is 52342. glimmer start site begins at 52595 The genemark start is 52595 here is not much coding potential throughout the gene. Starterator supports 52342 as the most annotated start. PhagesDB blast, HH pred, and NCBI blast support function this function of the gene. No TMB according to TmHmm. CDS complement (52383 - 52595) /gene="85" /product="gp85" /function="Ku-like dsDNA break-binding protein" /locus tag="alkhayr_85" /note=Original Glimmer call @bp 52595 has strength 16.89; Genemark calls start at 52595 /note=SSC: 52595-52383 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_VORRPS_85 [Mycobacterium phage Vorrps] ],,NCBI, q1:s23 100.0% 2.57891E-41 GAP: 118 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.592, -7.128060851526577, no F: Ku-like dsDNA break-binding protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_85 [Mycobacterium phage Vorrps] ],,AYQ98922,76.087,2.57891E-41 SIF-HHPRED: SIF-Syn: /note=glimmer start site begins at 52595 The genemark start is 52595 According to PhagesDB none of the cluster O`s have a good e-value they are all too high also indicating that the function is unknown. None of the function tools, BASTp, HHPRED, or CCD support a function. It is also not a membrane protein according to TmHmm. CDS complement (52714 - 53025) /gene="86" /product="gp86" /function="hypothetical protein" /locus tag="alkhayr_86" /note=Original Glimmer call @bp 53025 has strength 10.49; Genemark calls start at 53025 /note=SSC: 53025-52714 CP: no SCS: both ST: NI BLAST-Start: [HTH DNA binding protein [Mycobacterium phage Ryadel] ],,NCBI, q1:s1 100.0% 5.07918E-67 GAP: 15 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.163, -2.627458653341447, yes F: hypothetical protein SIF-BLAST: ,,[HTH DNA binding protein [Mycobacterium phage Ryadel] ],,YP_010097580,99.0291,5.07918E-67 SIF-HHPRED: SIF-Syn: /note=Genemark and glimmer score have a start site at 53025 According to PhagesDB none of the cluster O`s have a good e-value (4e-53) they are all too high also indicating that the function is unknown. None of the function tools, BASTp, HHPRED, or CCD support a function. It is also not a membrane protein according to TmHmm. CDS complement (53041 - 53895) /gene="87" /product="gp87" /function="Ku-like dsDNA break-binding protein" /locus tag="alkhayr_87" /note=Original Glimmer call @bp 53895 has strength 17.85; Genemark calls start at 53895 /note=SSC: 53895-53041 CP: yes SCS: both ST: SS BLAST-Start: [MULTISPECIES: Ku protein [unclassified Mycolicibacter] ],,NCBI, q1:s1 98.5916% 7.7538E-50 GAP: 53 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.082, -2.523003374675015, yes F: Ku-like dsDNA break-binding protein SIF-BLAST: ,,[MULTISPECIES: Ku protein [unclassified Mycolicibacter] ],,WP_224864635,57.6512,7.7538E-50 SIF-HHPRED: KU_like; Ku-core domain, Ku-like subfamily; composed of prokaryotic homologs of the eukaryotic DNA binding protein Ku.,,,cd00789,94.0141,100.0 SIF-Syn: Gene 88 is very similar to gene 92 in Krilli. Genes that are upstream and downstream are also similar. Also very similar to gene 92 in Madkillah. They also share similar upstream and downstream genes. Krili also has the Ku-like dsDNA break-binding protein function. Vorrps also is homologous at gene 88 with the same function. /note=Glimmer start, Genemark start, and starterator all agree that 53,895 is the start site. Pham reports reveal that 53,895 had 60 manual annotations which also confirm that 53,895 is the correct start site. 53,895 has the second longest ORF. It also has the most favorable gap of 53. There is not much coding potential throughout the gene. PhagesDB blast, HH pred, and NCBI blast support function this function of the gene. With the alignment being Q1:S1. Phagesdb Function Frequency shows a 15% frequency for Ku-like dsDNA break-binding protein in subcluster O. Strong HHPred match. CDS complement (53949 - 54176) /gene="88" /product="gp88" /function="hypothetical protein" /locus tag="alkhayr_88" /note=Original Glimmer call @bp 54176 has strength 2.22; Genemark calls start at 54176 /note=SSC: 54176-53949 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_DYLAN_88 [Mycobacterium phage Dylan] ],,NCBI, q1:s42 100.0% 1.02815E-45 GAP: 7 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.546, -3.6916801401915316, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_DYLAN_88 [Mycobacterium phage Dylan] ],,YP_008530652,64.6552,1.02815E-45 SIF-HHPRED: SIF-Syn: This gene is homologous to gene 88 in Dylan with similar upstream and downstream genes. This is also the same for gene 89 in Blessica, 88 in corndog, and 88 in catdawg. /note=Glimmer start, GeneMark start, and starterator all conclude that 54,176 is the most likely start site. 54,176 has the most manual annotations with 20 MA`s. Though it does not have the longest ORF, it has a favorable gap of 7. PhagesDB blast, HH pred, and NCBI blast support this being a hypothetical protein. GeneMark shows no coding potential for this gene. HH pred shows no genes with a probability of 95% or higher which is favorable when helping determine correct gene function. NCBI blast supports 100% coverage and E value of 1.02815e-45 for hypothetical protein. HHPred doesn`t show a strong match. CDS complement (54184 - 54459) /gene="89" /product="gp89" /function="hypothetical protein" /locus tag="alkhayr_89" /note=Genemark calls start at 54459 /note=SSC: 54459-54184 CP: yes SCS: genemark ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_89 [Mycobacterium phage Catdawg] ],,NCBI, q1:s34 100.0% 2.6224E-58 GAP: 176 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.831, -2.9625409806968013, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_89 [Mycobacterium phage Catdawg] ],,YP_008409258,73.3871,2.6224E-58 SIF-HHPRED: DNA gyrase inhibitor YacG; Isomerase, DUF329, ISOMERASE-ISOMERASE INHIBITOR complex; 3.3A {Escherichia coli},,,4TMA_L,32.967,97.1 SIF-Syn: This gene is very similar to gene 88 in school bus, gene 90 in YungJamal, gene 88 in JangDynasty, and gene 86 in TelAviv. They also all share similar downstream and upstream genes. Most of these genes have NKF/ are hypothetical proteins. /note=GeneMark Start and starterator both agree that the most probable start site is 54, 459. In starterator it states that this start site had the most manual annotations with 11 total. It has the second longest ORF and a gap of 176. PhagesDB blast, HH pred, and NCBI blast support this being a hypothetical protein. HH pred showed a few genes with a probability of greater than 96% that were uncharacterized or were hypothetical proteins. NCBI blast supports 100% coverage and E value of 2.6224e-58 for hypothetical protein. HHPred doesn`t have a strong match, high E value. TmHmm shows no TMD. CDS complement (54636 - 56144) /gene="90" /product="gp90" /function="ParB-like nuclease domain" /locus tag="alkhayr_90" /note=Original Glimmer call @bp 56144 has strength 14.03; Genemark calls start at 56144 /note=SSC: 56144-54636 CP: yes SCS: both ST: SS BLAST-Start: [ParB-like nuclease domain protein [Mycobacterium phage Winget]],,NCBI, q1:s1 100.0% 0.0 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.082, -2.5052746077145835, yes F: ParB-like nuclease domain SIF-BLAST: ,,[ParB-like nuclease domain protein [Mycobacterium phage Winget]],,QWY81576,100.0,0.0 SIF-HHPRED: ParB domain protein nuclease; ParB-N, pnob8, partition, HYDROLASE; HET: CIT, MSE; 2.45A {Sulfolobus solfataricus},,,5K5D_A,35.8566,99.7 SIF-Syn: Gene is very similar to gene 91 in Winget, 87 in FoulBall, 88 in Schuy, and 92 in MadKillah. Winget and Madkillah share the same function, while Schuy and FoulBall are NKF. All these genes have similar upstream and downstream genes. /note=Glimmer start, Genemark start, and starterator all agree that 56,144 is the most probable start site of this gene. 56,144 had the most manual annotations with 22 total manual annotations. It has the second longest ORF with the most favorable gap of -4. PhagesDB blast, HH pred, and NCBI blast support this being a parB-like nuclease domain. HH pred showed a few genes with a probability of greater than 96% that were ParB-like nuclease domain.. NCBI blast supports 100% coverage and E value of 2.6224e-58 for this function. Phagesdb Function Frequency reveals a frequency of 26% for parb-like nuclease domain protein in subcluster O. CDS complement (56141 - 56479) /gene="91" /product="gp91" /function="hypothetical protein" /locus tag="alkhayr_91" /note=Original Glimmer call @bp 56479 has strength 7.77; Genemark calls start at 56479 /note=SSC: 56479-56141 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_91 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 4.29336E-77 GAP: 43 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.221, -2.2186743274732437, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_91 [Mycobacterium phage Catdawg] ],,YP_008409260,100.0,4.29336E-77 SIF-HHPRED: DUF983 ; Protein of unknown function (DUF983),,,PF06170.15,16.9643,92.9 SIF-Syn: Similar to gene 91 in Krilli, 93 in MadKillah, 91 in Mori, 91 in Idergollasper, 91 in Corndog and 91 in Catdawg, all being downstream genes. /note=Glimmer start, Genemark start, and starterator all agree that 56,479 is the most probable start site of this gene. 56,479 had the most manual annotations with 22 total manual annotations. It has the second longest ORF with a gap of 43. According GeneMark, there is high coding potential in the reverse strand around genes 56400. PhagesDB function frequency, HH pred, and NCBI blast support this being a hypothetical protein. HH pred shows no genes with a probability of 95% or higher. NCBI blast supports 100% coverage and E value of 4.29336e-77 for hypothetical protein. PhagesDB function frequency shows 50% frequency for hypothetical dna binding protein. TmHmm doesn`t predict any TMDs. CDS complement (56523 - 56972) /gene="92" /product="gp92" /function="hypothetical protein" /locus tag="alkhayr_92" /note=Original Glimmer call @bp 56972 has strength 7.09; Genemark calls start at 56972 /note=SSC: 56972-56523 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_WINGET_93 [Mycobacterium phage Winget] ],,NCBI, q1:s1 100.0% 6.79269E-105 GAP: 105 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.496, -6.548494936238812, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_WINGET_93 [Mycobacterium phage Winget] ],,QWY81578,100.0,6.79269E-105 SIF-HHPRED: CRISPR-associated protein, Cse2 family; CRISPR, Cascade, CasB, CRISPR-assoicated protein, Nucleic acid binding protein, DNA BINDING PROTEIN; 1.9A {Thermobifida fusca},,,4H79_A,61.745,39.7 SIF-Syn: Similar to gene 92 in Catdawg, 92 in Corndog, 96 in Firecracker, 92 in Idergollasper, 92 in Krilli, 94 in MadKillah, 92 in Mori, 92 in NiebruSaylor, 96 in Ryadel, 92 in Shida, 94 in Smooch, and 92 in Vorrps, all being upstream genes. /note=Glimmer start, Genemark start, and starterator all agree that 56,972 in the reverse strand is the most probable start site of this gene. 56,972 had the most manual annotations with 22 total manual annotations. It has the longest ORF with the lowest gap of 105. According GeneMark, there is no coding potential in the reverse strand around 56,972. NCBI blast support this being a hypothetical protein while Phagesdb Blast consistently shows unknown function. HH pred shows no genes with a probability of 95% or higher. NCBI blast supports 100% coverage and E value of 6.79269e-105 for hypothetical protein. No TMD according to TmHmm. CDS complement (57078 - 57419) /gene="93" /product="gp93" /function="hypothetical protein" /locus tag="alkhayr_93" /note=Original Glimmer call @bp 57419 has strength 9.69; Genemark calls start at 57419 /note=SSC: 57419-57078 CP: yes SCS: both ST: SS BLAST-Start: [gp93 [Mycobacterium phage Corndog] ],,NCBI, q1:s1 100.0% 1.71276E-75 GAP: -11 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.082, -2.5052746077145835, yes F: hypothetical protein SIF-BLAST: ,,[gp93 [Mycobacterium phage Corndog] ],,NP_817944,100.0,1.71276E-75 SIF-HHPRED: SIF-Syn: Similar to other genes in several O cluster phages but all have no known function. /note=Glimmer start, Genemark start, and starterator all agree that 57,419 is the most probable start site of this gene. 57,419 had the most manual annotations with 22 total manual annotations. It has the second longest ORF with the most favorable gap of -11. PhagesDB blast mostly states function unknown, HH pred showed BNR/Aspbox repeat with the highest probability and second highest going to protein of unknown function, and NCBI blast supports it by stating 100% coverage and evalue of 1.71276e-75 for hypothetical protein. With all this, the function is not known. CDS complement (57409 - 57681) /gene="94" /product="gp94" /function="hypothetical protein" /locus tag="alkhayr_94" /note=Original Glimmer call @bp 57681 has strength 10.0; Genemark calls start at 57681 /note=SSC: 57681-57409 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_94 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 4.18105E-58 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.926, -2.8454123590742793, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_94 [Mycobacterium phage Catdawg] ],,YP_008409263,100.0,4.18105E-58 SIF-HHPRED: Size determination protein Sid; bacteriophage, phage, procapsid, satellite phage, size determination, capsid protein, molecular piracy, VIRUS;{Escherichia phage P2},,,7JW1_e,53.3333,62.3 SIF-Syn: not informative /note=Start site 57681 has the longest orf and the least overlap. it also has the highest RBS scores. Starterator supports 57681 as the start site. It has been manually annotated 20 times. None of the function tools supports a function. No TMD predicted by Deep TmHmm. CDS complement (57678 - 58478) /gene="95" /product="gp95" /function="hypothetical protein" /locus tag="alkhayr_95" /note=Original Glimmer call @bp 58478 has strength 19.06; Genemark calls start at 58478 /note=SSC: 58478-57678 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_95 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 0.0 GAP: 51 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.082, -3.095100142625534, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_95 [Mycobacterium phage Catdawg] ],,YP_008409264,100.0,0.0 SIF-HHPRED: DUF3150 ; Protein of unknown function (DUF3150),,,PF11348.11,67.6692,96.9 SIF-Syn: not informative /note=Start site 58478 has been selected by genemark and glimmer. Even though it doesn`t have the longest orf (828) or the shortest gap (24), it has the strongest ribosome binding site. Starterator supports the start site with it being manually annotated the most. None of the function tools supports a function. No TMD predicted by Deep TmHmm. CDS complement (58530 - 58787) /gene="96" /product="gp96" /function="hypothetical protein" /locus tag="alkhayr_96" /note=Original Glimmer call @bp 58787 has strength 5.22; Genemark calls start at 58787 /note=SSC: 58787-58530 CP: yes SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_VORRPS_96 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 2.00409E-52 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.39, -3.8730906109286094, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_96 [Mycobacterium phage Vorrps] ],,AYQ98933,100.0,2.00409E-52 SIF-HHPRED: SIF-Syn: not informative /note=start site 58787 has the least overlap and the highest RBS score even though it doesn`t have the longest orf. coding potential in this case is not that strong. None of the function tools supports a function. CDS complement (58784 - 59044) /gene="97" /product="gp97" /function="hypothetical protein" /locus tag="alkhayr_97" /note=Original Glimmer call @bp 59044 has strength 8.63; Genemark calls start at 59044 /note=SSC: 59044-58784 CP: yes SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_CATDAWG_96 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 7.78607E-55 GAP: 82 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.082, -2.794070146961553, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_96 [Mycobacterium phage Catdawg] ],,YP_008409265,100.0,7.78607E-55 SIF-HHPRED: Translocating peptide; SecA, SecY, Translocation, Cryo-EM, PROTEIN TRANSPORT; HET: PGV, GYS, ADP; 3.45A {Bacillus subtilis (strain 168)},,,6ITC_B,62.7907,84.2 SIF-Syn: /note=even though start site 59044 doesnt have the longest orf or the least gaps or ovelaps, it has the lowest RBS score. Phamerator supports the start site because there are 10 manual annotation on this start site. coding potential isn`t that strong. None of the function tools supports a function. CDS complement (59127 - 60827) /gene="98" /product="gp98" /function="AAA-ATPase" /locus tag="alkhayr_98" /note=Original Glimmer call @bp 60905 has strength 18.46; Genemark calls start at 60827 /note=SSC: 60827-59127 CP: yes SCS: both-gm ST: SS BLAST-Start: [AAA-ATPase [Mycobacterium phage NiebruSaylor] ],,NCBI, q1:s27 100.0% 0.0 GAP: 343 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.924, -4.9157003279346805, no F: AAA-ATPase SIF-BLAST: ,,[AAA-ATPase [Mycobacterium phage NiebruSaylor] ],,QOC59297,95.6081,0.0 SIF-HHPRED: Peroxisomal ATPase PEX1; AAA ATPase, peroxisomes, peroxisome biogenesis, peroxisome biogenesis disorders, Zellweger Syndrome, TRANSLOCASE; HET: ADP, MG, ATP;{Saccharomyces cerevisiae},,,8C0W_F,75.6184,99.9 SIF-Syn: A number of O cluster phages show similar genes in this region, Madkilla, Mori, and Murai, to name some. /note=the start site for this gene is 60827. glimmer suggest 60905 as the start site but it doesn`t have the longest orf or the shortest gap. Starterator suggests 60827 as the start site because its been manually annotated 18 times. In genemark there is high coding capacity. the function was shown in phagesdb function frequency to be within the same subcluster and the number of hits was 22 times. HHpred match is strong to an ATPase in S.cerevisae. with an E value of 1E-19. CDD also has a strong match to that function. TmHmm doesn`t predict a TMD. CDS complement (61171 - 61605) /gene="99" /product="gp99" /function="hypothetical protein" /locus tag="alkhayr_99" /note=Original Glimmer call @bp 61560 has strength 1.67; Genemark calls start at 61569 /note=SSC: 61605-61171 CP: yes SCS: both-cs ST: SS BLAST-Start: [hypothetical protein SEA_FAMILTON_101 [Mycobacterium phage Familton] ],,NCBI, q1:s1 100.0% 1.1169E-101 GAP: 75 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.665, -3.83326386801984, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_FAMILTON_101 [Mycobacterium phage Familton] ],,ATW60583,100.0,1.1169E-101 SIF-HHPRED: DUF1764 ; Eukaryotic protein of unknown function (DUF1764),,,PF08576.14,13.1944,20.5 SIF-Syn: Not informative. /note=the glimmer and gene marks didnt agree on a start sight. In Glimmer, it chose 61560 which contains a gap over 120. In Genemark the Gap is over 111, the start state that has 63 MA is 61605 on the startator. The gene potential isn`t that great it goes passed 0.5 at different places. The places are 61250, 61520, and 61300. there is no known function. In NCBI BLAST all the resolves only say hypothetical protein, in HHPRED Mycobacterium phage Blessica had a 100% alignment and no function was found. DeepTMHMM states that it s inside the membrane with a probability of 1.0. According to Deep TmHmm it doesn`t have a TMD. CDS complement (61681 - 62112) /gene="100" /product="gp100" /function="hypothetical protein" /locus tag="alkhayr_100" /note=Original Glimmer call @bp 62112 has strength 11.42; Genemark calls start at 62112 /note=SSC: 62112-61681 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_WINGET_101 [Mycobacterium phage Winget]],,NCBI, q1:s1 100.0% 1.87211E-101 GAP: 42 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.082, -2.5052746077145835, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_WINGET_101 [Mycobacterium phage Winget]],,QWY81586,100.0,1.87211E-101 SIF-HHPRED: SIF-Syn: Gene 100 is homologous with Gene 103 of Krilli, neither of which has a function. /note=Genemark and Glimmer agree on the start site at 62112. The start site also provides us with the longest ORF and the smallest gap compared to other start sites. According to starterator, it is the most manually annotated start site with 22 MAs. The function of the gene is categorized as NKF (no known function) as Phagesdb BLAST, HHPRED and NCBI BLAST, do not agree on any function. Phagesdb BLAST has no known function in the O subcluster, HHPRED demonstrates high e-values and disagree on the function. NCBI BLAST does not give an appropriate function as they only provides hypothetical proteins.DeepTMHMM states that it s inside the membrane with a probability of 1.0. CDS complement (62155 - 62445) /gene="101" /product="gp101" /function="hypothetical protein" /locus tag="alkhayr_101" /note=Original Glimmer call @bp 62445 has strength 14.63; Genemark calls start at 62445 /note=SSC: 62445-62155 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_103 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 1.42456E-63 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.159, -4.490107201936413, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_103 [Mycobacterium phage Catdawg] ],,YP_008409272,100.0,1.42456E-63 SIF-HHPRED: Uncharacterized protein; periplasmic, lipoprotein, UNKNOWN FUNCTION; HET: MSE; 2.104A {Paraburkholderia phytofirmans},,,5T11_H,40.625,83.8 SIF-Syn: Not informative /note=Genemark and Glimmer agree on the start site at 62,445. The start site also provides us with the longest ORF and the smallest gap compared to other start sites. According to starterator, it is the most coded start site with 22 manual annotations. Coding potential is not maintained throughout the whole gene as there is a dip at 62,300 that has coding potential at 0.8. The function of the gene is categorized as NKF (no known function) as Phagesdb BLAST, HHPRED and NCBI BLAST, do not agree on any function. Phagesdb BLAST has no known function in the O subcluster, HHPRED demonstrates high e-values, disagree on the function, and has poor probability percentages. NCBI BLAST does not give an appropriate function as they only provide hypothetical proteins.DeepTMHMM states that it s inside the membrane with a probability of 1.0 CDS complement (62442 - 62621) /gene="102" /product="gp102" /function="hypothetical protein" /locus tag="alkhayr_102" /note=Original Glimmer call @bp 62621 has strength 16.38; Genemark calls start at 62621 /note=SSC: 62621-62442 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_FAMILTON_104 [Mycobacterium phage Familton] ],,NCBI, q1:s1 100.0% 2.77044E-35 GAP: -8 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.488, -4.197870532213559, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_FAMILTON_104 [Mycobacterium phage Familton] ],,ATW60586,100.0,2.77044E-35 SIF-HHPRED: DUF4416 ; Domain of unknown function (DUF4416),,,PF14385.10,84.7458,77.5 SIF-Syn: Several O cluster phages are syntenic but not informative of a function call. /note=Glimmer and Genmark agree that the Gene starts at 62621. it contains all the coding potential with everything at 1.0 nothing below. Starterator has a MA of 22 hits This is the start sight. All 3 are in alignment with the start site. /note=The start site also provides us with the longest ORF and the smallest gap compared to other start sites. According to starterator, it is the most coded start site with 22 manual annotations. Coding potential is not maintained throughout the whole gene as there are dips spanning from 62621-62444 bp, but never reaching coding potential zero. The function of the gene is categorized as NKF (no known function) as Phagesdb BLAST, HHPRED and NCBI BLAST, do not agree on any function. Phagesdb BLAST has no known function in the O subcluster, HHPRED demonstrates high e-values and disagrees on the function. NCBI BLAST does not give an appropriate function as they only provide hypothetical proteins. DeepTMHMM is within the membrane with e with a probability of 1.0, no TMD. CDS complement (62614 - 62943) /gene="103" /product="gp103" /function="hypothetical protein" /locus tag="alkhayr_103" /note=Original Glimmer call @bp 62943 has strength 7.78; Genemark calls start at 62943 /note=SSC: 62943-62614 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_105 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 1.10767E-74 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.901, -2.958041784599676, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_105 [Mycobacterium phage Catdawg] ],,YP_008409274,100.0,1.10767E-74 SIF-HHPRED: PF14_0633 protein; AP2 domain, Plasmodium falciparum, specific transcription factor, PROTEIN-DNA COMPLEX, Transcription-DNA COMPLEX; 2.2A {Plasmodium falciparum},,,3IGM_A,60.5505,82.9 SIF-Syn: When comparing gene 103 of Alkahayr to gene 103 of school bus and gene 107 of Madkillah, the genes are homologous to one another even if the start and stop site are not at the same location. Genes upstream and downstream are also similar. No evidence from syntenic genes to support function call. /note=Genemark and Glimmer agree on the start site at 62,943. The start site also provides us with the longest ORF and the smallest overlap compared to other start sites. The z-score is not the best score, with a 2.901. The final score is not the least best score as the called start site is -2.958. However, according to starterator, it is the most coded start site with 22 manual annotations. Gene mark demonstrates that coding potential is not maintained throughout the whole gene as there are dips at 62,800-62720 that has coding potential less than 1, at ranges of .7-.8. The function of the gene is categorized as NKF (no known function) as Phagesdb BLAST, HHPRED and NCBI BLAST, show no known function. Phagesdb BLAST and function frequency demonstrate known functions in other clusters rather than in the O cluster. HHPRED demonstrates very high e-values, poor probability percentages, not over then 90, and low coverage of the query to subject, therefore, no quality match was found. NCBI BLAST demonstrates full alignment of the unknown gene to known gene (Q1:S1), however, it does not give an appropriate function as they only provide hypothetical proteins. Deep TMHMM tells us there is no membrane protein. CDS complement (62940 - 63347) /gene="104" /product="gp104" /function="hypothetical protein" /locus tag="alkhayr_104" /note=Original Glimmer call @bp 63347 has strength 13.73; Genemark calls start at 63347 /note=SSC: 63347-62940 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_WINGET_105 [Mycobacterium phage Winget]],,NCBI, q1:s1 100.0% 1.71928E-96 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.204, -4.3978601770686225, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_WINGET_105 [Mycobacterium phage Winget]],,QWY81590,100.0,1.71928E-96 SIF-HHPRED: d.50.3.2 (A:) Hypothetical protein TTHA1913 {Thermus thermophilus [TaxId: 274]} | CLASS: Alpha and beta proteins (a+b), FOLD: dsRBD-like, SUPFAM: YcfA/nrd intein domain, FAM: YcfA-like,,,SCOP_d1whza_,32.5926,71.2 SIF-Syn: When comparing gene 104 of Alkahayr to gene 104 of school bus and 108 of Madkillah, the genes are homologous to one another even if the start and stop site are not at the same location. Genes upstream and downstream are also similar. Neither genes call for a function. No evidence from syntenic genes to support function call. /note=Genemark and Glimmer agree on the start site at 63,347. The start site also provides us with the longest ORF at 408 and the smallest overlap compared to other start sites. The z-score is not the best score, with a 2.204. The final score is not the least best score as the called start site is -4.398. However, according to starterator, it is the most coded start site with 29 manual annotations. Gene mark demonstrates that coding potential is strong throughout the whole orf with the coding capacity at one. The function of the gene is categorized as NKF (no known function) as Phagesdb BLAST, HHPRED and NCBI BLAST, show no known function. Phagesdb BLAST demonstrates good e-values, but there are known function in the O cluster. HHPRED demonstrates very high e-values, poor probability percentages (not over then 90) and low coverage of the query to subject, therefore no quality match was found. NCBI BLAST demonstrates full alignment of the unknown gene to known gene (Q1:S1) with good e-values, however, it does not give an appropriate function as they only provide hypothetical proteins. Deep TMHMM tells us there is no membrane protein. CDS complement (63347 - 63754) /gene="105" /product="gp105" /function="hypothetical protein" /locus tag="alkhayr_105" /note=Original Glimmer call @bp 63754 has strength 11.9; Genemark calls start at 63754 /note=SSC: 63754-63347 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_SCHOOLBUS_105 [Mycobacterium phage SchoolBus] ],,NCBI, q1:s1 100.0% 5.57576E-91 GAP: 39 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.065, -3.30700010583914, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_SCHOOLBUS_105 [Mycobacterium phage SchoolBus] ],,AVP42760,100.0,5.57576E-91 SIF-HHPRED: d.58.5.1 (A:1-112) automated matches {Synechococcus elongatus [TaxId: 32046]} | CLASS: Alpha and beta proteins (a+b), FOLD: Ferredoxin-like, SUPFAM: GlnB-like, FAM: Prokaryotic signal transducing protein,,,SCOP_d4affa1,64.4444,88.5 SIF-Syn: When comparing gene 105 of Alkahayr to gene 109 of Madkillah and gene 105 of school bus, the genes are homologous to one another even if the start and stop site are not at the same location. Genes upstream and downstream are also similar. Neither genes call for a function. No evidence from syntenic genes to support function call. /note=Genemark and Glimmer agree on the start site at 63,754. The start site also provides us with the longest ORF and the smallest gap compared to other start sites. The z-score is the best score, with a 3.065. The final score is not the least best score as the called start site is -3.307. However, according to starterator, it is the most coded start site with 17 manual annotations. Gene mark demonstrates that coding potential is not maintained throughout the whole gene as there are dips from 63500-63575 that has coding potential less than 1, at ranges of .3-.8. The function of the gene is categorized as NKF (no known function) as Phagesdb BLAST, HHPRED and NCBI BLAST, show no known function. Phagesdb BLAST demonstrates known functions in other clusters rather than in the O cluster. HHPRED demonstrates very high e-values, poor probability percentages (not over 90) and low coverage of the query to subject, therefore no quality match was found. NCBI BLAST demonstrates full alignment of the unknown gene to known gene (Q1:S1), however, it does not give an appropriate function as they only provide hypothetical proteins. Deep TMHMM tells us there is no membrane protein. CDS complement (63794 - 64105) /gene="106" /product="gp106" /function="helix-turn-helix DNA binding domain" /locus tag="alkhayr_106" /note=Original Glimmer call @bp 64105 has strength 13.64; Genemark calls start at 64105 /note=SSC: 64105-63794 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_108 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 3.48587E-68 GAP: -8 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.577, -3.54831954703195, yes F: helix-turn-helix DNA binding domain SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_108 [Mycobacterium phage Catdawg] ],,YP_008409277,100.0,3.48587E-68 SIF-HHPRED: Transcriptional regulator, MarR family; MarR, HELIX-TURN-HLEIX, TRANSCRIPTIONAL REGULATOR, Structural Genomics, PSI-2, Protein Structure Initiative, Midwest Center for Structural Genomics, MCSG; HET: SO4, GOL, MSE; 2.69A {Silicibacter pomeroyi DSS-3},,,3CDH_A,99.0291,99.4 SIF-Syn: When comparing gene 106 of Alkahayr to gene 110 of Madkillah and 106 of schoolbus, the genes are homologous to one another even if the start and stop site are not at the same location. Genes upstream and downstream are also similar. Madkillah and Schoolbus codes for helix-turn-helix DNA binding domain protein. /note=Genemark and Glimmer agree on the start site at 64,105. The start site provides us with the smallest ORF at 312 but the smallest overlap compared to other start sites. The z-score is the best score, with a 2.577. The final score is not the least best score as the called start site is -3.548. However, according to starterator, it is the most coded start site with 17 manual annotations. Gene mark demonstrates that coding potential is not maintained throughout the whole orf with a dip at 64,000 at about 0.7 and a slight dip at 63,720. The function of the gene is categorized as helix-turn-helix DNA binding domain as Phagesdb BLAST, HHPRED and NCBI BLAST support the function. Phagesdb BLAST demonstrates good e-values for similar phages in the O sub cluster that contain the same gene. HHPRED demonstrates very good e-values, great probability percentages (over 90) and high coverage of the query to subject, therefore a quality match was found. NCBI BLAST demonstrates full alignment of the unknown gene to known gene (Q1:S1) with good e-values and gives us the function of helix-turn-helix DNA binding protein. Deep TMHMM tells us that it is not a membrane protein. CDS complement (64098 - 64367) /gene="107" /product="gp107" /function="hypothetical protein" /locus tag="alkhayr_107" /note=Original Glimmer call @bp 64367 has strength 14.68; Genemark calls start at 64367 /note=SSC: 64367-64098 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_109 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 3.30975E-57 GAP: 88 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.319, -2.0162541296952132, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_109 [Mycobacterium phage Catdawg] ],,YP_008409278,100.0,3.30975E-57 SIF-HHPRED: A2L_zn_ribbon ; A2L zinc ribbon domain,,,PF08792.13,32.5843,97.3 SIF-Syn: When comparing gene 107 of Alkahayr to gene 111 of Madkillah and 107 of schoolbus, the genes are homologous to one another even if the start and stop site are not at the same location. Genes upstream and downstream are also similar. No evidence from syntenic genes to support function call. /note=Genemark and Glimmer agree on the start site at 64,367. The start site also provides us with a small ORF at 270 and a gap of 88 compared to other start sites. The z-score is the best score, with a 3.319. The final score is not the least best score as it is -2.016. However, according to starterator, it is the most coded start site with 69 manual annotations. Gene mark demonstrates that coding potential is strong throughout the whole orf with the coding capacity at one. The function of the gene is categorized as NKF (no known function) as Phagesdb BLAST, HHPRED and NCBI BLAST, show no known function. Phagesdb BLAST demonstrates good e-values, but there are no known function in the O cluster, with no data available for phagesdb function frequency. HHPRED demonstrates high probability for the best match, however it shows high e-values, and low coverage of the query to subject; therefore, no quality match was found. NCBI BLAST demonstrates full alignment of the unknown gene to known gene (Q1:S1) with good e-values, however, it does not give an appropriate function as they only provide hypothetical proteins. Deep TMHMM tells us there is no membrane protein. CDS complement (64456 - 64626) /gene="108" /product="gp108" /function="hypothetical protein" /locus tag="alkhayr_108" /note=Original Glimmer call @bp 64626 has strength 16.16; Genemark calls start at 64626 /note=SSC: 64626-64456 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_VORRPS_109 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 4.87971E-30 GAP: 109 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.005, -5.196246865656795, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_109 [Mycobacterium phage Vorrps] ],,AYQ98946,100.0,4.87971E-30 SIF-HHPRED: TRANSCRIPTION INITIATION FACTOR IIA SUBUNIT 2; TFIID, TFIIA, TRANSCRIPTION, RNA POLYMERASE II, GENERAL TRANSCRIPTION FACTORS, PREINITIATION COMPLEX, CORE PROMOTER, DNA BINDING; HET: SEP, TPO; 8.5A {HOMO SAPIENS},,,5FUR_D,66.0714,80.4 SIF-Syn: When comparing gene 106 of Alkahayr to gene 112 of Madkillah and gene 108 of Schoolbus, the genes are homologous to one another even if the start and stop site are not at the same location. Genes upstream and downstream are also similar. No evidence from syntenic genes to support function call. /note=Genemark and Glimmer agree on the start site at 64,626. The start site also provides us with a small ORF at 171 and a gap of 109 compared to other start sites. The z-score is the not the best score in comparison to the others, with a 2.005. The final score is not the least best score as it is -5.196. However, according to starterator, it is the most coded start site with 9 manual annotations. Gene mark demonstrates that coding potential is strong throughout the orf with the coding capacity at one. The function of the gene is categorized as NKF (no known function) as Phagesdb BLAST, HHPRED and NCBI BLAST, show no known function. Phagesdb BLAST demonstrates good e-values, but there are no known function in the O cluster, with no data available for phagesdb function frequency. HHPRED demonstrates low probability, high e-values, and low coverage of the query to subject; therefore, no quality match was found. NCBI BLAST demonstrates full alignment of the unknown gene to known gene (Q1:S1) with good e-values, however, it does not give an appropriate function as they only provide hypothetical proteins. Deep TMHMM tells us there is no membrane protein. CDS complement (64736 - 64930) /gene="109" /product="gp109" /function="hypothetical protein" /locus tag="alkhayr_109" /note=Original Glimmer call @bp 64930 has strength 8.66; Genemark calls start at 64930 /note=SSC: 64930-64736 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_SCHOOLBUS_109 [Mycobacterium phage SchoolBus] ],,NCBI, q1:s1 100.0% 3.33109E-39 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.082, -2.583959800616441, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_SCHOOLBUS_109 [Mycobacterium phage SchoolBus] ],,AVP42764,100.0,3.33109E-39 SIF-HHPRED: b.34.6.2 (A:) MazF protein {Escherichia coli [TaxId: 562]} | CLASS: All beta proteins, FOLD: SH3-like barrel, SUPFAM: Cell growth inhibitor/plasmid maintenance toxic component, FAM: Kid/PemK,,,SCOP_d1ub4a_,20.3125,59.5 SIF-Syn: Gene number 109 in Alkhayr is homologous with gene 113 in Madkillah and gene 109 in Schoolbus, even though the start and stop sites are not in the same location. Genes that are upstream and downstream are also similar. No evidence from syntenic genes to support function call. /note=Genemark and Glimmer agree on the start site at 64,930. The start site also provides us with the LORF and a small overlap compared to other start sites. The z-score is the best score in comparison to the others, with a 3.082. The final score is not the least best score as it is -2.584. However, according to starterator, it is the most coded start site with 18 manual annotations. Gene mark demonstrates that coding potential is strong but not throughout the whole ORF. The gene has a coding capacity at one. The function of the gene is categorized as NKF (no known function) as Phagesdb BLAST, HHPRED and NCBI BLAST, show no known function. Phages function frequency has no data available. Phagesdb BLAST demonstrates good e-values, but there are no known function in the O cluster, with no data available for phagesdb function frequency. HHPRED demonstrates low probability, high e-values, and low coverage of the query to subject; therefore, no quality match was found. NCBI BLAST demonstrates full alignment of the unknown gene to known gene (Q1:S1) with good e-values, however, it does not give an appropriate function as they only provide hypothetical proteins. Deep TMHMM tells us there is no membrane protein. CDS complement (64927 - 65259) /gene="110" /product="gp110" /function="membrane protein" /locus tag="alkhayr_110" /note=Original Glimmer call @bp 65250 has strength 11.31; Genemark calls start at 65259 /note=SSC: 65259-64927 CP: yes SCS: both-gm ST: NA BLAST-Start: [hypothetical protein SEA_WINGET_111 [Mycobacterium phage Winget] ],,NCBI, q1:s1 100.0% 8.81387E-73 GAP: 19 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.738, -4.062450304550039, yes F: membrane protein SIF-BLAST: ,,[hypothetical protein SEA_WINGET_111 [Mycobacterium phage Winget] ],,QWY81596,100.0,8.81387E-73 SIF-HHPRED: AflR ; Aflatoxin regulatory protein,,,PF08493.13,29.0909,44.8 SIF-Syn: Gene number 110 in alkhayr is homologous with gene 110 in firecracker even though the start and stop sites are not the same. Neither genes call for a function. Genes that are upstream and downstream are also similar. /note=Glimmer and Genemark do not agree on a start site. According to Starterator, 65250 has 8 manual annotations 65259 has 4 but the later gives us the longest ORF and shortest gap. Also NCBI Blast supports start site 65250 by showing Q1:S1. Start site 65250 provides the third longest ORF, the third smallest gap, and the smallest final score. There is coding potential but it is not strong throughout the gene. Between around 65130bp - 65005bp the coding potential drops and increases. There is evidence in Phagesdb Function Frequency that support functions but none that are in the same cluster as Alkhayrs. Phagesdb Blast shows function unknown in cluster O with e-value of 2e-62. There is also no evidence in HHPRED to support any function. NCBI Blast supports a hypothetical protein which has a 100% identity, aligned, coverage and an e value of 1.37469e-70.DeepTMHMM predicts two TMDs. CDS complement (65279 - 65536) /gene="111" /product="gp111" /function="hypothetical protein" /locus tag="alkhayr_111" /note=Original Glimmer call @bp 65536 has strength 11.7; Genemark calls start at 65536 /note=SSC: 65536-65279 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_113 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 7.45903E-53 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.082, -2.970161406017234, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_113 [Mycobacterium phage Catdawg] ],,YP_008409282,100.0,7.45903E-53 SIF-HHPRED: DUF6324 ; Family of unknown function (DUF6324),,,PF19849.2,35.2941,36.6 SIF-Syn: Gene number 111 in alkhayr is homologous with gene 112 in firecracker even though the start and stop sites are not the same. Neither genes call for a function. Genes that are upstream and downstream are also similar. /note=Both Genemark and Glimmer agree on the start site 65536. According to Starterator, start site 65536 is the most coded start site with 22 manual annotations. Also NCBI supports start site 65536 by showing Q1:S1. Start site 65536 provides the longest ORF, shortest gap, and the smallest final score. There is strong coding potential throughout the whole gene. There is evidence in Phagesdb Function Frequency that supports a function but it is not in the same cluster as Alkhayrs. Phagesdb Blast shows function unknown in cluster O with e-value of 1e-40. There is also no evidence in HHPRED to support any function. NCBI Blast supports a hypothetical protein which has a 100% identity, aligned, coverage and an e value of 7.45903e-53. Finally there is no evidence in DeepTMHMM that supports a function, no TMD. CDS complement (65533 - 65754) /gene="112" /product="gp112" /function="hypothetical protein" /locus tag="alkhayr_112" /note=Original Glimmer call @bp 65754 has strength 7.33; Genemark calls start at 65757 /note=SSC: 65754-65533 CP: yes SCS: both-gl ST: NI BLAST-Start: [hypothetical protein PBI_CATDAWG_114 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 1.1202E-46 GAP: -14 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.498, -3.712290173411778, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_114 [Mycobacterium phage Catdawg] ],,YP_008409283,100.0,1.1202E-46 SIF-HHPRED: Uncharacterized protein; ribosome, protein-protein interaction, Structural Genomics, Structural Genomics Consortium, SGC, UNKNOWN FUNCTION; NMR {Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009)},,,5JN6_A,73.9726,90.1 SIF-Syn: Gene number 112 in alkhayr is homologous with gene 113 in firecracker even though the start and stop sites are not the same. Neither genes call for a function. Genes that are upstream and downstream are also similar. /note=Genemark and Glimmer do not agree on a start site. According to Starterator, start site 65754 is the most called start site with 22 manual annotations. Both Glimmer and Starterator agree on the start site 65754. Also NCBI Blast supports start site 65754 by showing Q1:S1. Start site 65754 provides the smallest ORF, the smallest gap, and it does not have the smallest final score. There is coding potential but it is not strong throughout the gene. There is evidence in Phagesdb Function Frequency that supports functions but none that are in the same cluster as Alkhayrs. Phagesdb Blast shows function unknown in cluster O with e-value of 3e-39. There is also no evidence in HHPRED to support any function. NCBI Blast supports a hypothetical protein which has a 100% identity, aligned, coverage and an e value of 1.1202e-46. There is no evidence in DeepTMHMM that supports the presence of a TMD. CDS complement (65741 - 66256) /gene="113" /product="gp113" /function="hypothetical protein" /locus tag="alkhayr_113" /note=Original Glimmer call @bp 66256 has strength 10.82; Genemark calls start at 66256 /note=SSC: 66256-65741 CP: yes SCS: both ST: NI BLAST-Start: [hypothetical protein PBI_CATDAWG_115 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 4.77723E-121 GAP: -11 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.984, -2.7863799983944713, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_115 [Mycobacterium phage Catdawg] ],,YP_008409284,100.0,4.77723E-121 SIF-HHPRED: apCC-Tet*; coiled coil, 4-helix bundle, de novo protein design, peptide assembly, DE NOVO PROTEIN; 0.96A {N/A},,,6Q5Q_A,10.5263,27.8 SIF-Syn: Gene number 113 in alkhayr is homologous with gene 114 in firecracker even though the start and stop sites are not the same. Neither genes call for a function. Genes that are upstream and downstream are also similar. /note=Both Genemark and Glimmer agree on the start site 66256. According to Starterator, start site 66256 is the most coded start site with 22 manual annotations Also NCBI Blast supports start site 66256 by showing Q1:S1. Start site 66256 provides the longest ORF, the smallest gap, and the smallest final score. There is coding potential but it is not strong throughout. Between around 66050bp - 65790bp the coding potential drops and increases. There is evidence in Phagesdb Function Frequency that support functions but none that are in the same cluster as Alkhayrs. Phagesdb Blast shows function unknown in cluster O with e-value of 1e-103. There is also no evidence in HHPRED to support any function. NCBI Blast supports a hypothetical protein which has a 100% identity, aligned, coverage and an e value of 4.77723e-121. Finally DeepTMHMM doesn`t predict a TMD. CDS complement (66246 - 66503) /gene="114" /product="gp114" /function="hypothetical protein" /locus tag="alkhayr_114" /note= /note=SSC: 66503-66246 CP: yes SCS: neither ST: NI BLAST-Start: [hypothetical protein PBI_CATDAWG_116 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 4.9308E-54 GAP: 3 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.321, -7.386474746755941, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_116 [Mycobacterium phage Catdawg] ],,YP_008409285,100.0,4.9308E-54 SIF-HHPRED: DUF4503 ; Domain of unknown function (DUF4503),,,PF14951.10,48.2353,91.5 SIF-Syn: Added gene number 114 in alkhayr is not homologous with any gene in firecracker. Gene 114 in Pham maps do not call for a function. Genes that are upstream and downstream are also similar. /note=Added missing gene. There is coding potential on genemark report, gene is present in similar cluster O. Start site 66503 provides the longest ORF, smallest gap, and the biggest final score. Also NCBI Blast supports start site 65250 by showing Q1:S1. There is evidence in Phagesdb Function Frequency that support functions but none that are in the same cluster as Alkhayrs. Phagesdb Blast shows function unknown in cluster O with e-value of 2e-49. There is also no evidence in HHPRED to support any function. NCBI Blast supports a hypothetical protein which has a 100% identity, aligned, coverage and an e-value of 4.9308e-54. Finally there is no evidence in DeepTMHMM that supports a function. CDS complement (66507 - 66773) /gene="115" /product="gp115" /function="hypothetical protein" /locus tag="alkhayr_115" /note=Original Glimmer call @bp 66773 has strength 13.9; Genemark calls start at 66779 /note=SSC: 66773-66507 CP: yes SCS: both-gl ST: SS BLAST-Start: [hypothetical protein KNU03_gp121 [Mycobacterium phage Ryadel] ],,NCBI, q1:s1 100.0% 3.64306E-59 GAP: 2 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.594, -3.513874779476501, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein KNU03_gp121 [Mycobacterium phage Ryadel] ],,YP_010097611,100.0,3.64306E-59 SIF-HHPRED: YjdM_Zn_Ribbon ; PhnA Zinc-Ribbon,,,PF08274.15,22.7273,62.7 SIF-Syn: Gene number 115 in alkhayr is homologous with gene 116 in firecracker even though the start and stop sites are not the same. Neither genes call for a function. Genes that are upstream and downstream are also similar. /note=Both Genemark and Glimmer do not agree on a start site. According to Starterator, start site 66773 has 16 MAs and 66779 14 MAs. Since 66773 has the smallest gap but not the longest ORF. Also NCBI Blast supports start site 66773 by showing Q1:S1. Start site 66773 provides the second longest ORF, the shortest gap, and the third smallest final score. There is strong coding potential but not throughout the whole gene. Between about 66570bp - 66480bp the coding potential drops significantly. There is no evidence in Phagesdb Function Frequency that support a function. Phagesdb Blast shows function unknown in cluster O with e-value of 2e-50 which supports the function of NKF. There is also no evidence in HHPRED to support any function. NCBI Blast supports a hypothetical protein which has a 100% identity, aligned, coverage and an e value of 3.64306e-59. DeepTMHMM doesnt predict a TMD. CDS complement (66776 - 66931) /gene="116" /product="gp116" /function="hypothetical protein" /locus tag="alkhayr_116" /note= /note=SSC: 66931-66776 CP: yes SCS: neither ST: NI BLAST-Start: [hypothetical protein SEA_KRILI_119 [Mycobacterium phage Krili] ],,NCBI, q1:s1 100.0% 1.07055E-30 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.36, -4.014494766727933, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_KRILI_119 [Mycobacterium phage Krili] ],,QFP97165,100.0,1.07055E-30 SIF-HHPRED: a.160.1.1 (A:215-364) Poly(A) polymerase, PAP, middle domain {Cow (Bos taurus) [TaxId: 9913]} | CLASS: All alpha proteins, FOLD: PAP/OAS1 substrate-binding domain, SUPFAM: PAP/OAS1 substrate-binding domain, FAM: Poly(A) polymerase, PAP, middle domain,,,SCOP_d1q79a1,72.549,85.5 SIF-Syn: Added gene number 116 in alkhayr is not homologous with any gene in firecracker. Gene 116 in Pham maps do not call for a function. Genes that are upstream and downstream are also similar. /note=Added missing gene. There is coding potential on genemark report, gene is present in similar cluster O phages. Start site 66931 provides the longest ORF, the second shortest gap, and has the smallest final score. Phagesdb Function Frequency does not give data that supports any function. Phagesdb Blast shows function unknown in cluster O with e-value of 5e-28 which supports the function of NKF. There is also no evidence in HHPRED to support any function. NCBI Blast supports a hypothetical protein which has a 100% identity, aligned, coverage and an e value of 1.07055e-30. DeepTMHMM doesn`t predict a TMD. CDS complement (66928 - 67029) /gene="117" /product="gp117" /function="hypothetical protein" /locus tag="alkhayr_117" /note=Genemark calls start at 67029 /note=SSC: 67029-66928 CP: yes SCS: genemark ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_119 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 4.96671E-12 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.929, -4.905314844093283, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_119 [Mycobacterium phage Catdawg] ],,YP_008409288,100.0,4.96671E-12 SIF-HHPRED: Conotoxin ; Conotoxin,,,PF02950.20,30.303,50.3 SIF-Syn: Gene number 117 in alkhayr is homologous with gene 118 in firecracker even though the start and stop sites are not the same. Neither genes call for a function. Genes that are upstream and downstream are also similar. /note=Both Genemark and Starterator agree on start site 67029. According to Starterator, start site 67029 is the most coded start site with 22 manual annotations. Also NCBI Blast supports start site 67029 by showing Q1:S1. Start site 67029 provides the longest ORF, the smallest gap, and smallest final score. Coding potential is strong throughout the whole gene. There is no evidence in Phagesdb Function Frequency that supports a functions. Phagesdb Blast shows function unknown in cluster O with e-value of 5e-12 which supports the function of NKF. There is also no evidence in HHPRED to support any function. NCBI Blast supports a hypothetical protein which has a 100% identity, aligned, coverage and an e value of 4.96671e-12. TmHmm doesn`t predict a TMD. CDS complement (67026 - 67223) /gene="118" /product="gp118" /function="hypothetical protein" /locus tag="alkhayr_118" /note=Original Glimmer call @bp 67223 has strength 11.34; Genemark calls start at 67223 /note=SSC: 67223-67026 CP: yes SCS: both ST: NA BLAST-Start: [hypothetical protein SEA_SMOOCH_117 [Mycobacterium phage Smooch]],,NCBI, q1:s2 100.0% 2.11393E-37 GAP: -1 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.082, -2.583959800616441, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_SMOOCH_117 [Mycobacterium phage Smooch]],,QFP96608,98.4848,2.11393E-37 SIF-HHPRED: Nanovirus_C8 ; Nanovirus component 8 (C8) protein,,,PF05629.14,38.4615,51.4 SIF-Syn: not informative /note=67723 has 10 MAs in Starterator, coding potential all inclusive. No evidence supporting function call. Not a TMP. CDS complement (67223 - 67402) /gene="119" /product="gp119" /function="hypothetical protein" /locus tag="alkhayr_119" /note=Original Glimmer call @bp 67402 has strength 15.64; Genemark calls start at 67402 /note=SSC: 67402-67223 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_121 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 4.53379E-35 GAP: 45 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.489, -5.796882727134779, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_121 [Mycobacterium phage Catdawg] ],,YP_008409290,100.0,4.53379E-35 SIF-HHPRED: SIF-Syn: Blessica_119 with similar genes upstream and downstream /note=Glimmer and GeneMark agree to start site 67402 with stop site at 67223. 67402 is also most manually annoted start site in Starterator with 22 MAs. This provides gene 119 with the longest ORF and shortest gap. Phagesdb BLAST provides NKF with e-value 2e-28 in O cluster. NCBI BLAST supports NKF with 100% identity, alignment, and coverage. DeepTMHMM supports protein is inside cell membrane. CDS complement (67448 - 67753) /gene="120" /product="gp120" /function="hypothetical protein" /locus tag="alkhayr_120" /note=Original Glimmer call @bp 67753 has strength 13.72; Genemark calls start at 67783 /note=SSC: 67753-67448 CP: yes SCS: both-gl ST: SS BLAST-Start: [hypothetical protein SEA_VORRPS_121 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 1.30467E-65 GAP: 103 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.005, -4.669046746593814, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_121 [Mycobacterium phage Vorrps] ],,AYQ98958,99.0099,1.30467E-65 SIF-HHPRED: SIF-Syn: not informative. /note=no evidence for function call. Not a TMP. CDS complement (67857 - 68144) /gene="121" /product="gp121" /function="hypothetical protein" /locus tag="alkhayr_121" /note=Original Glimmer call @bp 68144 has strength 11.7; Genemark calls start at 68144 /note=SSC: 68144-67857 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_123 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 5.80399E-62 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.665, -3.4470622626190464, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_123 [Mycobacterium phage Catdawg] ],,YP_008409292,100.0,5.80399E-62 SIF-HHPRED: SIF-Syn: not informative /note=None of the sources of evidence for function supports a function call. Not a TMP. CDS complement (68141 - 68284) /gene="122" /product="gp122" /function="hypothetical protein" /locus tag="alkhayr_122" /note=Original Glimmer call @bp 68284 has strength 10.79; Genemark calls start at 68317 /note=SSC: 68284-68141 CP: no SCS: both-gl ST: SS BLAST-Start: [hypothetical protein PBI_CATDAWG_124 [Mycobacterium phage Catdawg] ],,NCBI, q1:s1 100.0% 6.49099E-22 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.082, -2.583959800616441, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein PBI_CATDAWG_124 [Mycobacterium phage Catdawg] ],,YP_008409293,97.8723,6.49099E-22 SIF-HHPRED: SIF-Syn: Not informative. /note=No evidence supporting a function call. Not a TMP. CDS complement (68281 - 68436) /gene="123" /product="gp123" /function="hypothetical protein" /locus tag="alkhayr_123" /note=Original Glimmer call @bp 68436 has strength 5.59 /note=SSC: 68436-68281 CP: yes SCS: glimmer ST: SS BLAST-Start: [hypothetical protein SEA_KRILI_126 [Mycobacterium phage Krili]],,NCBI, q1:s36 100.0% 7.98098E-29 GAP: 118 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.924, -4.835658240213919, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_KRILI_126 [Mycobacterium phage Krili]],,QFP97172,59.3023,7.98098E-29 SIF-HHPRED: SIF-Syn: not informative. /note=Glimmer and Starterator support start at 68436 with 16 MAs. Phagesdb BLAST supports NKF within O cluster. no evidence to support a function call. Not a tmp. CDS complement (68555 - 68791) /gene="124" /product="gp124" /function="hypothetical protein" /locus tag="alkhayr_124" /note=Original Glimmer call @bp 68791 has strength 16.2; Genemark calls start at 68731 /note=SSC: 68791-68555 CP: no SCS: both-gl ST: NI BLAST-Start: [hypothetical protein SEA_WINGET_125 [Mycobacterium phage Winget]],,NCBI, q1:s1 100.0% 2.47011E-47 GAP: 124 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.161, -5.2540491229691355, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_WINGET_125 [Mycobacterium phage Winget]],,QWY81610,100.0,2.47011E-47 SIF-HHPRED: SIF-Syn: CDS complement (68916 - 69386) /gene="125" /product="gp125" /function="hypothetical protein" /locus tag="alkhayr_125" /note= /note=SSC: 69386-68916 CP: no SCS: neither ST: NA BLAST-Start: [hypothetical protein SEA_WINGET_126 [Mycobacterium phage Winget]],,NCBI, q1:s1 100.0% 3.64279E-103 GAP: 567 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.048, -6.786414265900529, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_WINGET_126 [Mycobacterium phage Winget]],,QWY81611,100.0,3.64279E-103 SIF-HHPRED: SIF-Syn: not informative. /note=Added missing gene. There is no coding potential on genemark report, gene is present in similar cluster O phages, MadKillah, Winget, with unknown function. NCBI Blast: 100%, 100% coverage, Q1:S1, e value: 3.64279e-103. No evidence to support a function call. Not a tmp. CDS 69954 - 70676 /gene="126" /product="gp126" /function="hypothetical protein" /locus tag="alkhayr_126" /note=Original Glimmer call @bp 69954 has strength 13.86; Genemark calls start at 69954 /note=SSC: 69954-70676 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_VORRPS_127 [Mycobacterium phage Vorrps] ],,NCBI, q1:s1 100.0% 7.74112E-170 GAP: 567 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.924, -4.976656753876107, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_VORRPS_127 [Mycobacterium phage Vorrps] ],,AYQ98964,100.0,7.74112E-170 SIF-HHPRED: SIF-Syn: not informative. /note=no supporting evidence for function call. Not a tMP