CDS complement (1 - 399) /gene="1" /product="gp1" /function="hypothetical protein" /locus tag="Casino_1" /note=Original Glimmer call @bp 399 has strength 14.1; Genemark calls start at 399 /note=SSC: 399-1 CP: no SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_MORRILL_1 [Microbacterium phage Morrill]],,NCBI, q1:s51 92.4242% 1.45665E-28 GAP: 189 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.923, -4.64618386363431, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_MORRILL_1 [Microbacterium phage Morrill]],,UQT01686,47.0588,1.45665E-28 SIF-HHPRED: DSHCT ; DSHCT (NUC185) domain,,,PF08148.16,27.2727,92.0 SIF-Syn: Synteny with other EM1 phages /note=There were no functions supported with e-value, percentage, or coverage in HHpred, NCBI, Phages Db blast. CDS complement (589 - 1263) /gene="2" /product="gp2" /function="DNA primase/polymerase" /locus tag="Casino_2" /note=Original Glimmer call @bp 1263 has strength 14.11; Genemark calls start at 1338 /note=SSC: 1263-589 CP: no SCS: both-gl ST: NI BLAST-Start: [DNA primase/polymerase [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 1.18992E-164 GAP: 13 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.928, -5.465303019845003, no F: DNA primase/polymerase SIF-BLAST: ,,[DNA primase/polymerase [Microbacterium phage Hannabella]],,UVG34210,100.0,1.18992E-164 SIF-HHPRED: ORF904; primase, polymerase, Replication; 1.85A {Sulfolobus islandicus},,,3M1M_A,45.9821,98.4 SIF-Syn: There is systems between Casino Gene 2 and GShelby Gene 2. They match up alongside matching with the same protein function of DNA primase/polymerase. /note=-The function DNA primase/polymerase is correct because the Phagesdb BLAST matches with Gshelby that has an E-value of 1e-127 and several others. This E-value is within the guiding principles. NCBI BLAST also portrays a great e-value of 1.18992e-164. CDS complement (1277 - 2821) /gene="3" /product="gp3" /function="DNA helicase" /locus tag="Casino_3" /note=Original Glimmer call @bp 2821 has strength 11.77; Genemark calls start at 2821 /note=SSC: 2821-1277 CP: yes SCS: both ST: SS BLAST-Start: [DNA helicase [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 0.0 GAP: 52 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.033, -2.356391881054909, yes F: DNA helicase SIF-BLAST: ,,[DNA helicase [Microbacterium phage Hannabella]],,UVG34211,100.0,0.0 SIF-HHPRED: Primase D5; DNA helicase, D5_N domain, DUF5906 domain, Pox_D5 domain, SF3 helicase, VIRAL PROTEIN;{Vaccinia virus Copenhagen},,,8APM_B,79.9611,100.0 SIF-Syn: There are relatively high levels of synteny among Casino and the other phages in its subcluster and these levels range between 79%-99%. /note=start: grinner and gene mark agree to the start at 2821 bp, captures all the coding potential, spacer and gap falls in line with the guiding principles for a reverse gene, decent z-score and final score among the rest of the determined starts. /note=function: PhagesDB shows it appears in five of eight phages in EM1 subcluster (subcluster that Casino is in ), HHPRED shows good e-value scores and probability scores, NCBI BLAST also has goo e-values scores CDS complement (2874 - 3062) /gene="4" /product="gp4" /function="hypothetical protein" /locus tag="Casino_4" /note=Original Glimmer call @bp 3017 has strength 19.93; Genemark calls start at 3017 /note=SSC: 3062-2874 CP: yes SCS: both-cs ST: SS BLAST-Start: [hypothetical protein HWD16_gp04 [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 1.72464E-36 GAP: 2 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.392, -4.747299625592391, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWD16_gp04 [Microbacterium phage Arete] ],,YP_009857387,100.0,1.72464E-36 SIF-HHPRED: SIF-Syn: There are relatively high synteny scores among Casino and the other phages in its subcluster and these scores range from between 79%-99%. /note=3017 barely catches all of the coding potential; phages with a higher match to gene structure have the start at position 3062 /note=start: glimmer and genemark both suggest that the start is at 3017 bp, captures all the coding potential, gap and spacer match the guiding principles for a reverse gene, z-score and final scores are decent and align with the principles /note=function: PhagesDB shows phages relative to Casino that have no known function at this gene, NCBI BLAST shows good e-value scores for no known function CDS complement (3065 - 4297) /gene="5" /product="gp5" /function="DNA helicase" /locus tag="Casino_5" /note=Original Glimmer call @bp 4297 has strength 17.6; Genemark calls start at 4294 /note=SSC: 4297-3065 CP: yes SCS: both-gl ST: NI BLAST-Start: [DNA helicase [Microbacterium phage Hannabella]],,NCBI, q2:s1 99.7561% 0.0 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.994, -5.148613652008008, no F: DNA helicase SIF-BLAST: ,,[DNA helicase [Microbacterium phage Hannabella]],,UVG34213,100.0,0.0 SIF-HHPRED: ATP-dependent RNA helicase DbpA; DEAD-box helicase, RNA, ribosome biogenesis, ATPase, RNA BINDING PROTEIN; HET: BEF, POP, ADP; 2.5A {Escherichia coli (strain K12)},,,7PLI_F,91.7073,100.0 SIF-Syn: There was a decent amount of Synteny of 79% however it could better as it`s still in the 70`s range but closer to 80. /note=Start changed to site 4297 because longer ORF and slightly better scores. /note= /note=Start Determination: /note=Out of two potential start candidates, start at 4294 bp follows guiding principle stating to choose second candidate. This start has a small gap of -1, which is within the overlap bounds of 7 bp. It has a e value of 1.994, which is close to 2. The final score of -5.806 is not the highest, but is similar to other candidates. Encompasses all of gene mark coding potential spike. Starterator states it is only present in 12.3% of members in the Pham. /note= /note=Function: HHPRED’s top hits mention DNA binding. PhagesDB has DNA helicase as function for a majority of phages in the Pham, with e-values smaller than 10^-7. NCBI’s mention DNA helicase as a function for a majority of the top 10 hits, with small e-values and coverages over 80%. CDS complement (4294 - 4980) /gene="6" /product="gp6" /function="hypothetical protein" /locus tag="Casino_6" /note=Original Glimmer call @bp 4980 has strength 16.38; Genemark calls start at 4980 /note=SSC: 4980-4294 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_HANNABELLA_6 [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 3.61905E-167 GAP: 93 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.955, -2.583959800616441, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_HANNABELLA_6 [Microbacterium phage Hannabella]],,UVG34214,100.0,3.61905E-167 SIF-HHPRED: DUF6378 ; Domain of unknown function (DUF6378),,,PF19905.3,34.6491,99.9 SIF-Syn: Synteny with other EM1 phages /note=High similarity to DUF6378 on HHPRED /note= /note=It has no known function because of the reptation of the other gene 6`s have no known function as well. CDS complement (5074 - 5235) /gene="7" /product="gp7" /function="ribbon-helix-helix DNA binding domain" /locus tag="Casino_7" /note=Original Glimmer call @bp 5247 has strength 11.95; Genemark calls start at 5247 /note=SSC: 5235-5074 CP: yes SCS: both-cs ST: SS BLAST-Start: [ribbon-helix-helix DNA binding domain protein [Microbacterium phage Araxxi] ],,NCBI, q1:s1 100.0% 5.7983E-28 GAP: -4 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.031, -6.02261024484599, no F: ribbon-helix-helix DNA binding domain SIF-BLAST: ,,[ribbon-helix-helix DNA binding domain protein [Microbacterium phage Araxxi] ],,YP_009850665,100.0,5.7983E-28 SIF-HHPRED: Putative; Helicobacter pylori, repressor, transcriptional regulator, DNA-binding, ribbon-helix-helix, HP0564, JHP0511, UNKNOWN FUNCTION, GENE REGULATION; NMR {Helicobacter pylori},,,2K1O_A,81.1321,99.1 SIF-Syn: Synteny exists with gene position 8 in Araxxi, DoTi, Hannabella, and Burro, and gene position 7 in Gshelby23. Arete gene 9 also shares synteny with Casino gene 7. /note=Support for this function exists within PhagesDB, HHpred, and NCBI BLAST all with a values no higher than 1.3e-8. CDS complement (5232 - 5711) /gene="8" /product="gp8" /function="RuvC-like resolvase" /locus tag="Casino_8" /note=Original Glimmer call @bp 5711 has strength 10.02; Genemark calls start at 5561 /note=SSC: 5711-5232 CP: yes SCS: both-gl ST: SS BLAST-Start: [RuvC-like resolvase [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 3.96039E-111 GAP: 64 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.184, -2.0949393225970705, yes F: RuvC-like resolvase SIF-BLAST: ,,[RuvC-like resolvase [Microbacterium phage Hannabella]],,UVG34216,100.0,3.96039E-111 SIF-HHPRED: c.55.3.6 (A:) automated matches {Pseudomonas aeruginosa [TaxId: 287]} | CLASS: Alpha and beta proteins (a/b), FOLD: Ribonuclease H-like motif, SUPFAM: Ribonuclease H-like, FAM: RuvC resolvase,,,SCOP_d6lw3a_,82.3899,98.9 SIF-Syn: Synteny with other EM1 phages /note=It was the only available option as the other options had no known function, so there was basically only one option with a function to choose. /note= /note=I agree with the called function and start site. I added evidence in NCBI and PhagesDB BLAST(Hannabella, Fullmetal, Xitlalli, Birdfeeder, LesNorah) for the called function, and I also added that the start site contains all GM capacity. Lastly, I added that Starterator contains the suggested start. CDS complement (5776 - 7728) /gene="9" /product="gp9" /function="DNA polymerase I" /locus tag="Casino_9" /note=Original Glimmer call @bp 7956 has strength 13.98; Genemark calls start at 7905 /note=SSC: 7728-5776 CP: no SCS: both-cs ST: SS BLAST-Start: [DNA polymerase I [Microbacterium phage Hannabella]],,NCBI, q1:s77 100.0% 0.0 GAP: 224 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.804, -3.8666016581442175, no F: DNA polymerase I SIF-BLAST: ,,[DNA polymerase I [Microbacterium phage Hannabella]],,UVG34217,87.5862,0.0 SIF-HHPRED: DNA polymerase nu; Pol Nu, Polymerase, error-prone DNA synthesis, TRANSFERASE-DNA complex; HET: MES; 2.95A {Homo sapiens},,,4XVK_A,99.5385,100.0 SIF-Syn: There is synteny for Casino gene 9 with both Gshelby23 and Hannabella gene 9. /note=PhageDB BLAST shows similar gene with the same function, supported by e-values of 0. /note=HHPRED supports this function with e-values of 0. /note= /note=NCBI BLAST also supports this with e-values of 0. CDS complement (7953 - 8891) /gene="10" /product="gp10" /function="Cas4 exonuclease" /locus tag="Casino_10" /note=Original Glimmer call @bp 8891 has strength 15.45; Genemark calls start at 8783 /note=SSC: 8891-7953 CP: yes SCS: both-gl ST: SS BLAST-Start: [Cas4 family exonuclease [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 0.0 GAP: -26 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.12, -4.367477548074901, no F: Cas4 exonuclease SIF-BLAST: ,,[Cas4 family exonuclease [Microbacterium phage Hannabella]],,UVG34218,96.8051,0.0 SIF-HHPRED: Cas4_I-A; CRISPR/Cas system-associated protein Cas4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA.,,,cd09659,93.2692,99.9 SIF-Syn: There is synteny in gene 10 of Casino with gene 10 Gshelby. /note=PhagesDB BLAST: similar genes show the same function supported wit an e-value of 1e-177 /note=HHPRED: the function is supported but the e-value is 3.9e-19 /note= /note=NCBI BLAST supports Cas4 function with e-value of 0. CDS complement (8866 - 9600) /gene="11" /product="gp11" /function="hypothetical protein" /locus tag="Casino_11" /note=Original Glimmer call @bp 9600 has strength 15.79; Genemark calls start at 9600 /note=SSC: 9600-8866 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_GSHELBY23_11 [Microbacterium phage Gshelby23] ],,NCBI, q1:s1 100.0% 0.0 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.684, -5.238109312333945, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_GSHELBY23_11 [Microbacterium phage Gshelby23] ],,URM86408,99.5902,0.0 SIF-HHPRED: DNA-directed RNA polymerase I subunit RPA12; TRANSCRIPTION;{Schizosaccharomyces pombe (strain 972 / ATCC 24843)},,,7AOC_I,32.7869,92.7 SIF-Syn: Synteny with other EM1 phages /note=phages db BLASTs has no known function while HHPRED and NCBI list a “hypothetical protein” and DNA-directed RNA polymerase as functions respectively. Conflicting information leads us to list an unknown function CDS complement (9597 - 9740) /gene="12" /product="gp12" /function="hypothetical protein" /locus tag="Casino_12" /note=Original Glimmer call @bp 9740 has strength 12.86; Genemark calls start at 9740 /note=SSC: 9740-9597 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein HWB89_gp13 [Microbacterium phage Burro] ],,NCBI, q1:s1 100.0% 8.04595E-21 GAP: 65 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.78, -2.8793565916978188, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWB89_gp13 [Microbacterium phage Burro] ],,YP_009841632,89.3617,8.04595E-21 SIF-HHPRED: MvaT_DBD; DNA-binding domain of the bacterial xenogeneic silencer MvaT. MvaT is a xenogeneic silencer conserved in Pseudomonas which assists in distinguishing foreign from self DNA.,,,cd16170,68.0851,90.1 SIF-Syn: In Phamerator, there is synteny between Gshelby and Casino. /note=PhagesDB BLAST supports an unknown function with e values of 2e-23. /note=HHPRED and NCBI BLAST do not provide conclusive information for function determination. CDS complement (9806 - 10495) /gene="13" /product="gp13" /function="hypothetical protein" /locus tag="Casino_13" /note=Original Glimmer call @bp 10495 has strength 19.86; Genemark calls start at 10495 /note=SSC: 10495-9806 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein HWD16_gp15 [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 1.15566E-163 GAP: 111 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.955, -2.523003374675015, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWD16_gp15 [Microbacterium phage Arete] ],,YP_009857398,99.5633,1.15566E-163 SIF-HHPRED: Phage-related protein DUF2815; structural genomics, phage-related protein, DUF2815, pfam10991, PSI-Biology, Midwest Center for Structural Genomics, MCSG, UNKNOWN FUNCTION; HET: MSE; 1.93A {Enterococcus faecalis},,,4KLK_A,59.3886,93.8 SIF-Syn: Gene 13 of Casino shows synteny with gene 13 of Gshelby23 and gene 14 of Hannabella. The functions of the two genes showing synteny is unknown, which supports the decision made for function. /note=PhagesDB BLAST all but one of the genes in our cluster have an Unknown function with the best E-value of (1e-126). HHPRED also had similar results an although it gave a wider verity of results though the E-values were way to high to provide strong evidence for any conclusion. NCBI BLAST only lists the function as Possible Function which means that the data is inconclusive, and Conserved Domain Database also lists unknown function as its result but it does have a high E-value. Topcons has no data listed. CDS complement (10607 - 11569) /gene="14" /product="gp14" /function="hypothetical protein" /locus tag="Casino_14" /note=Original Glimmer call @bp 11569 has strength 14.21; Genemark calls start at 11569 /note=SSC: 11569-10607 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_GSHELBY23_14 [Microbacterium phage Gshelby23]],,NCBI, q1:s1 100.0% 0.0 GAP: -20 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.955, -2.442961286954254, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_GSHELBY23_14 [Microbacterium phage Gshelby23]],,URM86411,100.0,0.0 SIF-HHPRED: DUF2828 ; Domain of unknown function (DUF2828),,,PF11443.12,55.9375,99.2 SIF-Syn: Gene 14 in Casino is Identical to Gene 14 in Gshelby23 and Gene 15 in Hannabella. Neither of these phages have a function for this gene /note=similarity to DUF2828 on HHPRED /note= /note=In HHPRED the only hit with an E-value less than 1e-7 is an unknown function, the E-value is 9.4e-10. In NCBI BLAST all hits in the EM1 cluster were hypothetical proteins with an E-value of 0. Most of the closely related phages did not have a function for this gene. CDS complement (11550 - 12506) /gene="15" /product="gp15" /function="AAA-ATPase" /locus tag="Casino_15" /note=Original Glimmer call @bp 12746 has strength 18.49; Genemark calls start at 12506 /note=SSC: 12506-11550 CP: yes SCS: both-gm ST: NI BLAST-Start: [AAA-ATPase [Microbacterium phage Arete] ],,NCBI, q1:s84 100.0% 0.0 GAP: 296 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.089, -2.156361006712914, no F: AAA-ATPase SIF-BLAST: ,,[AAA-ATPase [Microbacterium phage Arete] ],,YP_009857400,79.3017,0.0 SIF-HHPRED: c.37.1.0 (A:23-259) automated matches {Pseudomonas aeruginosa, PA01 [TaxId: 208964]} | CLASS: Alpha and beta proteins (a/b), FOLD: P-loop containing nucleoside triphosphate hydrolases, SUPFAM: P-loop containing nucleoside triphosphate hydrolases, FAM: automated matches,,,SCOP_d6blba1,76.7296,99.6 SIF-Syn: Gene 15 of Casino has synteny with gene 15 of Gshelby23 and gene 16 of Hannabella. The function of the two genes are AAA-ATPase, which supports the chosen function. /note=Although function unknown had a good e-value of 1e-179 under Phagesdb BLAST, AAA-ATPase also had the same e-value. This e-value is good because it is at least 10^-7, which is the minimum e-value needed for this class to choose it as a function. Under NCBI BLAST, AAA-ATPase had an e-value of 0, which is the best e-value. There were other functions, such as nitrogen assimilation regulatory protein listed under HHPRED, but the e-values were not as good as AAA-ATPase because the best one was 5e-12. CDS complement (12803 - 13276) /gene="16" /product="gp16" /function="hypothetical protein" /locus tag="Casino_16" /note=Original Glimmer call @bp 13276 has strength 14.96; Genemark calls start at 13276 /note=SSC: 13276-12803 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein HWB89_gp17 [Microbacterium phage Burro] ],,NCBI, q3:s2 94.2675% 3.09622E-48 GAP: 288 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.86, -2.707694805492614, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWB89_gp17 [Microbacterium phage Burro] ],,YP_009841636,66.6667,3.09622E-48 SIF-HHPRED: De novo designed protein H3mb; de novo design, mini protein, HA stem binder, VIRAL PROTEIN, VIRAL PROTEIN-DE NOVO PROTEIN complex; HET: BMA, NAG; 2.75A {Influenza A virus (strain A/Hong Kong/1/1968 H3N2)},,,7RDH_G,14.6497,20.4 SIF-Syn: Gene 16 in Casino is identical to Gene 14 in Hannabella and DoTi. neither of these phages has a function for this gene. /note=In BLAST all of the Phages in the EM1 subcluster do not have a function with E-values that are 2e-41 or less. In HHPRED all the hits have extremely high E-values, the lowest is 130. In NCBI BLAST the only hits for the EM1 subcluster were hypothetical proteins with E-values below 3.09622e-48. All of the closely related phages do not have a known function for this gene. CDS complement (13565 - 13741) /gene="17" /product="gp17" /function="hypothetical protein" /locus tag="Casino_17" /note=Original Glimmer call @bp 13741 has strength 5.84; Genemark calls start at 13741 /note=SSC: 13741-13565 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein HWD16_gp19 [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 1.10937E-33 GAP: 33 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.184, -2.0949393225970705, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWD16_gp19 [Microbacterium phage Arete] ],,YP_009857402,100.0,1.10937E-33 SIF-HHPRED: zf-CCHC_3 ; Zinc knuckle,,,PF13917.10,17.2414,31.4 SIF-Syn: Synteny with other EM1 phages /note=All other phages have no known function. They have large e-values making them unreliable, no other evidence supporting other function. CDS complement (13775 - 14425) /gene="18" /product="gp18" /function="hypothetical protein" /locus tag="Casino_18" /note=Original Glimmer call @bp 14425 has strength 12.85; Genemark calls start at 14425 /note=SSC: 14425-13775 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_HANNABELLA_19 [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 5.49865E-153 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.541, -3.4685663819143713, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_HANNABELLA_19 [Microbacterium phage Hannabella]],,UVG34226,98.6111,5.49865E-153 SIF-HHPRED: DUF732 ; Protein of unknown function (DUF732),,,PF05305.18,32.8704,75.6 SIF-Syn: Synteny with other EM1 phages /note=All related phages in cluster EM1 have no determined function. This is supported by the large E-values, which range from 1e-10 to 9e-91. The only suggested function for this gene (endolysin) from PhageDB BLAST comes from another cluster (AO2) and has an E-value of 4.1 which is very poor. The guiding principles say that the highest accepted E-value is 1e-5 which makes this E-value too high. CDS complement (14422 - 15219) /gene="19" /product="gp19" /function="hypothetical protein" /locus tag="Casino_19" /note=Original Glimmer call @bp 15219 has strength 17.86; Genemark calls start at 15219 /note=SSC: 15219-14422 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_GSHELBY23_19 [Microbacterium phage Gshelby23] ],,NCBI, q1:s1 100.0% 0.0 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.955, -4.049342652064859, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_GSHELBY23_19 [Microbacterium phage Gshelby23] ],,URM86415,99.6226,0.0 SIF-HHPRED: d.218.1.11 (A:2-157) Hypothetical protein TM1012 {Thermotoga maritima [TaxId: 2336]} | CLASS: Alpha and beta proteins (a+b), FOLD: Nucleotidyltransferase, SUPFAM: Nucleotidyltransferase, FAM: TM1012-like,,,SCOP_d2fcla1,38.1132,93.5 SIF-Syn: Synteny with other EM1 phages /note=Start is supported by data from Starterator, PhagesDB, GeneMark, and PECAAN. /note=Includes all GeneMark coding capacity. /note=There is not enough evidence to support any specific function for this gene. CDS complement (15219 - 15599) /gene="20" /product="gp20" /function="hypothetical protein" /locus tag="Casino_20" /note=Original Glimmer call @bp 15599 has strength 15.54; Genemark calls start at 15599 /note=SSC: 15599-15219 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_HANNABELLA_21 [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 2.86218E-80 GAP: 698 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.955, -2.442961286954254, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_HANNABELLA_21 [Microbacterium phage Hannabella]],,UVG34228,100.0,2.86218E-80 SIF-HHPRED: Pre-mRNA-processing factor 19; Spliceosome, U2/U5/U6, Lariat, RNA BINDING PROTEIN-RNA complex; HET: GDP, ADP; 3.6A {Schizosaccharomyces pombe 972h-} SCOP: h.1.40.1, g.44.1.2,,,3JB9_T,53.1746,92.7 SIF-Syn: Synteny with other EM1 phages /note=All other phages have no known function. They have large e-values making them unreliable, no other evidence supporting other function. CDS complement (16298 - 16459) /gene="21" /product="gp21" /function="hypothetical protein" /locus tag="Casino_21" /note=Original Glimmer call @bp 16459 has strength 4.95; Genemark calls start at 16459 /note=SSC: 16459-16298 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_HANNABELLA_22 [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 5.59984E-28 GAP: -14 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.686, -3.544879427814361, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_HANNABELLA_22 [Microbacterium phage Hannabella]],,UVG34257,100.0,5.59984E-28 SIF-HHPRED: Coiled-coil P6 peptide; coiled-coil, nanobody, antibody, protein design, DE NOVO PROTEIN; HET: SO4; 2.157A {Lama glama},,,7A4Y_D,37.7358,85.8 SIF-Syn: Synteny with Hannabella_22 /note=Determined with PECAAN, Starterator, GeneMark, NCBI BLAST, PhagesDB BLAST, and Phamerator. /note=All the available E-values for the function were not good hits in HHPRED or PhagesDB because they either indicated function is unknown or had unacceptably high e-values, for example, e-values of 2.7, 3.5, 6, 10, 18, or 23, a good e-value is 0 or anything extremely close to zero such as 0.02e-15. Only one entry on NCBI BLAST had a good e-value of 5.59e-28 and had 100% identity. It was a fairly recent submission, January 31, 2023. The function was listed as a “hypothetical protein” but this is not a function. When comparing gene 21 of Casino to gene 21 for Hannabella on Phamerator, I did not see the similarity of functions matching alignment CDS complement (16446 - 17249) /gene="22" /product="gp22" /function="DNA binding protein" /locus tag="Casino_22" /note=Original Glimmer call @bp 17249 has strength 12.22; Genemark calls start at 17249 /note=SSC: 17249-16446 CP: yes SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_HANNABELLA_23 [Microbacterium phage Hannabella]],,NCBI, q3:s1 99.2509% 0.0 GAP: -14 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.989, -5.032276239276967, no F: DNA binding protein SIF-BLAST: ,,[hypothetical protein SEA_HANNABELLA_23 [Microbacterium phage Hannabella]],,UVG34229,97.7358,0.0 SIF-HHPRED: INSULINOMA-ASSOCIATED PROTEIN 1; OXIDOREDUCTASE-PEPTIDE COMPLEX, DEMETHYLASE, TRANSCRIPTION FACTOR, CHROMATIN; HET: FAD; 2.96A {HOMO SAPIENS},,,3ZMS_C,16.4794,98.2 SIF-Syn: Synteny with other EM1 phages /note=-DNA binding protein is supported by phages "Arete" and "Gshelby23" with E-values exceeding E-7 /note= /note=I have reviewed and some what agree with the gene notes. /note=On HHPRED, they demonstrated that INSULINOMA-ASSOCIATED PROTEIN 1 was a possible function. It is an okay hit as a function because the e-value is close to zero, 0.000096, and a high probability, 97.8, very close to 100. We do not see this function repeated in PhagesDB or NCBI. There is only one other hit for this function on HHPRED, 3ZMS_C, with an e-value of 0.00047, decently close to zero. And 3ZMS_C has a high probability of 97.6. On NCBI BLAST, the DNA binding protein tends to be a popular hit, occurring 5 out of the 10 entries, it has trustable e-values, e.g., 6.52551e-78, 4.1338e-117, 5.47268e-136. On the other hand, the percentage identities fall between the 70% to 20% range which is not a trustable value to confidently determine it as the function. CDS complement (17236 - 17628) /gene="23" /product="gp23" /function="hypothetical protein" /locus tag="Casino_23" /note=Original Glimmer call @bp 17628 has strength 9.68; Genemark calls start at 17628 /note=SSC: 17628-17236 CP: yes SCS: both ST: SS BLAST-Start: [membrane protein [Microbacterium phage Araxxi] ],,NCBI, q1:s1 97.6923% 2.31922E-56 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.762, -5.0535707005373816, no F: hypothetical protein SIF-BLAST: ,,[membrane protein [Microbacterium phage Araxxi] ],,YP_009850681,80.6202,2.31922E-56 SIF-HHPRED: CYYR1 ; Cysteine and tyrosine-rich protein 1,,,PF10873.12,83.0769,96.3 SIF-Syn: synteny with other EM1 phages /note=On PhagesDB BLAST, the function was classified as unknown. On HHPRED, Cysteine and tyrosine-rich protein 1 was found to beb a potential function with a 96.3 probability which is good but the e-value, 0.096, is okay but isn’t as close to zero as my peers and I believed would be reliable enough to claim it as the proper function for gene 23. For NCBI BLAST the e-values, e.g. 2.31922e-56 and 3.79468e-56, were very close to zero which is exactly what we would want but the % identity and % aligned were on the lower end, 68.2171 and 80.6202 respectively, which is a negative hit when choosing a gene function. Having reasoned with this evidence our group ultimately decided that the function for gene 23 would remain unknown. CDS complement (17625 - 17963) /gene="24" /product="gp24" /function="hypothetical protein" /locus tag="Casino_24" /note=Original Glimmer call @bp 17900 has strength 9.49; Genemark calls start at 17963 /note=SSC: 17963-17625 CP: yes SCS: both-gm ST: SS BLAST-Start: [hypothetical protein SEA_HANNABELLA_25 [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 4.19618E-71 GAP: 928 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.685, -3.081776789475849, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_HANNABELLA_25 [Microbacterium phage Hannabella]],,UVG34231,96.4912,4.19618E-71 SIF-HHPRED: PHA_gran_rgn ; Putative polyhydroxyalkanoic acid system protein (PHA_gran_rgn),,,PF09650.14,40.1786,88.9 SIF-Syn: Synteny with other EM1 phages /note=On NCBI BLAST, function hypothetical protein with a e-value of 6.25263e-57 and a percent identity of 75.4386 makes a strong contender as a possible function candidate. However, the % identity is on the lower end. /note=The data from HHPRED for function is a negative hit because the E-Value for the described chaperone protein is 10, way too high to be a reliable function for this gene and the probability is only 87.4. /note=There was only one possible function provided by the phagesDB function frequency and it was a minor tail protein but lacked further information Then there is this about its only related to the seaphages data and there`s only one with no info on percent identity or e-values which are key statistical measurements of a reliable gene function. CDS complement (18892 - 19128) /gene="25" /product="gp25" /function="hypothetical protein" /locus tag="Casino_25" /note=Original Glimmer call @bp 19128 has strength 5.15; Genemark calls start at 19128 /note=SSC: 19128-18892 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein HWD16_gp26 [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 3.93414E-49 GAP: 397 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.313, -3.8137921535956054, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWD16_gp26 [Microbacterium phage Arete] ],,YP_009857409,100.0,3.93414E-49 SIF-HHPRED: RiPP precursor (SonA); alpha-N-methyltransferase, borosin, natural products, TRANSFERASE; HET: SAM; 1.55A {Shewanella oneidensis},,,8T1T_D,41.0256,84.5 SIF-Syn: Gene 25 of casino shares synteny with gene 24 of Gshelby /note=PhagesDB BLASt, HHPRED, and NCBI BLAST don’t provide enough information in order for the function to be determined /note= /note=the suggested start is supported by Starterator and that all GM coding capacity is included in the start. CDS 19526 - 33010 /gene="26" /product="gp26" /function="tape measure protein" /locus tag="Casino_26" /note=Original Glimmer call @bp 19526 has strength 12.14; Genemark calls start at 19526 /note=SSC: 19526-33010 CP: yes SCS: both ST: SS BLAST-Start: [tail length tape measure protein [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 0.0 GAP: 397 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.804, -2.8454123590742793, no F: tape measure protein SIF-BLAST: ,,[tail length tape measure protein [Microbacterium phage Arete] ],,YP_009857410,99.9777,0.0 SIF-HHPRED: Structural protein VP2; helical symmetry, archaeal pilus, STRUCTURAL PROTEIN, VIRUS; 3.4A {Pyrobaculum filamentous virus 1},,,6V7B_g,1.17935,92.2 SIF-Syn: Gene 27 of Casino has synteny with gene 25 of Gshelby23 and gene 27 of Hannabella. Gene 25 of Gshelby 23 and gene 27 of Hannabella do not have functions. /note=Although function unknown had an e-value of 0 under Phagesdb BLAST, tape measure protein also had an e-value of 0 under NCBI BLAST. There were different functions listed under HHPRED, such as structural protein VP2, but the best e-value was 0.35, which is too large. The function needs to have an e-value of at least 10^-7 in order to be used as a function. Also, the gene length (13485 bp) is large, and tape measure proteins are usually long because they are directly related to the phage tail length. Looking at the genome map of Casino on Phamerator, there are no other large genes, which indicates that this is a tape measure protein because a phage only has one tape measure protein if they have a tail. /note=-the phage Baee gene 26 with an e-value of 1e-5 as supportive of function. In NCBI BLAST, I added Araxxi gene 27 as supportive of function (e-value of 0). CDS 33001 - 34983 /gene="27" /product="gp27" /function="portal protein" /locus tag="Casino_27" /note=Original Glimmer call @bp 33001 has strength 15.76; Genemark calls start at 33007 /note=SSC: 33001-34983 CP: yes SCS: both-gl ST: NI BLAST-Start: [portal protein [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 0.0 GAP: -10 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.709, -3.619929324802829, no F: portal protein SIF-BLAST: ,,[portal protein [Microbacterium phage Arete] ],,YP_009857411,100.0,0.0 SIF-HHPRED: Portal protein; Portal protein, Dodecamer, VIRAL PROTEIN; 6.24A {Pseudomonas virus PaP3},,,7SZ6_h,82.7273,100.0 SIF-Syn: Gene 26 of Gshelby 23 and gene 28 of Hannabella showed synteny with Gene 28 of Casino. Both the genes were portal proteins, which supports the chosen function. /note=Although function unknown had an e-value of 0 under Phagesdb BLAST, portal protein also had an e-value of 0. 0 is the best e-value. Portal protein also had an e-value of 1.8e-39 under HHPRED. This is a good e-value because it is smaller than 10^-7, which is a requirement for this class. Portal protein also had an e-value of 0 under NCBI BLAST. CDS 35069 - 36649 /gene="28" /product="gp28" /function="major capsid protein" /locus tag="Casino_28" /note=Original Glimmer call @bp 35069 has strength 13.12; Genemark calls start at 35069 /note=SSC: 35069-36649 CP: yes SCS: both ST: SS BLAST-Start: [major capsid protein [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 0.0 GAP: 85 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.184, -2.606079664606164, yes F: major capsid protein SIF-BLAST: ,,[major capsid protein [Microbacterium phage Hannabella]],,UVG34235,100.0,0.0 SIF-HHPRED: Short-tailed cyanophage tailspike receptor-binding domain; Tailspike receptor-binding domain, VIRAL PROTEIN; 2.671A {unidentified},,,7EEA_C,10.4563,97.2 SIF-Syn: Gene 29 in Hannabella is Identical to Gene 29 in Casino. Hannabella has major capsid protein as the function of this gene. /note=phage Hannabella, which is one of the most closely related phages to Casino, has major capsid protein listed as the function for this gene with an E-value of 0. Gene 29 in Hannabella is Identical to Gene 29 in Casino. In NCBI BLAST the Phage Hannabella has major capsid protein as the function for this gene with an E-value of 0. CDS 36708 - 36866 /gene="29" /product="gp29" /function="hypothetical protein" /locus tag="Casino_29" /note=Genemark calls start at 36708 /note=SSC: 36708-36866 CP: yes SCS: genemark ST: SS BLAST-Start: [hypothetical protein HWD16_gp30 [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 2.43251E-23 GAP: 58 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.531, -3.4283035164720754, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWD16_gp30 [Microbacterium phage Arete] ],,YP_009857413,100.0,2.43251E-23 SIF-HHPRED: Nucleoid-associated protein Rv3716c; MtbRv3716, DNA binding, metal ion, DNA BINDING PROTEIN; HET: EDO; 1.9A {Mycobacterium tuberculosis H37Rv},,,5YRX_A,96.1538,63.0 SIF-Syn: The Synteny was high for this gene as it was 81 percent. Still pretty good for not being in the 90 range. Synteny with other EM1 phages /note=The function is the way is is because of the other similar genes in other phages had this listed as no known function or hypothetical protein ( which is another way of saying no function). CDS 36947 - 38017 /gene="30" /product="gp30" /function="minor tail protein" /locus tag="Casino_30" /note=Original Glimmer call @bp 36947 has strength 11.37; Genemark calls start at 36947 /note=SSC: 36947-38017 CP: yes SCS: both ST: SS BLAST-Start: [minor tail protein [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 0.0 GAP: 80 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.033, -2.2763497933341483, yes F: minor tail protein SIF-BLAST: ,,[minor tail protein [Microbacterium phage Hannabella]],,UVG34237,100.0,0.0 SIF-HHPRED: L-shaped tail fiber protein p132; Bacteriophage, Siphophage, T5, baseplate, VIRAL PROTEIN; 3.45A {Escherichia phage T5},,,7QG9_S,31.7416,99.6 SIF-Syn: synteny is present in gene 31 b/c all of the three corresponding gene 31`s have the exact same # when comparing phages Casino, Arete, and Burro on phamerator /note=-Phage “Burro” has a function listed as minor tail protein with an E-value of 3e-85, which is above the E-7 expectation CDS 38028 - 39737 /gene="31" /product="gp31" /function="minor tail protein" /locus tag="Casino_31" /note=Original Glimmer call @bp 38067 has strength 7.51; Genemark calls start at 38028 /note=SSC: 38028-39737 CP: yes SCS: both-gm ST: SS BLAST-Start: [minor tail protein [Microbacterium phage Gshelby23]],,NCBI, q1:s1 100.0% 0.0 GAP: 10 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.84, -5.3524728754355015, no F: minor tail protein SIF-BLAST: ,,[minor tail protein [Microbacterium phage Gshelby23]],,URM86425,100.0,0.0 SIF-HHPRED: Receptor-type tyrosine-protein phosphatase F; Fibronectin type-III, adhesion protein, CELL ADHESION; HET: SO4; 2.9A {Homo sapiens},,,6TPW_A,67.4868,99.9 SIF-Syn: yes, b/c between phage burro and casino gene 31 has identical positions and function /note=minor tail protein is a supported function because the supporting E-values are 4e-41 and 2e-35, far exceeding 1e-7. CDS 39737 - 41029 /gene="32" /product="gp32" /function="minor tail protein" /locus tag="Casino_32" /note=Original Glimmer call @bp 39737 has strength 9.74; Genemark calls start at 39737 /note=SSC: 39737-41029 CP: no SCS: both ST: SS BLAST-Start: [minor tail protein [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 0.0 GAP: -1 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.709, -2.967790469131549, yes F: minor tail protein SIF-BLAST: ,,[minor tail protein [Microbacterium phage Hannabella]],,UVG34239,99.5349,0.0 SIF-HHPRED: L-shaped tail fiber protein p132; Bacteriophage, Siphophage, T5, baseplate, VIRAL PROTEIN; 3.45A {Escherichia phage T5},,,7QG9_S,25.5814,99.7 SIF-Syn: Synteny with other EM1 phages /note=Looking at PhagesDB BLAST, 5 out of the 7 genes in our cluster state the function as minor tail protein with all of them having the perfect E-value of (0). HHPRED most of the E-values are too high the only reasonable ones have functions related to tail proteins. The lowest E-value is (7.4e-15) stating the function of L-shaped tail fiber protein. NCBI BLAST, the top 5 results all have the perfect E-value of (0), and 3 out of those 5 are for tail protein while the other 2 are for minor tail protein, though the best ranked function is minor tail protein with 99.5349% identity, 99.5349% Aligned, and 100% Coverage score. The Conserved Domain Data base lists conflicting information and there is no information listed on TmHmm or Topcons. CDS 41061 - 41807 /gene="33" /product="gp33" /function="endolysin" /locus tag="Casino_33" /note=Original Glimmer call @bp 41061 has strength 13.76; Genemark calls start at 41061 /note=SSC: 41061-41807 CP: yes SCS: both ST: SS BLAST-Start: [endolysin [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 0.0 GAP: 31 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.86, -2.6453814847322845, yes F: endolysin SIF-BLAST: ,,[endolysin [Microbacterium phage Hannabella]],,UVG34240,99.5968,0.0 SIF-HHPRED: D,D-dipeptidase/D,D-carboxypeptidase; CENTER FOR STRUCTURAL GENOMICS OF INFECTIOUS DISEASES, CSGID, NATIONAL INSTITUTE OF ALLERGY AND INFECTIOUS DISEASES, NIAID; HET: 2D8, LY0, GOL, PE3; 1.364A {Enterococcus faecalis},,,4MUQ_A,41.129,98.6 SIF-Syn: There is synteny shown between gene 34 of Casino and other phages in the cluster EM1, an example being Burro. Although the gene 34 in Burro is in relatively the same position as Casino, the function of gene 34 in Burro was determined to be lysin A. /note=Looking at Phagesdb Function Frequency compared to Phagesdb BLAST, there are a few opposite suggestions for the function of gene 34. One of the suggestions by Phagesdb Function Frequency states Lysin A to be the function for this gene, with a 46% frequency. Although this is the number one candidate listed and it has the highest percentage number, the pham (in cluster EA1) that this is connected to is not in the same subcluster as Casino (EM1) which prompts for further investigation. I then moved onto NCBI BLAST which stated that Endolysin was the proper function with an E-value of 0. Lysin A was also listed, but had a worse E-value (3.84e-179) compared to the Endolysin function. CDS 41809 - 42294 /gene="34" /product="gp34" /function="membrane protein" /locus tag="Casino_34" /note=Original Glimmer call @bp 41809 has strength 8.85; Genemark calls start at 41809 /note=SSC: 41809-42294 CP: yes SCS: both ST: NI BLAST-Start: [membrane protein [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 6.75326E-112 GAP: 1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.062, -4.411422009034556, no F: membrane protein SIF-BLAST: ,,[membrane protein [Microbacterium phage Arete] ],,YP_009857418,100.0,6.75326E-112 SIF-HHPRED: Tetraspanin-15; Protease, Metalloprotease, Tetraspanin, Sheddase, Adhesion, MEMBRANE PROTEIN; HET: Y01, MAN, NAG, BAT; 3.3A {Homo sapiens},,,8ESV_B,69.5652,76.9 SIF-Syn: Synteny with other EM1 phages /note=On PhagesDB Blast it shows a function unknown. /note=HHPRED lists potential function such as Membrane Protein: probability 76.9 and e-value 88. The rest of the potential listed functions have extremely high e-values well over 88 which suggests that these are not good hits as functions for gene 35. /note=NBCI Blast has strong hits potential functions derived from the national database. The best hit being membrane protein with a 100% identity, 100% aligned, and 100% coverage. As for those parameters it can’t get any better. In addition to that strongly supportive evidence is that the the e-value is 6.75326e-112 for membrane protein function. I do think it is important to note that they were referring to other phages in the cluster and not specifically Casino. Therefore, I momentarily conclude that the function for Gene 35 is unknown, until further evidence can be found. CDS 42287 - 42712 /gene="35" /product="gp35" /function="hypothetical protein" /locus tag="Casino_35" /note=Original Glimmer call @bp 42287 has strength 11.37; Genemark calls start at 42287 /note=SSC: 42287-42712 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein HWD16_gp36 [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 6.4187E-97 GAP: -8 bp gap LO: no RBS: Kibler 6, Karlin Medium, 3.109, -2.17469248771465, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWD16_gp36 [Microbacterium phage Arete] ],,YP_009857419,100.0,6.4187E-97 SIF-HHPRED: PspB ; Phage shock protein B,,,PF06667.16,57.4468,96.6 SIF-Syn: Synteny with other EM1 phages /note=PhagesDB BLAST, all of the genes in our cluster state function unknown with 4 of them having the lowest E-value of (1e-77). HHPRED, all of the E-values were to high to be put into consideration although a majority of the listed functions stated unknown function. NCBI BLAST, hypothetical protein was the majority of what was listed which means that there were no supporting proteins found for a function. There is no information listed on TmHmm or Topcons. CDS 42715 - 42948 /gene="36" /product="gp36" /function="hypothetical protein" /locus tag="Casino_36" /note=Original Glimmer call @bp 42715 has strength 10.89; Genemark calls start at 42715 /note=SSC: 42715-42948 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein HWD16_gp37 [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 1.95346E-44 GAP: 2 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.955, -2.523003374675015, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWD16_gp37 [Microbacterium phage Arete] ],,YP_009857420,100.0,1.95346E-44 SIF-HHPRED: Endoplasmic reticulum membrane protein complex subunit 10; ER membrane protein complex, EMC, membrane protein biogenesis, insertase, chaperone, endoplasmic, reticulum, MEMBRANE PROTEIN-IMMUNE SYSTEM complex; HET: X3P, NAG;{Saccharomyces cerevisiae},,,7KRA_H,32.4675,57.6 SIF-Syn: Synteny with other EM1 phages /note=There were no percentage, coverage, or e-values that supported a known function. CDS 42985 - 44265 /gene="37" /product="gp37" /function="hypothetical protein" /locus tag="Casino_37" /note=Original Glimmer call @bp 43117 has strength 11.92; Genemark calls start at 43003 /note=SSC: 42985-44265 CP: yes SCS: both-cs ST: SS BLAST-Start: [hypothetical protein HWD16_gp38 [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 0.0 GAP: 36 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 3.033, -2.2763497933341483, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWD16_gp38 [Microbacterium phage Arete] ],,YP_009857421,100.0,0.0 SIF-HHPRED: SIF-Syn: Synteny with other EM1 phages /note=No top matches were seen when using Phagesdb BLAST, NCBI BLAST, or HHpred, so no function was able to be determined. All results with acceptable e-values of >e-7 showed “function unknown” or “hypothetical protein” both meaning no known function. CDS 44267 - 45346 /gene="38" /product="gp38" /function="hypothetical protein" /locus tag="Casino_38" /note=Original Glimmer call @bp 44267 has strength 10.71; Genemark calls start at 44267 /note=SSC: 44267-45346 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_HANNABELLA_39 [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 0.0 GAP: 1 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.569, -3.618090370183562, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_HANNABELLA_39 [Microbacterium phage Hannabella]],,UVG34245,100.0,0.0 SIF-HHPRED: Colicin-E1; antibiotic efflux, bacteriocin, TRANSPORT PROTEIN, ANTIMICROBIAL PROTEIN; 3.09A {Escherichia coli (strain K12)},,,6WXH_D,37.6045,90.9 SIF-Syn: Synteny with other EM1 phages /note=No hits on anything except NCBI BLAST, but because this is a national database, it can be taken into account a bit more than Phagesdb. The matches on NCBI BLAST had e-values of e-56 and e-53, so I think it is this is enough evidence to call membrane protein function. CDS 45348 - 46145 /gene="39" /product="gp39" /function="hypothetical protein" /locus tag="Casino_39" /note=Original Glimmer call @bp 45348 has strength 13.02; Genemark calls start at 45348 /note=SSC: 45348-46145 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_HANNABELLA_40 [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 0.0 GAP: 1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.524, -3.504375086072296, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_HANNABELLA_40 [Microbacterium phage Hannabella]],,UVG34246,100.0,0.0 SIF-HHPRED: Virion associated protein; Complex, VIRAL PROTEIN; 3.8A {Ralstonia phage GP4},,,8JOV_2,84.5283,91.0 SIF-Syn: Synteny with Gshelby and Burro. /note=No top matches were seen when using Phagesdb BLAST, NCBI BLAST, or HHpred, so no function was able to be determined. All results with acceptable e-values of >e-7 showed “function unknown” or “hypothetical protein” both meaning no known function. CDS 46145 - 48529 /gene="40" /product="gp40" /function="hypothetical protein" /locus tag="Casino_40" /note=Original Glimmer call @bp 46145 has strength 13.16; Genemark calls start at 46145 /note=SSC: 46145-48529 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_HANNABELLA_41 [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 0.0 GAP: -1 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 1.923, -4.787182377296497, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_HANNABELLA_41 [Microbacterium phage Hannabella]],,UVG34247,99.8741,0.0 SIF-HHPRED: Virion-associated phage protein; Complex, VIRAL PROTEIN; 3.8A {Ralstonia phage GP4},,,8JOV_D,81.6121,100.0 SIF-Syn: There is synteny in gene 40 of Casino and gene 38 of Gshelby23. /note=Start Determination: Gap of -1 is low enough to be considered and is lowest of all candidates. Z-score of 1.923 is closer to 2 than a majority of other candidates. Final score of -4.787 is higher than most other candidates. Spacer is 12 bp, which is on the higher bounds for preferred spacer length (8-12bp). Starterator states this start is only found in 17.5 of members in the Pham, but is called 100% of time when present. Start would encompass all of coding potential spike for Genemark. /note= /note=Function: /note=Both PhagesDB and NCBI Blast programs have no known function for all or a majority of entries. HHPRED states potential function as “ virion-associated phage protein”. No predicted TMHs. CDS 48529 - 48873 /gene="41" /product="gp41" /function="hypothetical protein" /locus tag="Casino_41" /note=Original Glimmer call @bp 48529 has strength 9.15; Genemark calls start at 48529 /note=SSC: 48529-48873 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_GSHELBY23_40 [Microbacterium phage Gshelby23] ],,NCBI, q1:s1 100.0% 7.24162E-73 GAP: -1 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.955, -2.442961286954254, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_GSHELBY23_40 [Microbacterium phage Gshelby23] ],,URM86435,100.0,7.24162E-73 SIF-HHPRED: PROTEIN 2; VIRUS, MEMBER OF PRD1-ADENO VIRAL LINEAGE, MEMBRANE-CONTAINING BACTERIOPHAGE, VIRUS VIRION, MEMBRANE, TRANSMEMBRANE, CAPSID PROTEIN; HET: CA; 7.0A {PSEUDOALTEROMONAS PHAGE PM2},,,2W0C_L,68.4211,98.1 SIF-Syn: Exhibits 98-99% percent synteny with Gshelby223 and Hannabella. /note=Although 43% of genomes have function as terminase via PhagesDB, I decided no known function based on no information given my TMH program, no HHPRED hits, and no entries on NCBI database, as well as no function given for closest related genomes Hannabella and Gshelby23. CDS 48938 - 49669 /gene="42" /product="gp42" /function="hypothetical protein" /locus tag="Casino_42" /note=Original Glimmer call @bp 48938 has strength 17.99; Genemark calls start at 48938 /note=SSC: 48938-49669 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein SEA_GSHELBY23_41 [Microbacterium phage Gshelby23] ],,NCBI, q1:s1 100.0% 8.92609E-179 GAP: 64 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.654, -3.1657223569180837, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_GSHELBY23_41 [Microbacterium phage Gshelby23] ],,URM86436,100.0,8.92609E-179 SIF-HHPRED: Uncharacterized protein; Immunoglobulin - like beta-sandwich, Structural Genomics, Joint Center for Structural Genomics, JCSG, Protein Structure Initiative, PSI-BIOLOGY; HET: MSE, GOL; 1.69A {Bacteroides thetaiotaomicron},,,4G5A_B,20.9877,67.3 SIF-Syn: Synteny with other EM1 phages /note=There was a bad e-value for HHPRED and probability score was less than 80%. NCBI BLAST has no known functions that matched alongside Phagesdb BLAST has no known functions. CDS 49666 - 50259 /gene="43" /product="gp43" /function="hypothetical protein" /locus tag="Casino_43" /note=Original Glimmer call @bp 49666 has strength 18.92; Genemark calls start at 49666 /note=SSC: 49666-50259 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein HWD16_gp44 [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 1.73763E-140 GAP: -4 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.382, -3.8079208184165134, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWD16_gp44 [Microbacterium phage Arete] ],,YP_009857427,100.0,1.73763E-140 SIF-HHPRED: Fe(II)-binding effector; Fe(II)-binding effector, METAL BINDING PROTEIN; HET: MSE; 1.96A {Yersinia pseudotuberculosis serotype I (strain IP32953)},,,7DMS_A,19.797,51.6 SIF-Syn: Synteny with other EM1 phages /note=- no function known because the e-values are not good for HHPRED and neither is the probability score which is below 80%, alongside that Phagesdb BLAST and NCBI BLAST has no matches. CDS 50260 - 50658 /gene="44" /product="gp44" /function="hypothetical protein" /locus tag="Casino_44" /note= /note=SSC: 50260-50658 CP: yes SCS: neither ST: NA BLAST-Start: [hypothetical protein HWC57_gp45 [Microbacterium phage Araxxi] ],,NCBI, q51:s1 59.8485% 8.92461E-25 GAP: 0 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.804, -2.7653702713535186, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWC57_gp45 [Microbacterium phage Araxxi] ],,YP_009850702,86.0465,8.92461E-25 SIF-HHPRED: SIF-Syn: Syntey with EM1 phage Araxxi_45 /note=No hits on HHPRED or NCBI CDS 50541 - 51353 /gene="45" /product="gp45" /function="hypothetical protein" /locus tag="Casino_45" /note=Original Glimmer call @bp 50541 has strength 9.28; Genemark calls start at 50541 /note=SSC: 50541-51353 CP: yes SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_HANNABELLA_45 [Microbacterium phage Hannabella]],,NCBI, q1:s96 100.0% 0.0 GAP: -118 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.169, -4.263391326015678, no F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_HANNABELLA_45 [Microbacterium phage Hannabella]],,UVG34251,73.9726,0.0 SIF-HHPRED: SIF-Syn: Synteny with EM1 phage Araxxi_46 /note=no information on function given. CDS 51356 - 51487 /gene="46" /product="gp46" /function="hypothetical protein" /locus tag="Casino_46" /note=Original Glimmer call @bp 51356 has strength 2.22 /note=SSC: 51356-51487 CP: yes SCS: glimmer ST: SS BLAST-Start: [hypothetical protein HWD16_gp46 [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 3.00499E-21 GAP: 2 bp gap LO: no RBS: Kibler 6, Karlin Medium, 1.862, -4.777807699064131, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWD16_gp46 [Microbacterium phage Arete] ],,YP_009857429,100.0,3.00499E-21 SIF-HHPRED: DUF2852 ; Protein of unknown function (DUF2852),,,PF11014.12,69.7674,86.5 SIF-Syn: There are high levels of synteny among Casino and other EM1 phages and these scores range in the 90s. /note=start: suggested start by glimmer, captures all coding potential, z-score, final score, gap and spacer align with the guiding principals /note=function: good e-values scores on PhagesDB BLAST and NCBI BLAST for no known function, CDS 51488 - 52948 /gene="47" /product="gp47" /function="terminase" /locus tag="Casino_47" /note=Original Glimmer call @bp 51488 has strength 15.73; Genemark calls start at 51488 /note=SSC: 51488-52948 CP: yes SCS: both ST: SS BLAST-Start: [terminase [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 0.0 GAP: 0 bp gap LO: yes RBS: Kibler 6, Karlin Medium, 2.654, -3.1657223569180837, yes F: terminase SIF-BLAST: ,,[terminase [Microbacterium phage Arete] ],,YP_009857430,100.0,0.0 SIF-HHPRED: Large subunit terminase; large terminase, VIRAL PROTEIN; 2.2A {Deep-sea thermophilic phage D6E},,,5OE8_B,92.3868,100.0 SIF-Syn: Synteny with other EM1 phages /note=Start supported from PhagesDB, GeneMark, and PECAAN. /note=Support from e-values and probabilities in HHPRED Predictions. CDS 52936 - 53283 /gene="48" /product="gp48" /function="hypothetical protein" /locus tag="Casino_48" /note=Original Glimmer call @bp 52936 has strength 12.5; Genemark calls start at 52936 /note=SSC: 52936-53283 CP: yes SCS: both ST: NI BLAST-Start: [hypothetical protein SEA_HANNABELLA_48 [Microbacterium phage Hannabella]],,NCBI, q1:s1 100.0% 2.53717E-78 GAP: -13 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.695, -2.996985748252251, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein SEA_HANNABELLA_48 [Microbacterium phage Hannabella]],,UVG34254,100.0,2.53717E-78 SIF-HHPRED: a.61.1.3 (X:) Mason-Pfizer monkey virus matrix protein {Simian Mason-Pfizer virus [TaxId: 11855]} | CLASS: All alpha proteins, FOLD: Retroviral matrix proteins, SUPFAM: Retroviral matrix proteins, FAM: Mason-Pfizer monkey virus matrix protein,,,SCOP_d2f77x_,59.1304,92.1 SIF-Syn: Synteny with other EM1 phages /note=No information given for function. CDS 53273 - 53563 /gene="49" /product="gp49" /function="hypothetical protein" /locus tag="Casino_49" /note=Original Glimmer call @bp 53273 has strength 14.89; Genemark calls start at 53273 /note=SSC: 53273-53563 CP: yes SCS: both ST: SS BLAST-Start: [hypothetical protein HWD16_gp49 [Microbacterium phage Arete] ],,NCBI, q1:s1 100.0% 1.3685E-63 GAP: -11 bp gap LO: no RBS: Kibler 6, Karlin Medium, 2.382, -4.018031164761625, yes F: hypothetical protein SIF-BLAST: ,,[hypothetical protein HWD16_gp49 [Microbacterium phage Arete] ],,YP_009857432,100.0,1.3685E-63 SIF-HHPRED: DUF6307 ; Family of unknown function (DUF6307),,,PF19826.3,28.125,67.7 SIF-Syn: Synteny with other EM1 phages /note=There was no percentage, coverage, or e-value that supported a known function.