CDS 135 - 350 /note=There are 45 Manual annotations for the start site 135 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. NCBI Blast identifies this as a hypothetical protein with a 100% coverage and Identity on multiple hits. HHPred does not have any significant hits to identify this gene. So, we can assume that this is a hypothetical protein CDS 350 - 673 /note=There are 68 Manual annotations for the start site 350 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present and the lack of gaps in between genes. HHPred identifies this as a terminase, specifically a small subunit, with approximately a 94% probability and 78% coverage. NCBI Blast identifies this gene as a terminase small subunit as well, with a 95% identity and 98% coverage. Multiple hits on NCBI Blast and HHPred support this gene function. So, we can assume that this is a Terminase. CDS 670 - 1044 /note=There are 2 Manual annotations for the start site 670 on Starterator. Genemark and Glimmer agree with this. I cannot identify this on Genemark either. NCBI Blast does not have any significant hits. HHPred identifies this gene as a restriction endonuclease which is a type of HNH endonuclease. This was identified with a high probability of 99% and coverage of 98%. CDS 1046 - 2674 /note=There are 70 Manual annotations for the start site 1046 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. NCBI Blast identifies this as a large subunit terminase, with a 99% coverage and Identity on multiple hits. HHPred identifies this gene as a large subunit terminase with a 100% probability and a 91% coverage. So, we can be confident in the genes function as a large subunit terminase. CDS 2686 - 4710 /note=There are 69 Manual annotations for the start site 2686 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this as a portal protein with a 100% probability and a 92% coverage. NCBI Blast identifies this gene as a portal protein with a 99% identity and a 100% coverage on multiple hits. So, we can be confident in the genes function as a portal protein. CDS 4710 - 5462 /note=There are 120 Manual annotations for the start site 4710 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this as a capsid maturation protease with a 99% probability and a lower coverage of 64%. NCBI Blast identifies this gene as a capsid maturation protease with a 100% identity and a 100% coverage on multiple hits with comparable numbers. So, we can be confident in the genes function as a capsid maturation protease. CDS 5499 - 6680 /note=There are 157 Manual annotations for the start site 5499 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this as a major capsid protein with a 100% probability and a coverage of 99%. NCBI Blast identifies this gene as a capsid maturation protease with a 100% identity and a 100% coverage on multiple hits with comparable numbers. So, we can be confident in the genes function as a major capsid protein. CDS 6775 - 7116 /note=There are 42 Manual annotations for the start site 6775 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred idoes not have any significant hits. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. CDS 7151 - 7711 /note=There are 74 Manual annotations for the start site 7151 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as a heat to tail adaptor. Some being under the name of head completion protein with a 100% probability and a 97% coverage. NCBI Blast identifies this gene as a head to tail adaptor with a 99% identity and a 100% coverage on multiple hits with comparable numbers. CDS 7711 - 8040 /note=There are 68 Manual annotations for the start site 7711 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as a heat to tail stopper with a 99% probability and a 89% coverage. NCBI Blast identifies this gene as a head to tail stopper with a 99% identity and a 100% coverage on multiple hits with comparable numbers. So, we can assume that this is a head to tail stopper. CDS 8040 - 8297 /note=There are 48 Manual annotations for the start site 8040 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as an unknown function a 94% probability and a 81% coverage, This is the only significant hit. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. CDS 8297 - 8692 /note=There are 81 Manual annotations for the start site 8297 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as a tail terminator protein with a 99% probability and a 95% coverage, This is the only significant hit. NCBI Blast identifies this gene as a tail terminator with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the large number evidence presented on both HHPred and NCBI Blast, we can assume this gene is a tail terminator. CDS 8706 - 9503 /note=There are 195 Manual annotations for the start site 8706 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene a major tail protein with a 100% probability and a 98% coverage. There are other hits that show the function as hypothetical protein. NCBI Blast identifies this gene as a major tail protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. There is evidence for both hypothetical protein or a major tail protein. Most hits that represent the function as being hypothetical also emphasize that the gene could be a major tail protein. Based on the large number evidence presented on both HHPred and NCBI Blast, we can assume this gene is a major tail protein. CDS 9629 - 9937 /note=There are 88 Manual annotations for the start site 9629 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as a tail assembly protein with a 99% probability and a 95% coverage. There are no other signnificant hits for this gene on HHPred. NCBI Blast identifies this gene as a tail assembly chaperone with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the large number evidence presented on both HHPred and NCBI Blast, we can assume this gene is a tail assembly chaperone. CDS complement (10344 - 10559) /note=There are 7 Manual annotations for the start site 9629 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits. NCBI Blast identifies this gene as a hypothetical protein with a 98% identity and a 98% coverage on multiple hits with comparable numbers. CDS 10774 - 13533 /note=There are 52 Manual annotations for the start site 10774 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits to use as evidence. NCBI Blast identifies this gene as a tape measure protein with a 99% identity and a 100% coverage on multiple hits with comparable numbers. Based on the large number evidence presented on NCBI Blast, we can assume this gene is a tape measure protein. CDS 13530 - 14384 /note=There are 65 Manual annotations for the start site 13530 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as a distal tail protein 100% 89% coverage. It is a type of minor tail protein, which explains why multiple hits with strong numbers were presenting minor tail protein as opposed to distal tail protein. NCBI Blast identifies this gene as a minor tail protein protein with a 97% identity and a 98% coverage on multiple hits with comparable numbers. Based on the large number evidence presented on NCBI Blast, we can assume this gene is a minor tail protein. CDS 14384 - 16222 /note=There are 37 Manual annotations for the start site 14384 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this as a minor tail protein. The probability is 100% while the coverage is rather low at 63%. NCBI Blast identifies this gene as a minor tail protein with a 76% identity and a 99% coverage on multiple hits with comparable numbers. The identity is lower that normal and the alignment is not where it should be. Based on the synteny compared to other phages, we can identify this gene as a minor tail protein. CDS 16235 - 16495 /note=There are 15 Manual annotations for the start site 16235 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred and NCBI Blast do not have any significant hits to identify this gene. So, I ran it through Deep TMHMM. The results indicated that the gene has inside topology and globular type. CDS 16495 - 17583 /note=There are 143 Manual annotations for the start site 16495 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as a receptor binding protein, which falls under the category of a minor tail protein with a 99% probability and a lower coverage of 42%. NCBI Blast identifies this gene as a minor tail protein with a 96% identity and a 98% coverage on multiple hits with comparable numbers. Based on the large number evidence presented on both HHPred and NCBI Blast, we can assume this gene is a minor tail protein. CDS 17593 - 18396 /note=There are 94 Manual annotations for the start site 17593 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene a receptor type tyrosine protein with a 97% probability and a lower coverage of 59%. NCBI Blast identifies this gene as a minor tail protein with a 96% identity and a 98% coverage on multiple hits with comparable numbers. Based on the large number evidence presented on NCBI Blast, we can assume this gene is a minor tail protein. CDS 18408 - 19157 /note=There are 32 Manual annotations for the start site 18408 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene a D dipeptidase and a D carboxypeptidase; with a 99% probability and a 53% coverage. There are other hits that show the function as carboxypeptidase, which is a type of endolysin. NCBI Blast identifies this gene as an endolysin with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is an endolysin. CDS 19163 - 19645 /note=There are 39 Manual annotations for the start site 19163 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits. NCBI Blast identifies this gene as a membrane protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a membrane protein. Confirmed by DeepTMHMM. CDS 19656 - 19997 /note=There are 53 Manual annotations for the start site 19656 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene a membrane protein with a 95% probability and a 82% coverage. In multiple hits it is identified as a tail anchor protein which is a type of membrane protein and supports the other evidence collected. There are other hits that show the function as hypothetical protein. NCBI Blast identifies this gene as a membrane protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is a membrane protein. Membrane binding domain on the end. CDS 20001 - 20228 /note=There are 43 Manual annotations for the start site 20001 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits. NCBI Blast identifies this gene as a membrane protein with a 97% identity and a 98% coverage on multiple hits with comparable numbers. There is evidence for both hypothetical protein or a membrane protein because HHPred did not have significant evidence and NCBI blast had multiple hits indicating the gene as a hypothetical protein. Based on the evidence presented on NCBI Blast, we can assume this gene is a membrane protein. Confirmed by DeepTMHMM CDS 20471 - 21127 /note=There are 45 Manual annotations for the start site 20471 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits to use as evidence for the function of this gene. NCBI Blast identifies this gene as a hypothetical protein with a 99% identity and a 99% coverage. There are multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. Inside topology based on DeepTMHMM CDS 21124 - 21270 /note=There are 35 Manual annotations for the start site 21124 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 21335 - 22213 /note=There are 60 Manual annotations for the start site 21335 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as an exonuclease with a 99% probability and a 85% coverage. NCBI Blast identifies this gene as an exonuclease with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is an exonuclease. CDS 22213 - 22377 /note=There are 21 Manual annotations for the start site 22213 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred did not have any significant hits. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 22374 - 22757 /note=There are 47 Manual annotations for the start site 22374 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits with high enough coverage. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 22871 - 23407 /note=There are 8 Manual annotations for the start site 22871 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene a hydrolase with a 100% probability and a 72% coverage. There are other hits that show the function as metal binding protein. Hydrolase can be identified as a metal binding protein. NCBI Blast identifies this gene as a nucleoside pyrophosphohydrolase with a 98% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is a hydrolase. CDS 23404 - 23913 /note=There are 63 Manual annotations for the start site 23404 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as a thymidylate kinase with a 99% probability and a 97% coverage. NCBI Blast identifies this gene as a thymidylate kinase with a 98% identity and a 98% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is a thymidylate kinase. CDS 23929 - 24678 /note=There are 68 Manual annotations for the start site 23929 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene aa a recombination directionality factor with a 99% probability and a 74% coverage. There is only one significant hit on HHPred. NCBI Blast identifies this gene as a recombination directionality factor with a 99% identity and a 99% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is a recombination directionality factor. CDS complement (24948 - 25217) /note=There are not any Manual annotations for the start site 25217 on Starterator. Genemark and Glimmer agree identifies the start site as 25205. HHPred and NCBI Blast both lack significant hits to identify the function of this gene Based on the lack of evidence presented on both HHPred and NCBI Blast, and the differing start sites, this might not be able to be identified as a gene. CDS 25291 - 25521 /note=There are 57 Manual annotations for the start site 25291 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as a glutaredoxin with a 99% probability and a 93% coverage. There are other hits that show the function as metal binding protein.This supports the function as glutaredoxin, because glutaredoxin can funciton as a metal binding protein. NCBI Blast identifies this gene as a glutaredoxin with a 97% identity and a 100% coverage on multiple hits with comparable numbers. Multiple hits identified this gene as thioredoxin. This contributes to evidence due to the fact that both maintain cellular redox reactions. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is glutaredoxin. CDS 25521 - 25847 /note=There are 62 Manual annotations for the start site 25521 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as a Holliday junction resolvase with a 99% probability and a 95% coverage. NCBI Blast identifies this gene as a Holliday junction resolvase with a 99% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is a Holliday junction resolvase. CDS 25877 - 26170 /note=There are 48 Manual annotations for the start site 25877 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 26256 - 26525 /note=There is 1 Manual annotation for the start site 26256 on Starterator. Genemark and Glimmer agree with this and was easily identified in genemark. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits. NCBI Blast identifies this gene as a hypothetical protein with a 98% identity and a 98% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 26611 - 26796 /note=There are no manual annotations for the start site 26611 on Starterator. Genemark and Glimmer agree with identifying the start site as 26796. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits to indentify this genes function. NCBI Blast identifies this gene as a major hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical. CDS 26916 - 29390 /note=There are 67 Manual annotations for the start site 26916 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene a DNA binding protein with a 100% probability and a lower coverage of 54%. There are other hits that show the function as DNA primase and helicase. NCBI Blast identifies this gene as DNA primase/helicase with a 99% identity and a 99% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is a DNA primase/helicase. CDS 29387 - 31261 /note=There are 58 Manual annotations for the start site 29387 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as DNA polymerase with a 100% probability and a 95% coverage. NCBI Blast identifies this gene as a DNA polymerase with a 99% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is a DNA polymerase. CDS 31261 - 31461 /note=There are 65 Manual annotations for the start site 31216 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. DeepTMHMM suggests inside topology. CDS 31458 - 31760 /note=The start site identified in starterator with the most MA`s is the start site 31458 with 84 manual annotations. The identified start site on glimmer and genemark is 31491 with 47 manual annotations. HHPred does not have any significant hits. NCBI Blast identifies this gene as a hypothetical protein with a 89% identity and a 89% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 31760 - 32479 /note=There are 53 Manual annotations for the start site 31760 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as an RNA Polymerase with a 99% probability and a 99% coverage. There are other hits that show the function as RNA polymerase or DNA binding protein. NCBI Blast identifies this gene as a DNA binding protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. There is evidence for both RNA polymerase and DNA binding protein, which has the same function. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is a DNA Binding Protein. CDS complement (32476 - 32700) /note=There are 8 Manual annotations for the start site 32700 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene a DNA binding protein and a Nucleic acid binding protein with a 91% probability and a 90% coverage. These hits support each other because Nucleic acid binding proteins can include DNA binding proteins and have similar function. NCBI Blast identifies this gene as a hypothetical protein with a 98% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred which has a more specific conclusion, we can assume this gene is a DNA binding protein. CDS complement (32712 - 32876) /note=There are 5 Manual annotations for the start site 32876 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits. NCBI Blast identifies this gene as a hypothetical protein with a 98% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 32988 - 33146 /note=There are 38 Manual annotations for the start site 32988 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene a hypothetical protein with a 86% probability and a 78% coverage. There are other significant hits that show the function as RNA polymerase, but the probability and coverage was higher on hypothetical protein than other functions. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is a hypothetical protein. CDS 33146 - 33547 /note=Starterator identifies 42 manual annotations for the start site 33146. Glimmer and Genemark agree on the start site of 33146 and it is easily identifiable in Genemark. HHPred identifies this gene as Sulfiredoxin with a 98% probability and a 86% coverage. This evidence supports the function as a ParB-like nuclease domain because Sulfiredoxin contains ParB-like nuclease domain protein. There are other signficant hits on HHPred that show the function as ParB-like nuclease domain protein. NCBI Blast identifies this gene as a ParB-like nuclease domain protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is a ParB-like nuclease domain. CDS 33535 - 33978 /note=There are 4 Manual annotations for the start site 33535 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits. NCBI Blast identifies this gene as a hypothetical protein with a 97% identity and a 98% coverage on multiple hits with comparable numbers. Based on the evidence on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 33983 - 34471 /note=There are 8 Manual annotations for the start site 33983 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene a T4SS, Secretion, H. pylori, TRANSLOCASE, with a 90% probability and a 88% coverage. There is one other hits that show the function as a Large polyvalent protein associated domain. NCBI Blast identifies this gene as a hypothetical protein with a 98% identity and a 98% coverage on multiple hits with comparable numbers. Due to the fact that the hits presented in NCBI Blast are more consistent and have higher frequencies of identity and coverage when compared to HHPred, we can assume that this gene is a hypothetical protein. CDS 34653 - 34934 /note=There is 1 Manual annotation for the start site 34653 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene a CRISPR associated endonuclease with a 97% probability and a coverage of 79%. There are other hits that show the function as RNA BINDING PROTEIN-RNA-DNA. NCBI Blast identifies this gene as a HNH endonuclease with a 69% identity and a 70% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is HNH endonuclease. CDS 35094 - 35411 /note=There are 44 Manual annotations for the start site 35094 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits. NCBI Blast identifies this gene as a hypothetical protein with a 99% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 35408 - 35566 /note=There are 61 Manual annotations for the start site 35408 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 35619 - 36119 /note=There are 32 Manual annotations for the start site 35619 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits. NCBI Blast identifies this gene as a hypothetical protein with a 97% identity and a 99% coverage on multiple hits with comparable numbers. Based on the evidence presented in NCBI Blast, we can assume this gene is a hypothetical protein. CDS 36183 - 36347 /note=Starterator does not identify any manual annotations for the start site 36183. Genenmark and glimmer both confirm this start. There is no Phagesdb functions frequency. HHPred does not have any significant hits. NCBI Blast does not have any data available. No one has seen this gene before so we lack the ability to compare this gene to other genes. The function is can be identified as a hypothetical protein. CDS 36355 - 36600 /note=There are 7 Manual annotations for the start site 36340 on Starterator. Phagesbd, Genemark, and Glimmer agree that the start site is 36355 on Starterator. So, we can confirm that 36355 is the start site. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as an uncharacterized protein with a 84% probability and a 61% coverage. NCBI Blast identifies this gene as a hypothetical protein with a 75% identity and a 75% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is a hypothetical protein. CDS 36590 - 37162 /note=There are 64 Manual annotations for the start site 36590 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene a dihydrofolate reductase with a 99% probability and a 97% coverage. NCBI Blast identifies this gene as a dihydrofolate reductase with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is dihydrofolate reductase. CDS 37159 - 37686 /note=There are 20 Manual annotations for the start site 37159 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as a Lipoprotein confined to pathogenic Mycobacterium with a 81% probability and a 77% coverage. NCBI Blast identifies this gene as a hypothetical protein with a 96% identity and a 97% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast and the lack of evidence on HHPred, we can assume that this gene is a hypothetical protein. CDS 37686 - 38069 /note=There are 23 Manual annotations for the start site 37686 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits with high enough coverage and probability to identify this gene. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 38066 - 38971 /note=There are 25 Manual annotations for the start site 38066 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene a thymidylate synthase with a 100% probability and a 99% coverage. NCBI Blast identifies this gene as a thymidylate synthase with a 99% identity and a 99% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is a thymidylate synthase. CDS 38986 - 39210 /note=There are 23 Manual annotations for the start site 38986 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits based in the coverage and probability to be able to identify this genes function. NCBI Blast identifies this gene as a hypothetical protein with a 95% identity and a 97% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 39210 - 39404 /note=There are 40 Manual annotations for the start site 39210 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene a Putative Actinobacterial holin superfamily 3 with a 83% probability and a 96% coverage. This described as a membrane protein which supports the evidence presented by NCBI Blast. There are other hits that show the function as unknown. NCBI Blast identifies this gene as a membrane protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is a membrane protein. Evidence on DeepTMHMM CDS 39548 - 39685 /note=There are 53 Manual annotations for the start site 39548 on Starterator. Glimmer agrees with this, while genemark identifies the start site as 39482 but there are only 8 manual annotations. So we can assume that the start site is 39548 This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits with high enough probability and coverage. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 39682 - 40074 /note=There are 14 Manual annotations for the start site 39682 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits or high enough probability and coverage, to be able to identify this gene. NCBI Blast identifies this gene as a hypothetical protein with a 83% identity and a 87% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 40067 - 40249 /note=There are 53 Manual annotations for the start site 40067 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred doe not have any significant hits to be able to identify the gene function. NCBI Blast identifies this gene as a hypothetical protein with a 98% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 40262 - 40492 /note=There are 37 Manual annotations for the start site 40262 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits to identify the gene function. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 40492 - 40662 /note=There is 1 Manual annotation for the start site 40492 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred identifies this gene as an oxidoreductase with a 82% probability and a 62% coverage. This is lower than ideal, but it is the highest hit. There are other hits that show the function as hypothetical protein, but lack high enough probability and coverage. NCBI Blast identifies this gene as a hypothetical protein with a 96% identity and a 98% coverage on one hit. There is no other available hits. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 40659 - 41009 /note=There are 60 Manual annotations for the start site 40659 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits to identify this gene. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein. CDS 41122 - 41292 /note=There are 4 Manual annotations for the start site 41122 on Starterator. Glimmer agrees with this. Genemark lacks identification of the start site. This is identified as a gene based on the number of base pairs present. HHPred does not have any significant hits to identify this gene function. Regardless, the highest probability on HHPred is 75% with a higher coverage of 96%, identifying the function as a transport protein. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on one hit, with one other hit at comparable numbers. Based on the evidence presented NCBI Blast, we can assume this gene is a hypothetical protein. CDS 41357 - 41632 /note=There are no Manual annotations for the start site 41357 on gene 72 Starterator. Genemark and Glimmer agree with this. Starterator only identifies the start site 41357 for gene 73. This is identified as a gene based on the number of base pairs present. Additionally, Phagesdb specifies that this gene codes for a tRNA, not a protein. HHPred identifies this gene a CRISPER associated endonuclease with a 98% probability and a 91% coverage. There are other hits with comparable numbers that agree with this function. NCBI Blast identifies this gene as an HNH endonuclease with a 100% identity and a 100% coverage on multiple hits with comparable numbers. The hits that represent the function as CRISPER associated endonuclease is logical and supports HHPred because HNH endonuclease is a CRISPER associated endonuclease. Based on the evidence presented on both HHPred and NCBI Blast, we can assume this gene is HNH endonuclease. CDS 41632 - 41760 /note=There are 40 Manual annotations for the start site 41632 on Starterator. Genemark and Glimmer agree with this. This is identified as a gene based on the number of base pairs present. HHPred doe not have significant hits to identify this gene function. NCBI Blast identifies this gene as a hypothetical protein with a 100% identity and a 100% coverage on multiple hits with comparable numbers. Based on the evidence presented on NCBI Blast, we can assume this gene is a hypothetical protein.