CDS 1 - 525 /note=No significant sequence or structural homology to proteins of known function identified by BLASTP and HHPred respectively. CDS 522 - 1922 /note=HHpred predicts substantial structural homology to well characterized T-even and T-odd terminase proteins. BLASTp predicts substantial sequency homology to annotated terminases in related phages. CDS 1953 - 3350 /note=HHPred predicts very strong structural homology to well characterized portal proteins of gram positive phages. Blast finds significant sequence homology to its syntenic phages. CDS 3356 - 4024 /note=HHPred found a high structural homology to three distinct phams, and only over a short stretch of protein (~35 aa). Not compelling evidence of functional homology. BlastP found no significant homology to any proteins that would allow assignment of capsid maturation protease. Calling hypothetical protein. CDS 4108 - 4668 /note=HHPred did not predict structural homology to any phams with biochemically characterized functions. BLAST found consistent high sequence homology to annotated scaffolding proteins, but again, we could not with confidence find a BLAST hit with biochemical data supporting a functional call. We used AlphaFold to generate a predicted tertiatry structure for Stop-4668. The alpha-fold Stop-4668 structure is similar to the published structure of a scaffolding protein (Guo et al, https://doi.org/10.1101/2024.11.01.621488), in that it has a long stretch of protein with no secondary structure, followed by a number of alpha-helices. To be consistent with the community, we are calling a scaffolding protein. CDS 4708 - 5634 /note=HHPred suggests major capsid protein. Ncbi blast also calls for a major capsid protein. CDS 5631 - 5882 /note=HHPred was inconsistent with low probability and coverage numbers. NCBI BLAST shows structural homology to hypothetical proteins. CDS 5952 - 6377 /note=No significant homology found by HHpred or NCBI Blast P. CDS 6404 - 6817 /note=No significant structural or sequence homology predicted by HHPred or BLAST. CDS 6783 - 7199 /note=HHPred showed no significant score other than one. That score called for an unknown function. Blast calls for hypothetical protein. CDS 7196 - 7537 /note=We looked through HHpred results and found some results connecting to H97 but no compelling evidence was found. NCBI Blast came up with hypothetical proteins. We`ll settle with this for now but further research may prove useful. CDS 7540 - 7896 /note=Function list says that SPP1 confirms tail terminator. CDS 7896 - 8132 /note=No significant homology found by HHPred or NCBI blast. CDS 8145 - 8645 /note=HHPred found compelling structural homology to a "phage-tail like" protein from Listeria, but since this is a bacterial protein and not a phage protein, we declined to note it as evidence. No other significant structural homology predicted. BLAST finds significant sequence homology to other annotated Major Tail Proteins. CDS 8674 - 9243 /note=No significant structural homology predicted by HHPred. Blast finds significant sequence homology to tail assembly chaperones of microbacterium phages (mostly), all likely annotated on the basis of synteny. CDS 9267 - 9629 /note=No significant structural homology predicted by HHPred. BLAST finds sequence homology to annotated TACs, but no BLAST hits to biochemically characterized proteins. /note= /note=Searched long and hard for an unambiguous slippery sequence to tie this to the upstream Pham, but did not find a ss. CDS 9654 - 11921 /note=HHPred calls for a tape measure protein with a high probability, homologous to gp57 of staph 80 alpha and Bxb1. NCBI BLAST also calls for tape measure protein on all pages. CDS 11918 - 12691 /note=HHPred showed homology to minor tail proteins more specifically distal tail proteins. Blast showed homology to minor tail protein as well. CDS 12691 - 15117 /note=Two hits on HHPred and NCBI BLAST predicts structural homology to minor tail proteins. gp59, the tail protein region, sialic acid, and sugar-binding proteins. CDS 15117 - 15302 /note=HHPred had inconsistent data with numbers that are not high enough to look into. NCBI BLAST finds homology to hypothetical proteins. CDS 15299 - 15895 /note=HHPred numbers did not meet the 90% cutoff and did not contain any useful data. NCBI BLAST finds homology to hypothetical proteins on all pages. CDS 15895 - 16374 /note=HHPred shows significant scores to GP 15 protein in listeria phage as well as receptor binding protiens in lactococcus phage. Blast showed significant scores for hypothetical protien. CDS 16374 - 18509 /note=HHPred shows significant scores to sugar binding proteins. Interesting scores to levanase hydrolase in bacillus from query 381 to 556. Blast shows significant homology to minor tail proteins. Significant scores to chitinase from query 2 to 346. Also goood scores for glycoside hydrolase from query 438 to 556. CDS 18540 - 19346 /note=HHPred finds compelling structural homology to carboxypeptidases, consistent with an endolysin function. We have not identified a lysin B pham, and therefore we are annotating this pham as an endolysin. CDS 19381 - 19770 /note=DeepTMHMM predicts high likelihood of a membrane-spanning domain. HHPred predicts compelling structural homology to FtsX from Vibrio cholorea. CDS 19775 - 20116 /note=HHPred shows evidence for holin. DeepTMHMM shows three passes through the membrane. CDS complement (20172 - 20360) /note=No significant results from HHpred or others. Settling on hypothetical protein, no known function. CDS complement (20424 - 20576) /note=HHPred and NCBI BLAST both call for a hypothetical protein. There was not much useful data. CDS complement (20591 - 20749) /note=No significant structural or sequence homology to Phams of known function predicted by HHPred or BLAST. CDS complement (20746 - 21588) /note=Interesting HHPred results. From amino acid 24 to 66 roughly very high homology predicted to an anti-termination Protein Q (Shield et al 2019 Nat. Comm) and an inhibitor of the Tryptophan RNA-binding attenuator protein. Further along, from about 88 to 187 more or less, structural homology to nucleic acid binding proteins is predicted. Compelling and interesting, but not a clear function. BLAST found little beyond hypothetical proteins other than two SS-DNA binding proteins. Hypothetical Protein. CDS complement (21633 - 21896) /note=No significant structural homology predicted by HHPred. BLAST found significant sequence homology to several phage Phams annotated as membrane proteins. DeepTMHmm agrees, predicting a membrane spanning protein. CDS complement (21896 - 22105) /note=No significant structural homology predicted by HHPred. BLAST finds significant sequence homology to Phage Phams annotated as membrane proteins, and DeepTMHMM finds an extensive transmembrane region. CDS complement (22102 - 23772) /note=HHPred did predict substantive structural homology to RecA DNA recombinase of E. coli. However, the sequence did not align to the E. coli RecA gene sequence well. Moreover, the predicted protein for this pham is quite a bit larger than the E. coli RecA, and the structural homology to RecA predicted by HHPred only occurs across the C-terminal half of this Pham. Across this same region of Stop 22102, HHPred also finds substantial structural homology to well characterized DNA helicases. And in the preceeding N-terminal region of Stop 22102, HHPred finds substantial structural homology to biochemically characterized primases. CDS complement (23748 - 24041) /note=HHPred shows biochemical synteny to VRR-Nuc domain protein in Fan1 from query 1 to 93. CDS complement (24045 - 24890) /note=HHPred shows biochemical synteny to DNA binding protein in OB Fold from query 2 to 132. This looks at Bredienstein Et.al (2024) Life Sci Alliance 7. This paper talks about how this protein can also grab onto double stranded DNA. CDS complement (24923 - 25591) /note=HHPred predicts substantial structural homology to a number of proteins that fall under the category of ASCE ATPases. HOWEVER, we cannot identify with confidence sequence that is unambiguously either Walker A or Walker B motif, and we have looked extensively through several alignments to well characterized ASCE ATPases. This pham clearly has a relationship to ASCE ATPases, but we are not comfortable calling it an ASCE ATPase. Hypothetical protein. CDS complement (25588 - 26763) /note=HHPred predicts substantive structural homology between aas 33 to 365 to aas 791-1045 of the unbound form of the AdnA protein, a well characterized helicase/exonuclease (see Jia et al 2019 PNAS 116 p24507). Digging into the supplementary materials of Jia et al, we found that aas 791-1045 of AdnA correspond to the RecB-like nuclease domain, and exclude the preceding helicase domain. From the current function list for Cas4 exonuclease (" This family of exonucleases is similar to the exonuclease domain of RecB. The Cas4 label should be used if the gene includes only the exonuclease region.") Therefore we are calling a Cas4 exonuclease. CDS complement (26750 - 28615) /note=HHpred comes up with DNA polymerase I with great scores, whats not to love? CDS complement (28815 - 29240) /note=HHpred gives no results with good results. BLAST shows a result of an ABC transporter but we aren`t totally on board with that, we see no indication that there is biochemical evidence for it. CDS complement (29411 - 30799) /note=We predict a protein of about 463 amino acids. HHPred predicts substantial structural homology across most of this stretch (generally leaving off the N-terminal 40 aas or so) to... /note= /note=The RecA-like domains 1A and 2A, linker 2 region and Swi2/Snf2-specific domains of E. coli RapA protein (Shaw et al Structure 16 p 1417), an RNA-polymerase recycling protein. We aligned the predicted protein sequence of Stop-29,411 with an E. coli RecA and Spud gp205 and found no clear Walker A or B sequence motifs so Stop-29,411 seems not be a RecA protein or an ASCE ATPase. /note= /note=Amino acids 577 to 1064 of the E. coli Zorb protein, the C-terminal domain of Zorb which has ATPase and nuclease functions (Hu et al, Nature Vol 639 p1093). /note= /note=Amino acids 111 to 490 of the E. coli DEAD-box helicase DbpA of E. coli, a dsRNA helicase involved in ribosome maturation. /note= /note=So we appear to have a nucleotide-binding ATPase, but beyond that, the evidence is not clear. Stop-29,411 does not appear to be unambiguously a helicase. We are calling hypothetical protein. CDS complement (30799 - 31392) /note=HHPred shows significant scores for a phosphoesterase. Blast did as well. Looked at Myllykoski and Kursula`s paper Structural aspects of nucleotide ligand binding by a bacterial 2H phosphoesterase (2017). CDS complement (31389 - 31697) /note=HHpred gives no results with higher scores. NCBI BLAST gives us all hypothetical proteins. CDS complement (31697 - 31975) /note=HHpred gives us no results above our desired 90%. BLAST gives us all hypothetical proteins. CDS complement (31975 - 32784) /note=HHPred had a very probable hit on MazG-like nucleotide pyrophosphohydrolase with the query from 99 to 264. HHPred had another hit on a PGDYG Protein with the query from 1 to 81. CDS complement (32781 - 33506) /note=HHPred did not have any solid hits. NCBI BLAST calls for hypothetical protein. CDS complement (33484 - 34083) /note=HHPred shows hits from query 1 to 170 calling for a thymidylate kinase. No other hits were found. CDS complement (34096 - 35028) /note=HHpred comes up with hit for glycosyltransferase that has good scores and a query from 1 to 264. CDS complement (35021 - 35254) /note=HHPred shows no significant scores. Blast calls for hypothetical protein. CDS complement (35317 - 35613) /note=HHPred shows no significant scores. Blast shows homology to hypothetical proteins and membrane proteins. DeepTMHMM finds a membrane hit. Calling this membrane protein based off of the deep TMHMM hit. CDS complement (35610 - 36377) /note=HHPred has a solid hit on a thymidylate synthase. BLAST also calls for a thymidylate synthase. CDS complement (36497 - 36727) /note=HHPred predicts no structural homology to genes of known function. BLAST finds no significant sequence homology to genes of known function. CDS complement (36788 - 37054) /note=HHPred predicts no significant structural homology to proteins of known function. BLAST finds no significant sequence homology to proteins of known function. CDS complement (37077 - 37343) /note=HHPred shows no significant scores above 90%. Blast shows sequence homology to no known functions. CDS complement (37415 - 37717) /note=Interesting top hits on HHPred, all seem to be connected to zinc binding. But none of these were listed on the approved function list, so we`re going with hypothetical protein. CDS complement (37809 - 38003) /note=No hits on HHPred. NCBI BLAST calls for a hypothetical protein. CDS complement (38042 - 38347) /note=No significant HHPred hits and BLAST calls for a hypothetical protein. CDS complement (38344 - 38529) /note=HHPred did not have any significant hits and BLAST calls for a hypothetical protein. CDS complement (38531 - 39262) /note=No significant HHPred hits and BLAST calls for a hypothetical protein. CDS complement (39356 - 39742) /note=No significant HHPred hits and BLAST calls for a hypothetical protein. CDS complement (39739 - 40098) /note=HHPred did not show good probability or coverage. BLAST shows hypothetical protein. CDS complement (40095 - 40277) /note=HHPred showed no significant scores. Blast shows homology to hypothetical proteins. Calling a hypothetical protein. CDS 40888 - 41331 /note=HHPred showed no significant scores. Blast shows homology to hypothetical proteins. Calling this a hypothetical protein.