Looking for lipases and lipolytic organisms in low-temperature anaerobic reactors treating domestic wastewater

Poor lipid degradation limits low-temperature anaerobic treatment of domestic wastewater even when psychrophiles are used. We combined metagenomics and metaproteomics to find lipolytic bacteria and their potential, and actual, cold-adapted extracellular lipases in anaerobic membrane bioreactors treating domestic wastewater at 4°C and 15°C. Of the 40 recovered putative lipolytic metagenome-assembled genomes (MAGs), only three (Chlorobium, Desulfobacter, and Mycolicibacterium) were common and abundant (relative abundance ≥ 1%) in all reactors. Notably, some MAGs that represented aerobic autotrophs contained lipases. Therefore, we hypothesised that the lipases we found are not always associated with exogenous lipid degradation and can have other roles such as polyhydroxyalkanoates (PHA) accumulation/degradation and interference with the outer membranes of other bacteria. Metaproteomics did not provide sufficient proteome coverage for relatively lower abundant proteins such as lipases though the expression of fadL genes, long-chain fatty acid transporters, was confirmed for four genera (Dechloromonas, Azoarcus, Aeromonas and Sulfurimonas), none of which were recovered as putative lipolytic MAGs. Metaproteomics also confirmed the presence of 15 relatively abundant (≥1%) genera in all reactors, of which at least 6 can potentially accumulate lipid/polyhydroxyalkanoates. For most putative lipolytic MAGs, there was no statistically significant correlation between the read abundance and reactor conditions such as temperature, phase (biofilm and bulk liquid), and feed type (treated by ultraviolet light or not). Results obtained by metagenomics and metaproteomics did not confirm each other and further work is required to identify the true lipid degraders in these systems.

redundancy and annotation errors using CD-hit and notepad++. Furthermore, the database search using the cleaned metagenomics constructed database was performed using a two-round search strategy. The initial search allowed 50 ppm parent ion and 0.1 Da fragment mass error tolerance and carbamidomethylation as fixed modification. Protein matches of the initial search with a -10lgP protein score greater or equal 20 were collected, which resulted in a preliminary search output of 11,814 protein groups. The second-round search, using the refined database from the first-round search, allowed up to 3 missed cleavages, 50 ppm parent ion and 0.1 Da fragment mass error tolerance, carbamidomethylation as fixed modification, oxidation and deamidation as variable modifications and employed a decoy fusion database for determining false discovery rates. Peptide spectrum matches were filtered against 1% or 5% false discovery rates (FDR), and protein identifications with 2 or more unique peptides across the group were considered as significant matches. Processing of metadata was done using MATLAB 2017b.
Additional taxonomic and KEGG number annotations was performed using GhostKOALA v2.2. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD028388 (Reviewer account: Username: reviewer_pxd028388@ebi.ac.uk Password: Tl8JGpJd).

Reads, contigs, MAGs
The highest and the lowest number of reads belonged to the liquid phase of the sterile feed at 4℃ (100 million) and 15℃ (64 million), respectively (Supplementary File 1, Table S2). There was statistically significant difference in the number of reads in the biofilm and liquid (Tukey pairwise comparison; P-value = 0.402). We found about 1 million (M) contigs with a total length of nearly 1.5 billion base pair (bp). The largest contig was about 1 Mbp long (Supplementary File 1, Table S3). We recovered about 1519 MAGs. However, only 40 MAGs had at least one putative lipase genes and met the accepted quality threshold to be selected as putative lipolytic MAGs (Supplementary File 1, Table S4).

Lipolytic potential: whole metagenome vs MAGs
Within the metagenomic data, a total of 903 sequences had putative lipolytic activity (EC number 3.1.1.3), of which only 78 were present in putative lipolytic MAGs. By contrast, there were numerous genes coding for the extracellular enzymes that degrade proteins, carbohydrates, short-chain lipids, and phosphates in both the whole metagenome and putative lipolytic MAGs, respectively (The CCR regulatory system for selecting the most suitable carbon source is aligned with the economic theories (Allison and Vitousek 2005). In the presence of simple substrates, cells do not invest carbon and nitrogen for producing extracellular enzymes that decompose complex substrates. However, where carbon and nitrogen resources exist in complex form, producing the relevant enzymes becomes inexpensive (Allison and Vitousek 2005). In the case of lipases, where glucose is abundant, CCR depresses the lipase production (Boekema et al. 2007). In addition, the expression of proteases affects the lipase production (Andersson 1980, Black andDiRusso 2003). In Bacillus subtills, for example, the accumulation of amino acids induced the cells to produce more proteases and depress the lipase expression. Table 1).
Three most frequently annotated genes for degrading sugars in the whole metagenome were βgalactosidase, β-glucosidase, and α-galactosidase. However, in the putative lipolytic MAGs, β-glucosidase, β-hexosaminidase, cellulase, α-amylase, α-galactosidase, and endo-β-xylanase were the most annotated. The large difference in the number of the genes can indicate that cells might have various alternative gene regulatory systems for expressing the genes which are involved in degrading sugars rather than lipases. Bacteria have a global regulatory mechanism known as carbon catabolite repression (CCR). In the presence of easily accessible carbon sources like sugars, CCR inhibits the expression of genes that allow cells to use a secondary carbon source (Görke and Stülke 2008). One of the key genes in this process is catabolite repression resistance gene, known as the phosphotransferase system sugar specific EII component (PTS-EII) or putative sugar kinases. These genes were present in all putative lipolytic MAGs (Supplementary File 1, Table S5). The CCR regulatory system for selecting the most suitable carbon source is aligned with the economic theories (Allison and Vitousek 2005). In the presence of simple substrates, cells do not invest carbon and nitrogen for producing extracellular enzymes that decompose complex substrates. However, where carbon and nitrogen resources exist in complex form, producing the relevant enzymes becomes inexpensive (Allison and Vitousek 2005). In the case of lipases, where glucose is abundant, CCR depresses the lipase production (Boekema et al. 2007). In addition, the expression of proteases affects the lipase production (Andersson 1980, Black andDiRusso 2003). In Bacillus subtills, for example, the accumulation of amino acids induced the cells to produce more proteases and depress the lipase expression.

Which are the lipolytic MAGs and why they have lipases?
Putative lipolytic MAGs belonged to 14 distinct phyla (mostly from the Actinobacteria, Proteobacteria and Bacteroidota), with two unclassified at the phyla level ( Figure 1). Some of the putative lipase genes were found from genera we did not expect to be lipolytic or  Table S6). The fourth possibility is that these are mis-assemblies or mis-annotations. Even high-quality MAGs can be subjected to these misinterpretations.
We could not label any MAG with certainty as a possible or true lipid degrader due to both   Table S6). Therefore, we assumed that Bin 22 might use the lipase for degrading the PHA produced by other bacteria from these two phyla for denitrification.
Similarly, the other 10 MAGs from several phyla that only had denitrification and lipase genes (no PHA synthesising genes) might use the lipase for degrading the PHA produced by others.

Linking the putative lipolytic MAGs to the reactor conditions and lipases
For most putative lipolytic MAGs, the number of mapped reads per reactor conditions did not vary significantly. However, for a few MAGs, statistically significant differences were observed (Supplementary File 1, Table S8 and Figure S1-S3  Table S9).

The putative lipolytic MAGs abundance in the reactors
There were 32 common and abundant (relative abundance ≥ 1%) bacterial genera in at least one reactor conditions (Figure 3).

Expressed proteins
A total of 93 and 117 distinct protein groups were found at False Detection Rates (FDR) of 1% and 5%, respectively as listed in Supplementary File 1, Table S10, using the complete metagenomics constructed database. At FDR 5%, there were 24 new protein groups compared to FDR 1% though neither of the new or common hits were significantly different (P-value = 0.514, one-way ANOVA). Not only were none of the hits lipases, none were hydrolytic enzymes of any description either. About 75% of the identified proteins were involved in processing the genetic information, signalling and cellular processes, processing environmental information and energy metabolism. Further 4%, 2%, and 1% of the proteins were related to carbohydrate, amino acids, and lipid metabolism, respectively ( Figure 5).
In terms of class, outer membrane porin proteins (omp32) outnumbered the rest of the classes

Taxonomical distribution of identified proteins by metaproteomics
About 97% of the identified expressed genes (at FDR 5%) were related to the bacterial domain and at least from 19 distinct classes (Figure 6a), among which Betaproteobacteria had the greatest share (57 %). The top-ranked identified genera with expressed proteins were all from the class Betaproteobacteria including Paucimonas, Dechloromonas, Acidovorax, Azoarcus and Thauera, respectively (Figure 6b). The full list of all genera associated to the expressed proteins is presented in Supplementary File 1 Table S11. Among the top-ranked, all genera except for Paucimonas have been formerly identified by GOTTCHA2 and their relative abundance in each reactor was known (Figure 7). Comparatively, Azoarcus, was the only lowabundant genus with no abundance at Non-sterile-4℃.
Although Paucimonas was absent from the reactors based on GOTTCHA2, it had the highest number of protein matches (Supplementary File 1, Table S12) with most of them being ribosomal proteins or involved in energy metabolism. One porin and one outer membrane protein were present too.  Figure S5). Additionally, both Dechloromonas and Azoarcus had fadL and thus were potentially lipolytic. Expressed fadL was also found in two other genera (not among the top-ranked), Aeromonas and Sulfurimonas. Their relative abundance is presented in Figure 7. Azoarcus and Sulfurimonas were low-abundant, in all conditions, whereas Aeromonas had higher relative abundance (~ 1%) at Non-sterile-15℃.
The expression of fadL in Dechloromonas, Azoarcus, Aeromonas and Sulfurimonas might imply the presence of long-chain fatty acids in the system and therefore can be a proxy for lipolysis performed by these genera or others. However, none of these four genera were recovered as putative lipolytic MAGs by metagenomics. The absence of lipases along with the presence of fadL genes in a genome might be indicative of cheating mechanisms. Nonetheless, the complete genome of these four genera in NCBI had both the fadL and lipase genes. While this might remove the "cheating label", from these genera, it does not necessarily make them true lipase producers either. We do not know whether or not fadL and lipases are coregulated, but we do know that both can be exported through extracellular vesicles in Gram-negative and Gram-positive bacteria (Galka et al. 2008, Lee et al. 2009, Lee et al. 2016. The presence of both fadL and lipases in the extracellular vesicles might have an entirely different reason than the lipolysis of exogenous lipid molecules. For instance, Galka et al. (2008) have shown that pathogens transport lipases as a virulence factor through extracellular vesicles to attack the lipidic membrane of the host cell and deliver lipids to them. The same scenario might apply to bacterial cells interaction, but no study has shown this yet.

Identified proteins of abundant genera
Out of the 32 common bacterial genera with relative abundance of more than 1% (Figure 3) metaproteomics identified proteins expressed by 15 of them (Table 2). More than half (55%) of the proteins were outer membrane proteins and porins. Some of these genera accumulate lipids, e.g., polyhydroxyalkanoates (PHAs) or perform denitrification. Lipid-accumulation is a barrier for lipid degradation in wastewater systems (Chipasa and Mdrzycka 2008) and cold temperature is a stimulator for PHA accumulation (Srivastava et al. 2020).
At least six of the identified genera (Acinetobacter, Cloacibacterium, Dechloromonas, Rhodopseudomonas, Thauera, and Thermomonas) are involved in PHA accumulation (Carlozzi and Sacchi 2001, Coats et al. 2016, Hauschild et al. 2017, Oshiki et al. 2008, Ram et al. 2018, Singleton et al. 2021 Employing high-resolution mass spectrometers along with developing better tools for identifying the mass spectra and matching them to protein groups can also enhance metaproteomics data analysis. And finally, the dependency of metaproteomics to metagenomics database can be reduced through de novo approaches.