Toggle Main Menu Toggle Search

Open Access padlockePrints

The limitations of the current protein classification tools in identifying lipolytic features in putative bacterial lipase sequences

Lookup NU author(s): Reihaneh Bashiri, Professor Thomas CurtisORCiD, Dr Dana OfiteruORCiD



This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


© 2022 The Authors. Metagenomics sequencing has generated millions of new protein sequences, most of them with unknown functions. A relatively quick first step for function assignment is to use the existing public protein databases and their scanning tools. However, to date these tools are not able to identify all sequence features like conserved motifs or patterns. In this study we evaluated the capability of several protein public databases (e.g., InterPro, PROSITE, ESTHER, pfam, AlphaFold etc) and their scanning tools for identifying lipolytic features in 78 putative cold-adapted bacterial lipase sequences. Novel lipases that can tolerate extreme conditions have great biotechnological importance. We obtained the putative cold-adapted lipolytic sequences from the metagenomic study of anaerobic psychrophilic microbial community treating domestic wastewater at 4 and 15 ℃. Both newer and conventional protein classifiers failed to find lipolytic features for most of the putative lipases. InterProScan predicted lipase family membership for only 18 of the putative lipase sequences. For more than half of them (41 out of 78) InterProScan could not predict any protein family membership, let alone find lipolytic features in them. However, when the Lipase Engineering Database and AlphaFold were used, half of those sequences were classified. Conventional databases like PROSITE could find lipolytic patterns for 9 of the putative lipolytic sequences of which only one was identified by InterProScan as a lipase. Moreover, different scanning tools made different and inconsistent predictions for a certain putative lipase sequence. Even InterProScan, which integrates predictions from 13 protein member databases, did not have a consensus prediction for a certain lipase sequence. Our study shows that there is lack of information in public protein databases about bacterial lipase sequences and this limits their lipolytic feature prediction and biotechnological application. The integration of AlphaFold within the InterPro can improve the lipase identification and classification significantly.

Publication metadata

Author(s): Bashiri R, Curtis TP, Ofiteru ID

Publication type: Article

Publication status: Published

Journal: Journal of Biotechnology

Year: 2022

Volume: 351

Pages: 30-37

Print publication date: 10/06/2022

Online publication date: 04/05/2022

Acceptance date: 26/04/2022

Date deposited: 23/05/2022

ISSN (print): 0168-1656

ISSN (electronic): 1873-4863

Publisher: Elsevier BV


DOI: 10.1016/j.jbiotec.2022.04.011


Altmetrics provided by Altmetric


Funder referenceFunder name