Download PDFOpen PDF in browser

Implementation of Fine-Tuned BERT for Enzyme Classification Based on Gene Ontology

EasyChair Preprint 14723

6 pagesDate: September 5, 2024

Abstract

Enzymes are biocatalysts with vital roles in biological functions and many industrial applications. Diverse enzymes are classified using Enzyme Commission (EC) nomenclature, making differentiation challenging. On the other hand, another biological information, gene ontology (GO), can describe the biological aspects of enzymes, covering related biological processes (BP), molecular functions (MF), and their locations within cells (CC). This study proposes a novel EC class and subclass classification of enzymes within the ontology subclass based on their GO semantics using a Bidirectional Encoder Representation of Transformer (BERT). The BERT model is first fine-tuned using the preprocessed GO term name and definition, with the enzymes in each ontology class (BP, MF, or CC) are also divided based on how the GO assigned, either through manual annotation (NONIEA) or electronically inferred (IEA). BERT successfully obtained 0.93, 0.60, 0.99, 0.90, 0.40, and 0.35 F1 scores during fine-tuning for BP IEA, BP NONIEA, MF IEA, MF NONIEA, CC IEA, and CC NONIEA, respectively. On the test set, the fine-tuned BERT significantly outperformed GOntoSim in EC class classification across all metrics with less inference time in all ontology subclass. Expanded further to the EC subclass, BERT can classify the enzyme on the EC subclass level in BP IEA and MF IEA ontology subclass. However, longer epochs are needed in fine-tuning. This result shows that the names and definitions of GO terms are distinguishable features in classifying enzymes as an alternative to the information content approach.

Keyphrases: BERT, GOntoSim, Gene Ontology, enzyme classification, fine-tuning

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:14723,
  author    = {Matthew Martianus Henry and Christian Kenneth and Bens Pardamean},
  title     = {Implementation of Fine-Tuned BERT for Enzyme Classification Based on Gene Ontology},
  howpublished = {EasyChair Preprint 14723},
  year      = {EasyChair, 2024}}
Download PDFOpen PDF in browser