[ biopathway.org ]

Automatic Extension of Gene Ontology with Flexible Identification of Candidate Terms


Gene Ontology (GO) has been manually developed to provide a controlled vocabulary for gene product attributes. It continues to evolve with new concepts that are compiled mostly from existing concepts in a compositional way. If we consider the relatively slow growth rate of GO in the face of the fast accumulation of the biological data, it is much desirable to provide an automatic means for predicting new concepts from the existing ones.

We present a novel method that predicts more detailed concepts by utilizing syntactic relations among the existing concepts. We propose a validation measure for the automatically predicted concepts by matching the concepts to biomedical articles. We also suggest how to find a suitable direction for the extension of a constantly-growing ontology such as GO.

    Extension of GO
         Extended GO from the version of June 2004
    Validation of the extended GO
         Test 1 (evaluation for 55 concepts)
         Baseline test 1 (evaluation for 55 concepts)
         Test 2 (evaluation for 50 concepts)
         Baseline test 2 (evaluation for 50 concepts)


  1. Jin-Bok Lee, Jung-jae Kim, and Jong C. Park, Automatic extension of Gene Ontology with flexible identification of candidate terms, Bioinformatics, 22(6): 665-670, 2006. (abstract, pdf)
  2. Jin-Bok Lee, Jung-jae Kim, and Jong C. Park, Induced Extension of Gene Ontology from Biomedical Resources with Flexible Identification of Candidate Terms, Proc. of the First International Symposium on Semantic Mining in Biomedicine (SMBM), Hinxton, Cambridgeshire, UK, 10th-13th April, 2005. [Electronic CEUR proceedings]
  3. Jung-jae Kim, Jin-Bok Lee, Hye-Jin Min, Ji-yong Jung, and Jong C. Park, Logical Representation of Ontological Terminologies in Biomedical Domain, Proc. of the 2nd Annual Conference of The Korean Society for Bioinformatics (KSBI 2003), pp. 79-85, Daejeon, Korea, 2003. (In Korean) [pdf]
  4. Jin-Bok Lee and Jong C. Park. Automatic Gene Ontology Extension and Terminology Analysis,Proceedings of the KISS Conference, Suwon, Korea, October 2002. (In Korean) [pdf]
  5. Jin-Bok Lee and Jong C. Park. Text Data Mining for Automatic Gene Ontology Extension, Intelligent Systems for Molecular Biology (ISMB), Proceedings of the second meeting of the special interest group on Text Data Mining, pages 22-25, Edmonton, Alberta, Canada, August, 2002. [pdf]
  6. Jin-Bok Lee, Jung-jae Kim and Jong C. Park. Semi-Automatic Extension of Gene Ontology, Human Computer Interaction (HCI) Workshop, Phoenix Park, Korea, January, 2002. (In Korean) [pdf]

GO logo

Page maintained by Jin-Bok Lee
Last modified: March 23, 2006