HESML V1R2 Java software library of ontology-based semantic similarity measures and information content models
2019-07-18T13:29:46Z (GMT) by
HESML V1R2 is the second release of the Half-Edge Semantic Measures Library (HESML) , which is a new, scalable and efficient Java software library of ontology-based semantic similarity measures and Information Content (IC) models based on WordNet. HESML V1R2 implements most ontology-based semantic similarity measures and Information Content (IC) models based on WordNet reported in the literature. In addition, it provides a XML-based input file format in order to specify the execution of reproducible experiments on WordNet-based similarity, even with no software coding. The V1R2 release significantly improves the performance of HESML V1R1. HESML is introduced and detailed in a companion reproducibility paper  of the methods and experiments introduced in [2,3,4]. The main features of HEMSL are as follows: (1) it is based on an efficient and linearly scalable representation for taxonomies called PosetHERep introduced in , (2) its performance exhibits a linear scalability as regards the size of the taxonomy, and (3) it does not use any caching strategy of vertex sets. HESML V1R2 is freely distributed for any non-commercial purpose under a CC By-NC-SA-4.0 license, subject to the citing of the main HESML paper  as attribution requirement. On other hand, the commercial use of the similarity measures introduced in , as well as part of the intrinsic IC models introduced in  and , is protected by a patent application . In addition, any user of HESML must fulfill other licensing terms described in  related to other resources distributed with the library, such as WordNet and a dataset of corpus-based IC models, among others. References:  Lastra-Díaz, J. J., & García-Serrano, A. (2016). HESML: a scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset. To appear in Information Systems Journal.  Lastra-Díaz, J. J., & García-Serrano, A. (2015). A novel family of IC-based similarity measures with a detailed experimental survey on WordNet. Engineering Applications of Artificial Intelligence Journal, 46, 140–153.  Lastra-Díaz, J. J., & García-Serrano, A. (2015). A new family of information content models with an experimental survey on WordNet. Knowledge-Based Systems, 89, 509–526.  Lastra-Díaz, J. J., & García-Serrano, A. (2016). A refinement of the well-founded Information Content models with a very detailed experimental survey on WordNet. Universidad Nacional de Educación a Distancia (UNED).  Lastra Díaz, J. J., & García Serrano, A. (2016). System and method for the indexing and retrieval of semantically annotated data using an ontology-based information retrieval model. United States Patent and Trademark Office (USPTO) Application, US2016/0179945 A1.