HESML V1R4 Java software library of ontology-based semantic similarity measures and information content models
2019-07-19T12:17:54Z (GMT) by
HESML V1R4 is the fourth release of the Half-Edge Semantic Measures Library (HESML) detailed in , which is a new, linerarly scalable and efficient Java software library of ontology-based semantic similarity measures and Information Content (IC) models based on WordNet. HESML V1R4 implements most ontology-based semantic similarity measures and Information Content (IC) models based on WordNet reported in the literature, as well as the evaluation of three pre-trained word embedding models. It also provides a XML-based input file format in order to specify the execution of reproducible experiments on WordNet-based similarity, even with no software coding. HESML V1R4 introduces the following novelties: (1) a software implementation for the evaluation of three pre-trained word embedding file formats which support most of state-of--the-art models reported in the literature; (2) a software implementation of an intrinsic IC model and two new IC-based semantic similarity measures introduced by Cai et al. (2017); (3) a software implementation of a fast approximation of the Wu&Palmer (1994) measure commonly used in the literature; (4) the integration of a very large set of word similarity benchmarks; and finally (5), the correction of an error in our software implementation of the Leacock&Chodorow (1998) measure in previous HESML versions. HESML library is freely distributed for any non-commercial purpose under a CC By-NC-SA-4.0 license, subject to the citing of the main HESML paper  as attribution requirement. On other hand, the commercial use of the similarity measures introduced in , as well as part of the intrinsic IC models introduced in  and , is protected by a patent application . In addition, any user of HESML must fulfill other licensing terms described in  related to other resources distributed with the library. References:  Lastra-Díaz, J. J., García-Serrano, A., Batet, M., Fernández, M., & Chirigati, F. (2017). HESML: a scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset. Information Systems, 66, 97–118.  Lastra-Díaz, J. J., & García-Serrano, A. (2015). A novel family of IC-based similarity measures with a detailed experimental survey on WordNet. Engineering Applications of Artificial Intelligence Journal, 46, 140–153.  Lastra-Díaz, J. J., & García-Serrano, A. (2015). A new family of information content models with an experimental survey on WordNet. Knowledge-Based Systems, 89, 509–526.  Lastra-Díaz, J. J., & García-Serrano, A. (2016). A refinement of the well-founded Information Content models with a very detailed experimental survey on WordNet. Universidad Nacional de Educación a Distancia (UNED).  Lastra Díaz, J. J., & García Serrano, A. (2016). System and method for the indexing and retrieval of semantically annotated data using an ontology-based information retrieval model. USPTO App, US2016/0179945 A1.