International Journal of Metadata, Semantics and Ontologies (9 papers in press)
A framework for a semantic model based on ontology generation for E-government applications: case study of the Tunisian government
by Hatem Ben Sta
Abstract: A number of developed countries employ ontology in E-government projects. On one hand, E-government is gaining more and more acceptance in Tunisia; on the other hand the Web Ontology Language (OWL) standard is increasingly being used to build E-government service ontologies that are integrated and interoperable in E-government environment. However, current works employing OWL ontologies in E-government are more directed to the Semantic Web audience than to the broader E-government community. Furthermore, only a few of these works provide detailed guidelines for constructing OWL ontologies from a business domain. This paper presents a framework for generating semantic model ontologies in OWL syntax from a government service domain. Firstly, the government service domain is analysed and a domain ontology is constructed to capture its semantic content. Thereafter, a semiformal representation of the domain ontology is created with the ontology knowledge-base editor Protégé. Finally, the OWL ontology model is imported. This study aims at providing E-government developers, particularly those from the developing world, with an easy to use framework for practising semantic knowledge representation in E-government processes, thus facilitating the design of E-government systems that can be easily integrated and maintained. The other challenge when integrating information systems in any domain such as e-government is the challenge of interoperability. One can distinguish between three aspects of interoperability; technical, semantic, and organisational. The technical aspect has been widely tackled especially after the ubiquity of internet technologies. The semantic and organisational aspects deal with sharing the same understanding (semantics) of exchanged information among all applications and services, in addition to modelling and re-engineering governmental processes to facilitate process cooperation that provides seamless e-government services. In this paper, we present the case of our framework interoperability, which is a use case of using ontology in e-government (i.e., data and process governance) to tackle the issues of semantic and organisational interoperability. The followed methodology resulted in a success story within a very short time and has produced a framework that is intuitive, elegant, and easy to understand and implement.
Keywords: e-government; ontology; OWL; interoperability; data integration; data governance; process governance; business process modelling; ProtÃ©gÃ© and software engineering.
Expert system for coffee rust detection based on supervised learning and graph pattern matching
by Emmanuel Lasso, Thiago Toshiyuki Thamada, Carlos Alberto Alves Meira, Juan Carlos Corrales
Abstract: Diseases in agricultural production systems represent one of the main reasons for losses and poor quality products. For coffee production, experts in this area suggest that weather conditions and crop physical properties are the main variables that determine the development of coffee rust. This paper proposes an extraction of rules to detect coffee rust from induction of decision trees and expert knowledge. In order to obtain a model with greater expressiveness and interpretability, a graph-based representation is proposed. The extracted rules are evaluated using an expert system supported by graph pattern matching.
Keywords: graph; graph matching; decision tree; rules; plant disease; hemileia vastatrix; agriculture.
Semantic association rule mining in text using domain ontology
by Ibukun Afolabi, Olaperi Sowunmi, Olawande Daramola
Abstract: Online news websites are now valuable archives for both current and old news regarding various issues, particularly those that relate to the political and historical contexts of a country. These news platforms have become an important medium for all forms of political activities, such as branding, campaigns, and communication. Online newspapers make large volumes of textual data available, which are rich in political and historical inferences that can be leveraged for national development. In this paper, we report a procedure for ontology-based association rule mining for knowledge extraction from text. Ordinarily, association rule mining algorithms have the limitations of generating many non-interesting rules, huge numbers of discovered rules, and low algorithm performance. This research demonstrates a procedure for improving the performance of association rule mining in text mining by using domain ontology. To do this, a study context of Nigerian politics based on information extracted from a Nigerian online newspaper was selected, and a methodology that combined natural language processing methods, ontology-based keywords extraction, and the modified Generating Association Rules based on Weighting scheme (GARW) was applied. The result obtained from the study revealed that compared with non-ontology-based association rule mining approaches, our procedure provides significant reduction in the number of generated rules, and produces rules that are more semantically related to the problem context. The study validates the capability of domain ontology to improve the performance of association rule mining algorithms, particularly when dealing with unstructured textual data.
Keywords: domain ontology; text mining; political science; association rule mining; Nigeria.
A conceptualisation of narratives and its expression in the CRM
by Valentina Bartalesi, Carlo Meghini, Daniele Metilli
Abstract: Current Digital Libraries (DLs) are mostly built around large collections of scarcely related objects. We aim at enriching the information space of DLs by introducing narratives, consisting of two main components: networks of events related to one another and to the DL resources through semantic links, and narrations of those events in texts. In order to introduce narratives in DLs, we developed a conceptualisation based on narratology and we expressed it using the CIDOC CRM and CRMinf as reference ontologies. We used this expression to validate our conceptualisation, creating a narrative of the biography of Dante Alighieri as a realistic case study. To support this experiment, we developed a semi-automated tool that collects basic knowledge about objects and events from Wikidata. The developed ontology is general enough to be not limited to create biographies but other types of narratives as well.
Keywords: ontology; semantic web; digital libraries; narratives; CIDOC CRM; OWL.
Annotating legal documents with GaiusT 2.0
by Nicola Zeni, Luisa Mich, John Mylopoulos
Abstract: We present the GaiusT 2.0 framework for annotating legal documents. GaiusT 2.0 has been designed and implemented as a web-based system to semi-automate the extraction of legal concepts from text. In requirements analysis these concepts can be used to identify the requirements that a software system has to fulfil to comply with a law or regulation. The analysis and annotation of legal documents in prescriptive natural language is still an open problem for research in the field. In GaiusT 2.0, a multi-step process exploits a number of linguistics and technological resources to offer a comprehensive annotation environment. The modules of the system are presented as evolutions from corresponding modules of the original GaiusT framework, which in turn was based on a general-purpose annotation tool, Cerno. The application of GaiusT 2.0 is illustrated with two cases, to demonstrate the performance of the extraction process and its adaptability to different law models.
Keywords: semantic annotation; Annotation schema; Annotation engine; Document structure analyser; legal documents; legal requirements; compliance requirements; natural language processing; prescriptive language; linguistic resources; user needs.
A novel ontology for 3D semantics: ontology-based 3D model indexing and content-based video retrieval applied to the medical domain
by Leslie Sikos
Abstract: Because of the growing popularity of 3D modelling, there is a great demand for efficient mechanisms to automatically process 3D contents. Owing to the lack of semantics, however, most 3D scenes cannot be interpreted by software agents. 3D ontologies can provide formal definitions for 3D objects, however, many of them are semistructured only, cover a narrow knowledge domain, do not provide comprehensive coverage for geometric primitives, and do not exploit the full expressivity of the implementation language. This paper presents the most comprehensive formally grounded 3D ontology to date that maps the entire XSD-based vocabulary of the industry standard X3D (ISO/IEC 1977519777) to OWL 2, complemented by fundamental concepts and roles of the 3D modelling industry not covered by X3D. This upper ontology can be used for the representation, annotation, and efficient indexing of 3D models, and their retrieval by 3D characteristics rather than by associated category labels.
Keywords: 3D model semantics; multimedia ontology; medical 3D printing; feature-based 3D model retrieval; X3D; MPEG-7; medically accurate 3D models; content-based medical video retrieval.
SMONT: an ontology for crime solving through social media
by Edlira Kalemi, Sule Yildirim-Yayilgan, Elton Domnori, Ogerta Elezaj
Abstract: There are numerous social networks such as Facebook, LinkedIn, Google Plus and Twitter whose data sources are becoming larger every day holding an abundance of valuable information. Among these data, digital crime evidence can be collected from online social networks (OSNs) for crime detection and further analysis. This paper describes the SMONT ontology, which has been developed to give support to the process of crime investigation and prevention. The SMONT ontology covers specific data about the crime, digital evidence obtained from OSNs, information archived from police entities, and also details related to people or events that may bring the authorities closer to crime case solving. It is possible to benefit from the ontology in different ways, such as: intelligence gathering; reasoning over the data; smarter searches and comparisons; open data publication purposes; and the overall management of the crime solving and prevention process.
Keywords: ontology; online social networks; crime; digital evidence.
Psychological named entity recognition from psychological Arabic texts
by Lakel Kheira, Bendella Fatima
Abstract: One of the most important problems facing the Arabisation of modern science is the terminological inconsistency in translation. This problem becomes more complex in the medical field, specifically in psychological sciences where the translation of English Arabic medical terms poses real challenges for researchers eager to analyse and organise this information. NER (Named Entity Recognition) systems play a significant role in many areas of Natural Language Processing (NLP), such as question answering systems, translation and information retrieval. Unlike previous Arabic NER systems, which have been built to extract named entities from Arabic text, our task involves extracting named entities from psychological text for bilingual annotation. In this paper, the problem of NERPSY (Named Entity Recognition for PSYchological sciences) is tackled through integrating the machine learning based approach with the symbolic approach that uses handcrafted rule-based by using an information extraction tool available within the platform GATE to form a hybrid approach in an attempt to enhance the overall performance of psychological NER. The proposed hybrid system is capable of recognising eight different types of named entity including mental disorders designated by the DSM-IV Diagnostic and Statistical Manual of the American Psychiatric Association, psychoactive substances, symptoms, diseases and medicaments. Then, we use a translation module to translate named entities. Extracting psychological named entities provides basic information for psychological analysis and ontologies building in the psychological domain.
Keywords: named entity recognition; NERA; psychological sciences; Arabic language; Jape; gazetteers; GATE.
Dictionary-based sentiment analysis of Hinglish text and comparison with machine learning algorithms
by Harpreet Kaur, Veenu Mangat, Nidhi Krail
Abstract: With the recent development of Web 2.0, there has been a lot of increase in social networking and online marketing sites. The data or reviews obtained from these sites are analysed for better human decision making. Sentiment analysis is a part of natural language processing which involves extraction of opinions or sentiments from reviews. Opinions can be classified into positive, negative or neutral. Most of the content on the internet is in the English language, but with the improved awareness of people, data in other languages is also increasing gradually. India is a country of many languages. Sentiment analysis of English is very popular but not much work has been done in Indian languages. These days, a ton of correspondence in online networking happens to use Hinglish content, which is a mixture of two languages - Hindi and English. Hinglish is an informal language that is exceptionally famous in India as individuals feel better talking in their particular language. In this paper, we present a dictionary-based approach for Hinglish text classification. We also implemented traditional machine learning classification algorithms such as SVM (Support Vector Machine), NB (Na
Keywords: sentiment classification; feature extraction; dictionary or lexicon development; Wordnet.