International Journal of Metadata, Semantics and Ontologies (6 papers in press)
SMONT: an ontology for crime solving through social media
by Edlira Kalemi, Sule Yildirim-Yayilgan, Elton Domnori, Ogerta Elezaj
Abstract: There are numerous social networks such as Facebook, LinkedIn, Google Plus and Twitter whose data sources are becoming larger every day holding an abundance of valuable information. Among these data, digital crime evidence can be collected from online social networks (OSNs) for crime detection and further analysis. This paper describes the SMONT ontology, which has been developed to give support to the process of crime investigation and prevention. The SMONT ontology covers specific data about the crime, digital evidence obtained from OSNs, information archived from police entities, and also details related to people or events that may bring the authorities closer to crime case solving. It is possible to benefit from the ontology in different ways, such as: intelligence gathering; reasoning over the data; smarter searches and comparisons; open data publication purposes; and the overall management of the crime solving and prevention process.
Keywords: ontology; online social networks; crime; digital evidence.
Psychological named entity recognition from psychological Arabic texts
by Lakel Kheira, Bendella Fatima
Abstract: One of the most important problems facing the Arabisation of modern science is the terminological inconsistency in translation. This problem becomes more complex in the medical field, specifically in psychological sciences where the translation of English Arabic medical terms poses real challenges for researchers eager to analyse and organise this information. NER (Named Entity Recognition) systems play a significant role in many areas of Natural Language Processing (NLP), such as question answering systems, translation and information retrieval. Unlike previous Arabic NER systems, which have been built to extract named entities from Arabic text, our task involves extracting named entities from psychological text for bilingual annotation. In this paper, the problem of NERPSY (Named Entity Recognition for PSYchological sciences) is tackled through integrating the machine learning based approach with the symbolic approach that uses handcrafted rule-based by using an information extraction tool available within the platform GATE to form a hybrid approach in an attempt to enhance the overall performance of psychological NER. The proposed hybrid system is capable of recognising eight different types of named entity including mental disorders designated by the DSM-IV Diagnostic and Statistical Manual of the American Psychiatric Association, psychoactive substances, symptoms, diseases and medicaments. Then, we use a translation module to translate named entities. Extracting psychological named entities provides basic information for psychological analysis and ontologies building in the psychological domain.
Keywords: named entity recognition; NERA; psychological sciences; Arabic language; Jape; gazetteers; GATE.
Dictionary-based sentiment analysis of Hinglish text and comparison with machine learning algorithms
by Harpreet Kaur, Veenu Mangat, Nidhi Krail
Abstract: With the recent development of Web 2.0, there has been a lot of increase in social networking and online marketing sites. The data or reviews obtained from these sites are analysed for better human decision making. Sentiment analysis is a part of natural language processing which involves extraction of opinions or sentiments from reviews. Opinions can be classified into positive, negative or neutral. Most of the content on the internet is in the English language, but with the improved awareness of people, data in other languages is also increasing gradually. India is a country of many languages. Sentiment analysis of English is very popular but not much work has been done in Indian languages. These days, a ton of correspondence in online networking happens to use Hinglish content, which is a mixture of two languages - Hindi and English. Hinglish is an informal language that is exceptionally famous in India as individuals feel better talking in their particular language. In this paper, we present a dictionary-based approach for Hinglish text classification. We also implemented traditional machine learning classification algorithms such as SVM (Support Vector Machine), NB (Na
Keywords: sentiment classification; feature extraction; dictionary or lexicon development; Wordnet.
The development of a biological ontology based on a possible connection between EEG and breast cancer using bibliographic reasoning
by Marios Poulos
Abstract: This paper describes, using bibliographic reasoning, a biological ontology based on a possible connection between EEG and breast cancer. The paper uses a data mining approach, applied to PubMed articles. The proposed system will be able to identify the appearance of each genes ID and compare the coexistence of two genes or proteins in PubMed articles/papers using as queries specific authorised terms. The produced connections (genes and proteins) could be a useful tool for scientists and medical professionals searching for a possible relationship between disorders such as cancer and EEG.
Keywords: semantic web; data mining; proteins; genes; breast cancer; bibliometric; ontology.
What about virtual creative knowledge environments? Definition and modelling through case-based reasoning
by Fabio Sartori
Abstract: Virtual creative knowledge environments support is a new and challenging research topic in the knowledge management research field. Starting from the definitions of creative thinking, creative space and creative knowledge environments, this paper reflects on how such concepts can be properly computationally modelled. The adoption of the case-based reasoning paradigm has allowed to build up a conceptual and computational framework able to promote them from both the theoretical and practical points of view.
Keywords: case-based reasoning; creativity; analogical reasoning; distant analogy; virtual creative knowledge environment.
Linking science: approaches for linking scientific publications across different LOD Repositories
by Arben Hajra, Klaus Tochtermann
Abstract: Enriching the content of a digital library (DL) with additional information from other DLs and domains would facilitate the scholarly communication, scientific findings, and knowledge distribution. The implementation of semantic technologies by interlinking resources results in a new vision for interoperability among different DLs. Therefore, this research explores bibliographic Linked Open Data (LOD) repositories by investigating alignments among them. The application of global unigrams frequency is used for determining the importance of terms on the set of metadata. The semantic relatedness of the retrieved publications is measured by comparing two main approaches: Vector Space Model through TF-IDF and Cosine Similarity, versus a Deep Learning approach through Word2Vec implementation of Word Embeddings. In summary, they are performing with 40.5% difference, concerning the outcome of relevant retrieved publications. In addition to the given metadata, Word Embeddings achieves a better performance for short texts, such as publication titles.
Keywords: digital libraries; linked open data; LOD; semantic web; word embeddings; data mining; recommender systems;.