International Journal of Web Engineering and Technology
These articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.
Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.
Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.
International Journal of Web Engineering and Technology (4 papers in press)
Semantic Lifting and Reasoning on the Personalised Activity Big Data Repository for Healthcare Research by Hong Qing Yu Abstract: The fast growing markets of smart health monitoring devices and mobile applications provide opportunities for common citizens to have capability for understanding and managing their own health situations. However, there are many challenges for data engineering and knowledge discovery research to enable efficient extraction of knowledge from data that is collected from heterogonous devices and applications with big volumes and velocity. This paper presents research that initially started with the EC MyHealthAvatar project and is under continual improvement following the projects completion. The major contributions of the work is a comprehensive big data and semantic knowledge discovery framework which integrates data from varied data resources. The framework applies hybrid database architecture of NoSQL and RDF repositories with introductions for semantic oriented data mining and knowledge lifting algorithms. The activity stream data is collected through Kafkas big data processing component. The motivation of the research is to enhance the knowledge management, discovery capabilities and efficiency to support further accurate health risk analysis and lifestyle summarization. Keywords: Big Data; Knowledge Discovery; Semantic Web; Ontology; Data Engineering; Data processing; Healthcare.
Anomaly detection in the web logs using user-behavior networks by Xiaojuan Wang Abstract: With the rapid growth of the web attacks, anomaly detection becomes a necessary part in the management of modern large-scale distributed web applications. As the record of the user behavior, web logs certainly become the research object relate to anomaly detection. Many anomaly detection methods based on automated log analysis have been proposed. However, most researches focus on the content of the single logs, while ignoring the connection between the user and the path. To address this problem, we introduce the graph theory into the anomaly detection and establish a user behavior network model. Integrating the network structure and the characteristic of anomalous users, we propose five indicators to identify the anomalous users and the anomalous logs. Results show that the method gets a better performance on four real web application log datasets, with a total of about 4 million log messages and 1 million anomalous instances. In addition, this paper integrates and improves a state-of-the-art anomaly detection method, to further analyze the composition of the anomalous logs. We believe that ourwork will bring a newangle to the research field of the anomaly detection. Keywords: graph theory; anomaly detection; user behavior.
DWSpyder: A New Schema Extraction Method for a Deep Web Integration System by Yasser Saissi Abstract: The deep web is a huge part of the web that is not indexed by search engines. The deep web sources are accessible only through their associated access forms. We wish to use a web integration system to access the deep web sources and all of their information. To implement this web integration system, we need to know the schema description of each web source. The problem resolved in this paper is how to extract the schema describing an inaccessible deep web source. We propose our DWSpyder method as being able to extract the schema describing a deep web source despite its inaccessibility. The DWSpyder method starts with a static analysis of the deep web source access forms in order to extract the first elements of the associated schema description. The second step of our method is a dynamic analysis of these access forms using queries to enrich our schema description. Our DWSpyder method also uses a clustering algorithm to identify the possible values of deep web form fields with undefined sets of values. All of the information extracted is used by DWSpyder to generate automatically deep web source schema descriptions. Keywords: Web Integration; Schema Extraction; Deep Web; Clustering.
Impact of Replica Placement-based Clustering on fault
Tolerance in Grid Computing by Rahma Souli-Jbali Abstract: Due to several demands on very high computing power and storage capacity, data grids seem to be a good solution to meet these growing demands. Indeed, these architectures make it possible to add hardware/software resources offering a virtually infinite storage and computation capacities. However, the design of distributed applications for data grids remains complex. This task is even harder if one considers that faults and disconnections of machines, leading to data loss, are common in data grids. Therefore, it is necessary to take into account the dynamic nature of the grids since the grid nodes may disappear at any time. This paper focuses on problems related to the impact of replica placement-based clustering on fault tolerance of nodes in Grids. The main idea is based on two well-known fault tolerance protocols used in distributed systems. We propose a novel protocol based on two levels: Inter-cluster and intra-cluster. In interclusters, the message-logging protocol is used. In intra-cluster, the same interclusters protocol is used coupled with the non-blocking coordinated checkpoint of Chandy-Lamport. This ensures that in case of failure, the impact of the fault would remain confined to the nodes of the same cluster. The experiment results shows the efficiency of the proposed protocol in terms of time recovery, numbers of either used processes and exchanged messages. Keywords: Data Grids; Fault Tolerance; Replica placement; Clustering; Job scheduling.