Forthcoming articles


International Journal of Computational Vision and Robotics


These articles have been peer-reviewed and accepted for publication in IJCVR, but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.


Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.


Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.


Articles marked with this Open Access icon are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.


Register for our alerting service, which notifies you by email when new issues of IJCVR are published online.


We also offer RSS feeds which provide timely updates of tables of contents, newly published articles and calls for papers.


International Journal of Computational Vision and Robotics (36 papers in press)


Regular Issues


  • Automated Identification and Counting of Proliferating Mesenchymal Stem Cells in Bone Callus   Order a copy of this article
    by Samer Awad, Rula Abdallat, Othman Smadi, Thakir AlMomani 
    Abstract: cell proliferation development. Manual counting can be time consuming and subject to human error as it depends on visual inspection. On the other hand, automated counting using software based morphological analysis can eliminate or reduce these disadvantages and provide statistical reliability. In this study, we employ a software-based method for the automated counting of mesenchymal stem cells (MSCs) proliferation in the bone callus of Wistar rats to evaluate fracture healing. The proposed method started with extracting the green component of the digital image acquired using a light microscope. The subsequent stages involved: contrast enhancement, adaptive thresholding and false detection reduction. This method was tested using 48 MSCs images and the results were evaluated by a specialist. The average of precision, recall and F-measure were found to be 87.14%, 88.04% and 87.50% respectively.
    Keywords: automated cell counting; biological cell counting; image processing; image segmentation; pattern recognition; automated thresholding; light microscopic images; mesenchymal stem cells; MSC; false detection minimisation.
    DOI: 10.1504/IJCVR.2019.10016589
  • Content-Based Image Retrieval (CBIR): A deep look at features prospectus.   Order a copy of this article
    by Mohammed Suliman Haji, Amjad Rehman, Tanzila Saba 
    Abstract: Currently rapid growth of digital images on the internet is observed, accordingly, the need for content-based image retrieval systems are in high demand. Content-Based Image Retrieval (CBIR) is an image search technique that does not depend on manually assigned annotations; rather, CBIR uses discriminative features to search an image. By refining features, an efficient retrieval mechanism could be achieved. The aim of this research is to review features extraction and selection that have an impact on Content-Based Image Retrieval (CBIR) and information extraction from images using global and local features such as shape, texture, and color. In order to extract most appropriate features for Content-Based Image Retrieval (CBIR), several feature extraction and selection techniques are analyzed and their efficiency is compared. Additionally, shortcomings of current content-based image retrieval techniques are addressed and possible solutions are suggested to enhance accuracy.
    Keywords: CBIR; Discrete Wavelet Transform; Low-level Features and High-level Features.

  • Bio-inspired visual attention process using spiking neural networks controlling a camera   Order a copy of this article
    by André Cyr, Frédéric Thériault 
    Abstract: This study introduces virtual and physical implementations of a bottom-up visual attention mechanism using a spiking neural network(SNN) controlling a camera. The SNN is able to focus simple stimuli of various length that appear randomly in the camera's view. This is accomplished with an overt process based on a competitive choice according to a stimulus quadrant location. After focusing a selected stimulus toward the center of its view, the SNN scans it from one edge to the other. Since the spike train of dedicated neurons reflects the duration of each scan, it allows the extraction of the stimulus length. Upon the completion of a scan, the SNN has the ability to switch to another stimulus. This preliminary work on spatial visual attention intends to be a step toward the study of the concept size learning process in a robotic context.
    Keywords: Spiking neurons; Robotics; Overt process; Visual Attention.

  • Enhancing Proximity Measure Between Residual and Noise for Image Denoising   Order a copy of this article
    by Gulsher Baloch, Junaid Ahmed 
    Abstract: Sparse representation and dictionary learning based image denoising algorithms approximate the clean image patch by linear combination of few dictionary atoms. Clearly, residue after completion of denoising must be similar to the contaminating noise. Ideally, clean image patch is perfectly recovered if residue is exactly contaminating noise. Hence, for better denoising residue must be enforced to possess characteristics similar to the contaminating noise. In this paper, we model residue such that proximity between residue and contaminating noise is increased. The proposed mathematical model makes sure that the residue is as random in nature as contaminating noise. This is achieved by unique sparse coding and dictionary update stages developed based on modeling of randomness in residue. The proposed algorithm is tested on Additive White Gaussian Noise (AWGN), Additive Colored Gaussian Noise (ACGN) and Laplacian Noise. Since performance of the image denoising algorithms also depend on image effective bandwidth, therefore, in this paper we have generated synthetic images with known effective image bandwidths. These images are generated using the Discrete Cosine Transform (DCT). The proposed algorithm is also tested on these images. The proposed algorithm is compared with state-of-the-art algorithms. The comparison on the bases of peak signal-to-noise ratio (PSNR), structure similarity index measure (SSIM) and feature similarity index measure (FSIM) indicate that the proposed algorithm is able to produce often better and competitive results.
    Keywords: Additive Colored Noise; Laplacian Noise; Residual Correlation; Image Denoising.

  • Push Recovery System and Balancing of a Biped Robot on Steadily Increasing Slope of an Inclined Plane   Order a copy of this article
    by Pravat Kumar Behera, Ravi Kumar Mandava, Pandu R. Vundavilli 
    Abstract: The present research paper demonstrates the push recovery system and balancing on an inclined plane by a 20 degrees of freedom (DOF) small sized biped robot. Here, the authors developed an algorithm to balance the posture of the biped robot while standing (that is, stationary mode), and balancing on an inclined plane with steadily increasing slope. To maintain stability of the biped robot, stability controllers are integrated into the walking controller. This enable the robot to sense any disturbance, and perform necessary action to maintain its stability. For measuring the external disturbances and orientation of the ground, an inertial measurement unit sensor is fitted inside the robot. Further, the robot is allowed to generate internal torque by the movement of its body parts to resist external disturbances. This principle is extended to test the balance of the biped robot on an inclined plane with increasing inclination angle. The robot is seen to successfully exhibit the two tasks, such as push recovery and maintaining the balance on the steadily increasing slope of a sloping surface in real time.
    Keywords: Push recovery; external disturbances; balance on an inclined plane.

  • Ethiopian Maize Diseases Recognition and Classification using: Support Vector Machine   Order a copy of this article
    by Enquhone Alehegn 
    Abstract: Currently, more than 72 maize diseases found in Ethiopia that attacked different part of maize. There are different traditional mechanisms to identify and classify maize leaf diseases by chemical analysis or visual observation. But, the traditional mechanisms have their own drawbacks take more time and require professional staff. Therefore, many researchers have been doing a lot in identifying and classifying the different types of diseases that attack maize using image processing. However, as far as the researchers knowledge no attempt has been done for Ethiopian maize diseases dataset. In this study an attempt has been made to develop maize leaf diseases recognition and classification using both support vector machine model and image processing. To evaluate the recognition and classification accuracy from the total dataset of 800 images, 80% used for training and the remaining 20% for testing the model. Based on the experiment result using combined (texture, colour and morphology) features with support vector machine an average accuracy of 95.63% achieved.
    Keywords: maize disease; image pre-processing; features; feature extracted; image segmentation; image enhancement; noise removal; binarisation; support vector machine; SVM.
    DOI: 10.1504/IJCVR.2019.10017481
  • An Integrative Approach for Tracking of Mobile Robot with Vision Sensor   Order a copy of this article
    by Sangarm Keshari Das, Sabyasachi Dash, B.K. Rout 
    Abstract: Current work addresses an experimental approach which incorporates feature-based object detection, KLT algorithm-based tracking method and Kalman filter-based de-noising technique in a real-time environment. In the detection phase, the mobile robot is detected using Viola-Jones algorithm which extracts detectable features. Then the position of the mobile robot is computed with homography constraints and a region of interest window is set up to accommodate the mobile robot. In the tracking phase, the region of interest window is dealt with using KLT algorithm. The proposed method is of practical importance when the mobile robot is tracked while moving on a predetermined (specified) path as the size of the image of the mobile robot is small relative to the captured image of the environment. Thus the analysis of captured image of environment becomes unnecessary for tracking and thereby the approach reduces computational load. The proposed approach accurately detects and tracks the mobile robot with error percentage ranging from 0.5% to 10% in different parts of the specified path.
    Keywords: mobile robot; Viola-Jones algorithm; KLT algorithm; Kalman filter; vision-based tracking.
    DOI: 10.1504/IJCVR.2019.10019086
  • Visual Cues based Deception Detection Using Two Class Neural Network   Order a copy of this article
    by Sabu George, Manohara Pai M.M, Radhika M. Pai, Samir Kumar Praharaj 
    Abstract: The deception detection technique which helps to analyse a person without his knowledge is convenient and effective than other methods of deception detection. In this paper facial visual cues based deception detection study is performed. In this study, an experiment was conducted with the participation of 62 subjects. Facial muscle variations of lie and truth responses of the subjects were recorded using a high speed camera and the corresponding Action Units (AUs) were trained and then tested for truth and lie prediction using 2 class neural networks. The prediction performance was analysed using 5 different sets each having 10%, 20% and 30% test samples.
    Keywords: Lie face analysis; AU analysis; deception detection.

  • Images-to-Images Person ReID Without Temporal Linking   Order a copy of this article
    by Thuy-Binh Nguyen, Thi-Lan Le, Ngoc-Nam Pham 
    Abstract: This paper addresses images-to-images person re-identification in which there are multiple images for each individual on both gallery and probe. Most existing approaches that try to extract/learn features require temporal linking between frames. This paper proposes a novel framework to overcome this requirement by formulating images-to-images person re-identification as fusion function of image-to-images. First, a ranked list of candidates corresponding to each query image is determined. Then, these lists are fused to determine the matched person. The contributions of the paper are two-fold: (1) an extra feature (Gaussian of Gaussian) is used for representing person; (2) a new images-to-images scheme that does not require temporal linking and features the benefit of image-to-images scheme is proposed. Extensive experiments on CAVIAR4REID (case A and B) and RAiD datasets prove the effectiveness of the framework. The proposed scheme obtains + 20.88%, +10.23% and +10.39% improvement in rank-1 over image-to-images scheme on these datasets.
    Keywords: Multi-shot; person re-identification; late fusion; images-to-images person re-identification.

  • A Technique to Validate Automatic Generation of B   Order a copy of this article
    by Nabil Messaoudi, Allaoua Chaoui, Bettaz Mohamed 
    Abstract: Several approaches have been proposed in the literature to transform UML models to formal methods for verification reason. However, few of these approaches take into account the validation of such transformations. This paper is a proposal in this context. It has two parts; first, we propose a technique to control the output of a transformational tool, in order to obtain safe transformational rules, and second, we propose a way to generate the formal model B
    Keywords: UML 2 Sequence diagrams; Semantics; Model Transformations Validation; B├╝chi automata; AGG.

  • A framework for automatically constructing a dataset for training a vehicle detector   Order a copy of this article
    by Changyon Kim, Jeonghwan Gwak, Moongu Jeon 
    Abstract: Object detection based on a trained detector has been widely applied to diverse tasks such as pedestrian, face, and vehicle detection. In such approach, detectors are learned offline with an enormous number of training samples. However, the approach has a significant drawback that heavy intervention and effort, as well as domain knowledge, of a human are essentially required to construct a reliable training dataset. To remedy this drawback, we propose a framework to collect and label training samples automatically. By analyzing information of foreground blobs obtained from background subtraction results, a training dataset can be constructed without any humans effort. Also, condition investigation of scenes is performed periodically to check the suitability of sample candidates. As a result, it generates an accurate vehicle detector. With the proposed method, training samples can be automatically collected only when vehicle blobs in the given scene provide suitable appearance information. The effectiveness of the proposed framework is demonstrated from vehicle detection tasks under real traffic environments.
    Keywords: Object detection; Optimal vehicle detector; Appearance model; Scene condition investigation; Automatic sample collection.

  • Effective scene change detection in complex environments   Order a copy of this article
    by Hui Fuang Ng, Chee Yang Chin 
    Abstract: One of the fundamental operations in computer vision applications is change detection, in which moving foreground objects are segmented from a static background. A common approach for change detection is the comparison of an image frame with the stored background model using a matching algorithm, a process known as background subtraction. However, such techniques fail in environments with dynamic backgrounds, illumination changes, or shadow and camera jitters. This study focuses on effectively detecting scene changes in complex environments. To this end, we proposed a new colour descriptor named Local Colour Difference Pattern (LCDP) that is insusceptible to shadow and is able to capture both colour and texture features at a pixel location. Furthermore, a scene change detection framework was proposed to handle dynamic scenes based on sample consensus that integrates LCDP and a novel spatial model fusion mechanism. Experiments using the CDnet benchmark dataset demonstrated the effectiveness of the proposed approach to change detection in complex environments.
    Keywords: change detection; background subtraction; moving object segmentation; foreground segmentation; local descriptor; video signal processing; CDnet.

  • 3D Image Reconstruction from Different Image Formats Using Marching Cubes Technique   Order a copy of this article
    by Abdou Shalaby, Mohammed Elmogy, Ahmed AboElfetouh 
    Abstract: Structure from motion (SFM) is the problem of reconstructing the 3D image from 2D images. The main problem of 3D reconstruction is the quality of the 3D image that depends on the number of 2D slices input to the system. A large number of 2D slices may lead to high processing time. This paper introduces a new model to reconstruct the 3D image from any 2D image by using marching cubes algorithm. We use the LABVIEW program to build the system and use the Biomedical Toolkit to read and registered any 2D images. Our main goal is to implement the 3D reconstruction system to produce a high-quality 3D image with a minimum number of 2D slices and to decrease the execution time as possible. We apply our system on two datasets; all the experimental results have proved the efficiency and effectiveness of this system in 3D image reconstruction from any 2D image type. As shown in results, changing iso_value, image type and a number of images, affects the quality of 3D image reconstruction, and the processing time.
    Keywords: 3D image reconstruction; Marching cubes; Lab VIEW; 2D image registration; computed tomography(CT); Magnetic Resonance(MR); Single-photon emission computed tomography(SPECT).

  • Use of Radial Basis Function Network with Discrete Wavelet Transform for Speech Enhancement   Order a copy of this article
    by Rashmirekha Ram, Mihir Narayan Mohanty 
    Abstract: Neural Network has occupied a very good position in the field of detection, recognition and classification. However the use of these models for signal enhancement is a new direction of research. In this paper, Neural Network is used to enhance the quality of the speech. The efficient model Radial Basis Function Network (RBFN) is chosen for enhancement of the noisy signals. Wavelet Transform is used for decomposition of signal. It works in both the ways. In first stage, the noise from the input signal is reduced.Next to it, these coefficients are used as weights of the RBFN model that makes faster processing as compared to use of random weights. The output of the proposed model is measured in terms of Signal to Noise Ratio (SNR), Segmental Signal to Noise Ratio (SegSNR) and Perceptual Evaluation of Speech Quality (PESQ).The performance of the proposed method found excellent and is exhibited in the result section.
    Keywords: Speech Enhancement; Discrete Wavelet Transform; Radial Basis Function Network; Signal to Noise Ratio; Segmental Signal to Noise Ratio; Perceptual Evaluation of Speech Quality.rnrn.

  • Crypto-compression Scheme based on the DWT for Medical Image Security   Order a copy of this article
    by Med Karim Abdmouleh 
    Abstract: Ensuring the confidentiality of exchanged data is always a great concern for any communication. Also, the purpose of compression is to reduce the amount of data while preserving important information. This reduction leads to the archiving of more information on the same storage medium and minimizes the transfer times via telecommunication networks. Indeed, the combination of encryption and compression guarantees both confidentiality and authentication of information. In addition, it reduces processing time and transmission on public channels and increases storage capacity. In this paper, we propose a new approach of a partial or selective encryption for medical Images based on the Discrete Wavelet Transform (DWT) coefficients and compatible with the norm JPEG2000. The obtain results prove that, the proposed scheme provides a significant reduction of the processing time during the encryption and decryption, without tampering the high compression rate of the compression algorithm.
    Keywords: Crypto-compression; Encryption; Compression; Discrete Wavelet Transform; RSA; JPEG2000; Telemedicine.

  • Non-Invasive Technique of Diabetes Detection using Iris Images.   Order a copy of this article
    by Kesari Verma, Bikesh Kumar Singh, Neelam Agrawal 
    Abstract: Alternative medicine techniques are important in improving the quality of life, disease prevention and better to the conventional invasive method of diseases detection. This paper addresses a non-invasive approach of diabetic detection using iris images. The proposed technique evaluate the use of iridology to diagnose diabetes using modern digital image processing techniques that analyses structural properties of the iris and classifies the patterns accordingly. The system analyses the broken tissues of the iris by extracting significant textural features using Gabor filter bank and Gray Level Co-occurrence Matrix (GLCM) from the subsection of the iris. The extracted textural features help to categorize the diabetic and non-diabetic irises using benchmarks Artificial Neural Network (ANN) and Support Vector Machine (SVM) classifiers. The promising results of extensive experiments demonstrate the effectiveness of the proposed method.
    Keywords: diabetes detection; image processing; iris images; support vector machine; artificial neural network; SVM; ANN; gabor features; gray level co-occurrence matrix; GLCM; Non-Invasive Technique.

  • Autonomous Void Detection and Characterisation in Point Clouds and Triangular Meshes   Order a copy of this article
    by Benjamin Bird, Barry Lennox, Simon Watson, Thomas Wright 
    Abstract: In this paper we propose and demonstrate a novel void characterisation algorithm which is able to distinguish between internal and external voids that are present in point clouds of both manifold and non-manifold objects and 3D scenes. We demonstrate the capabilities of our algorithm using several point clouds representing both scenes and objects. Our algorithm is shown in both a descriptive overview format as well as pseudocode. We also compare a variety of different void detection algorithms and then present a novel refinement to the best performing of these algorithms. Our refinement allows for voids in point clouds to be detected more efficiently, with fewer false positives and with over an order of magnitude improvement in terms of run time. We show our run time performance and compare it to results obtained using alternative algorithms, when tested using popular single board computers. This comparison is important as our work is intended for online robotics applications, where hardware is typically of low computational power. The target application for this work is 3D scene reconstruction to aid in the decommissioning of nuclear facilities.
    Keywords: Point Cloud; Void Detection; Meshing; Reconstruction; Computer Vision.

  • GCSAC: Geometrical Constraint SAmple Consensus for Primitive Shapes Estimation in 3-D Point Cloud   Order a copy of this article
    by Le Van Hung, Hai Vu, Thi-Thuy Nguyen, Thi-Lan Le, Thanh-Hai Tran 
    Abstract: Estimating parameters of a primitive shape from a 3-D point cloud data is a challenging problem due to data containing noises and computational time demand. In this paper, we present a new robust estimator (named GCSAC, Geometrical Constraint SAmple Consensus) aimed at solving such issues. The proposed algorithm takes into account geometrical constraints to construct qualified samples for the estimation. Instead of randomly drawing minimal subset of sample, explicit geometrical properties of the interested primitive shapes (e.g., cylinder, sphere and cone) are used to drive sampling procedures. At each iteration of GCSAC, the minimal subset sample is selected based on two criteria (1) It must ensure a consistency with the estimated model via a roughly inlier ratio evaluation; (2) The samples satisfy geometrical constraints of the interested objects. Based on the obtained good samples, model estimation and verification procedures of the robust estimator are deployed in GCSAC. Extensive experiments have been conducted on synthesized and real datasets for evaluation. Comparing with the common robust estimators of RANSAC family (RANSAC, PROSAC, MLESAC, MSAC, LO-RANSAC and NAPSAC), GCSAC outperforms in term of both the precision of the estimated model and computational time. The implementations of the proposed method and the datasets are made publicly available.
    Keywords: Robust Estimator; Primitive Shape Estimation; RANSAC and RANSAC Variations; Quality of Samples; Point Cloud data.

  • Exploring the Effects of Non-Local Blocks on Video Captioning Networks   Order a copy of this article
    by Jaeyoung Lee, Junmo Kim 
    Abstract: In addition to visual features, video also contains temporal information that contributes to semantic meaning regarding the relationships between objects and scenes. There have been many attempts to describe spatial and temporal relationships in video, but simple encoder-decoder models are not sufficient for capturing detailed relationships in video clips. A video clip often consists of several shots that seem to be unrelated, and simple recurrent models suffer from these changes in shots. In other fields, including visual question answering and action recognition, researchers began to have interests in describing visual relations between the objects. In this paper, we introduce a video captioning method to capture temporal relationships with a non-local block and boundary-aware system. We evaluate our approach on a Microsoft Video Description Corpus (MSVD, YouTube2Text) dataset and a Microsoft Research-Video to Text (MSR-VTT) dataset. The experimental results show that a non-local block applied along a temporal axis can improve video captioning performance on video captioning datasets.
    Keywords: Video captioning; Non-local mean; Self-attention; Video description.

  • A Novel Approach for Mitigating Atmospheric Turbulence using Weighted Average Sobolev Gradient and Laplacian   Order a copy of this article
    by Prifiyia Nunes, Dippal Israni, Karthick D, Arpita Shah 
    Abstract: Heat scintillation mainly leads to atmospheric turbulence which causes the image distortion due to the propagation of light through the volatile environment. The change in the refractive index due to variation in wind velocity is also the reason for causing turbulence in the atmosphere. Traditional image registration approach lags as it is computationally expensive and need post-processing algorithm for sharpening the image. A non-registration based Sobolev Gradient and Laplacian (SGL) algorithm removes turbulence but results in ghost artifacts in moving objects. This paper proposes a novel approach based on weighted average SGL. The proposed method mitigates atmospheric turbulence as well as restores the moving object in the scene. Performance metrics like SSIM and MSE prove that the proposed algorithm outperforms the state of the art algorithms in terms of restoring the geometric distortion as well as the object of interest.
    Keywords: Atmospheric Turbulence; Heat Scintillation; Restoration; Sobolev; Phase Shift; Weighted Average.

  • 3D Object Classification Based on Deep Belief Networks and Point Clouds   Order a copy of this article
    by Fatima Zahra OUADIAY, Nabila ZRIRA, Mohamed HANNAT, El Houssine BOUYAKHF, Mohammed Majid HIMMI 
    Abstract: Since the discovery of 3D sensors such as Kinect camera, 3D object models, and point clouds become frequently used in many areas. The most important one is the 3D object recognition and classification in robotic applications. This type of sensors, like the human vision, allows generating the object model from a field of view or even a complete 3D object model by combining several individual Kinect frames. In this work, we propose a new feature learning-based object classification approach using Point Cloud Library (PCL) detectors and descriptors and Deep Belief Networks (DBNs). Before developing the classification approach, we evaluate 3D descriptors by proposing a new pipeline that uses the L2-distance and the recognition threshold. 3D descriptors are computed on different datasets, in order to achieve the best descriptors. Subsequently, these descriptors are used to learn robust features in the classification approach using DBNs. We evaluate the performance of these contributions on two datasets; Washington RGB-D and our real 3D object datasets. The results show that the proposed approach outperforms advanced methods by approximately 5% in terms of accuracy.
    Keywords: Kinect; 3D Object classification; PCL; recognition threshold; DBNs; Washington RGB-D.

  • A novel fast fractal image compression based on reinforcement learning   Order a copy of this article
    by Bejoy Varghese, Krishnakumar S 
    Abstract: The concept of digital image compression is of considerable interest in the area of transmission and storage of images. The recent research in this area explores the combination of different coding techniques to achieve a better compression ratio without compromising the image quality. Fractal-based coding techniques got the attention of the research community from the very earlier days of data compression. However, those methods are computationally intensive at that time because of the exhaustive search involved to select a transformation sequence. In this paper, we propose a system that replaces the current domain-range comparison in the fractal compression with a reinforcement learning technique that reduces the compression time and increases the PSNR. The system will learn from the output of the exhaustive algorithm in the initial state and discard the combinatorial search after trained on a data set. The recommended method shows a good improvement in the compression ratio, PSNR and compression time.
    Keywords: Machine learning; Image compression; Reinforcement learning; Fractal coding.

  • Video Summarization based on Motion Estimation using Speeded up Robust Features   Order a copy of this article
    by Dipti Jadhav, Udhav Bhosle 
    Abstract: Video Summarization (VS) is a technique to extract keyframes from a video based on video contents. It provides user with a brief representation of video contents to semantically understand the video. This paper aims to present video summarization based on motion between consecutive video frames. The motion between frames is represented by affine and homograph transformation. The video frames are represented by a set of Speeded Up Robust Features (SURF). The keyframes are extracted in a sequential manner by successively comparison with the previously declared keyframe based on motion. The validity of the proposed algorithms is demonstrated on videos from Internet, YouTube dataset and Open Video Project. The proposed work is evaluated by comparing it with different classical and state-of-the-art video summarization methods reported in the literature. The experimental results and performance analysis validates the effectiveness and efficiency of the proposed algorithms.
    Keywords: Video Summarization; motion estimation. Key frames; SURF; affine transformation; homography.

  • Electroencephalography based classification of human emotion: A hybrid strategy in machine learning paradigm   Order a copy of this article
    by Bikesh Kumar Singh, Ankur Khare 
    Abstract: The objective if this article is to develop a new improved two stage method for classifying emotional states of human by fusing back-propagation artificial neural network (BPANN) and k-nearest neighbors (K-NN). A publicly available electroencephalograph (EEG) signal database for emotion analysis using physiological signals is used in experiments. The EEG signals are initially pre-processed followed by feature extraction in time domain and frequency domain. The extracted features were then supplied to proposed model for emotion recognition. The proposed machine learning framework attains higher classification accuracy of 78.33 % as compared to conventional BPANN and K-NN classifiers, which achieves classification accuracy of 56.90 % and 59.52 % respectively. Future work is required to evaluate the proposed model in practical scenario wherein a proficient psychologist or medical professional can analyze the emotion recognized by first stage and the unsure test cases can be supplied to secondary classifier (k-NN) for further assessment.
    Keywords: Brain computer interface; emotion; electroencephalogram; hybrid classifier.

  • Blocking of Operation of Unauthorized Software using MQTT   Order a copy of this article
    by Kitae Hwang 
    Abstract: This paper presents design and implementation of the Meerkat system; a system that detects operation of software that is unauthorized. The MQTT protocol has been used for data communication in Meerkat system. The Meerkat system is largely comprised of three components: Meerkat client, the web application that operates as admin, and the server software. The Meerkat client alerts the MQTT broker as soon as it detects operation of unauthorized software on the users PC. The admin receives the information from the MQTT broker immediately via the MQTT broker. To evaluate the performance of the system, the transmission time between messages delivered from the user/admin PC was measured. The measurements illustrated that it took, on average, 8~50 milliseconds for a message to be delivered. These results indicate that the messages are delivered quickly enough for the Meerkat system to be put into actual use.
    Keywords: MQTT; publish-subscribe; unauthorized software; Mosquitto.

  • Development of Translation Telephone System by Using MQTT Protocol   Order a copy of this article
    by Jae Moon Lee 
    Abstract: This paper is a study on the development of a translation telephone system that enables two individuals speaking different languages to communicate through phone call. The system will be developed using the MQTT protocol - a push service technology - and voice, translation related web services that have been experiencing rapid improvement along with the development in the artificial intelligence technology. The core technology applied to the system are voice recognition, text translation, and speech synthesis. In order to guarantee that the system runs real time, the system is designed to utilize as many threads as possible so that the functions can be operated simultaneously. In order to minimize communication traffic, the system is designed to convert a conversation into text and send out a translated text to the counterpart instead of sending voice data. Also, to ensure the accuracy of the translation, the system is designed to translate all given information in a sentence basis separately. The proposed system has been developed to operate in Android smartphones. Because sentences do not tend to be too long during a conversation normally, we can know experimentally that the developed translation telephone system appears to run real time.
    Keywords: Telephone System; Translation; Web Service; Push Service; Speech Recognizer; Speech Synthesizer; MQTT.

  • An Iris Biometric-based Dual Encryption Technique for Medical Image in e-Healthcare Application   Order a copy of this article
    by Aparna P., P. V.V. Kishore 
    Abstract: Medical image watermarking has been broadly distinguished as a relevant technique for rising data content verification, security, image fidelity, and authenticity in the current e-health environment where medical images are stored, retrieved and transmitted over networks. Maintaining a secure environment for Tele-radiology from different issues such as malpractice liability and Image Retention etc is a challenging task. To discover the security issues, we suggest biometric key based medical image watermarking technique in E-health care application in this paper. In this paper, two types of inputs are utilized such as patient MRI image and electronic health record (HER). Initially, we segment the ROI region and encrypt the information using SHA-256 algorithm. Then, we encrypt the EHR Information using ECC algorithm. In this ECC algorithm, for key generation we utilize iris biometric which is increase the security level of watermarking system. Then, we concatenate the image and EHR information. Further increase the system security, we use arithmetic encoding algorithm to compress the bit stream. Then finally, we embed the bit-stream into cover image. The same process is repeated for the extraction process. The experimental result is carried out on the different medical images with EHR and the effectiveness of the proposed algorithm is analyzed with the help of Peak signal to noise ratio (PSNR) and normalized correlation (NC). The proposed methodology is used for many applications concerned with privacy protection, safety and management.
    Keywords: SHA-256; elliptical curve cryptography; biometric key; watermarking; Authentication; iris image; arithmetic encoding.

  • An Improved Algorithm Based on $sqrt{3}$ Subdivision for Subdivision Surface Modeling   Order a copy of this article
    by Lichun Gu, Jinjin Zheng, Chuangyin Dang, Zhengtian Wu 
    Abstract: In this paper we present an improved uniform subdivision algorithm, based on $sqrt{3}$ subdivision schemes. The original $sqrt{3}$ subdivision scheme was not able to accommodate sharp features, but the improved subdivision algorithm can overcome this difficulty without changing the topology of the subdivision surface. The algorithm uses an interpolation subdivision method and gets pleasant results by preserving dart, crease, corner and cone. The subdivision surface can reach everywhere smooth except infinite sharp features. The experimental results demonstrate that this new rule outperforms all existing ones.
    Keywords: Interpolation Subdivision; Sharp Features; Crease; Dart; Cone; Conner.

Special Issue on: MIWAI 2017 Computational Intelligence and Deep Learning for Computer Vision

  • A Real-time Aggressive Human Behavior Detection System in Cage Environment across Multiple Cameras   Order a copy of this article
    by Phooi Yee Lau, Hock Woon Hon, Zulaikha Kadim, Kim Meng Liang 
    Abstract: The sense of confinement inherent in a cage environment, such as lock-up or elevator, will become a place that is conducive to conduct criminal activities such as fighting. The monitoring of activities in the enclosed cage environments has, therefore, become a necessity. However, placing security guards could be inefficient and ineffective, as 24/7 surveillance is impossible to monitor the scene 24 by 7. A vision-based system, employing a real-time video analysis technology, could be deployed to detect abnormalities such as aggressive behavior, could eventually become an emerging and challenging problems. In order to monitor suspicious activities in a cage environment, the system should be able (1) to track individuals, (2) to identify their action, and (3) to keep a record of how often these aggressive behavior happen, at the scene. On top of that, the system should be implemented in real-time, whereby, the following limitations should be taken into consideration: (1) viewing angle (fish-eye) (2) low resolution (3) number of people (4) low lighting (normal) and (5) number of cameras. This paper proposes to develop a vision-based system that is able to monitor aggressive activities of individuals in an enclosed cage environment using multiple cameras. This work focuses on analyzing the temporal feature of aggressive movement, taking into consideration the limitations discussed previously. Experimental results show that the proposed system is easily realized and achieved impressive real-time performance, even on low end computers.
    Keywords: surveillance system; behavior monitoring; perspective correction; background subtraction; real-time video processing.

  • Attention-Based Argumentation Mining   Order a copy of this article
    by Derwin Suhartono, Aryo Pradipta Gema, Suhendro Winton, Theodorus David, Mohamad Ivan Fanany, Aniati Murni Arymurthy 
    Abstract: This paper is intended to make a breakthrough in argumentation mining field. Current trends in argumentation mining research use handcrafted features and traditional machine learning (e.g., support vector machine). We worked on two tasks: identifying argument components and recognising insufficiently supported arguments. We utilise deep learning approach and implement attention mechanism on top of it to gain the best result. We do also implement Hierarchical Attention Network (HAN) in this task. HAN is a neural network that gives attention to two levels, which are word-level and sentencelevel. Deep learning with attention mechanism models can achieve better result compared with other deep learning methods. This paper also proves that on research task with hierarchically-structured data, HAN will perform remarkably well. We do present our result on using XGBoost instead of a regular non-ensemble classifier as well.
    Keywords: argumentation mining; hand-crafted features; deep learning; attention mechanism; hierarchical attention network; word-level; XGBoost; sentence-level.
    DOI: 10.1504/IJCVR.2019.10018917
    by Antony P.J., Savitha C.K. 
    Abstract: This paper proposes an efficient method for segmentation and recognition of handwritten characters from Tulu palm leaf manuscript images. The proposed method uses an automated tool with a combination of thresholding and edge detection technique to binarize the image. Further projection profile with connected component analysis is used to line and character segmentation. Deep convolution neural network (DCNN) model used here to extract features and recognize segmented Tulu characters efficiently with a recognition rate of 79.92 %. The results are verified using benchmark dataset, the AMADI_LontarSet to generalize our model to handwritten character recognition task. The results showed that our method outperforms from the existing state of art models.
    Keywords: Handwritten Character Recognition; Palm Leaf; Segmentation; DCNN; Tulu.

  • Combination of Domain Knowledge and Deep Learning for Sentiment Analysis of Short and Informal Messages on Social Media   Order a copy of this article
    by Tho Quan 
    Abstract: Sentiment analysis has been emerging recently as one of major Natural Language Processing (NLP) tasks in many applications. Especially, as social media channels (e.g. social networks or forums) have become signi cant sources for brands to observe users opinions about their products, this task is thus increasingly crucial. However, when applied with real data obtained from social media, we notice that there is a high volume of short and informal messages posted by users on those channels. This kind of data makes the existing works su er from much diculty to handle, especially ones using deep learning approaches.rnrnIn this paper, we propose an approach to handle this problem. This work is extended from our previous work, in which we proposed to combine the typical deep learning technique of Convolutional Neural Network (CNN) with domain knowledge. The combination is used forrnacquiring additional training data augmentation and more reasonable loss function. In this work, we further improve our architecturernby various substantial enhancements, including negation-based data augmentation,rntransfer learning for word embeddings, combination of word-level embeddings andrncharacter-level embeddings, and using multi-task learning technique for attachingrndomain knowledge rules in the learning process. Those enhancements, speci callyrnaiming to handle short and informal message, help us to enjoy signi cant improve-rnment on performance once experimenting on real datasets.
    Keywords: Sentiment analysis; deep learning; domain knowledge; recurrent neural network;transfer learning; multi-task learning; data augmentation.

Special Issue on: Research in Virtual Reality

  • Crowd detection and counting using a static and dynamic platform: State of the Art   Order a copy of this article
    by Huma Chaudhry, Mohd Shafry Mohd Rahim, Tanzila Saba, Amjad Rehman 
    Abstract: Automated object detection and crowd density estimation are popular and important topics in visual surveillance research area. The last decades witnessed many significant publications in this field and it has been and still is a challenging problem for automatic visual surveillance. The ever increase in research of the field of crowd dynamics and crowd motion necessitates a detailed and updated survey of different techniques and trends in this field. This paper presents a survey on crowd detection and crowd density estimation from moving platform and surveys the different methods employed for this purpose. This review category and delineates several detections and counting estimation methods that have been applied for the examination of scenes from static and moving platforms.
    Keywords: Crowd; Counting; Holistic and Local Motion Features; Estimation; Visual Surveillance; Moving Platform.

  • Real time vision-based hand gesture recognition using depth sensor and a stochastic context free grammar   Order a copy of this article
    by Jayesh Gangrade, Jyoti Bharti 
    Abstract: This paper presents a new algorithm in computer vision for the recognition of hand gestures. In the proposed system, Kinect sensor is used to track and segment hand in the clutter background and feature extracted by finger and an angle between them. Classify the hand posture using multi-class support vector machine. The hand gesture is recognised by stochastic context free grammar (SCFG). Stochastic context free grammar uses syntactic structure analysis and by this method, recognises hand gestures by set of production rules which consists of a combination of hand postures. The proposed algorithm is able to recognise various hand postures in real time with more than 97% accuracy.
    Keywords: hand gesture; stochastic context free grammar; SCFG; multi-class support vector machine; Kinect sensor.

  • Real time sign language recognition using depth sensor   Order a copy of this article
    by Jayesh Gangrade, Jyoti Bharti 
    Abstract: Communication via gestures is a visual dialect utilized by deaf and Hard-of-Hearing (HoH) people group. This paper proposed a system for sign language recognition utilizing human skeleton data provided from Microsofts Kinect sensor to recognizing sign gestures. The Kinect sensor generates the skeleton of a human body and distinguishes 20 joints in it. The proposed method utilizes 11 out of 20 joints and extracts 35 novel features per frame, based on distances, angles and velocity involving upper body joints. Multi-class Support Vector Machine classified the 35 Indian sign gestures in real time with accuracy of 87.6%. The proposed method is robust in cluttered environment and viewpoint variation.
    Keywords: Kinect sensor; Indian sign gesture; Multi class support vector machine; Human computer interaction; Pattern recognition.

  • Adaptive Multi-Threshold Based De-noising Filter for Medical Image Applications   Order a copy of this article
    by Ramya A, Murugan D, Murugeswari G, Nisha Joseph 
    Abstract: Medical image processing is the emerging research area and many researchers contributed to medical image processing by proposing new techniques for medical image enhancement and abnormality detection. Interpretation of medical images is a challenging problem because of the unavoidable noise produced by the medical imaging devices and interference. In this work, a new framework is proposed for noise detection and reduction. This framework comprises two phases. First phase is the noise detection phase which is performed using the newly proposed Adaptive Multi-Threshold scheme (AMT). In second phase, modification of noisy pixel is done using Edge Preserving Median filter (EPM), which conserves the edge component and controls the blurring effect with preservation of fine details of interior region. The proposed work is tested with benchmark images and few medical images. It produces promising result and the results are compared with existing two-stage noise reduction techniques. Popular performance metrics such PSNR and SSIM are used for evaluation. Quantitative analysis and experimental results demonstrate that the proposed method is more efficient and suitable for medical image pre-processing.
    Keywords: Noise removal; Noise Detection; Impulse noise; Multi-Threshold; Edge Preserving.