International Journal of Computational Vision and Robotics (29 papers in press)
Automated Identification and Counting of Proliferating Mesenchymal Stem Cells in Bone Callus
by Samer Awad, Rula Abdallat, Othman Smadi, Thakir AlMomani
Abstract: cell proliferation development. Manual counting can be time consuming and subject to human error as it depends on visual inspection. On the other hand, automated counting using software based morphological analysis can eliminate or reduce these disadvantages and provide statistical reliability. In this study, we employ a software-based method for the automated counting of mesenchymal stem cells (MSCs) proliferation in the bone callus of Wistar rats to evaluate fracture healing. The proposed method started with extracting the green component of the digital image acquired using a light microscope. The subsequent stages involved: contrast enhancement, adaptive thresholding and false detection reduction. This method was tested using 48 MSCs images and the results were evaluated by a specialist. The average of precision, recall and F-measure were found to be 87.14%, 88.04% and 87.50%
Keywords: automated cell counting; biological cell counting; image processing; image segmentation; pattern recognition; automated thresholding; light microscopic images; mesenchymal stem cells; MSC; false detection minimisation.
Content-Based Image Retrieval (CBIR): A deep look at features prospectus.
by Mohammed Suliman Haji, Amjad Rehman, Tanzila Saba
Abstract: Currently rapid growth of digital images on the internet is observed, accordingly, the need for content-based image retrieval systems are in high demand. Content-Based Image Retrieval (CBIR) is an image search technique that does not depend on manually assigned annotations; rather, CBIR uses discriminative features to search an image. By refining features, an efficient retrieval mechanism could be achieved. The aim of this research is to review features extraction and selection that have an impact on Content-Based Image Retrieval (CBIR) and information extraction from images using global and local features such as shape, texture, and color. In order to extract most appropriate features for Content-Based Image Retrieval (CBIR), several feature extraction and selection techniques are analyzed and their efficiency is compared. Additionally, shortcomings of current content-based image retrieval techniques are addressed and possible solutions are suggested to enhance accuracy.
Keywords: CBIR; Discrete Wavelet Transform; Low-level Features and High-level Features.
Bio-inspired visual attention process using spiking neural networks controlling a camera
by André Cyr, Frédéric Thériault
Abstract: This study introduces virtual and physical implementations of a bottom-up visual attention mechanism using a spiking neural network(SNN) controlling a camera. The SNN is able to focus simple stimuli of various length that appear randomly in the camera's view. This is accomplished with an overt process based on a competitive choice according to a stimulus quadrant location. After focusing a selected stimulus toward the center of its view, the SNN scans it from one edge to the other. Since the spike train of dedicated neurons reflects the duration of each scan, it allows the extraction of the stimulus length. Upon the completion of a scan, the SNN has the ability to switch to another stimulus. This preliminary work on spatial visual attention intends to be a step toward the study of the concept size learning process in a robotic context.
Keywords: Spiking neurons; Robotics; Overt process; Visual Attention.
Fast Binary Shape Categorization for Intelligent Vehicles
by Insaf Setitra, Slimane Larabi
Abstract: In this paper we propose a novel method for shape categorization suitable
for video surveillance and intelligent vehicles applications. Binary shape is
convoluted with a Gaussian filter at different scales and curvatures are detected
for each scale. Shape is then described using arclength and radius of curvatures.
This descriptor allows differentiating between shapes particularly for objects
that may appear in front of intelligent vehicle or in monitored scene such as
pedestrian, car, cyclist, animal (horse, cow, dog, cat). Conducted experiments
show that our method compete the state-of-the-art methods in term of accuracy
and surpasses them in term of time processing.
Keywords: Binary shape; categorisation; Intelligent vehicles; Matching; Categorisation; Curvature; Scale space.
Enhancing Proximity Measure Between Residual and Noise for Image Denoising
by Gulsher Baloch, Junaid Ahmed
Abstract: Sparse representation and dictionary learning based image denoising algorithms approximate the clean image patch by linear combination of few dictionary atoms. Clearly, residue after completion of denoising must be similar to the contaminating noise. Ideally, clean image patch is perfectly recovered if residue is exactly contaminating noise. Hence, for better denoising residue must be enforced to possess characteristics similar to the contaminating noise. In this paper, we model residue such that proximity between residue and contaminating noise is increased. The proposed mathematical model makes sure that the residue is as random in nature as contaminating noise. This is achieved by unique sparse coding and dictionary update stages developed based on modeling of randomness in residue. The proposed algorithm is tested on Additive White Gaussian Noise (AWGN), Additive Colored Gaussian Noise (ACGN) and Laplacian Noise. Since performance of the image denoising algorithms also depend on image effective bandwidth, therefore, in this paper we have generated synthetic images with known effective image bandwidths. These images are generated using the Discrete Cosine Transform (DCT). The proposed algorithm is also tested on these images. The proposed algorithm is compared with state-of-the-art algorithms. The comparison on the bases of peak signal-to-noise ratio (PSNR), structure similarity index measure (SSIM) and feature similarity index measure (FSIM) indicate that the proposed algorithm is able to produce often better and competitive results.
Keywords: Additive Colored Noise; Laplacian Noise; Residual Correlation; Image Denoising.
Push Recovery System and Balancing of a Biped Robot on Steadily Increasing Slope of an Inclined Plane
by Pravat Kumar Behera, Ravi Kumar Mandava, Pandu R. Vundavilli
Abstract: The present research paper demonstrates the push recovery system and balancing on an inclined plane by a 20 degrees of freedom (DOF) small sized biped robot. Here, the authors developed an algorithm to balance the posture of the biped robot while standing (that is, stationary mode), and balancing on an inclined plane with steadily increasing slope. To maintain stability of the biped robot, stability controllers are integrated into the walking controller. This enable the robot to sense any disturbance, and perform necessary action to maintain its stability. For measuring the external disturbances and orientation of the ground, an inertial measurement unit sensor is fitted inside the robot. Further, the robot is allowed to generate internal torque by the movement of its body parts to resist external disturbances. This principle is extended to test the balance of the biped robot on an inclined plane with increasing inclination angle. The robot is seen to successfully exhibit the two tasks, such as push recovery and maintaining the balance on the steadily increasing slope of a sloping surface in real time.
Keywords: Push recovery; external disturbances; balance on an inclined plane.
Ethiopian Maize Diseases Recognition and Classification using: Support Vector Machine
by Enquhone Alehegn
Abstract: Currently, there are around 72 maize diseases found in Ethiopia that attack different part of maize. From the maize diseases, maize common rust, maize leaf blights and maize gray leaf spot are the commonest diseases that attack maize leaf in all over Ethiopian farm area. There are different traditional mechanisms to identify and classify maize leaf diseases by chemical analysis or visual observation. But, the traditional mechanisms have their own drawbacks: inconsistent, costly, take more time, prone to error, and require professional staff. Therefore, many researchers have been doing a lot in identifying and classifying the different types of diseases that attack maize using model-based image processing and computer vision to support experts across the world. However, as far as the researchers knowledge is concerned, no attempt has been done for Ethiopian maize diseases data set. In this study an attempt has been made to develop maize leaf diseases recognition and classification using both support vector machine model and image processing. To evaluate the recognition and classification accuracy, from the total data set of 800 images, 80% used for training and the remaining 20% for testing the model. Based on the experiment result using combined (texture, colour and morphology) features with support vector machine an average accuracy of 95.63% achieved.
Keywords: Maize Disease; image pre-processing; features; feature extracted; image segmentation; image enhancement; noise removal; binarization; SVM.
An Integrative Approach for Tracking of Mobile Robot with Vision Sensor
by Sangarm Keshari Das, Sabyasachi Dash, B.K. Rout
Abstract: Current work addresses an experimental approach which incorporates feature based object detection, KLT Algorithm based tracking method and Kalman filter based de-noising technique in a real-time environment. In the detection phase, the mobile robot is detected using Viola-Jones algorithm which extracts detectable features. Then the position of the mobile robot is computed with homography constraints and a region of interest window is set up to accommodate the mobile robot. In the tracking phase, the region of interest window is dealt with using KLT algorithm. The proposed method is of practical importance when the mobile robot is tracked while moving on a predetermined (specified) path as the size of the image of the mobile robot is small relative to the captured image of the environment. Thus the analysis of captured image of environment becomes unnecessary for tracking and thereby the approach reduces computational load. The proposed approach accurately detects and tracks the mobile robot with error percentage ranging from 0.5% to 10% in different parts of the specified path.
Keywords: Mobile robot; Viola Jones algorithm; KLT algorithm; Kalman Filter; Vision based Tracking.
Visual Cues based Deception Detection Using Two Class Neural Network
by Sabu George, Manohara Pai M.M, Radhika M. Pai, Samir Kumar Praharaj
Abstract: The deception detection technique which helps to analyse a person without his knowledge is convenient and effective than other methods of deception detection. In this paper facial visual cues based deception detection study is performed. In this study, an experiment was conducted with the participation of 62 subjects. Facial muscle variations of lie and truth responses of the subjects were recorded using a high speed camera and the corresponding Action Units (AUs) were trained and then tested for truth and lie prediction using 2 class neural networks. The prediction performance was analysed using 5 different sets each having 10%, 20% and 30% test samples.
Keywords: Lie face analysis; AU analysis; deception detection.
Images-to-Images Person ReID Without Temporal Linking
by Thuy-Binh Nguyen, Thi-Lan Le, Ngoc-Nam Pham
Abstract: This paper addresses images-to-images person re-identification in which there are multiple images for each individual on both gallery and probe. Most existing approaches that try to extract/learn features require temporal linking between frames. This paper proposes a novel framework to overcome this requirement by formulating images-to-images person re-identification as fusion function of image-to-images. First, a ranked list of candidates corresponding to each query image is determined. Then, these lists are fused to determine the matched person. The contributions of the paper are two-fold: (1) an extra feature (Gaussian of Gaussian) is used for representing person; (2) a new images-to-images scheme that does not require temporal linking and features the benefit of image-to-images scheme is proposed. Extensive experiments on CAVIAR4REID (case A and B) and RAiD datasets prove the effectiveness of the framework. The proposed scheme obtains + 20.88%, +10.23% and +10.39% improvement in rank-1 over image-to-images scheme on these datasets.
Keywords: Multi-shot; person re-identification; late fusion; images-to-images person re-identification.
A Technique to Validate Automatic Generation of B
by Nabil Messaoudi, Allaoua Chaoui, Bettaz Mohamed
Abstract: Several approaches have been proposed in the literature to transform UML models to formal methods for verification reason. However, few of these approaches take into account the validation of such transformations. This paper is a proposal in this context. It has two parts; first, we propose a technique to control the output of a transformational tool, in order to obtain safe transformational rules, and second, we propose a way to generate the formal model B
Keywords: UML 2 Sequence diagrams; Semantics; Model Transformations Validation; Büchi automata; AGG.
A framework for automatically constructing a dataset for training a vehicle detector
by Changyon Kim, Jeonghwan Gwak, Moongu Jeon
Abstract: Object detection based on a trained detector has been widely applied to diverse tasks such as pedestrian, face, and vehicle detection. In such approach, detectors are learned offline with an enormous number of training samples. However, the approach has a significant drawback that heavy intervention and effort, as well as domain knowledge, of a human are essentially required to construct a reliable training dataset. To remedy this drawback, we propose a framework to collect and label training samples automatically. By analyzing information of foreground blobs obtained from background subtraction results, a training dataset can be constructed without any humans effort. Also, condition investigation of scenes is performed periodically to check the suitability of sample candidates. As a result, it generates an accurate vehicle detector. With the proposed method, training samples can be automatically collected only when vehicle blobs in the given scene provide suitable appearance information. The effectiveness of the proposed framework is demonstrated from vehicle detection tasks under real traffic environments.
Keywords: Object detection; Optimal vehicle detector; Appearance model; Scene condition investigation; Automatic sample collection.
Effective scene change detection in complex environments
by Hui Fuang Ng, Chee Yang Chin
Abstract: One of the fundamental operations in computer vision applications is change detection, in which moving foreground objects are segmented from a static background. A common approach for change detection is the comparison of an image frame with the stored background model using a matching algorithm, a process known as background subtraction. However, such techniques fail in environments with dynamic backgrounds, illumination changes, or shadow and camera jitters. This study focuses on effectively detecting scene changes in complex environments. To this end, we proposed a new colour descriptor named Local Colour Difference Pattern (LCDP) that is insusceptible to shadow and is able to capture both colour and texture features at a pixel location. Furthermore, a scene change detection framework was proposed to handle dynamic scenes based on sample consensus that integrates LCDP and a novel spatial model fusion mechanism. Experiments using the CDnet benchmark dataset demonstrated the effectiveness of the proposed approach to change detection in complex environments.
Keywords: change detection; background subtraction; moving object segmentation; foreground segmentation; local descriptor; video signal processing; CDnet.
3D Image Reconstruction from Different Image Formats Using Marching Cubes Technique
by Abdou Shalaby, Mohammed Elmogy, Ahmed AboElfetouh
Abstract: Structure from motion (SFM) is the problem of reconstructing the 3D image from 2D images. The main problem of 3D reconstruction is the quality of the 3D image that depends on the number of 2D slices input to the system. A large number of 2D slices may lead to high processing time. This paper introduces a new model to reconstruct the 3D image from any 2D image by using marching cubes algorithm. We use the LABVIEW program to build the system and use the Biomedical Toolkit to read and registered any 2D images. Our main goal is to implement the 3D reconstruction system to produce a high-quality 3D image with a minimum number of 2D slices and to decrease the execution time as possible. We apply our system on two datasets; all the experimental results have proved the efficiency and effectiveness of this system in 3D image reconstruction from any 2D image type. As shown in results, changing iso_value, image type and a number of images, affects the quality of 3D image reconstruction, and the processing time.
Keywords: 3D image reconstruction; Marching cubes; Lab VIEW; 2D image registration; computed tomography(CT); Magnetic Resonance(MR); Single-photon emission computed tomography(SPECT).
Use of Radial Basis Function Network with Discrete Wavelet Transform for Speech Enhancement
by Rashmirekha Ram, Mihir Narayan Mohanty
Abstract: Neural Network has occupied a very good position in the field of detection, recognition and classification. However the use of these models for signal enhancement is a new direction of research. In this paper, Neural Network is used to enhance the quality of the speech. The efficient model Radial Basis Function Network (RBFN) is chosen for enhancement of the noisy signals. Wavelet Transform is used for decomposition of signal. It works in both the ways. In first stage, the noise from the input signal is reduced.Next to it, these coefficients are used as weights of the RBFN model that makes faster processing as compared to use of random weights. The output of the proposed model is measured in terms of Signal to Noise Ratio (SNR), Segmental Signal to Noise Ratio (SegSNR) and Perceptual Evaluation of Speech Quality (PESQ).The performance of the proposed method found excellent and is exhibited in the result section.
Keywords: Speech Enhancement; Discrete Wavelet Transform; Radial Basis Function Network; Signal to Noise Ratio; Segmental Signal to Noise Ratio; Perceptual Evaluation of Speech Quality.rnrn.
Crypto-compression Scheme based on the DWT for Medical Image Security
by Med Karim Abdmouleh
Abstract: Ensuring the confidentiality of exchanged data is always a great concern for any communication. Also, the purpose of compression is to reduce the amount of data while preserving important information. This reduction leads to the archiving of more information on the same storage medium and minimizes the transfer times via telecommunication networks. Indeed, the combination of encryption and compression guarantees both confidentiality and authentication of information. In addition, it reduces processing time and transmission on public channels and increases storage capacity. In this paper, we propose a new approach of a partial or selective encryption for medical Images based on the Discrete Wavelet Transform (DWT) coefficients and compatible with the norm JPEG2000. The obtain results prove that, the proposed scheme provides a significant reduction of the processing time during the encryption and decryption, without tampering the high compression rate of the compression algorithm.
Keywords: Crypto-compression; Encryption; Compression; Discrete Wavelet Transform; RSA; JPEG2000; Telemedicine.
Non-Invasive Technique of Diabetes Detection using Iris Images.
by Kesari Verma, Bikesh Kumar Singh, Neelam Agrawal
Abstract: Alternative medicine techniques are important in improving the quality of life, disease prevention and better to the conventional invasive method of diseases detection. This paper addresses a non-invasive approach of diabetic detection using iris images. The proposed technique evaluate the use of iridology to diagnose diabetes using modern digital image processing techniques that analyses structural properties of the iris and classifies the patterns accordingly. The system analyses the broken tissues of the iris by extracting significant textural features using Gabor filter bank and Gray Level Co-occurrence Matrix (GLCM) from the subsection of the iris. The extracted textural features help to categorize the diabetic and non-diabetic irises using benchmarks Artificial Neural Network (ANN) and Support Vector Machine (SVM) classifiers. The promising results of extensive experiments demonstrate the effectiveness of the proposed method.
Keywords: diabetes detection; image processing; iris images; support vector machine; artificial neural network; SVM; ANN; gabor features; gray level co-occurrence matrix; GLCM; Non-Invasive Technique.
Autonomous Void Detection and Characterisation in Point Clouds and Triangular Meshes
by Benjamin Bird, Barry Lennox, Simon Watson, Thomas Wright
Abstract: In this paper we propose and demonstrate a novel void characterisation algorithm which is able to distinguish between internal and external voids that are present in point clouds of both manifold and non-manifold objects and 3D scenes. We demonstrate the capabilities of our algorithm using several point clouds representing both scenes and objects. Our algorithm is shown in both a descriptive overview format as well as pseudocode. We also compare a variety of different void detection algorithms and then present a novel refinement to the best performing of these algorithms. Our refinement allows for voids in point clouds to be detected more efficiently, with fewer false positives and with over an order of magnitude improvement in terms of run time. We show our run time performance and compare it to results obtained using alternative algorithms, when tested using popular single board computers. This comparison is important as our work is intended for online robotics applications, where hardware is typically of low computational power. The target application for this work is 3D scene reconstruction to aid in the decommissioning of nuclear facilities.
Keywords: Point Cloud; Void Detection; Meshing; Reconstruction; Computer
GCSAC: Geometrical Constraint SAmple Consensus for Primitive Shapes Estimation in 3-D Point Cloud
by Le Van Hung, Hai Vu, Thi-Thuy Nguyen, Thi-Lan Le, Thanh-Hai Tran
Abstract: Estimating parameters of a primitive shape from a 3-D point cloud data is a challenging problem due to data containing noises and computational time demand. In this paper, we present a new robust estimator (named GCSAC, Geometrical Constraint SAmple Consensus) aimed at solving such issues. The proposed algorithm takes into account geometrical constraints to construct qualified samples for the estimation. Instead of randomly drawing minimal subset of sample, explicit geometrical properties of the interested primitive shapes (e.g., cylinder, sphere and cone) are used to drive sampling procedures. At each iteration of GCSAC, the minimal subset sample is selected based on two criteria (1) It must ensure a consistency with the estimated model via a roughly inlier ratio evaluation; (2) The samples satisfy geometrical constraints of the interested objects. Based on the obtained good samples, model estimation and verification procedures of the robust estimator are deployed in GCSAC. Extensive experiments have been conducted on synthesized and real datasets for evaluation. Comparing with the common robust estimators of RANSAC family (RANSAC, PROSAC, MLESAC, MSAC, LO-RANSAC and NAPSAC), GCSAC outperforms in term of both the precision of the estimated model and computational time. The implementations of the proposed method and the datasets are made publicly available.
Keywords: Robust Estimator; Primitive Shape Estimation; RANSAC and RANSAC Variations; Quality of Samples; Point Cloud data.
Exploring the Effects of Non-Local Blocks on Video Captioning Networks
by Jaeyoung Lee, Junmo Kim
Abstract: In addition to visual features, video also contains temporal information that contributes to semantic meaning regarding the relationships between objects and scenes. There have been many attempts to describe spatial and temporal relationships in video, but simple encoder-decoder models are not sufficient for capturing detailed relationships in video clips. A video clip often consists of several shots that seem to be unrelated, and simple recurrent models suffer from these changes in shots. In other fields, including visual question answering and action recognition, researchers began to have interests in describing visual relations between the objects. In this paper, we introduce a video captioning method to capture temporal relationships with a non-local block and boundary-aware system. We evaluate our approach on a Microsoft Video Description Corpus (MSVD, YouTube2Text) dataset and a Microsoft Research-Video to Text (MSR-VTT) dataset. The experimental results show that a non-local block applied along a temporal axis can improve video captioning performance on video captioning datasets.
Keywords: Video captioning; Non-local mean; Self-attention; Video description.
A Novel Approach for Mitigating Atmospheric Turbulence using Weighted Average Sobolev Gradient and Laplacian
by Prifiyia Nunes, Dippal Israni, Karthick D, Arpita Shah
Abstract: Heat scintillation mainly leads to atmospheric turbulence which causes the image distortion due to the propagation of light through the volatile environment. The change in the refractive index due to variation in wind velocity is also the reason for causing turbulence in the atmosphere. Traditional image registration approach lags as it is computationally expensive and need post-processing algorithm for sharpening the image. A non-registration based Sobolev Gradient and Laplacian (SGL) algorithm removes turbulence but results in ghost artifacts in moving objects. This paper proposes a novel approach based on weighted average SGL. The proposed method mitigates atmospheric turbulence as well as restores the moving object in the scene. Performance metrics like SSIM and MSE prove that the proposed algorithm outperforms the state of the art algorithms in terms of restoring the geometric distortion as well as the object of interest.
Keywords: Atmospheric Turbulence; Heat Scintillation; Restoration; Sobolev; Phase Shift; Weighted Average.
Special Issue on: Research in Virtual Reality
Real time sign language recognition using depth sensor
by Jayesh Gangrade, Jyoti Bharti
Abstract: Communication via gestures is a visual dialect utilized by deaf and Hard-of-Hearing (HoH) people group. This paper proposed a system for sign language recognition utilizing human skeleton data provided from Microsofts Kinect sensor to recognizing sign gestures. The Kinect sensor generates the skeleton of a human body and distinguishes 20 joints in it. The proposed method utilizes 11 out of 20 joints and extracts 35 novel features per frame, based on distances, angles and velocity involving upper body joints. Multi-class Support Vector Machine classified the 35 Indian sign gestures in real time with accuracy of 87.6%. The proposed method is robust in cluttered environment and viewpoint variation.
Keywords: Kinect sensor; Indian sign gesture; Multi class support vector machine; Human computer interaction; Pattern recognition.
Adaptive Multi-Threshold Based De-noising Filter for Medical Image Applications
by Ramya A, Murugan D, Murugeswari G, Nisha Joseph
Abstract: Medical image processing is the emerging research area and many researchers contributed to medical image processing by proposing new techniques for medical image enhancement and abnormality detection. Interpretation of medical images is a challenging problem because of the unavoidable noise produced by the medical imaging devices and interference. In this work, a new framework is proposed for noise detection and reduction. This framework comprises two phases. First phase is the noise detection phase which is performed using the newly proposed Adaptive Multi-Threshold scheme (AMT). In second phase, modification of noisy pixel is done using Edge Preserving Median filter (EPM), which conserves the edge component and controls the blurring effect with preservation of fine details of interior region. The proposed work is tested with benchmark images and few medical images. It produces promising result and the results are compared with existing two-stage noise reduction techniques. Popular performance metrics such PSNR and SSIM are used for evaluation. Quantitative analysis and experimental results demonstrate that the proposed method is more efficient and suitable for medical image pre-processing.
Keywords: Noise removal; Noise Detection; Impulse noise; Multi-Threshold; Edge Preserving.
Crowd detection and counting using a static and dynamic platform: State of the Art
by Huma Chaudhry, Mohd Shafry Mohd Rahim, Tanzila Saba, Amjad Rehman
Abstract: Automated object detection and crowd density estimation are popular and important topics in visual surveillance research area. The last decades witnessed many significant publications in this field and it has been and still is a challenging problem for automatic visual surveillance. The ever increase in research of the field of crowd dynamics and crowd motion necessitates a detailed and updated survey of different techniques and trends in this field. This paper presents a survey on crowd detection and crowd density estimation from moving platform and surveys the different methods employed for this purpose. This review category and delineates several detections and counting estimation methods that have been applied for the examination of scenes from static and moving platforms.
Keywords: Crowd; Counting; Holistic and Local Motion Features; Estimation; Visual Surveillance; Moving Platform.
Real time vision-based hand gesture recognition using depth sensor and a stochastic context free grammar
by Jayesh Gangrade, Jyoti Bharti
Abstract: This paper presents a new algorithm in computer vision for the recognition of hand gestures. In the proposed system, Kinect sensor is used to track and segment hand in the clutter background and feature extracted by finger and an angle between them. Classify the hand posture using multi-class support vector machine. The hand gesture is recognised by stochastic context free grammar (SCFG). Stochastic context free grammar uses syntactic structure analysis and by this method, recognises hand gestures by set of production rules which consists of a combination of hand postures. The proposed algorithm is able to recognise various hand postures in real time with more than 97% accuracy.
Keywords: hand gesture; stochastic context free grammar; SCFG; multi-class support vector machine; Kinect sensor.
Special Issue on: MIWAI 2017 Computational Intelligence and Deep Learning for Computer Vision
A Real-time Aggressive Human Behavior Detection System in Cage Environment across Multiple Cameras
by Phooi Yee Lau, Hock Woon Hon, Zulaikha Kadim, Kim Meng Liang
Abstract: The sense of confinement inherent in a cage environment, such as lock-up or elevator, will become a place that is conducive to conduct criminal activities such as fighting. The monitoring of activities in the enclosed cage environments has, therefore, become a necessity. However, placing security guards could be inefficient and ineffective, as 24/7 surveillance is impossible to monitor the scene 24 by 7. A vision-based system, employing a real-time video analysis technology, could be deployed to detect abnormalities such as aggressive behavior, could eventually become an emerging and challenging problems. In order to monitor suspicious activities in a cage environment, the system should be able (1) to track individuals, (2) to identify their action, and (3) to keep a record of how often these aggressive behavior happen, at the scene. On top of that, the system should be implemented in real-time, whereby, the following limitations should be taken into consideration: (1) viewing angle (fish-eye) (2) low resolution (3) number of people (4) low lighting (normal) and (5) number of cameras. This paper proposes to develop a vision-based system that is able to monitor aggressive activities of individuals in an enclosed cage environment using multiple cameras. This work focuses on analyzing the temporal feature of aggressive movement, taking into consideration the limitations discussed previously. Experimental results show that the proposed system is easily realized and achieved impressive real-time performance, even on low end computers.
Keywords: surveillance system; behavior monitoring; perspective correction; background subtraction; real-time video processing.
Attention-Based Argumentation Mining
by Derwin Suhartono, Aryo Pradipta Gema, Suhendro Winton, Theodorus David, Mohamad Ivan Fanany, Aniati Murni Arymurthy
Abstract: This paper is intended to make a breakthrough in argumentation
mining field. Current trends in argumentation mining research use handcrafted
features and traditional machine learning (e.g., Support Vector Machine).
We worked on two tasks: identifying argument components and recognizing
insufficiently supported arguments. We utilize deep learning approach and
implement attention mechanism on top of it to gain the best result. We do also
implement Hierarchical Attention Network (HAN) in this task. HAN is a neural
network that gives attention to two levels, which are word-level and sentencelevel.
Deep learning with attention mechanism models can achieve better result
compared with other deep learning methods. This paper also proves that on
research task with hierarchically-structured data, HAN will perform remarkably
good. We do present our result on using XGBoost instead of a regular nonensemble
classifier as well.
Keywords: argumentation mining; hand-crafted features; deep learning; attention mechanism; hierarchical attention network; word-level; sentence-level; XGBoost.
SEGMENTATION AND RECOGNITION OF CHARACTERS ON TULU PALM LEAF MANUSCRIPTS
by Antony P.J., Savitha C.K.
Abstract: This paper proposes an efficient method for segmentation and recognition of handwritten characters from Tulu palm leaf manuscript images. The proposed method uses an automated tool with a combination of thresholding and edge detection technique to binarize the image. Further projection profile with connected component analysis is used to line and character segmentation. Deep convolution neural network (DCNN) model used here to extract features and recognize segmented Tulu characters efficiently with a recognition rate of 79.92 %. The results are verified using benchmark dataset, the AMADI_LontarSet to generalize our model to handwritten character recognition task. The results showed that our method outperforms from the existing state of art models.
Keywords: Handwritten Character Recognition; Palm Leaf; Segmentation; DCNN; Tulu.
Combination of Domain Knowledge and Deep Learning for Sentiment Analysis of Short and Informal Messages on Social Media
by Tho Quan
Abstract: Sentiment analysis has been emerging recently as one of major Natural Language Processing (NLP) tasks in many applications. Especially, as social media channels (e.g. social networks or forums) have become signicant sources for brands to observe users opinions about their products, this task is thus increasingly crucial. However, when applied with real data obtained from social media, we notice that there is a high volume of short and informal messages posted by users on those channels. This kind of data makes the existing works suer from much diculty to handle, especially ones using deep learning approaches.rnrnIn this paper, we propose an approach to handle this problem. This work is extended from our previous work, in which we proposed to combine the typical deep learning technique of Convolutional Neural Network (CNN) with domain knowledge. The combination is used forrnacquiring additional training data augmentation and more reasonable loss function. In this work, we further improve our architecturernby various substantial enhancements, including negation-based data augmentation,rntransfer learning for word embeddings, combination of word-level embeddings andrncharacter-level embeddings, and using multi-task learning technique for attachingrndomain knowledge rules in the learning process. Those enhancements, specicallyrnaiming to handle short and informal message, help us to enjoy signicant improve-rnment on performance once experimenting on real datasets.
Keywords: Sentiment analysis; deep learning; domain knowledge; recurrent neural network;transfer learning; multi-task learning; data augmentation.