International Journal of Computational Vision and Robotics (22 papers in press)
Plant leaf disease detection using deep learning on mobile devices
by Shaheera A. Rashwan, Marwa K. Elteir
Abstract: : Conventional plant disease detection by human experts is subjective, sensitive to human errors, and requiring specialised training. Computer vision algorithms powered by deep convolutional neural network (DCNN) models have the ability to improve the plant leaf disease detection. In this paper, we investigate the practicability of deploying a DCNN-based solution on mobile/embedded devices in terms of accuracy and performance. We exploit MobileNetV2, one of the DCNN models commonly used with embedded devices, and another heavy DCNN model which is not designed for embedded devices, i.e., AlexNet to assess the performance loss compared to the accuracy gain. Our results using plant village benchmark dataset show that the achieved accuracy is 96.54% and 97.87% for MobileNetV2 and AlexNet, respectively. For the inference performance, the best performance is mostly achieved when the embedded GPU is utilised. It takes 26.3 and 27.5 milliseconds on the average for MobileNetV2 and AlexNet, respectively on a professional class mobile device and 155.07 and 80.67 milliseconds on the average for MobileNetV2 and AlexNet, respectively on an average class mobile device. We conclude that the advanced computational power of current mobile devices enables heavy-weighted DCNN models to be efficiently deployed and hence achieving high accuracy without scarifying the performance.
Keywords: plant leaf disease detection; convolutional neural network; mobile devices; embedded GPUs; TensorFlow; MobileNetV2; AlexNet.
Face recognition with Raspberry Pi using deep neural networks
by Xhevahir Bajrami, Blendi Gashi
Abstract: Identifying a person through photography is difficult when dealing with different conditions such as light, color and image cleanliness. Old facial recognition methods are not applicable due to the poor performance they have shown when dealing with a lot of data and under different conditions. However, with deep neural networks we can create systems that achieve high facial recognition efficiency from digital photography. The new deep learning methods have enabled the accuracy of person identification through digital photography to be very high. Knowing the face through software systems remains a separate problem. Facial recognition systems enable the person to be identified through digital photography and deal with large amounts of digital imaging. In this paper will be presented the implementation of deep neural networks, for face recognition from digital photography, on the electronic device Raspberry Pi. In addition to the efficiency of this system in different areas, this implementation is cost effective.
Keywords: face recognition; raspberry pi; deep neural networks; knn.
The using of deep neural networks and natural mechanisms of acoustic wave propagation for extinguishing flames
by Jacek Wilk-Jakubowski, Pawel Stawczyk, Stefan Ivanov, Stanko Stankov
Abstract: The article presents an innovative method of flame extinguishing with a high-power acoustic extinguisher, which is equipped with a deep neural network (DNN) flame detection module. Experimental results of flame detection with the use of the DNN networks are presented, and then their extinguishing with the use of sinusoidal waves modulated by triangular waveform, as well as with triangular waves without modulation. The article provides a justification for the approach taken, as well as information on the parameters of the signals used and hardware components. The results are discussed taking into account the power supplied to the loudspeaker and the influence of sound pressure on flame extinguishing as a function of a distance
from the extinguisher output. The article concludes with a short summary, in which the benefits and potential application of the technology were indicated.
Keywords: acoustic extinguisher; acoustic testing; acoustic waves fire suppression; amplitude modulation; deep neural networks; DNN; extinguishing effect; fire detection; firefighting; fire retardation; non-invasive extinguishing of the flames; TensorFlow; wave modulation.
An anti-phishing model based on similarity measurement
by Parvinder Singh, Bhawna Sharma, Jasvinder Kaur
Abstract: Phishing has represented a more noteworthy danger to clients. In the current work author attempted to build up a powerful anti-phishing technique based on hybrid similarity approach combining Cosine and Soft Cosine similarity that measures the resemblance between user query and database. The proposed similarity hybrid is also evaluated against another similarity hybrid comprising of Cosine and Jaccard similarity measure so as to validate the proposed work. Both hybrid similarities are separately fed to validation layer of feed forward back propagation neural network (FFBPNN) to predict phishing and legitimate websites. The performance of the proposed work is evaluated against data set comprising of 3,000 sample files in terms of positive predictive value (PPV), true positive rate (TPR) and F-measure. The comparative analysis demonstrated that the anti-phishing model using proposed similarity hybrid outperformed the cosine and Jaccard similarity hybrid with 0.233%, 0.2833% and 0.258% higher PPV, TPR and F-measure, respectively.
Keywords: phishing; cosine similarity; soft-cosine similarity; similarity index; FFBPNN.
Performance Evaluation of Shannon and Non-Shannon Fuzzy 2-Partition Entropies for Image Segmentation using Teaching-Learning-Based Optimization
by Baljit Singh Khehra, Arjan Singh, Gurdeep Singh Hura
Abstract: Image segmentation is the most significant pre-processing phase of computer vision. Thresholding is one of most suitable approach used for the segmentation of image that has been used extensively by different researchers due to its accuracy and precision. Fuzzy 2-partition entropy with various evolutionary algorithms has been used widely to determine optimal threshold value for image segmentation. Teaching-Learning Based Optimization (TLBO), which is also an evolutionary optimization algorithm, has also been used to maximize the objective function based on the fuzzy 2-partition entropy and subsequently finding optimal threshold value for image segmentation. Fuzzy 2-partition Shannon entropy is generally applied for thresholding. In this paper, fuzzy 2-partition non-Shannon measure of entropy i.e. Havrda-Charvat fuzzy 2-partition entropy and Renyi fuzzy 2-partion entropy using TLBO have been proposed for selecting optimal threshold value. The performance of fuzzy 2-partition Shannon and non-Shannon measures of entropy using TLBO has been compared with other nature based evolutionary algorithms namely Genetic Algorithm (GA), Biogeography-based Optimization (BBO) and with a recursive approach, which is a non-evolutionary approach. The standard test images from the benchmark datasets have been used for experimental purpose. The evaluation of experimental results has been done from qualitative as well as quantitative point of view. From results, it has been observed that TLBO based Havrda-Charvat fuzzy 2-partition entropy gives better performance than all other approaches in terms of quality of the segmented image as well as taking less computational time.
Keywords: Shannon entropy; Havrda-Charvat entropy; Renyi entropy; Kapur entropy; TLBO; GA; BBO.
Comparative analysis of wavelet-based copyright protection techniques
by Jasvinder Kaur, Parvinder Singh
Abstract: Copyright protection of digital content is need of hour. The watermarking in form of wavelet in digital images is becoming popular nowadays. We have implemented and compared different wavelets family-based techniques for copyright protection. We have used Haar, Daubechies, biorthogonal and reverse biorthogonal wavelets to calculate the MSE and PSNR values for sample cover image and watermark images with
constant values of approximate coefficient and intensity gain factor. We have also proposed to choose local threshold value of approximate coefficient of cover image in insertion algorithm. The local threshold value of cover image can improve the PSNR value significantly.
Keywords: wavelets; information hiding; watermarking; copyright protection.
A secure identity and access management system for decentralising user data using blockchain
by Tripti Rathee, Parvinder Singh
Abstract: The arrival of blockchain technology has made a revolution in the field of cybersecurity. Since on the internet, almost every interaction involves some digital identity, therefore the ways needed to protect the digital identity over the internet becomes stronger. In this paper, a blockchain based identity and access management system MedSecureChain has been implemented on a medical ecosystem. An OAuth-based authentication mechanism is used to provide delegated access, so as to protect and provide the control over user data. Further, a document verification system using interplanetary file system (IPFS) and blockchain technology has been proposed. IPFS is used to store the users data in decentralised manner thus reducing the size of the data. The proposed system provides security and privacy to the identity of the user by using smart contracts. The use of blockchain helps in decentralising the system thus eliminating the control of single authority over the data.
Keywords: access control; security; blockchain; distributed ledger; identity management.
High-power acoustic fire extinguisher with artificial intelligence platform
by Jacek Wilk-Jakubowski, Pawel Stawczyk, Stefan Ivanov, Stanko Stankov
Abstract: Nowadays, innovative flame extinguishing techniques are still being sought that do not threaten the environment and do not damage the extinguished elements. One of them seems to be the acoustic method. Research is being conducted in Europe, America, and Asia into the possibility of using acoustic waves to extinguish flames. This article presents the original measurement results showing the extinguishing possibilities with a sinusoidal waveform and sinusoidal waveform modulated with a square waveform for the three analysed frequencies (15 Hz, 17 Hz and 20 Hz). A high-powered acoustic extinguisher was applied to extinguish the flames. Such a fire extinguisher may be equipped with an artificial intelligence module. Deep neural networks (DNN) were used for flame detection. Results from the training networks were described, as well as the hardware architecture employed. Such an intelligent module, which allows an acoustic extinguisher to be automatically activated when flames are detected, is particularly useful when traditional sensors are not available.
Keywords: acoustic extinguisher; acoustic wave fire suppression; deep neural network; DNN; fire detection; fire retardation; intelligent sensor.
Isolated spoken word recognition using packed-MFCC on padded-voice signal for unscripted languages
by Rajdev Tiwari, Vidha Sharma, Ramesh Chandra Sahoo
Abstract: Voice-based applications like Alexa, Siri and Google assistant have become very common these days. These voice operated devices are based on scripted languages which have their own set of alphabets, phonemes and grammar, whereas languages of oral tradition do not have all these. Because of the fundamental difference between languages of scripted and unscripted nature, techniques used for languages like English are found unfit to be used for languages of oral tradition. In this paper, an isolated-spoken word recognition system for unscripted languages is modelled. Model uses a packed-Mel frequency cepstral coefficients (MFCC) feature over padded-voice with support vector machine (SVM) as classifier. The model is tested and compared against various other statistical features with different classifiers like K-nearest neighbour (KNN) and stochastic gradient descent (SGD). SVM is found best in terms of recognition accuracy for data set of language Kurukh, spoken by Oraon community, having 8,900 samples.
Keywords: speech recognition; Language translation; support vector machine; SVM; K-nearest neighbour; KNN; stochastic gradient descent; SGD; packed-MFCC; isolated word recognition; word error rate; WER; oral tradition languages; Oraon.
Retinal vessel segmentation using a strip wise classification approach with grid search-based parameter selection
by Mahua Nandy Pal, Minakshi Banerjee
Abstract: Blood vessel characteristics of retinal images can be utilised for early detection of diseases like diabetes, hypertension, glaucoma, etc. In case of abnormal retinal symptoms like neovascularisation and aneurysm, accurate extraction of local vessel is very significant. Challenges present in automatic vessel detection are varying vessel width, presence of optic disc, neovascularisation, exudates, aneurysm, haemorrhage and low contrast. This paper proposes an automated segmentation of retinal vasculature using Gabor filter bank, optimised on the basis of grid search over the whole parameter space, and a new strip wise classification approach. Tophat features and ridge information based on eigen values of Hessian matrix are also considered along with optimised Gabor features to capture vessels more precisely. Accuracy of about 95% in all the cases proves the efficiency of strip wise classification. Discriminative power of features increases when different sets of features are considered together.
Keywords: Gabor filter; Tophat transformation; ridge enhancement; strip based classification.
Intelligent Fuzzy Logic based Sliding Mode Control Methodologies for Pick & Drop Operation of Robotic Manipulator
by Mohd Salim Qureshi, Pushpendra Singh, Pankaj Swarnkar
Abstract: The work shows a vigorous adaptive control methodology in tracking control of robot manipulators based on amalgamation of fuzzy control with ostensible sliding mode control (SMC). For robotics, the impetus of adopting SMC relies on its substantial features. Nonetheless, demerits of classical SMC, like effect of chattering and prior knowledge of uncertainty bounds can be extremely caustic. This article proposes different robust adaptive control techniques. Firstly, an Adaptive Fuzzy PI Sliding Mode Control (AF-PI-SMC) is proposed where fuzzy controller is the major tracking controller and the difference between ideal computational and fuzzy controller is compensated by the compensation controller. Uncertain bound of compensation controller is examined by estimation mechanism. Secondly, an Auto-Tuned Adaptive Fuzzy Sliding Mode Controller (AT-AFSMC) is proposed where the control gain is considered as individual vector and is adjusted by an adaptive SISO fuzzy system. Here, control gain ? is tuned online which makes the controller adaptive. Mathematical analysis showcases that the controllers in tracking robot manipulator in the presence of uncertainties has global asymptotic stability in Lyapunov sense. Finally, proposed controllers are tested on a 2-Degree of Freedom (DOF) robot manipulator with real time digital simulator Opal-RT (OP-4500). The experimental results express superiority of the proposed control techniques in presence of structured and unstructured uncertainties.
Keywords: Robotic manipulator; Sliding mode control; Fuzzy sliding mode control; Pick & drop operation.
New descriptors combination for 3D mesh correspondence and retrieval
by Roaa Soloh, Abdallah El Chakik, Hassan Alabboud, Ahmad Shahin, Adnan Yassine
Abstract: 3D models that are widely used nowadays, mostly represented by meshes or point clouds, these models are appearing in many fields like computer vision, informatics, engineering, as well as medicine. This paper considers the problem of shape matching and retrieval between 3D models where finding superior one-to-one correspondence between them are the target. To do so, we detect feature points using the well known 3D Harris detector, followed by proposing a combination of local shape descriptors to form a compact feature vector for the keypoints extracted that consist of: Gaussian curvature, curvature index, and shape index. Lastly we model the matching problem as combinatorial problem solved using brute-force approach, and Hungarian one, comparing the efficiency between them. Our proposed combination of descriptors, shows good performance and compromise numerical values specifically using the Hungarian algorithm where its results demonstrate our proposed approach. Moreover, cosine similarity is used behind the retrieval system between these features of each pairs in the database, and our system gives accurate retrieval for several models, and acceptable percentages for others.
Keywords: 3D meshes; feature detection and extraction; matching problems; shape retrieval; Hungarian algorithm; brute-force algorithm.
Quantum neural network application for exudate affected retinal image patch identification
by Mahua Nandy Pal, Minakshi Banerjee, Ankit Sarkar
Abstract: In the field of retinal disease identification, deep neural networks are exhaustively used. But the efficiency of quantum neural network in the field is not yet explored. Recently, quantum neural network achieved attention of researchers as it is required to explore if quantum network has any scope in the relevant field in terms of resource utilisation and decision-making during network learning. In this paper, efficiency of a simple quantum network model is experimented. In the present scenario, quantum classical models are unable to handle more than few qubits. Experimentally, it is found that the quantum neural network is quite efficient in representing the features of exudate affected retinal image patches. The accuracy of quantum neural net model is 84.28%. The accuracies are 51.80% and 88% respectively with comparable deep neural net and convolutional neural net models.
Keywords: quantum neural network model; deep neural network model; deep convolutional neural network model; retinal fundus image; exudates; classification; TensorFlow quantum; TFQ; parameterised quantum circuit; PQC.
A deep hybrid model for advertisements detection in broadcast TV and radio content
by Abdesalam Amrane, Abdelkrim Meziane, Abdelmounaam Rezgui, Abdelhamid Lebal
Abstract: Media monitoring is essential for measuring the influence of companies over their consumers. It consists of building, reporting, and providing a full view of media sources in near real-time allowing to synthesise the data. Advertisement detection and classification in electronic media (TV and radio) is an essential part of a media monitoring system and is very useful for companies that work in a competitive environment. Advertisement detection entails many difficulties including unbalanced data, misclassification caused by outliers, and variation in loudness levels between TV/radio channels. To overcome these challenges, we propose a deep hybrid model for advertisement detection (DHM-ADS). We conduct several experiments by combining different methods: deep neural network models (ANN, CNN, and RNN) with dynamic time warping and multi-level deep neural networks such as autoencoders. The evaluation shows that the ANN classifier combined with an autoencoder gives the best result
for advertisement detection in TV/radio broadcast even compared to the conventional framework 'DejaVu'.
Keywords: advertisement detection; media monitoring; audio outliers removal; deep learning; autoencoder.
Kidney Image Classification Using Transfer Learning with Convolutional Neural Network
by Priyanka , Dharmender Kumar
Abstract: For abdominal studies, one of the most widely used diagnostic methods is ultrasound imaging. Several Chronic Kidney Diseases (CKDs) such as kidney stone, cystic kidney, and hydronephrosis are present in the human kidney. These CKDs, later on, lead to the development of a number of severe diseases particularly heart diseases, pulmonary attacks, cardiomyopathy etc. So, early detection of CKDs is highly desirable in clinical practices as it can save hundreds of lives. Nowadays, the main focus of researchers is to develop automatic disease detection methods avoiding the need for human interaction. The study of deep learning models is playing a critical role in various applications of healthcare not only due to their fast and accurate results but also minimal manual interference is required in these methods. In this paper, two approaches are proposed for the detection of CKDs in ultrasound kidney images. The first one is a conventional approach which uses GA Optimized Neural Network (GAONN) as classifier whereas in other approach Convolution Neural network model such as AlexNet is used for automatic detection of diseases. AlexNet is trained using the transfer learning process. Experimental results show that CNN performs better than GA optimized neural network in classifying kidney images.
Keywords: Convolution Neural Network (CNN); GA optimized Neural Network; transfer learning,; Accuracy; Principal Component Analysis; Grey Level Co-Occurance Matrix,.
Deep learning solution for machine vision problem of vehicle body damage classification
by Aaron Rasheed Rababaah
Abstract: The automation of vehicle damage classification into classes of interest has benefits over manual solutions such as efficiency, accuracy, reliability and repeatability. Industries such as automotive dealerships, car rentals and car insurance are among the most industries that are expected to be interested in such a solution. In this paper, we present machine vision and deep learning-based method for vehicle damage classification based on convolution neural networks (CNNs) models. For training and validation, we used a publicly available dataset along with our own to increase input data as CNN models require significantly much more data than classical machine learning models. Our best performing model demonstrated a remarkable classification accuracy of 98.7%. As future work, we intend to consider a wider range of damage classes and significantly extend the current dataset to further validate the current solution.
Keywords: vehicle damage classification; image processing; machine vision; deep learning; convolutional neural networks.
A novel restricted Boltzmann machine-based temporal-spatial correlation method for student behaviour recognition in depth video
by Fan Zhang
Abstract: Human behaviour recognition is an important research hotspot in the field of artificial intelligence. Current behaviour recognition methods have low recognition accuracy under different viewing angles, therefore, this paper proposes a novel restricted Boltzmann machine (RBM)-based temporal-spatial correlation method for student behaviour recognition in depth video. The RBM is used to map the human behaviour from different viewing angles to the high-dimensional space. The time level pooling function is applied in the time series activated by each neuron to realise the encoding of the video time sub-series. Finally, behaviour recognition and classification experiments are conducted on different public datasets and real classroom student behaviour datasets with other methods. The results show that the proposed method improves the accuracy of depth video recognition under different viewing angles and has good generalisation performance. The data analysis of abnormal behaviour in class can play an auxiliary role in dynamic classroom management.
Keywords: restricted Boltzmann machine; RBM; student behaviour recognition; temporal-spatial correlation; Fourier time pyramid algorithm.
Plant leaf disease classification using deep neural network
by N. Kasthuri, T. Meera Devi, Arivazhagan T. Shangar, R. Yashwin, J.S. Shabhareesh
Abstract: Agriculture is the backbone of Indian economy. Most of the people living in rural areas depend on agriculture for their livelihood. Nevertheless, the farmers are facing a lot of difficulties in crop production due to climatic change. In addition, diseases in plants affect the production of crops drastically. Presently, the farmers identify the plant diseases by visual inspection which, in turn, requires an experts help and it is a time consuming task. Hence, in this paper, deep learning networks are used to identify different types of diseases in the leaves of different plants. The model is trained with 45,562 images and validated with 8,049 images belonging to 17 categories of diseases. It is fine-tuned with the hyperparameters such as learning rate, epochs, batch size and input image size and then tested with 9,469 images which yield a total classification accuracy of 96.8%.
Keywords: multiclass classification; convolutional neural network; CNN model; model parameters; activation function; support vector machine; SVM classifier; transfer learning; plant leaf diseases; hyperparameters; performance metric.
Convolutional neural networks for obstacle detection on the road and driving assistance
by Ramzi Mosbah, Larbi Guezouli
Abstract: Generally, a driver have moments of inattention, that can cause considerable damage. To deal with this issue, we have to detect obstacles on the road automatically. To do that, several challenges appear. Firstly, we have to locate the region of interest which is the road part in the frame, than we have to detect objects inside the region of interest. In this work we propose an improved driver assistance system using a camera on the front of the car. Acquired images from this camera feed our system. In the frames to be processed, we reduce the region of interest to the area of the road. Obstacles on the road are sought in this region of interest. At the same time, we take care of the driver by detecting whether he is drowsy. Experimental results were evaluated using KITTI Vision Benchmark Suite and a short videos recorded on streets in Batna.
Keywords: obstacle detection; image edge detection; driving assistance; object recognition; convolutional neural networks.
Design of a solar-powered mobile manipulator using fuzzy logic controller of agriculture application
by Fradina Septiarini, Tresna Dewi, Rusdianasari
Abstract: This paper shows the feasibility of applying a mobile manipulator powered by solar energy as a harvesting robot in agriculture. The designs are started by designing the robots mechanics and the mobile manipulator control. The motion of the mobile base and arm robot manipulator are approached using FLC, whose inputs are based on target detection using image processing image segmentation. FLC design is also intended to predict the robots charging source based on the light sensor attached to the charging system. The robot can directly take power from the solar ray during a sunny day or take it from the battery during a cloudy day. The robot motion is simulated using MobotSim to show how the robot moves from one spot to another, harvesting the agricultural product. The simulations conducted in this study show that the solar-powered mobile manipulator is possible are applied in agriculture as a harvesting robot.
Keywords: agriculture; fuzzy logic; mobile manipulator; robot vision; solar energy.
Special Issue on: The Role of Computer Vision for Smart Cities
Image enhancement based on skin-colour segmentation and smoothness
by Haitao Sang, Bo Chen, Shifeng Chen, Li Yan
Abstract: The image restoration tasks represented by image denoising, super-resolution and image deblurring have a wide range of application background, and have become a research hotspot in academia and business circles. A novel image enhancement algorithm based on skin texture preserving is proposed in this paper. The mask has been obtained using the Gaussian fitting, which can have a box blur for many times for skin feather. The denoising smoothing image is fused with the original image mask to preserve the hair details of the original image and enhance the edge details of the contour, so as to provide more effective information for the extraction of edge features. Compared with different methods of image smoothing algorithms, this
algorithm is more effective in smoothing the skin edge contour and achieving better detection of images. Experimental results show that the proposed algorithm has strong adaptive capacity and significant effect on most images detection. Specifically, it can moderately smooth the edges of the areas with many details, leaving no traces of an artificial process. The proposed algorithm with image enhancement has a wide range of practicality.
Keywords: image enhancement; image restoration; image generation and synthesis; texture preserving smoother; skin-colour model.
Supervised learning software model for the diagnosis of diabetic retinopathy
by M. Padmapriya, S. Pasupathy
Abstract: Diabetic retinopathy (DR) is the leading cause of eye diseases and vision loss for diabetic affected people. Due to the damage of retinal blood vessels, diabetic patients often suffer from DR. So the retinal blood vessel segmentation plays a crucial role in the diagnosis of DR. We can prevent vision loss or blindness problems if the diagnosis happens during the early stages. Early diagnosis and initial investigation would help lower the risk of vision loss by 50%. This article exploits the supervised classification approach to detect blood vessels by applying features such as grey level and invariant moments. The image pre-processing and blood vessel segmentation are the two essential steps are used in this study, along with the proposed classification framework using neural network models. Two publicly available retinal image datasets, such as DRIVE and STARE, are used to assess the proposed supervised classification framework. The suggested supervised classification methodology in this study attains the average retinal blood vessel segmentation accuracy of 93.94% in the DRIVE dataset and 95.00% in the STARE dataset.
Keywords: diabetic retinopathy; fundus imaging; grey level features; invariant
moments; vessel segmentation.