Forthcoming and Online First Articles

International Journal of Computational Vision and Robotics

International Journal of Computational Vision and Robotics (IJCVR)

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are published online here, before they appear in a journal issue. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Open AccessArticles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

Register for our alerting service, which notifies you by email when new issues are published online.

International Journal of Computational Vision and Robotics (95 papers in press)

Regular Issues

  • Machine learning approaches for early detection and management of musculoskeletal conditions   Order a copy of this article
    by Pawan Whig, Ebtesam Shadadi, Shama Kouser, Lathifah Alamer 
    Abstract: Musculoskeletal conditions have a significant impact on quality of life. This study explores the use of machine learning algorithms for early detection and management of such conditions. Different models were evaluated using a dataset of musculoskeletal images and clinical information. Results demonstrate accurate classification with high sensitivity and specificity. A neural network was developed for detecting chronic lower back pain, achieving an impressive validation F1 score of 89%-93%. This highlights the potential of artificial intelligence in improving early detection and management. Future research should address data outliers to enhance model performance. Overall, neural networks are a valuable tool for early detection and management of musculoskeletal conditions, leading to improved patient outcomes. These findings suggest promising avenues for future research and implications for early detection and management in this field.
    Keywords: musculoskeletal conditions;arthritis; fractures; spinal problems; machine learning; early detection.
    DOI: 10.1504/IJCVR.2023.10057385
     
  • Localisation and classification of surgical instruments in laparoscopy videos using deep learning techniques   Order a copy of this article
    by Avanti Bhandarkar, Priyanka Verma 
    Abstract: Surgical trainees often use laparoscopic surgery videos to understand the appropriate use of instruments and visualise the surgical workflow better, but these videos may be difficult to interpret without proper annotations. In recent times, neural networks have emerged as an accurate and effective solution for instrument detection and classification in surgical video frames, which can subsequently be used to automate the annotation process. The proposed implementation uses faster-RCNNs and bidirectional LSTMs with (and without) time-distributed layers and attempts to solve some of the problems commonly faced while developing deep learning models for surgical image and video data: severe class imbalance, inaccuracies during multi-label classification and a lack of spatiotemporal context from adjacent video frames. The bidirectional LSTM with time-distributed layers achieved an average accuracy of 80.20% and an average F1 score of 0.7176 on the M2CAI16 tool dataset, while also achieving 63.49% average accuracy and an average F1 score of 0.522 on unseen data. Jaccard distance and Hamming distance have also been used as object detection-specific metrics; the same model registered the lowest values for both distances, implying accurate localisation and identification of surgical instruments.
    Keywords: deep learning; surgical instrument detection; surgical instrument classification; surgical instrument localisation; data augmentation; transfer learning; faster-RCNN; region-based convolutional neural networks; bidirectional LSTMs; long short-term memory networks; Jaccard distance; Hamming distance.
    DOI: 10.1504/IJCVR.2023.10057447
     
  • Holistic knuckle recognition through adept texture representation   Order a copy of this article
    by Neeru Bala, Anil Kumar, Rashmi Gupta, Ritesh Vyas 
    Abstract: In topical years, substantiation of individuals through their finger knuckle patterns has turned into an extremely dynamic area of exploration. Finger knuckle patterns are the inimitable creases existent on the posterior surface of the hand which is more expedient than other hand related modalities like fingerprint and palmprint, as the posterior surface of hand is less abraded in contrast to interior hand. This work presents an effective knuckle-based recognition framework via fusion of base, minor and major finger knuckle patterns of fingers of the individual for boosted recognition. For this, all the finger knuckle patterns are segmented and features are extracted explicitly using an efficient feature descriptor named curvature Gabor filter (CGF). In order to substantiate the proposed methodology, rigorous investigations have been performed on a publicly accessible large hand dorsal database named PolyU-Hand Dorsal (HD) dataset. Knuckles are integrated in three different ways to investigate the effect of their fusion, named fusion over knuckle, fusion over finger and fusion over hand. All the strategies mentioned have supported their magnified performance than individual knuckle recognition framework, whereas fusion over hand outshined with tiniest EER of 0.2009.
    Keywords: information security; multimodal biometrics; information fusion; knuckle recognition; score level fusion.
    DOI: 10.1504/IJCVR.2023.10057530
     
  • A multi-modal image encoding and self-attention-based transformer framework with sentiment analysis for financial time series prediction   Order a copy of this article
    by Ravi Prakash Varshney, Dilip Kumar Sharma 
    Abstract: In this paper, we propose a novel approach for financial time series forecasting using feature selection, image encoding, and a self-attention-based CNN transformer. We use Markov transition field and candlestick chart encoding to extract features from historical stock data. Additionally, we incorporate the sentiment analysis of the financial news data in our model to improve the forecast accuracy. The proposed approach is compared to traditional time series forecasting methods, and the results show that our method outperforms the traditional method in terms of forecasting accuracy. The proposed approach can be used to improve risk management and make more informed trading decisions. Our experiments demonstrate that the proposed framework achieved an improvement of approximately 17.8% in root mean squared error and ~38.7% in mean absolute error for securities lending dataset and ~71.5% improvement in root mean squared error and around ~83.2% improvement in mean absolute error for pricing dataset.
    Keywords: candlestick image encoding; computer vision; convolutional neural network transformer; feature selection; Markov transition field image encoding; multivariate time series; MTS; pattern recognition; long-short term memory; LSTM; sentimental analysis.
    DOI: 10.1504/IJCVR.2023.10057531
     
  • Improved classification of histopathological images with feature fusion of Thepade SBTC and Sauvola thresholding using machine learning   Order a copy of this article
    by Sudeep D. Thepade, Mangesh S. Dudhgaonkar Patil 
    Abstract: Histopathological images play a significant role in selecting effective therapeutics and identifying disorders like cancer. Digital histopathology is a crucial advancement in contemporary medicine. The growth and spread of cancer cells within the body can be significantly controlled or stopped with early identification and therapy. Many machine learning (ML) algorithms are used to study the images in the dataset. Feature extraction is done using Sauvola thresholding and Thepade sorted block truncation code (TSBTC). This paper presents a fusion of the features computed using the TSBTC and Sauvola thresholding method for improved classification of histopathological images. The experimental validation is done using 960 images from KIMIA PATH 960 dataset with the help of performance metrics like sensitivity, specificity, and accuracy. The superior performance is shown in TSBTC 9-ary and Sauvola thresholding feature fusion using logistic model tree (LMT) classifier with 97.6% accuracy in ten cross-fold validation scenarios.
    Keywords: classification; binarisation; histopathological; feature fusion; ensembles; KIMIA_PATH_960; Thepade SBTC; classifiers.
    DOI: 10.1504/IJCVR.2023.10058302
     
  • A primitive analysis of resonance frequency and stability simulation of a 2D SCARA drawing robot system for BCIs   Order a copy of this article
    by Ellis Iver David, James Edward Rowe, Yeon-Mo Yang 
    Abstract: In recent years, selective compliance assembly robot arm (SCARA) manipulators related to brain-computer interfaces (BCIs) have been gaining in popularity in industrial applications owing to their significant adaptability. One popular application concerns commercially available drawing robots. For example, the tip ring sleeve drawbot by Hart and Ragan uses an audio output. Thus, WAV files with pulse width modulation are used to control the servomotors. After constructing a drawing robot prototype and analysing the impulses and responses, structural flaws were noticed in this particular design from the perspective of stability, limiting the quality of the final drawing. Indeed, the robot was designed to follow single-line paths, resulting in very sudden movements (e.g., stop-start motions). This caused vibrations in the arm that were more noticeable at high speeds. To counter or mitigate the shaking of the robot arm, in this study, a kinematic model and stability simulation for a 2D (dimensional) SCARA drawing robot arm were constructed with the aim of improving the overall stability. The eventual aim was to find a model for describing the motions of all two-degree-of-freedom (DOF) rotational arm robots to allow for quick access or derivation of the optimal functional parameters of such robots.
    Keywords: brain-computer interface; BCI; SCARA; drawbot synthesiser; stipple gen; travelling salesman problem; TRS; statistical signal processing; stability; transfer function; impulse response; IR; step response; SR.
    DOI: 10.1504/IJCVR.2023.10058433
     
  • An efficient deep convolutional neural network-based safety monitoring system for construction sites   Order a copy of this article
    by V. Ashwanth, Dhanya Sudarsan 
    Abstract: Worker safety and health are paramount concerns, especially in high-risk occupations such as construction works. Monitoring workers to ensure proper usage of personal protective equipment (PPE) at construction sites is essential. However, manual surveillance via CCTV footage is time-consuming. This paper proposes an automated approach for construction site monitoring without human intervention. Initially, YOLOv4 is employed for construction worker detection, with subsequent division of the bounding boxes into four halves. EfficientNet is then utilised to analyse these cropped sections and identify specific PPE components. Additionally, construction tools and equipment are recognised, and a safety score is assigned based on worker proximity to these objects. Unsafe workers are flagged as danger zone in each frame, alongside the marking of workers. This approach streamlines safety monitoring processes while ensuring worker well-being.
    Keywords: computer vision; YOLOv4; construction safety; EfficientNet-B5; transfer learning; PPE; custom labelling; safety detection; object detection; image classification; machine learning.
    DOI: 10.1504/IJCVR.2023.10058756
     
  • An approach for speaker diarisation using whale-anti coronavirus optimisation integrated deep fuzzy clustering   Order a copy of this article
    by K. Vijay Kumar, Ramisetty Rajeswara Rao 
    Abstract: In this paper, Anticorona whale optimisation (ACWOA) method is developed for speaker diarisation which is then used to train the deep fuzzy clustering (DFC) algorithm for final clustering. To extract relevant characteristics, such as Mel frequency cepstral coefficients (MFCCs), line spectral frequencies, and line prediction cepstral coefficients (LPCCs), the input audios are fed into a feature extraction procedure (LSF). Music and silence removal are used in the speech activity detection (SAD). After identifying speech activities, the speakers are segmented using a Bayesian inference criterion (BIC) score. The ACWOA-based DFC outperformed other methods with best testing accuracy of 0.891, lowest diarisation error, false discovery rate (FDR), false negative rate (FNR) and false positive rate (FPR) of 0.618, 0.289, 0.148, and 0.130. The proposed approach outperforms the existing approaches active learning, DE+K-means, LSTM, MCGAN, and ANN-ABC-LA in terms of testing accuracy for test case 1 by 9.31%, 7.40%, 6.73%, 5.49%, and 3.59%.
    Keywords: speaker diarisation; deep fuzzy clustering; DFC; Bayesian inference criterion; BIC; speech activity detection; SAD; speaker segmentation; Mel frequency cepstral coefficients; MFCCs; line prediction cepstral coefficients; LPCCs.
    DOI: 10.1504/IJCVR.2023.10059523
     
  • A sine-cosine algorithm blended grey wolf optimisation algorithm for partitional clustering   Order a copy of this article
    by Gyanaranjan Shial, Chita Ranjan Tripathy, Sabita Sahoo, Sibarama Panigrahi 
    Abstract: Over last few decades, partitional clustering algorithms have been emerged as one of the most promising clustering algorithms that find groups among data items. Motivated from this, we have proposed a hybrid sine-cosine algorithm (SCA) blended grey wolf optimisation (GWO) algorithm for partitional data clustering. This algorithm selects near-optimal cluster centres using leadership approach of GWO and explorative strategy of SCA. Here, the sine and cosine functions are used to generate more diversified solutions around the mutant wolf of each search agents. Therefore, a tradeoff is maintained between exploration and exploitation which enjoys the benefits from both the algorithms. An extensive simulation work is carried out for clustering 11 benchmark datasets using four performance measures. Additionally, a comparative performance analysis (statistical) is conducted against GWO, PSO, SCA, JAYA and K-means using Duncans multiple range test and Friedman and Nemenyi hypothesis test. The test confirms the supremacy of our proposed algorithm.
    Keywords: grey wolf optimiser; JAYA algorithm; sine-cosine algorithm; SCA; particle swarm optimisation; PSO; partitional clustering; K-means algorithm.
    DOI: 10.1504/IJCVR.2023.10059975
     
  • Machine learning-based iris liveness detection using fusion of Thepade SBTC and Niblack binarisation technique   Order a copy of this article
    by Sudeep D. Thepade, Bhumika Patil, Smita Khade 
    Abstract: Liveness authentication is crucial in the observation environment, especially at border crossings and locations with a combat or buffer zone. It is determined in this study how to assess the liveness of the iris template to avoid fraud. This study uses a handcrafted method called TSBTC and additional binarisation techniques to survey the IIIT Delhi and Clarkson datasets and improve accuracy. A current requirement is to acquire an ILD dataset that covers all typical iris spoofing attempts. Three classifications of eyes are included in the dataset: normal, coloured, and transparent. On every image TSBTC, TSBTC + Niblack binarisation is applied, and further comparison is done on the based-on accuracies. Different classifiers are used for comparison, and Weka software has been used to compare the accuracies of the classifiers used. The study has investigated the method for extracting the local and global features from iris images.
    Keywords: iris liveness detection; ILD; Biometrics; Niblack binarisation; machine learning; feature fusion; Thepade SBTC; security.
    DOI: 10.1504/IJCVR.2024.10061526
     
  • An accurate and efficient multi-task brain tumour detection with segmented MRI images using auto-metric adolescent neural network   Order a copy of this article
    by Amrapali Kishanrao Salve, Kalpana C. Jondhale 
    Abstract: Early diagnosis of a brain tumour (BT) boosts that the patient will survive after medication. Several existing methods for detecting BTs are intrusive, cumbersome, and vulnerable to human errors. This manuscript introduces a novel hybrid method, auto-metric graph adolescent identity neural network (AGAINN), for accurate and efficient human BT segmentation and multi-task detection using magnetic resonance imaging (MRI) images. The input brain MRI images are given to structural interval gradient filtering (SIGF) based preprocessing method for eliminating noise, resizing and increasing the excellence of brain images and then provided into adaptive transfer density peaks search (ATDPS) clustering based segmentation for finding the region of interest (RoI) of the preprocessed image. Then, three types of feature extraction are done using empirical wavelet transform (EWT) and grey-level co-occurrence matrix (GLCM). The extracted image features are transferred into the suggested scheme for detecting tumour and also the types of tumour also performance analyses are compared using several metrics, statistical tests and improved accuracy rate.
    Keywords: benign; malignant; brain tumour; magnetic resonance imaging; MRI; clustering; statistical analysis.
    DOI: 10.1504/IJCVR.2024.10061584
     
  • Machine vision algorithm for MCQ automatic grading - MVAAG   Order a copy of this article
    by Aaron Rasheed Rababaah 
    Abstract: Multiple-choice questions (MCQ) predated the first digital computer. MCQ was created as a response to demands for objective and standardised tests for large populations of test takers as in national tests in education, military assessments, surveys, etc. There has been an evolution in the used technology to automate MCQ grading including optical mark recognition (OMR), optical character recognition (OCR), digital image processing (DIP), etc. In this article, we propose a robust solution for MCQ automatic grading using image processing techniques MVAAG. Our approach uses an unexpansive digital camera or a scanner to scan the answer sheets which are regular A4 papers. The scanned images are then put through a sequence of DIP operations including colour transformation stages, thresholding, morphology, connected components analysis, etc. MVAAG was validated using extensive experimental testing and found to be effective and efficient compared to manual methods as well as current modern technologies.
    Keywords: multiple-choice questions; MCQ; automating MCQ grading; image processing; bubble-based answer sheets processing; machine vision; robust MCQ auto-grading.
    DOI: 10.1504/IJCVR.2024.10061643
     
  • Synergising machine learning and blockchain for enhanced fraud detection   Order a copy of this article
    by Pawan Whig, Rattan Sharma 
    Abstract: The convergence of blockchain technology and machine learning represents a powerful paradigm shift in revolutionising fraud detection within the financial sector. This abstract highlights the synergistic potential of combining these two cutting-edge technologies, emphasising their collective impact on bolstering fraud detection and prevention strategies. Through the utilisation of blockchain’s inherent features, such as transparency, immutability, and real-time monitoring, in conjunction with the predictive capabilities of machine learning, including exploratory data analysis (EDA), XGBoost, and random forest (RF), our research has achieved an outstanding accuracy rate of approximately 99.9% in fraud detection. This fusion empowers the identification of anomalies in real-time, the issuance of proactive alerts, and the development of adaptable models that continuously evolve to address emerging fraud patterns. Furthermore, the decentralised and collaborative nature of blockchain facilitates secure data sharing and leverages collective intelligence, further enhancing the precision of fraud detection. The profound implications of this integration empower financial institutions to significantly elevate transaction security, effectively combat fraudulent activities, and foster greater trust in the ever-evolving digital financial landscape.
    Keywords: blockchain; fraud detection; fraud prevention; financial security; digital transactions; decentralised ledger; data integrity; real-time monitoring; patterns; fraudulent activity.

  • Analysing the performance of Viola-Jones and multi-task convolution neural networks face detection algorithms using real-time video sequences   Order a copy of this article
    by M. Mohana, P. Subashini 
    Abstract: In recent years, face detection has been a hot research area in computer vision, serving as the first step in both face recognition and facial expression detection. However, several challenges exist when detecting faces in real-time, including pose variations, varying lighting conditions, and partial occlusions on face and video images. Despite the existence of numerous face detection algorithms, this study focuses on evaluating the Viola-Jones and multi-task convolutional neural network (MTCNN) algorithms, which have been widely used for face detection in several research studies. The objective of this comparative study is to analyse these two widely used face detection algorithms in the context of the aforementioned challenges using real-time video sequences and benchmark datasets. For this study, video sequences were collected from the LIRIS children spontaneous facial expression video database, and a real-time video dataset was captured in the centre for machine learning and intelligence laboratory. The results show that MTCNN achieved an average true positive rate accuracy of 94.33%, whereas the Viola-Jones algorithm achieved 73.33% accuracy when conducting experiments with various face detection challenge scenarios.
    Keywords: face detection; Viola-Jones; MTCNN; computer vision; face detection challenges.
    DOI: 10.1504/IJCVR.2024.10061831
     
  • Generalised video anomaly detection: a systematic review   Order a copy of this article
    by S. Anjali, S. Don 
    Abstract: The practice of identifying irregularities and outliers in data is known as anomaly detection. Due to the demand for prompt and precise anomaly detection, this is a growing research area in computer vision. The purpose of this paper is to provide a systematic literature review (SLR) on video anomaly detection by creating the pertinent research questions (RQs). We have considered 83 research articles from reputable databases published between 2012 and 2023. After reviewing these publications, we developed a taxonomy of different video anomaly detection strategies and found that deep learning-based algorithms performed better than traditional ones. The two most common applications of video anomaly detection are seen in the surveillance and healthcare domains. We have identified 16 benchmark datasets, including surveillance and medical datasets. Researchers can use this SLR to look into the most recent studies, applications, datasets, methodologies, challenges and future scope of video anomaly detection.
    Keywords: video anomaly detection; visual anomaly detection; computer vision; deep learning; systematic literature review; SLR.
    DOI: 10.1504/IJCVR.2024.10061832
     
  • Personalised video summarisation using video-text multi-modal fusion   Order a copy of this article
    by Rakhi Akhare, Subhash K. Shinde 
    Abstract: Video summarisation techniques have evolved in recent years, mostly focusing on visual material and ignoring user preferences. In this work, the topic of query-focused video summarisation is addressed. Long videos are given as input, and the goal is to produce a query-focused video summary using the user's sentences rather than keywords. The two parts of the proposed personalised video summarisation (PVS) system are the query-relevance computation module and the feature encoding network. In order to provide a customised video summary, the suggested end-to-end approach combines encoded visual and textual information and assigns a query relevance score. The suggested PVS model is tested using the fast-text and Resnet embeddings on the video-query dataset. In comparison to various combinations of language and vision models, the suggested PVS model performs better and achieves an accuracy of 0.53%. This study assists the research community to work in the field of multimodal video summarisation.
    Keywords: personalised video summarisation; PVS; word embedding; feature fusion; multi-modal video summarisation; query based video summarisation.
    DOI: 10.1504/IJCVR.2024.10061911
     
  • Effect of layers on CNN model accuracy for facial emotion recognition   Order a copy of this article
    by M.D. Rakshith, Harish H. Kenchannavar 
    Abstract: Facial expression recognition has become a very tedious task in the domain of image recognition. Image classification involves drastic usage of deep learning techniques. This has resulted in the increased usage of convolutional neural networks (CNNs) for recognising emotions through facial expressions. In deep learning, developing the compact network architecture that achieves high accuracy on the data of interest is a significant challenge. In the presented article, a novel optimised CNN (O-CNN) model consisting of five convolution layers is proposed and the effect of layers on the test accuracy is observed on FER-2013 dataset. The hyperparameters of CNN such as kernel size, number of kernels, activation function, dropout, number of hidden units, batch size and epochs are considered for experimentation. By keeping the constant kernel size, the convolution layers and kernels are varied for the model evaluation. The test accuracy obtained by the O-CNN model on FER2013 dataset without batch normalization and for 50 epochs is 64.17%.
    Keywords: facial expression; convolutional neural network; CNN; hyperparameters; deep learning.
    DOI: 10.1504/IJCVR.2024.10061999
     
  • Deep learning-powered test case prioritisation in continuous integration: a comparative study and efficiency analysis   Order a copy of this article
    by Sheetal Sharma, Swati V. Chande 
    Abstract: The empirical study introduces a deep learning-based approach for prioritising test cases in continuous integration (CI) environments, leveraging historical CI data to optimise resource allocation and reduce testing time. The model achieved a remarkable 100% accuracy in prioritisation, outperforming traditional methods. Compared to decision tree, it achieved perfect accuracy with fewer test cases. Against random forest, it had a higher fault detection rate while maintaining efficiency. When compared to neural network, it struck a balance between fault detection and execution time. This research highlights deep learning’s potential in transforming CI/CD testing strategies and software development practices.
    Keywords: test case prioritisation; continuous integration; CI; deep learning; comparative analysis; efficiency; accuracy; fault detection; software development.
    DOI: 10.1504/IJCVR.2024.10062000
     
  • Prediction of fine-grained human activities in videos using pose-based and object-based features   Order a copy of this article
    by Ashwini S. Gavali, S.N. Kakarwal 
    Abstract: Human activity prediction in videos deals with anticipating the intention of human activity before it is fully observed. Activity prediction becomes more challenging when fine-grained details are to be considered. This paper presents a deep learning-based approach for predicting complex, fine-grained, and long-duration human actions in videos. Along with prediction, our approach also localises human action spatially with bounding boxes. This approach works by considering the sequential nature of the activities in the video. Each high-level activity is represented as a sequence of local actions (low-level activities). Given a partially observed video, local actions are detected and tracked first, and then these local detections are used for predicting future high-level actions. Fine-grained activity involves interactions with different objects, so we used a combination of the human pose feature and the object feature to predict fine-grained activity more accurately. We evaluated results on the publicly available MPPI cooking activity dataset.
    Keywords: activity prediction; fine-grained activity; local actions; ResNet-50; YOLO object detection; compact prediction tree; convolutional neural network.
    DOI: 10.1504/IJCVR.2024.10062050
     
  • A deep learning framework for disaster recognition and classification of the damaged regions   Order a copy of this article
    by Jaychand Loknath Upadhyay, Himanshu Gharat, Reetik Gupta, Pallav Savla 
    Abstract: Natural disasters are rare, but when they occur, they generally cause colossal damage. Due to climate change, the number of disasters is increasing which demands enhancement in disaster response to reduce and recover the amount of devastation caused due to disasters. A rapid assessment of the situation could facilitate an improved strategy for disaster management and recovery. However, these disasters often cause infrastructural destruction which makes the affected regions inaccessible. In such difficult conditions, aerial images captured through drones can be momentous to identify the regions of damage. This study provides a methodology to classify the disaster images using deep learning which could help to identify the regions of damage. To perform this classification a CNN model was used which was trained on various disaster images through transfer learning. The model was trained on the AIDER dataset and provided an F1-Score of 96.8%. The performance of the proposed model is also verified with real-time videos covering the recording of various disasters. The results obtained in the experiment emphasise disaster response management and ways by which the proposed model could assist the role of deep learning to expedite rescue operations.
    Keywords: disaster management; disaster recognition; damaged region classification; deep learning; convolution neural network; CNN; transfer learning.
    DOI: 10.1504/IJCVR.2024.10062177
     
  • Using fuzzy similarity measure in content-based video retrieval based on image query   Order a copy of this article
    by Fatemeh Taheri, Kambiz Rahbar 
    Abstract: The primary challenge of video retrieval systems is to retrieve videos with the highest similarity to user queries. The process of feature extraction and similarity measurement plays a crucial role in the results of content-based video retrieval. This article introduces a fuzzy similarity metric for comparing and retrieving similar videos using image-queries to address the issue of uncertainty in the similarity between queries and video frames. To this end, features are extracted from both image-query and each video frame using a pre-trained VGG-16. Similarity metrics, including frequency and continuity in similar frames to the image-query, form the basis for calculating the similarity for retrieving videos. The proposed method compensates for uncertainty in image-query and dataset videos’ similarity measurements, leading to improved retrieval results. The best evaluation results with the mean accuracy metric on the UCF-11 dataset for retrieving one and ten top samples are reported as 0.862 and 0.689 respectively.
    Keywords: fuzzy similarity; content-based video retrieval; image query; VGG-16 neural network.
    DOI: 10.1504/IJCVR.2024.10062185
     
  • Orthogonal opponent colour local binary patterns: a new colour-texture descriptor for content based-image retrieval   Order a copy of this article
    by Rahima Boukerma, Bachir Boucheham, Salah Bougueroua 
    Abstract: Opponent colour local binary patterns (OCLBP) is one of the first extensions of greyscale LBP to colour images, which has been proven to be an effective descriptor for extracting colour texture features. In order to improve the OCLBP performance for image retrieval and increase its invariance to illumination change, we propose in this paper a new scheme for computing the OCLBP’s inter-channel features. Unlike OCLBP, where the inter-channel features are computed by considering the circular neighbouring of the centre pixel, our proposed descriptor named orthogonal OCLBP (O-OCLBP) is constructed by considering the orthogonal neighbouring of the centre pixel. Moreover, the proposed scheme is applied to the improved version of OCLBP (IOCLBP) to derive a new descriptor named orthogonal IOCLBP (O-IOCLBP). Experiments performed over eight databases demonstrate that the proposed descriptors significantly improve retrieval performance on almost all databases, and show generally better results compared to some of the state-of-the-art descriptors.
    Keywords: CBIR; IOCLBP; LBP; multichannel feature extraction; OCLBP; orthogonal-IOCLBP; orthogonal-OCLBP.
    DOI: 10.1504/IJCVR.2024.10062221
     
  • Recent security challenges and robust techniques in colour image watermarking   Order a copy of this article
    by Chandan Kumar, Dinesh Dinu 
    Abstract: Digital image watermarking is a widely used technique for ensuring the authenticity and security of digital images on the internet. While greyscale and colour image watermarking are both commonly used techniques, however this paper specifically reviews the most recent security issues related to colour image watermarking. Colour image watermarking poses unique security challenges, including vulnerability to attacks that can remove or alter the watermark, compromising the image’s authenticity and security. To address these challenges, researchers have developed robust watermarking techniques that embed the watermark in multiple colour channels of the image, making it difficult to remove or alter without affecting the image quality. Despite ongoing security challenges, the continued development of these techniques will help to enhance the security of digital images on the internet. This paper provides a comprehensive review of the latest developments in colour image watermarking, including security and robust techniques to address these challenges.
    Keywords: spatial and transform domain techniques; image watermarking; embedding and extraction; colour images.
    DOI: 10.1504/IJCVR.2024.10062255
     
  • Motion control of 3-DoF delta robot using adaptive neuro fuzzy inference system   Order a copy of this article
    by Riyadh A. Sarhan, Zaid H. Rashid, Mohammed S. Hassan 
    Abstract: Delta robot are widely used to achieve positioning tasks with high speed and accuracy, which require a control model to move the platform of delta robot along a specific coordinate. This paper presents a control system based on fuzzy controller to achieve the motion control and applies this system on the model of delta robot, which is capable of carrying out the motion with three translational degrees of freedom. The proposed control system evaluates the applied angular position on the motor’s joint depended on the output of inverse kinematics and ANFIS then move the end effector in the translation coordinates (X, Y and Z). Results from both inverse kinematics equations and from the delta robot after applied proposed control system show that there is a difference in the translation coordinates by around 5 cm in X direction, 2 cm in Y direction and 1 cm in Z direction. This difference due to the effect of the friction in the joint of the delta robot, which is negligible in the inverse kinematics analysis. Finally, the validation of proposed control system foe a delta robot is verified with minimum errors.
    Keywords: delta robot; inverse kinematics; fuzzy control; adaptive neuro fuzzy interference system; ANFIS.

  • Investigating dementia: an analysis on machine learning strategies   Order a copy of this article
    by Tanvi Kapdi, Apurva Shah 
    Abstract: Dementia, an ongoing and moderate mental declination of cerebrum capability brought about by disability, is turning out to be more pervasive because of the maturing populace. A significant challenge in dementia is accomplishing exact reasonable determination. Lately, neuroimaging with PC helped calculations and has made surprising advances in tending to this test. The outcome of these techniques is generally ascribed to the application of AI strategies for neuro-imaging. In this review paper, we present a meticulous overview of robotised indicative methodologies for dementia utilising clinical picture examination. Given the thorough survey of the current efforts, it has been observed that, while a large portion of the examinations zeroed in on common mental illness, late exploration has shown sensible execution in the ID of disparate strains of dementia stays a significant challenge. Multimodal imaging assessment profound research draws near has shown optimistic sequel in the conclusion of the strains of dementia.
    Keywords: artificial intelligence; machine learning; deep learning; mental health; dementia.
    DOI: 10.1504/IJCVR.2024.10062322
     
  • Optimisation of weed management by image segmentation in precision agriculture   Order a copy of this article
    by Mohammed Habib, Salma Sekhra, Adil Tannouche, Youssef Ounejjar 
    Abstract: Accurate weed detection remains crucial for ultra-localised control in robotic solutions, simulating manual weeding in agriculture. Although many studies have been conducted in the field of weed detection using machine learning, most have focused mainly on direct detection, which can present challenges in the face of weed diversity. In this study, we propose an integrated approach based on vegetation/soil segmentation, followed by discrimination between crops and weeds using an object detector. Segmentation models such as UNet, FPN, and LinkNet have been thoroughly trained to discriminate efficiently between vegetation and soil. The results obtained are promising, with the trained models being able to generate binary images (masks) with an accuracy (Jaccard and Dice similarity indices) of over 89%. In addition, the execution speed reached 217 frames per second (Fps). The integration of the localisation results from the detection model with the segmented images provides a robust method for accurately determining the position of weeds in the agricultural context, opening up new prospects for automated, targeted weed control solutions.
    Keywords: images segmentation; computer vision; convolutional neural networks; agricultural images; weed detection; smart farming; precision farming; deep learning.
    DOI: 10.1504/IJCVR.2024.10062366
     
  • Enhanced licence plate detection using YOLO framework in challenging environments   Order a copy of this article
    by Sahil Khokhar, Deepak Kedia 
    Abstract: The need for monitoring and controlling traffic for applications such as toll collection, parking, and law enforcement has grown significantly in the last few years. ALPR systems are accomplishing the monitoring of vehicles on a massive scale. The ALPR systems have been a research topic for many years, yet the ground deployment has yet to catch up. The primary reason for this issue has been the systems poor efficiency in real-world scenarios compared to the lab testing conditions. The focus of this paper has been on the license plate detection part of the ALPR system. The deep learning-based YOLO frameworks have been employed to detect license plates. The effect of using different datasets for training the network and the efficiency of various versions of the YOLO framework has also been tested in diverse conditions such as low-light low-contrast environments and partial or obstructed plates. The YOLOv7 algorithm achieved an F-score of 98.62% on the AOLP dataset with an average processing time of 15.43 ms. The implemented techniques are accurate and fast enough for real-time applications such as toll collection, traffic monitoring, etc.
    Keywords: automatic license plate recognition; ALPR; object detection; deep learning; machine learning; computer vision; intelligent transportation system.
    DOI: 10.1504/IJCVR.2024.10062468
     
  • The evolution of humanoid robots   Order a copy of this article
    by Tejas Deshpande, Bhumeshwar Patle, Virendra Bhojwani 
    Abstract: The recent development in the field of robotics has expanded the horizons for humans to make autonomous robots work with better accuracy and speed without human intervention in multiple industries. The introduction of bio-mechanics with Leonardo da Vinci’s model widened the scope of robotics leading to the concept of human-shaped robots or humanoids. Today, humanoid robots work alongside humans as well as work without any human intervention. Humanoids study their external environment using multiple sensors and accordingly use artificial intelligence which helps them analyse the situation and develop an appropriate response to external stimuli. From being extremely heavy and inefficient to becoming lightweight, sturdy, efficient, and possessing human-like intelligence, humanoids have been developed in previous years. In this review paper, we discuss the evolution and development of robotics which caused advancements leading to the current generation of humanoids while simultaneously classifying robots from each other using parameters like technological advancements and tasks performed which will help other researchers.
    Keywords: robotics; humanoids; artificial intelligence.
    DOI: 10.1504/IJCVR.2024.10062527
     
  • Node anomaly detection in social networks using cohesive non-local graph convolutional network   Order a copy of this article
    by Yallamanda Rajesh Babu, G. Karthick, V.V. Jaya Rama Krishnaiah 
    Abstract: Users connect with one another and develop relationships on social media platforms. These users have a collection of personal information about themselves on these platforms and communicate with one another. Social networks are becoming more prevalent all across the globe. With all of its advantages, criminality and fraudulent conduct in this medium are on the rise. As a result, there is an urgent need to detect abnormalities in these networks before they do substantial harm. Social network analysis uses graph data structure to represent and manage data. Graphs store data and capture relationships that exist between the nodes. Graphs are a complicated kind of data representation in which each data entry contains attributes and is also connected to other data entries traditional non-deep learning approaches are failing to perform effectively when the size and scope of real-world social networks rise in numbers.
    Keywords: anomaly detection; graph; node anomaly; graph convolutional network; GCN; auto-encoder; CNLGCN.
    DOI: 10.1504/IJCVR.2024.10062528
     
  • Real-time interpretation of American Sign Language using SSD-MobileNet   Order a copy of this article
    by Youssef Farhan, Zineb Haimer, Abdessalam Ait Madi 
    Abstract: Individuals with hearing impairments may struggle with integrating into society because the general population does not understand sign language. Consequently, this can lead to isolation and exclusion from social and professional opportunities. To address this issue, this paper proposes a system for the real-time interpretation of American Sign Language (ASL) using computer vision technology. This system uses a normal webcam to detect and interpret 26 letters of the English alphabet and three auxiliary signs. To achieve this goal, the pre-trained lightweight single-shot multibox detection network model, from the TensorFlow object detection application programming interface (API), SSD-MobileNet was used. After the training phase of the proposed model with a personally collected dataset, the obtained results in testing are promising, with a precision of 82.8% and a recall of 85%. The proposed system represents a forward step in sign language translation. Furthermore, it can be adapted to interpret other sign languages.
    Keywords: American Sign Language; ASL; SSD-MobileNet; TensorFlow object detection API; computer vision.
    DOI: 10.1504/IJCVR.2024.10062706
     
  • Spatial attributes-based segmentation and topological attributes-based recognition algorithm for Myanmar OCR   Order a copy of this article
    by Nwe Nwe Htay Win 
    Abstract: In this paper, we propose a novel segmentation and character recognition algorithm for printed offline Myanmar documents. The main contribution of this paper is threefold: 1) it firstly presents a segmentation algorithm based on spatial attributes of the characters; 2) it then extracts the most relevant features from segmented images using determinant and trace values of Hessian feature matrix. The feature vectors are fed into fully connected self-organisation map (SOM) computational network for recognition of those segmented images; 3) the system finally assembles partially recognised characters into a complete compound character depending on their topological attributes. To prove the performance of the system, we have conducted experiments with a dataset with 40,878 images and evaluate the performances in terms of accuracy, error rate and computational time by comparing with contemporary works CNeT and OCRMPD. Our system proves that we outperform 97.5% in overall accuracy than those in compared works.
    Keywords: character segmentation; recognition; topological attributes; spatial attributes; Hessian feature matrix; self-organisation map; SOM.
    DOI: 10.1504/IJCVR.2024.10062781
     
  • Classification of the sentiment using African vultures spider monkey optimisation based SqueezeNet technique   Order a copy of this article
    by Konda Adilakshmi, Malladi Srinivas, Anuradha Kodali, Srilakshmi Vellanki 
    Abstract: Sentiment classification is a precise chore in the categorisation of text, which intends to categorise the documents by their reviews. Analysation of sentiment is a process of extracting emotional content from the texts. An analysis of sentiment is a fundamental task, which is necessary for an understandable user. Therefore, an effective technique is proposed called the AVSMO_SqueezeNet technique for the classification of sentiment Firstly, the Amazon review document is assumed as input and then it is given to the tokenisation phase, where BERT is used. After the phase of tokenisation, the feature extraction is completed for extracting appropriate features for the classification of sentiment. Lastly, sentiment classification is performed utilising Squeeze Net which is tuned by the proposed AVSMO approach. However, the newly AVSMO technique is devised by an amalgamation of AVOA and SMO techniques. Furthermore, the proposed technique achieved maximum precision of 0.878, recall of 0.887, and F-measure of 0.883.
    Keywords: SqueezeNet; aquila optimiser; AO; African vultures optimisation algorithm; AVOA; SailFish optimiser; SFO; and spider monkey optimisation; SMO.
    DOI: 10.1504/IJCVR.2024.10062782
     
  • Experimentative analysis of artificial immune system algorithms for intrusion detection in IoT networks   Order a copy of this article
    by Syed Ali Mehdi, Syed Zeeshan Hussain 
    Abstract: Intrusion detection systems (IDS) are the basic security line for any network. Internet of things (IoT) networks have been in trend and usage. It thus raises security challenges in IoT networks, and thus there is a requirement for IDS for IoT. There has been promising research on artificial immune system (AIS) algorithms for intrusion detection. In this paper, AIS algorithms, namely negative selection algorithm (NSA) and clonal selection algorithm (CSA), for intrusion detection in IoT are compared. These algorithms are evaluated using Python on a popular public BoT-IoT-L01 dataset. Various performance metrics like detection rate, classification accuracy, false-positive rate, and falsenegative rate are used for comparison. The results of the research show that the CSA is better than the NSA-based intrusion detection for IoT in terms of accuracy. CSA achieved an overall detection rate of 91% in comparison to NSA’s, with 79%. It was found that NSA was more efficient than CSA in detecting rare kinds of attacks. The research findings indicate that AIS can be a powerful tool for IoT-based IDS. The selection of an appropriate AIS algorithm for IDS in IoT depends on the specific requirements and characteristics of the IoT network.
    Keywords: internet of things; IoT; artificial immune systems; AIS; intrusion detection system; IDS.
    DOI: 10.1504/IJCVR.2024.10065553
     
  • An image-based system for monitoring pregnant womens sleep posture   Order a copy of this article
    by S. Mohanram, J. Sathyamoorthy, A. Sargunal, P. Seran, M.S. Yogiramkumar, C. Elakshme Devi, M.R. Dharshini, S. Dharshan, G. Chandru 
    Abstract: Pregnancy is an important period for both mother and child, and the quality of sleep plays a vital role in ensuring their health a sleep position monitoring system designed to aid pregnant women in maintaining a healthy posture during sleep, crucial for maternal and fetal well-being. Leveraging computer vision and machine learning techniques, the system detects four sleep positions based on shoulder coordinates obtained from the MediaPipe pose model. Employing SVM and random forest algorithms, two models are developed to enhance accuracy, and their results are averaged for robust sleep position identification. Upon detecting prolonged undesired positions, the system triggers a call via a GSM modem for timely intervention. Offering a non-invasive, automated, and cost-effective solution, this system facilitates proactive monitoring of pregnant women's sleep posture, potentially preventing harm to the fetus. By promoting healthy sleep habits throughout pregnancy, it aims to improve maternal and fetal health outcomes.
    Keywords: GSM module; MediaPipe; Open CV; random forest algorithm; support vector machine; SVM.
    DOI: 10.1504/IJCVR.2024.10062865
     
  • Enhanced faster R-CNN based subcutaneous and visceral adipose tissue segmentation from abdominal MRI   Order a copy of this article
    by B. Sudha Devi, D.S. Misbha 
    Abstract: Obesity has emerged as a significant global problem that exposes both adults and children at risk for developing chronic diseases. The overall quantity of abdominal adipose tissue is frequently divided into two primary components, which are visceral adipose tissue (VAT) and subcutaneous adipose tissue (SAT), with the former being more directly linked to health concerns. Many computer based techniques are developed for segmenting the VAT and SAT, which are poor in feature extraction using the MRI and CT images. In this proposed model, the collected MRI and CT images of datasets 1 and 2 are enhanced using the pre-processing techniques consisting of image resizing, CLAHE, and median filter. Finally, the pre-processed images are segmented and classified using the enhanced faster R-CNN based on ResNet-5.0 and ROI (grab cut). The proposed model performance is evaluated using the performance metrics including error, accuracy, precision, specificity, etc., for datasets 1 and 2. The Enhanced Faster R-CNN model performs better by accurately segmenting and classifying the VAT and SAT from the abdominal region.
    Keywords: visceral adipose tissue; VAT; subcutaneous adipose tissue; SAT; faster R-CNN; ResNet-50; grab cut; CLAHE; median filter.
    DOI: 10.1504/IJCVR.2024.10062951
     
  • Deep neural learned bipolar sigmoid association rule mining for discovering high frequent and utility itemsets   Order a copy of this article
    by R. Savitha, V. Baby Deepa 
    Abstract: This paper presents a new deep neural learned bipolar sigmoid association rule mining (DNLBSARM) algorithm to improve the mining performance of frequent and utility itemsets (FUI) when considering large transactional data as input. The DNLBSARM is introduced by combining the deep neural learning and bipolar sigmoid association rule generation concepts. Initially, the input layer in DNLBSARM gets large number of items as input to perform mining process. Subsequently, hidden layers in DNLBSARM perform deep analysis where the support and utility value of each itemsets in big database is significantly measured. Finally, output layer in DNLBSARM used bipolar sigmoid association rule with the aiming at accurately discovering and mining top frequent and maximum profited itemsets in massive dataset with minimal amount of time utilisation. From that, DNLBSARM obtained improved extraction performance to find the top user interested and profited itemsets as compared to existing works. The experimental evaluation of DNLBSARM is conducted using parameters such as accuracy, time complexity, and false positive rate by considering various numbers of input items. The testing results demonstrated that the proposed DNLBSARM provides better performance in terms of higher accuracy and lower complexity for extracts the FI and HUT when compared to conventional research works.
    Keywords: association rule; bipolar sigmoid activation function; BSAF; deep learning; frequent itemsets; utility itemsets.
    DOI: 10.1504/IJCVR.2024.10062952
     
  • Design of low power low area SRAM cell at 180 nm, 90 nm and 45 nm technology nodes   Order a copy of this article
    by Seerapu Venkatesh, Krishna Veni Sahukara 
    Abstract: In this paper, a new SRAM topology cell is proposed with low power and low area in different technology nodes those are 180 nm, 90 nm and 45 nm. The power consumption results of the new SRAM topology is compared with the conventional 6T SRAM cell and SRAM cell implemented with LECTOR technique. The schematic design of these SRAM cells are implanted in the S-Edit, for generating spice code of the circuit T-Spice is used and the wave forms are observed in the W-Edit of the tanner tool software. The proposed SRAM cell at 45 nm technology node has got better power consumption compared to the remaining SRAM topologies.
    Keywords: LECTOR; SRAM; read; write; hold; power.
    DOI: 10.1504/IJCVR.2024.10063007
     
  • An internet of things-edge paradigm-enabled vision-based driving assistance for blind corners: a V2I application   Order a copy of this article
    by Goutam Kumar Sahoo, Rashmiranjan Nayak, K. L. Sanjeev Tudu, Umesh Chandra Pati, Santos Kumar Das, Poonam Singh 
    Abstract: The proposed work detects moving vehicles using unsupervised methods and estimates their speed as well as distance using surveillance cameras mounted in road infrastructure for collision avoidance at sharp corners. The goal is to develop IoT-based computer vision-assisted vehicle-to-infrastructure (V2I) communication for autonomous vehicles. Information like the availability of vehicles in the blind zone, speed, and distance of the upcoming vehicle can be shared with the drivers beforehand for safety purposes. Computer vision-based lightweight algorithms using simple morphological operations have been proposed to detect the incoming vehicle and estimate the associated speed and distance. Further, an IoT-edge paradigm-enabled computing platform is developed to facilitate efficient computation for latency-sensitive real-time applications. An auto-generated audio-visual alarm guides the driver by a fixed roadside unit near the turning point when the approaching vehicle crosses the predefined threshold zone decided for a particular turning point. Hence, it enables the vehicle to prevent a collision.
    Keywords: blind corners; camera; edge computing; internet of things; IoT; vehicle detection; warning generation; vehicle-to-infrastructure; V2I; advanced driver assistance systems; ADAS.
    DOI: 10.1504/IJCVR.2024.10063672
     
  • Predictive modelling of carbon nanotube structures using machine learning techniques   Order a copy of this article
    by Pawan Whig, Imran Ahmed Khan, Amrita Rai, Owais Ahmad Shah, M. Nasim Faruque, Jaishanker Prasad Keshari, Mudit Wadhwa 
    Abstract: This paper explores predictive modelling techniques applied to the structural analysis of carbon nanotubes (CNTs) using a dataset encompassing 10,721 initial and calculated atomic coordinates, alongside their intricate chiral networks. Derived from simulation software, BIOVIA materials studio CASTEP, this dataset serves as the foundation for employing advanced machine learning methodologies. Our research aims to decode the nuanced complexities inherent in CNT structures. By leveraging cutting-edge machine learning approaches, we seek to revolutionise the understanding and predictive capabilities regarding CNT architectures. The significance of our findings resonates deeply within materials science and nanotechnology, promising to streamline the comprehension and utilisation of CNTs across diverse domains, spanning electronics to materials engineering. This study marks a pivotal stride towards automating and expediting the development of nanomaterials, fostering innovation in a field where precision and efficiency are paramount. Our work showcases the potential for transformative advancements in harnessing CNTs for practical applications, propelling the integration of these nanostructures into real-world technologies.
    Keywords: carbon nanotubes; CNTs; predictive modelling; machine learning; structural analysis; materials studio CASTEP; nanomaterials; atomic coordinates; chiral networks; nanotechnology; materials science; simulation data.
    DOI: 10.1504/IJCVR.2024.10063137
     
  • Improving electrocardiography signal quality: introducing an efficient approach for noise removal   Order a copy of this article
    by V. Jagan Naveen, Gunta Nooka Raju, Sanapala Umamaheswara Rao, Marpu Chaitanya Kumar, Potnuru Narayanarao 
    Abstract: To evaluate the hearts electrical activity, electrocardiography (ECG) is commonly utilised. However, power line interference, muscular artefacts, and baseline drift are only a few examples of noise that can affect the accuracy and reliability of ECG signals. An effective method for noise removal is introduced in this paper as a novel strategy for enhancing the quality of ECG signals. The suggested approach uses cutting-edge signal processing techniques and machine learning algorithms to isolate and eliminate unwanted noise without altering the original cardiac signal. Pre-processing, feature extraction, noise estimation, and adaptive filtering are the cornerstones of the methodology. Experimental results on various ECG recordings show that the proposed method is effective at drastically lowering noise interference and improving the quality of ECG signals overall. With higher signal quality, doctors may make more informed patient care decisions. The proposed method achieves the highest SNR of 4025 dB after filtering, indicating that it effectively reduces noise and enhances the quality of the signal by a significant margin compared to the other methods. There is promising potential for the presented approach to be included in preexisting ECG devices and systems, giving a realistic option for noise reduction in clinical situations.
    Keywords: artefact; baseline-wander; electrocardiogram; ECG; denoising; filtering.
    DOI: 10.1504/IJCVR.2023.10063447
     
  • Intelligent serial cascade of hybrid deep learning model for plant leaf disease identification and classification with multi-scale dilation assisted 3D-CNN features   Order a copy of this article
    by P. Vinay, G. Santhosh Kumar 
    Abstract: A novel deep learning framework is explored for plant leaf disease detection to resolve the challenges of existing leaf disease detection models. The pre-processed through optimal weighted threshold histogram equalisation. The parameters inside the histogram equalisation approach are optimised via the hybrid heuristic algorithm like rat aquila swarm optimisation (RASO). Subsequently, the deep features from the pre-processed image are acquired through multi-scale dilation assisted 3D-CNN. Thus, the resultant image is classified using the serial cascade of autoencoder and gated recurrent unit (GRU) (SC-AGRU). Then, the RASO is also used to perform the parameter tuning to increase the classification performance. Throughout the analysis, the accuracy and precision rate of the suggested method are 96% and 95%. Thus, the overall effectiveness of the proposed plant leaf disease classification technique is encountered by conducting a comparative analysis of various plant leaf disease classification techniques regarding various evaluation measures.
    Keywords: plant leaf disease identification; optimal weighted threshold histogram equalisation; rat Aquila swarm optimisation; serial cascade of autoencoder and gated recurrent unit neural network; multi-scale dilation assisted convolution neural network.
    DOI: 10.1504/IJCVR.2024.10063504
     
  • Mammogram mass segmentation using evolutionary algorithm-based single layer neural network   Order a copy of this article
    by Sunita Sarangi, Harish Kumar Sahoo 
    Abstract: Mammography is the most reliable method for detecting breast cancer in its early stages. Breast region segmentation is a fundamental procedure for analysing mammograms. This paper presents an improved segmentation approach using a hybrid model using a functional link artificial neural network (FLANN) based on particle swarm optimisation (PSO). The suggested segmentation technique makes use of a threshold for segmentation that is adaptively adjusted by the image attributes. A comparison has been made between three expansion techniques used for input to the FLANN, they are exponential FLANN (EFLANN), Chebyshev FLANN (CFLANN), and Legendre FLANN (LFLANN). 110 images from mini-MIAS and DDSM databases are used for comparison. The performance measures for CFLANN and LFLANN are found to be better than Exponential FLANN (EFLANN).
    Keywords: mammogram; adaptive threshold; EFLANN; CFLANN; LFLANN; particle swarm optimisation; PSO.
    DOI: 10.1504/IJCVR.2024.10063552
     
  • Fingerprint template protection: cancellable biometrics   Order a copy of this article
    by Ayesha S. Shaikh, Vibha D. Patel 
    Abstract: Biometric authentication systems have become more popular nowadays because of mobile and other handheld devices since they eliminate the need for a password or pin to remember. If an intruder hacks biometric traits, there is no way to change the biometric traits of any person because they are permanently attached to the person. The biometric traits are not replaceable like passwords; hence, the key research area is privacy preservation. To stop such biometric traits from being stolen or used improperly, secure technology solutions must be developed. In order to provide a reliable and secure biometric authentication system, we present a cancellable biometrics technique. We proposed a highly secure method for cancellable biometrics using a speeded up robust feature approach for image feature extraction, which is followed by a fast Fourier transform with an index of max hashing and Hadamard product vector for the protection of the biometric template. On a standard dataset FVC2002-DB1 and DB2, we tested and assessed the suggested strategy, and we got reasonably decent results.
    Keywords: fingerprint biometrics; template protection; cancellable biometric; security and privacy preservation.
    DOI: 10.1504/IJCVR.2024.10063568
     
  • Design analysis of compliant 3D printed thermoplastic polyurethane micro-gripper with screw-gear actuation   Order a copy of this article
    by N. Sahay, S. Chattopadhyay 
    Abstract: In this work the design and analysis of a compliant micro-gripper of thermoplastic polyurethane (TPU) material is presented. With the proposed design the prototype of the grippers will be developed by of 3D printing technology using TPU. The material is of light weight, low cost and very flexible in nature providing gripping with its deformation due to application of force at its actuation point. The gripper is designed and finite element analysis (FEA) has been done using Pro Release 5.0 software where stress and displacement are evaluated at every point of interest. Pressure has been applied in the range of 0.01 to 1.0 MPa to obtain input characteristics in terms of stress generation of the structure which is found to be linear in the range of interest. Output characteristics have been presented in the displacement curve with respect to the applied force. Actuating force has been calculated mathematically from the specified torque and other required parameters of the screw-gear actuation system.
    Keywords: compliant mechanism; displacement analysis; micro-gripper; Pro Release 5.0; screw-gear; stress analysis; thermoplastic polyurethane.
    DOI: 10.1504/IJCVR.2024.10063569
     
  • Ensemble learning and skip connection-based CNN framework for COVID-19 identification using CXR and CT images   Order a copy of this article
    by Muzammil Khan, Bhavana Singh, Pushpendra Kumar 
    Abstract: COVID-19 causes a severe deterioration to the respiratory system by infecting the lungs, resulting in high fatality rates. Thus, in order to reduce the mortality rate chest radiographs such as CT and CXR of lungs can be utilised for early identification. The proposed work introduces a novel convolutional neural network architecture TES-Net for performing COVID-19 detection from CT and CXR. The model is based on transfer learning, ensemble learning and skip connections. Transfer learning allows to circumvent the need for lots of new data to train a model, while ensemble learning uses a combination of different individual models to obtain a higher predictive accuracy. Moreover, skip connections are useful in tackling the problem of vanishing gradients. The experimental results are described in terms of different evaluation metrics and compared with several existing CNNs and machine learning classifiers. An ablation study is also conducted to show the significance of different components.
    Keywords: convolutional neural network; CNN; COVID-19; CT scan; CXR; ensemble learning; skip connection; transfer learning.
    DOI: 10.1504/IJCVR.2024.10063592
     
  • Particle swarm optimisation-based scalable controller placement with balancing constraints in software-defined wide area networks   Order a copy of this article
    by Sasibhushana Rao Pappu, Kalyana Chakravarthy Chilukuri 
    Abstract: Software defined networking (SDN) is a cutting-edge networking technology that enables a traditional switch’s control plane and data plane to be isolated. SDN improves network usage performance by centralising control plane management (SDN controller). However, a single controller will be unable handle these networks due to massive usage of resources in today’s wide area networks. It allows the control plane to be controlled by multiple controllers by distributing switches among them. We provide a method for determining the best controller position by balancing the load imbalance between switches and controllers. The proposed strategy is based on particle swarm optimisation to determine the placement of controllers in a software defined networks, using the controller’s load factor as the fitness feature. It does not however affect the current controller placement solutions (latency between controller and switch). In this article, the network topologies OS3E and Intellifiber are used. The results show that the proposed method reduces the overall latency of the network when multiple controllers are used.
    Keywords: controller placement problem; CPP; software defined networking; SDN; particle swarm optimisation; PSO; load balance.
    DOI: 10.1504/IJCVR.2024.10063677
     
  • Facial action unit and its intensity detection using multi-network architecture   Order a copy of this article
    by Rohan Appasaheb Borgalli, Sunil Surve 
    Abstract: Facial expressions recognition (FER) plays a significant role in applications like medicine, human-machine interface, e-education, video games, AI, distance psychotherapy, and security. In literature, solving the FER problem based on single static images is preferred due to the availability of the dataset, processing requires less memory, and the algorithm is not as complex as videos. In terms of techniques, deep learning, particularly convolution neural networks (CNNs) is favoured for its ability to learn high-level facial features. The proposed multi-network architecture uses modified Xceptionnet architecture by slightly changing a few final fully connected layers to detect facial action unit (FAU) intensity accurately. Using this modified architecture, we designed multi-network architecture for the DISFA+ Database, which consists of 12 networks, each trained separately on FAUs to detect action units and their intensity with reasonable accuracy of 89% and 64%, respectively, to be then intern mapped to find basic and compound facial emotions.
    Keywords: facial expression; facial action unit; convolution neural network; action unit intensity; deep learning.
    DOI: 10.1504/IJCVR.2024.10063731
     
  • A mobile-based deep learning technique for ECG beat classification   Order a copy of this article
    by Geetamma Tummalapalli, Sanapala Umamaheswara Rao, Marpu Chaitanya Kumar, Potnuru Narayanarao 
    Abstract: The electrocardiogram (ECG) is a valuable tool for diagnosing cardiovascular issues. However, manual analysis can be time-consuming and prone to error. This work presents a novel ECG classification system utilizing a convolutional neural network (CNN) to automatically categorize ECG signals into five classes: normal, left/right bundle branch block, atrial premature contraction, and ventricular premature contraction. Our method extracts nonlinear features directly from the signal, outperforming approaches reliant on hand-crafted features. We achieved 99.25% accuracy on the MIT-BIH database, with rapid classification time (0.0738 seconds per beat). Crucially, we integrated this model into an Android application, enabling convenient ECG signal classification and result display for potential clinical use.
    Keywords: Android application; convolutional neural network; CNN; electrocardiogram; ECG; MIT-BIH.
    DOI: 10.1504/IJCVR.2024.10063732
     
  • Ensemble CNN model with novel optimisation technique for video content detection   Order a copy of this article
    by Sita M. Yadav, Sandeep M. Chaware 
    Abstract: This research develops and implements a CNN-BiLSTM with chaser prairie wolf optimisation (CPW) model for video content analysis. Initially, the input is collected from the CAMVID and DAVIS datasets, the video is first been read. The optimised YOLO-4 model is proposed for detecting the objects from the video. The hybrid optimisation algorithm is developed from the characteristics of Albus and Falcon, and the role of the optimiser is to train the YOLO model. Then, in order to achieve enhanced performance for the multiclass object classification from videos, the identified objects are subjected to classification using a deep learning model employing the suggested CNN-coupled LSTM model. Additionally, the chaser priori wolf optimisation is used to enhance the deep learning classifiers training, which improves convergence rates. Based on the video content analysis model achievements, at training percentage (TP) 90, the accuracy is 95.75%, sensitivity is 97.30%, and specificity is 96.88% for D1, similarly based on D2 the accuracy is 97.77%, sensitivity is 99.00%, and specificity is 98.90%.
    Keywords: hybrid optimisation algorithm; chaser priori optimisation; object detection; object classification; CNN-coupled LSTM.
    DOI: 10.1504/IJCVR.2024.10063874
     
  • An efficient hybrid model for localisation and grading of diabetic retinopathy using fundus images   Order a copy of this article
    by Pammi Kumari, Priyank Saxena 
    Abstract: Diabetic retinopathy (DR) is the leading factor affecting the visions of many. This study aims to develop a computationally efficient deep learning (DL) framework for DR grading (0 to 4) to overcome the limitations of computationally inefficient existing DL models. This prompted us to use a small-scale architecture (MobileNetV2) integrated with a support vector machine (SVM) for DR grading on the APTOS dataset. A computationally light MobileNetV2 has considerably fewer trainable parameters, making it suitable for edge devices. The integration of SVM provides flexibility in tuning the essential characteristics of the dataset and enhances the grading performance efficaciously. The gradient-weighted heatmap technique is incorporated for disease localisation to visualise the affected regions adequately. The investigation’s outcome substantiates the proposed architecture’s efficiency over the existing DL methods, achieving a test set accuracy of 80% for multilevel and 96% for binary classification with a minimum testing loss.
    Keywords: diabetic retinopathy; DR; support vector machine; SVM; Grad-CAM; deep learning; hybrid architecture; APTOS; MobileNetV2.
    DOI: 10.1504/IJCVR.2024.10063875
     
  • Underwater image enhancement using anisotropic diffusion and multiscale fusion strategy   Order a copy of this article
    by Rekha Chaturvedi, Vishnu Soni, Jitendra Rajpurohit, Abhay Sharma 
    Abstract: Since the transmission of light through water leads to scattering, consequently underwater images thus often afflicted by several types of degradation such as poor contrast, haziness, blurring, and colour distortions. In order to resolve these kinds of problems, we devise a novel technique that combines anisotropic diffusion to effectively split the LAB colour space’s L-channel into base and detail images with the aim of reducing noise while simultaneously preserving the salient features of underwater images. The method further performs a fusion process to quantify three weight maps using a variety of strategies and yield normalised weight maps for each image. To achieve enhanced final image, we consolidate the blended contributions of all levels after appropriate upsampling. Lastly, we restore the enhanced underwater image by converting the blended enhanced LAB to RGB colour space image. Enhancement of image quality is measured in terms of Entropy, PCQL and UIQM. UIEB dataset has been used to implement our proposed method and experimental findings shows that our method outperforms the LAFFNet, deep residual, retinex based methods. It also works well for the underwater images having colour distortion, poor contrast and detail loss.
    Keywords: underwater image enhancement; multiscale fusion; anisotropic diffusion; weight maps; Laplacian pyramid.
    DOI: 10.1504/IJCVR.2024.10063999
     
  • Through deep learning, dynamic hand gesture recognition of sign language learning algorithm   Order a copy of this article
    by Tushar A. Champaneria, Harikrishna B. Jethva, S. Julia Faith, Neel Kumar Shrimali 
    Abstract: Speech is the most common way people talk to each other, but some people have trouble saying or hearing. In this study, a deep learning-based model is proposed that can figure out what words a person is trying to say from the way they move their hands. Deep learning models, like LSTM and GRU (feedback-based learning models), are used to figure out what signs are in Indian sign language (ISL) film clips that are not connected to each other. This study shows how machine learning methods can be used for real-time motion recognition in a wide range of human-computer interfaces. Experiments showed that the system could recognise hand postures with 99.4% accuracy and active gestures with an average accuracy of 93.72%. For datasets with easy backgrounds, the accuracy is almost 99%, for datasets with complicated backgrounds it is 92%, and for the video dataset it is 84%.
    Keywords: deaf-mute people; human-machine contact; key frame extraction; inception deep-convolution network; video analytics.
    DOI: 10.1504/IJCVR.2024.10064000
     
  • Sewer shad fly optimisation based efficient skin lesion detection using capsule neural network   Order a copy of this article
    by Vineet Kumar Dubey, Vandana Dixit Kaushik 
    Abstract: In this research, sewer shad fly optimisation (SSFO) is developed to detect the skin lesion using capsule neural network. HAM10000 dataset is first accessed for input, after which pre-processing is carried out. ROI is segmented using an optimised clustering-based segmentation method based on sewer shad fly optimisation, created as a result of mayfly and moth flame optimisation. The segmented region is sent for feature extraction, which is carried out using both grid-based statistical features and a hybrid ternary pattern. The recovered region is sent to the Capsule Neural Network classifier, uses the sewer shad fly optimisation algorithm to adjust the classifier's weights and bias to accurately detect the skin lesion. The proposed SSFO-CapsNet NN attained the values for TP 90 is 96.45%, 98.00%, 94.28% and while measuring k-fold 10 it attains 95.89%, 98.57%, and 95.76%.
    Keywords: capsule neural network; skin lesions classification; sewer shad fly optimisation; SSFO; transfer learning; and resnet-101.
    DOI: 10.1504/IJCVR.2024.10064099
     
  • Communication efficiency enhancement in underwater wireless sensor networks   Order a copy of this article
    by M.N.V.S.S. Kumar, Sanapala Ramkishore, G. Sateesh Kumar, Sanapala Umamaheswararao 
    Abstract: Underwater wireless sensor networks (UWSNs) are very important in establishing underwater communication. In this UWSNs, the main parameters need to be focused are low latency and energy-efficient data transfer. However, there are problems as a result of dynamic topologies of networks. The acoustic medium’s location-dependent and time-communication characteristics make it challenging to collect data in UWSNs in a reliable and energy-efficient way using exceptionally stable links. This research employs a new hybrid optimisation method known as the fish swarm artificial algae algorithm (FS-AAA) for routing scheme over internet monitoring events in UWSNs applications. The developed routing method provides more durable and reliable routing paths for routing packets in UWSNs under shadow zones and connectivity voids. By balancing the traffic load data over the vast network scale, latency and energy consumption issues are reduced when sharing information. The designed protocol’s primary multi-objective function is to improve signal-to-noise ratio (SNR), packet delivery probability and quality of service (QoS). Experiment findings show that the developed method outperforms the UWSNs’ present routing strategies.
    Keywords: underwater wireless sensor networks; UWSNs; fish swarm artificial algae algorithm; FS-AAA; optimised path routing.
    DOI: 10.1504/IJCVR.2024.10064408
     
  • Risk estimation of breast cancer patient with METABRIC clinical data: an elucidative study of machine learning algorithms with time sensitive information   Order a copy of this article
    by Rajan Prasad Tripathi, Sunil Kumar Khatri, Darelle Van Greunen, Danish Ather 
    Abstract: Breast cancer is a prevalent and life-altering disease that demands precise prognostic tools to guide treatment decisions. Machine learning (ML), with its data-driven capabilities, has emerged as a promising avenue for improving breast cancer prognosis. In this study, we harness the power of machine learning to predict breast cancer survival using clinical data sourced from the METABRIC dataset. Our research sheds light on the critical clinical factors that intimately influence patient outcomes. Among seven distinct algorithms evaluated, Logistic Regression stands out with the highest accuracy of 78%. Notably, our findings underscore the pivotal role of time-related data in enhancing predictive performance, advocating for its inclusion in future prognostic models. We identify positive correlations between survival and parameters such as tumour size and breast-conserving surgery, where the latter exhibits a correlation coefficient of 0.18. Conversely, a negative correlation emerges with breast mastectomy surgery, with a correlation coefficient of - 0.18. This study not only points to robust machine learning models for prognosis but also highlights the intricate interplay between time-sensitive information and breast cancer prognosis. By doing so, it deepens our understanding of breast cancer prognosis and potentially informs more effective treatment strategies.
    Keywords: breast cancer; METABRIC; machine learning; patient survival; risk estimation.
    DOI: 10.1504/IJCVR.2024.10064463
     
  • Knee osteoarthritis using hybrid deep learning approach with SqueezeNet and ResNet   Order a copy of this article
    by A. Muthukumar, S. Singaravelan, R. Arun, V. Selvakumar, D. Arun Shunmugam, S. Balaganesh, P. Gopalsamy 
    Abstract: Osteoarthritis is a degenerative joint sickness that influences a huge number of individuals around the world. Early location and determination of osteoarthritis is basic for viable treatment and the board of the illness. As of late, warm imaging has arisen as a promising painless procedure for identifying osteoarthritis. In any case, existing methods for osteoarthritis recognition in warm pictures experience the ill effects of a few constraints, like low precision, restricted generalisability, and absence of interpretability. To address these difficulties, we propose an original methodology for osteoarthritis recognition in warm pictures utilising the half and half ResNet-SqueezeNet model profound learning design. The proposed approach includes pre-handling the warm pictures to improve their highlights, trailed by division to extricate the area of interest.
    Keywords: ResNet; SqueezeNet; osteoarthritis; hyperparameters; nitty gritty investigation.
    DOI: 10.1504/IJCVR.2024.10064577
     
  • Fusion of periocular and forehead features for masked face recognition   Order a copy of this article
    by Diwakar Agarwal, Atual Bansal 
    Abstract: Most of the automated face recognition systems are not able to identify authorised users with masked faces. This paper proposed a face recognition system based on the fusion of information obtained from mostly uncovered regions of the masked face, i.e., periocular and forehead. Deep features are extracted from the periocular region by using the pre-trained inception-v3 model, while the handcrafted features such as local binary pattern (LBP) and histogram of oriented gradients (HOG) are extracted for the forehead region. The performance of the proposed method is evaluated on Georgia Tech Face Database (GTDB), Color FERET face image database, and self-acquired masked face database. Experimental results show that the fusion of periocular with forehead LBP and HOG features achieved notable verification and recognition rates as GTDB - 76.39% and 52%, respectively, Color FERET database - 96.63% and 95.62%, respectively, and self-acquired masked face database - 89.95% and 67%, respectively.
    Keywords: biometrics; forehead; fusion; masked face; multimodal; periocular.
    DOI: 10.1504/IJCVR.2024.10064605
     
  • A novel signature recognition system using a convolutional neural network and fuzzy classifier   Order a copy of this article
    by Ouafae El Melhaoui, Soukaina Benchaou, Redouan Zarrouk 
    Abstract: The present work provides a novel method for recognising the signature images, based on machine learning algorithms; convolutional neural network (CNN) and fuzzy min max classifier (FMMC). The new system goes through three phases; pre-processing, features extraction and classification. First of all, a variety of pre-processing techniques are used to isolate the signature pixels from the background. The resulting images are scanned with multiple filters to perform the convolution and ReLU procedure. The pooling process is then applied. Finally, the resulting image pixels are flattened and used to feed FMMC. Three systems containing the most used techniques including; profile projection-FMMC, Loci-FMMC and CNN; have been compared to the proposed system. The first two models are used to prioritise the feature extraction method of our system, while the third model, CNN, is utilised to prioritise the FMMC as classifier. The experimental results have obtained a good recognition rate equal to 97% which confirm the effectiveness of the proposed structure.
    Keywords: convolutional neural network; CNN; fuzzy min max classification; FMMC; offline signature recognition.
    DOI: 10.1504/IJCVR.2024.10064681
     
  • Artificial intelligence and machine learning for eye-hand coordination test: a review   Order a copy of this article
    by Milind Shah, Avani Vasant 
    Abstract: Eye-hand coordination refers to the ability of an individual to visually perceive their surroundings and accurately react with their hands to interact with objects or perform tasks. This essential skill is crucial in various activities, including writing, driving a car, exercising, and participating in sports. It is central to understanding how the brain creates internal models of the action space and generates movement within it. Eyehand coordination remains a very complex and elusive problem, further complicated by its distributed representation in the brain. Diseases and disorders such as autism spectrum disorders, cerebral palsy, developmental delays, or visual disorders occur due to poor eye-hand coordination. Significant advancements in technologies like artificial intelligence (AI), deep learning, and machine learning have revolutionised the enhancement of eye-hand coordination. The combination of these technologies has opened up new possibilities in understanding real-time scenarios and improving eye-hand coordination in human visual perception, computer vision, and robotic vision. This study reviews 27 articles and discusses the role of AI, deep learning, and machine learning for eye-hand coordination tests and its related challenges.
    Keywords: hand-eye coordination test; machine learning; deep learning; artificial intelligence; AI; robotic vision; eye-hand coordination test; skill recognition.

  • Skin cancer recognition with novel deep learning methodology on mobile platform   Order a copy of this article
    by Phillip Ly, Abhishek Verma, Doina Bein 
    Abstract: The goal of this research is to create mobile applications that can leverage the power of deep learning to detect skin cancer in the early phase and save lives. In this paper we present: 1) novel deep learning-based methodology named as feature extraction with data augmentation and fine-tuning (FEDAFT) to develop compact mobile compatible model that perform effectively in both experimental and real-world situations; 2) our methodology is based on advanced data augmentation, transfer learning, and fine-tuning techniques and obtained top-1 accuracy of 88.35%, which is better than several of the other researches on skin cancer dataset; 3) the model is successfully deployed on the iOS and Android mobile systems; 4) furthermore, we create a composite dataset from several existing datasets for improved recognition accuracy.
    Keywords: deep learning; skin cancer; melanoma; neural network; convolutional neural network; CNN; PHDB.
    DOI: 10.1504/IJCVR.2024.10064682
     
  • Semantic segmentation of agricultural aerial images using encoder-decoder models   Order a copy of this article
    by Rajagopal Rekha, M. Indhuja, S.K. Nivetha, R. Vaishnavi 
    Abstract: The agricultural sector is the backbone of almost all economies in the world. Agriculture supports 70% of the population and covers about 40% of the Earths surface. A huge quantity of aerial images of agricultural lands is available due to the invent of agricultural drones (unmanned aerial vehicles). Deriving useful information from the captured images can help the farmers in several ways to know the problems in their land. Semantic segmentation is the process of assigning a label to every pixel in the image and clustering the parts of images together, which belong to the same object. Semantic image segmentation identifies where an object is located in the image, the shape of that object, which pixel belongs to which object. This research work aims to segment six types of anomalies (that includes cloud shadow, double plant, planter skip, standing water, water-way and weed cluster) in the aerial farmland images that are most important to farmers. This is carried out in a subset of agriculture-vision - a large aerial image database for agricultural pattern analysis with nine types of anomalies. This is carried out with encoder-decoder architectures aiming to give mean intersection over union (mIoU - metric for model evaluation) greater than the existing work.
    Keywords: semantic segmentation; u-net; agriculture; aerial images; deep learning; Mean-IoU.

  • Enhanced feature extraction technique using multiple pre-trained convolutional neural networks for improved brain tumour detection in MRI images   Order a copy of this article
    by Michael Chi Seng Tang 
    Abstract: Accurate brain tumour detection in MRI images remains challenging due to the tumours variability in appearance from image to image. This paper proposes an improved technique for identifying brain tumours in the images of MRI. To begin, features are extracted from MRI images using ResNet50 and ResNet101. The Chi-square test is then used to identify the five most significant features in each network. Concatenation of the features results in ten features per image. These features are used to train a classifier called K-nearest neighbour (KNN). The trained classifier is then evaluated on images from the testing set. The proposed method demonstrated accuracy, sensitivity, specificity, and precision values of 0.8492, 0.9351, 0.7143, and 0.8372, respectively. A comparison of performance demonstrates that the proposed method outperforms previously used methods for brain tumour detection on the same dataset in terms of accuracy, sensitivity, and precision.
    Keywords: brain tumour detection; feature extraction; image processing; machine learning; medical imaging; computer vision; convolutional neural network; CNN; MRI images; artificial intelligence; feature selection.
    DOI: 10.1504/IJCVR.2024.10065038
     
  • Automated kitchen waste segregation system via convolutional neural network   Order a copy of this article
    by Teh Boon Hong, Sarah Atifah Saruchi, Ain Atiqa Mustapha, Nor Aziyatul Izni, Wan Zailah Wan Said, Noor Idayu Mohd Tahir 
    Abstract: Composting is one of the efficient and practical methods to manage kitchen waste. The initial process of the composting system is the kitchen waste segregation between compostable and non-compostable categories. However, currently, the segregation process is carried out by human labour. Thus, to reduce the human labour burden, this study proposes an automated kitchen waste segregation system by deep learning method to classify kitchen waste into two groups: compostable and non-compostable. A convolutional neural network (CNN) model with different learning algorithms and several epochs is applied to perform the segregation. A prototype consisting of a camera, sensors, and motors is developed to validate the performance efficiency of the proposed model. Results show that the integration of CNN into the proposed kitchen waste segregation system manages to segregate the waste successfully without human involvement. This output is expected to contribute to supporting the waste management and composting campaign thus leading to a better environment.
    Keywords: kitchen waste; composting; convolutional neural network; CNN; internet of things; IoT; automation.
    DOI: 10.1504/IJCVR.2024.10065039
     
  • A novel spatial domain enhancement based image segmentation method with mean shift filtering and automated thresholding for leaf images on natural background   Order a copy of this article
    by Muhammed Shafi Koodakkal, Mohammed Ismail Bellary, Narayanan Sarala Sreekanth 
    Abstract: Identification of plants is essential for the production of fuel, healthy foods, medications, animal feed, and biodiversity conservation. It is difficult to recognize all of the plant species available in the world and identify some features that can be used to classify different plant species. Plant recognition relies on the shape and colour features of the leaves. The segmentation of a region from an image depends on background complexity. It will be simpler if the leaf is on a light background, and more challenging on a natural background. We proposed a method for the segmentation of leaf images on a natural background. One deep convolutional neural network has also trained to segment leaves from four different datasets. Through experimental runs, the obtained results show that our method achieves an accuracy rate of 98% and high PSNR and low RMSE scores, which is significantly better than the current state-of-the-art methods.
    Keywords: image segmentation; enhancement; mean shift filtering; mask RCNN.
    DOI: 10.1504/IJCVR.2024.10065087
     
  • A brief review on HCR works of Indian scripts   Order a copy of this article
    by Kalyanbrat Medhi, Diganta Kumar Pathak, Sanjib Kumar Kalita 
    Abstract: The aim of this paper is to describe the challenges faced by the researchers in the fields of optical character recognition (OCR) and handwritten character recognition (HCR) for Indian scripts. Several studies have been performed on OCR and HCR in Indian languages during the last few decades. The 11 Indian regional scripts are Devanagari, Bangla, Gujarati, Kannada, Malayalam, Oriya, Gurmukhi, Tamil, Assamese, Manipuri and Telugu. The aim of this paper is to present a review of the recent work done on OCR and the HCR of Indian scripts in the last few years, appearing in relevant conferences and journals. The survey is organised into five sections. Section 1 provides an introduction of various works, followed by components of OCR in Section 2. The properties of Indian scripts are presented in Section 3. Section 4 discusses different OCR methods and OCR research work done on Indian scripts. Section 5 concludes the paper.
    Keywords: Indian scripts OCR; Indian handwritten character recognition; OCR survey; document analysis.
    DOI: 10.1504/IJCVR.2024.10065452
     
  • Application of artificial bee colony algorithm as numerical solution for first order IVPs and industrial robot arm control problem   Order a copy of this article
    by V. Murugesh, G. Sanjiv Rao, Manoj Singhal, Rajnesh Singh, Sunil Gupta 
    Abstract: The current research article presents a novel analytical method with the help of artificial bee colony (ABC) algorithm to overcome the industrial robot arm control problem and 1st order initial value-based ordinary differential equations (ODEs). The current study took ten problems in addition to industrial robot arm control problem into consideration to establish the effective outcomes of ABC algorithm. In terms of actual solutions, the outcomes were compared with that of the results from RK-Gill algorithm and RK-Butcher algorithm and were inferred to be highly accurate. The results infer that it is easy to implement ABC algorithm and obtains solution for any time period.
    Keywords: Runge-Kutta method: RK-Butcher algorithm; RK-Gill algorithm; ABC algorithm: ODEs; artificial bee colony; ordinary differential equations; first order IVPs.
    DOI: 10.1504/IJCVR.2024.10065106
     
  • Colour-spin: a novel PCL descriptor for 3D object recognition and detection   Order a copy of this article
    by Mohamed Hannat, Nabila Zrira, Fatima Zahra Ouadiay, Mohammed Majid Himmi, El Houssine Bouyakhf 
    Abstract: This paper addresses the challenge of three-dimensional (3D) object recognition and detection in real-world scenes by introducing a new feature descriptor, the colour-spin (CSpin). CSpin combines spin image features and RGB colour information, leading to enhanced feature robustness. The performance evaluation shows that CSpin significantly outperforms the traditional spin descriptor by improving accuracy by 11%. Although the colour-signature of histograms of orientations (CSHOT) descriptor presented better accuracy, CSpin’s lower computational time makes it more suitable for real-time applications. Additionally, the 3D bag of words (3D BoW) model was optimised using a support vector machine (SVM) with a nonlinear radial basis function (RBF) kernel and a codebook size of 250. Our approach achieved an overall recognition accuracy of 96.42% on the publicly available RGBD Washington dataset, outperforming other state-of-the-art methods. The proposed solution shows considerable promise for practical applications in 3D object recognition and detection.
    Keywords: 3D object recognition; 3D object detection; colour-spin descriptor; CSpin; support vector machine; SVM; bag of words; BoW; point cloud library; PCL; RGBD Washington dataset.
    DOI: 10.1504/IJCVR.2024.10065156
     
  • Maximum power generation and conversion using hybrid technique for grid connected PV system   Order a copy of this article
    by A. Renjith, P. Selvam 
    Abstract: In this paper present an adaptive dingo optimiser (ADO) and coyote optimisation algorithm (COA) for maximum power generation from PV panels even in low irradiance and climatic change. ADO is a combination of arithmetic optimisation algorithm (AOA) and dingo optimiser (DO) based on dingo animal behaviour and mathematical calculations. COA is used to ensure that power stability during the entire process based on the social behaviour of coyotes. The novelty of this paper is produce maximum power in under different conditions and maintains the stable power for any type of loads and demands by controlling signal of inverter and converter using proposed algorithms. Using MATLAB/Simulink platform, the maximum PV panel power is modelled, and the findings of system efficiency study are also validated. From various simulation results generated output power is maximum and stability is also more efficient even with any type of loads when compared to existing technology.
    Keywords: irradiance; PV panel; maximum power generation; adaptive dingo optimiser; ADO; coyote optimisation algorithm; COA.
    DOI: 10.1504/IJCVR.2024.10065174
     
  • Analysis of COVID-19 symptoms using machine learning and robotic process automation   Order a copy of this article
    by Gireesh Kumar, Richa Sharma 
    Abstract: A virus called coronavirus or COVID-19 is the source of the contagious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The disease quickly spread worldwide, resulting in the COVID-19 pandemic. The virus is caused by beta coronavirus strain which is an acute SARS-CoV-2. The contagiousness of the virus resulted in universal infections and deaths since medically proven treatment was not available. The primary clinical symptoms are fever, cough, sore throat, shortness of breath and headache. This study aims to train a model using machine learning (ML) and robotics process automation (RPA) to predict infections, analyse symptoms and predict vulnerability. This study analyses the symptoms to determine clinical significance and rank the symptoms based on their significance and gender. Additionally, the model studies the effect of age on vulnerability towards infection. By automating the symptom analysis process, we aim to improve the efficiency and accuracy of COVID-19 diagnosis, ultimately aiding healthcare professionals in making informed decisions. The integration of ML and RPA holds the potential to revolutionise how healthcare systems address not only the current COVID-19 crisis but also future challenges in the rapidly evolving landscape of infectious diseases.
    Keywords: clinical; symptom; pandemic; machine learning; ML; robotics process automation; RPA.
    DOI: 10.1504/IJCVR.2024.10065561
     
  • Quantum support vector machine: a novel approach for predicting students academic performance in higher education   Order a copy of this article
    by Iti Batra, Subhranil Som, Pawan Whig 
    Abstract: This research paper investigates the impact of non-intellectual parameters on students academic growth, emphasising the transformative role of data mining in education. Through psychometric analyses of students behaviour and learning patterns, we aim to enhance academic performance. Previous studies, employing various mining techniques like neural networks, decision trees, KNN, naive Bayes, and SVM, have yielded accuracy rates below 89%. In response, our study proposes a novel approach, leveraging the quantum support vector machine (QSVM) for data classification to predict learners cumulative grade point index (CGPI). Results demonstrate a significant advancement with the radial basis function kernel in the QSVM model, achieving an impressive accuracy of approximately 98%. This breakthrough underscores the superior predictive capabilities of QSVM, marking a substantial improvement over existing methodologies. The findings suggest that QSVM holds great promise in accurately predicting academic performance based on psychological factors, paving the way for enhanced educational outcomes.
    Keywords: non-intellectual parameters; academic performance; study habits; data mining; psychometric analyses; educational outcomes; psychological factors; neural networks; decision trees; K-nearest neighbours; KNN; naïve Bayes; support vector machines; SVM; quantum support vector machine; QSVM.
    DOI: 10.1504/IJCVR.2024.10065562
     
  • ROI adaptive lossy compression in dermatology using deep learning and superpixels algorithm   Order a copy of this article
    by Saida Lemnadjlia, Melouah Ahlem, Amel Slim 
    Abstract: Medical images play a vital role in diagnosing and monitoring diseases, yet storage challenges arise due to their size. The necessity for compression becomes evident, though traditional methods compromise image quality. This study proposes a solution by selectively degrading non-critical image portions, adjusting the compression ratio based on content. The relevance of each segment is determined using a deep learning (DL) and super-pixels combination. Initially, a super-pixels method groups homogeneous pixels, and then the image is split into partitions, each classified by a DL model. A compression factor guides the ratio and degradation. Evaluation on skin images reveals superior results (CR: 42.07, PSNR: 63.87, MSE: 1.15, SS: 97.62) compared to existing techniques, affirming the method’s success.
    Keywords: deep learning; super pixel; image compression; jpeg; adaptive; loss of data; medical images; quality factor; semantic.
    DOI: 10.1504/IJCVR.2024.10065654
     
  • Robotics and autonomous systems: metaheuristic optimisation and deep learning   Order a copy of this article
    by Le Tuan Anh, Nguyen Van Duc, Trung Van Nguyen, Nguyen Thuy Dung, Nguyen Ho Quang, Pham Van Huy, Than Le 
    Abstract: Autonomous systems and mobile robots are increasingly developed and applied in large-scale areas of life. This paper focuses on navigation for autonomous and slamming to create stable systems when exploiting dynamic environments. Firstly, a solution built on simulation annealing, which is an approximation function dealing with global optimum using probability, solves the problem of metaheuristic optimisation based on a search algorithm using a high-level procedure to find the optimal trajectory in that is still the novel challenges of autonomous systems and robotics in unstructured environments. Next, the process of moving in the unknown environment is handled based on the tangent bug algorithm to help avoid collisions and move to the target. In addition, the system also uses deep learning algorithms to identify users through the built-in camera. The robot was developed can be an optimal destination trajectory to reduce the cost of the system, moving from the robot position to the required position in an unknown environment, and analysing and recognising the user’s face.
    Keywords: motion planning; autonomous system; deep learning; local minimum; simulated annealing.
    DOI: 10.1504/IJCVR.2025.10065668
     
  • A novel approach of multi-classification of brain tumour with MRI images using transfer learning   Order a copy of this article
    by Rashmi Jolhe, Sudhir D. Sawarkar 
    Abstract: Brain tumour identification via MRI is vital for early diagnosis and treatment planning, yet manual detection is time-consuming and accuracy varies. Our framework employs transfer learning, using fine-tuned pre-trained networks to extract deep features from MR images. We fine-tuned and evaluated six pre-trained neural networks for a brain tumour classification challenge, aiming to enhance computation efficiency and mitigate overfitting risks. Standardised techniques were applied to tune model weights and extract relevant features, tailored for the 4-class tumour categorisation task. Both quantitative measures like classification accuracy and convergence speed, and qualitative aspects like required training resources, were analysed. Findings highlight the promise of properly fine-tuned pre-trained models to boost accuracy and efficiency for critical healthcare tasks. This evaluation establishes transfer learning as impactful for adapting computer vision models to real medical applications. Achieving 98.74% accuracy, our model addressed overfitting issues and yielded favourable computational results.
    Keywords: deep learning; magnetic resonance imaging; MRI; glioma; meningioma and pituitary; perfusion MRI.
    DOI: 10.1504/IJCVR.2024.10065707
     
  • Energy efficient and secure data aggregation based on rock hyrax swarm optimisation and deep recurrent auto encoder in WSNs   Order a copy of this article
    by K. Hemalatha, M. Amanullah 
    Abstract: In wireless sensor networks (WSNs), the sensor nodes typically have limited capabilities in terms of sensing, computation, and communication. To improve the overall lifetime and reliability of these networks, it is crucial to implement energy-efficient and secure data aggregation techniques. These approaches aim to optimise the usage of available resources while ensuring the integrity and confidentiality of the aggregated data. By employing such methods, WSNs can operate more efficiently and effectively, enabling a longer lifespan and improved reliability. This study presents a novel hybrid methodology to enhance data aggregation performance and security in WSNs. The methodology utilises rock hyrax swarm optimisation (RHSO) for feature extraction, inspired by the natural behaviours of rock hyrax swarms in search of food. Additionally, an attention recurrent autoencoder (ARAE) model, which combines autoencoder architecture with attention mechanisms and recurrent neural networks, is employed for classification.
    Keywords: data aggregation; rock hyrax swarm optimisation; RHSO model; optimisation; recurrent method; auto encoder model; attack detection; wireless sensor networks; WSNs.
    DOI: 10.1504/IJCVR.2024.10065783
     
  • Efficient high-speed and low-area hardware architectures of Lilliput cipher for resource-constrained RFID applications   Order a copy of this article
    by Pulkit Singh, Rachana Chandupatla, Archana Nagar, Bibhudendra Acharya 
    Abstract: The development of communication networks has made information security more important than ever for both transmission and storage. Secure communication suffers resource limitations. Lightweight cryptographic algorithms are basically developed to work on constraints devices where conventional cryptography cannot be applied efficiently. In this paper, 8-bit hardware architecture of Lilliput cipher is proposed, which takes 8-bit out of 64-bit of plaintext to perform encryption and decryption operations. The proposed hardware architecture can work for those devices that can operate on low-bit input-output configuration like microcontrollers. Moreover, a loop unrolling technique is also proposed to decrease latency and improve throughput of hardware architecture. The Spartan and Virtex FPGAs platforms are used to carry out the implementation results of proposed 8-bit and unroll architecture for different unroll factors. The results demonstrate that the proposed designs have advantages over previous similar state-of-the-art papers in terms of throughput/area, throughput, and execution speed.
    Keywords: architecture; radio frequency identification; RFID; field programmable gate array; FPGA; resource-constrained; cipher.
    DOI: 10.1504/IJCVR.2024.10065889
     
  • Self-supervised learning - recent advancements, global trends and research directions   Order a copy of this article
    by Narinder Kaur Seera, Neha Gupta 
    Abstract: Self-supervised learning (SSL) has emerged as a novel machine learning paradigm that enables learning feature representations when the annotated data is scarce. In SSL, a model is trained on pretext tasks which extract useful representations from the available data to perform various downstream tasks. This study conducts a comprehensive review on different approaches of self-supervised learning predictive, generative and contrastive and recent developments in this field. This paper also highlights the significant contributions of SSL in computer vision and natural language processing (NLP) with its various applications in similar domain. It has been found that generative and contrastive models have proved their significant contributions in the field of medical imaging and video classification. Finally, the study presents a set of open research problems in SSL and future directions for the readers. This paper attempts to provide a complete study on self-supervised learning covering all major achievements, challenges and progress since its inception.
    Keywords: self-supervised learning; SSL; pretext task; generative learning; contrastive learning; computer vision; natural language processing; NLP.
    DOI: 10.1504/IJCVR.2024.10065969
     
  • A very effective approach for the generation of cryptographic keys based on safe EEG features   Order a copy of this article
    by T. Senthil Kumar, L. Mohana Sundari, M. Bharaneedharan, C. Saravanakumar 
    Abstract: The term biometric-based cryptographic key generation refers to a specific kind of data mining in which knowledge discovery methods are used to obtain biometric data in order to create cryptographic keys for the purpose of encrypting secure data. This method is considered to be a data mining strategy. The science of concealing information is known as cryptography, and when combined with cognitive science, the solutions of cryptography establish a new branch of the field cognitive cryptography. In this article, we provide a novel approach that we believe will reduce the amount of inaccuracy that is associated with the electroencephalogram-based key creation process. The new approach is based on the window segmentation protocol, and all of the characteristics are first transformed to the binary mode before being utilised as the input for the new method. We were able to determine that the mean HTER for the single-channel system was 0.48%.
    Keywords: security; cryptography; electroencephalogram; EEG; biometric cryptosystem.
    DOI: 10.1504/IJCVR.2024.10066093
     
  • Improving license plate detection with YOLO-LPD algorithm   Order a copy of this article
    by Sahil Khokhar, Deepak Kedia 
    Abstract: License plate detection is a pivotal task with implications across diverse domains, including traffic management, law enforcement, and parking administration. This research paper introduces YOLO-LPD, an innovative evolution of the YOLOv7 algorithm, engineered to excel in license plate detection. YOLO-LPD seamlessly integrates novel data augmentation techniques that significantly enhance the diversity and richness of the training dataset, alongside optimising the detection head of the YOLO algorithm to augment feature extraction and representation. Through systematic comparisons against the baseline YOLOv7 algorithm and contemporary state-of-the-art methodologies, YOLO-LPD emerges as the frontrunner, boasting a remarkable F-score of 98.97%. This outstanding performance resonates across the intricacies of license plate detection, effectively surmounting challenges posed by adverse lighting conditions, partial occlusion, and low visibility scenarios. The practical ramifications of YOLO-LPDs advancements are extensive, from enhancing traffic flow to empowering law enforcement and optimising parking management.
    Keywords: automatic license place recognition; ALPR; object detection; deep learning; machine learning; computer vision; intelligent transportation system.
    DOI: 10.1504/IJCVR.2024.10066190
     
  • A modified-two-fold-deep-learning-classifier paradigm for crop disease detection   Order a copy of this article
    by M. Chithambarathanu, M.K. Jeyakumar 
    Abstract: In this research work, a novel modified-two-fold-deep-learning-classifier paradigm is introduced for crop disease detection. The collected raw images are pre-processed via median filtering (for noise removal) and HE (for contrast enhancement). Then, from the pre-processed images, the features like improved texture features (I-CLBP, GMCM, AACM, and EACM), and colour features [(RGB), (HSV) or (HSB)] are extracted. Among the extracted features, the optimal features are selected using a THBOA (proposed). The projected hybrid optimisation model is the conceptual enhancement of standard HBA and TSO, respectively. The leaf disease detection phase is modelled with a modified-two-fold-deep-learning-classifiers approach. In the first phase, the bidirectional LSTM and ARNN. Both classifiers (Bi-LSTM and ARNN) are trained using the optimally selected features. The outcome from Bi-LSTM and ARNN is fed as input to the M-CNN. The final detected outcome is acquired from the modified CNN (proposed). To further enhance the detection accuracy, the loss function of CNN (ultimate decision maker) is modified (instead of the entropy-based loss function, the RMSE is computed). The final detected outcomes (presence/absence of crop disease) are acquired from modified CNN. The proposed model is validated over the existing models in terms of accuracy, precision, and sensitivity as well.
    Keywords: plant diseases and pests; digital image processing; I-CLBP; tuna honey badger optimisation algorithm; THBOA; modified-two-fold-deep-learning-classifiers.
    DOI: 10.1504/IJCVR.2024.10066290
     
  • Detecting sleepiness of the driver using computer vision and deep learning techniques   Order a copy of this article
    by V. Pradeep, Nisha Tellis, P. Namratha, S. Shravya, Vshker Mayengbam 
    Abstract: The majority of auto accidents are caused by drowsy driving, and so we present a reliable and intelligent method for detecting it. The proposed method obtains the input by capturing the facial features of the driver by installing a camera inside the vehicle. The system applies the Viola-Jones object detection framework that uses Haar-like features and AdaBoost learning for feature selection for initially identifying and extracting the face from the image and then identifying the eye region on the face. The deep convolutional neural network model with Keras uses the input data, which will classify different levels of drowsiness. Eye closure techniques are used to determine whether the driver is feeling fatigued. The system evaluates the drivers’ level of fatigue and, if necessary, issues a warning message. Our experimental results validate the effectiveness of our approach.
    Keywords: drowsiness detection; computer vision; facial expression recognition; convolutional neural networkp; deep learning.
    DOI: 10.1504/IJCVR.2024.10066345
     
  • Weed identification from field crop images using deep learning techniques   Order a copy of this article
    by Sulakshana B. Mane, Kiran Shrimant Kakade, L. Sudha, Sankarsan Panda 
    Abstract: The identification of weeds is very important in the process of constructing a weed control system that is based on deep learning. Using photos of both weeds and crops, deep learning techniques provide assistance in the construction of a weed identification model. Deep learning algorithms that are used for weed identification have been subjected to a limited number of studies that have been undertaken to evaluate the potential impact that different photo backgrounds may have on these algorithms. Average f1-score of 77.5% and 68.4%, respectively, was attained by the models in order to satisfy their performance objective. In the process of building the model, the use of both uniform and non-uniform background photographs led to an enhancement in the performance of both the VGG16 and ResNet50 models, with the average f1-score values falling somewhere in the range of 92% to 99%.
    Keywords: learning techniques; convolutional neural; visual group geometry; moving cameras.
    DOI: 10.1504/IJCVR.2024.10066346
     
  • Multilevel classification of disease in plants with IoT using hybrid optimisation algorithm   Order a copy of this article
    by Monalisa Mishra, Bibudhendu Pati, Prasenjit Choudhury 
    Abstract: The internet of things (IoT) is a technology developed in most of the worlds infrastructure with the requisite concept of connecting every device to collect, contribute, experience, and analyse information. Disease in plants is most commonly noticed by identifying leaves and hence IoT helps in the early detection of diseases. This research paper concentrates on multilevel classification using DL enabled model that is trained by the hybridised optimisation algorithm. Here, DL utilised is LeNet for plant type classification, and SqueezeNet for multiclass plant disease detection. Moreover, the training of these DL models is done by the proposed NMBEO, which is a combination of the Namib beetle optimisation (NBO) algorithm + mayfly algorithm (MA) + bald eagle search (BES) algorithm. Before classifying plant types, features are extracted from plant leaf segmentation. Here, UNet is utilised to segment plant leaves, where UNet is trained by MBEO. Moreover, the anisotropic filtering process is followed for input leaf images obtained from simulated data of the internet of things (IoT).
    Keywords: multilevel classification; plant leaf disease; internet of things; IoT; SqueezeNet; LeNet.
    DOI: 10.1504/IJCVR.2024.10066443
     
  • Parallel U-Net based virtual try-on system - VIRTUALTN   Order a copy of this article
    by Ved Patel, Yash Dalvi, Ameya Kulkarni, Anuj Neema, Dipti Jadhav 
    Abstract: Virtual try-on is a configurable person creation framework that includes virtual try-on and other fashion modification tasks. Authors propose a virtual try on method VIRTUALTN that generates clothing effects that are not possible with existing work, such as varied garment interactions such as wearing a top tucked into or over a bottom and layering of numerous clothes of the same type. VIRTUALTN specifically records each garment’s shape and texture, allowing these elements to be changed independently. Authors present a virtual try-on feature based on parallel U-net architecture and image processing that depicts customers how clothes look on actual models of various body shapes and sizes. This feature is achieved by a new generative AI model that generates life-like depictions of clothing. The objective evaluation of the proposed technique based on SSIM and FID shows acceptable results. The performance analysis shows that the proposed technique is comparable with many state-art-techniques.
    Keywords: virtual try on; DeepFashion; multimodal dataset; image processing; structural similarity index; Fréchet Inception Distance; FID.
    DOI: 10.1504/IJCVR.2024.10066444
     
  • UAV-based detection of landmines using infrared thermography   Order a copy of this article
    by Muhammad Umair Akram Butt, Zaighum Naveed, Usama Javed 
    Abstract: Landmines remain a pervasive threat in conflict-affected regions worldwide, exacting a toll on innocent lives. Approximately every two hours, another individual becomes a victim of these lethal explosive devices (Landmines Monitor 2023, 2023), with a significant proportion being innocent civilians. Current methods for landmine detection suffer from inefficiency, high costs, and risks to the operator and system integrity. In this paper, we present a novel, efficient, safe, and cost-effective approach to unearth these hidden dangers. Our proposed method integrates an unmanned aerial vehicle (UAV) with a thermal camera to capture high-resolution images of minefields. These images are subsequently transmitted to a base computer, where a state-of-the-art image processing algorithm is applied to identify the presence of landmines. Notably, our solution performs exceptionally well, particularly during evening hours, achieving an impressive detection accuracy of nearly 88%. These results exhibit great promise when compared to existing methods constrained by their design limitations.
    Keywords: IR thermography; landmines detection; unmanned aerial vehicle; UAV; remote sensing; drone.
    DOI: 10.1504/IJCVR.2024.10066473
     
  • Preventing crop damage due to animals using deep learning   Order a copy of this article
    by Naeem Ahmad, Rizwan Alam, Shuchi Sethi 
    Abstract: Even though the technology has matured enough, farmers are still following the old methods of dealing with animals in crop fields. Along with using the methods of fences and repellents for crop protection, the very best method is constant monitoring of crop fields, which is often used in farming practices. This constant monitoring makes sure that everything on the field goes as planned. The constant monitoring through traditional methods is very difficult for farmers, especially on rainy days, and fog when lands are near forests. To deal with this perennial problem, a framework has been proposed, that utilises modern technology such as deep learning and agriculture internet of things (AIoT) devices. In this framework, a deep learning technique is applied to video segments generated by video surveillance systems to monitor the animals gatherings in agricultural lands. The prediction method faster-RCNN applied in this framework has shown good accuracy. To justify our results, we evaluated a model on different performance metrics using a modified open image dataset of animal class.
    Keywords: smart agriculture; deep learning; crop damage; animals; object detection.
    DOI: 10.1504/IJCVR.2024.10066616
     
  • Human activity recognition using binarised spiking neural network with remora optimisation algorithm   Order a copy of this article
    by T. Senthil Prakash, Francis H. Shajin, Rajesh Kumar Singh, T. Eswarlal 
    Abstract: Human activity recognition (HAR) has great advantages in many applications, such as healthcare, entertainment, and human activity monitoring. Several classified methods are which does not accurately identifying the human activities. To overcome these issues, human activity recognition using binarised spiking neural network with remora optimisation algorithm is proposed to recognise the human transition motion. Initially, data’s are taken from human activities and postural transitions (HAPT) dataset. Then the input data’s are pre-processed by structural interval gradient filter for removing the noises. These pre-processed data are fed to ternary pattern and discrete wavelet transforms to extract the features through the data collected by the sensors. To improve human activity identification rate BSNN model is used. The weight parameters of BSNN are optimised by remora optimisation algorithm (ROA) for accurately recognise transitions and their activities. It is implemented in Python.
    Keywords: binarised spiking neural network; BSNN; structural interval gradient filter; human activity recognition; HAR; remora optimisation algorithm; ROA; ternary pattern; discrete wavelet transforms.
    DOI: 10.1504/IJCVR.2024.10066617
     
  • Artificial intelligence control learning for autonomous industrial robots   Order a copy of this article
    by Amit Yadav, Jyoti Vimal, Aseem Chandel, Trapti Jain, Dushyant Kumar Singh 
    Abstract: In recent days the technology shift can be well observed towards artificial if we talk of intelligent robots. Robot and intelligent robots which is based on control algorithm developed by using artificial intelligence. The idea of intelligent robot comes from human intelligence, speed of thinking, capabilities of balance and the real time decision making in the crowded environment. The robot which may avoid the obstacles, and reach to the destination safely. The research involves two main components. The first relates to mathematical modelling of robot, in terms of kinematic, dynamic, and electric drives models. The second part of research concerns: filtering, optimisation, system identification, and development of search algorithm for robot. These methods should be incorporate: 1) obstacle is static; 2) obstacle is moving slowly; 3) obstacle is moving fast. The search algorithm will provide a human like intelligence to the robot. The approach assumes external sensors and control algorithm for complete the task.
    Keywords: control learning techniques; CLT; computer vision; expert system.
    DOI: 10.1504/IJCVR.2024.10066618
     
  • Performance evaluation of neural network models to classify day-5 human embryos   Order a copy of this article
    by B.R. Shobha Rani, S. Bharathi, Balachander Agoramurthy, Srivatsa Sanath 
    Abstract: The success rate of the most recent ART treatments for infertility has significantly increased over time, rising from 25%30% to 65%70%, while simultaneously minimising miscarriages and birth defects. In this paper, in depth comparison of neural network models are discussed and proposed a state-of-art model to automate embryo detection process with a conventional microscopic image. Neural network models such as Faster R-CNN, YOLO and Mask R-CNN were implemented and assessed that Mask R-CNN model is suitable for embryo detection in the process of embryo grading. Detection process is an important criterion in automating embryo grading system helping embryologists to transfer viable embryos resulting in successful pregnancy. The effective approach implemented in this paper for detection process is also compared with other neural network models such as Faster R-CNN and YOLO and found that precision and recall of Mask R-CNN with Resnet101 as backbone has outperformed than other models. Selecting an optimised detection algorithm helps further in grading process resulting in successful pregnancy.
    Keywords: in vitro fertilisation; IVF; inner cell mass; ICM; trophectoderm; TE; region proposal network; RPN; Mask R-CNN; Faster R-CNN; you only look once; YOLO; hyperparameters.
    DOI: 10.1504/IJCVR.2024.10066666
     
  • A combined approach of radon and Fourier transform for multimodal medical image registration   Order a copy of this article
    by Manjusha Deshmukh, Sheetal Bukkawar 
    Abstract: Image registration is an essential imaging technique used to match two or more images of the same scene. However, different image registration methods are required to address specific situations, as geometric radiometric distortions, noise disturbance, data characteristics, and accuracy threshold levels vary across different tasks. This paper proposes a method to identify features from a given set of radon projections. The radon transform converts rotation into translation, and the magnitude of DFT is invariant to translation. The features extracted using their combinations are invariant to rotation. The RTFT approach is used to register images in different frequency bands for image fusion applications. The objective of this work is to address variations in scale, translation, and rotation. The algorithm is tested on PET and MRI image datasets, using around 23 images, to correct scale, translation, and rotation. The algorithm has been successfully tested with different datasets, and the required average computation time is in seconds, depending on the size of the images, with an accuracy of 98%.
    Keywords: image registration; image fusion; radon transform; Fourier transform; image matching.
    DOI: 10.1504/IJCVR.2024.10066771
     
  • Enhancing crop disease prediction in tropical regions through GAN-based transfer learning   Order a copy of this article
    by Amol Bhilare, Niraj Patel, Debabrata Swain, Hargeet Kaur 
    Abstract: Agribusiness is one of the critical areas for the endurance of humanity. Simultaneously, digitisation influenced many areas, making it easier to execute many complex operations. Utilising technologies is essential for farming to benefit both the producer and the client. Agriculture has a significant impact on the economy of developing countries. In India, 40%-50% of people still depend on agriculture as an essential source of revenue. But sustaining the crops gets harder when they are afflicted with multiple diseases. It is tough for the average pupil to understand the pattern of the conditions in plant leaves and find its solution. Through technologies and continuous vigilance, pathogens can be recognised early and treated, resulting in higher agricultural production. Here a transfer learning-based framework is proposed to detect the diseases in tropical region crops such as potato and tomato. This study has shown a promising solution to the issue by using EfficientNet-B5, a transfer learning-based model. This work applies re-sampling techniques such as augmentation and generative adversarial networks (GANs). The EfficientNetB5 model has shown the highest validation accuracy of 99.73% on GANs images.
    Keywords: tropical region crop; generative adversial network; augmentation; EfficientNet; Softmax; Adam optimiser; transfer learning.
    DOI: 10.1504/IJCVR.2024.10066824
     
  • Electronic health information systems to improve the detection of disease   Order a copy of this article
    by S.B. Mohan, Mukil Alagirisamy, Leo John Baptist Andrews, S. Allwin Devaraj 
    Abstract: The use of an electronic health information system (EHIS) makes it easier for healthcare workers to give clients personalised care, and it also makes it easier for service providers to share information. EHIS are being used more and more to diagnose and treat both communicable and non-communicable diseases. This is because of the growing amount of patient data and the need to provide ongoing care. Also, the fact that the COVID-19 pandemic happened in low- and middle-income countries (LMICs) with a lot of sick people has shown how important a strong EHIS is for keeping track of pandemics. The goal of this study is to figure out how durable and resilient hospital information systems are in light of the big changes happening in the world of healthcare. The study conducts a comprehensive evaluation of existing research on HIS, with a particular focus on their implementations in healthcare settings. The accuracy rate provided by the proposed system is 92.5%.
    Keywords: electronic health information system; EHIS; healthcare apps; healthcare; HIS applications.
    DOI: 10.1504/IJCVR.2024.10066870
     
  • Automated lesion segmentation of COVID-19 chest CT scan images using magneto-static active counter model   Order a copy of this article
    by S. Salini, B. Selvapriya 
    Abstract: The novel coronavirus disease, often referred to as COVID-19, is an airborne infection that is very infectious and has been responsible for a substantial amount of harm all over the globe. It got its name from the COVID-19 virus that caused it. The recent years have witnessed the appearance of more COVID-19 varieties, which has made the current scenario a great deal more complex and possibly harmful. This is due to the fact that the situation has become much more complicated. Evaluation and measurement of COVID-19 chest anomalies may be able to be of assistance in all of these areas. In this investigation, we make use of CT images to test the applicability of magneto-static active contour models (MACMs) for the segmentation of pneumonia caused by COVID-19, which is a disease caused by the coronavirus. MCMs are a productive method for image segmentation. The proposed way for adding more data to the training set can raise the average DSC from 0.7599 to 0.781. This method can drop the average MAE from 0.0074 to 0.0067 and raise the average SEN from 0.8104 to 0.8308.
    Keywords: chest CT scans; COVID-19 infection; active contour models; level set methods; region-based models; edge-based models.
    DOI: 10.1504/IJCVR.2024.10067000
     
  • Zigbee-based wireless sensor network in smart agriculture for crop production   Order a copy of this article
    by K. Sathiya Priya, C. Rajabhushanam 
    Abstract: When it comes to high-tech farm tracking cycles that are built on networks, "smart farming" is a new trend that uses tools, equipment, and monitors to put a focus on information and communication technology. Cloud computing, IoT, and other smart technologies could accelerate advancements and increase the use of AI and robotics in agriculture. Not only are these drastic changes causing a great deal of issues, but they are also disrupting the current agricultural practices. This research aims to investigate the kinds of tools and equipment used in Internet of Things agriculture to employ portable devices, as well as the issues that are expected to arise when technology is combined with conventional agricultural practices. This technology understanding also helps farmers during the whole food cycle, from growing to gathering. The ZigBee routing system drops the number of times routing is found by 15.2%.
    Keywords: crop management; sustainable agriculture; smart farming; internet-of-things (IoT); advanced agriculture practices; issues and problems.
    DOI: 10.1504/IJCVR.2024.10067001
     
  • Transfer learning based ResNet50v2 model for classification of COVID-19 in chest X-ray Images   Order a copy of this article
    by Farha Nausheen 
    Abstract: Coronavirus disease 2019 (COVID-19) is a lung-specific strain of influenza. One of the techniques used to diagnose it is the chest X-ray (CXR). One such technique is a convolutional neural network (CNN), one of the most well-liked deep learning (DL) designs. DL requires training a CNN for a purpose like classification using a big dataset. The transfer learning (TL) approach is utilised because the available datasets are either too small or too heterogeneous to build efficient feature representations. The popular ResNet architecture, which is TL-based, is well known for its great performance and generalisation on recognition tasks. In the proposed study, classification is carried out on a dataset that includes 10,192 normal and 3,616 COVID-19 chest X-ray images using the ResNet50v2 model. To evaluate the accuracy, specificity, sensitivity, and precision of the provided model, an experimental analysis is carried out. The testing accuracy is 98.746% and the testing loss is 0.04482 after 20 epochs. In order to expand the training set and enhance generalisation, we have included data augmentation.
    Keywords: COVID-19-detection; deep learning; transfer learning; ResNet50v2; chest X-ray images.

  • Multi-resolution deepfake face swap image forgery detection using DCT and DWT   Order a copy of this article
    by Bhavani Ranbida, Debabala Swain, Bijay Kumar Paikaray 
    Abstract: There are pictures everywhere in the digital age, yet cunning editing programmes may produce false ones without our knowledge. Better methods for swiftly identifying these fakes are needed since they produce issues that we might not even be aware of. With the goal of accelerating the verification process and developing more intelligent techniques for identifying phony photographs, this project aims to facilitate the handling of the fallout from altered photos. In this research, a novel approach to detect false images, particularly those with altered faces using deepfake methods is presented. The DCT and DWT methods were used for effective findings of the fake parts in images. By breaking down the images into smaller blocks, the fake regions can be easily figured out. It is not only making the spotting of fake images better but also time saving. Hence, it is a reliable way to find fake images and make sure genuine images, especially against new challenges like deepfake face changes.
    Keywords: deepfake; digital image forensic; multi-resolution; counterfeit detection; DCT; discrete wavelet transform; DWT.