International Journal of Biometrics (IJBM) Inderscience Publishers - linking academia, business and industry through research

Forthcoming and Online First Articles

International Journal of Biometrics

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are published online here, before they appear in a journal issue. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Articles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

International Journal of Biometrics (27 papers in press)

Regular Issues

Statistical analysis of lateral palm prints in conjugation with signatures for personal identification
by Sonali Ashok Dagha, Kanica Chugh, Pooja Ahuja
Abstract: Personal identification on the basis of individual characteristics is always an area of interest for the forensic scientist. Certain behavioural biometrics, viz. Signatures can help in individualising the person. If such biometrics are modelled with lateral palm prints, they can be a reliable source of individualisation. The hypothenar area/lateral side of the palm/ulnar side of palm that is encountered in crime scenes in the form of patent or latent prints in cases such as sexual assault, below the signature on the document, etc. gives the way to identify an individual. The paper discusses one such useful technique, where fingerprints help in identifying the individual, in the same way that the lateral prints left on paper in conjugation with signatures can also help in identification of the author without analysing the signatures. The present study was conducted on 200 samples obtained in normal position from individuals, in order to identify lateral prints and determine their patterns for the purpose of personal identification. The results suggested positive identification to understand whether the person is right-handed or left-handed. Statistical analysis helps to establish a correlation between the genuine way of signature and the lateral side of the palm.
Keywords: signature authorship; biometrics; hypothenar area; personal identification; palm biometrics.
DOI: 10.1504/IJBM.2026.10071812

A novel Boolean approach for cancellable biometric template generation
by Onkar Singh, Ajay Jaiswal, Nitin Kumar, Naveen Kumar
Abstract: Cancellable biometrics addresses privacy and security concerns by transforming biometric data into a non-invertible template. This paper presents two novel and secure binary domain transformations for generating cancellable templates from biometric images. Our approach involves converting pixel decimal values to their unsigned binary equivalents, significantly enhancing non-invertibility. Experimental results on eight datasets demonstrate superior performance, achieving an Equal Error Rate (EER) below 0.001%, outperforming five leading cancellable biometric methods based on salting, XOR, and random permutations. Inverse attack, Attack via Record Multiplicity (ARM), and similarity metrics confirm the non-invertibility and robustness of the generated templates. False accept and brute force attack analysis prove that the methods are secure. Both methods adhere to the essential requirements of cancellable biometrics, allowing the cancellation of templates and demonstrating improved recognition accuracy. This makes both of them feasible for secure biometric authentication.
Keywords: Boolean-XOR; biometric salting; biometric recognition; template protection; privacy; security.
DOI: 10.1504/IJBM.2026.10071901

Facial micro expression classification and recognition method based on hidden Markov and support vector machine
by Ruifang Xing, Jingjing Feng
Abstract: In order to reduce the misidentification rate and time consumption of facial micro expression classification and recognition, a facial micro expression classification and recognition method based on hidden Markov and support vector machine is proposed. Firstly, the micro expression sample images are preprocessed using bilinear interpolation and mean variance normalisation methods. Secondly, Gabor wavelet transform is used to extract local features of micro expression space. Finally, a hidden Markov model is constructed to extract temporal features, and the optimal feature sequence is searched through Viterbi. The feature sequence is then input into a support vector machine to complete classification and recognition. The experimental results show that the false recognition rate of this method is always below 0.2%, the AUC area is close to 1, and the recognition time is always below 15 ms. Therefore, it indicates that the proposed method significantly improves the accuracy and timeliness of micro expression recognition.
Keywords: Gabor wavelet transform; feature extraction; hidden Markov; support vector machine; SVM; micro expression recognition.
DOI: 10.1504/IJBM.2025.10071942

An optimised learning strategy for analysing mentally disordered childrens activity based on facial expressions
by Mayuri N. Panpaliya , Pritish A. Tijare
Abstract: Facial expressions play crucial role in non-verbal communication and are used to recognise present state of mind in humans. A lot of research has been carried out for determining human emotions from facial expressions. However, the contribution is very less in case of children’s having mental illness. This article presents a novel intelligent convolution neural-based buffalo optimisation (CNbBO) model to detect and predict, activities of mentally ill children’s based on their facial expressions. Initially, machine learning (ML) algorithms are used for pre-processing and feature extraction. Facial emotions are classified with the help of optimisation fitness function from these features. Then, the second fitness function at layer two is updated to track the activities, enhance the detection rate, as well as to improve classification accuracy. Results shows the performance of the developed algorithm is promising with an accuracy of 94%, which is better than present available techniques.
Keywords: mental disorder; mental illness; facial expression; feature extraction; classification accuracy.
DOI: 10.1504/IJBM.2026.10072292

Chinese speech emotion recognition based on improved convolutional neural network
by Xiaoyan Wei, Xinhua Wang
Abstract: A Chinese speech emotion recognition method based on improved convolutional neural network is proposed with the expected goal of solving the problems of high false acceptance rate and false rejection rate and high recognition time consumption in traditional Chinese speech emotion recognition methods. Collect Chinese speech signals and perform pre emphasis, framing, windowing, and fast Fourier transform on the collected signals to achieve pre-processing of Chinese speech signals and extract features of the pre-processed Chinese speech signals. Introducing multi-level residuals to improve the convolutional neural network, inputting Chinese speech signal features into the improved convolutional neural network, and iteratively outputting Chinese speech emotion recognition results. Through experimental testing, it has been proven that the proposed method has an average false acceptance rate of 2.84% and an average false rejection rate of 4.63%. The maximum time consumption for Chinese speech emotion recognition is 49.2 ms.
Keywords: improved convolutional neural network; Chinese speech; emotion recognition; fast Fourier transform; FFT; multi-level residuals.
DOI: 10.1504/IJBM.2025.10072293

A discriminative model for scale, translation and rotation invariant face recognition
by Puja S. Prasad, Esther Varma, Sanjay Kumar Prasad
Abstract: There are many challenges like different illumination conditions, ageing, different poses and orientation of images, limited datasets for training, and other variational conditions associated with facial recognition and verification. SIFT is a robust and popular algorithm for facial recognition due to its invariant nature towards scale, and orientation, but it has its some limitations. This paper proposes a framework in which we modify the steps of SIFT algorithm in two ways. First, for calculating extrema using a non-maximal suppression algorithm we compare the grid in fixed patches instead of whole images, and by reducing the size of the SIFT feature descriptor. For this experiment, we are using five public databases FERET, Yale2B, M2VTS, Face 94, ORL and found an improvement in terms of accuracy with respect to the existing facial recognition system. The novelty of the proposed method is that it has less computational complexity compared to original SIFT and good accuracy compared to other state-of-the-art methods.
Keywords: scale invariant feature transform; SIFT; ORL; MOPS; FERET; Yale2B; M2VTS; Extrema.
DOI: 10.1504/IJBM.2025.10065805

Towards biometric template update protocols for cryptobiometric constructions
by Subhas Barman
Abstract: Biometric data have been stored remotely for authentication purposes. The crypto-biometric construction needs to update the biometric template periodically for security confirmation. However, the revocation of enrolled biometric data requires secure transmission of biometric data from the user to the server. We have analysed three existing schemes where users' biometric templates are stored in the remote server. In the first scheme, a cryptographic key is protected and shared with the biometric data. In the second approach, the biometric template is stored in the server's database and is used to exchange cryptographic keys. In the third scheme, a biometric template is stored and used to derive a permanent key that encrypts the session key and distributes it to the communicating party under a biometric-based key distribution centre. We have analysed the security of all the proposed protocols using the random oracle security model and proved that the protocols are secure against an attacker. We have compared our approaches with the existing approach and found that our third protocol has a minimum communication cost, that is 2,368 bits.
Keywords: biometric authentication; crypto-biometric frameworks; template update protocol; random oracle; cryptographic key exchange.
DOI: 10.1504/IJBM.2025.10065177

An online noun phrase translation method based on speech recognition technology
by Kun Li
Abstract: Due to the low translation accuracy of traditional methods, a noun phrase online translation method based on speech recognition technology is proposed. Firstly, an online speech signal recogniser is used to collect the speech signals of noun phrases, and Fourier transform is used for the denoising process. Secondly, based on the denoised speech signal, a benchmark translation context is set to extract the features of noun phrase speech signals in the optimal translation context. Finally, a transformation layer is introduced into the seq2seq model, with the source noun phrase as input and the target noun phrase as output, to construct a neural machine translation model for noun phrases and complete online translation of noun phrases. The experimental results show that the method proposed in this paper can accurately recognise the speech signals of noun phrases and improve the accuracy of online translation. The accuracy of online translation remains above 93%.
Keywords: speech recognition; noun phrase; online translation; seq2seq model; conversion layer.
DOI: 10.1504/IJBM.2025.10066023

Threshold selection for keystroke dynamics identification system
by Onsiri Silasai, Sucha Smanchat, Sirapat Boonkrong
Abstract: Keystroke dynamics is the timing information captured when typing on a computer keyboard. It includes hold time and inter-key time. In authentication and identification systems, a threshold is an essential element used in the decision-making process to determine whether a user be granted access or not. Therefore, the threshold selection process is vital. In this work, 15 users were asked to type long texts using a word processor twice a day for 10 days. Two scenarios were used to determine the ability to identify users. EER and accuracy were used to confirm the result and find the most appropriate threshold. The result showed that the highest count thresholds were 0.20 and 0.30. When confirmed by using EER and accuracy, the optimal threshold is 0.20 with an EER of 0.08% and an accuracy of 87.60%. Additionally, our proposed method outperforms those that use free texts to create typing patterns.
Keywords: keystroke dynamics; threshold selection; user identification.
DOI: 10.1504/IJBM.2025.10066656

Performance analysis on fingerprint identification by deep learning approach
by Florence Francis-Lothai, Kung Chuang Ting, Emily Sing Kiang Siew, Hai Inn Ho, Annie Joseph, Tengku Mohd Afendi Zulcaffle, David B.L. Bong
Abstract: Achieving high accuracy in fingerprint identification remains challenging, despite various approaches that have been introduced over the years, including deep learning-based methods. These approaches can be computationally complex and may require a vast amount of training data. This study aims to evaluate the performance of deep learning-based approaches for fingerprint identification using two pretrained deep network models, i.e., GoogLeNet and ResNet18. The images in the datasets are first registered and cropped before being trained and validated. The validation rates demonstrated that the preprocessed images produced higher average validation rates compared to the original images. These images are then applied during the testing phase, resulting in nearly perfect identification rates for both models. In comparison, with only 20% of the training dataset, GoogLeNet and ResNet18 achieved 93.00% and 97.00% for the FingerDOS database, respectively. Both models obtained an 88.75% identification rate on the FVC2002 DB1A database, outperforming other methods.
Keywords: fingerprint identification; biometric; deep learning; GoogLeNet; ResNet18; image registration; speeded up robust features; SURF.
DOI: 10.1504/IJBM.2025.10067397

Gender-based analysis of ECG biometric identification under different physiological conditions
by Siti Nurfarah Ain Mohd Azam, Khairul Azami Sidek
Abstract: The study investigates the reliability of ECG signals as a biometric feature for individuals in different physiological conditions. Previous research has shown that ECG biometric identification can be performed during normal conditions, however, the challenge lies in performing biometric identification on moving subjects. Therefore, this work proposed a robust biometric identification by using ECG signals incorporating different physiological conditions based on gender. A total of 15 male subjects and 7 female subjects who performed sitting, walking and running activities were involved in this work. The study discovered that ECG signals can be reliably used as a biometric for different physiological conditions, with medium Gaussian SVM being the most effective with the results of 98.7%. The accuracy of ECG signal classification can be affected by gender differences, with female subjects exhibiting accuracy as low as 82.9%, likely due to the size of their hearts.
Keywords: ECG; biometric; person identification; pattern recognition; verification; signal processing; human authentication; support vector machine; biological signals; signal classification.
DOI: 10.1504/IJBM.2025.10070475

Enhancing security and accuracy in biometric systems through the fusion of fingerprint and gait recognition technologies
by Mayank Shekhar, Amit Kumar Trivedi, Ripon Patgiri
Abstract: In the evolving landscape of security technology, biometric systems are pivotal for unique identification through physiological or behavioural traits. This research focuses on enhancing biometric system security and accuracy by integrating fingerprint and gait recognition technologies. Fingerprint recognition is valued for its precision and ease of data acquisition, while gait recognition offers non-invasiveness and resistance to obfuscation. The study explores feature and score level fusions of these modalities, utilising advanced algorithms to optimise the integration and elevate recognition performance. Experimental evaluations demonstrate that the proposed multimodal system not only outperforms unimodal systems but also strengthens robustness against spoofing attacks. Key contributions include a novel gait feature extraction technique compatible with fingerprint features and an optimised score-level fusion algorithm, significantly enhancing accuracy and security. Biometric security systems have become integral to modern security architectures, leveraging unique physiological and behavioural characteristics to authenticate individuals.
Keywords: multimodal biometrics; fingerprint recognition; gait recognition; biometric security; feature fusion; biometric authentication.
DOI: 10.1504/IJBM.2025.10068285

Classification of human emotion using an EEG-based brain-machine interface: a machine learning approach
by Abdul Cader Mohamed Nafrees, Sidath Ravindra Liyanage, Naomal G.J. Dias
Abstract: The main purpose of this work is to investigate the possibility of using electroencephalography (EEG) data to improve machine learning models' ability to accurately identify emotions. The work focuses on emotion classification using EMG data, to improve data mining models. This work investigates the use of individual and ensemble classification methods in the processing of windowed data obtained from four scalp sites. This information is then utilised to calculate the emotions that participants felt at particular times. The results indicate that the use of a low resolution, readily available EEG device can be a useful tool for determining a human's emotional status. The submission of ensembling technique increases the accuracy of the model; this highlights the possibility of creating categorisation algorithms that may be used in practical decision support systems. Future studies in this field ought to concentrate on determining if the method, attribute creation, attribute selection, or both were responsible for this notable improvement.
Keywords: electroencephalography; EEG; electromyography; EMG; facial expressions; human emotion; machine learning; ML.
DOI: 10.1504/IJBM.2025.10070061

A survey of multimodal emotion recognition: fusion techniques, datasets, challenges and future directions
by Kuei-Chung Chang, Sheng-Quan Chen
Abstract: Emotion recognition is crucial in enhancing the quality of human-computer interaction, education, healthcare, and transportation safety. By integrating data from different sources, multimodal emotion recognition can capture complex emotional signals more comprehensively and accurately than single-modal data sources. This paper comprehensively reviews AI-based multimodal emotion recognition systems, covering deep learning techniques, datasets, challenges, and future research directions. We also present current research techniques, including exploring fusion strategies for different modal data and diverse data fusion methods. Several challenges in multimodal emotion recognition are discussed in this paper, such as the incompleteness of modal data, inconsistency in signal quality, and insufficient model interpretability. The paper points out that future research needs to further explore how to effectively integrate data from different modalities and enhance the adaptability and interpretability of models in practical applications.
Keywords: emotion recognition; multimodal learning; artificial intelligence; feature fusion.
DOI: 10.1504/IJBM.2025.10070454

Facial thermograms - application of facial recognition in the medical sector
by Swagata Sarkar, R. Muthuselvan, N. Ashokkumar, Rajesh Kumar Vishwakarma
Abstract: Millions of people around the world have recurrent migraines, which are a nerve disease that can be very bad. This study finds a big difference in temperature in the frontal and temporal areas of the right brains of women who had headaches on one side only. Notably, the temperature trends of people who had pain on both sides did not change, which suggests that the diagnostic process may be more complex. More study with bigger groups is still required. Face thermography should still be read with care, though, because more research and proof are needed. Facial thermography has a lot of potential to help doctors diagnose headaches better and learn more about how they work on a neurophysiological level. The system was trained with 1980 images and then tried with 576 images. It got a score of 96.66% for accuracy.
Keywords: female; headache; humans; migraine disorders; quality of life; pain; temperature; thermography.
DOI: 10.1504/IJBM.2025.10069534

Special Issue on: Applications of Image Processing and Pattern Recognition in Biometrics

A method for classifying and recognising the emotional states of dancers based on the spatiotemporal features of facial expressions
by Yaotian Li, Zhaoping Wang
Abstract: To address the issues of low recall and poor accuracy in the classification and recognition of dance performers' emotional states based on spatiotemporal features of facial expressions, a dance performer emotional state classification and recognition method based on spatiotemporal features of facial expressions is proposed. Firstly, face detection is performed using an integral graph, and preprocessing is carried out using affine transformation and histogram equalization. Secondly, combining LBP and LPQ algorithms to extract spatiotemporal features of facial expressions. Next, principal component analysis is applied for feature selection and dimensionality reduction to reduce noise and redundant information. Finally, support vector machine (SVM) is used for emotional state classification, achieving automatic recognition and multi class classification. Through experiments, it has been proven that the accuracy and recall rate of the emotional state recognition method proposed in this paper are high, with a recall rate consistently above 95%.
Keywords: spatiotemporal features; principal component analysis; PCA; support vector machine; SVM; emotional state; classification recognition.
DOI: 10.1504/IJBM.2025.10069414

A grading evaluation method for English oral pronunciation errors based on deep neural networks.
by Jian Sun, Li Zhang, Guanghui Shu
Abstract: In this paper, a deep neural network-based grading method for English oral pronunciation errors is proposed. Preprocess English oral pronunciation signals and extract MFCC feature vectors. Using Hidden Markov model to construct an acoustic model, using deep neural network to predict the state probability distribution of acoustic feature vectors, replacing the observation probability of the acoustic model. Construct a language model to obtain the probability of word order, combine it with an acoustic model to build a search network, use the Viterbi algorithm to decode, and find the phoneme state sequence. And based on the reference phoneme sequence, calculate the degree of pronunciation errors, compare it with a threshold, and achieve graded evaluation. The results indicate that the AUC value of the proposed method is close to 1, and the F1 value is above 0.95, indicating a high accuracy of the evaluation.
Keywords: spoken English; pronunciation error; graded evaluation; hidden Markov model; deep neural network; DNN.
DOI: 10.1504/IJBM.2025.10069415

Research on basketball emergency stop jump shot action recognition based on semantic guided neural network
by Yong Wang
Abstract: In order to accurately and quickly recognize basketball emergency stop and jump shot movements, a new semantic guided neural network-based basketball emergency stop jump shot action recognition method is proposed. Firstly, improve the quality of basketball action images through color vectorization and filtering preprocessing techniques. Secondly, using image retrieval technology for edge contour feature extraction and fusion retrieval, a high suspicion basketball emergency stop jump shot action pixel feature sample set is selected. Finally, semantic information is integrated into the neural network to improve recognition accuracy. The network architecture innovatively incorporates non local feature extraction modules, ECA attention mechanism modules, and deformable convolution modules to extract feature information. Through fully connected layers, accurate recognition of basketball emergency stop jump shots is achieved. The test results show that the recognition accuracy of this paper method is stable at around 95%, and the highest recognition time is only 0.93s.
Keywords: semantic guided neural network; basketball emergency stop jump shot; action recognition; edge contour features.
DOI: 10.1504/IJBM.2025.10069416

Seeing the unseen: a novel approach to biometric recognition system
by Kumari Deepika , Deepika Punj, Jyoti Verma
Abstract: This paper introduces an innovative three-phase cascade framework designed for biometric recognition systems, particularly suited for small-scale applications. By integrating multiple biometric modalities - dorsal vein, wrist vein, and palm print - the framework aims to improve recognition accuracy and robustness. The first phase focuses on extracting unique features from each modality using a moment-based approach that is transformation-invariant and computationally efficient. In the second phase, an asymmetric aggregator operator is employed to merge these features into a unified representation. The final phase utilises spectral clustering to classify and match the fused feature vectors, effectively addressing unseen samples. Evaluated on 350 samples from the COEP and FYO benchmark databases, the framework achieved an impressive accuracy of around 98% for unseen samples, outperforming existing methods like Zernike moment and hierarchical clustering. This work not only enhances biometric authentication but also broadens its applicability across various domains, marking a significant advancement in the field.
Keywords: moment; unseen samples; spectral clustering; hierarchical; Zernike; Hu; CFOEP palmprint; FYO DB.
DOI: 10.1504/IJBM.2025.10070135

Interactive gesture recognition method based on spatiotemporal mask and variational mode decomposition
by Yuee Yi
Abstract: In order to reduce the error recognition rate and response time of interactive gestures, this study proposes a new interactive gesture recognition method based on spatiotemporal masking and variational mode decomposition. Firstly, the spatiotemporal features of interactive gestures are extracted using spatiotemporal masks, and combined with graph convolutional networks and self attention mechanisms, the modeling of hand joint data is optimized. Secondly, the variational mode decomposition technique is used to extract time-frequency features from interactive gestures. Finally, the principal component analysis method is used to reduce the dimensionality of high-dimensional gesture features, and a support vector machine classifier is employed to recognize different types of interactive gestures in the reduced feature space. The experimental results show that the proposed method performs well in interactive gesture recognition tasks, with a maximum error rate of no more than 1% and a response time of only 0.31s.
Keywords: spatiotemporal mask; variational mode decomposition; interactive gesture recognition; support vector machine; SVM.
DOI: 10.1504/IJBM.2025.10070791

English oral pronunciation recognition based on improved deep neural networks
by Lifang Cheng
Abstract: In order to solve the problems of poor performance, low F1-score, and high error rate in traditional English oral pronunciation recognition methods, an English oral pronunciation recognition method based on improved deep neural networks is proposed. Firstly, preprocess the spoken English pronunciation signals and videos to extract audio and lip features. Then, fusion processing is performed on the extracted multimodal features. Finally, the multimodal feature fusion result is used as the output vector, and the English spoken pronunciation recognition result is used as the output vector. By adding an attention module before the first fully connected layer, a deep neural network model is built to obtain the relevant recognition results. The experimental results show that the proposed method has good recognition performance, with an F1-value consistently maintained above 0.95 and an error rate of no more than 1%. It can be further promoted in related fields.
Keywords: spoken English; pronunciation recognition; deep neural network; attention module; multimodal feature fusion.
DOI: 10.1504/IJBM.2025.10070792

Multimodal emotion recognition based on combined deep learning network
by Zhenzhen Wang, Yu Ji, Rui Sun, Qi Liu
Abstract: Aiming to address the issues of low accuracy, low F1 score, and long task completion time in traditional multimodal emotion recognition methods, a multimodal emotion recognition method based on combined deep learning network is proposed. Firstly, collect EEG signals, eye movement data, and facial expression images, and extract the features of the collected data. Then, using the modal attention module, weighted operation, and decision module, a multimodal feature fusion model is built, and the extracted features are used as model inputs to obtain the multimodal feature fusion results. Finally, combining the results of multimodal feature fusion with a combined deep learning network to achieve multimodal emotion recognition. The experimental results show that the proposed method has a maximum accuracy of 99.1% for multimodal emotion recognition, a minimum F1 value of 0.947, and a minimum completion time of 56.8ms for multimodal emotion recognition tasks, demonstrating high precision and efficiency.
Keywords: combined deep learning network; multimodal; emotion recognition; EEG signals; eye movement data; facial expression images.
DOI: 10.1504/IJBM.2025.10070793

Automatic speech recognition method for English translator based on improved transfer learning
by Wei Zeng, Gaochao Huang
Abstract: Aiming to solve the problems of high word error rate and long response time in current English translator speech recognition methods, an improved transfer learning based English translator speech automatic recognition method is proposed. Sample, quantify, and pre-process the speech signal of the English translator, and extract the features of the pre-processed English translator speech signal. Using the speech signal features of the English translator as input vectors and the speech recognition results of the English translator as output vectors, improving transfer learning by removing the output layer of the base model, re-initialising an output layer and connecting it to the hidden layer of the base model, and fine-tuning the network to build an English translator speech automatic recognition model and obtain speech recognition results. Experimental results show that the word error rate of the proposed method is less than 3.5%, and the highest response time is only 7.3 ms.
Keywords: improved transfer learning; English translator; speech recognition; pre-processed; signal features.
DOI: 10.1504/IJBM.2025.10071397

Continuous pronunciation error recognition of English vocabulary based on dual modal fusion features
by Lin Wu
Abstract: In order to reduce the word error rate and character error rate in English pronunciation recognition, a continuous pronunciation error recognition method for English vocabulary based on dual modal fusion features is proposed. Firstly, pre-processing continuous speech data of English vocabulary, including screening visual information mouth ROI, colour normalisation, horizontal flipping; using short-time Fourier transform to extract audio features and normalise them to ensure data temporal consistency. Secondly, for the pre-processed continuous speech data of English vocabulary, the fusion of dual modal features of speech visual information and auditory information is completed based on kernel principal component analysis. Finally, using the fused features as input, the construction of an English vocabulary continuous speech pronunciation error recognition model is completed. Experimental results show that the proposed method has a word error rate of less than 7.9%, a character error rate of less than 6.4% for continuous pronunciation errors in English vocabulary.
Keywords: dual modal fusion features; English vocabulary; continuous speech; pronunciation error; intelligent recognition.
DOI: 10.1504/IJBM.2025.10071398

Application of improved stacking ensemble learning in intelligent terminal fingerprint recognition
by Zhe Li
Abstract: In order to solve the problems of high RMSE value and error acceptance rate, as well as poor recognition performance in existing fingerprint recognition methods, the application of improved Stacking ensemble learning in intelligent terminal fingerprint recognition was studied. Firstly, image quality is improved through grayscale processing, Gaussian filtering denoising, and Gabor filtering. Secondly, the Sobel operator is used to calculate the gradient direction and divide it into blocks to extract fine node features, which are described by curvature to support fingerprint matching. Finally, the improved algorithm is used for fingerprint recognition, which is trained and predicted by the base learner. The output is then subjected to feature enhancement, weighted fusion, and second layer learner training to obtain the final fingerprint recognition result. The experimental results show that the RMSE value and error acceptance rate of the proposed method are low, and the identification effect is good.
Keywords: improved stacking ensemble learning; intelligent terminal; fingerprint recognition; decision tree algorithm.
DOI: 10.1504/IJBM.2025.10071399

Health status recognition method of human in sports videos based on deep learning
by Xunxing Liu
Abstract: In this paper, A recognition method of human health status in sports videos based on deep learning is proposed. Firstly, determine the global spatial information of frame difference features and enhance the frame difference features of the human's health status for frame difference feature extraction; Secondly, extract the facial expression information feature frame differences of the human's health status, and complete the frame difference segmentation of the human's health status in the motion video. Then, reduce the fitting degree of the recognition results, use the fully connected layer in deep learning to output the recognition results, and introduce a loss function to optimize the recognition results; Finally, experimental verification will be conducted. The experimental results show that the proposed method has high confidence, low complexity, and small error, indicating good recognition performance.
Keywords: deep learning; sports videos; human’s health status; identification; frame difference segmentation.
DOI: 10.1504/IJBM.2025.10071400

Static facial expression emotion recognition method using spatiotemporal graph convolution
by Yanmei Sun, Bo Cheng
Abstract: In order to improve the Matthews correlation coefficient and consistency index of facial expression features in emotion recognition, a static facial expression emotion recognition method using spatiotemporal graph convolution is proposed. Firstly, by standardising facial images through eye localisation, correcting tilted expressions through rotation, estimating pixel values through bilinear interpolation, and combining histogram equalisation techniques, grayscale processing of static facial expression images has been achieved. Secondly, by applying two-dimensional Gabor wavelet filtering to static facial images, the time-frequency localisation characteristics of Gabor wavelets are utilised to accurately extract texture details of different frequencies and directions in facial images. Finally, the spatiotemporal graph convolution method is used to extract spatial features and achieve effective recognition of static facial expressions and emotions. In the static facial expression emotion recognition experiment, the Matthews correlation coefficient remained above 0.9, and the consistency index of expression features remained above 0.91.
Keywords: spatiotemporal graph convolution; static facial expressions; emotion recognition; two dimensional Gabor wavelet filtering.
DOI: 10.1504/IJBM.2025.10071401

Forthcoming and Online First Articles

International Journal of Biometrics

Keep up-to-date