Forthcoming and Online First Articles

International Journal of Biometrics

International Journal of Biometrics (IJBM)

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are published online here, before they appear in a journal issue. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Open AccessArticles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

Register for our alerting service, which notifies you by email when new issues are published online.

We also offer which provide timely updates of tables of contents, newly published articles and calls for papers.

International Journal of Biometrics (28 papers in press)

Regular Issues

  • Rapid Recognition of Athlete's Anxiety emotion Based on Multimodal Fusion   Order a copy of this article
    by Li Wang  
    Abstract: The diversity of anxiety emotions and individual differences among different athletes have increased the difficulty of emotion recognition. To address this, a rapid recognition method of athlete’s anxiety emotion based on multimodal fusion is proposed. Wireless sensor networks are used to collect facial expression images of athletes, and wavelet transform is applied for denoising the collected images. Image features are extracted using grey-level co-occurrence matrix, and the athlete’s facial expression images are normalised. Features related to the athlete’s emotions, such as voice characteristics, facial expression features, and physiological indicators, are obtained. These features from different perceptual modalities are fused to achieve rapid recognition of athletes’ anxiety emotions. The test results demonstrate that this method not only improves the image denoising effect but also achieves high accuracy and efficiency in emotion recognition, enabling accurate and real-time recognition of athletes’ emotions.
    Keywords: multimodal fusion; rapid recognition; wireless sensor networks; wavelet transform.
    DOI: 10.1504/IJBM.2024.10060859
     
  • Facial micro-expression recognition method based on CNN and Transformer mixed model   Order a copy of this article
    by Yi Tang, Jiaojun Yi, Feigang Tan 
    Abstract: The existing methods for facial microexpression recognition have the problem of low efficiency and accuracy. Therefore, a facial micro-expression recognition method based on a hybrid model of CNN and transformer is proposed. Extract facial hierarchical features using a hybrid model of CNN and transformer, and use them as inputs to a deep network. At the same time, the facial microexpression image area is segmented and the image is smoothed through threshold to obtain the feature vectors of the facial microexpression. These feature vectors are input into a CNN and transformer hybrid model to achieve recognition of facial microexpressions. The experimental results show that the proposed method can recognise facial microexpressions in complete or incomplete images, and the recognition state delay is controlled below 5 ms. In addition, compared to traditional methods, this method has a higher average recognition accuracy, up to 98%.
    Keywords: CNN; transformer mixed model; micro-expression of human face; recognition method.
    DOI: 10.1504/IJBM.2024.10060860
     
  • Identifying Illegal Actions method of Basketball Players Based on Improved Genetic Algorithm   Order a copy of this article
    by Zhenyu Zhu 
    Abstract: In order to reduce the time required for identifying athlete violations and improve the recognition rate, this paper proposes a basketball player violation recognition method based on an improved genetic algorithm. Firstly, the surface electromyographic signals of athletes are collected using a wireless sEMG signal acquisition device; Secondly, determine the location of signal acquisition and extract the time-domain features of the signal; Then, a composite filter is used to denoise the signal; Finally, the genetic algorithm is improved by combining support vector machines to design an action recognition classifier, which outputs the results of illegal action recognition through the recognition classifier. Through experiments, it can be seen that this method can effectively improve the recognition rate by 9.44%, and within 0.5 minutes, the recognition effect of basketball players' illegal actions is good.
    Keywords: Improved genetic algorithm; Wavelet transform threshold denoising; Action recognition classifier; Surface electromyographic signal.
    DOI: 10.1504/IJBM.2024.10060862
     
  • A Multistate Pedestrian Target Recognition and Tracking Algorithm in Public Places Based on Camshift Algorithm   Order a copy of this article
    by GaoFeng Han, Yuanquan Zhong 
    Abstract: In order to improve the accuracy of polymorphic pedestrian target recognition and tracking, and shorten tracking time, this paper proposes a public place polymorphic pedestrian target recognition and tracking algorithm based on the Camshift algorithm. Firstly, greyscale the input image and use Hog to select polymorphic pedestrian target features in public places. Then, calculate the probability density of the target area model and construct a pedestrian target recognition and tracking model. Finally, extract the colour features of the target, select the Bhattacharyya coefficient to calculate the similarity between the target model and the candidate model, and use the Camshift algorithm for target recognition, tracking, and matching to obtain the final recognition and tracking results. The experimental results show that the accuracy of the proposed method can reach 97.78 and the operation time is only 0.082 frames/s, indicating that the proposed method effectively improves the target recognition and tracking performance.
    Keywords: gamma correction method; Camshift algorithm; HSV space; hue histogram; search window.
    DOI: 10.1504/IJBM.2024.10061224
     
  • Multimodal Emotion Detection of Tennis Players Based on Deep Reinforcement Learning   Order a copy of this article
    by Wenjia Wu 
    Abstract: The research on multimodal emotional detection of tennis players is considered to be of great significance in terms of understanding their psychological state, improving technical performance. The problems of high detection error and low recall rate in traditional detection methods are sought to be solved. Therefore, a multimodal emotion detection method of tennis players based on deep reinforcement learning has been designed. The facial expressions, speech emotional signals, and physical behaviour emotional feature parameters of tennis players are extracted, and the obtained emotional feature parameters are used as input vectors for a multimodal emotion detection model based on deep reinforcement learning. The problem of high dimensionality of multimodal emotion parameters is addressed through the value function of reinforcement learning, and the multimodal emotion detection results of tennis players are output by the model. The experimental results demonstrate that the proposed method yields low detection error, high recall rate.
    Keywords: deep reinforcement learning; tennis players; multimodal emotion detection; facial expression; voice emotion signal; body behaviour emotion.
    DOI: 10.1504/IJBM.2024.10061499
     
  • Athlete facial micro-expression recognition method based on graph convolutional neural network   Order a copy of this article
    by HaoChen Xu, ZhiQiang Zhu 
    Abstract: The recognition accuracy of athlete facial micro-expression is low due to insufficient consideration, failure to remove invalid data from the recognition data, and inaccurate extraction of micro-expression features. To this end, a new method for athlete facial micro-expression recognition based on image convolutional neural networks was studied. Firstly, the athlete’s face data is preprocessed using facial alignment, unified frame, and optical flow extraction algorithms; Then, the graph convolutional neural network is used to extract athlete facial micro-expression features; Finally, to improve the performance of micro expression recognition tasks, a classification layer was added before the output layer of the network, and support vector machine algorithm was introduced to optimise and improve the graph convolutional neural network to adjust the discriminative boundaries between categories, achieving more accurate and effective micro expression recognition. The experimental results show that the proposed method can accurately extract micro-expression features, with a recognition accuracy of 97.0% and high convergence, effectively improving the recognition effect.
    Keywords: graph convolutional neural network; facial micro-expression; support vector machine; optical flow extraction algorithm; unified frame.
    DOI: 10.1504/IJBM.2024.10061500
     
  • GenVeins: An Artificially Generated Hand Vein Database   Order a copy of this article
    by Emile Beukes, Hanno Coetzer 
    Abstract: An artificially generated dorsal hand vein database called "GenVeins" (see Beukes (2023)) is developed in this study for the purpose of acquiring sets of fictitious training and validation individuals which are large enough to represent the entire population. The development of said database is motivated by experimental results which indicate that system proficiency is severely impaired when training on an insufficient number of different individuals. A number of dorsal hand vein-based authentication systems are proposed in this study for the purpose of determining whether or not the utilisation of the GenVeins database may increase system proficiency when compared to training and validating the proposed systems on small sets of different individuals. The results clearly indicate that the utilisation of the GenVeins database significantly increases system proficiency when compared to the scenario in which an insufficient number of different individuals are utilised for training and validation.
    Keywords: biometric authentication; hand vein; deep learning; similarity measure networks; siamese networks; two-channel networks; segmentation; artificial data; convolutional neural networks.
    DOI: 10.1504/IJBM.2024.10062266
     
  • An Intelligent Approach to Detect Facial Retouching using Fine Tuned VGG16   Order a copy of this article
    by Kinal Sheth 
    Abstract: It is a common practice to digitally edit or ‘retouch’ facial images for various purposes, such as enhancing one’s appearance on social media, matrimonial sites, or even as an authentic proof. When regulations are not strictly enforced, it becomes easy to manipulate digital data, as editing tools are readily available. In this paper, we apply a transfer learning approach by fine-tuning a pre-trained VGG16 model with ImageNet weight to classify the retouched face images of standard ND-IIITD faces dataset. Furthermore, this study places a strong emphasis on the selection of optimisers employed during both the training and fine-tuning stages of the model to achieve quicker convergence and enhanced overall performance. Our work achieves impressive results, with a training accuracy of 99.54% and a validation accuracy of 98.98% for the TL vgg16 and RMSprop optimiser. Moreover, it attains an overall accuracy of 97.92% in the two-class (real and retouching) classification for the ND-IIITD dataset.
    Keywords: Adam; retouching; RMSprop; transfer learning; TL; VGG16.
    DOI: 10.1504/IJBM.2024.10062315
     
  • Multi-pose face recognition method based on improved depth residual network   Order a copy of this article
    by Feigang Tan, Yi Tang, Jiaojun Yi 
    Abstract: Multi-pose face recognition method can reduce the interference of pose change on face characteristics by analysing pose change. In order to improve the accuracy of multi-pose face recognition and shorten the recognition time, a multi-pose face recognition method based on improved depth residual network is proposed. The multi-pose face image is transformed logarithmically, and the face image is enhanced by homomorphic filtering algorithm. The spatial transformation network is introduced to improve the depth residual network model, and the enhanced face image is input into the improved depth residual network model. Through the calculation of loss function and the update of gradient parameters, the multi-pose face image recognition is completed. The experimental results show that this method has strong multi-pose face image enhancement ability, can effectively recognise multi-pose face images, and has high recognition accuracy. When the occlusion is 30%, the face recognition accuracy can reach 0.989.
    Keywords: improved depth residual network; multi-pose; face recognition; image enhancement; Softmax regression model.
    DOI: 10.1504/IJBM.2024.10063084
     
  • A method of badminton video motion recognition based on adaptive enhanced AdaBoost algorithm   Order a copy of this article
    by YunTao Chang 
    Abstract: To overcome the problems of low recognition accuracy, poor recognition recall, and long recognition time in traditional badminton video action recognition methods, a badminton video action recognition method based on an adaptive enhanced AdaBoost algorithm is proposed. Firstly, the badminton video actions are collected through inertial sensors, and the badminton action videos are captured to construct an action dataset. The data in this dataset is normalised, and then the badminton video action features are extracted. The weighted fusion method is used to fuse the extracted badminton video action features. Finally, the fused action features are used as the basis, Construct a badminton video action classifier using the adaptive enhanced AdaBoost algorithm, and output the badminton video action recognition results through the classifier. The experimental results show that the proposed method has good performance in recognising badminton video actions.
    Keywords: inertial sensor; weighted fusion method; AdaBoost algorithm; motion recognition; data standardisation.
    DOI: 10.1504/IJBM.2024.10063377
     
  • Motion recognition of football players based on deformable convolutional neural networks   Order a copy of this article
    by Lingqiang Xuan, Di Zhang 
    Abstract: In order to improve the accuracy of football player action recognition and the number of frames transmitted per second, a football player action recognition method based on deformable convolutional neural network is proposed. Firstly, the action images of football players are collected through binocular vision, and distortion correction and disparity calculation are performed on the images to improve their quality. Secondly, based on the collected athlete action images, the receptive field of the action images is calculated in two-dimensional convolution to extract football player action features. Finally, the extracted action features are input into the support vector machine to construct the optimal classification plane and complete the recognition of football player actions. The experimental results show that the action recognition accuracy of our method can reach up to 99.3%, and the transmission speed of our method is always stable at 24 frames per second or above.
    Keywords: variable convolutional neural network; CNN; football players; action recognition; binocular vision.
    DOI: 10.1504/IJBM.2024.10063378
     
  • Basketball player action recognition based on improved LSTM neural network   Order a copy of this article
    by Xudong Yang  
    Abstract: In order to improve the IoU value and accuracy of basketball player action recognition methods, this paper proposes a basketball player action recognition method based on an improved LSTM neural network. Firstly, establish a coordinate system in the visual system and perform appropriate sequence transformations on the collected basketball player action images to complete image acquisition. Next, a Kalman filter is used to filter and process the collected action images. Finally, based on the LSTM neural network unit, two sigmoid gating units are introduced to improve it. Using the filtered action image as input and the action recognition result as output, an improved LSTM neural network is used to construct an action recognition model and obtain the recognition result. The experimental results show that the proposed method has achieved significant improvement in IoU value and accuracy in action recognition, with the highest recognition accuracy reaching 98.26%.
    Keywords: improving LSTM neural network; basketball players; action recognition.
    DOI: 10.1504/IJBM.2024.10063379
     
  • Facial expression recognition method based on multi-level feature fusion of high-resolution images   Order a copy of this article
    by Li Wan, Wenzhi Cheng 
    Abstract: To improve the accuracy of facial expression recognition, the paper designs a facial expression recognition method based on multi-level feature fusion of high-resolution images. Firstly, smooth the noise and texture in the facial image and perform enhancement processing. Secondly, extract multi-level features of facial images, and then fuse multi-level features through reverse solving. Then, extract the attributes of different regions of the face and assign them to the corresponding representation data. Extract decoupled data of facial expressions based on feature fusion results. Compare decoupled representation and representation data to complete facial expression recognition. The experiment shows that the geometric mean of the recognition results obtained by this method is between 0.963 and 0.989, and the similarity of the feature vectors is between 0.972 and 0.988, indicating that this method can accurately output facial expression recognition results.
    Keywords: facial images; expression recognition; high resolution images; multi-level feature fusion.
    DOI: 10.1504/IJBM.2024.10063380
     
  • A method for identifying foul actions of athletes based on multimodal perception   Order a copy of this article
    by Jiuying Hu 
    Abstract: In order to improve the recall rate and accuracy of foul action recognition for track and field athletes, and solve the problem of poor classification effect of foul action, this study proposed and designed a multi-modal perception-based foul action recognition method for track and field athletes. Firstly, the foul action dataset of track and field athletes is constructed. Then, the wavelet denoising method is used to process the movement image noise of track and field athletes. Finally, the recognition function of foul action of track and field athletes is established by means of multi-modal perception, and the bidirectional ranking loss is used to train the function, and the similarity between skeleton and video matching is calculated, so as to obtain the final recognition result of foul action of track and field athletes. The experimental results show that the accuracy of foul action identification is 98.5%, the classification accuracy is 98.6%, the recognition recall rate is 99.2%, the recognition sensitivity is high, and the application effect is good.
    Keywords: multimodal perception; athletes; identification of foul actions; bidirectional ranking loss.
    DOI: 10.1504/IJBM.2024.10063381
     
  • Character emotion recognition algorithm in small sample video based on multimodal feature fusion   Order a copy of this article
    by Jian Xie, Dan Chu 
    Abstract: In order to overcome the problems of poor recognition accuracy and low recognition accuracy in traditional character emotion recognition algorithms, this paper proposes a small sample video character emotion recognition algorithm based on multimodal feature fusion, aiming to overcome the problems of low accuracy and poor precision in traditional algorithms. The steps of this algorithm include extracting facial image scene features and expression features from small sample videos, using GloVe technology to extract text features, and obtaining character speech features through filter banks. Subsequently, a bidirectional LSTM model was used to fuse multimodal features, and emotions were classified using fully connected layers and softmax functions. The experimental results show that the method achieves an emotion recognition accuracy of up to 98.6%, with a recognition rate of 64% for happy emotions and 62% for neutral emotions.
    Keywords: multimodal feature fusion; bidirectional LSTM model; attention mechanism; softmax function.
    DOI: 10.1504/IJBM.2024.10063382
     
  • Fine grain emotional intelligent recognition method for athletes based on multi physiological information fusion   Order a copy of this article
    by Dong Guo 
    Abstract: Aiming to solve the problems of low accuracy in collecting multiple physiological information, low recognition rate of fine-grained emotions, and long recognition time in traditional recognition methods, and fine grain emotional intelligent recognition method for athletes based on multi physiological information fusion is proposed. Various physiological information of athletes is collected using ECG sensors, EMG sensors, EDA sensors, as well as airflow sensors to acquire signals such as electrocardiogram, electromyogram, skin conductance, and respiration. The collected information is denoised, and the denoised information is then fused using the Bayesian method. Fuzzy neural networks are used to extract fine-grained emotional characteristics of athletes, and the results of fine-grained emotion recognition are obtained by combining with base classifiers. Experimental results show that the average accuracy of multi-physiological information collection of the proposed method is 97.2%, the average recognition rate is 97.5%, and the average recognition time is 1.41s.
    Keywords: multi physiological information fusion; athletes; fine grain emotional intelligent recognition; Bayesian method; fuzzy neural networks; base classifiers.
    DOI: 10.1504/IJBM.2024.10063383
     

Special Issue on: Advanced Bio Inspired Algorithms for Biometrics

  • Identity authentication model from continuous keystroke pattern using CSO and LSTM network   Order a copy of this article
    by Anurag Tewari, Prabhat Verma 
    Abstract: The identification of a user's authenticity in a continuous form will have a wide range of appreciation since the one-time authentication system is admissible for compromise after logging in. In this research work, an optimisation-based deep learning network model namely cuckoo search optimisation-based long short-term memory (CSO-LSTM) is proposed to effectively learn the keystroke pattern of the user. The CSO algorithm is used to optimise the weight parameters of the long short-term memory (LSTM) network using the evolution process. As the network weight parameters get optimised, the learning mechanism will acquire a better prediction rate than existing techniques. Two datasets were utilised to evaluate the performance of the proposed model namely Clarkson II and Buffalo. The performance evaluation of the proposed model is evaluated with different count of neurons and varied lengths of keystrokes for scalability of model as the size of the dataset increases.
    Keywords: authentication mechanism; continuous authentication; cuckoo search; keystroke recognition; optimisation.
    DOI: 10.1504/IJBM.2024.10061225
     
  • Offline handwritten signature recognition based on generative adversarial networks   Order a copy of this article
    by Xiaoguang Jiang 
    Abstract: In order to shorten the time for offline handwritten signature recognition and reduce the probability of false positives, an offline handwritten signature recognition method based on generative adversarial networks is proposed. Firstly, select pen pressure, pen tilt angle, pen azimuth angle, and multi-level velocity moment as the main dynamic features of offline handwritten signatures, and calculate the Pearson correlation coefficients of these dynamic features. Secondly, calculate and sum multiple features to complete the dynamic feature selection and fusion of offline handwritten signatures. Finally, using dynamic feature fusion data as input and offline handwritten signature recognition results as output, a generative adversarial network model is constructed to complete the recognition of offline handwritten signatures. Experimental results show that this method can complete the recognition of 200 offline handwritten signatures in 0.66 seconds, with an error rejection rate and error acceptance rate of only 1%, and a recognition accuracy rate of 95%.
    Keywords: dynamic features; offline handwriting; signature recognition; Pearson coefficient; adversarial neural network; convolutional neural network; CNN.
    DOI: 10.1504/IJBM.2024.10059894
     
  • A method for recognising wrong actions of martial arts athletes based on keyframe extraction   Order a copy of this article
    by Zhiqiang Li 
    Abstract: In order to improve the accuracy of incorrect action recognition and shorten the time required for action recognition, the paper proposes a method for recognising incorrect actions of martial arts athletes based on keyframe extraction. Firstly, the optical flow method is used to filter the key frames of actions, and the shot adaptive K-means clustering algorithm is used to extract the texture features of image frames. Secondly, Euclidean distance is used to calculate the distance between cluster centres and complete the initial selection of keyframes. Finally, the sequence position and video frame rate of the initially selected keyframes is optimised to obtain the final keyframe sequence number and output the error action recognition result. The experimental results show that the error action recognition accuracy of this method is 96.58%, the recognition error is 1.9%, and the recognition time is 11 seconds.
    Keywords: keyframe extraction; optical flow method; Lens adaptation; K-means clustering algorithm; Euclidean distance.
    DOI: 10.1504/IJBM.2024.10059895
     
  • Speech endpoint detection method based on logarithmic energy entropy product of adaptive sub-bands in low signal-to-noise ratio environments   Order a copy of this article
    by MingHui Zhu, Peng-Cheng Huang, JiaYong Zhang 
    Abstract: In this paper, a detection method based on logarithmic energy entropy product of adaptive sub-bands is designed. After the speech signal is divided into frames and FFT, the probability of the existence of the speech signal is analysed according to the ratio of the minimum value of the local energy spectrum to the short-term energy spectrum. After the noise is suppressed according to the normal distribution of Gaussian noise, the logarithmic energy entropy product of adaptive sub-bands is calculated. Using the calculated results as a threshold, compare the logarithmic energy spectral ratio of the current speech frame with the threshold, and use Bayesian classification to detect speech endpoints. Experiment shows that the detection accuracy of this method is always higher than 94.4%, and the accuracy variance is between 0.055 and 0.072, effectively achieving the design expectations.
    Keywords: voice signal; signal-to-noise ratio; SNR; voice endpoint; short time energy spectrum value; denoising; sub-bands logarithmic energy entropy product; accuracy.
    DOI: 10.1504/IJBM.2024.10059893
     
  • A sparse representation-based local occlusion recognition method for athlete expressions   Order a copy of this article
    by Shaowu Huang 
    Abstract: A sparse representation-based local occlusion recognition method for athlete expressions is proposed to address the problems of large mean square error, low recall rate, and poor recognition performance. We calculate the gradient direction and size of image pixels, divide image blocks, count the histogram of the gradient direction of the image blocks, combine all small histograms into a feature vector, and obtain the facial feature extraction results. The LBP algorithm is used for local occlusion image segmentation, and a sparse representation model is established to extract expression features. By dividing the image into blocks and solving the sparse representation coefficients of each block, local occlusion expression recognition is achieved. Experimental results show that the maximum mean squared error of the proposed method for facial expression recognition is only 0.21, and the maximum recall rate is more than 80%, which shows that it can effectively recognise occluded parts.
    Keywords: sparse representation; localised occlusion of facial expressions; HOG; LBP algorithm; feature extraction.
    DOI: 10.1504/IJBM.2024.10059891
     
  • Recognition of starting movement correction for long distance runners based on human key point detection   Order a copy of this article
    by Xia Zhu 
    Abstract: In order to improve the accuracy and effectiveness of recognition of starting motion correction for long-distance runners, a method for recognising starting motion correction for long-distance runners based on human key point detection is proposed. Adopting sparse sampling method to process and collect starting action data, introducing STI module to extract data features and fuse them. Construct a human key point detection network based on collaborative spatiotemporal attention, collect dynamic gradient information of the input data, and use the collaborative spatiotemporal attention module to obtain all joint point information to complete the recognition of starting movements of long-distance runners. The results show that the proposed method has a recognition accuracy of over 96%, a root mean square error of always 0.01, and a recognition time of 1.8 seconds, indicating that the proposed method can achieve correction and recognition of starting movements of long-distance runners.
    Keywords: key points of the human body; starting movement; corrective identification; STI module; convolutional algorithm for central difference graph.
    DOI: 10.1504/IJBM.2024.10059892
     
  • Tennis players' hitting action recognition method based on multimodal data   Order a copy of this article
    by Song Liu 
    Abstract: In order to improve the recognition accuracy of hitting movements, a tennis player hitting movement recognition method based on multimodal data is proposed. First, we collect acceleration modal data of hitting movements and extract acceleration characteristics of hitting movements. Then, we collect deep modal data of hitting movements and extract deep optical flow features of hitting movements. Finally, we collect RGB modal images of hitting movements, and use recurrent neural networks to extract RGB features of hitting movements. The canonical correlation analysis method is selected to fuse the acceleration characteristics, depth optical flow characteristics and RGB characteristics of tennis players' hitting movements. The feature fusion result is taken as the input of the spatiotemporal convolutional neural network, and the spatiotemporal convolutional neural network is used to output the tennis player's stroke action recognition result. The experimental results show that this method effectively recognises tennis players' hitting movements, with an accuracy of over 99%.
    Keywords: multimodal data; tennis players; stroke action; recognition method; acceleration; deep optical flow characteristics.
    DOI: 10.1504/IJBM.2024.10059890
     
  • Chinese named entity recognition method based on multiscale feature fusion   Order a copy of this article
    by Xiaoguang Jiang 
    Abstract: In response to the problems of low recognition accuracy and poor recognition efficiency in traditional methods, the paper proposes a Chinese named entity recognition method based on multiscale feature fusion. Firstly, the similarity between each word is calculated using the literal similarity algorithm to obtain synonyms of Chinese named entities. Then, the Chinese named entity features are obtained, including character features, character shape features, binary character features, and word similarity features, through multiscale feature fusion to obtain the Chinese named entity feature set. Finally, the target Chinese named entity for recognition is obtained by matching vocabulary, compressing vocabulary vectors, and integrating character vectors, and the CRF is used to implement Chinese named entity recognition. The experimental results show that the recognition time of this method is only 4.0 s, with a precision rate of up to 99.9% and a recall rate of up to 99.2%.
    Keywords: multiscale feature fusion; similarity; CRF; literal similarity algorithm.
    DOI: 10.1504/IJBM.2024.10060537
     
  • An online learning behaviour recognition method based on tag set correlation learning   Order a copy of this article
    by Ruijing Ma 
    Abstract: Aiming at the problems of poor fitting degree of loss function and low confidence of behaviour recognition in online learning behaviour recognition, an online learning behaviour recognition method based on tag set correlation learning is proposed. Firstly, learners' online learning behaviour is analysed, and their online learning behaviour data is extracted through convolutional layer models. Then, Gaussian mixture model is used to extract feature data, and EM algorithm is used to preprocess feature data. Finally, the label set correlation learning method is used to obtain the label rating results of each behaviour data, and normalisation processing is performed to identify and judge its correlation with the behaviour sample, completing the final recognition. The results show that the loss function value of the proposed method approaches 0, has a high fitting degree, and the confidence is 98%, and the recognition effect is better.
    Keywords: online learning; learning behaviour recognition; Gaussian mixture model; EM algorithm; label set correlation learning.
    DOI: 10.1504/IJBM.2024.10060536
     
  • Accurate facial expression recognition method based on perceptual hash algorithm   Order a copy of this article
    by Yang Yang 
    Abstract: To improve recognition accuracy, a precise facial expression recognition method based on perceptual hash algorithm is proposed. Firstly, the single scale Retinex algorithm is used to enhance facial expression images. The image is divided into high-frequency and low-frequency parts through curvature change decomposition, and the image is enhanced after filtering processing. Secondly, the two-dimensional principal component analysis network is combined with a perceptual hash algorithm based on a simplified Watson model to extract image features. Finally, the feature is added to the Hash table, and based on the distance between the feature and the facial expression to be recognised, the nearest neighbour of the facial expression to be recognised is judged to achieve accurate facial expression recognition. The experimental results show that the face recognition accuracy of this method reaches over 95%, indicating that its recognition effect is good.
    Keywords: perceptual hash algorithm; facial expression recognition; image enhancement; feature extraction; hashtable.
    DOI: 10.1504/IJBM.2024.10060538
     
  • Multi-modal human motion recognition based on behaviour tree   Order a copy of this article
    by Qin Yang, Zhenhua Zhou 
    Abstract: Since the efficiency and accuracy of existing methods are low in complex multi-modal human motion recognition, this paper studies the multi-modal human motion recognition method based on behaviour tree. Firstly, Kinect sensor is used to collect multi-modal motion data of human body, and convolutional neural network is used to denoise the collected motion data. On the basis of denoising data, wavelet packet decomposition is used to extract its features. Finally, according to the extracted multi-modal human motion features, a behaviour tree model is constructed to traverse the recognised human motion and achieve accurate and efficient multi-modal human motion recognition according to the degree of feature matching. The experimental results show that the recognition accuracy of the proposed method can reach 98%, the highest recall rate is 96%, the highest F1 is 0.97, and the longest recognition time is only 4.65 seconds, which indicates that the proposed method has high practicability.
    Keywords: Kinect sensor; behaviour tree; convolutional neural network; multimodal human motion recognition.
    DOI: 10.1504/IJBM.2024.10060861
     
  • Classification of visual attention by microsaccades using machine learning   Order a copy of this article
    by Soichiro Yokoo, Nobuyuki Nishiuchi, Kimihiro Yamanaka 
    Abstract: This paper proposes machine learning methods for classifying visual attention. Eye-tracking data contains a range of useful information related to human visual behaviour. In particular, many recent studies have shown a relationship between visual attention and microsaccades, a type of fixational eye movement. In this study, eye movement and pupil diameter were measured under three controlled experimental conditions requiring different visual attention levels. Microsaccades were extracted from eye-tracking data that included rapid saccades. Various machine learning methods were then used on parameters related to the extracted microsaccades to classify the level of visual attention. By cross-validating data from one participant (test data) with that from other participants (training data), we showed that the support vector machine method had the highest correct discrimination rate (77.1%). These results suggest that it is possible to classify visual attention based on microsaccades.
    Keywords: microsaccade; machine learning; visual attention; pupil diameter; eye-tracking.
    DOI: 10.1504/IJBM.2024.10060309