Title: Research on facial dataset cleaning in mixed scenes based on spatiotemporal correlation

Authors: Siguang Dai

Addresses: School of Management, Hubei University of Education, Wuhan, 430205, China

Abstract: Researching methods for cleaning mixed scene facial datasets can improve the performance and reliability of mixed scene facial recognition algorithms. Therefore, the paper proposes a facial dataset cleaning method in mixed scenes based on spatiotemporal correlation. The 2DPCA algorithm is used to reduce the dimensionality of the data set, and the composite multi-scale entropy is used to decompose, reconstruct and arrange the image sequence after the dimensionality reduction. The autocorrelation coefficient and the number of interrelations between image sequences were determined, and the anomaly detection of data set was realised by combining spatio-temporal correlation. Sparse representation was used to repair the abnormal images, and the images with high similarity were deleted to clean the mixed scene face data set. The experimental results show that the minimum anomaly rate of our method is 0.5%, the success rate is between 94% and 96%, and the minimum time cost is 0.2 s.

Keywords: spatiotemporal correlation; mixed scenes; facial dataset; dataset cleaning; 2DPCA algorithm; composite multi-scale entropy; sparse representation.

DOI: 10.1504/IJDMB.2025.142970

International Journal of Data Mining and Bioinformatics, 2025 Vol.29 No.1/2, pp.51 - 66

Received: 17 May 2023
Accepted: 26 Oct 2023

Published online: 02 Dec 2024 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article