Forthcoming and Online First Articles

International Journal of Business Intelligence and Data Mining

International Journal of Business Intelligence and Data Mining (IJBIDM)

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are published online here, before they appear in a journal issue. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Open AccessArticles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

Register for our alerting service, which notifies you by email when new issues are published online.

International Journal of Business Intelligence and Data Mining (22 papers in press)

Special Issue on: Deep Learning Technology and Big Data Method for Business Intelligence and Management

  • Learning from high-dimensional unlabelled data with outliers: a novel robust approach   Order a copy of this article
    by Abdul Wahid 
    Abstract: This paper investigates the problem of feature selection and classification under the presence of multivariate outliers in high-dimensional unlabeled data. The research question is how to identify outliers and deal with them in unsupervised learning to improve the clustering accuracy compared with the state-of-the-art non-robust feature selection techniques. For this purpose, a robust method is proposed by utilize the Mahalanobis distance for outlier identification based on the minimum regularized covariance determinants approach. Furthermore, a new weighting scheme based on Mahalanobis distance is developed for dealing with outlying data points. Finally, it is suggested to combine the proposed weight function and least squared loss function along with the graph and sparsity constraints for achieving the robustness. This new procedure is named Robust Self-Representation Sparse Reconstruction and Manifold Regularization (RSSRMR). The novel method is compared with previously proposed unsupervised feature select
    Keywords: clustering; high-dimensional data; feature selection; Mahalanobis distance; multivariate outliers.
    DOI: 10.1504/IJBIDM.2025.10066878
     
  • Personalised recommendation method for smart library literature based on user behaviour feature perception   Order a copy of this article
    by Yina Liu 
    Abstract: To solve the problem that existing library literature recommendation methods cannot achieve diversity and personalisation, this study proposes a personalised recommendation method for smart library literature based on user behaviour feature perception technology. Firstly, based on network coding technology, the collection of user behaviour data for smart libraries is completed, and the collected data is reduced in dimensionality through information entropy to remove redundant features. Then, a user behaviour feature model is constructed through Bayesian networks to perceive and analyse user behaviour, obtain user behaviour features, and finally, based on the feature perception results, a collaborative filtering algorithm is used to complete personalised recommendation of literature materials in the smart library. The experimental results show that this method can fully utilise the behavioural characteristics of users, accurately understand their interests and needs, and provide more accurate literature recommendation results.
    Keywords: perception of user behaviour characteristics; smart library; literature materials; personalised recommendation; network coding; information entropy.
    DOI: 10.1504/IJBIDM.2025.10066987
     
  • Method for mining students online English learning intention based on user portrait and big data   Order a copy of this article
    by Yanli Li, Lili Wang, Haitao Gao, Bin Zhang 
    Abstract: To overcome the problems of low recall, low accuracy, and long time in traditional methods, a new method for mining students' online English learning intention based on user portrait and big data is proposed. With the support of big data technology, the maximum mean difference algorithm is used to determine the distance between student online English learning data sample points, and the K-means algorithm is used to implement student online English learning data collection. The collected data is used to construct user personas, and the attention mechanism is used to extract students' online English learning characteristics. A student online English learning willingness mining model based on extreme learning machine network is established to obtain relevant mining results. Experimental tests have shown that the recall rate of the proposed method is always above 97.3%, the maximum mining accuracy is 98.1%, and the average mining time is 79.15ms.
    Keywords: user portrait; big data; students; online English; learning intention; maximum mean difference algorithm; attention mechanism; extreme learning machine.
    DOI: 10.1504/IJBIDM.2025.10066988
     
  • Intelligent retrieval method for power grid dispatching information based on knowledge graph   Order a copy of this article
    by Baoyu Hou, Qichao Wang, Zhiguo Zhou 
    Abstract: To improve the retrieval efficiency of power grid dispatch information, the paper proposes an intelligent retrieval method based on knowledge graph. Firstly, after mining the terminology of power grid dispatch information, the entities and relationships in the power grid dispatch information are extracted to obtain a string of entity names in the terminology dictionary, achieving the design of the knowledge graph pattern layer for power grid dispatch information; Finally, the power grid dispatch information is embedded into a discrete Hamming space, and the nearest neighbour retrieval method is used in the embedded space to achieve intelligent retrieval of power grid dispatch information. The experimental results show that the intelligent retrieval accuracy of our method can reach 98.51%, the recall rate of our method can reach 98.32%, and the time consumption of our method is only 6.6 seconds. The retrieval efficiency of power grid dispatch information is relatively high.
    Keywords: knowledge graph; Hamming space; nearest neighbour retrieval; term dictionary tree.
    DOI: 10.1504/IJBIDM.2025.10066989
     
  • A sentiment classification method for Weibo sensitive topic text based on multimodal features   Order a copy of this article
    by Manlin Li 
    Abstract: Due to the problem of reduced classification accuracy in traditional text sentiment classification methods, this paper proposes a Weibo sensitive topic text sentiment classification method based on multimodal features. Firstly, the bidirectional loop structure is introduced to improve the GRU model, and a BiGRU model is constructed for multimodal feature extraction and fusion of sensitive topics on Weibo. Secondly, by combining seed features, similar features, and residual features, a multimodal feature cluster is constructed to improve the accuracy of classification. Finally, the constructed multimodal feature clusters are input into the support vector machine model to complete sentiment classification of Weibo sensitive topic text. The experimental results show that compared with traditional methods, our method achieves higher accuracy in all emotion categories.
    Keywords: multimodal features; Weibo sensitive topics; text sentiment classification; BiGRU model; multimodal feature clusters.
    DOI: 10.1504/IJBIDM.2025.10066994
     
  • An accurate and rapid pushing of marketing information based on multidimensional data mining   Order a copy of this article
    by Zhisheng Zhou, Bin Li 
    Abstract: In order to address the accuracy and recall issues in marketing information push, this study proposes a strategy based on multidimensional data mining to achieve accurate and efficient marketing information push. First of all, collect marketing information and build a push probability index system; Secondly, the analytic hierarchy process is used to calculate the weight of the marketing information push index; Finally, considering the product life cycle, data mining technology is used to obtain the stable and random purchase interest of active and inactive users, and marketing information is accurately and rapidly pushed through the above four dimensions. The research results show that after adopting this method, the accuracy of marketing information push reached 98.1%, the recall rate reached 96.9%, and user satisfaction also increased to 98.5%, significantly improving the overall effect of marketing information push and user satisfaction.
    Keywords: analytic hierarchy process; purchase interest; data mining techniques; multidimensional data mining; indicator weight.
    DOI: 10.1504/IJBIDM.2025.10066996
     
  • A method for merging and classifying higher mathematics teaching resources based on density clustering algorithm   Order a copy of this article
    by Hejie Chang, Xing Lv 
    Abstract: To enhance the recall and accuracy of resource merging classification, this study introduces a merging classification technique rooted in density clustering algorithms. Initially, we gather data pertaining to higher mathematics teaching resources. Subsequently, we convert textual sentences into word-level representations, eliminating stop words and unnecessary high-frequency vocabulary. Leveraging LDA, we extract mathematical resource features, transforming words into computer- and model-recognisable vectorised forms. Next, we calculate the density and distance between samples to categorise them into distinct groups, employing density clustering algorithms for merging and classifying teaching resources. Experimental findings reveal that our method achieves a classification recall rate of 99.6% and an accuracy of 99.9%, thereby enhancing the quality and efficacy of higher mathematics education.
    Keywords: density clustering; merge and classify; advanced mathematics; teaching resources; resource allocation.
    DOI: 10.1504/IJBIDM.2025.10066997
     
  • Research on engineering cost prediction based on GA-BP neural network   Order a copy of this article
    by Yan Wu, Sha Lan, Tingting Liu 
    Abstract: To improve the accuracy of engineering cost prediction and reduce prediction errors, an engineering cost prediction method based on GA-BP neural network is proposed in this paper. Comprehensive index system for engineering cost prediction is constructed, and qualitative indicators are discretized using the equal interval method. The qualitative indicators are transformed into quantitative indicators through scale assignment. The BP neural network error is obtained through gradient descent, and the GA algorithm is used to adjust the weights from the output layer to the hidden layer. Using the discretized qualitative indicators as input vectors and engineering cost as the output vector, a prediction model for engineering cost based on GA-BP neural network is built to obtain prediction results. Experimental results show that the proposed method has a prediction range of 2.41%, a residual mean range of 0.005~0.219, a recall rate fluctuating between 96.9% and 99.7%,and high prediction accuracy.
    Keywords: GA algorithm; BP neural network; engineering cost prediction; gradient descent.
    DOI: 10.1504/IJBIDM.2025.10066998
     
  • Intelligent evaluation method for multimedia network public opinion decline period based on multi-divisional optimisation   Order a copy of this article
    by Xuefang Zhou 
    Abstract: In order to overcome the long data collection time, low accuracy in extracting features of public opinion decline, and low precision rate associated with traditional methods, a new intelligent evaluation method for multimedia network public opinion decline period based on multi-divisional optimization is proposed. A evaluation index system for intelligent evaluation of public opinion decline period is constructed, and index data is collected and processed. The multiple fractal dimensions of the index data are determined, and multi-divisional optimization is performed in conjunction with nonlinear support vector machines to extract features of public opinion decline. Public opinion decline period intelligent evaluation is achieved based on these features and the BiLSTM model. The experimental results show that the average data collection time of the proposed method is 0.72s, the average accuracy of feature extraction of public opinion decline is 97.66%, and the precision rate is consistently above 95%.
    Keywords: multi-divisional optimisation; multimedia network; public opinion decline period; intelligent evaluation; nonlinear support vector machine; BiLSTM model.
    DOI: 10.1504/IJBIDM.2025.10066999
     

Special Issue on: Methods and Applications of Data Mining in Business Domains II

  • Online teaching data distribution method based on learning behaviour big data mining   Order a copy of this article
    by Jing Chang 
    Abstract: To overcome the problems of low accuracy and recall of traditional online teaching data distribution methods, this paper proposes an online teaching data distribution method based on learning behaviour big data mining. Firstly, collect online teaching data and pre-process the distribution data; then, generate online teaching data distribution rules through triangular fuzzy clustering; finally, based on the learning behaviour big data mining method, the data is divided into fuzzy metrics, membership functions are established to update distribution rules, and big data mining is used to design data distribution schemes. The experimental results show that the distribution accuracy of our method can reach 99.89%, and the parameter recall rate can reach 97.89%. The actual results are in line with the expected results and have a good distribution effect.
    Keywords: learning behaviour; big data mining; online teaching; data distribution.
    DOI: 10.1504/IJBIDM.2025.10065173
     
  • Enterprise financial risk early warning method based on PCA and SVM algorithms   Order a copy of this article
    by Yanyan Cao, Gechun Pei 
    Abstract: Aiming at the problems of low relevance and high false alarm rate of enterprise financial risk early warning, an enterprise financial risk early warning method based on PCA and SVM algorithm is proposed. Firstly, the sensitivity optimisation principal component analysis method is introduced, and the representative index is selected according to the threshold value to establish the index system. Then, support vector machine is introduced to store the data in the sample space, and the indicators are divided into positive and negative indicators. Finally, combined with FCM clustering algorithm, the early-warning decision function is constructed to realise the early-warning of enterprise financial risk. The experimental results show that the correlation of this method is higher than 0.915, the false alarm rate is lower than 2%, and the Matthews correlation coefficient is up to 1.00.
    Keywords: principal component analysis; PCA; support vector machine; SVM; corporate financial risks; risk warning; FCM clustering algorithm.
    DOI: 10.1504/IJBIDM.2025.10065576
     
  • An English learning behaviour data mining based on improved ensemble learning algorithm   Order a copy of this article
    by Lin Fan, Pengqi Cao, Yunxia Du 
    Abstract: In order to enhance the learning effectiveness of English learners, this paper proposes an English learning behaviour data mining method based on improved ensemble learning algorithm. A web crawler is used to collect behavioural information of learners during the process of learning English, and learner profiles are constructed. The data is pre-processed, and collaborative filtering algorithms are employed to extract features of English learning behaviours. By treating English learning behaviour features as input vectors and data mining results as output vectors, an improved stacking ensemble learning model based on chain rules is constructed. This model is utilised to obtain data mining results for English learning behaviour. The experimental results show that the normalised difference accuracy of the proposed method is always above 90%, and the mAP value is always above 93%, indicating that the proposed method has high accuracy and good mining effect in English learning behaviour data mining.
    Keywords: ensemble learning; English learning; learning behaviour; data mining; chain rules; stacking ensemble learning model.
    DOI: 10.1504/IJBIDM.2025.10065187
     
  • Web server log data pre-processing for mining zakat user profile using association rules   Order a copy of this article
    by Mohamad Farhan Mohamad Moshin, Wan Hussain Wan Ishak, Yuhanis Yusof, Jastini Mohd Jamil, Alwi Ahmad 
    Abstract: The internet’s transformative impact on businesses and marketing strategies underscores the pivotal role of websites in establishing credibility and disseminating information to customers. To measure website effectiveness, tracking visitor behaviour is essential. This study focuses on web log data from Lembaga Zakat Negeri Kedah (LZNK), a Malaysian government institution managing zakat which utilises web analytics and mining to gain insights into website usage. The objectives of this paper are two-fold: firstly, to detail the pre-processing of weblog data to ensure reliability for data mining. Secondly is to employ association rule mining to extract user patterns from pre-processed weblog data. To achieve this, the web logs were obtained from the LZNK’s website spanning from 2016 to November 2020 with a focus on user access in 2020. The findings reveal critical aspects of user behaviour including the most visited pages, popular page combinations, user interests, relationships between pages, and the impact of the entry page. Implementing these insights can enhance the LZNK website’s usability, user satisfaction, and highlighting the importance of adapting to evolving user preferences and technological advancements.
    Keywords: association rule; data pre-processing; user profile; web log; web mining.
    DOI: 10.1504/IJBIDM.2025.10065199
     
  • Assessing ensemble techniques for imbalanced classification   Order a copy of this article
    by Eric P. Jiang 
    Abstract: Class imbalance represents a pervasive and challenging problem in machine learning and manifests in a wide range of real-world applications, where the distribution of data across different classes is highly skewed. Conventional machine learning algorithms tend to favour majority classes, often resulting in a failure to capture data patterns of minority classes. This bias can lead to undesirable outcomes in practice. This paper addresses the problem of class imbalance by conducting a comprehensive comparative study of various hybrid ensemble approaches that demonstrate promise in mitigating this learning issue. The study encompasses extensive experiments conducted on a diverse collection of datasets gathered from multiple application domains and characterised by a wide range of class imbalance ratios. To facilitate a comprehensive performance assessment of these methods in dealing with imbalanced data, we have deployed a combination of relevant and commonly used performance metrics and additionally, we have leveraged multiple non-parametric statistical tests to evaluate, analyse and compare the results obtained from the selected methods. By doing so, we aim to offer practical insights into which particular methods are better suited for specific contexts, thus aiding practitioners in selecting the appropriate approaches to address class imbalance in their machine learning tasks.
    Keywords: learning from imbalanced data; data rebalancing; ensemble learning; performance evaluation and comparison.
    DOI: 10.1504/IJBIDM.2025.10065924
     
  • Comprehensive evaluation method of enterprise financial risk based on fuzzy grey correlation analysis   Order a copy of this article
    by Xuena Lin, Guijun Shang 
    Abstract: In this paper, a comprehensive evaluation method of enterprise financial risk based on fuzzy grey correlation analysis is proposed. Firstly, the comprehensive evaluation index system of enterprise financial risk is constructed, and the comprehensive evaluation index of risk is differentiated according to the judgment matrix. Then, based on fuzzy grey relational analysis, a judgment matrix is constructed to determine the weight of financial risk indicators. Finally, optimise the comprehensive evaluation sub-node to realise the comprehensive evaluation of enterprise financial risk. The experimental results show that the graphic area enclosed by PR curve, X-axis and Y-axis is close to 1, the financial health index is above 80%, and the false alarm rate is below 15%, which has good evaluation performance.
    Keywords: fuzzy grey correlation analysis; enterprise financial risk; comprehensive evaluation; evaluation index system; financial health index.
    DOI: 10.1504/IJBIDM.2025.10066018
     
  • Online allocation of network learning resources based on parallel cluster mining   Order a copy of this article
    by Zhaofeng Li, Ping Hu, Pei Zhang, Liwei Zhang 
    Abstract: In order to solve the problem of low accuracy and consideration of online allocation of existing network learning resources, this paper proposes a network learning resource online allocation method based on parallel clustering mining. Firstly, analyse the development stages of online learning resources and collect data on educational resources; secondly, construct a network learning resource model and utilise parallel clustering to explore the clustering features of network learning resources; finally, using the mined resource features, design network learning resource labels to achieve online allocation of network learning resources. The experimental results show that the accuracy of network learning resource allocation in this method is 98.2%, the accuracy of network learning resource allocation is 98.1%, and the reliability of allocation reaches 96.2%.
    Keywords: parallel clustering mining; online allocation of learning resources; resource tags; education resource data.
    DOI: 10.1504/IJBIDM.2025.10066019
     
  • Study on complement of knowledge map of educational resources based on semi-supervised learning   Order a copy of this article
    by Wei Liu 
    Abstract: In order to improve the effectiveness of completing educational resource knowledge graphs, a complement method of knowledge map of educational resources based on semi-supervised learning is studied. The relationship path features of the education resource knowledge graph are extracted using a path sorting algorithm. Within the interactive connection graph attention network of the semi-supervised deep learning algorithm, the embedding vectors of the knowledge graph are inputted to obtain the encoded representation of contextual features for the embedding vector entities, and the encoded feature matrix is constructed. The semantic matching model tensor decomposition is used to encode the feature matrix and calculate the scores for each triple. The triple with the highest score is selected as the completion result of the knowledge graph. The experimental results show that this method achieves high values in average reciprocal rank, Hits@10, Hits@3, and Hits@1, indicating a good completion effect of the knowledge graph.
    Keywords: semi-supervised learning; educational resources; knowledge map; complement method; relationship path; attention network.
    DOI: 10.1504/IJBIDM.2025.10066020
     
  • Adaptive recommendation method for teaching resources based on knowledge graph and user similarity   Order a copy of this article
    by Meng Li 
    Abstract: To provide users with personalised and accurate teaching resource recommendation results, a new teaching resource adaptive recommendation method is proposed by effectively integrating knowledge graph with user similarity. This method first constructs a knowledge graph of teaching resources, representing the relationship between resources as a graph structure. Then, by analysing user learning history, ratings, and preferences, calculate user similarity and identify other users with higher similarity to the current user. Next, based on the resource ratings between similar users and current users, combined with the resource association relationship in the knowledge graph, the resource ratings are calculated using methods such as weighted summation. Finally, teaching resources are sorted based on resource ratings and recommended to current users. The experimental results show that the maximum root mean square error of this method is only 0.26, the highest recall rate is 95.6% and the MRR value is relatively high.
    Keywords: knowledge graph; user similarity; ICA algorithm; improve collaborative filtering algorithms; preference learning algorithm.
    DOI: 10.1504/IJBIDM.2025.10066021
     
  • Financial risk monitoring and warning method of listed enterprises based on data mining   Order a copy of this article
    by Xinyan Zhang 
    Abstract: To address the issues of low accuracy in traditional methods for enterprise financial data mining, significant discrepancies between financial risk monitoring results and reality, and low accuracy in risk warning, a data mining-based financial risk monitoring and warning method for listed companies was designed. Firstly, grey relational clustering is used to mine financial data of listed companies. Then, factor analysis and fuzzy recognition matrix are combined to identify financial risks of listed companies. Finally, XGboost algorithm is used to divide financial risks of listed companies. Support vector machine is used to build a financial risk warning decision function for listed companies, achieving financial risk monitoring and warning for listed companies. The experimental results show that the financial risk monitoring results of this method are consistent with the true values, and the data mining accuracy can reach up to 98.23%, with a risk warning accuracy of over 95%. It has a good effect on enterprise financial risk monitoring and warning, and has high application value.
    Keywords: data mining; listed companies; financial risk; monitoring and warning; grey relational clustering; support vector machine.
    DOI: 10.1504/IJBIDM.2025.10066153
     
  • An online ideological and political courses recommended method in colleges and universities based on weighted collaborative filtering algorithm   Order a copy of this article
    by Yun Peng, Xue Wang 
    Abstract: In order to reduce the error of course recommendation and increase the duration of online learning for students, this paper proposes a university ideological and political online course recommendation method based on weighted collaborative filtering. Firstly, the mean centred method is used to standardise user ratings; user similarity is calculated based on interest and trust. Secondly, constructing a user tag weight matrix and a course tag weight matrix helps better describe user needs. Finally, after calculating the label weights, based on the idea of collaborative filtering, the weighted generalised Mahalanobis distance is used to calculate the closeness of the recommendation scheme; the Top-N closest recommendation scheme is selected to recommend to learners. The experimental results show that this method significantly reduces course recommendation errors and improves the online learning duration of students, with a minimum recommendation error of only 0.09.
    Keywords: weighted collaborative filtering; ideological and political online courses; course recommendation; label weight.
    DOI: 10.1504/IJBIDM.2025.10066990
     
  • A prediction method of purchasing intention of e-commerce consumers in network marketing   Order a copy of this article
    by Hongyan Da 
    Abstract: In order to improve the accuracy of prediction results, a prediction method of purchasing intention of e-commerce consumers in network marketing is proposed. Firstly, collect consumer related data, standardize and handle outliers to eliminate dimensional differences between data. Secondly, a logistic regression algorithm is used to extract and select consumer features related to purchase intention, revealing consumers’ purchasing preferences. Finally, based on the feature extraction results, a random forest algorithm is used to predict purchase intention. Gini impurity and information gain are used as splitting criteria to construct multiple decision trees, and the prediction results of all decision trees are synthesized by voting or averaging to obtain the final prediction result. The experimental results show that the root mean square error of the proposed method is relatively low, with the highest normalized information value reaching 0.89, indicating that it can accurately reflect consumers’ purchasing intention.
    Keywords: network marketing; e-commerce; purchase intention; logistic regression algorithm; random forest algorithm.
    DOI: 10.1504/IJBIDM.2025.10066993
     
  • A feature mining model for financial product purchase behaviour data based on consumer behaviour perspective   Order a copy of this article
    by Huijun Wang 
    Abstract: To overcome the problems of low accuracy, poor precision, and poor recall caused by traditional purchasing behaviour data feature mining models, the paper proposes a financial product purchasing behaviour data feature mining model based on consumer behaviour perspective. Firstly, extract financial product purchase behaviour data, preprocess and clean it to obtain the dataset to be mined. Then, based on consumer purchasing behaviour, a consumer behaviour preference function is constructed, and the regression function is used to perform secondary processing on the analysis results. Finally, the TF-IDF algorithm is applied to extract consumer behaviour vectors, and the K-means clustering analysis method is combined to accurately mine the characteristics of purchasing behaviour data. The experimental results show that the recall rate of this method can reach 98.74%, the precision rate can reach 96.51%, and the mining accuracy rate can reach 98.65%. The mining results obtained have high reliability.
    Keywords: consumer behaviour analysis; data feature mining; TF-IDF algorithm; k-means cluster analysis.
    DOI: 10.1504/IJBIDM.2025.10066995