Progress and challenges in the application of soft sensing technology for nuclear power plant pump fault diagnosis

Yukun Zhang; Qian Huang; Wei Xu; Yifan Zhi; Xiuli Wang

doi:10.1051/rdne/2025011

Open Access

Review

Issue		Res. Des. Nucl. Eng. Volume 2, 2026


Article Number		2025011
Number of page(s)		18
DOI		https://doi.org/10.1051/rdne/2025011
Published online		10 March 2026

Research & Design of Nuclear Engineering, 1, 2025011 (2026)

Review Article

Progress and challenges in the application of soft sensing technology for nuclear power plant pump fault diagnosis

Yukun Zhang¹, Qian Huang², Wei Xu³, Yifan Zhi² and Xiuli Wang¹^*

¹ Research Center of Fluid Machinery Engineering and Technology, Jiangsu University, Jiangsu 212013, PR China
² China Nuclear Power Engineering Co., Ltd., Beijing 100840, PR China
³ School of Energy and Power Engineering, Jiangsu University, Zhenjiang 212013, PR China

^* e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Received: 26 June 2025
Accepted: 15 December 2025

Abstract

Nuclear power pumps are critical equipment for the safe operation of nuclear power plants, and their fault diagnosis technologies face challenges such as adaptability to extreme environments, interference from dynamic operating conditions, and the coupling of multiple faults. Soft sensing technology provides a new approach to overcoming the limitations of traditional diagnostic methods by constructing mathematical mapping models between observable variables and target states. This paper systematically reviews the research progress in this field: traditional machine learning methods demonstrate high efficiency in small-sample scenarios but exhibit limited generalization capability; deep learning models significantly improve the accuracy of complex fault identification through end-to-end feature learning; transfer learning and hybrid strategies effectively address the challenge of cross-condition adaptability. The study also reveals current technical bottlenecks, including insufficient dynamic response capability of models, high data dependency, and a lack of interpretability. Future research should focus on the innovation of intelligent algorithms, the construction of edge-cloud collaborative validation platforms, and the formulation of industry standards, in order to promote the comprehensive implementation of soft sensing technology from theory to engineering and provide core support for the safe operation and maintenance of nuclear power systems.

Key words: Nuclear power pump / Fault diagnosis / Soft sensing / Deep learning / Machine learning

© The Author(s) 2026. Published by EDP Sciences.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1 Research background and significance

In the complex coupled systems of nuclear power plants, pump equipment is required to operate with high continuity and reliability. The unobservability of its operating conditions and the concealment of fault manifestations make fault diagnosis a critical breakthrough for enhancing the system’s level of intelligence. Under the national strategic framework of the “dual carbon” goals, nuclear energy, due to its zero-carbon emissions and ultra-high energy density, has become a core pillar in the transformation of China’s energy structure. According to the International Energy Agency (IEA), by 2050, the global installed capacity of nuclear energy must be tripled compared to that of 2020 in order to achieve temperature control targets. As a strategic emerging high-tech industry in China, the nuclear energy sector is continuously improving the safety technology level of nuclear energy utilization [1]. However, during the operation of nuclear power plants, certain key pump equipment must serve for extended periods under extreme conditions such as high temperature, high pressure, and radiation. For example, the main loop environment where the primary pump is located can exceed 300 °C in temperature and reach pressures up to 15 MPa, accompanied by intense radiation fields. These special operating conditions significantly increase the fault risk and diagnostic difficulty of pump equipment [2], resulting in a failure rate markedly higher than that of conventional industrial pumps. Studies have indicated that among unplanned reactor shutdown events caused by equipment failures in nuclear power plants, a relatively high proportion is associated with pump-related anomalies. As shown in Figure 1, engineering experience demonstrates that latent faults such as bearing wear, seal failure, and impeller cavitation are the most frequent in nuclear power pump equipment. In some actual operational data, such types of damage account for more than 80%, posing significant challenges for early state identification and diagnosis [3]. If latent defects are not detected in a timely manner, they may evolve into severe failures under certain extreme operating conditions, potentially compromising the cooling capacity of the primary system and leading to serious consequences such as a Loss-of-Coolant Accident (LOCA). In the event of such incidents, a single occurrence can result in economic losses of up to several hundred million USD [4], while also causing irreversible impacts on public perception and the credibility of the nuclear industry.

Fig. 1

Proportion of fault types in nuclear power pumps.

At present, fault diagnosis of nuclear power pumps faces three major technical bottlenecks:

First, sensor deployment is severely constrained. Due to the enclosed space of the nuclear island, radiation protection requirements (as mandated by the ALARA principle), and electromagnetic interference, the coverage rate of key parameter measurement points is below 40%. For example, core failure indicators such as fluid stress within the pump chamber and microcracks in bearings cannot be directly measured [5], resulting in essential blind spots in traditional monitoring systems that rely on physical sensors. Second, dynamic operating condition interference is intense. Simulation studies have shown that load-following operation in nuclear power plants leads to frequent start-stop cycles of pump units, causing complex transient processes such as abrupt flow changes (±30%) and pressure pulsations (±15%) [6]. In the face of such complex operating environments, traditional rule-based diagnostic approaches that depend on preset thresholds experience a significant increase in false alarm rates under transient disturbances (exceeding 30% in some cases [7]), while manual expert judgment often suffers from delays of more than 6–8 hours—far from meeting the minute-level response requirements of safety-critical equipment. Therefore, a more efficient intelligent diagnostic mechanism is urgently needed as a replacement. Third, multi-fault coupling mechanisms are complex. The strong coupling characteristics of mechanical, thermal-hydraulic, and electrical systems in nuclear power pumps mean that a single fault may trigger a cascade of failures. For example, under complex operating conditions, high-frequency vibrations caused by faults such as cavitation and hydraulic disturbances may interfere with the early feature extraction of mechanical failures such as bearing pitting. Studies by Reges et al. have shown that axial vibrations are more sensitive to bearing faults under such compound conditions but are also more likely to be masked by other disturbance signals [8]. Furthermore, traditional signal processing methods experience a sharp decline in fault resolution when the signal-to-noise ratio drops below –10 dB [9].

To overcome the aforementioned bottlenecks, soft sensing technology provides a revolutionary solution by constructing mathematical mapping models between observable variables (such as vibration, current, and temperature) and target states (such as wear level, efficiency loss, and remaining useful life). The evolution of this technology can be divided into three stages: First Stage (1990s–2010): Shallow modeling based on mechanistic equations and statistical regression. Sakthivel et al. compared the performance of Support Vector Machines (SVM) and Gene Expression Programming (GEP) in fault diagnosis of centrifugal pumps and found that the symbolic regression model constructed by GEP improved diagnostic accuracy to 91.2% under strong nonlinear interference conditions, outperforming SVM’s 86.5%, while also offering stronger model interpretability [10]. Second Stage (2010–2020): Feature engineering optimization driven by machine learning. Algorithms such as SVM and Random Forest enhanced the ability to fit nonlinear relationships. Liu improved classification accuracy of pump power maps to 92% based on LibSVM, but the method remained constrained by its reliance on manual feature design and domain expertise [7]. Third Stage (2020–present): End-to-end intelligent diagnosis empowered by deep learning. Architectures such as Convolutional Neural Networks (CNN) and Transformers have enabled automatic extraction and hierarchical representation of fault features. The multi-scale convolutional self-attention model constructed by Chen Li et al. achieved 99.5% accuracy in bearing crack diagnosis, representing an improvement of over 40% compared to traditional methods [11]. Cheng et al. further introduced physics-informed embedding (AFARN model), which maintained 98% accuracy in cross-domain transfer tasks while enhancing interpretability by a factor of 3.2 [12].

The irreplaceability of soft sensing technology in the diagnosis of nuclear power pumps is embodied in three core values: (1) Breaking the Limits of Physical Monitoring: By performing spectral analysis on measurable signals such as vibration, soft sensing enables effective estimation of variables that cannot be directly observed, such as pump chamber stress and wear trends. This significantly expands the monitoring boundaries of traditional physical sensor systems [13]. (2) Reconstructing the Fault Identification Paradigm: End-to-end models significantly simplify the traditional feature engineering process by integrating feature extraction and fault identification into a unified modeling framework. In typical diagnostic tasks, this reduces manual processing time from hours to seconds [14–16] and allows for the simultaneous identification of multiple faults. (3) Enabling Predictive Maintenance: Multiple studies have shown that soft sensing models based on transfer learning can maintain diagnostic accuracy above 90% even under unseen and unlabeled operating conditions [17, 18]. Some models demonstrate significantly superior early warning capabilities compared to traditional methods, showing strong potential to support predictive maintenance.

From a national strategic perspective, soft sensing – being one of the key technologies for ensuring the intelligence, safety, and maintainability of nuclear power equipment – is becoming an essential technical support for advancing China’s strategic goal of becoming a “nuclear power strong nation.” The China Nuclear Energy Development Report (2023) explicitly states: “Intelligent diagnosis and predictive maintenance are standard capabilities for next-generation nuclear power plants.” Therefore, overcoming the challenges of soft sensing technology in terms of generalization, interpretability, and engineering adaptability holds significant strategic importance for building an independent and controllable technological system for nuclear energy equipment.

2 Core methods and applications of soft sensing technology

To enhance the accuracy of operating condition identification and fault diagnosis for nuclear power pumps, researchers have developed multiple approaches centered on soft sensing modeling technologies. These approaches mainly include deep learning-based models, traditional machine learning methods, and hybrid strategies incorporating transfer learning. Each method has its own advantages and demonstrates differentiated adaptability under varying conditions of data scale, operating scenarios, and system complexity.

2.1 Traditional machine learning

In the field of nuclear power pump fault diagnosis, traditional machine learning methods still hold irreplaceable value in scenarios where sample sizes are limited or rapid deployment is required, due to their advantages of simple model structure, high computational efficiency, and strong interpretability. Compared with deep learning models, traditional methods significantly reduce dependence on training data (typically 100–500 samples are sufficient to build an effective model) and demonstrate unique applicability under conditions with limited feature dimensions, incomplete labels, or stringent real-time requirements [17, 19, 20].

(1) Support vector machines and their optimized variants

Support Vector Machines (SVM), as a representative method of statistical learning theory, are among the most widely applied techniques in nuclear power pump fault classification. Liu developed a pump power map feature classification model based on the LibSVM toolbox, utilizing A Radial Basis Function (RBF) kernel to map features into a high-dimensional space. The model achieved an accuracy of 92% in identifying six types of faults, including bearing wear and impeller cavitation, significantly outperforming manual threshold-based methods in terms of generalization capability [21]. To enhance the ability to identify multiple concurrent faults, Rapur and Tiwari proposed an integrated framework combining Wavelet Packet Transform and Multi-class Support Vector Machine (WPT-MSVM). First, vibration signals were decomposed into 32 sub-bands using WPT, from which energy entropy and kurtosis features were extracted. Then, a one-vs-one strategy was used to construct an MSVM decision tree, improving classification accuracy under multi-fault coexistence conditions in centrifugal pumps to 96.8%, which is 11.5% higher than that of a single SVM model [15]. For scenarios with scarce labeled data, Liu Guangping et al. developed a One-Class Support Vector Machine (OCSVM) anomaly detection system that requires only normal condition samples to construct a hyperspherical decision boundary. This approach achieved a recall rate of 87.3% in detecting unknown fault types in submersible electric pumps, demonstrating the practicality of unsupervised learning in nuclear power diagnostics [17].

(2) Signal processing and feature engineering–driven methods

The effectiveness of traditional machine learning heavily relies on signal preprocessing and feature design. Researchers have employed time-frequency analysis techniques to enhance the separability of fault features: (1) Short-Time Fourier Transform (STFT): Du and Zhang applied STFT to analyze the vibration signals of a hydraulic pump casing, constructing a power spectral density feature set. Combined with a k-nearest neighbor (KNN) classifier, this approach enabled early diagnosis of bearing pitting, achieving an accuracy of 89.4% under steady-state conditions. However, under complex conditions such as sudden flow changes, the degradation of frequency resolution led to a significant increase in false alarm rates [5]. (2) Wavelet Packet Transform (WPT): Qing et al. addressed the issue of coupled interference in plunger pump current signals by applying WPT decomposition and extracting energy moment features. Coupled with a random forest model, the method achieved high-accuracy classification of different fault states and improved robustness under strong electromagnetic interference conditions [13]. (3) Coherent Averaging and Envelope Demodulation: Schmidt et al. designed a frequency band enhancement algorithm based on coherent averaging and envelope demodulation. Under conditions with a signal-to-noise ratio as low as –15 dB, the method could still identify early micro-damage in bearings with a recognition rate of 83.6%, significantly enhancing the detection capability of weak features under complex operating conditions [9]. As shown in Table 1, the commonly used feature extraction mechanisms in pump fault diagnosis can be categorized into time-domain, frequency-domain, and time–frequency representations.

Table 1

Summary of commonly used feature types in pump fault diagnosis.

(3) Innovative applications of other traditional algorithms

Beyond SVM, various traditional algorithms demonstrate unique advantages in specific scenarios: (1) Gene Expression Programming (GEP): Patton et al. compared the performance of GEP and SVM in diagnosing multi-fault coupling in centrifugal pumps. They found that the fault decision tree constructed through symbolic regression by GEP achieved a higher accuracy (91.2%) under strong nonlinear interference conditions compared to SVM (86.5%), and also offered stronger model interpretability [22]. (2) Fuzzy Inference Systems: Mwaura et al. combined an Adaptive Neuro-Fuzzy Inference System (ANFIS) with vibration spectral kurtosis indicators to quantify the severity of faults in steam generator feedwater pumps, achieving a mean absolute error of only 4.3% [3]. (3) Fusion with Physical Models: Chen et al. developed a hybrid model combining dynamic equations and Bayesian networks for diagnosing faults in reciprocating bilge pumps. By performing force source inversion, the system could identify valve group leakage faults, offering a mechanism-data hybrid modeling paradigm for nuclear power pump structures [23].

(4) Performance and engineering applicability comparison of traditional machine learning methods in nuclear power pump fault diagnosis

As shown in Table 2 and Figure 2, although traditional machine learning methods offer advantages in terms of real-time performance and effectiveness under small-sample conditions, their accuracy and generalization capabilities are noticeably inferior to those of deep learning models when dealing with complex operating conditions and multi-fault identification. This limitation restricts their applicability in high-demand scenarios [24].

Fig. 2

Engineering applicability of traditional machine learning methods in nuclear power pump fault diagnosis.

Table 2

Engineering applicability analysis of traditional machine learning methods in nuclear power pump fault diagnosis.

The main limitations of traditional machine learning are as follows: (1) Dependence on Expert-driven feature engineering: Manual feature design incurs high costs – for instance, WPT requires predefining the number of decomposition levels. (2) Weak adaptability to dynamic operating conditions: When flow fluctuations exceed ±20%, the ability of SVM models to recognize fault features declines significantly. Experimental evidence indicates that under strong dynamic disturbances, model accuracy typically drops by double-digit percentages, highlighting insufficient adaptability to complex conditions [7]. (3) Limited capability in multi-fault coupling identification: These shortcomings have prompted researchers to shift toward integrated innovations combining deep learning and transfer learning approaches [25].

2.2 Deep learning models

(1) Foundational Architecture Phase: Overcoming Dependence on Feature Engineering

Early studies focused on transferring foundational architectures such as Convolutional Neural Networks (CNN) and Long Short-Term Memory networks (LSTM). In this context, end-to-end feature learning refers to enabling deep models to directly extract discriminative features from raw vibration or current signals without relying on manually crafted time–frequency features such as STFT or WPT, thereby reducing dependence on expert-designed feature engineering. Zheng et al. were the first to introduce Deep Neural Networks (DNN) into fault diagnosis of nuclear power pump power factor anomalies. They constructed a seven-layer fully connected network that mapped vibration signal spectra in an end-to-end manner, achieving an accuracy of 94.2% in identifying bearing wear – an improvement of 22% over SVM – and significantly reduced dependence on manual feature engineering [18, 26]. Zhang et al. further developed a Spatio-Temporal Convolutional Neural Network (ST-CNN) by parallelizing 1D-CNN and LSTM units to simultaneously extract frequency-domain features and temporal evolution patterns from vibration signals. Under varying load conditions in nuclear power plants, the model maintained a stable fault classification accuracy of 93.5%, effectively addressing the resolution limitations of traditional FFT methods in transient conditions [27]. Tan et al., focusing on the strongly non-stationary characteristics of cavitation faults in mixed-flow pumps, compared three architectures – Stacked Autoencoder (SAE), LSTM, and CNN. Their results demonstrated that CNN exhibited the strongest noise robustness, with only a 4.3% drop in accuracy under a signal-to-noise ratio of −8 dB, while maintaining a baseline accuracy of 87.2% [28].

(2) Advanced optimization phase: integrating structural and learning mechanism innovations

To enhance model robustness in complex scenarios, researchers have achieved breakthroughs in three key areas: (1) Multi-Scale Perception: Zaman et al. proposed a Dual-Scale Convolutional Autoencoder (DSCAE), decomposing raw vibration signals into high-frequency detail components (0.5–5 kHz) and low-frequency trend components (<0.5 kHz), which were fed into parallel convolution branches. By fusing features and applying spatial pyramid pooling, the model achieved 100% classification accuracy in centrifugal pump bearing pitting diagnosis, with a generalization error of less than 0.5% under varying pressure conditions [29]. Li et al. developed a Multi-Scale Convolutional Self-Attention Network (MSCSAN), incorporating dilated convolutions (dilation rates = 1, 3, 5) to capture multi-receptive field features and applying channel attention mechanisms to weight key frequency bands. This enabled high-accuracy detection of rolling element cracks (accuracy of 99.5%), representing a 7.8% improvement over single-scale CNNs [11]. (2) Dynamic optimization mechanisms: Zhang et al. introduced the Sparrow Search Algorithm (SSA) to optimize CNN-LSTM hyperparameters. Through adaptive learning rate adjustment and convolution kernel size evolution, the model’s convergence speed in nuclear plant accident classification tasks improved by 40%, with accuracy reaching 98.24% [16]. Liu et al. proposed a Transfer-Enhanced Spatiotemporal CNN (TE-STCNN), aligning source and target domain feature distributions using Maximum Mean Discrepancy (MMD) loss, reducing accuracy fluctuations during condition shifts from ±18.7% to ±5.2% [18]. (3) Lightweight deployment: To address computational constraints on edge devices, Wang et al. designed a Multi-Task Attention CNN (MTA-CNN) that shares a backbone network to extract common features. Task-specific attention modules then branched into fault diagnosis and operating condition recognition tasks. While maintaining an accuracy of 98.1%, the model reduced parameter count by 64%, meeting the real-time demands of embedded systems [30].

(3) Performance comparison of different deep learning models in nuclear power pump fault diagnosis

Table 3 and Figure 3 collectively present the performance differences among various deep learning models in nuclear power pump fault diagnosis. Overall, the Multi-Scale Convolutional Self-Attention Network (MSCSAN) significantly outperforms basic CNN and Spatio-Temporal CNN (ST-CNN) in both accuracy and robustness, demonstrating the superior adaptability of advanced architectures under complex operating conditions.

Fig. 3

Performance comparison of different deep learning models in nuclear power pump fault diagnosis (literature-based performance comparison).

Table 3

Performance comparison of different deep learning models in nuclear power pump fault diagnosis.

Current deep learning models exhibit two main limitations: (1) Data dependency: For example, the MSCSAN model experiences a sharp drop in accuracy from 99.5% to 81.3% when the number of training samples falls below 500 [11]. (2) Black-box decision risk: Although DSCAE achieves 100% classification accuracy, it fails to provide interpretability regarding the contribution of specific fault frequency bands [29]. These shortcomings highlight the need for further optimization through transfer learning and interpretability-focused research.

2.3 Transfer learning and hybrid strategies

In the field of nuclear power pump fault diagnosis, transfer learning and hybrid strategies have emerged as key technological approaches to enhance the engineering applicability of soft sensing models by addressing three core challenges: data scarcity, domain shift, and model interpretability. As the operational conditions of nuclear power plants become increasingly complex and variable, traditional single-model approaches face significant limitations in adapting to new scenarios and generalizing across different devices. Transfer learning enables model reuse from source domains (e.g., laboratory or simulation data) to target domains (e.g., actual nuclear power plant environments) through knowledge transfer mechanisms. In parallel, hybrid strategies integrate physical mechanisms, multimodal data, and heterogeneous model strengths to build diagnostic frameworks that offer both high accuracy and strong robustness [31].

(1) Domain adaptation and feature alignment techniques

Domain adaptation significantly enhances model stability under varying operating conditions by minimizing the distribution discrepancy between the source and target domains. The main techniques include: (1) Subspace mapping: Qu and Yan proposed the Maximum Mean Discrepancy-Regularized Transfer Subspace Model (MMD-TSM), which constructs a coupled projection matrix between the source and target domains to map high-dimensional features into a shared low-dimensional subspace. In a cross-device transfer experiment (from a wet ball mill to a nuclear power pump), this model improved the fault diagnosis accuracy from 68.5% to 89.2%, effectively mitigating performance degradation caused by distribution shift between source and target data [31]. (2) Adversarial training: Wang developed a transfer framework based on Variational Autoencoder–Wasserstein Generative Adversarial Network (VA-WGAN). By constraining the generator output distribution using Wasserstein distance, this method addressed the training instability issues of conventional GANs. On a multi-condition nuclear pump dataset, it accelerated convergence by 40% and reduced prediction error to 3.7%, significantly outperforming direct transfer approaches [32]. (3) Physics-driven alignment: Cheng et al. designed the Adaptive Fault Attention Residual Network (AFARN), which incorporates fluid dynamics equation constraints (e.g., simplified Navier–Stokes equations) into the feature extraction layer. A physics-consistency loss guides the alignment of features across domains. The model maintained 98% diagnostic accuracy in cross-domain diagnosis of nuclear plant circulating pumps, while improving explainability metrics (SHAP values) by a factor of 3.2. The reported accuracy was obtained under multi-condition operating scenarios of a nuclear power plant circulating water pump, including variable-speed operation and fluctuating flow conditions [12].

(2) Multimodal fusion and collaborative learning architectures

Hybrid strategies integrate multi-source data, heterogeneous models, and joint tasks to overcome the perceptual limitations of single data streams: (1) Spatiotemporal-physical joint modeling: Shi et al. proposed a Deep Learning Multi-Node Prediction Framework (DL-MNPF), which processes thermal parameters (e.g., temperature, pressure), mechanical vibrations, and current signals in parallel. Through a gated fusion module, it achieves multimodal feature interaction. In primary loop state prediction of nuclear power plants, its Root Mean Square Error (RMSE) was as low as 0.023, a 54% reduction compared to single-signal models [33]. (2) Multi-task Attention Mechanism: Wang et al. introduced a Multi-Task Attention CNN (MTA-CNN), which shares a backbone network to extract common features and uses task-specific attention modules to jointly optimize fault diagnosis and condition identification. On a rolling bearing dataset, it improved training efficiency by 2.3 times, with an average dual-task F1-score of 96.7% [30]. (3) Bayesian-deep model fusion: Li et al. combined EfficientNet with a Bayesian inference module – EfficientNet extracts high-dimensional spatial features, while the Bayesian module quantifies uncertainty in fault types. This hybrid model maintained 92.4% accuracy with fewer than 300 samples and kept the false alarm rate below 1.8% [34].

(3) Exploration of cutting-edge hybrid strategies

Researchers are further exploring advanced technologies such as meta-learning, federated learning, and digital twins to tackle increasingly complex engineering challenges: (1) Meta-transfer learning: Liu et al. proposed a Meta-Transfer Spatiotemporal CNN (MTST-CNN) based on the Model-Agnostic Meta-Learning (MAML) framework. This enables the model to quickly adapt to new working conditions with only a few samples. In nuclear power plant start-stop condition switching tests, the required number of fine-tuning samples was reduced by 80%, and diagnostic latency was kept under 50 ms [18]. (2) Federated knowledge distillation: To address data-sharing constraints between multiple nuclear power plants, some studies have begun exploring the combination of federated learning and knowledge distillation in fault diagnosis of pump equipment. This approach involves training lightweight models locally at each site and aggregating knowledge via distillation mechanisms, enabling cross-site intelligent diagnosis while preserving data privacy. Though still in early stages within this domain, prior transfer learning studies suggest promising generalization capability between different pump devices when supported by multi-signal fusion and hyperparameter optimization, achieving up to 98.6% accuracy [27]. (3) Digital twin integration: Recently, the integration of digital twins and soft sensing has emerged as a trend in predictive maintenance for pump equipment. For instance, Luo et al. proposed a hybrid method combining digital twin simulation with SAE–LSTM correction models for predicting the Remaining Useful Life (RUL) of centrifugal pump rolling bearings. By aligning simulation data with real measurements, they constructed Health Indicator (HI) curves across the lifecycle and employed Relevance Vector Machine (RVM) for prediction, achieving over 3% improvement in RUL estimation accuracy across multiple benchmark datasets. Although specific engineering applications in nuclear pumps are still lacking, this study demonstrates the potential of digital twin–driven soft sensing methods in scenarios involving small samples and high-reliability equipment [35]. Figure 4 systematically illustrates the typical architectures and key differences among major deep learning models – CNN, ST-CNN, MSCSAN, MTA-CNN, AFARN, and VA-WGAN – used in nuclear pump fault diagnosis.

Fig. 4

Comparison of different model architectures.

(4) Performance comparison and challenge analysis of transfer learning and hybrid strategies

Table 4 and Figure 5 demonstrate that models based on transfer learning and hybrid strategies outperform others in terms of cross-domain accuracy retention, reduced sample requirements, and improved interpretability. Notably, approaches incorporating physical priors and multimodal fusion exhibit exceptional robustness and engineering applicability under complex operating conditions.

Fig. 5

Performance comparison and technical advantages of transfer learning and hybrid strategy models.

Table 4

Performance comparison and technical advantage analysis of transfer learning and hybrid strategy models.

Despite the significant progress achieved by transfer learning and hybrid strategies, three major challenges remain: (1) Transfer failure in complex fault coupling: When there are substantial differences in fault mechanisms between the source and target domains, model performance may degrade significantly. For instance, the accuracy of the AFARN model drops by 25–30% under such conditions [27]. (2) Insufficient depth in physics-data fusion: Most current soft sensing methods only incorporate physics through shallow mechanisms – such as physical loss functions or regularization terms (e.g., Navier–Stokes residuals, MMD alignment). These shallow fusion approaches have been repeatedly identified as bottlenecks in interpreting and stabilizing models in multi-physics and strongly nonlinear systems. The emerging paradigm of Physics-Informed Neural Networks (PINNs) aims to overcome this by embedding Partial Differential Equation (PDE) structures directly into the network architecture, enabling joint training and deeper integration [36]. (3) Edge-cloud coordination and real-time bottlenecks: While edge-cloud deployment enhances data privacy and supports dynamic model updates, it also introduces latency issues. Communication and synchronization delays often reach 100–300 ms – posing serious limitations in environments with low bandwidth or heterogeneous hardware. Studies have shown that such delays can hinder real-time responses (on the order of minutes or faster), which are critical in nuclear safety-related application [37].

3 Development of feature extraction mechanisms for fault signals in nuclear power pumps

In the field of fault diagnosis for nuclear power pumps, vibration signals and their spectral features have been widely recognized as the most sensitive indicators, with strong correlations observed between their variation patterns and specific fault types. This sensitivity arises from characteristic frequency responses triggered by mechanical failures. Du and Zhang, using Short-Time Fourier Transform (STFT) to analyze the power spectrum of hydraulic pump casing vibrations, found that bearing pitting faults led to a significant energy increase of 40–60 dB in the 3–5 kHz frequency range. The fault monitoring system built on this basis achieved an accuracy of 89.4% under steady-state condition [5]. In a study on condition monitoring and fault diagnosis of multistage centrifugal water-injection pumps, Wang summarized the vibration characteristics of typical fault modes such as bearing wear and rotor misalignment. He noted that excessive clearance in the pump shaft bearing was a major cause of abnormal vibration velocity due to rotor misalignment, providing practical evidence for subsequent quantitative diagnosis and maintenance optimization [38]. Reges et al. conducted a systematic study of the vibration responses of submersible electric pumps under 27 operating conditions, revealing that axial vibrations were 1.8 times more sensitive to bearing faults than radial vibrations – an important insight for optimizing sensor placement strategies [8].

3.1 Limitations and breakthroughs of traditional signal processing methods

Traditional signal processing methods face severe challenges under dynamic operating conditions, specifically in the following three aspects: (1) Time–Frequency Resolution Trade-off in STFT: When there are large flow fluctuations (e.g., ±20%), the resolution of the Short-Time Fourier Transform (STFT) significantly decreases, leading to a notable increase in false alarm rates. To address this, Schmidt et al. proposed a frequency band enhancement technique for time-varying conditions, which achieved an 83.6% micro-damage identification rate even under SNR of −15 dB [9]. (2) Dependency on decomposition levels in WPT: While Wavelet Packet Transform (WPT) can alleviate non-stationary issues, it requires manually preset decomposition levels (typically 6–8 layers), affecting real-time performance. To overcome this, Qing Lujun et al. developed an adaptive WPT algorithm that dynamically adjusts decomposition depth based on signal complexity, reducing fault localization error to within ±5% [13]. (3) Confusion of multiple fault features: Immovilli et al. confirmed that when bearing pitting and rotor eccentricity coexist, traditional spectrum analysis struggles to distinguish the resonance peaks at 1.2 kHz and 1.8 kHz. They proposed an externally induced vibration modeling method, which injects specific frequency excitation signals to enhance fault separability [39].

3.2 Innovations in multi-source information fusion strategies

To enhance feature robustness, researchers have developed multi-source information fusion strategies. As shown in Table 5 and Figure 6, multi-source signal fusion strategies can significantly improve the accuracy and robustness of fault diagnosis in nuclear power pumps. Especially in complex scenarios such as concurrent multiple faults and early-stage crack detection, information complementarity effectively strengthens feature discrimination.

Fig. 6

Accuracy improvement through multi-source signal fusion strategies in nuclear pump fault diagnosis.

Table 5

Comparison of multi-source signal fusion strategies in fault diagnosis of nuclear power pumps.

Rapur and Tiwari improved the diagnostic accuracy of centrifugal pumps under concurrent fault conditions to 96.8% by synchronously analyzing vibration signals and motor current, constructing a multidimensional input space based on WPT energy entropy and kurtosis features [15]. Du et al. proposed a multimodal diagnostic strategy combining Acoustic Emission (AE) and Infrared Thermography (IRT). By employing compressed sensing and deep feature extraction techniques, they significantly enhanced early crack detection in bearings, demonstrating strong potential for improving early warning sensitivity [40].

Despite the benefits of multi-source information fusion, several practical challenges remain. First, sensor signals collected from different measurement channels often suffer from asynchrony due to inconsistent sampling rates or transmission delays, which may lead to misaligned features during fusion. Second, redundant or partially correlated signals can significantly increase computational cost, especially when deep fusion networks are employed. Third, the physical layout constraints of nuclear power plants restrict the installation of additional sensors, limiting the diversity and spatial coverage of available measurements. These issues must be addressed to fully realize the potential of multi-source fusion in nuclear pump diagnostics.

3.3 Deep learning-enabled automatic feature extraction

Deep learning technology has overcome the limitations of traditional feature engineering in three key ways: (1) Time-Frequency Feature Learning with CNNs: Zaman et al. developed a Dual-Scale Convolutional Autoencoder (DSCAE) that directly learns fault-sensitive features from time–frequency representations of vibration signals, achieving 100% accuracy in diagnosing bearing pitting in centrifugal pumps [29]. (2) Feature Weighting via Attention Mechanisms: Li et al. incorporated a channel attention module into the Multi-Scale Convolutional Self-Attention Network (MSCSAN), automatically focusing on the fault-sensitive frequency band of 3.2–4.1 kHz, enabling 99.5% accuracy in detecting cracks in rolling elements [11]. (3) Feature Adaptation through Transfer Learning: Cheng et al. applied domain adaptation techniques in their AFARN model by aligning feature distributions across operating conditions using Maximum Mean Discrepancy (MMD) loss, maintaining 98% accuracy in cross-domain diagnostics [12].

3.4 Comparison of feature extraction technique evolution

Table 6 and Figure 7 together demonstrate that deep learning – especially multi-scale attention models – offers the highest diagnostic accuracy and feature extraction capability under dynamic working conditions. In contrast, traditional methods, although low in computational complexity, show limitations in adaptability and accuracy. This highlights the dominant advantage of automatic feature learning techniques in complex environments.

Fig. 7

Performance comparison of feature extraction techniques in nuclear pump fault diagnosis.

Table 6

Performance comparison of feature extraction techniques in nuclear pump fault diagnosis evolution.

Current research trends indicate: (1) Multi-physical field collaborative sensing is key to enhancing feature robustness, such as combined monitoring of vibration, current, and temperature. (2) Physics-guided deep learning seeks a balance between interpretability and accuracy, as demonstrated by the AFARN model embedding fluid dynamics constraints. (3) Edge-intelligent feature extraction meets real-time requirements through lightweight design, such as the MTA-CNN model reducing computational load by 64% [30].

4 Current research status on the accuracy of fault diagnosis models for nuclear power pumps

In the field of fault diagnosis for nuclear power pumps, existing soft sensing methods show significant differences in key performance indicators such as diagnostic accuracy, robustness, and computational resource consumption. To systematically evaluate the engineering adaptability of various approaches, researchers have conducted comparative analyses on traditional machine learning models, deep neural network architectures, cross-domain models based on transfer learning, and lightweight networks designed for edge deployment. By constructing a unified experimental platform and utilizing multiple public and semi-physical datasets, they quantified the performance limits and generalization capabilities of each method under typical fault scenarios – specifically measuring diagnostic accuracy, noise resistance, inference latency, and model parameter size. This effort aims to provide both theoretical support and practical guidance for selecting and deploying soft sensing models in high-safety industrial systems [11, 18, 27].

4.1 Diagnostic accuracy of traditional machine learning models

Traditional machine learning methods remain widely used in the fault diagnosis of nuclear power pumps due to their computational efficiency and interpretability. These approaches perform well under stable operating conditions, offering satisfactory accuracy and fast response times. For instance, Support Vector Machine (SVM)-based classifiers are frequently employed to construct discriminative models for bearing fault detection, using features such as pump performance curves and vibration spectra. When operating conditions are known and sufficient labeled data are available, these models can achieve high diagnostic accuracy. However, their performance tends to degrade significantly under dynamic operating conditions – such as when flow or pressure fluctuations exceed certain thresholds – highlighting a lack of adaptability to working condition disturbances [7]. To enhance performance, researchers have developed hybrid methods that combine techniques like Wavelet Packet Transform (WPT) with multi-class SVM (MSVM), enabling multi-dimensional feature modeling to improve the discrimination of concurrent faults. Experimental results show that such methods outperform basic models in multi-fault diagnosis tasks. Nonetheless, they often rely heavily on complex manual feature engineering – for example, setting the number of WPT decomposition levels or calculating energy entropy – making the feature extraction process time-consuming and limiting the model’s real-time deployment potential [15]. In response to challenges such as label scarcity and resource constraints in edge computing environments, some studies have adopted One-Class SVM (OCSVM) to build lightweight anomaly detection frameworks. These models can be trained solely on normal condition data, offering strong real-time capabilities and computational efficiency. While this approach excels in early fault warning, its accuracy and robustness still depend heavily on consistent operating conditions and low noise levels, making it less suitable for the highly dynamic environments of nuclear power pump systems [17].

In summary, the strengths of traditional machine learning methods for nuclear pump fault diagnosis include: clear feature extraction logic, low data requirements for model training, and ease of deployment on low-power platforms. However, they face three major limitations: (1) Heavy reliance on manual feature design, lacking adaptive learning capability. (2) Poor generalization to dynamic operating conditions. (3) Limited ability to integrate multi-modal and multi-task diagnostics [7].

4.2 Accuracy of deep learning models

Deep learning has significantly advanced diagnostic accuracy in nuclear power pump fault detection through end-to-end feature learning. As illustrated in Table 7 and Figure 8, advanced deep learning architectures – such as multi-scale self-attention networks – demonstrate the highest levels of accuracy and robustness. However, this performance comes at the cost of significantly increased computational demands, highlighting a clear trade-off between model performance and resource consumption. In recent years, deep learning approaches have shown excellent performance in identifying fault patterns in centrifugal pumps under complex operating conditions. For example, Siddique et al. integrated Wavelet Coherence Analysis (WCA) with S-Transform, feeding the resulting time-frequency images into a CNN-KAN (Kolmogorov-Arnold Network) model. This approach achieved a diagnostic accuracy of 99.92%, demonstrating exceptional robustness and generalization across various pump fault scenarios [41, 42].

Fig. 8

Comparison of accuracy, robustness, and computational resource requirements of mainstream deep learning models in nuclear pump fault diagnosis.

Table 7

Comparison of accuracy, robustness, and computational resource requirements of mainstream deep learning models in nuclear power pump fault diagnosis.

Key findings: (1) Multiscale structures significantly enhance noise resistance:multiscale convolutional networks, by extracting features under different receptive fields in parallel, effectively capture multi-frequency information in fault signals. Compared to traditional single-scale CNNs, these models exhibit stronger feature separation and stability under high-noise conditions [11]; (2) Swarm Intelligence for Hyperparameter Optimization: To improve adaptability and training efficiency in complex diagnostic tasks, researchers have introduced swarm intelligence algorithms such as Particle Swarm Optimization (PSO) and Sparrow Search Algorithm (SSA) to optimize deep network hyperparameters. These strategies accelerate model convergence and maintain high accuracy even under small sample conditions [16]; (3) Lightweight Design for Edge Deployment: For embedded and edge computing scenarios, multi-task convolutional networks – such as attention-guided lightweight models – achieve real-time fault recognition under limited resources by leveraging parameter sharing and structure pruning strategies [30].

4.3 Accuracy of transfer learning and hybrid models

Transfer learning addresses operational condition shifts through domain adaptation techniques: (1) Physics-Embedded Model: The AFARN network by Cheng et al. integrates constraints from fluid dynamics equations. In cross-domain diagnostics of circulating water pumps at nuclear power plants, it maintains 98% accuracy, with only a 1.8% drop under operational disturbances, while SHAP-based interpretability indicators improve by 3.2 times [12, 43]. (2) Meta-Transfer Learning: Liu et al. proposed a meta-transfer spatiotemporal CNN model using a Model-Agnostic Meta-Learning (MAML) framework. The model can be fine-tuned with only a small number of new operating condition samples, achieving diagnostic latency under 50 ms [18]. (3) Bayesian–Deep Learning Fusion: Li et al. combined EfficientNet with Bayesian inference to quantify uncertainty in fault types. Under small sample conditions, the model maintains 92.4% accuracy, with a false alarm rate controlled within 1.8% [34].

Hybrid models demonstrate excellent robustness in complex scenarios: Shi et al.’s deep learning multi-node prediction framework (DL-MNPF) processes thermal, mechanical, and electrical signals in parallel. In primary loop state prediction at nuclear power plants, it achieves a root mean square error (RMSE) as low as 0.023, reducing error by 54% compared to single-signal models [33].

4.4 Model accuracy comparison and engineering selection recommendations

Based on the comprehensive performance comparison in Table 8, Table 9 and Figure 9, the following engineering selection recommendations are proposed: (1) High-accuracy priority scenarios (e.g., reactor coolant pumps): adopt multi-scale self-attention models (MSCSAN). (2) Dynamic operating condition scenarios (e.g., load-following operation): prioritize transfer learning models (AFARN/MTST-CNN). (3) Edge computing scenarios (e.g., sensor nodes): utilize lightweight models such as MTA-CNN or OCSVM. (4) Few-sample new equipment: recommend Bayesian–deep learning fusion models [12, 30, 34].

Fig. 9

Radar chart of key performance dimensions of soft sensing models.

Table 8

Unified benchmark comparison of representative soft sensing models for pump fault diagnosis.

Table 9

Comprehensive performance analysis of soft sensing models in terms of accuracy, robustness, real-time capability, and applicability.

Common Challenges to Note: (1) Lab-to-Field Gap: Most models achieve over 95% accuracy in simulated environments, but experience an average drop of 12.7% in high-noise nuclear power plant settings [34]. (2) Computation–Accuracy Trade-off: The MSCSAN model requires 12.8 × 10⁶ FLOPs, making it difficult to deploy directly in legacy nuclear power plant control systems [11]. (3) Multi-Fault Coupling Bottleneck: When bearing pitting and impeller cavitation occur simultaneously, model accuracy generally drops by 15–25% [28].

5 Research limitations and challenges of soft sensing technologies

5.1 Insufficient generalization and dynamic adaptability of models

Current soft sensing models face significant challenges in their general applicability within the complex and variable environments of nuclear power systems, primarily in the following three areas:

(1) Weak generalization across operating conditions: Most models are only validated under specific fault types or steady-state conditions. Their performance significantly degrades when applied to the highly variable real-world environments of nuclear power plants. For example, the transfer subspace model developed by Qu Wu and Yan Gaowei achieved 92.3% accuracy in wet ball mill applications but dropped to 68.5% when applied to nuclear power pumps. The key reason lies in the higher structural coupling and more complex failure mechanisms in nuclear systems [31]. Similarly, the multi-scale convolutional self-attention model proposed by Li et al. achieved 99.5% accuracy on simulation data. However, it was only tested on standard failures such as outer ring fractures and pitting, raising concerns about its reliability under multiple simultaneous interferences (e.g., electromagnetic interference combined with mechanical vibration) [11].

(2) Lack of dynamic response mechanisms: Existing models lack online adaptability in the face of frequent condition switching in nuclear power plants. Liu et al. reported that when operating parameters change dynamically, the accuracy of deep models trained on static data can plummet by 18.7%, with total failure under some parameter combinations [18]. Although the Bayesian-EfficientNet model by Li et al. attempted to address this issue, it was tested only in closed environments that did not consider real-world sensor drift or communication delays in nuclear settings, leaving its post-transfer stability in doubt [34].

(3) Insufficient Modeling of Multi-fault Coupling: Current approaches struggle to handle multi-fault concurrent scenarios. Mwaur et al. found that when bearing wear and seal failure occur simultaneously, traditional classifiers exhibit a misclassification rate as high as 32.4%, mainly due to overlapping fault features in the time-frequency domain (e.g., confusion between 1.2 kHz and 1.8 kHz resonance peaks) [3]. Although Zogg et al. proposed a parameter clustering method to enhance interpretability, its effectiveness in the highly coupled systems of nuclear power plants has yet to be validated [44].

Beyond model-level limitations, the harsh operating environment of nuclear power systems further exacerbates the failure of existing soft sensing models. Domain shift occurs when models trained on laboratory or single-condition datasets encounter real-world plant conditions involving fluctuating loads, fluid–structure interactions, and stochastic noise sources. Under these shifts, feature distributions deviate from training-time assumptions, causing deep models to misinterpret previously unseen operating patterns. Additionally, sensor drift induced by thermal cycling, electromagnetic interference, and mechanical vibration alters baseline measurements over time, progressively degrading decision boundaries learned by data-driven models. In multi-fault coupling scenarios, overlapping signatures and cross-frequency interference introduce ambiguity that exceeds the representational capacity of conventional architectures. These factors reveal a fundamental robustness gap between controlled experimental validations and deployment-grade model performance in nuclear environments.

5.2 Data dependency and lack of engineering validation

The reliability of soft sensing models is highly dependent on data quality and the depth of engineering validation. At present, there are three major limitations. As shown in Table 10, typical cases highlight the following issues: Tan et al.’s cavitation diagnosis model for mixed-flow pumps performed well under controlled experimental setups, but failed to consider key real-world nuclear plant conditions such as radiation noise (>85 dB) and coolant impurity interference. As a result, its accuracy dropped by 23.5% upon field deployment [28]. Bu and Guo developed a horizontal pump monitoring platform that achieved 90% accuracy, but the system was only validated on a single pump type under static operating conditions, making it unsuitable for the diverse equipment found in nuclear power plants [2]. More critically, the long-term prediction model by Li et al. relies heavily on extensive historical datasets and high-performance computing resources. This approach falls short of meeting nuclear safety regulations that require fault response within a matter of minutes [46]. Furthermore, Xiao et al. pointed out the issue of evidence conflict when integrating signals from vibration, temperature, and current sensors. In cases where the source data are mutually contradictory (e.g., vibration indicates failure while current appears normal), the model’s misjudgment rate surged to 31.2% [47].

Table 10

Analysis of data challenges and mitigation strategies in nuclear pump fault diagnosis.

Beyond these case-specific failures, the harsh operating conditions of nuclear power plants introduce several fundamental mechanisms that further degrade the performance of existing soft sensing models. First, domain shift arises when models trained on laboratory or single-condition datasets encounter real plant environments featuring fluctuating coolant flow, multi-source vibration, and radiation-induced stochastic disturbances. These shifts alter the statistical distribution of features and cause deep models to misinterpret unseen patterns. Second, long-term radiation exposure, thermal cycling, and mechanical vibration can induce sensor drift, shifting baseline measurements in ways that invalidate previously learned decision boundaries. Third, intermittent communication delays and electromagnetic interference frequently cause missing or partial sensor data, breaking temporal continuity and triggering unstable behavior in sequence models. These failure mechanisms demonstrate that data dependency is not merely a matter of dataset scale, but a deeper structural mismatch between real nuclear operating environments and the assumptions embedded in current model architectures.

Taken together, these cases illustrate that the lack of cross-condition data diversity, incomplete engineering validation, and inconsistencies among multi-source signals substantially constrain the generalization capability of soft sensing models. Moreover, data sensitivity, strict regulatory policies, and heterogeneous site-specific constraints further hinder dataset aggregation across nuclear facilities, limiting opportunities for large-scale model training. These challenges underscore the need for privacy-preserving data strategies such as anonymization protocols, hierarchical access control, and federated or knowledge-distillation–based learning frameworks, which offer feasible pathways to strengthen model robustness under practical nuclear engineering conditions.

In addition to enabling collaborative model training without data sharing, federated learning frameworks in nuclear power plants must also address the stringent data confidentiality and cybersecurity requirements. One key challenge is ensuring data privacy during the learning process, which can be mitigated through advanced encryption techniques such as homomorphic encryption and secure multi-party computation. These methods allow computations to be performed on encrypted data without revealing sensitive information. Moreover, regulatory restrictions such as data localization laws and compliance with nuclear safety standards (e.g., IEC 61508) can affect the scalability of soft sensing models across different nuclear sites. Therefore, integrating these privacy-preserving strategies into federated learning frameworks is essential to ensure both model performance and compliance with the complex regulatory landscape of nuclear power facilities.

5.3 Interpretability and safety compliance bottlenecks

As shown in Figure 10, intelligent fault diagnosis faces three major challenges: data scarcity and heterogeneity, difficulty in model generalization and transfer across scenarios, and the lack of interpretability and safety compliance in decision-making processes. In high-safety-critical domains such as nuclear power, insufficient model interpretability has become a core barrier to real-world deployment.

Fig. 10

Key limitations and challenges in intelligent fault diagnosis.

Black-box Decision Risk: Soft sensing models dominated by deep learning generally lack transparency in their decision-making processes. For instance, although Zaman et al.’s dual-scale convolutional autoencoder achieved 100% classification accuracy, it failed to clarify the contribution weight of each frequency band to the fault decision, making it difficult for maintenance personnel to verify reliability [29]. Similarly, Du et al. pointed out that while compressed sensing improves efficiency in feature extraction, it breaks the causal chain between fault physical mechanisms and signal evolution pathways, undermining diagnostic traceability [40].

A critical yet often overlooked challenge lies in the mismatch between modern deep learning architectures and the stringent safety-assurance requirements of nuclear power systems [48]. Under real plant disturbances – such as coolant flow fluctuations, radiation-induced signal distortion, and rapidly evolving multiphysics interactions – black-box models tend to produce unstable or non-monotonic decision paths that cannot be traced back to interpretable physical mechanisms. This instability directly violates nuclear diagnostic standards that demand causal transparency, verifiable decision logic, and bounded risk behavior under abnormal conditions. Moreover, existing interpretability tools, including attention visualization and feature attribution, often fail to maintain consistency across operating conditions, making them insufficient for regulatory certification. These limitations highlight the urgent need for hybrid interpretability frameworks that integrate physical priors, uncertainty quantification, and formal verification to ensure compliance with nuclear-grade safety requirements.

Moreover, existing interpretability tools, including attention visualization and feature attribution, often fail to maintain consistency across operating conditions, making them insufficient for regulatory certification. These limitations highlight the urgent need for hybrid interpretability frameworks that integrate physical priors, uncertainty quantification, and formal verification. Such frameworks would ensure compliance with nuclear-grade safety requirements and enhance model reliability in safety-critical decisions.

In addition to interpretability challenges, current soft sensing technologies do not yet satisfy the evaluation standards and safety-integrity requirements mandated in nuclear diagnostics. Nuclear regulatory guidelines and IEC 61508 require diagnostic systems to provide traceable decision processes, bounded misjudgment risks, and deterministic fail-safe behaviors. Furthermore, system evaluation must incorporate uncertainty quantification, allowable fault-tolerance thresholds, maximum false-alarm probabilities, and strict response-time requirements for safety-critical faults. Most existing deep learning–based soft sensors lack these verification mechanisms, revealing a significant gap between academic model development and nuclear-grade deployment requirements.

In safety-critical fields like nuclear power, fault diagnosis systems must not only achieve high recognition accuracy but also meet stringent safety compliance requirements. International standards explicitly mandate that diagnostic systems possess traceability of false positives and false negatives, auditability of decision processes, and fail-safe mechanisms under system malfunctions or anomalies. However, current soft sensing technologies still have clear limitations in these aspects. On the one hand, many deep learning models lack interpretability, making it difficult to clearly establish causal links between input signals and model outputs. This is particularly problematic in complex conditions or multi-physics coupling scenarios, where the decision logic often becomes unstable or uncontrollable, limiting the model’s trustworthiness in high-security settings. Even when physical priors or attention mechanisms are introduced, they are mostly constrained to idealized assumptions and struggle to handle real-world disturbances and fluid dynamic variations [12]. On the other hand, although some studies attempt to enhance model reliability via uncertainty quantification or Bayesian inference, existing models generally lack formal alignment with nuclear safety integrity levels. There are no standardized mechanisms to guarantee compliance for risk tolerance, response latency, or misjudgment thresholds. This gap – technological advancement outpacing regulatory support – is a critical bottleneck preventing soft sensing models from moving beyond laboratory environments and into practical nuclear applications [34].

Real-time vs. Reliability Trade-off: The demand for edge deployment intensifies the challenge of model interpretability. For example, Liu et al.’s OCSVM-based edge node achieves a rapid 8.2 ms response time, but only provides binary anomaly flags – lacking any diagnosis of fault type or severity [17]. Similarly, Wang et al.’s lightweight MTA-CNN model reduces computational cost to 1.8 × 10⁶ FLOPs, but this optimization comes at the expense of attention weight visualization granularity, limiting its diagnostic transparency [30].

6 Future research directions

With the increasing intelligence and informatization of nuclear power equipment, the application of soft sensing technology in nuclear pump fault diagnosis is facing higher demands and more complex challenges. To achieve more accurate, efficient, and adaptive fault prediction systems, future research must continue to make breakthroughs across multiple dimensions, including theoretical refinement, model evolution, system integration, and safety assurance. The following section provides a systematic outlook on five key research directions.

6.1 Innovation in intelligent algorithms and model architectures

To overcome current limitations in adaptability to dynamic conditions, few-shot learning, and interpretability, the development of a new generation of intelligent algorithms is urgently needed: (1) Meta-Learning and Domain Generalization: Establish meta-transfer learning frameworks (e.g., MAML⁺⁺) that enable models to rapidly adapt to new operating conditions using task-aware meta-optimizers. The aim is to reduce adaptation time from hours to minutes during operational transitions. (2) Physics-Neural Hybrid Architectures: Develop Neural Ordinary Differential Equation (Neural ODE) models that embed physical constraints, such as the Navier-Stokes equations, into the network architecture. This facilitates a two-way coupling of first-principles physics and data-driven inference. (3) Generative Adversarial Enhancement: Employ diffusion-based generative models to create synthetic samples under extreme operating conditions, addressing the scarcity of real-world fault data.

As shown in Table 11, cutting-edge intelligent technologies such as quantum convolutional neural networks, neuro-symbolic systems, and spiking neural networks hold great potential in improving training efficiency, enhancing interpretability, and reducing energy consumption. These innovations offer diverse pathways and solid theoretical support for the intelligent evolution of soft sensing models.

Table 11

Prospects and theoretical foundations of cutting-edge intelligent technologies in soft sensing models.

6.2 System integration and engineering validation platform development

To promote the transition of soft sensing technologies from laboratory research to engineering applications, a three-tier validation system must be established:

Edge-cloud collaborative deployment architecture: As shown in Figure 11, a closed-loop system combining lightweight diagnostics at the edge with deep optimization in the cloud should be developed.
High-fidelity validation platform: Develop a fault simulation testbed capable of replicating multiple fault types in nuclear power pumps.
Safety system integration: To ensure the stable operation of soft sensing models in high-safety scenarios, it is imperative to deeply integrate intelligent diagnostic systems with existing nuclear power plant control architectures. Safety system integration requires not only high sensitivity in detecting critical faults but also consistent response behavior and fault tolerance under complex operating conditions. Future system designs should focus on establishing a closed-loop mechanism that links state perception, anomaly detection, and safety response – ultimately enabling coordinated interaction between fault diagnosis and safety control logic.

Fig. 11

Architecture of a digital twin and edge-cloud collaborative system.

Based on the above components, a conceptual system-level integration pathway can be summarized to illustrate how soft sensing technologies may be embedded into future nuclear power plant operation architectures. In such a pathway, soft sensors function as upstream perception modules that continuously extract health indicators and transmit them to both automatic control loops and supervisory safety logic. The diagnostic outputs interface with a safety-assurance layer that enforces nuclear-grade requirements for traceability, bounded risk behavior, and deterministic responses under abnormal conditions. Meanwhile, a hierarchical edge-cloud architecture enables real-time inference at edge nodes, while the cloud layer supports large-scale model optimization, digital-twin-based simulation, and cross-site knowledge transfer. This conceptual integration pathway reflects the direction suggested by recent literature and outlines how soft sensing models may evolve toward deployable, safety-compliant components within intelligent nuclear power systems.

7 Conclusion

The application of soft sensing technology in fault diagnosis of nuclear power pumps has gradually become a crucial technical pathway for achieving key state awareness and intelligent maintenance under extreme conditions such as high temperature and high pressure. Its development has achieved breakthroughs in the following three aspects: First, by establishing mapping models between observable variables and target states, soft sensing methods have effectively alleviated monitoring blind spots caused by limited sensor layouts, significantly enhancing the perception of unmeasurable parameters (such as pump cavity stress and bearing wear). Second, the introduction of deep learning and transfer learning models has enabled adaptive feature extraction and cross-condition generalization. These models maintain high diagnostic accuracy and robustness in complex environments with concurrent faults and high noise disturbances, effectively reducing reliance on manual feature engineering and expert experience. Third, by constructing a closed-loop system of “state awareness – intelligent diagnosis – predictive maintenance”, soft sensing technology provides feasible support for the lifecycle management of nuclear pump equipment and offers a methodological foundation for the future development of intelligent nuclear power systems.

Despite notable progress, several key challenges remain for the deployment of soft sensing technologies in real nuclear engineering scenarios. These include limited model generalizability due to variations in operational conditions, underdeveloped cross-site data fusion and privacy-preserving mechanisms, and the lack of system-level validation platforms covering the entire equipment lifecycle. To address these challenges, future research should expand in three critical directions: (1) Theoretical level: Enhance the fusion modeling of multi-source heterogeneous information, and develop physically constrained, interpretable neural network architectures to improve model reliability and transparency in non-ideal data environments. (2) Standardization level: Promote the establishment of a unified performance evaluation framework for soft sensing, clearly defining its engineering applicability boundaries in terms of diagnostic accuracy, response delay, and false alarm tolerance. (3) Engineering level: Leverage demonstration reactor platforms to carry out high-fidelity, multi-physics validation of soft sensing systems, and explore integration pathways with digital twins, federated learning, and other advanced technologies to improve real-time deployment capabilities and cross-scenario transferability.

In addition, several actionable implementation pathways can further support future development: (1) establish federated learning frameworks that enable multi-site collaborative model training without sharing raw data, thereby ensuring privacy protection and regulatory compliance; (2) integrate hydraulic, thermodynamic, and electromagnetic physical principles into deep models through physics-informed constraints to enhance interpretability and robustness under unseen operating conditions; (3) incorporate uncertainty quantification modules to provide confidence-aware diagnostic outputs suitable for safety-critical nuclear equipment; and (4) develop edge-cloud collaborative architectures that combine rapid on-site inference with cloud-based optimization, improving real-time deployment and cross-scenario adaptability.

In summary, soft sensing technology holds significant potential for enhancing the intelligent operation and maintenance of nuclear power pumps. Its future development should focus on transitioning from methodological innovation to engineering-grade system integration and validation, supported by practical pathways such as federated learning for cross-site data privacy, physics-informed hybrid modeling, uncertainty-aware diagnostic strategies, and edge-cloud collaborative deployment. By advancing these implementation-oriented research directions, soft sensing can achieve a deep leap from theoretical exploration to critical application, ultimately establishing a new paradigm of interpretable, verifiable, and transferable intelligent fault diagnosis tailored to high-safety-level energy systems.

Funding

This work is funded by China Postdoctoral Science Foundation [2025M780568].

Conflicts of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability statement

Data will be made available on request.

Author contribution statement

Yukun Zhang: Writing–original draft, Writing–review&editing. Qian Huang: Software, Data curation. Wei Xu: Data curation, Funding acquisition. Yifan Zhi: Software, Visualization. Xiuli Wang: Writing–original draft, Conceptualization.

References

Q. Ye, Development of modern nuclear energy in China, Sci. Technol. Rev. 2024, 1–2 (2024). [Google Scholar]
J. Bu, G. Guo, Condition monitoring and fault diagnosis of horizontal pump units in nuclear power plants, Power Equip. 2019, 108–111 (2019). [Google Scholar]
A.M. Mwaura, Y. Liu, E. Zio, Prediction of the loss of feed water fault signatures using machine learning techniques, Sci. Technol. Nucl. Instal. 2021, 1–13 (2021). [Google Scholar]
United States Nuclear Regulatory Commission (U.S. NRC), Eighth National Report for the Convention on Nuclear Safety, Washington, DC, 2019. [Google Scholar]
Q. Du, K. Zhang, Condition monitoring and fault diagnosis of hydraulic pumps based on self-generated vibration signals, Trans. Chin. Soc. Agri. Eng. 2007, 120–123 (2007). [Google Scholar]
H. Sun, S. Yuan, Y. Luo, Unsteady internal flow analysis of centrifugal pumps under fluid‑machine‑electromagnetic coupling effects, Ann. Nucl. Energy 89, 224–231 (2016). [Google Scholar]
W. Liu, Operating condition recognition of pump indicator diagrams based on support vector machines, Value Eng. 2010, 156–157 (2010). [Google Scholar]
G. Reges, M. Fontana, M. Ribeiro, T. Silva, O. Abreu, R. Reis, L. Schnitman, Electric submersible pump vibration analysis under several operational conditions for vibration fault differential diagnosis, Ocean Eng. 219 (2021). [Google Scholar]
S. Schmidt, P.S. Heyns, K.C. Gryllias, An informative frequency band identification framework for gearbox fault diagnosis under time-varying operating conditions, Mech. Syst. Signal Proc. 158 (2021). [Google Scholar]
N.R. Sakthivel, B.B. Nair, V. Sugumaran, Soft computing approach to fault diagnosis of centrifugal pump, Appl. Soft Comput. 12, 1574–1581 (2012). [Google Scholar]
C. Li, X. Liu, H. Wang, M. Peng, Fault diagnosis method for centrifugal pumps in nuclear power plants based on a multi-scale convolutional self-attention network, Sensors (Basel) 25 (2025). [Google Scholar]
W. Cheng, X. Liu, J. Xing, X. Chen, B. Ding, R. Zhang, K. Zhou, Q. Huang, AFARN: Domain Adaptation for intelligent cross-domain bearing fault diagnosis in nuclear circulating water pump, IEEE Trans. Indus. Inform. 19, 3229–3239 (2023). [Google Scholar]
L. Qing, L. Gu, Y. Sun, Application of current spectrum recognition in plunger pump fault diagnosis, Mach. Tool Hydraulics 2015, 195–198 (2015). [Google Scholar]
S. Choi, M.S. Haque, M.T.B. Tarek, V. Mulpuri, Y. Duan, S. Das, V. Garg, D.M. Ionel, M.A. Masrur, B. Mirafzal, H.A. Toliyat, Fault diagnosis techniques for permanent magnet AC machine and drives – A review of current state of the art, IEEE Trans. Transp. Electrif. 4, 444–463 (2018). [Google Scholar]
J.S. Rapur, R. Tiwari, Experimental fault diagnosis for known and unseen operating conditions of centrifugal pumps using MSVM and WPT based analyses, Measurement 147 (2019). [Google Scholar]
C. Zhang, P. Chen, F. Jiang, J. Xie, T. Yu, Fault diagnosis of nuclear power plant based on sparrow search algorithm optimized CNN-LSTM Neural network, Energies 16 (2023). [Google Scholar]
G. Liu, Y. Du, L. Guo, E. Shi, Z. Wang, Z. Yan, Working condition and fault diagnosis of submersible electric pumps based on one-class support vector machine, J. China Univ. Petrol. (Edition of Natural Science) 2021, 162–168 (2021). [Google Scholar]
J. Liu, X. Yang, R. Macián-Juan, N. Kosuch, A novel transfer CNN with spatiotemporal input for accurate nuclear power fault diagnosis under different operating conditions, Ann. Nucl. Energy 194 (2023). [Google Scholar]
C. Wang, L. Luo, H. Xu, Soft sensing technology and its application in tool fault diagnosis, Tool Technol. 2007, 69–71 (2007). [Google Scholar]
G. Han, X. Wu, Q. Zhang, F. Mao, Application of indicator diagram recognition technology in working condition diagnosis of beam pumping units, Oil Drill. Prod. Technol. 2003, 70–74, 96–97 (2003). [Google Scholar]
H. Tang, D. Chengyan, Intelligent diagnosis method of multi-operating axial piston pump based on adaptive fusion of virtual and real data features, Eng. Res. Exp. 7, 045589 (2025). [Google Scholar]
R.J. Patton, F.J. Uppal, C.J. Lopez-Toribio, Soft computing approaches to fault diagnosis for dynamic systems: A survey, IFAC Proc. 33, 303–315 (2000). [Google Scholar]
Q. Chen, S. Wang, X. Zhang, S. Zhang, Z. Chai, W. Li, Force source identification and fault diagnosis of reciprocating bilge pumps, J. Harbin Eng. Univ. 2013, 471–476 (2013). [Google Scholar]
Y. Huang, J. Gao, Reliability analysis of rocket shooting density under small sample conditions, Comput. Eng. Appl. 2013, 255–257, 262 (2013). [Google Scholar]
S. Nandi, T.C. Ilamparithi, S.B. Lee, D. Hyun, Detection of eccentricity faults in induction machines based on nameplate parameters, IEEE Trans. Indus. Electron. 58, 1673–1683 (2011). [Google Scholar]
J. Zheng, L. Ma, Y. Wu, L. Ye, F. Shen, Nonlinear dynamic soft sensor development with a supervised hybrid CNN-LSTM network for industrial processes, ACS Omega 7, 16653–16664 (2022). [Google Scholar]
Z. Zhang, A. Tang, T. Zhang, A transfer-based convolutional neural network model with multi-signal fusion and hyperparameter optimization for pump fault diagnosis, Sensors (Basel) 23 (2023). [Google Scholar]
Y. Tan, G. Wu, Y. Qiu, H. Fan, J. Wan, Fault diagnosis of a mixed-flow pump under cavitation condition based on deep learning techniques, Front. Energy Res. 10 (2023). [Google Scholar]
W. Zaman, Z. Ahmad, J.M. Kim, Fault Diagnosis in centrifugal pumps: A dual-scalogram approach with convolution autoencoder and artificial neural network, Sensors (Basel) 24 (2024). [Google Scholar]
H. Wang, Z. Liu, D. Peng, M. Yang, Y. Qin, Feature-level attention-guided multitask CNN for fault diagnosis and working conditions identification of rolling bearing, IEEE Trans. Neural Netw. Learn. Syst. 33 (2022) 4757–4769. [Google Scholar]
W. Qu, G. Yan, Soft sensor modeling based on transfer subspace with integrated maximum mean discrepancy regularization, J. Chongqing Univ. Technol (Natural Science Edition) 2020, 108–114 (2020). [Google Scholar]
X. Wang, Theory and method of soft sensing for complex industrial processes based on deep learning, PhD Thesis, Xi’an University of Technology, 2020. [Google Scholar]
T. Shi, J. She, P. Li, J. Jiang, W. Chen, A deep learning-based framework for the operation prediction of primary heat transfer loop in nuclear power plants, Front. Energy Res. 11 (2023). [Google Scholar]
S. Li, J. Chen, H. Lin, W. Wang, A composite fault diagnosis model for NPPs based on Bayesian-EfficientNet module, 2024 (2024). [Google Scholar]
Y. Luo, Y. Chen, X. Qin, Y. Chen, Fault prediction model for centrifugal pumps based on external magnetic field, J. Drainage Irrigation Mach. Eng. 2023, 649–654, 662 (2023). [Google Scholar]
S. Yang, H. Kim, Y. Hong, K. Yee, R. Maulik, N. Kang, Data-driven physics-informed neural networks: A digital twin perspective, Comput. Methods Appl. Mech. Eng. 428 (2024). [Google Scholar]
L. Albshaier, S. Almarri, A. Albuali, Federated learning for cloud and edge security: A systematic review of challenges and AI opportunities, Electronics 14 (2025). [Google Scholar]
J. Wang, Condition monitoring and fault diagnosis analysis of centrifugal water injection pump units, China Equipm. Eng. 2022, 20–22 (2022). [Google Scholar]
F. Immovilli, C. Bianchini, M. Cocconcelli, A. Bellini, R. Rubini, Bearing fault model for induction motor with externally induced vibration, IEEE Trans. Indus. Electron. 60, 3408–3418 (2013). [Google Scholar]
Z. Du, X. Chen, H. Zhang, H. Miao, Y. Guo, B. Yang, Feature identification with compressive measurements for machine fault diagnosis, IEEE Trans. Instrum. Measure. 65, 977–987 (2016). [Google Scholar]
M.F. Siddique, S. Ullah, J. Kim, A deep learning approach for fault diagnosis in centrifugal pumps through wavelet coherent analysis and s-transform scalograms with CNN-KAN, Comput. Mater. Continua 1–10, (2025). [Google Scholar]
M.F. Siddique, S. Ullah, J.-M. Kim, A deep learning approach for fault diagnosis in centrifugal pumps through wavelet coherent analysis and S-Transform scalograms with CNN-KAN, Comput. Mater. Continua 84 (2025). [Google Scholar]
M. Umar, Z. Ahmad, S. Ullah, F. Saleem, M.F. Siddique, J.M. Kim, Advanced fault diagnosis in milling machines using acoustic emission and transfer learning, IEEE Access 13, 100776–100790 (2025). [Google Scholar]
D. Zogg, E. Shafai, H.P. Geering, Fault diagnosis for heat pumps with parameter identification and clustering, Control Eng. Pract. 14 (2006) 1435–1444. [Google Scholar]
S. Zhou, L. Zhang, X. Yang, R. Luo, B. Du, W. Zeng, Remaining useful life prediction method of centrifugal pump rolling bearings based on digital twins, Sci. Rep. 15, 19513 (2025). [Google Scholar]
S. Li, J. Fang, Y. Wu, W. Wang, C. Li, J. Chen, A fuzzy reinforcement LSTM-based long-term prediction model for fault conditions in nuclear power plants (2024) https://doi.org/10.48550/arXiv.2411.08370. [Google Scholar]
F. Xiao, Z. Cao, A. Jolfaei, A novel conflict measurement in decision-making and its application in fault diagnosis, IEEE Trans. Fuzzy Syst. 29, 186–197 (2021). [Google Scholar]
M.F. Siddique, W. Zaman, M. Umar, J.-Y. Kim, J.-M. Kim, A hybrid deep learning framework for fault diagnosis in milling machines, Sensors 5866 (2025). [Google Scholar]
I. Cong, S. Choi, M.D. Lukin, Quantum convolutional neural networks, Nature Phys. 15, 1273–1278 (2019). [Google Scholar]
M. Setzu, R. Guidotti, A. Monreale, F. Turini, D. Pedreschi, F. Giannotti, GLocalX – From local to global explanations of black box AI Models, Artificial Intelligence, 294 (2021). [Google Scholar]
K. Roy, A. Jaiswal, P. Panda, Towards spike-based machine intelligence with neuromorphic computing, Nature 575, 607–617 (2019). [Google Scholar]

Cite this article as: Zhang Y, Huang Q, Xu W, Zhi Y & Wang X, et al. Progress and challenges in the application of soft sensing technology for nuclear power plant pump fault diagnosis, Res. Des. Nucl. Eng. 2, 2025011 (2026), https://doi.org/10.1051/rdne/2025011.

All Tables

Table 1

Summary of commonly used feature types in pump fault diagnosis.

In the text

Table 2

Engineering applicability analysis of traditional machine learning methods in nuclear power pump fault diagnosis.

In the text

Table 3

Performance comparison of different deep learning models in nuclear power pump fault diagnosis.

In the text

Table 4

Performance comparison and technical advantage analysis of transfer learning and hybrid strategy models.

In the text

Table 5

Comparison of multi-source signal fusion strategies in fault diagnosis of nuclear power pumps.

In the text

Table 6

Performance comparison of feature extraction techniques in nuclear pump fault diagnosis evolution.

In the text

Table 7

Comparison of accuracy, robustness, and computational resource requirements of mainstream deep learning models in nuclear power pump fault diagnosis.

In the text

Table 8

Unified benchmark comparison of representative soft sensing models for pump fault diagnosis.

In the text

Table 9

Comprehensive performance analysis of soft sensing models in terms of accuracy, robustness, real-time capability, and applicability.

In the text

Table 10

Analysis of data challenges and mitigation strategies in nuclear pump fault diagnosis.

In the text

Table 11

Prospects and theoretical foundations of cutting-edge intelligent technologies in soft sensing models.

In the text

All Figures

	Fig. 1 Proportion of fault types in nuclear power pumps.
In the text

	Fig. 2 Engineering applicability of traditional machine learning methods in nuclear power pump fault diagnosis.
In the text

	Fig. 3 Performance comparison of different deep learning models in nuclear power pump fault diagnosis (literature-based performance comparison).
In the text

	Fig. 4 Comparison of different model architectures.
In the text

	Fig. 5 Performance comparison and technical advantages of transfer learning and hybrid strategy models.
In the text

	Fig. 6 Accuracy improvement through multi-source signal fusion strategies in nuclear pump fault diagnosis.
In the text

	Fig. 7 Performance comparison of feature extraction techniques in nuclear pump fault diagnosis.
In the text

	Fig. 8 Comparison of accuracy, robustness, and computational resource requirements of mainstream deep learning models in nuclear pump fault diagnosis.
In the text

	Fig. 9 Radar chart of key performance dimensions of soft sensing models.
In the text

	Fig. 10 Key limitations and challenges in intelligent fault diagnosis.
In the text

	Fig. 11 Architecture of a digital twin and edge-cloud collaborative system.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.