A transient stability assessment method for power systems incorporating residual networks and BiGRU-attention☆

doi:10.1016/j.gloei.2024.09.001

Figure（0）

Tables（0）

Author Information

Publication Information

A transient stability assessment method for power systems incorporating residual networks and BiGRU-attention☆

Shan Cheng^a,* ,Qiping Xu^a ,Haidong Wang^a ,Zihao Yu^b ,Rui Wang^a ,Tao Ran^a

（ a College of Electrical Engineering and New Energy, China Three Gorges University, Yichang 443000 PR China , b State Grid Shandong Institute of Economics and Technology, Jinan 200030 PR China ）

DOI:10.1016/j.gloei.2024.09.001

Keywords

Transient stability assessment; Aperiodic instability; Oscillatory instability; ResGRU; SGDR

Abstract

Abstract The traditional transient stability assessment (TSA) model for power systems has three disadvantages: capturing critical information during faults is difficult, aperiodic and oscillatory unstable conditions are not distinguished, and poor generalizability is exhibited by systems with high renewable energy penetration.To address these issues, a novel ResGRU architecture for TSA is proposed in this study.First, a residual neural network (ResNet) is used for deep feature extraction of transient information.Second, a bidirectional gated recurrent unit combined with a multi-attention mechanism (BiGRU-Attention) is used to establish temporal feature dependencies.Their combination constitutes a TSA framework based on the ResGRU architecture.This method predicts three transient conditions: oscillatory instability, aperiodic instability, and stability.The model was trained offline using stochastic gradient descent with a thermal restart(SGDR) optimization algorithm in the offline training phase.This significantly improves the generalizability of the model.Finally, simulation tests on IEEE 145-bus and 39-bus systems confirmed that the proposed method has higher adaptability, accuracy, scalability, and rapidity than the conventional TSA approach.The proposed model also has superior robustness for PMU incomplete configurations, PMU noisy data, and packet loss.© 2025 Global Energy Interconnection Group Co.Ltd.Publishing services by Elsevier B.V.on behalf of KeAi Communications Co.Ltd.This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

0 Introduction

As power systems evolve and a great deal of renewable energy becomes available [1], the existing power systems have become more vulnerable to failure [2].In power systems, the capacity of a synchronous motor to maintain synchronization for brief intervals after a large perturbation(e.g., a three-phase fault) is known as transient stability.The ability to predict whether a fault will lead to the destabilization of a power system and to take effective and timely preventive measures for power system security and stable operation is crucial [3].Therefore, appropriate transient stability assessment (TSA) methods are required.Time-domain simulation (TDS) [4], transient energy function (TEF) [5], extended equal-area criterion (EEAC)[6], and stand-alone equivalent method (SIME) [7] are currently some of the traditional TSA approaches.TDS is accurate but slow in calculation; TEF simplifies the model and increases the calculation speed, but exhibits poor model adaptation and is difficult to use in complex power systems; EEAC has not been extensively studied and is relevant only for the classic generator model; SIME combines the benefits of EEAC and TDS and is more effective than TDS, although at the expense of calculation precision [8].

Artificial intelligence has advanced quickly in the past years, and coupled with the fast evolution of data storage and computing devices, has brought us into the “era of big data.” In a power system, phase measurement units(PMUs) and wide-area management systems (WAMSs)can gather primary data quickly and in real time for the entire power grid [9] to dynamically monitor the power grid [10].Consequently, data-driven machine learning(ML) techniques, such as decision trees (DTs) [11],support vector machines (SVMs) [12], K-nearest neighbors[13], extreme learning machines [14], and XGBoost[15], have been successfully used for TSA of power systems.However, the choice of initial characteristics significantly affects the performance of ML algorithms.ML has insufficient feature extraction capability for multidimensional data.Consequently, ML must rely on a priori artificial knowledge for feature extraction, which can lead to underfitting.

In recent years, deep learning (DL) technology has notably advanced, and breakthroughs have been achieved in many field applications [16-18].In contrast to ML, DL can autonomously extract input features from data,thereby avoiding the subjectivity associated with human intervention.DL exhibits a strong learning ability, high data-driven ceiling, and strong generalizability.DL has been widely employed in TSA problems over the past few years owing to its strong data-learning ability.In [19], the capacity of a model for extracting features was enhanced by fusing a multi-branch stacked noise reduction selfencoder (MSDAE) with a logistic regression (LR) layer.In [20], the topology of the power system was considered when constructing the deep belief network (DBN) loss function.A particle swarm optimization approach was used in [21] to calculate the number of units in each layer of the DBN model applied, which increased TSA accuracy.Nevertheless, the inputs of the DBN and MSDAE are restricted to the data dimensions, their training speeds are slow, their parameter selection is challenging, and falling into a local optimal solution is easy, which significantly limits TSA applications of these models.In [22], gated recurrent units (GRUs) were introduced to learn timeseries data for TSA.However, accelerating the training of a single GRU model using parallel computing is challenging,which reduces the model update efficiency.In [23], an intelligent time-adaptive TSA system based on LSTM was proposed.This model balances computational speed and efficiency for TSA, but it has a slow prediction speed and cannot extract local short-term time-series features along with spatial data features.

By contrast, convolutional neural networks (CNNs) can accommodate data of various dimensions [24] and can be used to extract higher-order features contained in localized short-term time series.In [8], dynamic time-series data were converted into RGB images, and a CNN-based TSA system was built.However, as the number of network layers increases, CNNs suffer from overfitting, vanishing weights, and vanishing gradients, among other limitations.To address these problems, a deep residual network(ResNet) was proposed in [25].However, a single ResNet cannot adequately extract abstract features from macro long-term time-series data in power systems.From the perspective of model fusion, using a single model limits the performance of TSA.

However, when facing complex power system data during faults, it is difficult for traditional models to capture key information, which limits the performance of TSA.In recent years, related studies have introduced the attention mechanism into TSA and achieved good results.Reference [26] designed a TSA model based on a twostage attention mechanism and GRU that significantly improved the ability of the model to recognize fault information.The introduction of the attention mechanism enables the model to dynamically adjust the attention to different features according to the importance of the data.Thus, the model can further focus on the key features that contribute the most to fault classification.This ability to focus precisely can help the model capture key information during faults.

In power systems featuring high renewable energy penetration, the operational state and load distribution of the system become more complex and dynamic.This complexity and dynamics make it possible for traditional optimization algorithms to fall into local optimal solutions, leading to poor generalization of the model in this environment.Therefore, relevant measures must be implemented to improve the generalizability of the model.Currently, a variation of the stochastic gradient descent (SGD) technique is applied to train the majority of effective deep neural networks.SGD-like algorithms are still competitive compared to advanced second-order optimization training algorithms [27].The fact that the loss function of a model converges to a “flat” and “wide”local minimum indicates better generalizability [28].In[29], it was found that Adam presents limitations when it comes to converging to a relatively flat minimum and that its generalizability is worse than that of SGD.In[30], the superior generalizability of SGD over Adam was demonstrated using Lévy-driven stochastic differential equations (SDEs).Reference [31] reported that Adam easily finds sharp minima and is the most popular in NLP tasks, while SGD is better than the adaptive optimizer at finding flat minima and is the most frequently employed optimizer in computer vision.In this study, the input data were image data, and the SGD optimizer was more appropriate.Although SGD generalizes better than Adam,the momentum of Adam enables the model to effectively escape saddle points.In [32], a thermally restarted stochastic gradient descent (SGDR) method was proposed that eliminates sharp (Sharp) and narrow local minima and accelerates the escape from saddle points by periodically adjusting the learning rate and combining the advantages of both SGD and Adam optimizers.In [33], SGDR was applied for TSA to enhance the generalization ability of the model.The experiments showed that the model was robust to PMU noise and data loss.

In summary, current TSA based on DL has the following problems.First, owing to massive high-dimensional transient data, it is difficult for traditional models to extract useful key information during failures.Second, the current methods are all binary classification models, which only consider the aperiodic unstable mode caused by insufficient synchronous torque, but ignore the oscillatory unstable mode caused by insufficient damping torque, which will inevitably endanger the stability and safety of power system operation.In addition, the traditional Adam optimization algorithm may cause the model to fall into a local minimum, whereas the SGD algorithm is sufficiently slow to escape from saddle points and has a slow training speed.In systems with a high penetration rate of new energy, problems such as insufficient generalization ability and decreased model evaluation performance emerge.

To address the aforementioned issues, three alternatives are proposed as the primary contributions of this study:

1) To extract fault information when a transient fault occurs in a power system, ResNet is fused with the BiGRU model containing a multi-head attention mechanism to create a new TSA framework(ResGRU).ResNet is used to extract deep features in transient information; simultaneously, BiGRU,including a multi-head attention mechanism, is used to establish temporal dependencies to extract deep features in time series.In addition to extracting deep transient information features, this also mitigates the gradient vanishing and gradient explosion that occur when BiGRU extracts a long time series by adding a multi-head attention mechanism, further improving the model recognition effect.

2) The proposed model can accurately classify the three modes of stability, oscillatory instability, and aperiodic instability, which significantly reduces the safety risks associated with the power system.

3) Using SGDR with hot restart, the model can jump out of local optima, improving its generalization ability and robustness to data loss and noise.Thus,the model can be applied to power systems with a high penetration of new energy.

The remainder of this paper is organized as follows.The concept and categorization of power-system transient stabilization are briefly introduced in Section 1.Section 2 explains the basic structure of ResGRU.Section 3 describes the use of ResGRU for TSA.Section 4 details simulation experiments on IEEE 145-bus and 39-bus systems.The paper is summarized in Section 5.

1 Transient stability assessment and basic categories

1.1 Transient stability assessment

The capability of a power system to return to operational equilibrium following a physical disruption under specific starting operating circumstances is known as transient stability.The causes of power-system transient destabilization include short-circuit faults and sudden disconnection of lines or generators.Therefore, TSA is necessary for predicting the system stability after clearing system faults.The safety of a power system depends on the realization of rapid and precise evaluation techniques of transient stability.

1.2 Synchronizing torque, damping torque, and transient instability model

The electromagnetic torque of a synchronous motor can be divided into two components: the damping torque component, which is in phase with the speed deviation, and the synchronous torque component, which is in phase with the rotor angular deviation.

The two components of the electromagnetic torque of a synchronous motor are essential for the transient stability of the system [34].An insufficient synchronous torque can result in aperiodic instability.Case 2 in Fig.1 illustrates an instance of aperiodic instability, i.e., instability of the rotor angle dynamics.This instability can be determined by the synchronization loss that occurs during the first or second oscillation.In this case, prompt use of focused emergency control measures is necessary to preserve the stability and reduce the damage caused by system failure.These measures include generator tripping, reduction of mechanical power to the accelerating generator, and forced excitation.

pagenumber_ebook=148,pagenumber_book=145

Fig.1 Examples of oscillatory instability, aperiodic instability, and stability.

The lack of damping torque is the source of oscillatory instability with a larger amplitude.During the first two oscillations, the generator remains synchronized; however,the increased amplitude of the oscillations may eventually lead to instability of the power system.Oscillation instability is identified by the absence of damping in the rotor angular dynamics.An oscillatory unstable system is illustrated in Case 3 in Fig.1.Urgent control measures must be implemented to address this oscillatory instability.These measures include the addition of a power system stabilizer (PSS) [35] or a resilient wide-area damping controller to flexible AC transmission system (FACTS)equipment [36], as well as the use of a voltage-source converter high-voltage DC (VSC-HVDC) [37] to suppress oscillations.

Power systems exhibit different destabilization modes under distinct disturbances.The factors leading to these two destabilization modes are different; therefore, the required control measures also differ.For this reason, in this study, DL techniques were employed on TSA to build a model that not only predicts stability in advance but also evaluates specific destabilization modes of the system,acting as a foundation for emergency control measures.Following this approach, the instability of power systems can be better addressed, and the impact caused by potential failures can be reduced.

2 Structures of ResGRU

2.1 Overview of BiGRU structure

With its reset gate design and distinct update gate, GRU realizes the forgetting and memory functions for long-term properties of the input data.In contrast to LSTM networks and RNNs, a GRU can solve the challenge of gradient explosion, has a more straightforward structure, and is faster and more capable of learning.Fig.2 depicts the organizational structure of a GRU.

Fig.2 shows that the inputs of the GRU model are the hidden state ht-1 and feature vector xt from the preceding time step.The calculation process is expressed as follows:

pagenumber_ebook=149,pagenumber_book=146

where Wu, Wr, and Wh are the weight matrices of the GRU.

Fig.2 Structure of a GRU.

BiGRU constructs a bidirectional recurrent neural network structure by stacking two BiGRU [38] layers and processing the input sequence in forward and reverse orders.The output is a combination of the outputs of the two GRU layers.It not only considers the current moment information but also the information after the current moment.It has a stronger memory capacity and better processing effect on transient time-series signals with long time-series correlation characteristics.

Equations (1)-(4) can be used to compute the new hidden state, ht.Given that the same set of parameters is used to compute each time step, only one set of parameters, Wu, Wr,and Wh, must be optimized.This reduces the complexity of GRU training because the number of parameters is independent of the time step.

2.2 Overview of ResNet structure

Deep residual networks (ResNet) are an evolved form of CNNs.Unlike traditional convolutional networks, ResNet establishes a residual block (Resblock) by integrating parallel short-circuit paths alongside the convolutional layers.This allows ResNet to converge quickly and eliminates gradient explosion and vanishing issues in deep networks.

The ResNet branch of ResGRU uses the ResNet18 structure with eight residual blocks.The 1st, 2nd, 4th,6th, and 8th residual blocks are constant mapping-type structures, in which the constant function is summed with the convolutional layer and batch normalized output through the short path to obtain the output feature map of the residual module, as shown in Fig.3(a).The 3rd, 5th,and 7th residual blocks are downsampling-type structures,in which the residual module performs a 1×1 convolution operation on the short paths for downsampling to maintain both the feature maps of the residual paths and the short paths with the same dimensions, as shown in Fig.3(b).

2.3 Multiple attention mechanisms

Fig.3 Residual block structure.

The attention mechanism simulates human attention and selection when processing information.When processing time-series data, the model assigns different weights to different parts of the input and can pay further attention to more important parts of the task when processing the data; its integration into the BiGRU layer can effectively alleviate the gradient disappearance and gradient explosion triggered by its difficulty in learning dependencies over long time series.

The soft attention mechanism is often selected in DL for more purposeful representation by computing attention weights and averaging them in a weighted manner.Assuming a set of transient fault data, and given a query vector q, the correlation between each input di and q is computed by the scoring function s(di, q), Then,the output of each correlation score is normalized by the Softmax function to obtain the attention distribution G =[g1 , g 2, g3, …,gk] corresponding to the transient fault data.Finally, the output A is obtained by the weighted summation of the input data through the attention distribution, computed by the formula:

pagenumber_ebook=150,pagenumber_book=147

Fig.4 Multi-headed attention structure.

The multi-head attention mechanism [39] is an extended form of the attention mechanism that simultaneously captures features at different levels and perspectives by introducing multiple independent attention heads to improve the expressive and generalization abilities of the model.Its specific structure is shown in Fig.4.

2.4 Overview of ResGRU structure

The ResGRU structure, shown in Fig.5, is an innovative two-branch network consisting of two layers of GRU units and eight residual blocks.The characteristics of the raw input data can be comprehensively acquired from both macro long-term and local short-term perspectives.After the input layer receives the data, the raw input is divided into two groups.One group of raw time-series data flows into the GRU branch, whereas the other group is transformed into a square heatmap after data processing and then enters the ResNet branch.

Notably, ResNet is sensitive to higher-order information, including voltage magnitude and phase angle.In contrast, the GRU network possesses outstanding macro long-term memory functions.By recalling previous inputs and performing information transfer, a GRU can capture hidden abstract features in active and reactive power.The integration of a GRU in the ResNet network enables their functionalities to complement each other effectively.Finally, representative information from the two-branch processing is fused using a fully connected layer.The SGDR optimizer is employed to train the model, and the cross-entropy loss function is applied to select the loss function.More accurate classification results are obtained by continuously adjusting the model parameters:

Fig.5 Construction of ResGRU.

where N stands for the number of samples; pagenumber_ebook=151,pagenumber_book=148 , and represent the true labels of the samples; and , andrepresent the predicted probabilities of the oscillatory unstable, aperiodic unstable, and stable modes, respectively.

3 TSA based on ResGRU

3.1 Model overview

Fig.6 depicts the flowchart of the ResGRU-based TSA model, which is divided into stages of dataset creation, offline training, and online application.The dataset required to train the model can be obtained during the stage of dataset generation by establishing a specific environment to simulate the power system in the time domain.During the offline training stage, the gathered dataset is separated into test and training sets.The initial parameters of the model are fitted and adjusted using the training dataset, and the test dataset is used to assess the TSA performance of the model and make additional model parameter adjustments.In the online application phase, the collected data are used to evaluate the trained model.

3.2 Model input and preprocessing

The evaluation performance of the TSA model is directly affected by the choice and design of input characteristics.To minimize the delay and error generation,input features with high timeliness and clear physical meaning should be selected to avoid physical quantities that must be obtained through indirect calculations.In this study, the performance of the model was enhanced by four objective features: the reactive and active power of the transmission line, bus voltage magnitude, and bus phase angle, according to the following expressions:

pagenumber_ebook=151,pagenumber_book=148

where m is the network node count, t is the number of sample points, and n is the transmission line count.

3.3 Sample annotation

The transient steady state of the system, which can be categorized as oscillatory, aperiodic, or stable, is represented by the output of the model.According to the definition of the joint IEEE/CIGRE working group[40], transient stability is a state in which the rotor angle oscillations are suppressed, the relative motions of the generators are gradually reduced, and the system moves to a new steady state without losing synchronization.Aperiodic instability is characterized by an increase in the rotor angle without oscillations and a loss of synchronization during the first two oscillations.The system is stable throughout the first two cycles of oscillatory instability prior to becoming unstable owing to undamped oscillations that steadily increase in amplitude.Each sample was labeled during the offline simulation phase based on the pattern of the rotor angle profile observed throughout the simulation.The labels for the samples were as follows: 0 for stable samples, 1 for aperiodic unstable samples, and 2 for oscillatory unstable samples.

pagenumber_ebook=152,pagenumber_book=149

Fig.6 Flowchart of the TSA method using ResGRU.

3.4 SGDR optimizer

The learning rate of SGDR is related to the training epoch as follows:

where t represents the current cycle, T represents the total number of training calendar elements, and M is the number of training process cycles; f was set as the shifted cosine function proposed in [32] and Ir-0 denotes the initial learning rate.Within each cycle, the training is annealed to a lower learning rate after starting at a high learning rate.A high learning rate Ir=f(0) endows the model with sufficient energy to escape the tipping point; conversely, the model is driven to a suitable local minimum by a low learning rate Ir=f(T/M) [41].The blue line in Fig.7 represents an example of this learning rate tuning.The red line represents ordinary Adam, which does not need to be restarted.

Fig.7 shows that, in one cycle, the learning rate is annealed from the initial value Ir-0 to f(T/M)≈0 within this function.Fig.8 illustrates how Adam and SGDR evolve throughout the training phase.

As shown in Fig.8(a), Adam converges to the closest local minimum B, at which point the loss function of the training set is Ltrain-B, where point B represents a sharp minimum.Even small parameter variations lead to large loss variations, resulting in the loss of the test set being greater than the loss of the training set, i.e., Ltest-B＞Ltrain-B,which in turn leads to poorer generalizability of the model.Therefore, the steep point B is not a suitable local minimum.

Fig.7 Learning rate schedule of SGDR and Adam.

Fig.8 Comparison of the generalization ability of the SGD with diverse learning rate schedules.

As shown in Fig.8(b), the model jumps from local minimum B to position C after rapidly converging to local minimum B; this process is called a “restart.” The learning rate decreases in the next cycle in accordance with a cosine function until the model converges to the local optimum D, which represents a flat minimum.The next“restart” is unlikely to cause the model to leave its stable local minimum because the loss function at D is stable and located in a “wider” and “flatter” region.At point D, even if the dataset changes, there is a situation in which data are missing or the noise is high, the loss is relatively stable,and Ltest-D is approximately equal to Ltrain-D.Therefore, this flatter region indicates that the network generalizes to new data better.This improved generalizability makes the model more adaptable and generalizable to different scenarios with excellent accuracy and robustness against noise and missing input information.

3.5 Model evaluation index

In this section, evaluation metrics for our multicategorization model are discussed.In this paper, we denote the stable mode as State I, aperiodic instability as State II, and oscillatory instability as State III.The confusion matrix is the primary tool used to assess classification errors.The confusion matrix that defines the multi-categorization problem as defined in [42] is shown in in Table 1.

Based on the information provided in Table 1, TP, TN,FP, and FN are defined in Table 2.

Table 1 Confusion matrix.

Real labelPredicted label State IState IIState III State IPssPsaPso State IIPasPaaPao State IIIPosPoaPoo

Table 2 Definitions of TP, TN, FP, and FN.

LabelTPFN State IPssPsa+Pso State IIPaaPas+Pao State IIIPooPos+Poa LabelFPTN State IPas+PosPaa+Pao+Poa+Poo State IIPsa+PoaPss+Pso+Pos+Poo State IIIPso+PaoPss+Psa+Pas+Paa

In this study, we considered four performance indices to evaluate TSA.The corresponding equations for these indices are as follows:

Equation (11) defines Accuracy (Acc), which is the percentage of correctly predicted samples out of all samples,and indicates the ability of the model to categorize the data.A higher accuracy allows the model to better avoid false positives and error control.

Equation (12) defines the Missed Alarm Rate (MAR),which is the proportion of samples that are actually destabilized and that the model incorrectly judges as stable.It indicates that there is no alarm for instability caused by a certain fault; however, it is predicted to be stable.The higher the MAR, the weaker the ability of the model to recognize faults, and the more unstable the samples that are recognized as stable.This can cause serious accidents in the power system.Therefore, a lower leakage alarm rate helps maintain power grid stability.

Equation (13) defines the False Alarm Rate (FAR),which represents the proportion of unstable samples identified by the model from the real stable samples.The higher the false alarm rate, the more biased the model to predict real stable labels as faults, which can lead to frequent false alarms from the system and frequent outages or maintenance measures by the grid staff, which is detrimental to the economic efficiency of the grid.Therefore, a lower false-alarm rate reduces unnecessary interventions or wasted resources.

Equation (14) defines F1, which reflects the balance of the model in both false and missed alarms, and is a comprehensive assessment of the overall performance of the model; the higher the value of F1, the better the model performs in the identification of fault samples, which is crucial for stable operation and efficient management of the power grid.

4 Simulation verification

In this study, IEEE 145-bus and IEEE 39-bus systems were employed to assess the effectiveness of the proposed TSA model.Tensorflow 2.6.0 was used to develop DL models on a machine with an Intel (R) Core(TM) i7-12700H 2.30 GHz CPU and 16 GB RAM.

Four commonly used TSA models were selected as controls for comparative analysis with ResGRU: SVM,GRU, ResNet, and Transformer.The ResGRU model is based on a 3×3 convolution kernel.An SVM was built on the Scikit-learn platform using a radial basis function and a penalty coefficient of 110.BiGRU comprises a sigmoid output layer, two fully connected layers, and two BiGRU layers.Each layer has 256 neurons and the number of iterations was set to 200.ResNet uses the classic ResNet18 structure.It contains two Transformer submodules.The batch size of all the models was 64, and the SGDR optimization algorithm was used to adjust the learning rate.The deep models used a maximum number of epochs of 200, an initial learning rate of 0.05, and a cross-entropy loss function.The ResGRU structure is presented in Table A1 of the Appendix.

4.1 Dataset acquisition

A high-volume automatic transient simulation was executed on IEEE 145-bus and 39-bus systems using the Python API supplied by PSS/E.The TDS parameter settings for the IEEE 145-bus and 39-bus systems in PSS/E are presented in Table A1 of the Appendix.The IEEE 39-bus system contains 10 generators and 34 nontransformer branches; the IEEE 145-bus system contains 50 generators and 403 non-transformer branches.In the simulation, the fault was set on a non-transformer branch.Except for the different fault line settings, the load burden, fault location, fault duration, and fault type of the IEEE 145-bus system were the same as those of the IEEE 39-bus system.

Three scenarios were established for the different simulation experiments.Dataset A, obtained from the failure scenario in Table A2 of the Appendix, contains two parts, A1 and A2, which are data of simulation failures of the IEEE-39 and IEEE-145 Bus systems, respectively.The fault occurred 1 s after the simulation started and lasted for 10 s.Dataset A1 contains 10,200 samples, of which 5727, 3450, and 2048 are stable, aperiodic unstable,and oscillatory unstable samples, respectively, at a ratio of 2.80:1.68:1.Similarly, dataset A2 contains 120900 samples at a ratio of 7.24:1.46:1.In this study, the training,validation, and test sets were divided at a ratio of 6:2:2.Dataset A was used for the results presented in Sections 4.2,4.3, and 4.8.

Dataset B results from modifying Dataset A1 by replacing generators 30, 32, 33, 34, 35, 36, and 37 in the IEEE-39 Bus system with wind turbines; the remaining settings were the same as those in Dataset A1.Dataset B was used for the results presented in Section 4.5.

Dataset C was divided into four groups obtained by adding 0, 20, 30, and 40 dB of Gaussian white noise to Dataset A1.Dataset C was used for the results presented in Sections 4.7 and 4.8.

pagenumber_ebook=154,pagenumber_book=151

Fig.9 Accuracy and training loss curves for both considered systems.

Fig.10 Confusion matrix of the test results for the two considered systems.

4.2 Model training and testing results

Fig.s 9(a) and (b) depict the accuracies of the IEEE 39-bus and 145-bus systems, respectively.Fig.s 9(c) and (d)depict the training losses of the model for the IEEE 39-bus and 145-bus systems, respectively.

Subsequently, using the test data, we predicted the online stability states of the model to obtain the confusion matrix shown in Fig.10.The columns in the confusion matrix represent the categories predicted by the model,whereas the rows represent the real stability categories.The numbers on the main diagonal represent the number of correctly predicted samples in the test set.These figures can be used to determine the prediction accuracy.

4.3 Classification performance comparison

This section reports the tests conducted to determine the classification ability of the five models.The trainingand testing datasets of each control model were derived from Dataset A.Tables 3 and 4 present the TSA outcomes for both systems.For the IEEE 39-bus system, the Acc values of ResGRU for the three types of samples were 99.26%, 99.12%, and 99.85%.The MAR values were 1.71%, 0.22%, and 0%, respectively.For the IEEE 145-bus system, the ResGRU prediction Acc values for the three states were greater than 99%, with accuracies of 99.47%,99.25%, and 99.55%, respectively.The MAR values were 0.76%, 0.47%, and 21%, respectively.Additionally, the FAR indicator was the lowest.In a power system, missing fault alarms is dangerous and can cause serious system failures.Meanwhile, false fault alarms lead to more wasted resources for intervention; therefore, the MAR and FAR values of the model should be as low as possible.The results show that ResGRU can detect faults in a timely manner, has higher accuracy and the lowest error rate, and exhibits excellent TSA performance.

Table 3 TSA properties on IEEE 39-bus systems.

Table 4 TSA performance on IEEE 145-bus systems.

ModelsStatusAcc/%MAR/%FAR/%F1/%SVMI94.314.896.5194.29 II94.923.663.6896.33 III95.753.152.197.37 GRUI96.182.924.7296.17 II96.662.422.497.59 III97.062.371.2698.18 ResNetI98.631.191.5898.61 II98.251.451.0998.73 III99.030.560.6399.40 99.101.080.7599.09 II98.6410.9799.02 III99.190.380.6199.51 ResGRU I Transformer 99.470.760.3299.46 II99.250.470.6299.38 III99.550.210.3599.72 I

pagenumber_ebook=155,pagenumber_book=152

Fig.11 Accuracy of different models on the two considered systems.

Fig.11 shows the evaluation results of each model for the IEEE 145-bus and 39-bus systems.ResGRU achieves the highest recognition accuracy for all categorized states.The Acc value of the Transformer is close to that of ResGRU, but its accuracy when it comes to recognizing oscillatory unstable states in the IEEE 39-bus system is much lower than the 99.85% accuracy exhibited by ResGRU.The Acc values of the first and third states are the highest among all the models.

These findings demonstrate that ResGRU can accurately determine most of the transient instability events.Among all the evaluated models, the proposed one exhibits the best evaluation metrics, and the advantages of the model are more evident in larger systems.This is because ResGRU can recognize subtle differences between potential failure modes and normal modes from deeper features using a deep residual network structure, thus reducing the occurrence of false alarms and the false alarm rate.Meanwhile, BiGRUAttention can comprehensively capture the contextual information of time-series data, effectively reducing missed alarms caused by insufficient information at critical time points.

4.4 Failure scenario analysis

In this section, the fault scenarios identified by ResGRU are analyzed, and three typical scenarios are selected in which the fault was imposed at 1 s and the duration of the fault was set to 25/60 s.Individual physical and image features are shown in Fig.12.

Fig.12(a) shows the physical characteristic curves and images of a stabilized sample.The voltage phase angle,voltage magnitude, active power, and reactive power of this sample can be restored to relatively stable values for a period of time after the fault, especially in terms of voltage specifications, which are all stable at 1.0 p.u.after the fault.The images of the stable samples have clear green characteristics, and the picture color system is relatively stable.Table 3 shows that the recognition accuracy of ResGRU for the stable samples is 99.26%.

Fig.12(b) shows the physical characteristic curves and images of the non-periodic instability samples.Note that the indexes do not return to a stable value after the fault and fluctuate sharply, especially the voltage phase angle, which widens after the fault.The standard value of the voltage fluctuates around 0.75 p.u., which is below the standard.Active and reactive power also fluctuate violently, with small amplitudes and large frequencies.The image of the sample exhibits a distinctive red and stray color.Table 3 shows that ResGRU recognizes such samples with 99.12%accuracy.

Fig.12(c) shows the physical characteristic curves and images of the oscillation destabilization samples.Note that the indexes do not return to a relatively stable value even after the fault and slowly fluctuate in a relatively large range.The voltage phase angle changes the most after the fault; the voltage specification fluctuates around 0.75 p.u.;the active and reactive power also fluctuate with a large amplitude and frequency.The sample has a distinct orange color.Table 3 shows that the recognition accuracy of ResGRU for the in-kind samples is 99.85%.

ResGRU combines the advantages of ResNet, which excels in extracting spatial features from images, and Attention-BiGRU, which has excellent temporal dataprocessing capabilities.By combining the two, ResGRU can capture both spatially and temporally complex features from multi-dimensional data of fault samples.In particular,when dealing with time-series features such as voltage phase angle and voltage magnitude, the memory capability of BiGRU can identify the dynamic changes of the sample after the fault and simultaneously consider significant fluctuations and abnormal behaviors of individual physical quantities.ResNet further enhances the ability of the model to recognize complex fault features by analyzing the transformed images of the data.The residual design of ResNet makes the model more sensitive to small changes and fluctuations when dealing with complex instability scenarios, and can effectively capture the features of the instability samples.

Other models, such as Transformer and ResNet, lack the ability of temporal feature processing, and GRU lacks the ability of spatial feature extraction, which limits their feature extraction ability under complex instability scenarios.

4.5 Model generalizability test

To verify the effect of SGDR on the generalization ability of the model, the ResGRU model was trained and tested using the Adam and SGDR optimization algorithms on Datasets A1 and B, respectively.The results are presented in Fig.12.Comparing Fig.12(a) and 12(b), it can be observed that the accuracy and F1 values of the model trained using Dataset B are smaller than those trained using Dataset A.Moreover, the assessment capability of the model trained using Adam decreases more than that of the model trained using SGDR, with the largest decrease in the F1 metrics for Adam, by more than 4%.Meanwhile, as can be seen in Figs.12(c) and 12(d), MAR and FAR also increase, but the SGDR increases to a much smaller extent.In summary, the performance of the model in evaluating transient stabilization samples decreases with an increase in the penetration of new energy.However, the metrics of SGDR are still better than those of Adam, and its performance on the new dataset is more stable, with a better generalization ability.

pagenumber_ebook=157,pagenumber_book=154

Fig.12 Characteristics of the three types of samples selected.

4.6 Visual analysis of the classification ability of ResGRU

For visualization, a non-linear dimensionality reduction approach known as t-distributed stochastic neighborhood embedding (t-SNE) embeds high-dimensional data in a lowdimensional environment with two or three dimensions.This is based on the premise that conditional probabilities may indicate similarity between two samples by transforming the Euclidean distance of high-dimensional data.The use of t-SNE [43] as a visualization tool demonstrates that the TSA interpretability of the model performance was enhanced by sample processing.In this study, t-SNE was also employed to embed high-dimensional data in a 2D or 3D low-dimensional environment.After applying t-SNE for the TSA problem, the gap between the same samples decreased, and the gap between different samples increased.To showcase the excellent feature extraction capabilities of ResGRU, the output of each layer of the network was visually represented using t-SNE.The results are shown in Fig.13.

Fig.13 Comparison of results for the four indicators and two optimization strategies.

Fig.14 Visual representation of different layer features.

Fig.14 shows the results of feature visualization after the dimensions of the original data were reduced using the t-SNE method.Fig.14(a) shows the three types of samples mixed in the original feature space.However, through ResGRU processing, the samples of different categories tend to aggregate in the feature space and can be separated almost linearly, as shown in Figs.14(b) and 14(c).Finally,the features learned by ResNet and BiGRU were combined.Fig.14(d) demonstrates that ResGRU can effectively distinguish the two classes of samples, has a strong feature extraction ability, and exhibits an effective TSA.

4.7 Model robustness to noise

This study is based on the idea that PMUs in power systems can perform precise high-frequency sampling of variables, including phase angle, voltage magnitude,reactive power, and active power.However, noise disturbances and sample errors may affect PMUs and WAMS.At this stage, the model was trained using Dataset C, which contains additional noise.The remaining settings were the same as those in the earlier test.The results of the test in a noisy environment are shown in Fig.15.

pagenumber_ebook=158,pagenumber_book=155

Fig.15 Trends of the four measurement indices under.

As shown in Fig.15, the ResGRU-based model (trained using SGDR) can achieve over 97.5% Acc., 96.5% F1-score,and no more than 4% MAR and FAR under a 20 dB noise,which is better than the other methods.Compared to other approaches and raw data, the decline in each index was minor.This is because in ResGRU, the attention mechanism can dynamically assign different weights to the input features, thus highlighting important information and suppressing irrelevant or noisy information in the time-series data.By selectively focusing on the critical parts of the time-series data, the attention mechanism effectively reduces the interference of noise in the model prediction.Meanwhile, BiGRU extracts features from past and future information separately when processing time-series data; this bidirectional processing makes the model capable of comprehensively recognizing fluctuations in the time series, which further enhances its antiinterference ability against noise.Therefore, ResGRU-based TSA exhibits excellent robustness against PMU noise.

4.8 Model robustness to PMU loss

Owing to the high cost of PMUs, real power systems are typically configured with PMUs located only in grid hub nodes or electrically weak nodes.To ensure that the proposed ResGRU performs well when used with a real PMU setup,two typical scenarios were considered for comparison.The first scenario is the incomplete PMU configuration, in which 10% of the feature channels are randomly zeroed for simulation.The second scenario is under data packet loss, in which all features are set to lose the measurement data with a 10% probability at any moment.To reduce the volatility of the randomly generated dataset, the experiment was repeated 200 times for each scenario, and the anticipated value was determined by taking the average value.Table 5 presents the experimental results.The average values of the three states for each parameter are listed in Table 5.

The comparison reveals that shallow ML methods do not perform as well as DL methods, and exhibit different adaptations for distinct scenarios.The PMU incomplete configuration scenario had the greatest impact on the SVM, with an accuracy drop of 5.98%.The impact on the Transformer and ResGRU were the smallest.This is because the attention mechanism can flexibly adjust the focus on different features in case of data loss, prioritizingthe still available and complete input data.Simultaneously,BiGRU processes time-series data in both directions,enabling the model to use pre- and post-contextual information to compensate for parts that may be lost in the middle.In addition, ResNet was initially designed to retain key information through residual linking, which can better preserve existing features in the face of missing data and mitigate the impact of data loss on the model performance through deep feature extraction.

Table 5 TSA results for the IEEE 39-bus system for several models and various noise levels.

ScenesModelAcc/%MAR FAR F1/%/%/%Result of raw data averaging SVM95.134.254.0795.88 GRU96.832.642.6897.34 ResNet98.860.950.9199.07 Transformer99.180.870.5999.27 ResGRU99.410.640.5699.47 PMU incomplete configuration SVM89.1511.5511.3788.54 GRU92.459.688.7490.79 ResNet95.635.875.0294.55 Transformer96.733.082.396.92 ResGRU98.541.621.5998.35 Packet loss SVM90.2310.459.9689.79 GRU95.784.95.7994.65 ResNet96.773.744.0296.12 Transformer97.532.842.7399.24 ResGRU98.91.511.7798.38

4.9 Assessment time and network structure sensitivity analysis

Power systems exhibit strong time variability, and the transient process changes rapidly; therefore, it is necessary to compare the online TSA time of each model.Simultaneously, to verify the impact of different model structures on the evaluation performance, it is necessary to perform a sensitivity analysis on the different structures of ResGRU.Therefore, based on Dataset A1 for the IEEE 39 bus-system, serial ResNet-GRU and parallel ResNet-GRU were selected for comparison with ResGRU, and SVM, GRU, ResNet, and Transformer.The experiments were repeated 20 times for statistical averaging of the results, which are presented in Table 6.

Table 6 shows that all models meet the time-scale requirements of online TSA.SVM presents the lowestcalculation speed and lowest Acc, whereas Transformer exhibits high calculation complexity, which means that the calculation speed is slow, but Acc is high.GRU and ResNet are faster than Serial ResNet-GRU and Parallel ResNet-GRU, but the Acc value is lower than that of Serial ResNet-GRU and Parallel ResNet-GRU.Owing to its complex serial structure, the calculation speed of Serial ResNet-GRU is much slower than that of Parallel ResNet-GRU,but their accuracy is similar.ResGRU is also a parallel structure that integrates a multi-head attention mechanism.The calculation speed is intermediate between Serial ResNet-GRU and Parallel ResNet-GRU, but the accuracy is the highest.In general, ResGRU exhibits better TSA performance on the IEEE 39-bus system at the expense of lower calculation speed.

Table 6 Comparison of assessment times.

ModelTime/msAccuracy/%SVM7895.03 GRU2396.59 ResNet1798.65 Transformer5698.74 Serial ResNet-GRU4999.16 Parallel ResNet-GRU2599.23 ResGRU3099.46

5 Conclusions

This study proposes a TSA approach based on BiGRUAttention and ResNet to rectify the oscillatory instability caused by inadequate dampening torque.The three states of oscillatory instability, aperiodic instability, and transient stability can be predicted using the proposed model with high speed and accuracy.Using its predictions, the operator can instantly implement appropriate emergency control measures, thereby ensuring the safety and stability of the system.The SGDR algorithm was used for training to increase the generalizability of the model.The following results were obtained by simulating the IEEE 145-bus and 39-bus systems.

First, ResGRU can fully capture key information during failure and establish a mapping relationship between the original input and system stability label with excellent TSA performance.Second, experiments in a scenario where the system is configured with a large number of wind turbines show that, for power systems with a high proportion of renewable energy sources, the model proposed in this study exhibits a high accuracy for TSA.Meanwhile, the visualization of three different layers using tSNE indicates that ResGRU has excellent feature extraction capability.Furthermore, noise experiments were conducted to demonstrate that the model presented in this study is robust to PMU noise and PMU data loss.Finally, evaluation speed experiments were conducted to demonstrate that the model presented in this study enables faster evaluation with the required accuracy.

Subsequent studies will aim to investigate the effects of renewable energy generation as well as topology changes on TSA, aiming to enhance its stability under complex disturbances in the grid and to offer a more dependable solution for emergency control measures following malfunctions.

CRediT authorship contribution statement

Shan Cheng: Supervision, Software, Resources,Projectadministration, Investigation, Funding acquisition,Formal analysis, Conceptualization.Qiping Xu: Writing -review & editing, Writing - original draft, Visualization,Validation, Supervision, Software, Methodology, Data curation.Haidong Wang: Validation, Supervision, Software, Resources, Data curation.Zihao Yu: Supervision,Software, Resources, Project administration, Methodology.Rui Wang: Validation, Supervision, Software,Formal analysis, Data curation.Tao Ran: Resources,Project administration, Methodology, Investigation, Formal analysis.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This study was financially supported by State Key Laboratory of HVDC No.SKLHVDC-2023-KF-03.

Appendix A

Table A1 Parameterization of ResGRU.

pagenumber_ebook=159,pagenumber_book=156

IEEE 145-bus system Quantity/function Input Image Size128×128512×512 External Convolution Layer ResNet branchIEEE 39-bus system Quantity/function 1/ReLU1/ReLU External Convolution kernel size 7×77×7 External Convolution kernel quantity 6464 Resnet block quantity88 Residual Block Convolution Layer 8×2/ReLU8×2/ReLU Residual Block Convolution kernel size 3×33×3 Constant Mapping Residual Module Residual block 1,2,6,4 Residual block 1,2,6,4 Pooling methodResidual block 3,5,7Residual block 3,5,7 BiGRU branchQuantity/functionQuantity/function Time dimension122122 Input feature dimension 2621096 Number of neurons 2×2562×256 Dropout rate0.20.2 Full connection layerQuantity/functionQuantity/function Layers2/ReLU+1/softmax2/ReLU+1/softmax Units on the first layer128128

pagenumber_ebook=160,pagenumber_book=157

Table A2 Parameter settings for PSS/E bulk transient simulation.

Table B2 Parameter Settings for PSS/E Bulk Transient Simulation.

Appendix B

Table B1 Parameterisation of ResGRU.

IEEE 145-bus system Quantity/function Input Image Size128×128512×512 External Convolution Layer ResNet branchIEEE 39-bus system Quantity/function 1/ReLU1/ReLU External Convolution kernel size 7×77×7 6464 Resnet block quantity88 Residual Block Convolution Layer External Convolution kernel quantity 8×2/ReLU8×2/ReLU Residual Block Convolution kernel size 3×33×3 Residual block 1,2,6,4Residual block 1,2,6,4 Pooling methodResidual block 3,5,7Residual block 3,5,7 BiGRU branchQuantity/functionQuantity/function Time dimension122122 Input feature dimension Constant Mapping Residual Module 2621096 Neurons number2×2562×256 Dropout rate0.20.2 Full connection layerQuantity/functionQuantity/function Layers2/ReLU+1/softmax2/ReLU+1/softmax Units on the first layer128128

References

[1]
E.O’Shaughnessy, J.Heeter, C.Shah, et al., Corporate acceleration of the renewable energy transition and implications for electric grids,Renew.Sustain.Energy Rev.146 (2021) 111160. [百度学术]
[2]
Erdiwansyah, Mahidin, H.Husin, et al., A critical review of the integration of renewable energy sources with various technologies,Protect.Control Modern Power Syst.6 (1) (2021) 3. [百度学术]
[3]
M.Chen, Q.Liu, J.Zhang, et al., XGBoost-based algorithm for postfault transient stability status prediction, Power Syst.Technol.44 (2020)1026-1034. [百度学术]
[4]
T.L.Vu, K.Turitsyn, Lyapunov functions family approach to transient stability assessment, IEEE Trans.Power Syst.31 (2)(2016) 1269-1277. [百度学术]
[5]
P.Varaiya, F.F.Wu, R.L.Chen, Direct methods for transient stability analysis of power systems: Recent results, Proc.IEEE 73(12) (1985) 1703-1715. [百度学术]
[6]
T.Huang, J.Wang, A practical method of transient stability analysis of stochastic power systems based on EEAC, Inter.J.Elect.Power Energy Syst.107 (2019) 167-176. [百度学术]
[7]
S.Y.Wang, J.L.Yu, W.Zhang, Transient stability assessment using individual machine equal area criterion PART I: Unity principle, IEEE Access 6 (2018) 77065-77076. [百度学术]
[8]
A.Gupta, G.Gurrala, P.S.Sastry, An online power system stability monitoring system using convolutional neural networks, IEEE Trans.Power Syst.34 (2) (2019) 864-872. [百度学术]
[9]
W.Wu, Y.Tang, H.D.Sun, et al., A survey on research of power system transient stability based on wide-area measurement information,Power Syst.Technol.36 (9) (2012) 81-87. [百度学术]
[10]
D.Liu, D.Song, H.Wang, et al., Voltage stability online evaluation system based on WAMS and EMS, Power Syst.Technol.38 (2014)1934-1938. [百度学术]
[11]
L.Wehenkel, M.Pavella, E.Euxibie, et al., Decision tree based transient stability method a case study, IEEE Trans.Power Syst.9 (1)(1994) 459-469. [百度学术]
[12]
L.S.Moulin, A.P.A.DaSilva, M.A.El-Sharkawi, et al., Support vector machines for transient stability analysis of large- scale power systems,IEEE Trans.Power Syst.19 (2) (2004) 818-825. [百度学术]
[13]
W.Tongwen, G.Lin, A data mining technique based on pattern discovery and k-nearest neighbor classifier for transient stability assessment, in: 2007 International Power Engineering Conference (IPEC 2007), 2007.Pp.118-123. [百度学术]
[14]
L.L.Zhang, X.W.Hu, P.Li, et al.ELM model for power system transient stability assessment, in: Proceedings of 2017 Chinese Automation Congress (CAC).Jinan.IEEE, 2017.Pp.5740-5744. [百度学术]
[15]
M.H.Chen, Q.Y.Liu, S.H.Chen, et al., XGBoost-based algorithm interpretation and application on post-fault transient stability status prediction of power system, IEEE Access 7 (2019) 13149-13158. [百度学术]
[16]
S.Grigorescu, B.Trasnea, T.Cocias, et al., A survey of deep learning techniques for autonomous driving, J.Field Rob.37 (3) (2020) 362-386. [百度学术]
[17]
E.Moen, D.Bannon, T.Kudo, et al., Deep learning for cellular image analysis, Nature Methods 16 (12) (2019) 1233-1246. [百度学术]
[18]
R.A.Khalil, N.Saeed, M.Masood, et al., Deep learning in the industrial Internet of Things: Potentials, challenges, and emerging applications,IEEE Internet Things J.8 (14) (2021) 11016-11040. [百度学术]
[19]
Q.M.Zhu, J.F.Chen, L.Zhu, et al., A deep end-to-end model for transient stability assessment with PMU data, IEEE Access 6 (2018)65474-65487. [百度学术]
[20]
S.Wu, L.Zheng, W.Hu, et al., Improved deep belief network and model interpretation method for power system transient stability assessment, J.Modern Power Syst.Clean Energy 8 (1) (2020) 27-37. [百度学术]
[21]
W.X.Liu, D.Hao, S.Zhang, et al.Power system transient stability assessment based on PSO-DBN, in: Proceedings of 2021 6th International Conference on Power and Renewable Energy (ICPRE).Shanghai, China.IEEE, 2021.Pp.333-337. [百度学术]
[22]
Q.F.Chen, H.Y.Wang, Time-adaptive transient stability assessment based on gated recurrent unit, Inter.J.Elect.Power Energy Syst.133(2021) 107156. [百度学术]
[23]
J.J.Q.Yu, D.J.Hill, A.Y.S.Lam, et al., Intelligent time- adaptive transient stability assessment system, IEEE Trans.Power Syst.33(1)(2018) 1049-1058. [百度学术]
[24]
S.Cheng, Z.H.Yu, Y.Liu, et al., Power system transient stability assessment based on the multiple paralleled convolutional neural network and gated recurrent unit, Protect.Control Modern Power Syst.7 (2022) 39. [百度学术]
[25]
K.He, X.Zhang, S.Ren, et al., Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp.770-778. [百度学术]
[26]
Q.F.Chen, N.Lin, S.Q.Bu, et al., Interpretable time- adaptive transient stability assessment based on dual-stage attention mechanism, IEEE Trans.Power Syst.38 (3) (2023) 2776-2790. [百度学术]
[27]
M.Zhang, J.Lucas, J.Ba, et al., Lookahead Optimizer: k steps forward,1 step back, Adv.Neural Infor.Proc.Syst.32 (2019). [百度学术]
[28]
H.He, G.Huang, Y.Yuan, Asymmetric valleys: Beyond sharp and flat local minima, in: Advances in neural information processing systems,2019, p.32. [百度学术]
[29]
Z.Xie, X.Wang, H.Zhang, et al., Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum, in: Proceedings of the 39th International Conference on Machine Learning.PMLR, 2022,pp.24430-24459. [百度学术]
[30]
P.Zhou, J.Feng, C.Ma, et al., Towards theoretically understanding why SGD generalizes better than ADAM in deep learning, Adv.Neural Inf.Proces.Syst.33 (2020). [百度学术]
[31]
J.Z.Zhang, T.X.He, S.Sra, et al.Why gradient clipping accelerates training: A theoretical justification for adaptivity:1905.11881.(2019). [百度学术]
[32]
I.Loshchilov, F.Hutter, SGDR: Stochastic gradient descent with warm restarts: 1608.03983.(2016). [百度学术]
[33]
Z.T.Shi, W.Yao, L.K.Zeng, et al., Convolutional neural networkbased power system transient stability assessment and instability mode prediction, Appl.Energy 263 (2020) 114586. [百度学术]
[34]
P.Kundur, Power system stability, Power syst.Stabil.Control 10 (2007)7-11. [百度学术]
[35]
E.Larsen, D.Swann, Applying power system stabilizers part I: General concepts, IEEE Trans.Power Apparatus Syst.PAS-100(6)(1981) 3017-3024. [百度学术]
[36]
F.Liu, R.Yokoyama, Y.C.Zhou, et al.Design of H1 robust damping controllers of FACTs devices with considering time- delayof wide-area signals.Proceedings of 2011 IEEE Trondheim PowerTech.Trondheim.IEEE, 2011.Pp.1-6. [百度学术]
[37]
D.Roberson, J.F.O’Brien, Multivariable loop-shaping control design for stability augmentation and oscillation rejection in wide-area damping using HVDC, Electr.Pow.Syst.Res.157 (2018) 238-250. [百度学术]
[38]
Y.Du, Z.Hu, B.Li, et al., transient stability assessment of power system based on Bi-directional gated recurrent unit, Auto.Elect.Power Syst.45 (20) (2021) 103-112. [百度学术]
[39]
H.Wang, S.Cheng, Q.Xu, et al., Noise-containing power quality disturbance identification method based on deep learning fusion network, Power Syst.Protect.Control 52 (10) (2024) 11-20. [百度学术]
[40]
P.Kundur, J.Paserba, V.Ajjarapu, et al., Definition and classification of power system stability IEEE/CIGRE joint task force on stability terms and definitions, IEEE trans.Power Syst.19 (3) (2004) 1387-1401. [百度学术]
[41]
G.Huang, Y.X.Li, G.Pleiss, et al.Snapshot Ensembles: Train 1,get M for free: 1704.00109.2017. [百度学术]
[42]
D.M.Ibrahim, N.M.Elshennawy, A.M.Sarhan, Deep- chest:Multi-classification deep learning model for diagnosing COVID-19, pneumonia, and lung cancer chest diseases, Comput.Biol.Med.132 (2021) 104348. [百度学术]
[43]
L.van der Maaten, G.Hinton, Visualizing Data using t-SNE, J.Mach.Learn.Res.9 (2008) 2579-2605. [百度学术]

Fund Information

Author

Shan Cheng

Shan Cheng received g received the BS degree in electronic information engineering from Henan Polytechnic University, China, in 2005, and the PhD degree in electrical engineering from Chongqing University, China, in 2013.He is currently a Professor with the College of Electrical Engineering and New Energy, China Three Gorges University.His research interests include integration of new energy into the grid, vehicle to grid, and integrated energy system.
Qiping Xu

Qiping Xu was born in Sichuan, China, in 2000.He received the B.S.degree from the Sichuan University of Science & Engineering, in 2022.He is currently pursuing the master’s degree with China Three Gorges University.His research focuses on the application of deep learning to power systems.
Haidong Wang

Haidong Wang was born in Anhui, China, in 2000.He received the B.S.degree from the Chaohu University Of Science & Engineering, in 2022.He is currently pursuing the master’s degree with China Three Gorges University.His research focuses on the application of deep learning to power systems.
Zihao Yu

Zihao Yu was born in Shandong, China, in 1998.He received the B.S.degree from the Shandong University of Technology, in 2020.He received his M.S.degree from China Three Gorges University in 2023.His research focuses on the application of deep learning to power systems.
Rui Wang

Rui Wang was born in Chongqing, China, in 2002.He received the B.S.degree from the Chongqing University of Science & Technology, in 2023.He is currently pursuing the master’s degree with China Three Gorges University.His research interest includes Electric vehicle and grid Interaction Technology, Power System Operation and Control.
Tao Ran

Tao Ran was born in Sichuan, China, in 1999.He received the B.S.degree from the SUIHUA University, in 2022.He is currently pursuing the master’s degree with China Three Gorges University.His research interest includes power system operation and control, distributionally robust optimization in power system and resilience enhancement in distribution network.

Publish Info

Received：

Accepted：

Pubulished：2025-02-25

Reference： Shan Cheng,Qiping Xu,Haidong Wang,et al.(2025) A transient stability assessment method for power systems incorporating residual networks and BiGRU-attention☆.Global Energy Interconnection,8(1):143-159.

(Editor Zedong Zhang)

Contents

Figure（0）

Tables（0）

Recommended articles：

Global Energy Interconnection

A transient stability assessment method for power systems incorporating residual networks and BiGRU-attention☆

Keywords

Abstract

0 Introduction

1 Transient stability assessment and basic categories

2 Structures of ResGRU

3 TSA based on ResGRU

4 Simulation verification

4.1 Dataset acquisition

4.2 Model training and testing results

4.3 Classification performance comparison

4.4 Failure scenario analysis

4.5 Model generalizability test

4.6 Visual analysis of the classification ability of ResGRU

4.7 Model robustness to noise

4.8 Model robustness to PMU loss

4.9 Assessment time and network structure sensitivity analysis

5 Conclusions

References

Fund Information

Author

Shan Cheng

Qiping Xu

Haidong Wang

Zihao Yu

Rui Wang

Tao Ran

Publish Info