The advantages and disadvantages of various machine learning methods in the diagnosis of atrial fibrillation and the future direction of improvement

Atrial fibrillation (abbreviated as atrial fibrillation) is an arrhythmia that can cause a variety of serious complications. Among them, paroxysmal atrial fibrillation has paroxysmal and asymptomatic characteristics, so it is difficult to diagnose. Long-term ECG big data can improve the detection rate of paroxysmal atrial fibrillation, but the interpretation of big data has become a burden and problem for primary medical institutions. In order to solve the above problems, a variety of shallow learning models based on ECG features continue to appear. These models highly rely on manual extraction of features and all have limitations. Deep learning is a data-driven automatic feature learning algorithm to make up for the shortcomings of shallow learning.

Lorenz Scatter plot is an emerging method for rapid analysis of ECG big data, and the output two-dimensional graphics are high-quality materials for deep learning. Studies have shown that when diagnosing common arrhythmia, the existing multiple superficial diagnosis models have higher accuracy than general practitioners, but the error rate is still high, and clinical diagnosis and treatment cannot be made based on this. Especially in the diagnosis of atrial fibrillation, the existing diagnosis model does not significantly improve the diagnosis accuracy, especially in the elderly population, the possibility of wrong diagnosis is greater. There is an urgent need to build a deep learning model to help basic general practitioners diagnose atrial fibrillation.

This article reviews the advantages and disadvantages of atrial fibrillation computer-aided diagnosis models and the application status of machine learning in the diagnosis of atrial fibrillation, providing new ideas for the construction of auxiliary diagnosis models, and at the same time providing a new perspective for primary medical staff to solve the problem of interpretation of ECG big data.

1, the pros and cons of atrial fibrillation computer-aided diagnosis model

1.1   a model based on the electrocardiographic characteristics of atrial waves

z0 z atrial fibrillation computer-aided diagnosis model is constructed based on atrial waves and RR intervals, the atrial waves manifested as P waves disappear And f wave appears. Due to the low peak value of the f wave, it is susceptible to baseline drift, motion artifacts, power frequency and EMG interference, which makes the specificity and sensitivity of the diagnostic model based on atrial wave low. The advantage is that its waveform feature extraction window is narrow, no long-segment electrocardiogram is required to construct a model, which is conducive to the detection of paroxysmal atrial fibrillation.

1.2” Model based on the ECG characteristics of the RR interval

The diagnostic model of the RR interval cannot distinguish atrial fibrillation from non-atrial fibrillation arrhythmia, and the f wave information is missing. When atrial fibrillation is accompanied by atrioventricular block and atrioventricular junction tachycardia, the RR interval is regular. In addition, the RR interval of atrial flutter or multi-source atrial tachycardia may be irregular The RR interval feature extraction window is wide and requires 50 to 500 heart beats. It is inevitable to miss paroxysmal atrial fibrillation with short onset time.

1.3” A model based on atrial wave and RR interval combined with electrocardiographic characteristics The model

combined with atrial wave and RR interval characteristics is slightly better than a model based only on RR interval in diagnosing atrial fibrillation. The value of diagnosing atrial fibrillation is slightly better than models based only on the RR interval. However, the performance of the joint diagnosis model depends on the peak detection. The long-range ECG is susceptible to interference from daily activities and a large number of peak interference waves appear, which seriously affects its performance.

1.4   based on the non-linear ECG characteristic model

KUMAR and other sub-band signals obtained according to the elastic analysis wavelet transform (FAWT), based on which the logarithmic energy entropy (LEE) and displacement entropy (Pen) are calculated, which are based on nonlinearity The characteristic model can distinguish small differences between sinus rhythm and atrial fibrillation, and does not rely on the detection of f wave and R wave.

1.5 The model based on the ECG characteristics of the ECG scatter chart

1.5.1 Lorenz scatter chart 1. RRn) Determine one point and make it by iterative calculation. The Lorenz scatter plot does not rely on the detection of the ECG waveform, and is better for the diagnosis and differential diagnosis of atrial fibrillation with other complex arrhythmias.

Lorenz Scatter plot is used as a non-linear analysis method to analyze ECG big data from a macro level. Generalized ECG scatter plots include difference scatter plots, RDR scatter plots, three-dimensional scatter plots, etc. Due to the changes in the observation section and the increase in dimensionality, it can provide clinicians with more detailed ECG characteristics of atrial fibrillation .

1.5.2  Based on the ECG characteristics of the Lorenz scatter plot

After analyzing the shape of the Lorenz scatter plot of atrial fibrillation, the average step increase of the rhythm interval of the point dispersion on the diagnostic line of "cluster number" is extractedThree indicators are measured; the k-means clustering method and support vector machine are used to distinguish between atrial fibrillation rhythm and non-atrial fibrillation rhythm. The results show that the average sensitivity and average specificity of the model are 91.4% and 92.9%, respectively. Some researchers have proposed two indicators: the frequency distribution of points in different areas of the Lorenz scatter plot and the complex correlation metric (CCM), and the two are input into the neural network, and the accuracy of the constructed diagnostic model is as high as 94%. Based on the Lorenz scatter plot,

LOWN et al. proposed a difference scatter plot containing 60 RR intervals. According to the characteristics of the difference scatter plot, a support vector machine was used to construct a computer-aided diagnosis model for atrial fibrillation. The model is training The sensitivity of the set is 99.2% and the specificity is 99.5%; the sensitivity of the test set is 100.0%, and the specificity is 97.6%. However, the data used by the above models comes from the international standard ECG database, the source is single, and the generalization ability is limited. The diagnostic performance of these models in the real world still needs further verification.

2. Progress in the application of machine learning in the diagnosis of atrial fibrillation

2.1   an auxiliary diagnosis model based on shallow machine learning

Shallow machine learning algorithms include random forest, support vector machine, LASSO regression, decision tree, naive Bayes, K Mean clustering, etc. This type of model requires manual extraction of feature indicators. The manual extraction process is subject to subjective influence. At the same time, it cannot use the effective information provided by high-dimensional features, which limits its promotion and application in ECG big data.

2.2  based on the deep learning-aided diagnosis model

deep learning is good at image recognition and learning. It has the most research in medical imaging and is gradually used in electrocardiology, face recognition, diabetic retinopathy and other fields. In 2019, the Lancet published an article on the construction of atrial fibrillation diagnostic model based on deep learning. The article proposed that deep learning can detect ECG signals that cannot be observed by the human eye, which is beneficial to paroxysmal atrial fibrillation and atrial fibrillation. Diagnosis of other complex arrhythmias.

3. Current status and shortcomings of the application of computer-aided diagnosis models.

Most models use a single source of training data and test set data, which have low accuracy in clinical applications and weak generalization capabilities.

The 12-lead ECG currently used in primary medical institutions generally has auxiliary diagnostic functions. Because of the low accuracy rate, the conclusions drawn are unreliable. Many models currently constructed rarely use real-world ECG data, which limits their use at the primary level. Application in medical institutions. The

study shows that the combination of general practitioners in primary medical institutions with computer-aided diagnosis models can improve the accuracy of atrial fibrillation diagnosis, but it is not enough for atrial fibrillation screening, and it is still necessary to strengthen the ECG knowledge training of primary medical staff. As a “slow solution” to improve the accuracy of atrial fibrillation diagnosis, training cannot solve the “emergency problems” faced by primary medical institutions. Therefore, it is particularly important to build a computer-aided diagnosis model with good performance.

4. Summary and prospects

The computer-aided diagnosis model based on ECG features can assist general practitioners in making ECG diagnosis quickly. Deep learning, as an emerging technology of artificial intelligence, is significant in image recognition, high-dimensional data, and non-linear feature processing. Advantages, whether the organic combination of the two will produce a better performance model is worthy of further study. In short, deep learning is bound to become the mainstream of medical image recognition in the future, and it will be more and more applied in various fields. Edited by

: Yang Saili