Chaotic signal denoising algorithm based on sparse decomposition
Huang Jin-Wang1, Lv Shan-Xiang2, Zhang Zu-Sheng1, Yuan Hua-Qiang1, †
School of Cyberspace Science, Dongguan University of Technology, Dongguan 523808, China
College of Cyber Security, Jinan University, Guangzhou 510632, China

 

† Corresponding author. E-mail: yuanhq@dgut.edu.cn

Project supported by the National Natural Science Foundation of China (Grant No. 61872083) and the Natural Science Foundation of Guangdong Province, China (Grant Nos. 2017A030310659 and 2019A1515011123).

Abstract

Denoising of chaotic signal is a challenge work due to its wide-band and noise-like characteristics. The algorithm should make the denoised signal have a high signal to noise ratio and retain the chaotic characteristics. We propose a denoising method of chaotic signals based on sparse decomposition and K-singular value decomposition (K-SVD) optimization. The observed signal is divided into segments and decomposed sparsely. The over-complete atomic library is constructed according to the differential equation of chaotic signals. The orthogonal matching pursuit algorithm is used to search the optimal matching atom. The atoms and coefficients are further processed to obtain the globally optimal atoms and coefficients by K-SVD. The simulation results show that the denoised signals have a higher signal to noise ratio and better preserve the chaotic characteristics.

PACS: ;05.45.-a;;05.40.Ca;
1. Introduction

Chaos phenomenon widely exists in astronomy, meteorology, stock, and so on.[1,2] Some chaotic system can be described by definite equations, but its behavior is similar to stochastic and has the characteristics of broadband and similar to noise.[3,4] Because of the measurement accuracy and external influence, the actually measured chaotic signal is inevitably mixed with noise. Therefore, the characteristics of chaotic signals may be concealed, affecting the accuracy of parameter calculation and prediction of the signals, so the research of chaotic signal denoising algorithm is very important. And the non-linearity and broadband characteristics of chaotic signals makes the denoising more challenging.

The current denoising algorithms for chaotic signals can be classified into three categories. One is that the system model and parameters are known. The recursive expressions of chaotic signals are used as state equations, the adaptive filter, such as unscented Kalman filtering and cubature Kalman filter, is adopted to estimate the signal.[57] These methods can achieve online denoising, but the system model and parameters need to be known as a prior. The second is that the system model is known but the parameters are unknown. The phase space denoising method proposed in Ref. [8] belongs to this kind, which explicitly uses the source expression of the chaotic signals, but the recursive parameters are estimated by a least square method in the algorithm. Finally, the global error of the estimated signal is minimized. The third is that the system model and parameters are unknown. These algorithms belong to a completely blind. Wavelet threshold (WT) algorithm,[9,10] local curve fitting (LCF) algorithm,[11,12] and empirical mode decomposition (EMD) algorithm[13,14] are representative in all-blind methods. Essentially, they make use of the low-frequency property of chaotic flow signals, and carry out low-pass filtering while preserving the non-periodic property as far as possible in the filtering process. Their validity depends on the determination of the filtering boundary. The WT algorithm sets the coefficients to zero (hard threshold) under a certain threshold in high frequency band. The LCF algorithm performs hard projection under an optimal window length. The key of the EMD algorithm is to select the optimal number of eigenfunctions to reconstruct the source signal.

Based on these existing algorithms, we propose a noise suppression algorithm using sparse decomposition[15,16] and K-SVD.[17] Since the decomposition coefficient of noise in the dictionary is not sparse, the non-zero coefficients cover the whole transformation domain and generally are small compared with the coefficient of the signal. Moreover, the sparse decomposition coefficient of the signal in the dictionary is sparse, only K non-zero coefficients exist, the coefficient of the noisy signal has only K big coefficients. By decomposing the noisy signal in the dictionary A and selecting only the first K big coefficients, the linear combination of the atoms corresponding to these K large coefficients can approximate the clean signal. The approximation signal obtained contains most of the information of the useful signal and discards most of the noise information, which achieves the purpose of noise elimination.

The innovation of this algorithm lies in using noise intensity to dynamically determine the sparsity of the reconstructed signals, giving a low-complexity dictionary design scheme and optimizing by the K-SVD algorithm. The organizational structure of this paper is as follows: The second part introduces the sparse decomposition denoising model, the third part introduces the algorithm and describes the principle of each step in detail, and the fourth part analyzes the algorithm. The fifth part compares the algorithm with other classical algorithms through numerical simulation, and finally comes to the conclusion.

2. Signal model

The observed noisy signal y(n) from a noise-free signal x(n) can be modeled as

where v(n) is the Gauss noise with zero mean and variance σ2.

As the signal is divided into segments and processed, we use the window function F(y(n),M′) with window length M′ = 2M + 1 for signal segmentation, y(n) is segmented with length M′ and the moving step is M each time. The signal model can be rewrite in vector form yt = xt + vt. Here yt = [y(t),y(t + 1),…,y(t + M′ – 1)]T ∈ ℝM′, xt and vt have the similar expression. All the vectors yt form a matrix Y = [y1, y2, …, yL], where L = N/M and Y ∈ ℝM′ × L.

For the noisy observed signal yt, the sparsest representation is then the solution to the optimization problem[15]

where A is the dictionary and α is the sparse decomposition coefficient vector. Note that P0,0P0, where P0 is the optimization problem of noiseless signal sparse decomposition. If we apply this with δ > ε = |ytxt|2, the problem has a sparse solution. In fact, the solution obeys , where αt is the solution of noiseless optimization problem P0, and val(⋅) is the solution of the optimization problem.

Usually the problem (P0,δ) demands exorbitant computational efforts in general and does not have a unique sparse solution. If (P0,δ) has the sparsest solution, the corresponding expression can be solved. In the case of noise, we can pursue a strategy of convexification, replacing the 0-norm in Eq. (2) by an 1-norm

For the (P1,δ) problem, there are many algorithms to solve the convex quadratic program, such as iteratively-reweighted least squares (IRLS),[18] interior-point algorithms[19] and active-set methods. The solution of the (P1,δ) problem is closely related to the BPDN algorithm.[19] Both of those proposals amount to solving a corresponding convex optimization in Lagrangian form

when λ = λ(yt,δ), the problems (P1,δ) and have the same solutions.

3. The proposed denoising algorithm

By decomposing the contaminated signal and selecting only the first K large coefficients, the linear combination of these K large coefficients and the corresponding atoms can approach the clean signal and the purpose of noise cancellation can be achieved. Since selecting the sparsity of each frame, setting up a suitable dictionary A and optimizing the atoms and coefficients are the key parts of this denoising algorithm, we propose a dynamic determination of sparsity, using the iterative equation expansion of the signal to get the dictionary and K-SVD algorithm to optimize the atoms and coefficients here. Figure 1 shows the flow chart of the sparse decomposition based denoising algorithm.

Fig. 1. Block diagram of sparse decomposition based denoising algorithm.
3.1. Sparsity determination

The energy of noise can be roughly estimated by means of least total variation or least-squares principle in the case of blind processing,[22,23] the sparsity in the denoising model is based on the parameter of noise intensity. Define the problem

where is the function of the input signal energy and the estimated noise energy , ω is a constant, and 2M + 1 is the length of vector yt. The sparsity of the estimated signal reduces with the increase of noise, because the larger the noise is, the greater the probability that the atom (column) of the dictionary A obtained from zero norm tracking is misjudged. Therefore, only using atoms with high confidence probability to reconstruct x can achieve maximum noise suppression.

3.2. Dictionary construction

The chaotic flow signal can be described by differential equation , and it is expressed recursively, so x(θ) generated from the system is k + 1 order differentiable, the Taylor expansion formula of x(n) at θ is

Each segment of the signal is regarded as centered on the origin, the above expression can be rewritten as follows:

Based on the above representation, each atom of the matrix A can be set as a power exponent corresponding to the sparse coefficient, namely, , where 1 ≤ jN′, . Every atom aj is normalized to ajaj/||aj||2, and unit norm atom can simplify the steps of subsequent denoising algorithm.

3.3. Signal sparse decomposition

For the problem , using the 1 norm constraint instead of the 0 norm constraint will have a better recovery accuracy, but the convex relaxation algorithm corresponding to 1 norm is slower than the greedy algorithm corresponding to 0 norm with Iter/K, where Iter is the minimum number of iterations for the convex relaxation algorithm to reach the convergence state. In addition, the 1 norm tracking algorithm will introduce more atoms, resulting in poor performance of the denoising algorithm. The idea of the algorithm in this paper is to retain only the atoms with high confidence probability under the noise pollution, and the greedy algorithm is more suitable for solving the problem. We use the orthogonal matching pursuit (OMP) algorithm[20,21] to solve the problem of data reconstruction in each frame, because the length of the data to be processed in each segment is not large, the criterion of “single in and no out” for the OMP algorithm to select atoms is relatively simple. In contrast, the stage-wise orthogonal matching pursuit (StOMP) algorithm[24] and regularized orthogonal matching pursuit (ROMP) algorithm[25] use the criterion of “multi-in and multi-out”, their superiority lies in dealing with big data recovery.

The initial residual R1 = yt, is the original signal, the orthogonal set of selected atoms is D = dp,p = 1,2,…,k, and the atoms are selected according to certain principles in the iterative process to ensure the minimum residual between the sparse representation result and the original signal.

(1) Select the atom corresponding to the maximum result of residual and inner product in turn,

where m = 1,2,…,n and i = 1,2,…,n.

(2) The above atoms are orthogonalized by Schmidt orthogonalization method,

(3) Normalize dm,

(4) Projecting the residual component Rm onto dm,

The residual component continues to be decomposed until it is less than a certain predetermined threshold. All the selected atoms constitute the atomic set D, we can obtain a preliminary signal sparse representation using this atomic set. The signal is expressed as follow:

where s is the representation coefficient.

3.4. Dictionary update

All the selected atoms are updated one by one using the K-SVD algorithm for global optimum. The atoms and the corresponding coefficients are updated simultaneously.

Choose a column dk (i.e., the k-th atom) in the dictionary and the k-th row of the coefficient matrix S corresponding to dk, which is recorded as (not the k-th vector sk in S). Equation can be written as

Singular value decomposition is used to solve dk and in the above equation.

Define ωk as the indicator set that points to the atom dk in sample yt, namely, the non-zero position in , . Define matrix Ωk with dimension N × ωk. The elements (ωk(i),i) in the matrix are set to 1 and the others are set to 0. The row vector is shrunk with discarding the nonzero elements by multiplying . Similarly, the subset of the examples that are currently employing the dk atom is denoted by YΩk.

Error matrix denotes the error columns equivalent to the examples that employ the subset of dk, the minimization problem as mentioned before becomes

The error matrix is decomposed into using SVD algorithm. The atom dk = u1 and the coefficient are both updated.

We update all the atoms in the atomic set D sequentially from the first column to the last column, and the optimized dictionary and the sparsity coefficient are obtained. They are globally optimal and can represent the original signal better. The sparse representation is described as

3.5. Signal recovery

The sparse coefficients and optimized dictionary are obtained by sparse decomposition of y = [y1, y2, …, yL] after steps 3 and 4. We obtain the reconstructed signal , each column is set as a signal segment, all signal segments are merged and mean processing is performed in the overlapping area. The obtained signal is the sparse denoised signal.

4. Algorithm analysis

Based on the matrix A, the received signal can be rewritten as y = As + v when the subscript is omitted. We discuss why the number of selected atoms in A is reduced when the noise increases. In OMP algorithm, the residual of the k-th step can be expressed as[20]

According to the analysis of OMP algorithm in Ref. [20], if the real support set of pure data projected to A is recorded as AΓ, then the definition of maximum inner product coefficient can be derived from qk and vk in Eq. (16). Let the absolute value of the maximum inner product coefficient of qk and the real support set be , the maximum inner product coefficient of the interference support set is , the inner product coefficient of the noise term and support set is . Substituting rk = qk + vk and using the absolute value inequality,[20] the sufficient condition for choosing the right supporting atom is

It can be seen that the atoms that can be correctly selected must satisfy Qt,1Qt,2 > 2Jt. Without loss of generality, the modules of N′ coefficients in s are recorded as |s1|≥|s2|≥⋯≥|sN′| according to the size relation. And only |s1|,…,|sl| satisfy Qt,1Qt,2 > 2Jt. Since , it is obvious that the increase of σ2 will reduce the number of correct atoms that can be selected. Introducing incorrect atoms will lead to performance deterioration. We reduce the number of selected atoms when the noise intensity increases, so as to ensure that all the selected atoms are “correct”.

In assessing complexities, the algorithm is composed of iterating on three main stages: sparse coding (OMP process), dictionary update, and final averaging process. As mentioned in Ref. [29], both stages can be done efficiently in O(JMkLnz) per segment, where J is the number of iterations in updating dictionary, M′ is the size of the segment, k is the number of atoms in the dictionary, and Lnz is the maximal number of non-zero elements in each coefficient vector.

5. Simulation

The denoising of chaotic signals should not only be reflected in the improvement of SNR, but also in the maximum restoration of the characteristic parameters of the chaotic signals. In this section, the sparse decomposition based denoising (SDD) algorithm is compared with WT algorithm, LCF algorithm, and EMD algorithm from the aspect of SNR improvement, root mean square error (RMSE), and phase diagram recovery. The nonlinear time series data are generated from the following Lorenz system equations:[26]

where α = 10, β = 28, γ = 8/3. The equation is solved by the fourth-order Runge–Kutta method with step size of 0.01. In each experiment, the equation iterates from the random position of the chaotic area, and the state variable x is used as the chaotic signal, the sample length is N = 8000. The SNR and RMSE are defined as

We add –5 dB to 25 dB noise to x(n), and compare all these algorithms in SNR improvement. The parameter settings are: the moving length of window is M = 20, constant ω = 2.5, the dimension of matrix A is 41 × 61; WT algorithm uses J = 4 level decomposition and hard threshold; LCF algorithm uses polynomial order l = 3 and const. + x + x2 + x3, the block length is 2Δ + 1 = 17; and EMD algorithm uses the number of iterations 9 and IMF layers 10.

Figure 2(a) shows the output SNR obtained by different algorithms when the input SNR is between –5 dB and 25 dB. The SDD algorithm is about 2.35 dB higher than the other three in average, especially at lower SNR, because the SDD algorithm selects less atoms to avoid noise when the noise intensity increases. In essence, this SDD algorithm is an extension of the local curve fitting method, but it can use different atoms in different segments of the signal. Therefore, the consistency of the SDD algorithm is better than the local curve fitting algorithm. Figure 2(b) shows the RMSE of these algorithms, the SDD algorithm also shows better performance than the other three, especially when the SNR is lower. It is consistent with the SNR performance in Fig. 2(a).

Fig. 2. (a) SNR improvement and (b) RMSE of different denoising algorithms under different noise intensities.

In order to reflect these differences more intuitively, figure 3 shows the attractor flow patterns reconstructed by various methods when the input SNR is 15 dB. It can be seen that the reconstructed flow pattern becomes smoother around the attractor by SDD. The other three methods show a rough transfer in some areas.

Fig. 3. Comparison of the reconstructed attractors by different denoising algorithms: (a) the original pure data, (b) the signal after noise pollution, and (c)–(f) the signal recovered by SDD, WT, LCF, and EMD, respectively.

The proliferation exponent (PE)[27] is used here to analyze the chaotic characteristics of the reconstructed signal. Although the Lyapunov exponent (LE)[28] can better reflect the chaotic properties of signals from the definition of chaos, its calculation method for continuous chaotic signals is based on the Jacobian matrix of the definition, which is not suitable for the calculation of a large number of sample data. The proliferation exponent can largely reflect the chaos characteristics of the original signal when the input sample data is large enough. The proliferation exponent is defined as

where Vs(t) is the phase space distance function,[28] and α is the control factor.

Here we use a Lorenz signal with length of 8000 and sampling interval of 0.05, the same simulation parameter settings as the above. In terms of selecting the value of control factor α, it should be noted that since the selection of α takes the sampling step into account, that is, the optimal α value is set based on the sampling step to best reflect the chaotic characteristic. The simulation results with different α values may be slightly different. Here we set α = 6 and calculate the PE of the restored signal by different algorithms. As shown in Fig. 4, the PE value by SDD algorithm is about 0.56 in the whole interval, and the PE value by the original data is 0.47, while the PE value obtained by WT, LCF, and EMD algorithms have greater deviation, which are 0.64, 0.62, and 0.65, respectively.

Fig. 4. Comparison of proliferation exponents of the restored signal by different denoising algorithms.
6. Conclusion

Based on the theory of sparse decomposition and K-SVD, we propose a chaotic signal denoising algorithm with dynamic sparsity determination, dictionary constructed by signal iteration equation and K-SVD optimization. The special point of this algorithm is that the constructed dictionary over complete projection matrix has sufficient frequency-domain components, so that it can better represent each segment of the signal. The K-SVD optimization process further improves the noise reduction. The simulation results show that the algorithm has better SNR improvement and RMSE than the other three denoising algorithms, especially in the case of low SNR. For the PE value calculation which characterizes the chaotic signal, the algorithm in this paper has better performance than the other three denoising algorithms.

Reference
[1] Alamodi A O A Sun K H Ai W Chen C Peng D 2019 Chin. Phys. 28 020503
[2] Bi H Y Qi G Y Hu J B Wu Q L 2020 Chin. Phys. 29 020502
[3] Gao F Li W Q Tong H Q Li Q L 2019 Chin. Phys. 28 090501
[4] Wang W B Zhang X D Chang Y C Wang X L Wang Z Chen X Zheng L 2015 Chin. Phys. 25 010202
[5] Yang H Li Y A Li G H 2015 Acta Armamentarii 36 2330 in Chinese
[6] Liu K Li H Dai X C Xu P X 2008 JEIT 30 1849 in Chinese
[7] Wang M J Zhou Z Q Li Z J Zeng Y C 2018 Acta Phys. Sin. 67 060501 in Chinese
[8] Lv S X Feng J C 2013 Acta Phys. Sin. 62 230503 in Chinese
[9] Wang W B Jin Y Y Wang B Li W G Wang X L 2018 Acta Electron. Sin. 46 1652 in Chinese
[10] Hai S Gao H W Ruan X J 2016 Int. J. Signal Process., Image Process. and Pattern Recognit. 9 219
[11] Gao J B Sultan H Hu J Tung W W 2010 IEEE Signal Process. Lett. 17 237
[12] Tung W W Gao J B Hu J Yang L 2011 Phys. Rev. 83 046210
[13] Wu Z H Huang N E 2010 Adv. Adap. Data Anal. 2 397
[14] Wang M J Zhou Z Q Li Z J Zeng Y C 2019 Circ. Syst. Signal Proces. 38 2471
[15] Donoho D L Elad M Temlyakov V N 2006 IEEE Trans. Inf. Theory 52 6
[16] Tseng P 2009 IEEE Trans. Inf. Theory 55 888
[17] Dumitrescu B Irofti P 2017 IEEE Signal Proces. Lett. 24 309
[18] Karlovitz L A 1970 J. Appr. Theory 3 123
[19] Chen S S Donoho D L Saunders M A 2001 SIAM Rev. 43 129
[20] Cai T T Wang L 2011 IEEE Trans. Inf. Theory 57 4680
[21] Karahanoglu N B Erdogan H 2012 Digital Signal Proces.: A Rev. J. 22 555
[22] Tian J Chen L 2012 IEEE Signal Proces. Lett. 19 395
[23] Chen P Rong Y Nordholm S He Z Q 2017 IEEE Trans. Vehic. Technol. 66 10567
[24] Lee D J 2016 IEEE Commun. Lett. 20 2115
[25] Needell D Vershynin R 2010 IEEE J. Sel. Top. Signal Proces. 4 310
[26] Chowdhury M S H Hashim I Momani S 2009 Chaos Soliton. Fract. 40 1929
[27] Lv S X Wang Z S Hu Z H Feng J C 2014 Chin. Phys. 23 010506
[28] Holger K Thomas S 2004 Nonlinear Time Series Analysis Cambridge Cambridge University Press 65 74
[29] Elad M Aharon M 2006 IEEE Trans. Image Proces. 15 3736