CN102592593A

CN102592593A - Emotional-characteristic extraction method implemented through considering sparsity of multilinear group in speech

Info

Publication number: CN102592593A
Application number: CN2012100915251A
Authority: CN
Inventors: 吴强; 刘琚; 孙建德
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2012-03-31
Filing date: 2012-03-31
Publication date: 2012-07-18
Anticipated expiration: 2032-03-31
Also published as: CN102592593B

Abstract

The invention discloses an emotional-characteristic extraction method implemented through considering the sparsity of a multilinear group in a speech. The method comprising the following steps: considering multiple factors such as time, frequency, scale and direction information included in a speech signal; carrying out characteristic extraction by using a sparse decomposition method for multilinear groups; carrying out multilinear characterization on an energy spectrum of the speech signal through Gabor functions with different scales and the directions; solving a characteristic projection matrix by using a sparse tensor decomposition method for groups; calculating a characteristic projection with a frequency order; carrying out characteristic decorrelation through discrete cosine transform; and finally, calculating first-order and second-order differential coefficients so as to obtain the emotional characteristics of the speed. According to the invention, the factors such as time, frequency, scale and direction and the like in a speech signal is taken into consideration and used for extracting emotional characteristics, and the characteristic projection is performed by using a sparse tensor decomposition method for groups, thereby finally improving the accuracy rate of various speech emotion recognitions.

Description

A kind of emotional characteristics method for distilling of considering polyteny group sparse characteristic in the voice

Technical field

The present invention relates to a kind of voice mood feature extracting method that is used to improve the voice mood recognition performance, belong to the voice process technology field.

Background technology

Voice are one of convenient mode of exchanging in daily life of people, and this makes also the researchist tries to explore how to utilize voice as the instrument that exchanges between people and the machine.Except traditional interactive modes such as speech recognition, speaker's mood also is a kind of important interactive information, and it is one of important symbol of human-computer interaction intelligentization that machine can be discerned the mood of understanding the speaker automatically.

Voice mood is identified in signal Processing and the intelligent human-machine interaction field has significant values, and a lot of potential application are arranged.Aspect man-machine interaction, can improve the cordiality and the accuracy of system by computer Recognition speaker's mood, for example long-distance educational system can in time be adjusted course by identification student's mood, thereby promotes teaching efficiency; In telephone contact center and mobile communication, can in time obtain user's emotional information, improve the quality of service; Whether onboard system can be concentrated by mood identification detection driver's energy, and makes corresponding auxiliary caution.Aspect medical science, voice-based mood identification can be used as a kind of instrument, helps the doctor that patient's the state of an illness is diagnosed.

For voice mood identification, an important problem is exactly how to extract effective characteristic to be used for representing different moods.According to traditional feature extracting method, can one section voice signal be divided into multiframe usually, so that obtain approximate signal stably.The periodic feature that obtains from each frame is called local feature; For example fundamental tone, energy etc.; Its advantage is that existing sorter can utilize local feature to estimate the parameter of different emotional states comparatively accurately; Shortcoming is that intrinsic dimensionality and sample number are more, has influence on the speed of feature extraction and classification.Obtain characteristic and be called global characteristics through the characteristic of whole sentence is added up, its advantage is to obtain nicety of grading and speed preferably, but has lost the time sequence information of voice signal, occurs the problem of lack of training samples easily.Generally speaking, voice mood identification characteristic commonly used has following several types: continuously acoustic feature, spectrum signature, based on characteristic of Teager energy operator or the like.

According to the result of study of psychology and metrics etc., speaker's mood in voice the most intuitively characteristic be exactly the continuous characteristic of the rhythm, like fundamental tone, energy, the speed of speaking etc.Corresponding global characteristics comprises the average, median, standard deviation, maximal value, minimum value of fundamental tone or energy etc., and first, second resonance peak or the like.

Spectrum signature provides the useful frequency information in the voice signal, also is important feature extraction mode in the voice mood identification.Spectrum signature commonly used comprises linear predictor coefficient (LPC), linear prediction cepstrum coefficient (LPCC), Mei Er frequency cepstral coefficient (MFCC), perceptual weighting linear prediction (PLP) or the like.

Voice are to be produced by the non-linear airflow in the sonification system, and Teager energy operator (TEO) is that a kind of that people such as Teager proposes can follow the tracks of the arithmetic operation that signal energy changes in the glottis cycle fast, is used for the fine structure of analyzing speech.Under the different emotional states, the flexible situation of muscle can influence the motion of sonification system hollow air-flow, can know according to people's such as Bou-Ghazale result of study, can be used for detecting the intense strain in the voice based on the characteristic of TEO.

According to numerous experimental evaluation results, for voice mood identification, select suitable feature to characterize to different classification task, be suitable for detecting the intense strain in the voice signal based on the characteristic of Teager energy; The then suitable height of distinguishing of acoustic feature wakes mood (high-arousal emotion) and the low mood (low-arousal emotion) of waking up up continuously; And for the mood classification task of multiclass, the voice that spectrum signature is best suited for characterize, if spectrum signature is combined with acoustic feature continuously, perhaps consider the association analysis of multiple factor, also can reach the purpose of raising nicety of grading.

Classify exactly at voice mood feature extraction and the another one important stage after selecting completion.Various sorters all are used to the voice mood characteristic is classified in the area of pattern recognition at present, comprise HMM (HMM), gauss hybrid models (GMM), SVMs (SVM), linear discriminant analysis (LDA) and integrated classifier or the like.HMM is one of recognizer the most widely of in voice mood identification, using; This has benefited from its widespread usage in voice signal; Be particularly useful for handling data with sequential organization; From present result of study, higher classification accuracy can be provided based on the mood recognition system of HMM.Gauss hybrid models can be regarded as the HMM that has only a state, is very suitable for modeling is carried out in polynary distribution, and people such as Breazeal utilize GMM to be applied to the KISMET speech database as sorter, and five types of moods are carried out Classification and Identification.The SVMs area of pattern recognition that has been widely used; Its ultimate principle is through kernel function characteristic to be projected to higher dimensional space to make the characteristic linear separability; Compare HMM and GMM; It has the advantage that training algorithm global optimum and existence depend on the extensive border of data, and many results of study are to utilize SVMs as the sorter of voice mood identification and obtained classifying quality preferably.

As shown in Figure 1, following steps are adopted in traditional voice mood recognition methods based on spectrum signature usually:

1) voice signal to input carries out pre-service, comprises windowing, filtering, pre-emphasis etc.;

2) signal is carried out short time discrete Fourier transform, carry out filtering, ask logarithmic spectrum (getting log) then through the Mei Er quarter window;

3) utilize discrete cosine transform to calculate cepstrum, weighting then asks cepstral mean to subtract, and calculates difference;

4) utilize gauss hybrid models (GMM) to train, obtain the model of different moods;

5) mood model that obtains through training is discerned test data, obtains recognition accuracy.

To two types of mood classification,, reached nicety of grading relatively preferably at present like negative emotions and neutral mood; But classification for the multiclass mood; Because the unbalancedness of data is only considered single factors reasons such as (frequency or times), makes that the characteristic property distinguished is relatively poor; The mood nicety of grading is relatively low, and this makes voice-based mood recognition system use and is restricted.

Summary of the invention

Single factors is only considered in feature extraction in the identification of traditional voice mood; Like frequency or time; Make the problem that the characteristic property distinguished is relatively poor, the present invention propose a kind ofly to consider polyteny group sparse characteristic in the voice, be used for voice mood identification and can improve the voice mood feature extracting method of multiclass mood recognition accuracy.

The emotional characteristics method for distilling of polyteny group sparse characteristic in the consideration voice of the present invention is:

Consider to comprise in the voice signal the multiple factor of time, frequency, yardstick and directional information; Utilize the method for polyteny group Sparse Decomposition to carry out feature extraction; Gabor function through different scale and direction carries out the polyteny sign to the speech signal energy spectrum, utilizes the sparse tensor decomposition method of group to find the solution the characteristic projection matrix, the characteristic projection on the calculated rate rank; To the characteristic decorrelation, obtain the single order and the second order difference coefficient of characteristic through discrete cosine transform through difference; Specifically may further comprise the steps:

(1) gather voice signal s (t) (through equipment collections such as microphones), utilize Short Time Fourier Transform that s (t) is transformed to time-frequency domain, obtain signal time-frequency representation S (f, t) with energy spectrum P (f, t);

(2) utilize the two-dimensional Gabor function with different scale and direction that energy spectrum is carried out convolutional filtering, the Gabor function definition is following:

g_{\overset{&OverBar;}{k}} (\overset{&OverBar;}{x}) = \frac{{\overset{&OverBar;}{k}}^{2}}{σ^{2}} \cdot e^{- ({\overset{&OverBar;}{k}}^{2} \cdot {\overset{&OverBar;}{x}}^{2} / {2 σ}^{2})} \cdot [e^{j \overset{&OverBar;}{k} \cdot \overset{&OverBar;}{x}} - e^{- (σ^{2} / 2)}],

Wherein:

Be that (f is the element of f in t frame, frequency t) to energy spectrum P; Be the yardstick of control function and the vector of direction, j representes imaginary part unit, k _v=2 ^-(v+2)/2π, and φ=u (π/K), the direction of u representative function, the yardstick of v representative function, K are represented total direction number, σ is a constant of confirming the function envelope, is made as 2 π.

(f, t) result of convolutional filtering is the polyteny sign of voice signal to the Gabor function to energy spectrum P

Here

Be that a size does

5 rank tensors, each rank is express time, frequency, direction, yardstick and classification respectively, and is right then

The frequency rank carry out the filtering of Mei Er quarter window and obtain 5 new rank tensors P, PSize be N ₁* N ₂* N ₃* N ₄* N ₅, the length on each rank is N _i, i=1, L 5;

(3) polyteny that obtains is characterized PCarry out the sparse tensor of group and decompose, calculate the projection matrix U on the different factors ⁽ⁱ⁾, i=1, L 5, so that carry out the characteristic projection, set up following decomposition model:

P≈ Λ× ₁U ⁽¹⁾× ₂U ⁽²⁾× ₃U ⁽³⁾× ₄U ⁽⁴⁾× ₅U ⁽⁵⁾

Wherein, U ⁽ⁱ⁾Be that the size that decomposition obtains afterwards is N _iThe projection matrix of * K; ΛBe that diagonal element is 15 rank tensors, size is K * K * K * K * K; * _iExpression tensor i rank matrix multiplication, it defines as follows:

{(\underset{&OverBar;}{X} \times_{i} A)}_{\begin{matrix} n_{1}, L & n_{i - 1}, k, n_{i + 1}, L & n_{M} \end{matrix}} = \underset{n_{i}}{Σ} {\underset{&OverBar;}{X}}_{\begin{matrix} n_{1}, L & n_{M} \end{matrix}} A_{k, n_{i}}

Wherein XRepresent that a size is N ₁* L * N _MM rank tensor, A is that a size is N _iThe matrix of * K,

It is tensor XElement,

It is the element of matrix A;

Calculate projection matrix U ⁽ⁱ⁾, i=1, the concrete decomposable process of L I is following, and i representes the index of rank (corresponding different factors) here, I=5:

1. adopt alternately lowest mean square or random initializtion U ⁽ⁱ⁾>=0, i=1, L, I;

2. to projection matrix U ⁽ⁱ⁾, i=1, each column vector of L I

I=1, L, I, k=1, L, K carries out normalization;

3. error objective function

\underset{&OverBar;}{E} = \frac{1}{2} {||\begin{matrix} \underset{&OverBar;}{P} - Σ_{k = 1}^{K} u_{k}^{(1)} & {Ou}_{k}^{(2)} & OL & {Ou}_{k}^{(I)} \end{matrix}||}_{F}^{2}

During greater than certain threshold value, operation below circulation is carried out:

● from n=1 to I, carry out successively:

u_{k}^{(i)} &LeftArrow; \frac{{| | u_{k}^{(i)} | |}_{F}}{γ_{k}^{(i)} {| | u_{k}^{(i)} | |}_{F} + λ_{k} \sqrt{q_{i}}} {[P_{(i)}^{(k)} {u_{k}}^{e_{- i}}]}_{+},

Wherein, || || _FExpression Frobenius norm,

It is tensor P ^(k)I rank tensor matrixes launch,

{\underset{&OverBar;}{P}}^{(k)} = \underset{&OverBar;}{P} - Σ_{j = 1, j &NotEqual; k}^{K} \begin{matrix} u_{j}^{(1)} & {Ou}_{j}^{(2)} & OL & {Ou}_{j}^{(I)} \end{matrix},

{u_{k}}^{e_{- i}} = \begin{matrix} [u_{k}^{(I)}] e & L & e & [u_{k}^{(i - 1)}] e & [u_{k}^{(i + 1)}] e & L & e & [u_{k}^{(1)}] \end{matrix},

E is that the Khatri-Rao of matrix is long-pending, λ _kAnd q _iBe the weight coefficient that is used to regulate objective function composition degree of rarefication, get the numerical value between 0 to 1;

If ● n ≠ 5,

γ_{k}^{i} = u_{k}^{(I) T} u_{k}^{(I)},

If n=5,

4. work as objective function EDuring less than certain threshold value, loop ends calculates projection matrix U ⁽ⁱ⁾, i=1, L I;

(4) utilize the projection matrix U that obtains corresponding to frequency domain ⁽²⁾Polyteny to voice signal characterizes PCarry out the characteristic projection:

\underset{&OverBar;}{S} = \underset{&OverBar;}{P} \times_{2} U_{+}^{(2)}

Wherein,

Be projection matrix U ⁽²⁾The matrix that the nonzero element of pseudoinverse is formed, * ₂Representing matrix

With PCarrying out 2 rank matrixes of tensor takes advantage of;

(5) the time rank are fixed, the sparse sign of polyteny that obtains SCarry out tensor and launch operation, obtain size and do

Eigenmatrix S _(f), wherein

{\hat{N}}_{1} = N_{2} \cdot N_{3} \cdot N_{4} \cdot N_{5};

(6) utilize discrete cosine transform to S _(f)Carry out decorrelation, obtain voice mood characteristic F, the single order of calculated characteristics and second order difference coefficient obtain final emotional characteristics.

The present invention considers that the factors such as time, frequency, yardstick and direction in the voice signal are used for the feature extraction of mood, utilizes the sparse tensor decomposition method of group to carry out the characteristic projection, has finally improved the accuracy rate of multiclass voice mood identification.

Description of drawings

Fig. 1 is the schematic block diagram of traditional voice mood identifying;

Fig. 2 is the synoptic diagram of feature extracting method of the present invention;

Fig. 3 is the schematic block diagram that adopts voice mood identifying of the present invention.

Fig. 4 is the experimental result comparison diagram to four types of voice mood identifications.

Embodiment

As shown in Figure 2, the voice mood recognition methods based on polyteny group sparse features of the present invention specifically may further comprise the steps:

(1) collect voice signal s (t) through equipment such as microphones, utilize Short Time Fourier Transform that s (t) is transformed to time-frequency domain, obtain signal time-frequency representation S (f, t) with energy spectrum P (f, t);

(2) utilize the two-dimensional Gabor function with different scale and direction that energy spectrum is carried out convolutional filtering, the polyteny that obtains voice signal characterizes

Right then

The frequency rank carry out the filtering of Mei Er quarter window and obtain characterizing P

The Gabor function definition is following:

g_{\overset{&OverBar;}{k}} (\overset{&OverBar;}{x}) = \frac{{\overset{&OverBar;}{k}}^{2}}{σ^{2}} \cdot e^{- ({\overset{&OverBar;}{k}}^{2} \cdot {\overset{&OverBar;}{x}}^{2} / {2 σ}^{2})} \cdot [e^{j \overset{&OverBar;}{k} \cdot \overset{&OverBar;}{x}} - e^{- (σ^{2} / 2)}],

Wherein:

Be that (f is the element of f in t frame, frequency t) to energy spectrum P;

Be the yardstick of control function and the vector of direction, j representes imaginary part unit, k _v=2 ^-(v+2)/2π, and φ=u (π/K), the direction of u representative function, the yardstick of v representative function, K are represented total direction number, σ is a constant of confirming the function envelope, is made as 2 π.

Here

Be that a size does

(3) to characterizing PCarry out the sparse tensor of group and decompose, calculate the projection matrix U on the different factors ⁽ⁱ⁾, i=1, L 5, so that carry out the characteristic projection.Set up following decomposition model:

P≈Λ× ₁U ⁽¹⁾× ₂U ⁽²⁾× ₃U ⁽³⁾× ₄U ⁽⁴⁾× ₅U ⁽⁵⁾

{(\underset{&OverBar;}{X} \times_{i} A)}_{\begin{matrix} n_{1}, L & n_{i - 1}, k, n_{i + 1}, L & n_{M} \end{matrix}} = \underset{n_{i}}{Σ} {\underset{&OverBar;}{X}}_{\begin{matrix} n_{1}, L & n_{M} \end{matrix}} A_{k, n_{i}}

It is tensor XElement,

It is the element of matrix A.

For calculating projection matrix U ⁽ⁱ⁾, i=1, L I, I=5 here, concrete decomposable process is following:

A) adopt alternately lowest mean square or random initializtion U ⁽ⁱ⁾>=0, i=1, L, I;

B) to projection matrix U ⁽ⁱ⁾, i=1, each column vector of L I

I=1, L, I, k=1, L, K carries out normalization;

C) error objective function

\underset{&OverBar;}{E} = \frac{1}{2} {||\begin{matrix} \underset{&OverBar;}{P} - Σ_{k = 1}^{K} u_{k}^{(1)} & {Ou}_{k}^{(2)} & OL & {Ou}_{k}^{(I)} \end{matrix}||}_{F}^{2}

● from n=1 to I, carry out successively

u_{k}^{(i)} &LeftArrow; \frac{{| | u_{k}^{(i)} | |}_{F}}{γ_{k}^{(i)} {| | u_{k}^{(i)} | |}_{F} + λ_{k} \sqrt{q_{i}}} {[P_{(i)}^{(k)} {u_{k}}^{e_{- i}}]}_{+},

Wherein, || || _FExpression Frobenius norm,

{\underset{&OverBar;}{P}}^{(k)} = \underset{&OverBar;}{P} - Σ_{j = 1, j &NotEqual; k}^{K} \begin{matrix} u_{j}^{(1)} & {Ou}_{j}^{(2)} & OL & {Ou}_{j}^{(I)} \end{matrix},

It is tensor P ^(k)I rank tensor matrixes launch,

{u_{k}}^{e_{- i}} = \begin{matrix} [u_{k}^{(I)}] e & L & e & [u_{k}^{(i - 1)}] e & [u_{k}^{(i + 1)}] e & L & e & [u_{k}^{(1)}] \end{matrix},

If ● n ≠ 5,

γ_{k}^{i} = u_{k}^{(I) T} u_{k}^{(I)},

If n=5,

D) work as objective function EDuring less than certain threshold value, loop ends calculates projection matrix U ⁽ⁱ⁾, i=1, L I;

\underset{&OverBar;}{S} = \underset{&OverBar;}{P} \times_{2} U_{+}^{(2)}

Wherein,

With PCarrying out 2 rank matrixes of tensor takes advantage of;

Eigenmatrix S _(f), wherein

{\hat{N}}_{1} = N_{2} \cdot N_{3} \cdot N_{4} \cdot N_{5};

As shown in Figure 3, adopt above-mentioned feature extracting method to carry out the process of voice mood identification, may further comprise the steps:

1) obtains the voice signal data s that has different mood labels _l(t), l=1, L, L, the different moods of total J class;

2) utilize the feature extracting method shown in Fig. 2 to extract the characteristic F of different moods _l

3) utilize mixed Gaussian mixture model (GMM) that different emotional characteristicses are carried out modeling,, obtain the pairing mood model M of mood of l class through learning training _l

4) when the voice signal of given unknown type of emotion When testing, the mood model M that utilizes GMM to set up _l, l=1, L, L tests the calculating maximum posteriori probability successively, obtains the mood classification of maximum probability, promptly is the mood recognition result of this voice signal.

Effect of the present invention can further specify through experiment.

The recognition performance of the feature extracting method of the present invention's proposition has been tested in experiment on FAU Aibo data set, (Neutral Rest) discerns for Anger, Emphatic to 4 types of moods.The sampling rate of this experiment voice signal is 8kHz, adopts Hamming window to carry out windowing, and the 23ms window is long; The 10ms window moves; Utilize the energy spectrum of Short Time Fourier Transform signal calculated, have 4 different yardsticks and 4 different directions Gabor functions carry out the time-frequency convolutional filtering to energy spectrum, adopting size is 36 Mel bank of filters calculating Mei Er power spectrum; Utilize projection matrix on the frequency domain rank, to carry out the characteristic projection, utilize DCT that characteristic is carried out decorrelation.

Fig. 4 has compared the method for the present invention's proposition and the recognition performance of existing Feature Extraction Technology (MFCC and LFPC characteristic) compares; Visible by final recognition accuracy; After adopting the present invention; The accuracy rate of multiclass voice mood identification effectively improves, and MFCC has improved 6.1% than classic method, has improved 5.8% than the LFPC method.

Claims

1. voice mood feature extracting method of considering polyteny group sparse features in the voice is characterized in that:

Consider to comprise in the voice signal the multiple factor of time, frequency, yardstick and directional information; Utilize the method for polyteny group Sparse Decomposition to carry out feature extraction, through the Gabor function of different scale and direction speech signal energy is composed and carry out the polyteny sign, utilize the sparse tensor decomposition method of group to find the solution the characteristic projection matrix; Characteristic projection on the calculated rate rank; To the characteristic decorrelation, the single order of calculated characteristics and second order difference coefficient specifically may further comprise the steps through discrete cosine transform:

(1) gather voice signal s (t), utilize Short Time Fourier Transform that s (t) is transformed to time-frequency domain, obtain signal time-frequency representation S (f, t) with energy spectrum P (f, t);

g_{\overset{&OverBar;}{k}} (\overset{&OverBar;}{x}) = \frac{{\overset{&OverBar;}{k}}^{2}}{σ^{2}} \cdot e^{- ({\overset{&OverBar;}{k}}^{2} \cdot {\overset{&OverBar;}{x}}^{2} / {2 σ}^{2})} \cdot [e^{j \overset{&OverBar;}{k} \cdot \overset{&OverBar;}{x}} - e^{- (σ^{2} / 2)}],

Wherein:

Be that (f is the element of f in t frame, frequency t) to energy spectrum P;

Be the yardstick of control function and the vector of direction, j representes imaginary part unit, k _v=2 ^-(v+2)/2π, and φ=u (π/K), the direction of u representative function, the yardstick of v representative function, K are represented total direction number, σ is a constant of confirming the function envelope, is made as 2 π;

Here

Be that a size does 5 rank tensors, each rank is express time, frequency, direction, yardstick and classification respectively, and is right then The frequency rank carry out the filtering of Mei Er quarter window and obtain 5 new rank tensors P, its size is N ₁* N ₂* N ₃* N ₄* N ₅, the length on each rank is N _i, i=1, L 5;

Wherein, U ⁽ⁱ⁾Be that the size that decomposition obtains afterwards is N _iThe projection matrix of * K,, ΛBe that diagonal element is 15 rank tensors, size is K * K * K * K * K, * _iExpression tensor i rank matrix multiplication, it defines as follows:

{(\underset{&OverBar;}{X} \times_{i} A)}_{\begin{matrix} n_{1}, L & n_{i - 1}, k, n_{i + 1}, L & n_{M} \end{matrix}} = \underset{n_{i}}{Σ} {\underset{&OverBar;}{X}}_{\begin{matrix} n_{1}, L & n_{M} \end{matrix}} A_{k, n_{i}}

It is tensor XElement,

It is the element of matrix A;

\underset{&OverBar;}{S} = \underset{&OverBar;}{P} \times_{2} U_{+}^{(2)}

Wherein,

With PCarrying out 2 rank matrixes of tensor takes advantage of;

(5) the time rank are fixed, the sparse sign of polyteny that obtains SCarry out tensor and launch operation, obtain size and do Eigenmatrix S _(f), wherein

{\hat{N}}_{1} = N_{2} \cdot N_{3} \cdot N_{4} \cdot N_{5};

2. the voice mood feature extracting method based on polyteny group sparse features according to claim 1 is characterized in that: said calculating projection matrix U ⁽ⁱ⁾, i=1, the concrete decomposable process of L I is following, and i representes the index of rank (corresponding different factors) here, I=5:

2. to projection matrix U ⁽ⁱ⁾, i=1, each column vector of L I

I=1, L, I, k=1, L, K carries out normalization;

3. error objective function

\underset{&OverBar;}{E} = \frac{1}{2} {||\begin{matrix} \underset{&OverBar;}{P} - Σ_{k = 1}^{K} u_{k}^{(1)} & {Ou}_{k}^{(2)} & OL & {Ou}_{k}^{(I)} \end{matrix}||}_{F}^{2}

● from n=1 to I, carry out successively:

u_{k}^{(i)} &LeftArrow; \frac{{| | u_{k}^{(i)} | |}_{F}}{γ_{k}^{(i)} {| | u_{k}^{(i)} | |}_{F} + λ_{k} \sqrt{q_{i}}} {[P_{(i)}^{(k)} {u_{k}}^{e_{- i}}]}_{+},

Wherein, || || _FExpression Frobenius norm, It is tensor P ^(k)I rank tensor matrixes launch,

{\underset{&OverBar;}{P}}^{(k)} = \underset{&OverBar;}{P} - Σ_{j = 1, j &NotEqual; k}^{K} \begin{matrix} u_{j}^{(1)} & {Ou}_{j}^{(2)} & OL & {Ou}_{j}^{(I)} \end{matrix},

{u_{k}}^{e_{- i}} = \begin{matrix} [u_{k}^{(I)}] e & L & e & [u_{k}^{(i - 1)}] e & [u_{k}^{(i + 1)}] e & L & e & [u_{k}^{(1)}] \end{matrix},

If ● n ≠ 5,

γ_{k}^{i} = u_{k}^{(I) T} u_{k}^{(I)},

If n=5,

4. work as objective function EDuring less than certain threshold value, loop ends calculates projection matrix U ⁽ⁱ⁾, i=1, L I.