当前位置: 代码迷 >> 综合 >> 【线性代数】1.8SVD分解的证明
  详细解决方案

【线性代数】1.8SVD分解的证明

热度:158   发布时间:2023-09-22 07:22:56.0

矩阵的奇异值分解(SVD分解)

  • 0.准备
  • 1.奇异值
  • 2.SVD分解
  • 3.SVD分解的证明
  • 4.例题

0.准备

为了论述矩阵的奇异值与奇异值分解,需要下面的结论:
(1)设A∈Crm×n(r>0)A\in C_r^{m\times n}(r>0)ACrm×n?(r>0),则AHAA^HAAHA是Hermite矩阵,(如果矩阵AAA不包含复数,那么AH=ATA^H=A^TAH=AT)且其特征值均是非负实数;
(2)rank(AHA)=rank(A)rank\left(A^HA\right)=rank\left(A\right)rank(AHA)=rank(A)
(3)设A∈Cm×nA\in C^{m\times n}ACm×n,则A=OA=OA=O的充要条件是AHA=OA^HA=OAHA=O

证明:
(1)设:ATAx=λx(x≠0)A^TAx=\lambda x(x\neq0)ATAx=λx(x??=0),则xTATAx=λxTxx^TA^TAx=\lambda x^TxxTATAx=λxTx,即∥Ax∥2=λ∥x∥2,\left\|Ax\right\|^2=\lambda\left\|x\right\|^2,Ax2=λx2,λ?0\lambda\geqslant0λ?0,同理ATAA^TAATA的特征值也全是非负实数.
(2)证明方程组AHAx=0(1)A^HAx=0(1)AHAx=0(1)Ax=0(2)Ax=0(2)Ax=0(2)同解
???如果α\alphaα是(2)的解,则Aα=0A\alpha=0Aα=0,显然有ATAx=0A^TAx=0ATAx=0,即α\alphaα是(1)的解,故(2)的解全是(1)的解.
???若α\alphaα是(1)的解,即ATAα=0A^TA\alpha=0ATAα=0,那么αTATAα=0\alpha^TA^TA\alpha=0αTATAα=0,即(Aα)T(Aα)=0\left(A\alpha\right)^T\left(A\alpha\right)=0(Aα)T(Aα)=0.从而∥Aα∥2=0\left\|A\alpha\right\|^2=0Aα2=0,故Aα=0A\alpha=0Aα=0.所以α\alphaα必是(2)的解,即 (1)的解全是(2)的解.
???综上所述,方程组(1)与(2)同解,即rank(AHA)=rank(A)rank\left(A^HA\right)=rank\left(A\right)rank(AHA)=rank(A).
(3)充分性:A=O?ATA=OA=O\Rightarrow A^TA=OA=O?ATA=O,成立.
???必要性:将AAA写成列向量形式,A=(α1,α2,α3,?,αn)A=(\alpha_1,\alpha_2,\alpha_3,\cdots,\alpha_n)A=(α1?,α2?,α3?,?,αn?),则
???ATA=[α1Tα2Tα3T?αnT](α1,α2,α3,?,αn)=O?αiTαi=0?αiA^TA=\begin{bmatrix}\alpha_1^T\\\alpha_2^T\\\alpha_3^T\\\vdots\\\alpha_n^T\end{bmatrix}(\alpha_1,\alpha_2,\alpha_3,\cdots,\alpha_n)=O\Rightarrow\alpha_i^T\alpha_i=0\Rightarrow\alpha_iATA=????????α1T?α2T?α3T??αnT??????????(α1?,α2?,α3?,?,αn?)=O?αiT?αi?=0?αi?为零向量,因此矩阵AAA为零矩阵.

1.奇异值

??设A∈Crm×n(r>0)A\in C_r^{m\times n}(r>0)ACrm×n?(r>0)AHAA^HAAHA的特征值为λ1?λ2???λr>λr+1=?=λn=0\lambda_1\geqslant\lambda_2\geqslant\cdots\geqslant\lambda_{r>}\lambda_{r+1}=\cdots=\lambda_n=0λ1??λ2????λr>?λr+1?=?=λn?=0则称σi=λi(i=1,2,?,n)\sigma_i=\sqrt{\lambda_i}(i=1,2,\cdots,n)σi?=λi? ?(i=1,2,?,n)AAA的奇异值;当AAA为零矩阵时,它的奇异值都是0.

2.SVD分解

A∈Crm×n(r>0)A\in C_r^{m\times n}(r>0)ACrm×n?(r>0),则存在mmm阶正交矩阵UUUnnn阶正交矩阵VVV,满足A=U[σ1?σrO]VT=UΣVT=σ1u1v1T+?+σrurvrT\boldsymbol A=\boldsymbol U\begin{bmatrix}\sigma_1&&&\\&\ddots&&\\&&\sigma_r&\\&&&O\end{bmatrix}\boldsymbol V^{\mathbf T}=\boldsymbol U\boldsymbol\Sigma\boldsymbol V^{\mathbf T}=\sigma_1u_1v_1^T+\cdots+\sigma_ru_rv_r^TA=U?????σ1????σr??O??????VT=UΣVT=σ1?u1?v1T?+?+σr?ur?vrT?
UTAV=[ΣOOO]\boldsymbol U^{\mathbf T}\boldsymbol A\boldsymbol V=\begin{bmatrix}\mathbf\Sigma&\mathbf O\\\mathbf O&\mathbf O\end{bmatrix}UTAV=[ΣO?OO?]
其中r=rank(A)r=rank(A)r=rank(A)Σ=diag(σ1,σ1,?,σr)\boldsymbol\Sigma=diag(\sigma_1,\sigma_1,\cdots,\sigma_r)Σ=diag(σ1?,σ1?,?,σr?)U=(u1,?,ur,ur+1,?,um)\boldsymbol U=\left(u_1,\cdots,u_r,u_{r+1},\cdots,u_m\right)U=(u1?,?,ur?,ur+1?,?,um?)V=(v1,?,vr,vr+1,?,vn)\boldsymbol V=\left(v_1,\cdots,v_r,v_{r+1},\cdots,v_n\right)V=(v1?,?,vr?,vr+1?,?,vn?)。习惯上,称σ1?σ2???σr>0\sigma_1\geqslant\sigma_2\geqslant\cdots\geqslant\sigma_r>0σ1??σ2????σr?>0为矩阵AAA的全部非零奇异值.这个分解为奇异值分解,简称SVD.
?
矩阵ATAA^TAATAAATAA^TAAT的特征值λ\lambdaλσi\sigma_iσi?的关系
A=UΣVT?AV=UΣ,写成列向量A=U\Sigma V^T\Rightarrow AV=U\Sigma,\mathrm{写成列向量}A=UΣVT?AV=UΣ
A(v1?vrvr+1?vn)=(u1?urur+1?um)[σ1?σrO]\mathrm A{({\mathrm v}_1\cdots{\mathrm v}_{\mathrm r}{\mathrm v}_{\mathrm r+1}\cdots{\mathrm v}_{\mathrm n})}={({\mathrm u}_1\cdots{\mathrm u}_{\mathrm r}{\mathrm u}_{\mathrm r+1}\cdots{\mathrm u}_{\mathrm m})\begin{bmatrix}{\mathrm\sigma}_1&&&\\&\ddots&&\\&&{\mathrm\sigma}_{\mathrm r}&\\&&&\mathrm O\end{bmatrix}}A(v1??vr?vr+1??vn?)=(u1??ur?ur+1??um?)?????σ1????σr??O??????
同理,
AT(u1?urur+1?um)=(v1?vrvr+1?vn)[σ1?σrO]\mathrm A^{\mathrm T}{({\mathrm u}_1\cdots{\mathrm u}_{\mathrm r}{\mathrm u}_{\mathrm r+1}\cdots{\mathrm u}_{\mathrm m})}={({\mathrm v}_1\cdots{\mathrm v}_{\mathrm r}{\mathrm v}_{\mathrm r+1}\cdots{\mathrm v}_{\mathrm n})\begin{bmatrix}{\mathrm\sigma}_1&&&\\&\ddots&&\\&&{\mathrm\sigma}_{\mathrm r}&\\&&&\mathrm O\end{bmatrix}}AT(u1??ur?ur+1??um?)=(v1??vr?vr+1??vn?)?????σ1????σr??O??????
{Avi=σiuiATui=σivi(i=1,?,r),{Avi=0(i=r+1,?,n)ATui=0(i=r+1,?,m)\left\{\begin{array}{l}Av_i=\sigma_iu_i\\A^Tu_i=\sigma_iv_i\end{array}\right.(i=1,\cdots,r),\left\{\begin{array}{l}Av_i=0(i=r+1,\cdots,n)\\A^Tu_i=0(i=r+1,\cdots,m)\end{array}\right.{ Avi?=σi?ui?ATui?=σi?vi??(i=1,?,r),{ Avi?=0(i=r+1,?,n)ATui?=0(i=r+1,?,m)?
所以
ATAvi=σi2vi?vi是ATA关于σi2的特征向量AATui=σi2ui?ui是AAT关于σi2的特征向量A^TAv_i=\sigma_i^2v_i\Rightarrow v_i是A^TA\mathrm{关于}\sigma_i^2\mathrm{的特征向量}\\AA^Tu_i=\sigma_i^2u_i\Rightarrow u_i是AA^T\mathrm{关于}\sigma_i^2\mathrm{的特征向量}ATAvi?=σi2?vi??vi?ATAσi2?AATui?=σi2?ui??ui?AATσi2?

3.SVD分解的证明

??记Hermite矩阵AHAA^HAAHA的特征值为λ1?λ2???λr>λr+1=?=λn=0\lambda_1\geqslant\lambda_2\geqslant\cdots\geqslant\lambda_{r>}\lambda_{r+1}=\cdots=\lambda_n=0λ1??λ2????λr>?λr+1?=?=λn?=0存在mmm阶正交矩阵VVV,使得
VH(AHA)V=[λ1?λn]=[Σ2OOO]\boldsymbol V^{\mathbf H}\mathbf{(A^HA)}\boldsymbol V=\begin{bmatrix}\lambda_1&&\\&\ddots&\\&&\lambda_n\end{bmatrix}=\begin{bmatrix}\mathbf\Sigma^{\mathbf2}&\mathbf O\\\mathbf O&\mathbf O\end{bmatrix}VH(AHA)V=???λ1????λn?????=[Σ2O?OO?]
V\boldsymbol VV分块为
V=[V1?V2],V1∈Crn×r,V2∈Cn?rn×(n?r)\boldsymbol V=\left[{\boldsymbol V}_{\mathbf1}\vdots{\boldsymbol V}_{\mathbf2}\right],{\boldsymbol V}_{\mathbf1}\in C_r^{n\times r},{\boldsymbol V}_{\mathbf2}\in C_{n-r}^{n\times(n-r)}V=[V1??V2?]V1?Crn×r?V2?Cn?rn×(n?r)?
改写上述式子VH(AHA)V\boldsymbol V^{\mathbf H}\mathbf{(A^HA)}\boldsymbol VVH(AHA)VAHAV=V[Σ2OOO]\boldsymbol A^{\mathbf H}\boldsymbol A\boldsymbol V=\boldsymbol V\begin{bmatrix}\mathbf\Sigma^{\mathbf2}&\mathbf O\\\mathbf O&\mathbf O\end{bmatrix}AHAV=V[Σ2O?OO?]
则有
AHAV1=VΣ2,AHAV2=O\boldsymbol A^{\mathbf H}\boldsymbol A{\boldsymbol V}_{\mathbf1}=\boldsymbol V\boldsymbol\Sigma^{\mathbf2}\boldsymbol,\boldsymbol A^{\mathbf H}\boldsymbol A{\boldsymbol V}_{\mathbf2}\boldsymbol=\boldsymbol OAHAV1?=VΣ2AHAV2?=OAHAV1=VΣ2\boldsymbol A^{\mathbf H}\boldsymbol A{\boldsymbol V}_{\mathbf1}=\boldsymbol V\boldsymbol\Sigma^{\mathbf2}AHAV1?=VΣ2可得,V1HAHAV1=Σ2或者(AV1Σ?1)H(AV1Σ?1)=Ir\boldsymbol V_1^{\mathrm H}\boldsymbol A^{\mathrm H}\boldsymbol A{\boldsymbol V}_1=\boldsymbol\Sigma^2\;\mathrm{或者}\;{({\mathbf{AV}}_1\mathbf\Sigma^{-1})}^{\mathrm H}(\boldsymbol A{\boldsymbol V}_1\mathbf\Sigma^{-1})={\mathbf I}_{\mathrm r}V1H?AHAV1?=Σ2(AV1?Σ?1)H(AV1?Σ?1)=Ir?

AHAV2=O\boldsymbol A^{\mathbf H}\boldsymbol A{\boldsymbol V}_{\mathbf2}\boldsymbol=\boldsymbol OAHAV2?=O可得,(AV2)H(AV2)=O或AV2=O{(\boldsymbol A{\boldsymbol V}_2)}^H\left(\boldsymbol A{\boldsymbol V}_2\right)=\boldsymbol O或\boldsymbol A{\boldsymbol V}_2=\mathbf O(AV2?)H(AV2?)=OAV2?=O
??令U1=AV1Σ?1{\mathbf U}_{\mathbf1}\boldsymbol=\boldsymbol A{\boldsymbol V}_{\mathbf1}\mathbf\Sigma^{\boldsymbol-\mathbf1}U1?=AV1?Σ?1,则U1HU1=Ir\boldsymbol U_{\mathbf1}^{\mathbf H}{\boldsymbol U}_{\mathbf1}\boldsymbol={\boldsymbol I}_{\mathbf r}U1H?U1?=Ir?,即U1U_1U1?rrr个列是两两正交的单位向量,记作U1=(u1,u2,?,ur)\boldsymbol U_1=\left(u_1,u_2,\cdots,u_r\right)U1?=(u1?,u2?,?,ur?).将u1,?,uru_1,\cdots,u_ru1?,?,ur?扩充为CmC^mCm的标准正交基,记增添的向量为ur+1,?,umu_{r+1},\cdots,u_mur+1?,?,um?,并构造矩阵U2=(ur+1,?,um)\boldsymbol U_2=\left(u_{r+1},\cdots,u_m\right)U2?=(ur+1?,?,um?),则
U=[U1?U2]=(u1,u2,?,ur,ur+1,?,um)\boldsymbol U=\left[{\boldsymbol U}_{\mathbf1}\vdots{\boldsymbol U}_{\mathbf2}\right]=\left(u_1,u_2,\cdots,u_r,u_{r+1},\cdots,u_m\right) U=[U1??U2?]=(u1?,u2?,?,ur?,ur+1?,?,um?)
mmm阶正交矩阵,且有
U1HU1=Ir,U2HU1=O\boldsymbol U_1^H{\boldsymbol U}_1={\boldsymbol I}_r,\boldsymbol U_2^H{\boldsymbol U}_1=\boldsymbol O U1H?U1?=Ir?U2H?U1?=O于是可得
UHAV=UH[AV1?AV2]=[U1HU2H][U1Σ?O]=[U1HU1ΣOU2HU1ΣO]=[ΣOOO]证毕\boldsymbol U^H\boldsymbol A\boldsymbol V=\boldsymbol U^H\left[\boldsymbol A{\boldsymbol V}_1\vdots\boldsymbol A{\boldsymbol V}_2\right]=\begin{bmatrix}\boldsymbol U_1^H\\\boldsymbol U_2^H\end{bmatrix}\left[{\boldsymbol U}_1\boldsymbol\Sigma\vdots\boldsymbol O\right]=\begin{bmatrix}\boldsymbol U_1^H{\boldsymbol U}_1\boldsymbol\Sigma&O\\\boldsymbol U_2^H{\boldsymbol U}_1\boldsymbol\Sigma&O\end{bmatrix}=\begin{bmatrix}\mathbf\Sigma&\mathbf O\\\mathbf O&\mathbf O\end{bmatrix}\mathrm{证毕} UHAV=UH[AV1??AV2?]=[U1H?U2H??][U1?Σ?O]=[U1H?U1?ΣU2H?U1?Σ?OO?]=[ΣO?OO?]
改写UTAV=[ΣOOO]\boldsymbol U^{\mathbf T}\boldsymbol A\boldsymbol V=\begin{bmatrix}\mathbf\Sigma&\mathbf O\\\mathbf O&\mathbf O\end{bmatrix}UTAV=[ΣO?OO?]
A=U[ΣOOO]VH\boldsymbol A=\boldsymbol U\begin{bmatrix}\mathbf\Sigma&\mathbf O\\\mathbf O&\mathbf O\end{bmatrix}\boldsymbol V^H A=U[ΣO?OO?]VH称上式为矩阵A的奇异值分解.

4.例题

求矩阵A=[101011000]A=\begin{bmatrix}1&0&1\\0&1&1\\0&0&0\end{bmatrix}A=???100?010?110????的奇异值分解.

方法1

ATA=[101011112]A^TA=\begin{bmatrix}1&0&1\\0&1&1\\1&1&2\end{bmatrix}ATA=???101?011?112????的特征值是λ1=3,λ2=1,λ3=0\lambda_1=3,\lambda_2=1,\lambda_3=0λ1?=3λ2?=1λ3?=0,对应的特征向量依次为
ξ1=[112],ξ2=[1?10],ξ3=[11?1]\xi_1=\begin{bmatrix}1\\1\\2\end{bmatrix},\xi_2=\begin{bmatrix}1\\-1\\0\end{bmatrix},\xi_3=\begin{bmatrix}1\\1\\-1\end{bmatrix}ξ1?=???112????ξ2?=???1?10????ξ3?=???11?1????于是可得
rank(A)=2,Σ=[3001]rank(A)=2,\Sigma=\begin{bmatrix}\sqrt3&0\\0&1\end{bmatrix}rank(A)=2Σ=[3 ?0?01?]存在矩阵VVV,使得矩阵ATAA^TAATA可以相似对角化VT(ATA)V=diag(λ1,λ2,λ3)V^T(A^TA)V=diag(\lambda_1,\lambda_2,\lambda_3)VT(ATA)V=diag(λ1?,λ2?,λ3?)
即矩阵VVVλ1,λ2,λ3\lambda_1,\lambda_2,\lambda_3λ1?,λ2?,λ3?对应的单位特征向量组成,所以V=[ξ1∥ξ1∥ξ2∥ξ2∥ξ3∥ξ3∥]=[16121316?1213260?13]V=\begin{bmatrix}\frac{\xi_1}{\left\|\xi_1\right\|}&\frac{\xi_2}{\left\|\xi_2\right\|}&\frac{\xi_3}{\left\|\xi3\right\|}\end{bmatrix}=\begin{bmatrix}\frac1{\sqrt6}&\frac1{\sqrt2}&\frac1{\sqrt3}\\\frac1{\sqrt6}&-\frac1{\sqrt2}&\frac1{\sqrt3}\\\frac2{\sqrt6}&0&-\frac1{\sqrt3}\end{bmatrix} V=[ξ1?ξ1???ξ2?ξ2???ξ3ξ3???]=????6 ?1?6 ?1?6 ?2??2 ?1??2 ?1?0?3 ?1?3 ?1??3 ?1??????计算
U1=AV1Σ?1=[121212?1200]U_1=AV_1\Sigma^{-1}=\begin{bmatrix}\frac1{\sqrt2}&\frac1{\sqrt2}\\\frac1{\sqrt2}&-\frac1{\sqrt2}\\0&0\end{bmatrix}U1?=AV1?Σ?1=???2 ?1?2 ?1?0?2 ?1??2 ?1?0????由于矩阵UUU是正交矩阵,向量两两正交,构造
U2=[001],U=[U1?U2]=[1212012?120001]U_2=\begin{bmatrix}0\\0\\1\end{bmatrix},U=\left[U_1\vdots U_2\right]=\begin{bmatrix}\frac1{\sqrt2}&\frac1{\sqrt2}&0\\\frac1{\sqrt2}&-\frac1{\sqrt2}&0\\0&0&1\end{bmatrix}U2?=???001????U=[U1??U2?]=???2 ?1?2 ?1?0?2 ?1??2 ?1?0?001????AAA的奇异值分解为
A=U[300010000]VTA=U\begin{bmatrix}\sqrt3&0&0\\0&1&0\\0&0&0\end{bmatrix}V^TA=U???3 ?00?010?000????VT

方法2:

矩阵AAA可分解为A=UΣVT\boldsymbol A=\boldsymbol U\boldsymbol\Sigma\boldsymbol V^{\mathbf T}A=UΣVT,所以只需求出UΣVT\boldsymbol U\boldsymbol\Sigma\boldsymbol V^{\mathbf T}UΣVT
(1)分别求出矩阵ATA\boldsymbol A^{\mathbf T}\boldsymbol AATAAAT\boldsymbol A\boldsymbol A^{\mathbf T}AAT的特征值及对应的特征向量;
(2)矩阵ATA\boldsymbol A^{\mathbf T}\boldsymbol AATAAAT\boldsymbol A\boldsymbol A^{\mathbf T}AAT的特征值为对角矩阵Σ\boldsymbol\SigmaΣ的元素(其实两个矩阵特征值相同),ATA\boldsymbol A^{\mathbf T}\boldsymbol AATAAAT\boldsymbol A\boldsymbol A^{\mathbf T}AAT的特征值对应单位的特征向量为矩阵U\boldsymbol UUV\boldsymbol VV
(3)写SVD分解表达式

??
证明:矩阵ATA\boldsymbol A^{\mathbf T}\boldsymbol AATA(mmm阶)与AAT\boldsymbol A\boldsymbol A^{\mathbf T}AAT(nnn阶)的非零特征值集合相同.

证:r(AAT)=r(AT),r(ATA)=r(A)=r,r(A)=r(AT)r(AA^T)=r(A^T),r(A^TA)=r(A)=r,r(A)=r(A^T)r(AAT)=r(AT)r(ATA)=r(A)=rr(A)=r(AT),故
#{ATA的非零特征值}=r=#{AAT的非零特征值}\#\{A^TA\mathrm{的非零特征值}\}=r=\#\{AA^T\mathrm{的非零特征值}\}#{ ATA}=r=#{ AAT}
λ\lambdaλATAA^TAATA的非零特征值,即?x,使得ATAx=λx\exists x,\mathrm{使得}A^TAx=\lambda x?x使ATAx=λx
则有AATAx=λAx(Ax≠0)AA^TAx=\lambda Ax(Ax\neq0)AATAx=λAx(Ax??=0),故λ\lambdaλ也是AATAA^TAAT的非零特征值,反之亦然
因此,AATAA^TAATATAA^TAATA具有相同非零特征值.

??

解:
方法1中,已经求出矩阵Σ\boldsymbol\SigmaΣV\boldsymbol VV。那么只需求出矩阵U\boldsymbol UU,矩阵AAT\boldsymbol A\boldsymbol A^{\mathbf T}AAT
AAT=[210120000]AA^T=\begin{bmatrix}2&1&0\\1&2&0\\0&0&0\end{bmatrix}AAT=???210?120?000????的特征值为λ1=3,λ2=1,λ3=0\lambda_1=3,\lambda_2=1,\lambda_3=0λ1?=3λ2?=1λ3?=0,对应的特征向量依次为
ξ1=[110],ξ2=[1?10],ξ3=[001]\xi_1=\begin{bmatrix}1\\1\\0\end{bmatrix},\xi_2=\begin{bmatrix}1\\-1\\0\end{bmatrix},\xi_3=\begin{bmatrix}0\\0\\1\end{bmatrix}ξ1?=???110????ξ2?=???1?10????ξ3?=???001????
单位化后,得
U=[1212012?120001]U=\begin{bmatrix}\frac1{\sqrt2}&\frac1{\sqrt2}&0\\\frac1{\sqrt2}&-\frac1{\sqrt2}&0\\0&0&1\end{bmatrix}U=???2 ?1?2 ?1?0?2 ?1??2 ?1?0?001????
AAA的奇异值分解为
A=U[300010000]VTA=U\begin{bmatrix}\sqrt3&0&0\\0&1&0\\0&0&0\end{bmatrix}V^TA=U???3 ?00?010?000????VT

  相关解决方案