国科大_模式识别_刘成林_作业答案.pdf-资料库

4a6f0022-e49e-448a-89b2-3233450f09ef.pdf-第1页.png

第1页 / 共21页

4a6f0022-e49e-448a-89b2-3233450f09ef.pdf-第2页.png

第2页 / 共21页

4a6f0022-e49e-448a-89b2-3233450f09ef.pdf-第3页.png

第3页 / 共21页

4a6f0022-e49e-448a-89b2-3233450f09ef.pdf-第4页.png

第4页 / 共21页

4a6f0022-e49e-448a-89b2-3233450f09ef.pdf-第5页.png

第5页 / 共21页

4a6f0022-e49e-448a-89b2-3233450f09ef.pdf-第6页.png

第6页 / 共21页

4a6f0022-e49e-448a-89b2-3233450f09ef.pdf-第7页.png

第7页 / 共21页

4a6f0022-e49e-448a-89b2-3233450f09ef.pdf-第8页.png

第8页 / 共21页

p(x|ωi)p(ωi)ḄᦪǷωiAᐵA5P (error) = (d) P (ωi|x) = p(x|ωi)p(ωi) (c − 1)/c p(x) 1Q2 In many pattern classiﬁcation problems one has the option either to assign the pattern to one of c classes, or to reject it as being unrec- ognizable. If the cost for rejects is not too high, rejection may be a desirable action. Let  0 λr λs λ(αi|ωi) = i = j i, j = 1, ..., c i = c + 1 otherwise where λr is the loss incurred for choosing the (c + 1)th action, rejec- tion, and λs is the loss incurred for making a substitution error. Show that the minimum risk is obtained if we decide ωi if P (ωi|x) ≥ P (ωi|x) for all j and if P (ωi|x) ≥ 1 − λr , and reject otherwise. What happens if λr = 0? What happens if λr > λs? λs ᪵ʠxᐸXᑨȊωiᑖḄb◅Aὃ⇋> c j=1 Ri = λ(αi|ωj)P (ωj|x) j=i λsP (ωj|x) = 0 × P (ωi|x) + = λs[1 − P (ωi|x)] ᝞ʧ>ᑣ᪵ʠxb◅Rr = λrP (ωi|x) ≥ 1− λr λs ARiRr: Ri − Rr = λs[1 − P (ωi|x)] − Rr − 1) − λr ≤ λs(1 + λr λs = 0 ᓽRi − Rr ≤ 0ᑨωiḄb◅>Ḅb◅஺ λr = 0A>ᙠb◅ᑖᙠϏḄb◅ᡠᡠᨵḄ᪵ʠZX > λr > λsAᨵRr = λr > λs ≥ Ri = λs[1 − P (ωi|x)]ᓽRr > Ri>Ḅ b◅ɛʖᑖb◅ A>Ḅʡᑴᜫᦔ 2

1Q3 Now we have N samples, and each sample xi, i = 1, ..., N has d- dimensions. Please provide us the proofs and the pseudo-codes of PCA algorithm PCAʖOτA ᪵ʠᢗᑮ__Ḅ᪵ʠ[ 8ț;_A[ᨬ஺M▣X = (xc N ) ᦮ᦪɼ`ᐸ: 2, ..., xc i = xi − µ, xi ∈ Rd×1, i = 1, ..., N xc 1, xc µ = 1 N ?ᩭȖAM▣ N xi i=1 Σ = XX T ΣḄʠǷ᪀ᡂYM▣ᜐλ1, λ2, ..., λdᜧᑮ᣸ᑡ ΛḄʠᔣ[᪀ᡂd × dM▣ Λ = Diag(λ1, λ2, ..., λd) Φ = (φ1, φ2, ..., φd) ⌱>ʠǷᨬᜧḄmʠᔣ[_Ḅ9ʠᔣ[ᡂᢗM ▣W ΅᝞@5ᐰZᦪɼτᑮm W = (φ1, φ2, ..., φm) Z = W T X ʔ✌ᐜPCA⌕ᙠᦪɼτᢗȜᐸᙠᢗȜḄᜧA ᙠᙶ᪗Oᡂᑖ ᜧAᙠᙶ᪗ᡂ ᑖ ?ZW ⌱3ᢗᔣ[φjφj3xiᙠ _ḄjḄᦪǷφT j xiφjA Ej = φT j XX T φj LȌḄᓄL᪗5ᑮᔠ〉ḄφjᐸᨬᜧᓄEjᙠʣφT ᔠ>ʫᨽAᓄL᪗5 j φj − 1 = 0 JEj = φT j Σφj + λj(φT j φj − 1) JEj φjᦪ`ᜐᑮᔠ〉Ḅφjᑮ ʗᯠφjʠᔣ[ᩩ Σφj + λjφj = 0 3

1Q4 Consider the following decision rule for a two-category one-dimensional problem: Decide ω1 if x > θ; otherwise decide ω2. (a) Show the probability of error for this rule is given by P (error) = P (ω1) θ −∞ p(x|ω1)dx + P (ω2) ∞ θ p(x|ω2)dx (b) By diﬀerentiating, show that a necessary condition to minimize P (error) is that θ satisfy p(θ|ω1)P (ω1) = p(θ|ω2)P (ω2) (c) Does this equation deﬁne θ uniquely? (d) Give an example where a value of θ satisfying the equation actu- ally maximizes the probability of error (a) x ≤ θAᑨ3ω2ᡠZᑖḄ┯᳛ȕ_(−∞, θ]xω1Ḅᭆ ᳛ P 1(error) = = θ θ θ −∞ −∞ p(ω1|x)p(x)dx p(x, ω1) p(x)dx p(x) p(x|ω1)p(ω1) = −∞ θ −∞ θ p(x|ω2)dxᡠ = p(ω1) p(x)dx p(x) p(x|ω1)dx p(x|ω1)dx + p(ω2) θ −∞ ∞ θ p(x|ω2)dx Ȝᳮp2(error) = p(ω2) ∞ P (error) = p(ω1) (b) dP (error) dθ = p(ω1)p(θ|ω1) − p(ω2)p(θ|ω2) (c) P3θḄȨY8 p(θ|ω1) (d) x ≤ θAᑨ3ω25ᑨ3ω1ZȊω2Ḅ⌕ᭆ᳛ᑖាɕ ᙠ(θ,∞)ȕ_̠Ȋω1Ḅ⌕ᭆ᳛ᑖាɕᙠ(−∞, θ)ȕ_5̠ ᑨ3YᑣḄ᳛ᜧ ḄθǷᨵ p(θ|ω2) = p(ω2) p(ω1) 1Q5 Consider the multivariate normal density for which σij = 0 and σii = σ2 i , i.e., Σ = diag(σ2 2, ..., σ2 1, σ2 d). 4

(a) Show that the evidence is p(x) = (b) Plot and describe the contours of constant density (c) Write an expression for the Mahalanobis distance from x to µ d i=1( xi−µi σi )2] exp[− 1 2 2πσi 1d √ i=1 (a) ɏ A p(x) = 1 2 |Σ| 1 (2π) d 2 exp[− 1 2 (x − µ)T Σ−1(x − µ)] ᐭ⚪Lᩩᑮ 2d 1 p(x) = (2π) d i=1 σi d ( i=1 xi − µi σi )2] exp[− 1 2 (b) ᔣ[x1x2Rp(x1) = p(x2)ᑣᨵ d i=1 d i=1 ( x1 i σi − µi σi )2 = ( x2 i σi − µi σi )2 , ..., µd σd ]ḄOLᡠ ᙠ 1, x1 2, ..., x1 d]ȤF[x2 d]ᑮF[ µ1 ᓽF[x1 σ1 <3ʴᙊᙠ <3ʴᳫ (c) O 2, ..., x2 1, x2 d i=1 xi − µi ( )2 σi ∞ a e− µ2 1Q6 Let p(x|ωi) ∼ N (µi, σ2I) for a two-category d-dimensional problem with P (ω1) = P (ω2) = 1 2 (a) Show that the minimum probability of error is given by Pe = 1√ 2π (b) Let µ1 = 0 and µ = (µ1, ..., µd)t. Use the inequality from [Pattern Classiﬁcation, Chapter 2, Problem 31] to show that Pe approaches zero as the dimension d approaches inﬁnity. (c) Express the meaning of this result in words 2 dµ, where a = ||µ2 − µ1||/(2σ) (a) 5

R1 R1 ȕ_R1ᑨω1ȕ_R2ᑨω2 p(ω2|x)p(x)dx + P (error) = p(ω1|x)p(x)dx R2 p(ω1, x) R2 p(x) p(ω2, x) p(x) = p(x)dx + p(x)dx = p(ω2) p(x|ω2)dx + p(ω2) p(x|ω1)p(ω1)dx R1 p(x|ω2)dx + R2 p(x|ω1)dx 1 2 R2 1 2 = = R1 p(x|ω2)dx R1 ʠ⚪ᐹḄᑨ3Yᑣxµ1Oᑣᑨ3ω1µ2Oᑣᑨ 3ω2ὃ⇋ᑨω1Ḅ<3 i )2 − (xi − µ2 i )x − (µ2 [(xi − µ1 i )(µ2 i + µ1 i )] < 0 i − µ1 [2(µ2 i − µ1 i )2] = i ᓄ i y = (µ2 − µ1)tx < 2 µ2 − µT (µT 1 µ1) 1 2 y᪗[ᡠᑖḄᙳǷᑖȊ5(µ2 − µ1)T µ1Ȥ(µ2 − µ1)T µ2Aᙳ σ2||µ2 − µ1||2Aɴɏᔣ[ʹᜐᳮ P (error) = p(x|ω2)dx R1 √ = 1 2πσ||µ2 − µ1|| ∞ (µ2−µ1)T µ1+w(µ2−µ1)T µ2 2 exp(− (y − (µ2 − µ1)T µ2)2 2σ2||µ2 − µ1||2 )dy ᵨu = y−(µ2−µ1)T µ2 σ||µ2−µ1|| Bᣚᣵyᑮ P (error) = 1√ 2π ∞ ||µ2−µ1|| 2σ exp(− u2 2 )du (b) ᔠMᩩ Pe ≤ √ 1 ||µ2−µ1|| 2σ 2π e− (||µ2−µ1|| 2σ 2 )2 Pe ≤ 2σ√ 2π||µ2|| e − ||µ2||2 8σ2 Ϗ`ᔣ[µ2ḄAᜧAPe` (c) ʔ ᦪɼ5ᑖ 6

2Q1 n n Let the sample mean ˆµn and the sample covariance matrix Cn for a set of n samples x1, ..., xn ((each of which is d-dimensional) be deﬁned k=1(xk − ˆµn)(xk − ˆµn)t. We call these k=1 xk and Cn = 1 by ˆµn = 1 n−1 n the ’non-recursive’ formulas. (a) What is the computational complexity of calculating ˆµn and Cn by these formulas? (b) Show that alternative, ’recursive’ techniques for calculating ˆµn and Cn based on the successive addition of new samples xn+1 can be n+1 (xn+1 − ˆµn) and derived using the recursion relations: ˆµn+1 = ˆµn + 1 Cn+1 = n−1 (c) What is the computational complexity of ﬁnding ˆµn and Cn by these recursive methods? n+1 (xn+1 − ˆµn)(xn+1 − ˆµn)t. n Cn + 1 (a) ᙳǷAn᪵ʠn4ᐳdɎᩖO(dn) ȖAM▣CnA᪵ʠ˯d2ᐳn᪵ʠᡠ Ɏᩖnd2 (b) ⌴ḄAˆµn+1 n+1 1 xi n + 1 i=1 1 n + 1 (xn+1 + n i=1 ˆµn+1 = = = = xi) n i=1 1 n + 1 (xn+1 + n × 1 n xi) 1 n + 1 xn+1 + n n + 1 ˆµn (xn+1 − ˆµn) = ˆµn + 1 n + 1 7

⌴ACn+1 = n−1 n Cn + 1 n+1 (xn+1 − ˆµn)(xn+1 − ˆµn)t (xk − µn+1)(xk − µn+1)T k=1 n+1 n n k=1 k=1 n Cn+1 = = = = 1 n 1 n 1 n 1 n 1 n (xk − µn+1)(xk − µn+1)T + (xn+1 − µn+1)(xn+1 − µn+1)T 1 n [xk − µn − 1 n + 1 (xn+1 − µn)][xk − µn − 1 n + 1 (xn+1 − µn)]T + (xn+1 − µn+1)(xn+1 − µn+1)T (xk − µn)(xk − µn)T − 1 n(n + 1) (xn+1 − µn) n k=1 1 − n (xk − µn)T k=1 (xk − µn)(xn+1 − µn)T + 1 (n + 1)2 (xn+1 − µn)(xn+1 − µn)T ᐸn k=1 n(n + 1) [xn+1 − µn − 1 n + 1 (xn+1 − µn)][xn+1 − µn − 1 n + 1 1 + n k=1(xk − µn)T `ᔣ[ᡠஹ⚗5◀ (n + 1)2 (xn+1 − µn)(xn+1 − µn)T Cn+1 = Cn + n + 1 (xn+1 − µn)]T n − 1 n n − 1 n Cn + 1 n + 1 (xn+1 − µn)(xn+1 − µn)T (c) ᑭᵨᓫ᪵ʠBAµ᡻ʹ43ḄɎᩖ O(d)ᑭᵨᓫ᪵ʠBACA᡻ʹd2ᡠɎᩖO(d2) 2Q2 exp[− 1 2 In Pattern Classiﬁcation, Chapter 3, Page 77, we have f (σ, σn) = )2]dµ, now please calculate f (σ, σn), give the (µ− σ2 σ2+σ2 n σ2σ2 n nx+σ2µn σ2+σ2 n ﬁnal result of f , and explain that f has nothing to do with x. f (σ, σn) f (σ, σn) = exp[− 1 2 σ2 + σ2 n σ2σ2 n (µ − σ2 nx + σ2µn σ2 + σ2 n )2]dµ 8

资料库

国科大_模式识别_刘成林_作业答案.pdf

相关推荐

人工智能

热门标签

最新资料