NoteJ++: Theory of Generalization-6 Machine Learning Foundations

Monday, July 24, 2017

Theory of Generalization-6 Machine Learning Foundations

Review

break point: minimum data could be shattered
shattered: could represent all dichotomies
dichotomies: number of set could bi-separate the data
m(N): growth function, represent maximum of dichotomies given N

Term

Bounding Function B(N, k): N個點中，任k個不能shattered
combinatorial quantity:
Maximum number of length N vectors with (o,x), while no shatter in any length k subvectors

Bounding Function: The Theorem

從B(4, 3)舉例來看，可以發現分成兩組，（橘色）成對，（紫色）單支
（α部份）
可以直覺的判斷α中(略去x4不看)必須滿足小於B(3,2)的性質。（x1, x2, x3任2個不能被shattered，否則必無法滿足B(3, 3) ）
（α + β部份）
可以直覺的判斷α + β中，必須滿足小於B(3,3)的性質。（x1, x2, x3任3個不能被shattered，否則必無法滿足B(4, 3) ）
下圖舉三個例子：

於是我們找到了一個generalization of upper bound formula，最高的數值大小也不會超過N^(k-1)，和原本的 2^N 相比減少非常多！

VC Bound:

目前已知mH(N)成長函數，但還是無法估計出upper bound。已知E-in是有限的，但是E-out卻是無限的，於是想辦法把E-out轉換成有限，於是推導出VC Bound。
H: Hypothesis Set (可能描述資料的所有函數）
三步驟概略推導VC由來

replace E-out by E-in'
E-in'是D'(validation set)的評估，找有限的資料來作評估依據。這邊因為某些數學的理由，把限制變得更加嚴格 ε -> ε/2。
decompose H by kind
透過成長函數mH(N)來算出總共會有幾種Dichotomies -> mH(2N)
use Hoeffding without replacement
smaller bin, smaller ε

舉例2D perceptrons:

break point = 4
mH(N) < N^(k-1) = N^3
證明了只要N足夠，是可以學習的

NEXT: the-vc-dimension

REF

Theory of Generalization @ Machine Learning Foundations

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)