NoteJ++: July 2017

Sunday, July 30, 2017

Noise and Error-8 Machine Learning Foundations

Why?

延伸之前推論出的VC Dimension，嘗試把適用範圍拓展到不同的問題上。
雜訊問題：可能答案標錯，可能資料本身有錯，都是雜訊
機器學習就像從瓶子裡面拿彈珠，以前彈珠好壞比例固定（deterministic），雜訊存在時彈珠好壞則是隨機的（stochastic）有可能同一個判斷（hypothesis）有時好有時壞，但是依然可以推論VC Dimension是通用的，有雜訊時也是可以學習的。
換一種角度，直覺的思考：
如果今天好的資料機率0.7，壞的資料0.3（stochastic），那我們可以找出一個mini-target，根據大部分資料的特性來當作學習目標，此時壞的資料就當作雜訊。

Error Measure:

g: hypothesis function
f: target function
先前討論學的好不好是透過評估E-out: g在沒看過得資料發生bad event的數量
比較常用的是pointwise評估方法
譬如說分類和逼近兩種的評估方法就不同，並且定義了mini-target f function。
classification: f = argmax(P(y|x))
regression: f = E[P(y|x)]

Weighted Classification

很多時候error會根據不同目標而不同，在選擇學習演算法時，通常我們會選擇一個可行的或者友善的來學習。可行的是指我們對那個data domain的知識人為判斷，友善的是指針對資料特性找出能快速降低錯誤的方法。(err-head: 是我們設計的error measures)
上圖在討論一個直覺的想法，先前以證明過PLA是可以學習的，但現在學習演算法改變，出現了權重來影響學習，此時可以透過virtual copy來達到reduction效果（把一個問題推論另外一個問題，此時就可以證明在條件下可通用，這是很常看到的方法）
另外值得一提的是，weighted classification也可以用來解決data unbalance的問題（譬如說罹患癌症的比例在資料中非常少）

REF

Noise and Error @ Machine Learning Foundations

Saturday, July 29, 2017

[Interview] @ Synology, Software Engineer

20170729 @ Synology, Software Engineer

面試個人優劣自省：

優：

有一年半學生新創經驗
專案管理帶開發團隊經驗
App實務開發專案經驗
研究相關完整專案經驗
113血統

劣：

code寫太少，敏感度不夠
興趣領域背景知識不足（接觸機器學習不到一年）
大學基礎科目學習狀況差
以上導致自信心不足

第一關：兩位面試官（10:15 ~ 11:45）

簡單自我介紹（用自備電腦投影片）
詢問自我介紹中的專案細節，管理經驗。

處理不配合的成員經驗
專案的特點和貢獻點
習慣的語言

口頭問答（統計在最下方）
白板題：

把兩個排序好的array按照大小順序merge起來[1,3,5,8,19]
[4,5,7,9,18]
我寫了一個不太妙的O(N^2)解法，過程中有些bug，完成答題。
如何降低複雜度
經考官友善提示，完成了O(N)的解法...。

ＱＡ時間

第二關：一位面試官（12:00 ~ 13:30）

5分鐘自我介紹
詢問相關經驗的最大困難
口頭問答（統計在最下方）

白板題：

實作一個Matrix, 有add, get, transpose function.

add: O(n^2)
get: O(1)
transpose: O(n^2)

各種提升效率的方法

transpose如何變成O(1)，當然其他function也會被其影響。
（在小提示後完成作答）
如果今天維護一個private member，維護著整個matrix有值的部分（sparse representation），請把add和transpose優化成O(N)。這部分有點慘整個卡死，一直卡在最近處理DL資料議題的sparse特性，還有linked list & array特性之類的想法上，最後考官公佈答案後結束...

ＱＡ時間

第三關：兩位面試官（14:00 ~ 15:30）

3分鐘自我介紹
過去專案經驗負責的內容
口頭問答（統計在最下方）
白板題：

給一維度兩條線，請判斷是否重疊？
給一維度N條線，請彈斷是否重疊？

我發想了一個O(N)的解法，但有點過度複雜化問題...
經過重重提示還是卡住，考官最後公佈答案...
（離答案只差一毫米那種，白板就是有壓力）

技術專長詢問

從原始資料到完成機器學習目標的整個過程
如何評估資料量是否充足
如何處理遺失的資料
如何選擇深度學習模型

ＱＡ時間

這時有聊到我的興趣和公司目前之缺內容有出入的問題...，這讓我意識到其實我履歷上的確太執著機器學習相關領域了，多接觸不同領域應該是很正常的選擇。工作和生活本來就都是這樣才正常～～。
期待公司有什麼樣的職缺

第四關：HR時間（15:40 ~ 16:00）

簡單自我介紹
為什麼選研替
履歷和自傳的一些經歷分享討論
期待公司有什麼樣的職缺
覺得自己個性的缺點
ＱＡ時間

口頭問答統計：

process和thread差別
race conditions怎麼解
static的用法
virtual是什麼
auto變數的優缺點
UDP和TCP之間的差別
Synchronous call跟Asynchronous call差別
Stack和Queue差別
linked list和array差別
c++, java, python 編譯上的差異
heap, stack, code, text這些section的差別
atomic用法
習慣的語言有哪些
static 的用法
什麼是register
...

面試結果：

三關白板面試狀況：第三關>第一關>第二關（順利程度）
等待二面，說太早來面試了，所以要等約一兩個月內會通知！！！這跟網路上的分享是不同的，再三和人資確認過應該不是安慰的話，只知道這次研發替代役安排會比較其他面試者後再決定，目前判斷起來就是備取吧，希望有機會囉～！

感想：

群暉面試真的好長，但真的很用心，可以看得出來他們對人才的重視，考官們除了技術底子好以外，我發現更多的是優秀的人格特質，絕不是那種超級黑客脫離現實的傳說人物，更多的是尊重。
這是研替的第二間面試公司，連續一開始就挑戰兩間超高難度的公司真的不是很好的策略，不得不說這兩間的人資或是人才策略就是比較積極，難怪很強～
但目前而言這次面試體驗超好，即使我的專長與面試官完全不同，每個關卡的面試主管，還是會試著了解你在不同領域的努力，即使和技術無關的話題也會重視，我覺得上一間的體驗就差了一些，可能面試就是運氣也佔成份吧。
原來白板提其實沒這麼可怕，並不是以要考倒你為目的的測驗，反而更重視考試的思考過程，你展現的特質，包含技術，表達，思考，抗壓等等的整體反應。

面試技巧：

不同考官考題可能相同，中間休息時間可以把握把剛剛卡住的問題查詢了解一下。
不會可以直接承認，但盡可能提出想法，讓考官參考你的思考過程應該有幫助。
上白板功力會大減30％，瞬間變白癡，準備可能要更充足，最好找人練習，把leetcode題目寫在白板且找人模擬當考官。

Tuesday, July 25, 2017

The VC Dimension-7 Machine Learning Foundations

Review

Lecture 6: 總結只要Growth Function出現break point，且N夠大，機器就能有可能學會資料，使的E-out很接近E-in
Lecture 7: 本節目標是討論VC Dimension
統整截至模前的機器學習三步驟：

找到一個好的Hypothesis Set （有break point k)
資料正確且充足
一個好的演算法從訓練資料中找出g function
（加上一點點運氣）

Terms

VC Dimension:

其實就是k-1（break point - 1），能夠shatter的最大N值（超過N就無法shatter）

Example on 2D PLA

左邊紅色部份經過證明，一定會收斂（T:學習夠多次）
右邊藍色部份經過證明，E-out接近E-in（N:資料夠多，H:成長函數有upper bound）
所以2D PLA可行！！！！！一定可以學習成功（E-out接近E-in接近0）

How about Multi-Dimensions

D-vc == D(Dimensions) + 1
eg:

1D PLA: D-vc = 2
2D PLA: D-vc = 3

proof:

D-vc >= D + 1
也就是說只要找到一組可以shatter D+1維度即證明
D-vc <= D + 1
也就是說必須證明所有D+2維度都不能被Shatter

D-vc 物理意義

D-vc (VC Dimension): 直覺的想法其實就是DOF（Degree of Freedom），可以操控的維度數量。回顧之前的課程，機器學習不外乎要作兩件事：

確保E-in和E-out差不多（可套用學習結果到沒看過得資料上）
確保E-in夠小（成功從training data學習）

於是D-vc也和之前Hypothesis Set數量M有相當的狀況：

太大：學習能力強但是可能會失去一般性
太小：學習能力弱但是可容易保有一般性

D-vc 哲學意義

透過一些簡單的換算，可以把原本的Hoffding改寫成E-out的upper bound關係式。

根據上面的公式，可以推論出一個核心的哲學：

下圖舉例，如何根據老闆的需求，概略算出需要的資料量。

但實務上需要的資料量和評估出來的資料量會有非常大的落差，通常只需要10倍於VC-Dimension的資料量就很足夠了。

實務和理論評估出來的資料量需求會有落差，主要是在推導的過程皆採worse case，找到的upper bound是好幾種upper bound的upper bound...

簡言之推出的公式，其實沒什麼實務上的用途，但卻可以大大的了解運作的哲學，提供我們後續機器學習的參考。

小複習：

定義VC Dimension (其實就是DOF）
D-vc背後的衍生意義

NEXT: noise-and-error

REF

The VC Dimension @ Machine Learning Foundations

Fully Connected vs 1*1 Convolution

Fully Connected vs 1*1 Convolution, 兩個是一樣的嗎

完全不一樣！！！！！！

參考2015年Yann LeCun在臉書上的分享，Fully Connected 和 1*1 Convolution 只有在某種特殊情況下才會相當！

training時，最後一層ConV輸出1*1的feature map，此時flatten後接上Fully Connect
testing時，輸入是原本的數倍放大，導致在最後一層的feature map會產生空間上的擴展，此時的Fully Connect（原本的連接方式），可相當於1*1的ConV，（因為擴展的部份是共用同一個1*1 ConV Kernel！）

常搞混的點

ConV過程如何計算使用到的weights數量？

(last channels amount) * (kernel size) * (kernel amount)
每個channel都會有一個獨立的filter進行Convolution

ConV只能輸入固定大小的input？

Nooo, 沒有人規定這樣，只是不同的input size會產生不同大小的feature map，如此而已。

REF:

https://www.facebook.com/yann.lecun/posts/10152820758292143

Monday, July 24, 2017

Theory of Generalization-6 Machine Learning Foundations

Review

break point: minimum data could be shattered
shattered: could represent all dichotomies
dichotomies: number of set could bi-separate the data
m(N): growth function, represent maximum of dichotomies given N

Term

Bounding Function B(N, k): N個點中，任k個不能shattered
combinatorial quantity:
Maximum number of length N vectors with (o,x), while no shatter in any length k subvectors

Bounding Function: The Theorem

從B(4, 3)舉例來看，可以發現分成兩組，（橘色）成對，（紫色）單支
（α部份）
可以直覺的判斷α中(略去x4不看)必須滿足小於B(3,2)的性質。（x1, x2, x3任2個不能被shattered，否則必無法滿足B(3, 3) ）
（α + β部份）
可以直覺的判斷α + β中，必須滿足小於B(3,3)的性質。（x1, x2, x3任3個不能被shattered，否則必無法滿足B(4, 3) ）
下圖舉三個例子：

於是我們找到了一個generalization of upper bound formula，最高的數值大小也不會超過N^(k-1)，和原本的 2^N 相比減少非常多！

VC Bound:

目前已知mH(N)成長函數，但還是無法估計出upper bound。已知E-in是有限的，但是E-out卻是無限的，於是想辦法把E-out轉換成有限，於是推導出VC Bound。
H: Hypothesis Set (可能描述資料的所有函數）
三步驟概略推導VC由來

replace E-out by E-in'
E-in'是D'(validation set)的評估，找有限的資料來作評估依據。這邊因為某些數學的理由，把限制變得更加嚴格 ε -> ε/2。
decompose H by kind
透過成長函數mH(N)來算出總共會有幾種Dichotomies -> mH(2N)
use Hoeffding without replacement
smaller bin, smaller ε

舉例2D perceptrons:

break point = 4
mH(N) < N^(k-1) = N^3
證明了只要N足夠，是可以學習的

NEXT: the-vc-dimension

REF

Theory of Generalization @ Machine Learning Foundations

Tuesday, July 18, 2017

Machine Learning-5 Training versus Testing

Two Central Questions

Qa: Can we make sure Eout(g) is close enough to Ein(g) ?
(in: training data, out: testing data, g:hypothesis function)
Qb: Can we make Ein(g) small enough ?

Trade-off on M (Hypothesis Set)

Small M
Qa: Yes, Qb: No (too few choices/ algorithms)
Large M
Qa: No, Qb: Yes
找到合適的M，且了解為什麼無限的M是否可以學習（就像前幾章PLA的問題），是接下來討討論的目標。

Review

就是把所有BAD events聯集起來。但當今天M太大，upper bound很可能會變成一個無限大的值，而失去意義。

How can we find finite M?

其實不難發現很多的bad events是重疊的。下面舉例：

（可能會出現線性不可分的狀況）
（四筆資料的時候，最多只有14種線性可分狀況）
（從個無限大的M中分群成較少的集合，直覺是可行的）

Terms

Dichotomy: 二分，把資料二分後的一種集合。
（Growth Function: 用來描述利用不同的Hypothesis找出最多有幾種可能的Dichotomy）
m(N): growth function, x: data, H: hypothesis function
Shattered: 粉碎擊敗的，Growth Function = 2^n 可以表達所有Dichotomy.
Break Point: 對多可以達到Shattered的資料數量，超過以後就無法Shattered

Conclusion

如果今天m(N)是指數，那N變大右邊變小的速度會慢非常多。（也就是說，更容易發生bad event），反之如果是polynomial，那右邊就會隨著資料變大而快速減少。

NEXT: theory-of-generalization

REF

Training versus Testing @ Machine Learning Foundations

Machine Learning- HW1

HW1-15

Very simple implementation of PLA algorithm. Really worth to try by anyone interesting to Machine Leaning.
Though the input x has only 4 features in this problem, we need to add one more feature as bias during learning, meaning that the weight vector should be 5 dimensions too.
hw1-16, hw1-17 : 調參數實驗

HW1-18

如果資料確定是線性可分，使用vanilla PLA，反之使用pocket PLA但會慢很多。
pocket PLA：weight更新發法同vanilla PLA，但會有一個最佳的res_weight對整個training dataset的 error rate最低，所以每次更新完weight都要檢查現在的是否比之前保存的更好，這時候必須遍歷全部data統計錯誤率，也就是比較慢的原因。
hw1-19, hw1-20 : 調參數實驗

Ref:

Monday, July 17, 2017

Sampling Methods

Why Sampling?

Approximate Expectation

estimate statistics
poster inference

Visualization

Pros and Cons

Pros

Easy
Could figure out very complicated model

Cons

Too easy - used inappropriately
Slow
insufficient to get "good" samples (which could represent well to the whole data set)
difficult to assess (you hard to know your approximation is good or not)

Monte Carlo Approximation [History]

Credits: Con Nevmann, Fermi, Ulam
Ulam's gambling uncle likes to go to Monte Carlo Casino in Monaco

The intuition is try to evaluate the expectation of intractable event.
直接的作法莫過於把所有的資料取平均（獲取一個unbias的期望值)，但實務上卻往往不可能這樣作的。
核心：sample的資料越多，估計出來的期望值就會越接近真實期望值。

Example

如何知道黑色點在上圖的分佈期望值？（Mandelbrot）
m(A)/m(S) = p(X->A) = EI(X->A)
A表示在上圖中的黑色點點集合，S表示在上途中的全部點點集合，X表示黑色點點出現的隨機變數，相當於I(indicator)該點是黑色為1否為0的期望值。
m^(A) = R*1/n sum(I(X->A))
假設今天上圖範圍面積有R個點點，隨機採樣n個點點，可以來估計分佈～！

Importance Sampling

有沒有可能利用不同的Distribution q來作Samping，更好的還原目標的Distribution p？

YES!!!

可以寫成期望值PDF，會發現可以用不同的採樣方式表達同一個分佈期望值
（原本是從p(x)採樣估計E[f(x)]，現在則是從q(x)來採樣估計E[f(x)*p/q]）
importance weight = p/q
理論上可以找出最佳的q，但現實問題中常常很難作到。如果找的好可以大幅降低variance~!
兩種狀況：

如果今天不知道p，會盡可能找出一個最接近p的q
已知p，但是想用更好的q來降低 Monte Carlo Approx's Variance 。

（左圖）找到不好的q
（右圖）找到較好的q

Ref:

Sampling methods

Saturday, July 15, 2017

[Interview] @Appier, Machine Learning Scientist

20170714@Appier, Machine Learning Scientist

Interviewer: one RD in ML team

Interview Procedure:

Self introduction & QA

my slides

student entrepreneurship experience:

fin-tech domain knowledge
(no any response)
android development
(no any response again)
scrum management experience
(no any response again)

project experience related to deep learning

online handwriting recognition application in VR
(start to discuss!, QA!)
traffic flow prediction using LSTM
(discuss)
SSD algorithm fine-tune project experience/ Deep Q-learing
(...I should not put unfamiliar projects here...)

Leadership

kendo sport team leader/ backpacker experience/ international volunteer/ some small competitions/
(no any response again)

Machine Learning Basics QA

given you some data, users, favorite songs, historical data...features, how do you recommend new songs to them. what's model/ method will you use to fit/ predict data?
(I really suck at this section, because I have few experiences on basic ML field such as HMM, SVM... etc, maybe it's time to plan learning those basics, hope that I could prepare well next time...)

Whiteboard Coding

Given n classes please implement one function that return random selected classes id with distribution w as array .
(very simple question to verify your programming skills)

Conclusion:

Basically, the interviewer only focus on the technical part of my experience, which depress me a lot. seems like the company only appreciate people who is best in specific related field, and don't accept differences.
The whole interview experience: HR is so kind, interviewer also not bad but the questions he asked are felt like free style, ask any questions came out to him (or maybe just I really suck at ML basics concept), which make me very nervous during interview....
Appier is an appealing company, not sure if I have the opportunity to join them. waiting their rely.

Sunday, July 9, 2017

[c++] Leetcode

[Type]

[useful tools]

[example url]

Hash

map (stl): map<string, string> map;
map.insert(pair<string, string>("yeah", "wow")): key"yeah" is mapped to value"wow"
map["yeah"] = "wow"
map.find("yeah"): return the pointer pointing to slot with key "yeah"
map.find("yeah")->second: value of the slot with key "yeah"
REF: http://mropengate.blogspot.tw/2015/12/cc-map-stl.html

set (stl): set<string> set; // only key only, no key-value pair

  std::set<int> myset;
  std::set<int>::iterator itlow,itup;

  for (int i=1; i<10; i++) myset.insert(i*10); // 10 20 30 40 50 60 70 80 90

  itlow=myset.lower_bound (30);                //       ^
  itup=myset.upper_bound (60);                 //                   ^

  myset.erase(itlow,itup);                     // 10 20 70 80 90

task-scheduler
binary-tree-zigzag-level-order-traversal
group-anagrams // use prime to solve might be faster!
contains-duplicate-iii
word-pattern

String Operating

min(a, b): return smaller one

max(a, b): return larger one

reverse(s.begin(), s.end): void, reverse string s

a.compare(b): return 0 if a==b, 1 if a!=b

a.substr(begin, length): return a[begin:begin+length]

a.erase(a.begin(),a.end()) == a.clear() : clear all content

istringstream fin: fin>>word, or get line(fin, word, delimiter)

reverse-string-ii

repeated-dna-sequences

reverse-words-in-a-string-iii

is-subsequence

decode-string

find-the-difference: (very easy)

Vector Operating

std::vector<int> v { 34,23 };
// or
// std::vector<int> v = { 34,23 };

push_back(a): void
pop_back(a): void
vector<T>::iterator: the pointer points to the vector
vector.begin(): return the pointer points to the first element
vector.end(): return the pointer points to last+1 position
vector.front(): return the value of first element
vector.back(): return the value of last element
vector.empty(): return boolean
vector.insert(a.begin(), value): insert value at a.begin(), return void
vector.insert(a.end(), b.begin(), b.end()): append b after a, return void
vector.erase(a.begin()+pos): erase the item of the pos
reverse(s.begin(), s.end): void, reverse vector s
std::random_shuffle(s.begin(), s.end()): void, inlace shuffling

DFS

unordered_set.find(a): return a's reference

unordered_set.find(a)==unordered_set.end() -> a doesn't exist in the hash

diameter-of-binary-tree

Increasing Subsequences

path-sum-ii

convert-bst-to-greater-tree: good exercise!!

maximum-depth-of-binary-tree: (easy)

combination_sum: good

BFS

queue<template> q
q.push(): push back
q.pop(): pop front
q.back(): return last value
q.front(): return first value

find-largest-value-in-each-tree-row

Quick Sort

sort-colors

Binary Sort

search-a-2d-matrix

Useful Tips

for(auto e: <iterable class>) //<iterable class>-> list, vector, map, set...etc
traverse through the whole class

sort(<iterable class>.begin(), <iterable class>.end()) : void, directly sort inline vector<string> test;
test.push_back("ab");
test.push_back("ac");
test.push_back("aa");
sort(test.begin(), test.end());
test-> aa, ab, ac (sorted)

INT: 4 bytes/2 = 16 bits

INT_MAX: 2^15 - 1 = 32767

INT_MIN: -2^15 + 1 = -32767

maximum-product-of-three-numbers (easy)

Math Libs

pow(a, b): return a to the power of a

Algorithm by self-thinking

elimination-game: thinking of mod 4

minimum-moves-to-equal-array-elements-ii: find median

permutations: recursive

remove-duplicates-from-sorted-list: Linked List

merge-two-sorted-lists: linked list, basic

Dynamic Programming

climbing-stairs: fibonacci

maximum-subarray: easy level but good practicing example

ones-and-zeroes:

perfect-squares

longest-increasing-subsequence

2-keys-keyboard: factor decomposition

Algorithm

range-sum-query-mutable: segment tree (log n to sum over given range)

valid-perfect-square: binary search O(logN)

search-for-a-range: binary search O(logN)

Conversion

isdigit(int c): check whether character c is decimal digit or not.

REF

https://github.com/chychen/leetcode_cpp

Sunday, July 2, 2017

Machine Learning-4 Feasibility of Learning

TOPIC

機器學習有可能媽？

No Free Lunch

Training Data訓練出來的Hypothesis g並無法準確地表達沒看過的Testing Data資料

Inference

Statistics: Sampling（採樣）

抽樣樣本數N越大，樣本的機率 v 和真實的機率 u 誤差越小。

推論：找出一個差不多正確的機率分佈
想法：透過Hoeffding's Inequality公式中發現，真實的機率u 並不影響樣本的機率v 是否多接近 u，換句話說我們是不需要事前知道答案的（知道也就不用算了），可以無限接近的推論出答案。（PAC -> Probably Approximately Correct）
Verification: 即使在Training Data中模型v 已經非常接近真實u 了，還是無法確定 v 是否可以準確運作在真實資料u中，可以透過這個公式來驗證，P(|v-u|>0.001)<=0, N-> infinite。

Data

E-in:

Evaluation on training data

E-out:

Evaluation on un-seen data

Good Data:

E-in 很接近 E-out

Bad Data:

E-in 和 E-out 很不同，容易造成從Hypothesis set照出錯誤的 g。

Hoeffding's Inequality:

評估抓出一個Bad Data的機率

Hypothesis set

定義：一個集合，包含各種可能的g，可以表達X->Y之間的關係。
課程中利用了Hoeffding's Inequality證明了如果Hypothesis set集合在"有限"數目，且資料量夠大的條件下，可以透過機器學習找出一個最好的g。

(評估找到一個Bad Data的機率＝該資料對M個h是否為Bad Data的聯集)

Conclusion

如果資料分佈是有統計模型特性的，且有限可能的Hypothesis -> 可以學習的

REF

Feasibility of Learning @ Machine Learning Foundations

Sunday, July 30, 2017

Why?

Error Measure:

Weighted Classification

REF

Saturday, July 29, 2017

20170729 @ Synology, Software Engineer

面試個人優劣自省：

第一關：兩位面試官（10:15 ~ 11:45）

第二關：一位面試官（12:00 ~ 13:30）

第三關：兩位面試官（14:00 ~ 15:30）

第四關：HR時間（15:40 ~ 16:00）

面試結果：

感想：

面試技巧：

Tuesday, July 25, 2017

Review

Terms

Example on 2D PLA

How about Multi-Dimensions

D-vc 物理意義

D-vc 哲學意義

REF

Fully Connected vs 1*1 Convolution, 兩個是一樣的嗎

常搞混的點

REF:

Monday, July 24, 2017

Review

Term

Bounding Function: The Theorem

VC Bound:

REF

Tuesday, July 18, 2017

Two Central Questions

Trade-off on M (Hypothesis Set)

Review

How can we find finite M?

Terms

Conclusion

REF

HW1-15

HW1-18

Ref:

Monday, July 17, 2017

Why Sampling?

Pros and Cons

Monte Carlo Approximation [History]

Example

Importance Sampling

Ref:

Saturday, July 15, 2017

20170714@Appier, Machine Learning Scientist

Interviewer: one RD in ML team

Interview Procedure:

Conclusion:

Sunday, July 9, 2017

[Type]

[useful tools] [example url]

Hash

String Operating

Vector Operating

DFS

unordered_set.find(a): return a's reference unordered_set.find(a)==unordered_set.end() -> a doesn't exist in the hash diameter-of-binary-tree Increasing Subsequences path-sum-ii convert-bst-to-greater-tree: good exercise!! maximum-depth-of-binary-tree: (easy) combination_sum: good

BFS

Quick Sort

Binary Sort

Useful Tips

Math Libs

Algorithm by self-thinking

Dynamic Programming

Algorithm

Conversion

REF

Sunday, July 2, 2017

TOPIC

No Free Lunch

Inference

Data

Hypothesis set

Conclusion

[useful tools]

[example url]

unordered_set.find(a): return a's reference

unordered_set.find(a)==unordered_set.end() -> a doesn't exist in the hash

diameter-of-binary-tree

Increasing Subsequences

path-sum-ii

convert-bst-to-greater-tree: good exercise!!

maximum-depth-of-binary-tree: (easy)

combination_sum: good