NTU-Exam 板


LINE

課程名稱︰機率 課程性質︰必修 課程教師︰林守德 開課學院:電資院 開課系所︰資訊系 考試日期(年月日)︰100.06/23 考試時限(分鐘):14:30~17:30 是否需發放獎勵金:是 (如未明確表示,則不予發放) 試題 : Total Points: 120 You can answer in either Chinese or English. 1. Please briefly describe the following concepts (16pts): (a) Law of large number (b) The difference between MI and PMI (c) Prior, Likelihood function, Posterior (d) The usage and bound for KL-divergence 2. Somebody repeatedly roll a fair die. If it comes up 6, he instantly wins (and stops playing); if it comes up k, for any k between 1 and 5, he waits k minutes and then roll again. What is the expected elapsed time from when he starts rolling until he wins? (8pts) 3. Describe at least four connections (or relationships) between the following 5 distributions: Poisson, Exponential, Gamma, Chi-square, Normal (4*3=12pts) 4. Given discrete random variables X and Y are independent, is X^2 and Y^2 also independent? Please explain you answer (7 points) 5. Suppose that X and Y are independent exponential random variables with para- meter λ, and let Z = X/(X+Y), show that Z is a uniform distribution over (0,1) (8 pts) 6. Let X be a discrete random variable taking on positive and negative integer values, and Y is a function of X. For each of the following case, please answer whether H(X) or H(Y) is larger (a) Y = (X-2)^2 + 3 (4 pts) (b) Y = tanX (4 pts) (c) Prove that H(f(X)) ≦ H(X) hint: H(X,Y)=H(X)+H(Y|X), H(X,Y)=H(Y)+H(X|Y) (8 pts) 7. consider the table of term frequencies for 3 documents denoted Doc1, Doc2, Doc3 in Table 1. a) List how to compute the tf-idf weights for the terms car, auto, insurance and best; for each tf, there is no need to use normalization; for each document, using the idf values from Table 2. Again, there is no need to generate the final value; just list the equation is good enough. (6pts) b) Which document (Doc2 or Doc3) is more similar with doc1? (4 pts) │Doc1│Doc2│Doc3 ─────┼──┼──┼── car │ 27│ 4│ 24 auto │ 3│ 33│ 0 insurance │ 0│ 33│ 29 best │ 14│ 0│ 17 Table 1:term frequency │ df_t│idf_t ─────┼───┼─── car │18,165│ 1.65 auto │ 6,723│ 2.08 insurance │19,241│ 1.62 best │25,235│ 1.5 Table 2:inverse document frequency in the total collection of 806,791 documents 8. The classical PageRank algorithm assumes that a page chooses the next hop uniformly randomly from the available outgoing links. However, in practice Google does not treat all links equally and uses several proprietary indica- tions to determine the importance of a link. Assume that the weight (a po- sitive real number) of each link is known and a page chooses the next hop with the probability proportional to the outgoing link's weight. What modi- fications need to be made to the classical PageRank algorithm to take the link weights into account, please write down the new equation? (8 pts) 9. In vector space model with Tf-Idf schema, the words of different tenses(e.g. "talk" and "talked", "go" and "gone") are treated as complete different words, do you feel such treatment can affect the search results? If yes, how do you propose to fix them? (8 pts) 10.Let X, Y, and Z be joint random variables, Z=f(X,Y) satisfies the following inequalities (6*2=12 pts) (i) I(X,Y|Z) < I(X,Y), give an example of f(X,Y) and justify your solution? (ii)I(X,Y|Z) > I(X,Y), give an example of f(X,Y) and justify your solution? 11.Let {x1, X2, ..., Xn} are integers, generated by random round-off sampling from a univorm distribution U[0, b], Now we want to estimate the parameters b, this is known as German tank problem in WWII. (Sample are the German tank serial numbers spot by the Allies, and we want to estivate total number of tanks German has) (5*3=15 pts) a. find the estimation of parameters b using Maximum Likelihood Estimation. b. Find the estimation of parameters b using Method of Moments. c. What are the potential concerns for each of the estimation? Mutual Information I(X,Y) = H(X) - H(X|Y) = H(Y) - H(Y|X) TF-IDF tf_ij = n_ij / max n_kj idf_i = log(N/df_i + 1) Exponential Distribution f(x)=λe^(-λx), let θ=1/λ, μ=θ, σ^2=θ^2 --



※ 發信站: 批踢踢實業坊(ptt.cc)
◆ From: 140.112.4.192 ※ 編輯: s864372002 來自: 140.112.4.192 (06/23 20:46)
1F:推 jimwjma :推鋼琴 06/23 20:47
2F:→ andy74139 :已收錄至資訊系!! 06/23 21:06







like.gif 您可能會有興趣的文章
icon.png[問題/行為] 貓晚上進房間會不會有憋尿問題
icon.pngRe: [閒聊] 選了錯誤的女孩成為魔法少女 XDDDDDDDDDD
icon.png[正妹] 瑞典 一張
icon.png[心得] EMS高領長版毛衣.墨小樓MC1002
icon.png[分享] 丹龍隔熱紙GE55+33+22
icon.png[問題] 清洗洗衣機
icon.png[尋物] 窗台下的空間
icon.png[閒聊] 双極の女神1 木魔爵
icon.png[售車] 新竹 1997 march 1297cc 白色 四門
icon.png[討論] 能從照片感受到攝影者心情嗎
icon.png[狂賀] 賀賀賀賀 賀!島村卯月!總選舉NO.1
icon.png[難過] 羨慕白皮膚的女生
icon.png閱讀文章
icon.png[黑特]
icon.png[問題] SBK S1安裝於安全帽位置
icon.png[分享] 舊woo100絕版開箱!!
icon.pngRe: [無言] 關於小包衛生紙
icon.png[開箱] E5-2683V3 RX480Strix 快睿C1 簡單測試
icon.png[心得] 蒼の海賊龍 地獄 執行者16PT
icon.png[售車] 1999年Virage iO 1.8EXi
icon.png[心得] 挑戰33 LV10 獅子座pt solo
icon.png[閒聊] 手把手教你不被桶之新手主購教學
icon.png[分享] Civic Type R 量產版官方照無預警流出
icon.png[售車] Golf 4 2.0 銀色 自排
icon.png[出售] Graco提籃汽座(有底座)2000元誠可議
icon.png[問題] 請問補牙材質掉了還能再補嗎?(台中半年內
icon.png[問題] 44th 單曲 生寫竟然都給重複的啊啊!
icon.png[心得] 華南紅卡/icash 核卡
icon.png[問題] 拔牙矯正這樣正常嗎
icon.png[贈送] 老莫高業 初業 102年版
icon.png[情報] 三大行動支付 本季掀戰火
icon.png[寶寶] 博客來Amos水蠟筆5/1特價五折
icon.pngRe: [心得] 新鮮人一些面試分享
icon.png[心得] 蒼の海賊龍 地獄 麒麟25PT
icon.pngRe: [閒聊] (君の名は。雷慎入) 君名二創漫畫翻譯
icon.pngRe: [閒聊] OGN中場影片:失蹤人口局 (英文字幕)
icon.png[問題] 台灣大哥大4G訊號差
icon.png[出售] [全國]全新千尋侘草LED燈, 水草

請輸入看板名稱,例如:WOW站內搜尋

TOP