NTU-Exam 板


LINE

课程名称︰机率 课程性质︰必修 课程教师︰林守德 开课学院:电资院 开课系所︰资讯系 考试日期(年月日)︰100.06/23 考试时限(分钟):14:30~17:30 是否需发放奖励金:是 (如未明确表示,则不予发放) 试题 : Total Points: 120 You can answer in either Chinese or English. 1. Please briefly describe the following concepts (16pts): (a) Law of large number (b) The difference between MI and PMI (c) Prior, Likelihood function, Posterior (d) The usage and bound for KL-divergence 2. Somebody repeatedly roll a fair die. If it comes up 6, he instantly wins (and stops playing); if it comes up k, for any k between 1 and 5, he waits k minutes and then roll again. What is the expected elapsed time from when he starts rolling until he wins? (8pts) 3. Describe at least four connections (or relationships) between the following 5 distributions: Poisson, Exponential, Gamma, Chi-square, Normal (4*3=12pts) 4. Given discrete random variables X and Y are independent, is X^2 and Y^2 also independent? Please explain you answer (7 points) 5. Suppose that X and Y are independent exponential random variables with para- meter λ, and let Z = X/(X+Y), show that Z is a uniform distribution over (0,1) (8 pts) 6. Let X be a discrete random variable taking on positive and negative integer values, and Y is a function of X. For each of the following case, please answer whether H(X) or H(Y) is larger (a) Y = (X-2)^2 + 3 (4 pts) (b) Y = tanX (4 pts) (c) Prove that H(f(X)) ≦ H(X) hint: H(X,Y)=H(X)+H(Y|X), H(X,Y)=H(Y)+H(X|Y) (8 pts) 7. consider the table of term frequencies for 3 documents denoted Doc1, Doc2, Doc3 in Table 1. a) List how to compute the tf-idf weights for the terms car, auto, insurance and best; for each tf, there is no need to use normalization; for each document, using the idf values from Table 2. Again, there is no need to generate the final value; just list the equation is good enough. (6pts) b) Which document (Doc2 or Doc3) is more similar with doc1? (4 pts) │Doc1│Doc2│Doc3 ─────┼──┼──┼── car │ 27│ 4│ 24 auto │ 3│ 33│ 0 insurance │ 0│ 33│ 29 best │ 14│ 0│ 17 Table 1:term frequency │ df_t│idf_t ─────┼───┼─── car │18,165│ 1.65 auto │ 6,723│ 2.08 insurance │19,241│ 1.62 best │25,235│ 1.5 Table 2:inverse document frequency in the total collection of 806,791 documents 8. The classical PageRank algorithm assumes that a page chooses the next hop uniformly randomly from the available outgoing links. However, in practice Google does not treat all links equally and uses several proprietary indica- tions to determine the importance of a link. Assume that the weight (a po- sitive real number) of each link is known and a page chooses the next hop with the probability proportional to the outgoing link's weight. What modi- fications need to be made to the classical PageRank algorithm to take the link weights into account, please write down the new equation? (8 pts) 9. In vector space model with Tf-Idf schema, the words of different tenses(e.g. "talk" and "talked", "go" and "gone") are treated as complete different words, do you feel such treatment can affect the search results? If yes, how do you propose to fix them? (8 pts) 10.Let X, Y, and Z be joint random variables, Z=f(X,Y) satisfies the following inequalities (6*2=12 pts) (i) I(X,Y|Z) < I(X,Y), give an example of f(X,Y) and justify your solution? (ii)I(X,Y|Z) > I(X,Y), give an example of f(X,Y) and justify your solution? 11.Let {x1, X2, ..., Xn} are integers, generated by random round-off sampling from a univorm distribution U[0, b], Now we want to estimate the parameters b, this is known as German tank problem in WWII. (Sample are the German tank serial numbers spot by the Allies, and we want to estivate total number of tanks German has) (5*3=15 pts) a. find the estimation of parameters b using Maximum Likelihood Estimation. b. Find the estimation of parameters b using Method of Moments. c. What are the potential concerns for each of the estimation? Mutual Information I(X,Y) = H(X) - H(X|Y) = H(Y) - H(Y|X) TF-IDF tf_ij = n_ij / max n_kj idf_i = log(N/df_i + 1) Exponential Distribution f(x)=λe^(-λx), let θ=1/λ, μ=θ, σ^2=θ^2 --



※ 发信站: 批踢踢实业坊(ptt.cc)
◆ From: 140.112.4.192 ※ 编辑: s864372002 来自: 140.112.4.192 (06/23 20:46)
1F:推 jimwjma :推钢琴 06/23 20:47
2F:→ andy74139 :已收录至资讯系!! 06/23 21:06







like.gif 您可能会有兴趣的文章
icon.png[问题/行为] 猫晚上进房间会不会有憋尿问题
icon.pngRe: [闲聊] 选了错误的女孩成为魔法少女 XDDDDDDDDDD
icon.png[正妹] 瑞典 一张
icon.png[心得] EMS高领长版毛衣.墨小楼MC1002
icon.png[分享] 丹龙隔热纸GE55+33+22
icon.png[问题] 清洗洗衣机
icon.png[寻物] 窗台下的空间
icon.png[闲聊] 双极の女神1 木魔爵
icon.png[售车] 新竹 1997 march 1297cc 白色 四门
icon.png[讨论] 能从照片感受到摄影者心情吗
icon.png[狂贺] 贺贺贺贺 贺!岛村卯月!总选举NO.1
icon.png[难过] 羡慕白皮肤的女生
icon.png阅读文章
icon.png[黑特]
icon.png[问题] SBK S1安装於安全帽位置
icon.png[分享] 旧woo100绝版开箱!!
icon.pngRe: [无言] 关於小包卫生纸
icon.png[开箱] E5-2683V3 RX480Strix 快睿C1 简单测试
icon.png[心得] 苍の海贼龙 地狱 执行者16PT
icon.png[售车] 1999年Virage iO 1.8EXi
icon.png[心得] 挑战33 LV10 狮子座pt solo
icon.png[闲聊] 手把手教你不被桶之新手主购教学
icon.png[分享] Civic Type R 量产版官方照无预警流出
icon.png[售车] Golf 4 2.0 银色 自排
icon.png[出售] Graco提篮汽座(有底座)2000元诚可议
icon.png[问题] 请问补牙材质掉了还能再补吗?(台中半年内
icon.png[问题] 44th 单曲 生写竟然都给重复的啊啊!
icon.png[心得] 华南红卡/icash 核卡
icon.png[问题] 拔牙矫正这样正常吗
icon.png[赠送] 老莫高业 初业 102年版
icon.png[情报] 三大行动支付 本季掀战火
icon.png[宝宝] 博客来Amos水蜡笔5/1特价五折
icon.pngRe: [心得] 新鲜人一些面试分享
icon.png[心得] 苍の海贼龙 地狱 麒麟25PT
icon.pngRe: [闲聊] (君の名は。雷慎入) 君名二创漫画翻译
icon.pngRe: [闲聊] OGN中场影片:失踪人口局 (英文字幕)
icon.png[问题] 台湾大哥大4G讯号差
icon.png[出售] [全国]全新千寻侘草LED灯, 水草

请输入看板名称,例如:iOS站内搜寻

TOP