作者pinkyenyen (彦彦)
看板NTU-Exam
标题[试题] 96下 林守德 机率 期末考
时间Sat Jun 21 00:50:26 2008
课程名称︰ 机率
课程性质︰ 系必修
课程教师︰ 林守德
开课学院: 电资学院
开课系所︰ 资工系
考试日期(年月日)︰ 2008/06/19
考试时限(分钟): 180
是否需发放奖励金: 是
(如未明确表示,则不予发放)
试题 :
1.Let X have the following probability density function:
-1 -0.5 2 2
f(x) = σ (2π) EXP( -(x-μ) / 2σ )
What is the probability density function of Y = EXP(X) ?
2.Person A throws an unbiased dice n times and B throws the same dice n+1
times. We care about how many '6's they throw. If you are told that
P(B has more '6's than A) = 5/12
then what is the probability that A and B have equally many '6's after
throwing the dice n times?
Hint: conditioning on which player has more '6's after each has thrown n
times.
3.Your company must make a sealed bid for a construction project. Your
company will win if your bid is lower than other companies. If you win the
bid, then you play to pay another firm 100 thousand dollars to do the work.
You are competing with two other companies, and you believe their bids are
two independent ramdom variables uniformly distribution in [70,250] and
[140,300], respectively.
(a) Suppose your bid is x, what is the probability that you win?
(b) Suppoes your bid is x, what is the expected profit?
(c) Determine the x that maximizes your profit.
4.X, Y, and Z are three random variables. Can you propose a real-world example
of them that satisfy the following condition:
(a) X and Y are independent.
(b) X and Y become dependent given Z.
5.T1 and T2 are two positive continuous random variables that satisfy:
˙T1 > T2
˙T1 + T2 < 2
Their joint density function is uniform in the above region, and is zero
elsewhere. What is P(T1 + T2 > 1)?
6.Let X and Y be random variables of the continuous type having the joint
p.d.f.: f(x,y) = 2, 0 <= y <= x <= 1.
(a) What are the means of X and Y?
(b) What is the covariance of X and Y?
7.A public poll was taken to determine whether we should allow tourist from
Mainland China. let p equal the proportion of people who faver this
decision. We shall test H0: p = 0.65 against H1: p > 0.65.
(a) Given α = 0.025, what is the critical region?
(b) Given that 414 out of a sample of 600 favor this proposal, find the
p-value.
(c) Should we reject or accept H0?
8.The teacher claims that 1/4 of the student will recieve A grade. 1/4 will
receive B and 1/2 will receive C grade. If among the 40 students, 6 receive
A, 7 receive B and 27 receive C. Would the claim be rejected at σ = 0.05
significance level?
9.A six-sided fair die is rolled. What is the mutual information between the
topside and the front side (the side most facing you)?
Hint: The sum of two opposite sides is always 7.
10.Half of the Taiwanese students in the class get high score, and 2/3 of the
students in the class are Taiwanese. Only 1/10 of the non-Taiwanese
students get high score.
(a) Define the random variable and draw the Bayesian network (with
conditional probability table) for this statement.
(b) What is the probability that a randomly chosen student is a Taiwanese
who gets high score?
(c) Given an association rule that says "Japanese = true" → "score = high"
please provide a pair of "reasonable" min-support and min-confidence
that make this rule true.
11.Given the following Bayesian network,
┌─┐
│H│ P(H) = 1/2
↙└─┘↘
P(S|H) = 1/10 ┌─┐ ┌─┐ P(F|H) = 0
P(S|~H) = 1/2 │S│ │F│ P(F|~H) = 1/2
↙└─┘ └─┘
┌─┐ P(W|S) = 1/2
│W│ P(W|~S) = 1
└─┘
please calculate P(W,F).
12.Corpus C consists of only three document:
D1: "new york times"
D2: "new york post"
D3: "los angeles times"
(a) Please use the vector-space model to represent these three document,
assuming the weights are all binary and the words in the vector are
ordered alphabetically.
(b) Please use the vector-space model to represent these three document,
assuming the weights are TFIDF values and assuming that term
frequencies are normalize by the maximum frequency in a document.
Note: Please use the base-10 logarithm with the following table:
2 3 4 5 6 7 8 9
──┼────────────────────────────────
log│0.3010 0.4771 0.6021 0.6990 0.7782 0.8451 0.9031 0.9542
(c) Given the following query: "new new times," calculate the correspoding
TFIDF-based vector, and compute its distance with D1 using the cosine
similarity measure. Assume that term frequencies are normalized by the
maximum frequency in a given query.
13.Given a social network:
┌─┐ calls
│N1│→→→→→→→→→→↘
└─┘ ↓
↑calls ↓
↑ ↓
┌──┐ emails ┌──┐
│Sue │←←←←←←← │Jean│
└──┘ ↙└──┘
↓emails ↙ ↓calls
↓ ↙ ↓
┌─┐ ↙ ┌─┐
│P1│←←← ↙ │C1│
└─┘ calls └─┘
There are six paths of length 2:
calls calls
Sue →→→ N1 →→→ Jean
emails emails
Jean →→→ Sue →→→ P1
emails calls
Jean →→→ Sue →→→ N1
calls calls
N1 →→→ Jean →→→ C1
calls calls
N1 →→→ Jean →→→ P1
calls emails
N1 →→→ Jean →→→ Sue
If we perform a random experiment to pick a length-2 path randomly, and
define two random variables S and P:
S: the starting node of the path (eg. "Sue")
P: the link-combination of the path (eg. {calls, emails},{calls, calls})
(a) What is the size of P's outcome space?
(b) What is the mutual information I(S;P)?
(c) Assume that min-support is 0.3 and min-confidence is 0.7, can we
conclude an association rule N1 → {calls, calls}? Why?
(d) Assume the initial PageRank values for each node is 0.2. Which node(s)
have the highest PageRank values after two iteration?
14.The problem of "Chinese poetry segmentation" aims at breaking a Chinese
poetry sentence into a section of term, for example,
"夜半钟声到客船" → "夜半 钟声 到 客船"
Can you carefully describe a way to use n-gram LM to do this job?
Hint: You need to determine not only where to put the breaks but also how
many breaks there are.
--
※ 发信站: 批踢踢实业坊(ptt.cc)
◆ From: 61.229.232.184