作者fanfanfanfan ()
看板NTU-Exam
标题[试题] 100上 唐牧群 资讯检索 期末考
时间Tue Jan 10 15:58:27 2012
课程名称︰资讯检索
课程性质︰图资系大三必修
课程教师︰唐牧群
开课学院:文学院
开课系所︰图书资讯学系
考试日期(年月日)︰101/01/10
考试时限(分钟):1.20~4.20
是否需发放奖励金:要!
(如未明确表示,则不予发放)
试题 :
1. With an imaginary database that contains only the following 5 document:
(20 points)
D1:"a dog barks at a cat and a dog in a tree"
D2:"a dog watches ants eat the bark of a tree"
D3:"a dog watches another dog by a tree"
D4:"a dog barks at a cat on a tree"
D5:"the bark fell from the tree as a cat watches"
(Terms in the stop word list have been marked with lighter hue).
Please
1. Create an inverted file for the database where each cell contains the TF
(Term Frequency)weight of each term all the documents.
2.Calculate document frequecy(DF) and IDF weight for each index term(simply
use N/n without logarithm).
3. Give the ranking after the user submits the query"dog barks cat"
4. After the first iteration, the user examines the results and marks D1, D4
as relevant, and D2 and D5 as non-relevant. Produce the new ranking using
Rocchio's method where α=1.0 β=1.0 γ=1.0
Answer 4 out of the following 5 questions; each will acount for 20 points.
2. Unlike data retrieval, where perfect precision and recall are guaranted,
information retrieval is more of a probabilistic process where information
conveyed in the retrieved documents might or might not answer users'
information needs. What are the possible causes behind the uncertainty of IR?
3. Define the following concepts and explain hoe they are related to one
another:"specificity", "precision" and "IDF(Inverse document Frecuency);
"eshaustivity", "recall" and "TF(Term Frequency)".There is often a trade-off
between presicion and recall, is there also a trade-off between specificity
and exhaustivity?
4. Explain three basic models in information retrieve:Boolean, Vector space
Probabilistic.
5. Explain the rationales behind eliciting users' relevance feedback and how it
can improve search results. What are two mechanisms with which relevant terms
can be identified and extracted(hint: IQE and AQE)?
6. How does retrievel on the Web differ from retrieval with traditional
bibliographic databases(e.g the nature of Web document and Web environment,
the"structuredness" of indexing, and the use of link data etc.)? Give the
formula of Google's PageRank and explain its rationale.
--
※ 发信站: 批踢踢实业坊(ptt.cc)
◆ From: 140.112.4.195
1F:推 yoyo8089 :>< 01/10 16:08
2F:→ yoyo8089 :图资系已收 01/10 16:09
3F:推 abacada :囧 (帮1F小板主拍拍?) 01/11 08:07