[试题] 110-1 陈信希资讯检索与撷取期中考

时间Tue Dec 30 09:34:28 2025

课程名称︰资讯检索与撷取课程性质︰资工系选修课程教师︰陈信希开课学院：电机资讯学院开课系所︰资讯工程学系考试日期（年月日）︰2021/11/11 考试时限（分钟）：180 试题 : 1. Term frequency and inverse document frequency are commonly used to measure the importance of a term in a document and a query. We aim to select terms with discriminative power within a document and between documents to repre- sent a document. How term frequency and inverse document frequency achieve the goal? (10 points) 2. A long document is usually composed of passages describing several topics. On the one hand, it is relatively easier to retrieve long documents than short documents with keyword-based approach. On the other hand, the repre- sentation of long documents tends to be vague when average word (term) em- bedding approach is used for aggregation. Do you have any ideas to deal with these issues in keyword-based approach and term embedding-based approach? (10 points) 3. In language modeling, each individual document can be considered as a docu- ment model for retrieval. Besides, a document collection can be also used to learn a collection model for smoothing in retrieval. Please describe the idea of integrating document model and collection model for IR. (10 points) 4. To model term-term relationship is important in information retrieval. Va- rious methods from conventional counting-based approach to current predic- tion-based approach have been proposed. Please show one method from each ap- proach to compute inter-term relationship. (10 points) 5. (a) What are the typical similarities and topical similarities? (5 points) (b) Term representations learned from models based on different size of con- texts (e.g., document, short window size, or short context) may capture different similarities (typical similarities or topical similarities). Please explain this statement. (5 points) (c) Exact matching and embedding space based matching have different effects on retrieval. Please discuss this point. (5 points) 6. An IR model is a quadruple $[D, Q, F, R(q_i, d_j)]$ where $D$ is a set of logical views for the documents in the collection, $Q$ is a set of logical views for the user queries, $F$ is a framework for modeling documents and queries, and $R(q_i, d_j)$ is a ranking function. Please specify the framework $F$ and the ranking function $R$ for each of the following models. (15 points) (a) BM25 Model (b) Translation Model (c) Term Embedding Model 7. Query expansion aims to introduce new query terms to the original query. Please specify how query expansion is introduced to each of the following models. (15 points) (a) Vector Space Model (b) Language Model (c) Term Embedding Model 8. In SIGIR 2016, two tutorial speakers classify "Question Answering from Docu- ments" into an "easy" problem in IR. In contrast, they regard "Question Ans- wering from Knowledge Base" as a "hard" problem in IR. Do you agree such a classification? Please show your thoughts. (10 points) 9. Neural information retrieval systems typically use chaining pipeline. Are there any practical considerations? Please suggest a cascade pipeline to ex- plain your idea. (10 points) 10. We often encounter mis-conception, mis-translation, and mis-formulation pro- blems to transform an information need to a query in ad hoc retrieval. You have learned fundamentals of information retrieval during the first half of semester. Please show the lessons to deal with these problems. (10 points) -- 第01话似乎在课堂上听过的样子第02话那真是太令人绝望了第03话已经没什麽好期望了第04话被当、21都是存在的第05话怎麽可能会all pass 第06话这考卷绝对有问题啊第07话你能面对真正的分数吗第08话我，真是个笨蛋第09话这样成绩，教授绝不会让我过的第10话再也不依靠考古题第11话最後留下的补考第12话我最爱的学分 --

※ 发信站: 批踢踢实业坊(ptt.cc), 来自: 111.249.65.236 (台湾) ※ 文章网址: https://webptt.com/cn.aspx?n=bbs/NTU-Exam/M.1767058471.A.939.html

1^F：→ rod24574575 : 收录资讯系! 12/30 22:55

	[问题/行为] 猫晚上进房间会不会有憋尿问题
	Re: [闲聊] 选了错误的女孩成为魔法少女 XDDDDDDDDDD
	[正妹] 瑞典一张
	[心得] EMS高领长版毛衣.墨小楼MC1002
	[分享] 丹龙隔热纸GE55+33+22
	[问题] 清洗洗衣机
	[寻物] 窗台下的空间
	[闲聊] 双极の女神1 木魔爵
	[售车] 新竹 1997 march 1297cc 白色四门
	[讨论] 能从照片感受到摄影者心情吗
	[狂贺] 贺贺贺贺贺！岛村卯月！总选举NO.1
	[难过] 羡慕白皮肤的女生
	阅读文章
	[黑特]
	[问题] SBK S1安装於安全帽位置
	[分享] 旧woo100绝版开箱!!
	Re: [无言] 关於小包卫生纸
	[开箱] E5-2683V3 RX480Strix 快睿C1 简单测试
	[心得] 苍の海贼龙地狱执行者16PT
	[售车] 1999年Virage iO 1.8EXi
	[心得] 挑战33 LV10 狮子座pt solo
	[闲聊] 手把手教你不被桶之新手主购教学
	[分享] Civic Type R 量产版官方照无预警流出
	[售车] Golf 4 2.0 银色自排
	[出售] Graco提篮汽座（有底座）2000元诚可议
	[问题] 请问补牙材质掉了还能再补吗?(台中半年内
	[问题] 44th 单曲生写竟然都给重复的啊啊！
	[心得] 华南红卡/icash 核卡
	[问题] 拔牙矫正这样正常吗
	[赠送] 老莫高业初业 102年版
	[情报] 三大行动支付本季掀战火
	[宝宝] 博客来Amos水蜡笔5/1特价五折
	Re: [心得] 新鲜人一些面试分享
	[心得] 苍の海贼龙地狱麒麟25PT
	Re: [闲聊] (君の名は。雷慎入) 君名二创漫画翻译
	Re: [闲聊] OGN中场影片：失踪人口局 (英文字幕)
	[问题] 台湾大哥大4G讯号差
	[出售] [全国]全新千寻侘草LED灯, 水草

WEB批踢踢(PTT)

NTU-Exam 板

[试题] 110-1 陈信希资讯检索与撷取期中考

热门看板

赞助商连结