作者wanquan (X-Y軸的世界)
標題[試題] 96上 趙坤茂 生物序列分析演算法
時間Sun May 22 12:38:19 2011
課程名稱︰ 生物序列分析演算法
課程性質︰ 演算法
課程教師︰ 趙坤茂
開課學院: 電資
開課系所︰ 資工
考試日期(年月日)︰ 2005.11.09
考試時限(分鐘): 3H
是否需發放獎勵金: yes
試題 :
Note: Both the correctness and efficiency of your algorithm will be evaluated.
You are required to justify your answer.
Problem 1
Givecn a real number sequence A, please describe an algorithm for computing
the maximum-average segment ending at each index of A. You can use the following
sequence to describe your algorithm.
3 5 6 8 3 6 7 3
Problem 2
You are given a real number sequence A. Design a linear time algorithm
for computing the nearest larger elements for all elements of A. You can use the
following sequence to illustrate your alogorithm.
A 3 5 6 8 3 6 7 3
nearest larger element 5 6 8 x 8 7 8 7
Problem 3
What is a "better partner" of each index defined in RMSQ?
Exaplain why it is better.
Problem 4.
Given two sequences A and B, we say that a common substring-subsequence
is a substring, defined as a contiguous subsequence, of A which is also
a subsequence of B.
Give aan efficient algorithm for finding the longest common substring-subsequence
between A and B.
Problem 5
Assume the following scoring scheme:
A match is given a bonus +8
A mismatch is panalized by -5
A gap of length l, where k is between 1 and 50, is given a constant penalty 6+3k
A gap of length more than 50 is given a constant penalty 156.
GIve the recurrences for computing the score of an optimal global alignment. Explain
why they work.
Problem 6
a. Describe Hirschberg's linear space idea for delivering an optimal alignment.
Explain why the time conplexity remains the same as that of merely computing the score
of an optimal alignment. Show that directly appling this approach to a band might
cause an additional log n factor in time, where n is the sequence length.
b. Describe a linear space alignment method for aligning two sequences in a band
Problem 7
a. What is haplotype inference?
b. Give an Integer Quadratic Programming function for haplotype inference.
c. What are tag SNPs?
d. What is an LD bin?
e. Explain why the problem of finding a minimum set of LD bins is related to
the minimum clique cover problem.
Problem 8
Consider the problem of computing all delta points of two sequences of lengths M and N,
Where M < N.
a. Describe a method that works in O(MN) time and O(N^1/4N+N) working space
b. Describe a method that works in O(MN) time and O(M^(1+1/4)+N) working space.
