作者wanquan (X-Y軸的世界)
看板NTU-Exam
標題[試題] 99上 趙坤茂 生物序列分析演算法
時間Sun May 15 20:02:06 2011
課程名稱︰課程名稱︰ 生物序列分析演算法
課程性質︰ 演算法
課程教師︰ 趙坤茂
開課學院: 電資
開課系所︰ 資工
考試日期(年月日)︰ 2010.11.04
考試時限(分鐘): 3H
是否需發放獎勵金: yes
試題:
Problem 1 (15%):
Suppose we are given a very long DNA sequence where the occurrence probabilities of
nucleotides A (adenine), C (cytosine), G (guanine), T (thymine) are 0.1, 0.3, 0.4, and 0.2, respectively.
(a) (10%): Construct a Huffman code for them. You should work out the binary tree construction
as well as the code assignment.
(b) (5%): By the above Huffman coding scheme, what is the binary string for a 10-nucleotide DNA
sequence “GGGCTTCACG.”
Problem 2 (15%): In class, we introduced an O(n log n)-time algorithm for finding a longest increasing
subsequence. Use h8; 2; 6; 4; 5; 7; 3; 1; 12; 9; 10i to explain how the algorithm works.
Problem 3 (10%): Given a sequence of real numbers A = ha1; a2; : : : ; ani, the maximum-sum segment
problem is to find a consecutive subsequence, i.e., a substring or segment, in A with the maximum
sum. Let prefix sum P[i] =
Pi
j=1 aj be the sum of the first i elements. Explain how to use the prefix
sum to deliver the maximum-sum segment in O(n) time.
In the following, we are given two sequences A = ha1; a2; : : : ; ami and B = hb1; b2; : : : ; bni. An alignment
of A and B is obtained by introducing dashes into the two sequences such that the lengths of
the two resulting sequences are identical and no column contains two dashes. Let § denote the input
symbol alphabet. A score ¾(a; b) is defined for each (a; b) 2 § £ §. The score of an alignment is the
sum of ¾ scores of all columns with no dashes minus the penalties of the gaps.
Problem 4 (25%): In this problem, we employ a simple scoring scheme where each gap symbol is penalized
by a nonnegative constant ¯. Let S[i; j] denote the score of an optimal alignment between
ha1; a2; : : : ; aii and hb1; b2; : : : ; bji. With proper initializations, S[i; j] can be computed by the following
recurrences:
S[i; j] = max
8<
:
S[i ¡ 1; j] ¡ ¯
S[i; j ¡ 1] ¡ ¯
S[i ¡ 1; j ¡ 1] + ¾(ai; bj)
(a) (15%): Write down a complete pseudo-code for computing S[m; n] in O(mn) time and O(m+n)
space. All initializations should be included in the pseudo-code.
(b) (10%): Assume that we allow at most three gaps in an alignment. Give a method (as efficient as
possible) for computing the score of an optimal alignment.
Problem 5 (20%): In affine gap penalties, a gap of length k is penalized by ® + k £ ¯, where ® and ¯ are
both nonnegative constants.
(a) (10%): Give the recurrence relations for computing the score of an optimal (global) alignment
between A and B. Justify your recurrence relations and include all initializations.
(b) (10%): Give the recurrence relations for computing the score of an optimal local alignment between
A and B. Explain your recurrence relations and include all initializations.
Problem 6 (15%): Consider the problem of computing all ¢-points of two sequences of lengths m and
n, where m ¿ n. Describe a method for computing all ¢-points that works in O(mn) time and
O(m11/10 + n) working space.
--
Nothing is Impossible
--
※ 發信站: 批踢踢實業坊(ptt.cc)
◆ From: 140.112.30.46
1F:→ andy74139 :已收錄至資訊系!! 05/15 20:26
2F:→ andy74139 :請問考試日期是不是錯了?? 05/15 20:27
※ 編輯: wanquan 來自: 140.112.30.46 (05/15 23:14)
3F:→ wanquan :以更正..謝謝 05/15 23:14