中国科学院大学《计算机算法设计与分析》第一次作业

分治

国科大(UCAS)卜东波老师算法课作业答案.

1 Divide and Conquer

You are interested in analyzing some hard-to-obtain data from two separate databases. Each database contains n numerical values, so there are 2n values total and you may assume that no two values are the same. You’d like to determine the median of this set of 2n values, which we will define here to be the n th smallest value.

However, the only way you can access these values is through queries to the databases. In
a single query, you can specify a value k to one of the two databases, and the chosen database will return the k th smallest value that it contains. Since queries are expensive, you would like to compute the median using as few queries as possible.

Give an algorithm that finds the median value using at most O(log n) queries.

  • The desciption of algorithm and the pseudo-code

算法的思路是在第一个数据库中找出一个元素$a_i$,在第二个数据库中找出一个元素$b_j$,并保证小于等于$a_i$和$b_j$的元素总数为$N$个,然后在所有小于$a_i$的元素中找出最大的元素$a_m$,在所有小于$b_j$的元素中找出最大的元素$b_n$,两个数据库所有的元素中第$n$小的元素就是$max(a_m,b_n)$.

伪代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
FindMedianValue(a[1...n], b[1...n])
i = 1, j = n;
if a[n] <= b[1] then
return a[n];
end if
if a[1] >= b[n] then
return b[n];
end if
while j - 1 >= 2 do
k = (i + j) / 2
flag = a[k] - b[n - k + 1];
if flag > 0 then
j = k;
else if flag < 0 then
i = k;
else
return a[k];
end if
end while
return max(a[k], b[n - k]);
  • Subproblem reduction graph

img

  • Prove the correctness

设第一个数据库中第k小的元素为$a_k$,第二个数据库中第k小的元素为$b_k$,$f(k)$是关于$k$的单调递增函数,定义如下:

特别地,当$f(1) \geq 0$时,$b_n$就是第n小的元素;当$f(n) \leq 0$时,$a_n$就是第n小的元素. 若能找到一个$k$,使得:

这表明:

设集合$S=\{a_1,a_2,…,a_k,b_1,b_2,…,b_{n-k}\}$,S的元素个数为$n$,由上可知,$max(a_k,b_{n-k})$就是第$n$小的元素.

  • The complexity of the algorithm

2 Divide and Conquer

Find the $k^{th}$ largest element in an unsorted array. Note that it is the $k^{th}$ largest element in the sorted order, not the $k^{th}$ distinct element.

INPUT: An unsorted array A and k.

OUTPUT: The $k^{th}$ largest element in the unsorted array A.

  • The desciption of algorithm and the pseudo-code

算法的思路是利用快速排序的思想,每次从当前序列中找出一个pivot,把比pivot大的元素都放左边,比pivot小的元素放右边,然后将pivot所在位置posk-1比较,若相等,则pivot就是第k大的元素;若pos大于k-1,则第k大的元素必然在pos的左边;若pos小于k-1,则第k大的元素必然在pos的右边.

伪代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
FindKthLargest(A, k)
left = 0, right = A.length - 1;
while true do
pos = partion(A, left, right)
if pos == k - 1 then
return A[pos];
else if pos > k - 1 then
right = pos - 1;
else
left = pos + 1;
end if
end while
  • Subproblem reduction graph

img

  • Prove the correctness

设A的元素个数为n(下标从0开始),第k大的元素为t,第一次选择的pivot的位置为p,则大于pivot的元素个数为p,小于pivot的元素个数为n - p - 1;若$p = k-1$,则pivot就是第k大的元素;若$p > k-1$,则$pivot \leq t$,t必然在pivot的左边;若$p < k - 1$,则$pivot \geq t$,t必然在pivot的右边. 因此,递归搜索可以找到第k大的元素t.

  • The complexity of the algorithm

3 Divide and Conquer

Consider an n-node complete binary tree $T$ , where $n = 2^d − 1$ for some $d$. Each node $v$ of T is labeled with a real number $x_v$ . You may assume that the real numbers labeling the nodes are all distinct. A node $v$ of $T$ is a local minimum if the label $x_v$ is less than the label $x_w$ for all nodes $w$ that are joined to $v$ by an edge.

You are given such a complete binary tree $T$ , but the labeling is only specified in the following implicit way: for each node $v$, you can determine the value $x_v$ by probing the node $v$. Show how to find a local minimum of $T$ using only $O(\log n)$ probes to the nodes of $T$ .

  • The desciption of algorithm and the pseudo-code

算法的思路是,先比较根节点和它的两个孩子节点,若根节点最小,则直接返回根节点,否则,递归查找较小的孩子节点对应的子树.

伪代码如下:

1
2
3
4
5
6
7
8
9
10
11
FindLocalMin(T)
v == probe(T.root());
v_l = probe(T.leftChild());
v_r = probe(T.rightChild());
if v < v_l and v < v_r then
return T.root()
else if v_l < v_r then
FindLocalMin(T.leftChild());
else then
FindLocalMin(T.rightChild());
end if
  • Subproblem reduction graph

img

  • Prove the correctness

设当前二叉树为T,其根节点为T.root,根节点的左右孩子节点分别为T.leftT.right,若T.root < T.leftT.root < T.right,则T.root就为局部最小节点,否则,选择左右孩子节点中较小的那个对应的子树,作为新的二叉树T,重复上述操作,必能找到局部最小节点.

  • The complexity of the algorithm

8 Divide and Conquer

The attached file Q8.txt contains 100,000 integers between 1 and 100,000 (each row has a
single integer), the order of these integers is random and no integer is repeated.

  1. Write a program to implement the Sort-and-Count algorithms in your favorite language, find the number of inversions in the given file.
  2. In the lecture, we count the number of inversions in O(n log n) time, using the Merge-Sort idea. Is it possible to use the Quick-Sort idea instead ? If possible, implement the algorithm in your favourite language, run it over the given file, and compare its running time with the one above. If not, give a explanation.

1.Java源码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStreamReader;
/**
* Created by szh on 17-10-5.
*/
public class Main {
final static int LEN = 100000;
private static long count = 0;
private static int[] array = new int[LEN];
private static int[] tempArr = new int[LEN];
public static void main(String[] args) {
// 读取Q8.txt中的数据存入array数组中,这里使用了文件的相对路径
try {
String pathName = "Q8.txt";
File fileName = new File(pathName);
InputStreamReader reader = new InputStreamReader(new FileInputStream(fileName));
BufferedReader buf = new BufferedReader(reader);
String line;
int i = 0;
while ((line = buf.readLine()) != null) {
// System.out.println(line);
array[i++] = Integer.parseInt(line);
}
count = 0;
mergeSort(0, array.length - 1);
System.out.println(count);
} catch (Exception e) {
e.printStackTrace();
}
}
// 归并排序的递归调用
public static void mergeSort(int low, int high) {
if (low >= high) {
return;
}
int mid = low + (high - low) / 2;
mergeSort(low, mid);
mergeSort(mid + 1, high);
mergeAndSort(low, mid, high);
}
// 将两个升序排列的数组合并成一个升序排列的数组,并计算两个数组直接首尾拼接的情况下包含多少个逆序对
public static void mergeAndSort(int low, int mid, int high) {
int temp = low;
int idx = low;
int center = mid + 1;
while (low <= mid && center <= high) {
if (array[low] <= array[center])
tempArr[idx++] = array[low++];
else {
tempArr[idx++] = array[center++];
// 如果array[low]大于array[center],则[low,mid]这个区间内的元素都大于array[center]
count += mid - low + 1;
}
}
while (low <= mid) {
tempArr[idx++] = array[low++];
}
while (center <= high) {
tempArr[idx++] = array[center++];
}
while (temp <= high) {
array[temp] = tempArr[temp++];
}
}
}

2.不能用快排,因为在快排的过程中,每次选择的pivot两侧的数组都是乱序的,如果计算这两个数组包含的逆序对个数,时间复杂度为$O(n^2)$.

9 Divide and Conquer

Implement the algorithm for the closest pair problem in your favourite language.

INPUT: n points in a plane.

OUTPUT: The pair with the least Euclidean distance.

Java源码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.StreamTokenizer;
import java.util.Arrays;
import java.util.Comparator;
/**
* User: szh
* Date: 2017-10-09
* Time: 下午10:09
*/
class point {
private double x;
private double y;
public double getX() {
return x;
}
public void setX(double x) {
this.x = x;
}
public double getY() {
return y;
}
public void setY(double y) {
this.y = y;
}
}
class XComparator implements Comparator<point> {
@Override
public int compare(point o1, point o2) {
return (int) (o1.getX() - o2.getX());
}
}
class YComparator implements Comparator<point> {
@Override
public int compare(point o1, point o2) {
return (int) (o1.getY() - o2.getY());
}
}
public class Main {
private static point[] points;
private static int MAX = Integer.MAX_VALUE;
private static int n;
public static void main(String[] args) throws Exception {
StreamTokenizer st = new StreamTokenizer(new BufferedReader(new InputStreamReader(System.in)));
while (st.nextToken() != StreamTokenizer.TT_EOF) {
n = (int) st.nval;
// n表示要输入n个点的坐标
if (0 == n)
break;
// 这句代码只是创建了一个长度为n的对象数组,里面的元素都还是空的!!!
points = new point[n];
for (int i = 0; i < n; i++) {
point p = new point();
st.nextToken();
p.setX(st.nval);
st.nextToken();
p.setY(st.nval);
points[i] = p;
}
// 将所有点按照x坐标从小到大排序
Arrays.sort(points, new XComparator());
System.out.println(String.format("%.2f", closestPairDis(0, n-1)).toString());
}
}
// 计算两点之间的距离
public static double dis(point p1, point p2) {
return Math.sqrt(Math.pow((p1.getX() - p2.getX()), 2) + Math.pow((p1.getY() - p2.getY()), 2));
}
public static double closestPairDis(int left, int right) {
if (left == right)
return MAX;
if (left + 1 == right) {
return dis(points[left], points[right]);
}
int mid;
double left_dis, right_dis, min_dis;
mid = left + (right - left) / 2;
left_dis = closestPairDis(left, mid); // 左边一半点对的最小距离
right_dis = closestPairDis(mid + 1, right); // 右边一半点对的最小距离
min_dis = Math.min(left_dis, right_dis);
point[] temp = new point[n];
int j = 0;
for (int i = left; i <= right; i++) {
if (Math.abs(points[i].getX() - points[mid].getX()) <= min_dis)
temp[j++] = points[i];
}
// 将所有点按照y坐标从小到大排序
Arrays.sort(temp, 0, j, new YComparator());
for (int i = 0; i < j; i++) {
for (int k = i + 1, m = 0; k < j && m < 11; k++) {
min_dis = Math.min(min_dis, dis(temp[i], temp[k]));
m++;
}
}
return min_dis;
}
}

版权声明

作者:萝卜姓胡
许可协议:Creative Commons Attribution-ShareAlike 4.0 International License
本文永久链接:http://hw2007.com/2018/10/04/中国科学院大学《计算机算法设计与分析》第一次作业/