11 Approximation¶

近似比

Suppose ALG is an \(\alpha\)-approximation algorithm for an optimization problem \(\Pi\) whose approximation ratio is tight. Then for every \(\varepsilon>0\) there is no \((\alpha-\varepsilon)\)-approximation algorithm for \(\Pi\) unless \(P = NP\).

答案

单个算法的紧近似比不等于问题的近似难度下界，可能存在更优算法，答案为 \(\textnormal{F}\)

As we know there is a 2-approximation algorithm for the Vertex Cover problem. Then we must be able to obtain a 2-approximation algorithm for the Clique problem, since the Clique problem can be polynomially reduced to the Vertex Cover problem.

答案

Clique 可多项式归约到 Vertex Cover，仅表示能用顶点覆盖的解判断团的解，并不意味着顶点覆盖的近似算法能直接转化为团的近似算法，故答案为 \(\textnormal{F}\)

那么团问题的近似比具体是多少呢？

已知 \(\rho_{\text{Vertex}} = \frac{C_{\text{Vertex}}}{C^*_{\text{Vertex}}} = \frac{K}{K^*} = 2\)

由团与顶点覆盖的互补关系得：

\(C_{\text{Clique}} = |V| - C_{\text{Vertex}} = |V| - K\)，\(C^*_{\text{Clique}} = |V| - C^*_{\text{Vertex}} = |V| - K^*\)

故 \(\rho_{\text{Clique}} = \frac{C^*_{\text{Clique}}}{C_{\text{Clique}}} = \frac{|V| - K^*}{|V| - K} = \frac{|V| - K^*}{|V| - 2K^*} = 1 + \frac{1}{\frac{|V|}{K^*} - 2} \neq 2\)

For an approximation algorithm for a minimization problem, given that the algorithm does not guarantee to find the optimal solution, the best approximation ratio possible to achieve is a constant \(α > 1\).

答案

部分 NP 难的最小化问题（如集合覆盖问题），其近似算法的最优近似比是对数级别（而非常数），且在 \(P≠NP\) 的假设下不存在常数近似比的算法，故答案为 \(\textnormal{F}\)

时间复杂度

PTAS 和 FPTAS

PTAS（多项式时间近似方案）：只要求 \(n\) 多项式，对 \(ε\) 的依赖可以是指数级
FPTAS（完全多项式时间近似方案）：额外要求 \(1/ε\) 多项式，对 \(ε\) 的依赖不能是指数级或对数级

An approximation scheme that runs in \(O(n^2/ϵ)\) for any fixed \(ϵ>0\) is a fully polynomial-time approximation scheme.

答案

\(\textnormal{T}\)

An approximation scheme that runs in \(O(n^2/3^\epsilon)\) for any fixed \(ϵ>0\) is a fully polynomial-time approximation scheme.

答案

\(\textnormal{F}\)

An approximation scheme that runs in \(O(n^23^ϵ)\) for any fixed \(ϵ>0\) is a polynomial-time approximation scheme.

答案

\(\textnormal{T}\)

An (\(1+ϵ\))-approximation scheme of time complexity \((n+1/ϵ)^{3}\) is a PTAS but not an FPTAS.

答案

\(\textnormal{F}\)

Bin Packing

算法对比

	策略	近似比	时间复杂度
Next Fit（NF）	按输入顺序处理物品，若当前物品能与上个物品放入同一桶则放入，否则创建新桶	2	O(N)
First Fit（FF）	按输入顺序处理物品，扫描并放入第一个能容纳该物品的现存桶，无则创建新桶	1.7	O(NlogN)
Best Fit（BF）	按输入顺序处理物品，放入现存能容纳该物品且剩余空间最小的桶，无则创建新桶	1.7	O(NlogN)

Next-Fit

假设最大物体的大小 \(0 < \alpha < 1\)，令 \(NF(L)\) 和 \(OPT(L)\) 分别表示通过 Next Fit 得到的桶数和最优解的桶数。那么对于任意的物品组 \(L\)，我们有结论：\(NF(L) < \rho \cdot OPT(L) + 1\)，并且：

\(\rho = \frac{1}{1-\alpha} \left( 0 \leq \alpha \leq \frac{1}{2} \right)\)
\(\rho = 2 \left( \alpha > \frac{1}{2} \right)\)

There exists an on-line algorithm for the bin packing problem that uses at most 3/2 the optimal number of bins for any instance.

答案

\(\textnormal{F}\)

In the bin packing problem, we are asked to pack a list of items \(L\) to the minimum number of bins of capacity 1. For the instance \(L\), let \(FF(L)\) denote the number of bins used by the algorithm First Fit. The instance \(L'\) is derived from \(L\) by deleting one item from \(L\). Then \(FF(L')\) is at most of \(FF(L)\).

答案

\(\textnormal{F}\)，据说如果改为NF算法就是T，待证明吧

We are given \(n\) items that are arranged in a row. From left to right, the size of items are \(s_1, s_2, ..., s_n\), where \(0 < s_i \leq 1\). In addition, we require that each bin can only contain contiguous block of items with total size at most 1. That is, if items \(i\) and \(j\) are in the same bin, then all items between \(i\) and \(j\) should also be in this bin. Our goal is to pack all these items in the fewest number of bins. We adopt the classical bin packing algorithms to solve this problem, which of the following statement is true?

\(\textnormal{A}\). Next Fit algorithm is an optimal algorithm.

\(\textnormal{B}\). First Fit algorithm is an optimal algorithm.

\(\textnormal{C}\). Best Fit algorithm is an optimal algorithm.

\(\textnormal{D}\). None of the other three options is correct.

答案

\(\textnormal{A}\)

Knapsack Problems

For the 0-1 version of the Knapsack problem, if we are greedy on taking the maximum profit or profit density, then the resulting profit must be bounded below by the optimal solution minus the maximum profit.

答案

由 \(P_{opt} <P_{frac} < P_{gre} + p_{max}\) 得 \(P_{gre} > P_{opt} - p_{max}\)，故答案为 \(\textnormal{T}\)

Consider the following variant of the knapsack problem. You are given \(k\) identical knapsacks and \(n\) items of different sizes. The profit of each item is equal to its size. Your goal is to select items and fill the knapsacks as much as possible subject to the capacity constraints. A greedy approach always packs the largest remaining item as long as there is enough room. As bin packing, if more than one knapsacks have room for the current item, one may use any packing rule (Next Fit, First Fit, and Best Fit). No matter what rule you apply, for any \(k≥1\), the approximation ratio of the greedy algorithm is always 2.

答案

\(\textnormal{T}\)

Consider the 0-1 knapsack problem with object weights W, profits V, and total weight limit B (means that W of any object is no larger than B.). In the class, we have learned that combining the greedy algorithm on maximum profit V and maximum profit-weight ratio V/W leads to an approximation algorithm which always produces a solution no less than 1/2 of the optimal solution. Now let us consider the following simplified greedy algorithm. The algorithm first conducts the following sorting w.r.t. profit-weight ratio:

Sort all objects according to the profit-weight ratio \(r_i = v_i/w_i\) so that \(r_1 \geq r_2 \geq \cdots \geq r_n\).

Let the sorted order of objects be \(a_1, ... , a_n\). The next step is to find the lowest k such that the total weight of the first k objects exceeds B. Finally, we pick the more valuable of \(\{a_1, ... , a_{k-1}\}\) and \(\{a_k\}\) as the final solution. Then which of the following arguments is correct:

\(\textnormal{A}\). The algorithm can generate a solution which is arbitrarily worse than the optimal solution.

\(\textnormal{B}\). The algorithm always returns a solution no less than \(\alpha\) of the optimal solution, while \(\alpha < 1/2\).

\(\textnormal{C}\). The algorithm always returns a solution no less than 1/2 of the optimal solution.

\(\textnormal{D}\). The algorithm always returns the optimal solution.

答案

采用反证法：假设 \(\text{ALG} < \frac{1}{2}\text{OPT}\)，即前 \(k-1\) 个的总价值 \(V_{k-1} < \frac{1}{2}\text{OPT}\) 且第 \(k\) 个的价值 \(v_k < \frac{1}{2}\text{OPT}\)，则有 \(V_{k-1} + v_k < \text{OPT}\)

而最优解的总重量 \(\leq B <\) 前 \(k\) 个的总重量，又物品按照单位重量利润降序排列，所以 \(\text{OPT} < V_k = V_{k-1} + v_k\)，矛盾！

假设不成立，即 \(\text{ALG} = \max(V_{k-1}, v_k) \geq \frac{1}{2}\text{OPT}\)，\(\alpha > \frac{1}{2}\)，答案为 \(\textnormal{C}\)

期望与极端情况的关系

Suppose that a randomized algorithm \(A\) has expected running time \(\Theta(n \log n)\) on any input of size \(n\). Then it is possible for some execution of algorithm \(A\) to take \(\Omega(2^n)\) time.

答案

\(\textnormal{T}\)

与 P/NP 问题结合考察

Which one of the following statements is FALSE?

\(\textnormal{A}\). SAT, Vertex Cover, Hamiltonian Cycle, Clique, Knapsack, Bin Packing, and Domination Set problems are all NP-completeness problems.

\(\textnormal{B}\). If there is a polynomial time \((1+\frac{1}{2n})\)-approximation algorithm for Vertex Cover with \(n\) being the total number of vertices in the graph, then P=NP.

\(\textnormal{C}\). If there is a polynomial time 3/2-approximation algorithm for K-Center, then P=NP.

\(\textnormal{D}\). Given a weighted directed acyclic graph (DAG) \(G\) and a source vertex \(s\) in \(G\), it is NP-hard to find the longest distances from \(s\) to all other vertices in \(G\).

答案

SAT、Vertex Cover、Hamiltonian Cycle、Clique、Knapsack、Bin Packing、Domination Set都是经典的 NP 完全问题，故 \(\textnormal{A}\) 表述正确。

Vertex Cover 若存在近似比小于 2 的多项式时间近似算法，则 \(\text{P=NP}\)，故 \(\textnormal{B}\) 表述正确。

K-Center 存在 2 \(-\) 近似的多项式时间算法，且若存在近似比小于 2 的多项式时间近似算法，则 \(\text{P=NP}\)，故 \(\textnormal{C}\) 表述正确。

带权有向无环图（DAG）的最长路径问题可以在多项式时间内解决：对 DAG 进行拓扑排序后，按拓扑序依次松弛边（更新最长距离），整个过程的时间复杂度是 \(O(V+E)\)（多项式时间），不是 NP 难问题，故 \(\textnormal{D}\) 表述错误。

综上所述，答案为 \(\textnormal{D}\)

Activity Selection Problem

In the Activity Selection Problem:

we are given a set of activities \(S = \{a_1, a_2, ..., a_n\}\) that wish to use a resource. Each \(a_i\) takes place during a time interval \([s_i, f_i)\). Activities \(a_i\) and \(a_j\) are compatible if \(s_i \geq f_j\) or \(s_j \geq f_i\) (i.e. their time intervals do not overlap). Our goal is to select a maximum-size subset of mutually compatible activities.

We propose a greedy rule "Shortest-First" that select the interval which is the shortest (but not overlapping the already chosen intervals). What is the approximation ratio of the greedy algorithm for the activity selection problem?

\(\textnormal{A}\). \(3/2\) \(\qquad\) \(\textnormal{B}\). \(2\) \(\qquad\) \(\textnormal{C}\). \(3\) \(\qquad\) \(\textnormal{D}\). no constant approximation ratio.

答案

\(\textnormal{B}\)

Revisit the activity selection problem. Given a set of activities \(S = \{a_1, a_2, \cdots, a_n\}\) that wish to use a resource, each \(a_i\) takes place during a time interval. The goal is to arrange as many compatible activities as possible. Recall that several greedy approaches are introduced in the class, among which the one selecting an activity with the shortest length, denoted by \(SF\), is not always optimal. However, we claim that \(SF\) accepts at least \(|OPT|/2\) activities, given that the optimal value is \(|OPT|\), where \(OPT\) is an optimal solution. Check if the following is a correct proof.

We use a technique, called the charging scheme, similarly as the amortized analysis. Suppose each accepted activity of \(OPT\) holds one dollar, which will be given to the activities accepted by \(SF\) in the following way. For any activity \(a_i\) of \(OPT\), if \(a_i\) is also accepted by \(SF\), give the dollar to itself. Otherwise, there must be some activity \(a_j\), accepted by \(SF\), is not compatible with \(a_i\). Then \(a_j\) receives one dollar from \(a_i\). Along this line, each activity of \(OPT\) sends out one dollar to an activity in \(SF\), while each activity of \(SF\) receives at most two dollars. It implies that \(SF\) accepts as least \(|OPT|/2\) activities.

答案

\(\textnormal{T}\)

Set Cover Problem

The set cover problem is one of Karp's 21 NP-complete problems. Given a set of elements \(U = \{1,2,...,n\}\) (called the universe) and a collection \(S\) of \(k\) sets whose union equals the universe, and and a cost function \(cost: S \to Q^+\), the weighted set cover problem is to identify the smallest cost of sub-collection of \(S\) whose union equals the universe. Consider an approximation algorithm as follows:

C = null
while ( C!=U ) {
    for each S[i], calculate the cost-effectiveness a[i] = cost(S[i]) / |S[i]-C|
    find the best cost-effective set, say S[j]
    pick S[j]
    C = C union S[j]
}
output the picked sets

If \(P \neq NP\), we can't find a constant \(\rho \in N^+\), so that the approximation ratio of the above algorithm is \(\rho\).

答案

集合覆盖的贪心近似算法每次选择成本效益最高的集合，即集合成本 ÷ 该集合未被覆盖的元素数最小的集合，直到覆盖全集 \(U\)，其近似比是第 \(n\) 个调和数 \(H_n = 1 + \frac{1}{2} + \frac{1}{3} + \dots + \frac{1}{n}\)，随全集大小 \(n\) 对数增长，不是固定常数，故答案为 \(\textnormal{T}\)

近似比的具体计算过程

设 \(\text{OPT}\) 是最优集合覆盖的成本，即最小成本，贪心算法选择的集合序列为 \(S_1, S_2, \dots, S_t\)

贪心算法的总成本是 \(\sum_{i=1}^t \text{cost}(S_i)\)，可将 \(S_i\) 的成本均匀分摊给它覆盖的新元素 \(A_i\)，贪心总成本等价于 \(\sum_{e \in U}\)（元素 \(e\) 被覆盖时的分摊成本）

取最优解中的任意一个集合 \(T \in \text{OPT}\)，当 \(T\) 中还有 \(k\) 个元素未被覆盖时，贪心算法选择的集合 \(S_i\) 的成本效益不会比 \(T\) 此时的成本效益差，即 \(\frac{\text{cost}(S_i)}{|A_i|} \leq \frac{\text{cost}(T)}{k}\)，此时 \(S_i\) 分摊给每个新元素的成本 \(\leq \frac{\text{cost}(T)}{k}\)

对于 \(T\) 中的元素，被覆盖时对应的 \(k\) 会从 \(|T|\) 逐步降到 \(1\)，\(T\) 中所有元素的分摊成本之和 \(\leq \text{cost}(T) \times \left( \frac{1}{|T|} + \frac{1}{|T|-1} + \dots + 1 \right) = \text{cost}(T) \times H_{|T|} \leq \text{cost}(T) \times H_n = \text{OPT} \times H_n\)

故贪心算法的近似比是 \(H_n\)（随 \(n\) 对数增长的调和数）

Traveling Salesman Problem

Without any assumptions on the distances, if \(\text{P} \neq \text{NP}\), there is no \(ρ-\)approximation for TSP (Travelling Salesman Problem) for any \(ρ ≥ 1\).

答案

\(\textnormal{T}\)

综合

Which of the following statement is true?

\(\textnormal{A}\). Let A and B be optimization problems where it is known that A reduces to B in polynomial time. Additionally, suppose that there exists a polynomial-time 2-approximation for B. Then there must exist a polynomial time 2-approximation for A.

\(\textnormal{B}\). There exists a polynomial-time 2-approximation algorithm for the general Traveling Salesman Problem.

\(\textnormal{C}\). Suppose that you have two deterministic online algorithms, A1 and A2, with a competitive ratios c1 and c2 respectively. Consider the randomized algorithm A that flips a fair coin once at the beginning; if the coin comes up heads, it runs A1 from then on; if the coin comes up tails, it runs A2 from then on. Then the expected competitive ratio of A is at least min{c1,c2}.

\(\textnormal{D}\). A randomized algorithm for a decision problem with one-sided-error and correctness probability 1/3 (that is, if the answer is YES, it will always output YES, while if the answer is NO, it will output NO with probability 1/3) can always be amplified（放大） to a correctness probability of 99%.

答案

一般的多项式时间归约（如用于 NP 完全性证明的归约）通常不保持近似比，需要特定的归约（如 L-归约或 AP-归约）才能保证近似性，故 \(\textnormal{A}\) 表述错误。

一般 TSP 不满足三角不等式，不存在任何常数因子近似算法（除非 P=NP），只有度量 TSP（满足三角不等式）才有 2-近似算法，故 \(\textnormal{B}\) 表述错误。

随机算法的期望竞争比可能低于两个确定性算法的最小竞争比，因为随机化可能平均化最坏情况表现，从而得到更优的期望竞争比，故 \(\textnormal{C}\) 表述错误。

对于 NO 实例，通过多次独立运行算法，只要有一次输出 NO 即可正确判定，错误概率随运行次数指数下降，可通过多项式次运行将正确概率提升至 99%，，故 \(\textnormal{D}\) 表述正确。

综上，答案为 \(\textnormal{D}\)

K-center Problem

K-center problem: Given \(N\) cities with specified distances, one wants to build \(K\) warehouses in different cities and minimize the maximum distance of a city to a warehouse.

Which of the following is false?

\(\textnormal{A}\). Given any constant \(\alpha > 1\), unless \(\text{P} = \text{NP}\), otherwise the \(K\)-center problem cannot be approximated within the factor \(\alpha\) if the graph \(G\) admits an arbitrary distance function.

\(\textnormal{B}\). If the graph \(G\) obeys metric distance, then there is a 2-approximation algorithm for the \(K\)-center problem.

\(\textnormal{C}\). The \(K\)-center problem can be solved optimally in polynomial time if \(K\) is a given constant.

\(\textnormal{D}\). If the graph \(G\) obeys Euclidean distance, then there exists a PTAS for the \(K\)-center problem.

答案

在欧几里得距离下，K-center 问题是 NP 难问题，且已知不存在多项式时间近似方案（PTAS），除非 P=NP。虽然存在 2-近似算法，但对于任意 ε > 0 的 (1+ε)-近似算法（即 PTAS）并不存在，故答案为 \(\textnormal{D}\)

Schedule Problem

You have 20 identical cores on which you want to schedule some \(n\) jobs. Each job \(j \in \{1,2,\dots,n\}\) has a processing time \(p_j > 0\). If \(S_i\) is the set of jobs assigned to core \(i\), let the load be \(\sum_{j \in S_i} p_j\). Now, you want to partition the jobs among the cores to minimize the load of the most-loaded core.

We design a greedy algorithm that picks any unassigned job, and assign it to the core with the least current load.

What is the approximation ratio of the greedy algorithm? (Choose the smallest bound that applies.)

答案

近似比 \(= 2- \frac{1}{m} = 1.95\)