Multi-document summarization is the task of generating a summary from multiple documents, and the generated summary is expected to contain much of the information contained in the original documents. Previous work tries to realize this by (i) formulating the task as the combinatorial optimization problem of simultaneously maximizing relevance and minimizing redundancy, or (ii) formulating the task as a graph-cut problem. This paper improves summary quality by combining these two approaches into a synthesized optimization problem that is formulated in Integer Linear Programming (ILP). Though an ILP problem can be solved with an ILP solver, the problem is NP-hard and it is difficult to obtain the exact solution in situations where immediate responses are needed. Our solution is to propose optimization heuristics that exploit Lagrangian relaxation to obtain good appro ximate solutions within feasible computation times. Experiments on the document understanding conference 2004 (DUC 04) dataset show that our Lagrangian relaxation based heuristics completes in feasible computation time but achieves higher ROUGE scores than state-of-the-art approximate methods.
|Number of pages||9|
|Journal||Transactions of the Japanese Society for Artificial Intelligence|
|Publication status||Published - 2013 Jul 10|
- Combinatorial optimization
- Lagrangian relaxation
- Multi-document summarization