Saturday, October 18, 2025

Submodularity and Diminishing Returns — A Clean, Self-Contained Proof

Submodularity and Diminishing Returns — A Clean, Self-Contained Proof

Submodularity ⇔ Diminishing Returns — A Clean, Self-Contained Proof

Section 1 — Why this matters

Submodular functions sit at the heart of modern discrete optimization, powering algorithms for data summarization, influence maximization, active learning, cut/flow, and beyond. Their magic is twofold:

  1. Algorithmic tractability: submodular minimization is polynomial-time solvable; submodular maximization admits strong approximation guarantees (often via greedy).
  2. Modeling clarity: they capture the ubiquitous “diminishing returns” (DR) effect: the more you already have, the less benefit you gain from adding one more item.

This post proves the classic equivalence:

Theorem. For a set function F:2VRF: 2^V \to \mathbb{R} on a finite ground set VV, the following are equivalent:

  1. (Submodularity) For all subsets S,TVS,T \subseteq V,
    F(S)+F(T)F(ST)+F(ST). F(S) + F(T) \ge F(S \cup T) + F(S \cap T).
  2. (Diminishing Returns) For all ABVA \subseteq B \subseteq V and all xVBx \in V \setminus B,
    F(A{x})F(A)F(B{x})F(B). F(A \cup \{x\}) - F(A) \ge F(B \cup \{x\}) - F(B).

We’ll prove both directions.


Section 2 — The proof

Notation

  • 2V2^V is the Boolean lattice of all subsets of VV.
  • We write marginal gains as
    ΔF(xS):=F(S{x})F(S). \Delta_F(x \mid S) := F(S \cup \{x\}) - F(S).

Direction 1: Submodular ⇒ Diminishing Returns

Goal. If FF is submodular, then for all ABA \subseteq B and xBx \notin B,
ΔF(xA)ΔF(xB). \Delta_F(x \mid A) \ge \Delta_F(x \mid B).

Proof. Apply submodularity to the pair (S,T)=(A{x},B)(S,T)=(A \cup \{x\},\, B):
F(A{x})+F(B)F((A{x})B)+F((A{x})B). F(A \cup \{x\}) + F(B) \ge F\big((A \cup \{x\}) \cup B\big) + F\big((A \cup \{x\}) \cap B\big).
Because ABA \subseteq B and xBx \notin B, we have
(A{x})B=B{x},(A{x})B=A. (A \cup \{x\}) \cup B = B \cup \{x\}, \quad (A \cup \{x\}) \cap B = A.
Hence
F(A{x})+F(B)F(B{x})+F(A), F(A \cup \{x\}) + F(B) \ge F(B \cup \{x\}) + F(A),
which rearranges to
ΔF(xA)ΔF(xB). \Delta_F(x \mid A) \ge \Delta_F(x \mid B).
\square


Direction 2: Diminishing Returns ⇒ Submodular

Assumption (DR). For all ABA \subseteq B and xBx \notin B,
ΔF(xA)ΔF(xB). \Delta_F(x \mid A) \ge \Delta_F(x \mid B).

Goal. Show that for all S,TVS,T \subseteq V,
F(S)+F(T)F(ST)+F(ST). F(S) + F(T) \ge F(S \cup T) + F(S \cap T).
It is equivalent to
F(T)F(ST)F(ST)F(S). F(T) - F(S \cap T) \ge F(S \cup T) - F(S).
We should notice that TSTT \subset S\cup T and
R:=T(ST)=(ST)S R := T\setminus (S\cap T) = (S \cup T) \setminus S
To prove above inequality, we will compare the following extensions

  • From STS \cap T, to TT with elements in RR
  • From SS, to STS\cup T with elements in RR

Step 1 — Decompose the symmetric difference

Let
U:=ST,W:=ST,R:=WS=TU. U := S \cap T, \qquad W := S \cup T, \qquad R := W \setminus S = T \setminus U.
which is
F(T)F(U)F(W)F(S). F(T) - F(U) \geq F(W) - F(S).


Thus RR consists exactly of the elements that are in TT but not in SS. List RR in an arbitrary order:
R={r1,r2,,rk}. R = \{r_1, r_2, \dots, r_k\}.

Step 2 — Telescope from UU to WW and from SS to WW

Consider the telescoping expansions
F(T)F(U)  =  j=1k[F(U{r1,,rj})F(U{r1,,rj1})]  =  j=1kΔF(rjU{r1,,rj1}), F(T) - F(U) \;=\; \sum_{j=1}^k \big[F(U \cup \{r_1,\dots,r_j\}) - F(U \cup \{r_1,\dots,r_{j-1}\})\big]\\ \;=\; \sum_{j=1}^k \Delta_F(r_j \mid U \cup \{r_1,\dots,r_{j-1}\}),
and
F(W)F(S)  =  j=1k[F(S{r1,,rj})F(S{r1,,rj1})]  =  j=1kΔF(rjS{r1,,rj1}). F(W) - F(S) \;=\; \sum_{j=1}^k \big[F(S \cup \{r_1,\dots,r_j\}) - F(S \cup \{r_1,\dots,r_{j-1}\})\big]\\ \;=\; \sum_{j=1}^k \Delta_F(r_j \mid S \cup \{r_1,\dots,r_{j-1}\}).

Step 3 — Compare term-by-term using DR

For each jj, we have
U{r1,,rj1}    S{r1,,rj1}, U \cup \{r_1,\dots,r_{j-1}\} \;\subseteq\; S \cup \{r_1,\dots,r_{j-1}\},
and rjS{r1,,rj1}r_j \notin S \cup \{r_1,\dots,r_{j-1}\} by construction. The DR assumption therefore gives
ΔF(rjU{r1,,rj1})    ΔF(rjS{r1,,rj1}). \Delta_F(r_j \mid U \cup \{r_1,\dots,r_{j-1}\}) \;\ge\; \Delta_F(r_j \mid S \cup \{r_1,\dots,r_{j-1}\}).
Summing over j=1,,kj=1,\dots,k yields
F(T)F(U)    F(W)F(S). F(T) - F(U) \;\ge\; F(W) - F(S).

This is exactly the submodularity inequality. \square


Takeaways

  • The DR form says marginal gains shrink as the context grows. It is the discrete analog of concavity.
  • Submodularity on the lattice 2V2^V is equivalent to DR. Either form can be used depending on whether you want clean algebraic inequalities (submodularity) or stepwise algorithmic reasoning (DR).
  • This equivalence underlies the correctness and guarantees of greedy and related algorithms, and it is the bridge to convex analysis via the Lovász extension.

Popular Posts