Solving Blasso problem using gardient descent method
Let us consider a very simple setup of Blasso problem.
where , and and the Gaussian function centered at with fixed sigma. Here is some noise.
In the Blasso problem, we aim to recover positions and coefficients assuming that the noisy observation is known.
Note that we can rewrite as follows
where is a discrete Radon measure, i.e. combination of Dirac masses. And is a weakly continuous operator from the space of Radon measure to the Hilbert space .
To do that, we solve the Tikhonov regularization problem
where a regularization parameter and the total variation norm over the Radon measure space, e.g. the total variation norm of a discrete measure is the -norm of its coefficients.
Under some conditions (see Theorem 2 in Duval and Peyre), the optimizer measure is discrete, this allows us using the gradient descent to solve the non-convex Tikhonov regularization problem.
Our notebook can be found on Github providing a naive method for solving the problem. In our experiment, we assume that
is uniformly distributed in and . The result is as follows
To compute the gradient of the objective function, which is very complicated since it should be considered as a function of positions and coefficients , we use pure backward method in Pytorch without using torch.optim
tools (gradient descent method from scratch). The details is as follows:
def opt_min(func, point, lr=1e-2, max_iters=500, epsilon=1e-2):
point.clone()
data = {
"val": [],
"grad_max": [],
}
for i in range(max_iters):
point.requires_grad = True
point.grad = torch.zeros(point.shape)
loss = func(point)
loss.backward()
grad = point.grad
point = point - lr * grad
# detach
loss = loss.detach()
grad = grad.detach()
point = point.detach()
# save
grad_max = grad.abs().max()
data["val"] += [loss]
data["grad_max"] += [grad_max]
# stop
if grad_max<epsilon:
break
return point, data