Paper Reading on Compressed Sensing with LDPC codes

2020.06.06
ludics

1. LDPC Codes for Compressed Sensing

link

1.1 Introduction

We present a mathematical connection between channel coding and compressed sensing. In particular, we link, on the one hand, channel coding linear programming decoding (CCLPD), which is a well-known relaxation of maximum-likelihood channel decoding for binary linear codes, and, on the other hand, compressed sensing linear programming decoding (CS-LPD), also known as basis pursuit, which is a widely used linear programming relaxation for the problem of finding the sparsest solution of an under-determined system of linear equations. More specifically, we establish a tight connection between CS-LPD based on a zero-one measurement matrix over the reals and CC-LPD of the binary linear channel code that is obtained by viewing this measurement matrix as a binary parity-check matrix. This connection allows the translation of performance guarantees from one setup to the other. The main message of this paper is that parity-check matrices of “good” channel codes can be used as provably “good” measurement matrices under basis pursuit. In particular, we provide the first deterministic construction of compressed sensing measurement matrices with an order-optimal number of rows using high-girth low-density parity-check (LDPC) codes constructed by Gallager.

我们提出了信道编码和压缩感知之间的数学联系。特别是,我们一方面链接了信道编码线性编程解码(CCLPD),这是众所周知的针对二进制线性编码的最大似然信道解码的松弛,另一方面是压缩感知线性编程解码(CS-LPD),也称为基追踪,它是一种广泛使用的线性规划松弛方法,用于发现线性方程组欠定的最稀疏解的问题。更具体地说,我们在基于实数上的01测量矩阵的CS-LPD与通过将此测量矩阵视为二进制奇偶校验矩阵而获得的二进制线性通道代码的CC-LPD之间建立了紧密连接。此连接允许将性能保证从一种情境转换为另一种情境上。本文的主要信息是,在基追踪下,“良好”信道编码的奇偶校验矩阵可被证明是“良好”基追踪的测量矩阵。特别是,我们使用Gallager构造的高围低密度奇偶校验(LDPC)码,提供了行数为顺序最优的压缩传感测量矩阵的确定性构造。

2. Improving Distributed Gradient Descent Using Reed–Solomon Codes

Today’s massively-sized datasets have made it necessary to often perform computations on them in a distributed manner. In principle, a computational task is divided into subtasks that are distributed over a cluster operated by a taskmaster. One issue faced in practice is the delay incurred due to the presence of slow machines, known as stragglers. Several schemes, including those based on replication, have been proposed in the literature to mitigate the effects of stragglers, and more recently, those inspired by coding theory have begun to gain traction. In this work, we consider a distributed gradient descent setting suitable for a wide class of machine learning problems. We adapt the framework of Tandon et al. [18] and present a deterministic scheme that, for a prescribed per-machine computational effort, recovers the gradient from the least number of machines f theoretically permissible, via an O(f2)O(f^2) decoding algorithm. We also provide a theoretical delay model that can be used to minimize the expected waiting time per computation by optimally choosing the parameters of the scheme. Finally, we supplement our theoretical findings with numerical results that demonstrate the efficacy of the method and its advantages over competing schemes.

当今庞大的数据集使得有必要经常对它们进行分布式计算。原则上,计算任务分为多个子任务,这些子任务分布在由任务主管操作的群集上。实践中面临的一个问题是由于存在缓慢的机器(称为落后者)而导致的延迟。文献中已经提出了几种方案,包括基于复制的方案,以减轻落后者的影响。最近,受到编码理论启发的方案也开始受到关注。在这项工作中,我们考虑了适用于各种机器学习问题的分布式梯度下降设置。我们采用了Tandon等人的框架,并提出了一种确定性方案,该方案针对规定的每台机器计算工作量,通过 O(f2)O(f^2) 解码算法从理论上允许的最少数量的机器f中恢复梯度。我们还提供了理论上的延迟模型,该模型可通过最佳选择方案的参数来使每次计算的预期等待时间最小化。最后,我们用数值结果补充了我们的理论发现,这些数值结果证明了该方法的有效性及其相对于竞争方案的优势。

准备工作

构建码

3. Gradient Coding

提出新的编码理论框架,用于减轻分布式学习中的落后者。精心地复制数据块,对梯度进行编码,可以为同步梯度下降提供故障容错能力。

准备工作

Full stragglers

小数重复方案

循环重复方案