Approximating the geometric edit distance

Joint work with Xinyi Li

Algorithmica, 84(9), 2395–2413, 2022

Abstract

Edit distance is a measurement of similarity between two sequences such as strings, point sequences, or polygonal curves. Many matching problems from a variety of areas, such as signal analysis, bioinformatics, etc., need to be solved in a geometric space. Therefore, the geometric edit distance (GED) has been studied.
In this paper, we describe the first strictly sublinear approximate near-linear time algorithm for computing the GED of two point sequences in constant dimensional Euclidean space. Specifically, we present a randomized \(O(n\log^2n)\) time \(O(\sqrt n)\)-approximation algorithm. Then, we generalize our result to give a randomized \(\alpha\)-approximation algorithm for any \(\alpha\in [\sqrt{\log n}, \sqrt{n/\log n}]\), running in time \(O(n^2/\alpha^2 \log n)\). Both algorithms are Monte Carlo and return approximately optimal solutions with high probability.

Extended abstract in Proceedings of the 30th International Symposium on Algorithms and Computation (ISAAC), 26:1–26:16, 2019.