Skip to content

Longest Common Subseq

Problem

Given two strings, return the length of their longest common subsequence. A subsequence is a sequence derived by deleting some (or no) characters without changing the relative order.

Approach

2D DP. If characters match, extend the diagonal; otherwise take the max of skipping one character from either string. Space-optimized to a single row.

When to Use

Diff / alignment — "longest common subsequence", "diff two files", DNA sequence alignment. Foundation for unified-diff algorithms. See also: edit_distance for minimum-cost transformation.

Complexity

Time O(m * n)
Space O(min(m, n))

Implementation

longest_common_subseq

Longest Common Subsequence — length of LCS of two strings.

Problem

Given two strings, return the length of their longest common subsequence. A subsequence is a sequence derived by deleting some (or no) characters without changing the relative order.

Approach

2D DP. If characters match, extend the diagonal; otherwise take the max of skipping one character from either string. Space-optimized to a single row.

When to use

Diff / alignment — "longest common subsequence", "diff two files", DNA sequence alignment. Foundation for unified-diff algorithms. See also: edit_distance for minimum-cost transformation.

Complexity

Time: O(m * n) Space: O(min(m, n))

longest_common_subsequence

longest_common_subsequence(text1: str, text2: str) -> int

Return the length of the LCS of text1 and text2.

longest_common_subsequence("abcde", "ace") 3 longest_common_subsequence("abc", "def") 0

Source code in src/algo/dp/longest_common_subseq.py
def longest_common_subsequence(text1: str, text2: str) -> int:
    """Return the length of the LCS of *text1* and *text2*.

    >>> longest_common_subsequence("abcde", "ace")
    3
    >>> longest_common_subsequence("abc", "def")
    0
    """
    # Ensure text2 is the shorter string for O(min(m,n)) space
    if len(text1) < len(text2):
        text1, text2 = text2, text1

    m, n = len(text1), len(text2)
    dp = [0] * (n + 1)

    for i in range(1, m + 1):
        prev = 0
        for j in range(1, n + 1):
            temp = dp[j]
            if text1[i - 1] == text2[j - 1]:
                dp[j] = prev + 1
            else:
                dp[j] = max(dp[j], dp[j - 1])
            prev = temp

    return dp[n]
tests/dp/test_longest_common_subseq.py
"""Tests for longest common subsequence."""

from hypothesis import given
from hypothesis import strategies as st

from algo.dp.longest_common_subseq import longest_common_subsequence


class TestLongestCommonSubsequence:
    def test_basic(self) -> None:
        assert longest_common_subsequence("abcde", "ace") == 3

    def test_no_common(self) -> None:
        assert longest_common_subsequence("abc", "def") == 0

    def test_identical(self) -> None:
        assert longest_common_subsequence("abc", "abc") == 3

    def test_one_empty(self) -> None:
        assert longest_common_subsequence("abc", "") == 0

    def test_subsequence_at_end(self) -> None:
        assert longest_common_subsequence("oxcpqrsvwf", "shmtulqrypy") == 2

    @given(s=st.text(alphabet="ab", min_size=0, max_size=20))
    def test_self_lcs_is_length(self, s: str) -> None:
        assert longest_common_subsequence(s, s) == len(s)

Implement it yourself

Run: just challenge dp longest_common_subseq

Then implement the functions to make all tests pass. Use just study dp for watch mode.

Reveal Solution

longest_common_subseq

Longest Common Subsequence — length of LCS of two strings.

Problem

Given two strings, return the length of their longest common subsequence. A subsequence is a sequence derived by deleting some (or no) characters without changing the relative order.

Approach

2D DP. If characters match, extend the diagonal; otherwise take the max of skipping one character from either string. Space-optimized to a single row.

When to use

Diff / alignment — "longest common subsequence", "diff two files", DNA sequence alignment. Foundation for unified-diff algorithms. See also: edit_distance for minimum-cost transformation.

Complexity

Time: O(m * n) Space: O(min(m, n))

longest_common_subsequence

longest_common_subsequence(text1: str, text2: str) -> int

Return the length of the LCS of text1 and text2.

longest_common_subsequence("abcde", "ace") 3 longest_common_subsequence("abc", "def") 0

Source code in src/algo/dp/longest_common_subseq.py
def longest_common_subsequence(text1: str, text2: str) -> int:
    """Return the length of the LCS of *text1* and *text2*.

    >>> longest_common_subsequence("abcde", "ace")
    3
    >>> longest_common_subsequence("abc", "def")
    0
    """
    # Ensure text2 is the shorter string for O(min(m,n)) space
    if len(text1) < len(text2):
        text1, text2 = text2, text1

    m, n = len(text1), len(text2)
    dp = [0] * (n + 1)

    for i in range(1, m + 1):
        prev = 0
        for j in range(1, n + 1):
            temp = dp[j]
            if text1[i - 1] == text2[j - 1]:
                dp[j] = prev + 1
            else:
                dp[j] = max(dp[j], dp[j - 1])
            prev = temp

    return dp[n]