Skip to content

Group Anagrams

Problem

Given a list of strings, group the anagrams together. An anagram is a word formed by rearranging the letters of another.

Approach

Use the sorted characters of each string as a hash key. All anagrams produce the same sorted tuple, so they land in the same bucket in a defaultdict.

When to Use

Grouping items by equivalence class — anagrams, isomorphic strings, etc. Keywords: "group by", "categorize", "bucket by canonical form".

Complexity

Time O(n * k log k) where n = number of strings, k = max string length
Space O(n * k)

Implementation

group_anagrams

Group Anagrams — group strings that are anagrams of each other.

Problem

Given a list of strings, group the anagrams together. An anagram is a word formed by rearranging the letters of another.

Approach

Use the sorted characters of each string as a hash key. All anagrams produce the same sorted tuple, so they land in the same bucket in a defaultdict.

When to use

Grouping items by equivalence class — anagrams, isomorphic strings, etc. Keywords: "group by", "categorize", "bucket by canonical form".

Complexity

Time: O(n * k log k) where n = number of strings, k = max string length Space: O(n * k)

group_anagrams

group_anagrams(strs: list[str]) -> list[list[str]]

Return groups of anagram strings.

sorted( ... sorted(g) ... for g in group_anagrams(["eat", "tea", "tan", "ate", "nat", "bat"]) ... ) [['ate', 'eat', 'tea'], ['bat'], ['nat', 'tan']]

Source code in src/algo/arrays/group_anagrams.py
def group_anagrams(strs: list[str]) -> list[list[str]]:
    """Return groups of anagram strings.

    >>> sorted(
    ...     sorted(g)
    ...     for g in group_anagrams(["eat", "tea", "tan", "ate", "nat", "bat"])
    ... )
    [['ate', 'eat', 'tea'], ['bat'], ['nat', 'tan']]
    """
    groups: defaultdict[tuple[str, ...], list[str]] = defaultdict(list)
    for s in strs:
        key = tuple(sorted(s))
        groups[key].append(s)
    return list(groups.values())
tests/arrays/test_group_anagrams.py
"""Tests for the group-anagrams problem."""

from algo.arrays.group_anagrams import group_anagrams


def _normalize(groups: list[list[str]]) -> list[list[str]]:
    """Sort each group and sort groups for order-independent comparison."""
    return sorted(sorted(g) for g in groups)


class TestGroupAnagrams:
    def test_basic(self) -> None:
        result = group_anagrams(["eat", "tea", "tan", "ate", "nat", "bat"])
        assert _normalize(result) == [
            ["ate", "eat", "tea"],
            ["bat"],
            ["nat", "tan"],
        ]

    def test_empty_input(self) -> None:
        assert group_anagrams([]) == []

    def test_single_string(self) -> None:
        assert group_anagrams(["a"]) == [["a"]]

    def test_all_same_anagram(self) -> None:
        result = group_anagrams(["abc", "bca", "cab"])
        assert _normalize(result) == [["abc", "bca", "cab"]]

    def test_no_anagrams(self) -> None:
        result = group_anagrams(["a", "b", "c"])
        assert _normalize(result) == [["a"], ["b"], ["c"]]

    def test_empty_strings(self) -> None:
        result = group_anagrams(["", ""])
        assert _normalize(result) == [["", ""]]

Implement it yourself

Run: just challenge arrays group_anagrams

Then implement the functions to make all tests pass. Use just study arrays for watch mode.

Reveal Solution

group_anagrams

Group Anagrams — group strings that are anagrams of each other.

Problem

Given a list of strings, group the anagrams together. An anagram is a word formed by rearranging the letters of another.

Approach

Use the sorted characters of each string as a hash key. All anagrams produce the same sorted tuple, so they land in the same bucket in a defaultdict.

When to use

Grouping items by equivalence class — anagrams, isomorphic strings, etc. Keywords: "group by", "categorize", "bucket by canonical form".

Complexity

Time: O(n * k log k) where n = number of strings, k = max string length Space: O(n * k)

group_anagrams

group_anagrams(strs: list[str]) -> list[list[str]]

Return groups of anagram strings.

sorted( ... sorted(g) ... for g in group_anagrams(["eat", "tea", "tan", "ate", "nat", "bat"]) ... ) [['ate', 'eat', 'tea'], ['bat'], ['nat', 'tan']]

Source code in src/algo/arrays/group_anagrams.py
def group_anagrams(strs: list[str]) -> list[list[str]]:
    """Return groups of anagram strings.

    >>> sorted(
    ...     sorted(g)
    ...     for g in group_anagrams(["eat", "tea", "tan", "ate", "nat", "bat"])
    ... )
    [['ate', 'eat', 'tea'], ['bat'], ['nat', 'tan']]
    """
    groups: defaultdict[tuple[str, ...], list[str]] = defaultdict(list)
    for s in strs:
        key = tuple(sorted(s))
        groups[key].append(s)
    return list(groups.values())