Pybites Logo Rust Platform

uniq -c: Count Adjacent Duplicates

Medium +3 pts

🎯 uniq -c collapses consecutive duplicate lines into one, prefixed with how many times it repeated.

The "consecutive" part is the catch: uniq only looks at neighbors, so a a b a becomes 2 a1 b1 a (two separate runs of a). 

In Unix:

$ echo -e "tag1\ntag2\ntag1\ntag4\ntag2\ntag1" > tags.txt
$ sort tags.txt|uniq -c
   3 tag1
   2 tag2
   1 tag4

And in Python:

from itertools import groupby


def uniq_c(text: str) -> list[tuple[int, str]]:
    # need to sort the lines first to group them correctly
    sorted_lines = sorted(text.splitlines())
    # below takes the len of the values, intermediate results:
    # ('tag1', ['tag1', 'tag1', 'tag1'])
    # ('tag2', ['tag2', 'tag2'])
    # ('tag4', ['tag4'])
    return [(len(list(g)), key) for key, g in groupby(sorted_lines)]


assert uniq_c("tag1\ntag2\ntag1\ntag4\ntag2\ntag1") == [
    (3, "tag1"),
    (2, "tag2"),
    (1, "tag4"),
]

Vec::last_mut — edit the final element in place

Walk the lines and look at what you last pushed. If the current line continues …

Login to see the full exercise.