Brutkey

Fabian N. T.
@fabian@floss.social

aprxc — A command-line tool to estimate the number of distinct elements in a file/stream using Chakraborty/Vinodchandran/Meel approximation algorithm¹.

▸ Easier to remember & faster than
sort | uniq | wc -l.
▸ Bound memory usage
▸ Exact until 80k unique values, 0.4–1% deviation beyond
▸ v2 with less BS in the README

Useful? You decide! Try out:
▸ uvx aprxc -h
▸ pipx run aprxc -h

https://codeberg.org/fa81/aprxc
https://github.com/hellp/aprxc

¹
https://arxiv.org/pdf/2301.10191#section.2

#math #ComputerScience #Python #CLI