@fabian@floss.social
aprxc — A command-line tool to estimate the number of distinct elements in a file/stream using Chakraborty/Vinodchandran/Meel approximation algorithm¹.
▸ Easier to remember & faster than sort | uniq | wc -l.
▸ Bound memory usage
▸ Exact until 80k unique values, 0.4–1% deviation beyond
▸ v2 with less BS in the README
Useful? You decide! Try out:
▸ uvx aprxc -h
▸ pipx run aprxc -h
https://codeberg.org/fa81/aprxc
https://github.com/hellp/aprxc
¹ https://arxiv.org/pdf/2301.10191#section.2
#math #ComputerScience #Python #CLI