Performance comparison: counting words in Python, Go, C++, C, Awk, Forth, Rust

Summary: I describe a simple interview problem (counting frequencies of unique words), solve it in various languages, and compare performance across them. For each language, I’ve included a simple, idiomatic solution as well as a more optimized approach via profiling…

Language Simple Optimized Notes
`grep` 0.04 0.04 `grep` baseline; optimized sets `LC_ALL=C`
`wc -w` 0.29 0.20 `wc` baseline; optimized sets `LC_ALL=C`
Zig 0.54 by ifreund and matu3ba
Nim 0.76 0.58 by csterritt and euantorano
C 0.97 0.23
Go 1.14 0.39
Crystal 1.29 by Andrea Manzini
PHP 1.36 by Max Semenik
Rust 1.43 0.38 by Andrew Gallant
C# 1.51 0.82 by J Taylor, Y Ostapenko, O Turan
OCaml 1.72 by Nate Dobbins and Pavlo Khrystenko
C++ 1.73 0.42 optimized by Jussi Pakkanen
Perl 1.81 by Charles Randall
F# 1.82 1.59 by Yuriy Ostapenko
Kotlin 1.86 by Kazik Pogoda
Python 2.07 1.30
Lua 2.50 1.97 by themadsens; runs under luajit
JavaScript 2.52 1.90 by Dani Biro and Flo Hinze
Ruby 3.13 2.43 by Bill Mill
AWK 3.55 1.13 optimized uses `mawk`
D 4.16 1.01 by Ross Lonstein
Swift 4.23 by Daniel Muellenborn
Forth 4.26 1.46
Shell 14.60 1.85 optimized does `LC_ALL=C sort -S 2G`

https://benhoyt.com/writings/count-words/

This thread was posted by one of our members via one of our news source trackers.

1 Like

Thanks for the post

It’s missing Elixir… just saying

1 Like

PHP faster than Rust and C++?!

Nice joke.

1 Like

I like the Crystal syntax considering its speed

``````counts = Hash(String, Int32).new(0)

STDIN.each_line do |line|
line.downcase.split.each do |word|
counts[word] += 1
end
end

entries = counts.to_a.sort_by! &.[1]
entries.reverse_each do |(word, count)|
puts "#{word} #{count}"
end
``````

PHP is not the joke it used to be

Ruby is faster than Swift?

1 Like

Ruby has come a long way too… just like PHP

1 Like

Now a days almost every language is fast enough. Often you have to choose a language for a job on other criteria than performance.

1 Like

Yeah, right, as if. They just use bridges to C libraries underneath.

It’s tempting to think your favourite language is fast. But very often it’s just the stdlib shelling out to native implementations. Erlang does it too.

2 Likes