What’s 1000 words worth?
information
words
Let’s randomly select 1,000 lines from the dictionary and appends the number of bytes in that sample to a file.
for i in {1..500}; do
awk 'BEGIN {srand()} {printf "%05.0f %s \n",rand()*99999, $0; }' /usr/share/dict/words | sort -n |\\
head -1000 | sed 's/^[0-9]* //' | dd 2>&1 | grep "bytes transferred" | awk '{print $1}' >>sizes.dat
done
then, in R:
> sizes <- read.table("~/sizes.dat", header=TRUE)
> mean(sizes)
bytes
11581.83
> sd(sizes)
bytes
90.32316
> qqnorm(sizes$bytes)
> plot(density(sizes$bytes))
> hist(sizes$bytes, col=rainbow(15, start=.4))
> mean(sizes$bytes) / 1024
[1] 11.31038
11.31k is not a very large picture. Each of the exploratory plots (quantile x normal, density, histogram) is larger! Even these pictures of me and my daughter and the fat giraffes are 13k, 13k, and 15k respectively:
Still, here are all the many, many google image hits for ‘entropy’ pictures that are 128 x 128 pixels. Many of these are are in the roughly 11k range.
Happy Bastille Day.