Counting RSS subscriptions

The reason for this entry is more to show how interesting can it be working with a command line interface than to actually count RSS subscriptions. Counting is entirely based on my assumptions.

A friend asked me how many RSS subscriptions this blog has. I got hooked up by how to count them. I googled a bit and I could only find referencing FeedBurner. But it seamed to much work to go through. So here’s my simple estimate (which might be very very wrong).

First I wanted to see how many times the RSS link was accessed in the last month in my Apache logs:

$ grep index.rss access.log | wc -l
6479

Then I wanted to divide these entries by IP addresses:

$ grep index.rss access.log | awk '{IP[$1]++;} END {for(s in IP) print IP[s], s;}'
1 207.46.204.229
1 66.249.67.227
1 207.46.195.238
7 72.30.142.219
1201 72.14.199.79
3 188.92.75.82
2 207.46.204.236
.....

Count them:

$ grep index.rss access.log | awk '{IP[$1]++;} END {for(s in IP) print IP[s], s;}' | wc -l
266

Count those with only one entry (write all command in one line):

$ grep index.rss access.log | awk '{IP[$1]++;} END {for(s in IP) print IP[s], s;}' 
      | awk '{if ($1==1) print $1,$2}' | wc -l
137

If these people check my blog once a day and a month has 30 days this would give me 4 people dynamic IP’s as DSL dynamic IPs change once a day. Mobile IPs change with every connections and some people access the blog more than once a day. So all IPs that have accessed the RSS between 2 and 10 times (80% only 3 times) should be counted as the above.

$ grep index.rss access.log | awk '{IP[$1]++;} END {for(s in IP) print IP[s], s;}' 
       | awk '{if ($1>2 && $1<10) print $1,$2}' | wc -l
56

And there are another 17 IPs that belong to counts between 9 and 99
which I’d add here (if for example their RSS reader requests RSS feed once an hour or more). This is  137+56+17/30 approx 7 users with dynamic IPs

Some IPs have a count of more than 100 (some even more than 500) which might be RSS readers often checking the RSS address during the day from a fixed IP. So every IP that has a count of more than 100 counts for one user.

$ grep index.rss access.log | awk '{IP[$1]++;} END { for (s in IP) print IP[s], s; }' 
       | awk '{if ($1>100) print $1,$2}' | wc -l
16

This would be now 7+16=23. And because this sounds to much to me I’d divide it by 2 and get approx 11 users that read my blog through RSS feeds (me included).

But I might be entirely wrong with my guessing. Let’s see what has the world say about it in comments.

PS: Google analytics counted 400 unique visitors in the last month.