This probably isn’t the best way to track things, but it’s the best I came up
with. Tracking the total number of users that log in per day or month is trivial
as I have discussed in a previous post. To track the peak users online at
a given time requires a bit more thought.
The major issue for me is that I don’t necessarily want to be slamming Redis on
every page view, so I end up throttling the writes to every couple of minutes.
Delay aside, I have come up with two ways to track the peak number of users. One
method results in more data being stored while the other ends up with a small
bit of computation. YMMV and you should utilize a method that makes the most
sense for your current scale.
The first method is to write to a sorted set periodically with the number of
users online. You can then pull the peak number of users by grabbing the member
with the largest score:
ZADD :key :score: :member
:key
is your key (I’d name it something like ns:users:online:peak:20151224
).
:score
is how many users were online at that moment and :member
would be the
current timestamp (CCYYMMDDHHMMSS
).
To grab the highest value for the day (or whatever period you’re using) you
would run:
ZREVRANGE :key 0 0 WITHSCORES
Which would return the :member
and :score
for the member with the highest
score. You could break this up by hour, day, week, year, whatever. Could even
write to multiple keys at once to create something a bit more like a time
series.
The next method, and the method I am currently using on one of my sites is to
keep a value, sanity checking if the new value is larger and if so, overwriting
it. The reason I took this approach is because I wanted to store the values in a
hash to avoid key bloat.
As mentioned, this method requires a bit more logic as you have to retrieve the
stored value, compare it to the current value and if the current value is
greater, go ahead and store that. I’m sure this additional logic adds a small
bit of margin of error, but like I said, it reduces some of the key bloat and in
my opinion, makes it a ton easier to pull multiple day’s peak values without
pipelining.
First, you need to get the stored value:
HGET :key :period
As before, :period
can be whatever period you want it to be, I tend to use the
current date as I like having peak user data broken out by day. The :key
can
be something like ns:users:online:peak
.
Once you have this value, you’ll need to compare it to the current value (this
is a code agnostic post, I trust that you know how to compare variables ;). If
the current value is larger than the value we retrieved, you will need to write
the current value back out:
HSET :key :period: :users
Where :users
is the current number of users online.
That’s pretty much it, both methods allow you to track the peak number of users
per day. If so desired, you could run this on every single page load, but I find
that both methods would introduce quite a bit of unnecessary overhead.
Assuming you don’t prune the keys for the first method, you could end up with a
better picture of your site’s daily usage whereas the second method only shows
you the peak.
As mentioned, you have to choose the method that best fits your use case. If
you’ve come up with a better method of tracking, I’d love to hear about it, hit
me up in the comments below!
Oh! and by the way, Merry Christmas 🙂