Archive for April, 2011
Selecting large random integers in Perl
Monday, April 11th, 2011
We had a very odd bug in a simulation we were writing recently. We were supposed to be sampling from a large pool of possible data, but were getting a very weird distribution of values. After much debugging we found a most unusual cause.
Here is the pop quiz – read through the short script below and take your best guess at what the output will be. Being correct within 5% is good enough.
#!/usr/bin/perl use warnings; use strict;
my %seen;
for (1..10000000) {
my $rand = int(rand(1000000));
++$seen{$rand};
}
print "I saw ".(scalar keys %seen)." different values\n";
I should point out that the random number generation here is done according to the perl documentation, which simply says:
“Apply “int()” to the value returned by “rand()” if you want random integers instead of random fractional numbers. For example,
int(rand(10))
returns a random integer between 0 and 9, inclusive.”
OK, have you guessed – well the answer we got was:
I saw 32768 different values
Yes, that’s right, after selecting 10 million values from a range of 0-999,999 we only actually saw just over 32 thousand different values. This was the reason our distribution looked odd – we were only seeing around 2% of the values we could have seen.
It turns out that the cause of this oddity is platform specific. Perl doesn’t itself include code to generate random numbers – it simply makes a call to the random number library supplied by the underlying operating system. In our case this code was being run on 64-bit Activeperl under Windows 7, and the standard windows random number library is only capable of generating 32768 different values (15 bits of randomness).
If we take this exact code and run it under Linux we get:
I saw 999950 different values
..and our simulation returns sensible numbers.
This appears to be pretty poor – I’m sure the perl people will just blame the microsoft library, but this could be worked around in the perl implementation, or at the very least a note should be added to the rand() documentation to specifically warn that there is a precision limit to rand, and that this might be very low on some platforms.
Fortunately however there are some proper work rounds for this problem. If you need to reliably generate random numbers from a large range in Perl then there are a few modules which provide more fully featured random number generators than the default rand() function. Two of the most popular are Math::Random and Math::Random::MT, either of which will work reliably and consistently on all platforms.
Tags: integer, Perl, random
Posted in Computing | Comments Off
Recovered from defacing
Sunday, April 10th, 2011
Anyone visiting the site in the last couple of days may have found that I appeared to have simultaneously developed a fanatical interest in middle eastern politics, and a very poor taste in music. Whilst it’s very convenient to have a simple to use blogging engine such as WordPress to use, I guess the downside is that you occasionally make yourself the target of automated attacks.
It’s not completely clear how the site was compromised – there’s only one admin account on the site with a pretty unusual password (now changed) which wasn’t recycled elsewhere so that seems an unlikely avenue of attack. Some kind of compromise within WordPress seems more likely. The site was running the most recent available update of WordPress available from my web host, but there was an additional point update which was available but not applied. I guess I need to learn my lesson and make sure it’s always running the latest release, even if that means doing the update myself.
Fortunately I had a full backup of the site to fall back on (whoever defaced the site also kindly deleted all of the existing content) so I was able to put everything back. I did however find that even if I didn’t have a backup that I could recover every post I’d ever written from google’s cache, so it’s very generous of them to provide such an efficient backup service for the whole internet.
Hopefully now that I’m completely up to date with my wordpress version there shouldn’t be any more problems on the site. Apologies for any inconvenience caused whilst the site was a mess.
Tags: backup, defaced, hack, recovered, wordpress
Posted in Computing | Comments Off