Archive - Jan 2007 - Blog entry

Date

New Upgrade on BCroot, speed boost

A week ago, I have released a bcpow() no fraction exponent work around. It is good, like all my scripts, but it is not fast. This code could take 12.152509 second to run the following statement on my pc.

bcroot(2,3072);

The main reason of slowness is in the algorithm. There is a bcpow() function that will use the second number passed to bcroot() as the exponent. In that above script, it multiple 2 for over 3000 times!
PHP's bcsqrt() function is very fast, with a little math knowledge you would know that 3072 = 1024 * 3 = 2^10 * 3. The new version of the function takes advantage of the bcsqrt()'s speed. So now, it runs bcsqrt() 10 times and then multiple it with bcpow($whatever, 3). Result? check this out:
0.000928 second.
You might want to know how did I found a number's product that is a power of 2.First of my attempt was do many bcmod() to the number until it returns 1. But then I noticed in binary, a number that have a product that is a power of 2, will have some trailing zeros, the number of trailing zeros is the exponent of 2. The other product will be what remains in the front. It will be great for you to know the function decbin() and bindec(). For example:
decbin(10) = 1010. 1 trailing zeros, so 10 = 2^1 * bindec(101)
decbin(12) = 1100. 2 trailing zeros, so 12 = 2^2 * bindec(11)
decbin(16) = 10000. 4 trailing zeros, so 16 = 2^4 * bindec(1)
Neat...I uses like 10 minute of my free time trying this...
Here is the new version of the function, it still need the bcgetscale() function that exists in my last post about bcroot()

//Version 0.2 of BCRoot
//Change Log: It uses a lot bcsqrt() to make the speed fast
//fix a small decimal bug, where the last decimal could be wrong
 
function bcroot($a, $n, $scale='default'){
    $default = bcgetscale();//Get the scale
    if($scale == 'default'){
        $scale = $default;//use default scale
    }
    if($n & ($n-1)){//check if $n is the power of 2, return 0 if is
        //decbin is the reason this function can't have number
        //larger than 2^31
        $bin = decbin($n);
		$i = strlen($bin)-1;
		$pow = 0;
		while($i){
			if($bin[$i]==='0'){
				++$pow; --$i;
			}else{
				break;
			}
		}
		$n = bcdiv($n, bcpow(2, $pow),0);
       //now use Newton?s method to find the number
        bcscale($scale+15);
        $x = 1;
        $k = 0;
        $limit = ceil(log($scale+15)/log(2))+1;
        while($k<$limit){
            $t1 = bcdiv(1,$n);
            $t21 = bcmul(bcsub($n,1),$x);
            $t22 = bcdiv($a,bcpow($x, $n-1));
            $t2 = bcadd($t21,$t22);
            $x = bcmul($t1, $t2);
            ++$k;
        }
        $i = 0;
        while($i < $pow){
            $x = bcsqrt($x,$scale+3);
            ++$i;
        }
        bcscale($default);
        return bcadd($x,0,$scale);
    }else{
    //here use many bcsqrt, because this is FAST
        $i = 0;
        $pow = log($n)/log(2);
        while($i < $pow){
            $a = bcsqrt($a,$scale+3);
            ++$i;
        }
        return bcadd($a,0,$scale);
    }
}

Failed to make a large random number generator

I have tried to make a system that can generate a large random number, and I have encounter with problems. Here is the code that work when you first see it, but then you find there will be a great flaw in the system. I will show you step by step:

function bcrand($min, $max, $rand='rand'){
//ok, first, we subtract $max with $min
//because in the end we are going to
//add the number with $min
$rand_max = bcsub($max,$min);
$max_len = strlen($max);
 
 
 
$rand_num[0] = $rand(0,$max[0]);
  if($rand_num[0] == $max[0]){
    $ismax = 1;
}
 
$i = 1;
do{
  if($ismax){
    $rand_num[$i] = $rand(0,$max[$i]);
    if($rand_num[$i] == $max[$i]){
      $ismax = 1;
    }
  }else{
    $rand_num[$i] = $rand(0,9);
  }
  ++$i;
  usleep(mt_rand(0,10));
}while($i < $max_len);
 
 
//make the $rand_num array into a string
$i = 0;
do{
$return_num .= $rand_num[$i];
++$i;
}while($i<$max_len);
//add the returning number with the min
//number, it also take out the prefix zeros
$return_num = bcadd($return_num,$min);
return $return_num;
}

The concept is pretty easy, generate each digit of the number, and then put them together.
The first problem I meet is there could be a digit larger than the max number I specified. To solve that problem, I made the it generate the first digit equal or less than the max number's first digit first. and if the first digit is the max digit possible, the next digit will follow the same way.
This worked out fine, but if you really think about it, this is not right.
Suppose chose a number between 0 and 1000, if it choses the first digit, there is half of the chance of choosing 1, and lead to half of the chance of getting 1000, and half of the chance get the rest 999 numbers.
So I still have to work on that a bit more.

OpenDomains

OpenDomains is a open source PHP script to become a free domain provider.
I don't know how it works but I posted it here because it SOUNDS COOL.

OpenDomains

Bad Behavior

Bad behavior just WORKS! The bad behavior block count in the footer of the page shows 231. It is pretty powerful because I have only installed it for 2 days. There is no spam in the comment, and there is only one spam found. I have just added bad behavior on my forum and let us check out how it does.

Bad Behavior works by checking the HTTP user agent, check the database of the bad bots, if matches, block them entirely. Because this works before the bot can even get into your site, it saves your time by not loading the entire page. I think any CMS should have incorporate this system into their script because for the basic function, only one line of code is need to be added.

How to stop GoogleBot scan my site?

Someone in my forum asked this question. In my forum you ask questions and you will have a great chance getting an answer from me. Some of the questions will be featured at my blog.

How to stop GoogleBot scan my site?

Good day all.

On my server, I'm getting logged from apache:

66.249.65.109 - - [26/Jan/2007:10:49:46 -0500] "GET /tmp/logs/etc/?C=D;O=D HTTP/1.1" 200 965 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; http://www.google.com/bot.html)"

Which is Google Bot (as seen above). I would like to know how to stop this scanning of some folders (ie,/var/www/tmp/ and /var/www/opt) and other ones. I don't if this means editing robot.txt or something like that.

Thanks for your time

-kuzew.
2007-01-26

This is how to do the job:

Make a robot.txt at your website root, like can be accessed though.
www.yourwebsite.com/robot.txt
and put these in it:

User-Agent: Googlebot
Disallow: /

Googlebot sees this and will stop scanning anything in your site.. I mean ANYTHING.
But if you only want Googlebot to stop scan some items, you can try this

User-Agent: Googlebot
Disallow: /dirname/
Disallow: /dirname2/somesubdir/

Also, if you replace Googlebot with *, all bots except bad bots that does not follow robot.txt will stop crawling your site. Bad bots should be stopped using bot traps.

Honey Pot that kill bots