Archive - Dec 7, 2006

Date
  • All
  • Jan
  • Feb
  • Mar
  • Apr
  • May
  • Jun
  • Jul
  • Aug
  • Sep
  • Oct
  • Nov
  • Dec

gotAPI

in

gotAPI, a faster way to search mostly used documentations in one place. HTML,Cascading Style Sheets (CSS),JavaScript,XML, PHP, MySQL and ever increasing documentations' result show by Ajax. No more point going to different sites to check functions.
gotAPI
Editor Comment:

If you are a developer who can't remember functions, this can be your homepage (else, Google)

Pingback

Pingback is one of the three types of Linkback system. A system is referred as Pingback when Site A calls(ping) Site B and tell them there is a link made by Site A to one of the Site B's document by XMLRPC. Webmasters, especially bloggers find this is a useful feature for their site. Most times people see Pingback in Wordpress Blogs (go to view source), but it can also appear in contents other than HTML, like images. For contents other than HTML, a necessary HTTP header modification is required to make Pingback work.
For more information about Pingback, check Wikipedia's Pingback page.
Pingback

Editor Comment:

I'm new to wordpress and blogging, and always trying to know more about it. Because I like to View Source webdevlogs, I found that pingback thing. In my entire HTML life, that's the first time I see it. I just have to find out what it's for.

The Forge of the best Weighted Random function

just normal random? that's 1999? what time it is now?
Weighted Random allows one thing have more chance to pop up than the other! and you define the weight(chance)!

When I first met PHP, I tried to make a lot of scripts, weighted random is one of them. I soon forgot about it after I build it. Now, after a long time of reading other people's code, I think I can improve my first version a lot.
My first version

function rand_string_pro($array){
	foreach($array as $var){
		$i = 0;
		while($i < $var['w']){
			$string[] = $var['s'];
			$i++;
		}
	}
return $string[array_rand($string)];
}

The input is a array like this(where $i can be any interger)

$array[$i]['s'] = 'string';
$array[$i]['w'] = 30;//only interger

I thought that code is perfect, until weeks ago I saw the flaws and start writing improvements. The list of flaws:
*huge weight can slow the script down, even produce memory overflow
*huge string with reasonable amount of weight can produce memory overflow too
*it does not seems FAST...

This list does not include error handling, because I don't think I need to program something to notice kind experienced programmers. Yes Yes, I know it is a bad act.

I did some improvement and released my 1.0 version

/*
        Weighted Random Ver 1.0 by Mgccl(mgcclx@gmail.com)
        Usage: input an numbered array
        $array[$i][?s?] is the string you want to return
        $array[$i][?w'] is the weight of the string
        you can allow the function to find the GCD
        sometimes it speeds up the script
        To use GCD, use the function like
        rand_string_pro($array, true);
 
        Note: You need the math functions I wrote in
        order to use GCD, you can find them here:
        /2006/11/21/some-math-functions/
        */
function rand_string_pro($seed, $gcd = false){
        $count = count($seed);
        if($gcd == true){
                foreach($seed as $var){
                        $gcd_array[] = $var[?w?];
                }
                $gcd = math_gcd_array($gcd_array);
                if($gcd != 1){
                        $i = 0;
                        while($i < $count){
                                $seed[$i][?w?] /= $gcd;
                                ++$i;
                        }
                }
                unset($gcd, $gcd_array);
        }
 
        $i = 0;
        while($i < $count){
                $n = 0;
                while($n < $seed[$i][?w?]){
                        $key[] = $i;
                        ++$n;
                }
                ++$i;
        }
        return $seed[$key[mt_rand(0, count($key)-1)]][?s?];
}

The new version have these improvements:
* Weight divided by the GCD so we have smaller weights
* Produce a new array to store the key, and the key refer to the "key name" of the variable seed. No more huge array in the size of value
* Used ++$i instead of $i++, small speed up

But there is still a flaw in the system. It still creates a huge array if the GCD is small compare to the weight. The performance could still be slow.

I released the version 1.1, more like a usability improvement version

/*
        Weighted Random Ver 1.1 by Mgccl(mgcclx@gmail.com)
        Update: Nov/25/06 allow non-weighted random, separate the
        string storing array and the weight storing array
        Usage: input $array[$i] = 'string' format(where $i is a number)
        $array[$i]is the string you want to return
        $weight[$i]is the weight of the string
        if one of the $array[$i] does not have a
        $weight[$i] as a match, it automatically
        set $weight[$i] as 1
        To allow use weighted function, call the function like this
        rand_string_pro($array, $weight);
        you can allow the function to find the GCD
        sometimes it speeds up the script
        To use GCD, use the function like
        rand_string_pro($array, $weight, true);
 
        Note: You need the math functions I wrote in
        order to use GCD, you can find them here:
        /2006/11/21/some-math-functions/
        */
 
function rand_string_pro($seed, $weighted = false, $gcd = false){
        $count = count($seed);
   if($weighted === false){
      return $seed[mt_rand(0, $count - 1)];
   }else{
       if($gcd === true){
            if(count($weighted) != $count){
            }else{
               foreach($weighted as $var){
                  $gcd_array[] = $var;
               }
               $gcd = math_gcd_array($gcd_array);
               if($gcd != 1){
                  $i = 0;
                   while($i < count($gcd_array)){
                       $weighted[$i] /= $gcd;
                      ++$i;
                   }
               }
                   unset($gcd, $gcd_array);
           }
       }
           $i = 0;
           while($i < $count){
              if(empty($weighted[$i])){
                 $key[] = $i;
              }else{
                   $n = 0;
                   while($n < $weighted[$i]){
                           $key[] = $i;
                           ++$n;
                   }
               }
                   ++$i;
           }
           return $seed[$key[mt_rand(0, count($key)-1)]];
      }
}

this version separated the input of the string array and the weight array
it also made the function usable when there is no weight present.

And finally, Version 1.2 is where I made the speed to the max.

/*
        Weighted Random Ver 1.2 by Mgccl(mgcclx@gmail.com)
        http://mgccl.com
        Update: Nov/29/06
        Faster Speed
        allow non-weighted random, separate the
        string storing array and the weight storing array
        Usage: input $array[$i] = 'string' format(where $i is a number)
        $array[$i]is the string you want to return
        $weight[$i]is the weight of the string
        if one of the $array[$i] does not have a
        $weight[$i] as a match, it automatically
        set $weight[$i] as 1
        To allow use weighted function, call the function like this
        rand_string_pro($array, $weight);
        */
 
function rand_string_pro($seed, $weighted = false){
        $count = count($seed);
        if($weighted === false){
                return $seed[mt_rand(0, $count - 1)];
        }else{
                $i = 0; $n = 0;
                $num = mt_rand(0, array_sum($weighted) + ($count-count($weighted)));
                while($i < $count){
                        if(empty($weighted[$i])){
                                ++$n;
                        }else{
                        $n += $weighted[$i];
                    }
                    if($n >= $num){
                        break;
                    }
                    ++$i;
                }
                return $seed[$i];
                }
}

This code is way more faster than the old version because:
1. The loop will break at the selected string
2. It use one numeric variable instead of a huge array

Basically, this one will be the fastest code possible until anyone find anything faster. And I have t o post my final copy of the code. Version 1.3 Alpha, it have some more functions:

		/*
        Weighted Random Ver 1.3 by Mgccl(mgcclx@gmail.com)
        Update: Dec/4/06
        Allow user to input the amount of the array to be chose
        Allow user to chose only unique items
        Update: Nov/29/06 
        Faster speed
        allow non-weighted random, separate the
        string storing array and the weight storing array
 
        Usage: input $array[$i] = 'string' format(where $i is a number)
        $array[$i]is the string you want to return
        $weight[$i]is the weight of the string
        if one of the $array[$i] does not have a
        $weight[$i] as a match, it automatically 
        set $weight[$i] as 1
        To allow use weighted function, call the function like this
        rand_string_pro($array, $weight);
        chose the amount you want to pick
        rand_string_pro($array, $weight, 3);
        chose only the unique
        rand_string_pro($array, $weight, 3, 1);
        */
 
 
 
	function rand_string_pro($array, $weight = 0, $amount = 1, $unique = 0){
		$count = count($array);
		if($amount == 1){
			if($weight == 0){
					return $array[mt_rand(0, $count - 1)];
 
			}else{
	    		$weighted_sum = array_sum($weight) + ($count-count($weight));
	        	$i = 0; $n = 0;
	        	$num = mt_rand(0, $weighted_sum);
	        	while($i < $count){
	        		if(empty($weight[$i])){
	        			++$n;
	        		}else{
	            	    $n += $weight[$i];
	           		}
	            	if($n >= $num){
	            		break;
	            	}
	            	++$i;
	        	}
	        	return $array[$i];
			}
		}else{
			if($weight == 0){
					$rand = array_rand($array, $amount);
					foreach ($rand as $var){
						$result[] = $array[$var];
					}
					return $result;
			}else{
				if($unique == 0){
					$max = array_sum($weight)+ $count - count($weight);
					$i = 0;
					while($i<$amount){
						$random[] = mt_rand(0, $max);
						++$i;
 
					}
					sort($random);
					$i = 0; $n = 0; $m = 0;
					$random_count = count($random);
					while(1){
						while($m >= $random[$i]){
							$key[] = $array[$n-1];
							++$i;
							if($i == $random_count){
								break 2;
							}
						}
						if($weight[$n]){
							$m += $weight[$n];
						}else{
							++$m;
						}
						++$n;
					}
				}else{
					$i = 0;
					if($amount >= $count){
						return $array;
					}else{
						$weighted_sum = array_sum($weight) + ($count-count($weight)) - 1;
						$sub = 0;
						while($i<$amount){
				    		$weighted_sum -= $sub;
				        	$n = 0; $m = 0;
				        	$num = mt_rand(0, $weighted_sum);
				        	while ($m <= $num){
								if($weight[$n]){
									$m += $weight[$n];
									$sub = $weight[$n];
									++$n;
								}else{
									++$m;
									$sub = 1;
									++$n;
								}
							}
							if($n != 0){
								$n = $n - 1;
							}
							$key[] = $array[$n];
							unset($weight[$n]);
							array_merge($weight);
							++$i;
						}
					}
				}
			return $key;
			}
		}
	}

Because I want to know if there is any other way to make it faster, I join forums and started a Contest for the fastest weighted Random script. Here is mine result of the benchmark:
test1: 0.116928815842 s
test2: 0.145362615585 s
test3: 0.141038894653 s

score: 1.38378739357

and here are the guys who send their work.

Submission from pterodactyl

function weightedRandom($array, $weight = array())
{
    $count = count($array);
    $sum = array_sum($weight) + $count - count($weight);
    $key = mt_rand(0, $sum);
    $n = 0; $i = -1;
    do {
        $n += isset($weight[++$i]) ? $weight[$i] : 1;
    } while ($key > $n);
    return $array[$i];
}

Result:

test1: 0.140782833099 s
test2: 0.145872592926 s
test3: 0.14186835289 s

score: 1.4365336895

Submission from krt

function weightedRandom($array, $weight=null) {
    if ($weight) {
        $rand = mt_rand(1, array_sum($weight));
        $cummulativeWeight = 0;
        foreach ($array as $k=>$v) {
            $cummulativeWeight += $weight[$k];
            if ($cummulativeWeight >= $rand) { $v; break; }
        }
    } else {
        return $array[array_rand($array)];
    }
}

Result:

test1: 0.106962442398 s
test2: 0.139887094498 s
test3: 0.133788108826 s

score: 1.31472468376

krt did not make the script accept a weight array with a condition to check if the $weight[$i] is set or not.

Submission from Daniel15

function rand_weighted($values, $weights) {
    // Loop through all values passed. If a weight isn't set for a value, assume 1
    foreach ($values as $key => $value)    if (!isset($weights[$key])) $weights[$key] = 1;
    // Now, get the total of the weights
    $total = array_sum($weights);
    // Stuff for percentiles
    $percentiles = array();
    $percentileTemp = 0;
    $percentileLast = 0;
 
    // Loop through each weight
    foreach ($weights as $key => $weight) {
        // Get the percentage chance for this
        $percent = ($weight / $total) * 100;
        // Add this to the running total, for the percentile
        $percentileTemp += $percent;
        // The last percentile value used. Used to find the minimum
        $percentileLast = $percent;
 
        // Now, set the percentile
        $percentiles[$key]['min'] = $percentileTemp - $percentileLast;
        $percentiles[$key]['max'] = $percentileTemp;
    }
 
    // Get a random value (0 to 100). Get 4 decimal places (not sure if there's a better way to do this)
    $random = mt_rand(0, 1000000) / 10000;
 
    //echo $random, '<br />';
    //echo '<pre>'; print_r($percentiles); echo '</pre>';
 
    // Loop through all percentile values
    foreach ($percentiles as $key => $percentile) {
        // If value falls in this percentile...
        if ($percentile['min'] < $random && $random < $percentile['max'])
            // We've found it. Return this string
            return $values[$key];
    }
 
}

Result:
test1: 0.238054037094 s
test2: 0.225346565246 s
test3: 0.227444648742 s

score: 2.28517484665
Daniel15 got a detailed description, but getting the percentile is not necessary.

codestryke

  function get_random_item($content, $weight, $value) {
 
  	// Add all the weights together.
  	$total_weight = 0;
	  foreach ($content as $entry) {
		  $total_weight += $entry[$weight];
	  }
 
 
  	// Generate random selection between 0 and totalweight - 1
	  $pick = mt_rand(0, $total_weight - 1);
 
 
  	// Find the content entry that matches the random number.
  	$cumulative_weight = 0;
  	foreach ($content as $entry) {
  		$cumulative_weight += $entry[$weight];
  		if ($pick < $cumulative_weight) {
  			return $entry[$value];
  		}
  	}
 
    return false;
 
  }

Result:
test1: 0.17635011673 s(input get_random_item($array, $weight=array())
test2: 0.193833112716 s
test3: 0.168747425079 s

score: 1.82810807228

I have changed part of the code (take out the $value part) in order for the testing to work. codestryke include the function to specify the name of the array's output string.

Honey Pot that kill bots