PHP

Chat log database, pidginlog

I developed a simple chat logging database from the Pidgin html log parser.
It's one of those things I put in mgccl's google codes repository.
I call it pidginlog for now.
While it is called chat log, it is intended to log IM's instead of IRC or group chat logs.

You can find the download here, or check the SVN.

If you don't get how to use it, read the code, logic should be pretty clear. This requires you to run pidgin and a PHP server on the same machine.
You also have to use cron job or other kind of schedule service to run the script once in a while to make this thing actually useful.

If you don't understand the code or have no idea of what I just talked about, chose a user friendly alternative like IM-history, Dexrex or Web Pidgin.

In the future I will make a web front end to this program...or not. phpmyadmin is powerful enough for a front end.

Pidgin html log parser

I wish to put my chat logs in a database, then I can do a lot of fun operation on it. I wrote this php script so read Pidgin's html logs. I will write something separate to put it in a database.
I use html instead of text log so there is no ambiguous expressions (for example, someone copied a chat log and sent to you), and provided all information(which nickname is the user? one that's colored blue.).

To achieve what I really want, it's much better to wrote a plugin for Pidgin. I heard there was a Remote Logging plugin(which is exactly what I needed) but never saw anything come out from it. I hope there is someone who want to revive the project.

Some notes:
1. Pidgin can change the format anytime and render this not usable.
2. This is only useful in 1 on 1 IM's. in conversations, it can only record if you said something vs someone else said something.
3. the log will not be parsed if it's not complete(which means it didn't end in html end tag)

//this script reads a pidgin log html file.
//Configuration!!!!
//Logs to check
//The directory to the log file, without the last slash
$f = 'C:\Documents and Settings\UserXP\Application Data\.purple\logs';
//$s is an array of service name
//$u is an array of usernames
 
$s[]  = 'aim';
$u[] = 'mgcclx';
 
//use html or not
$html = 1;
//if use html, which html tags are allowed
$html_allow = '<br/><span><font><p><a>';
 
 
//Write your own logging function
//For every message, it call this function once
function logging_function($service,$user,$other,$user_or_other,$time,$speaker,$content){
	return true;
}
 
for($i=count($s)-1;$i>-1;$i--){
	$o = scandir($f.'/'.$s[$i].'/'.$u[$i]);
	for($j=count($o)-1;$j>1;$j--){
		$d[$j] = $f.'/'.$s[$i].'/'.$u[$i].'/'.$o[$j].'/';
		$files = file_list($d[$j],'html');
		for($k=0;$k<count($files);$k++){
			$log = parse_log($d[$j].$files[$k],$html,$html_allow);
			if($log===FALSE){
				continue;
			}
			for($l=0;$l<count($log);$l++){
				logging_function($s[$i],$u[$i],$o[$j],$log[$l][0],$log[$l][1],$log[$l][2],$log[$l][3]);
			}
		}
	}
}
 
function parse_log($file_name,$html=0,$html_allow = '<br/><span><font><p><a>'){
	$line = file($file_name);
	$c = count($line);
	if(rtrim($line[$c-1])!='</body></html>'){
		return FALSE;
	}
	preg_match("@Conversation with (.*?) at (.*?) (.*?) on (.*?) \((.*?)\)@u", $line[0], $match);
	$date=$match[2];
	$prev = 'AM';
 
	for($i=1;$i<$c;$i++){
		if(preg_match('@<font color="#(.*?)"><font size="2">\((.*?)\)</font> <b>(.*?):</b></font> (.*)<br/>@u', $line[$i], $match)==1){
			if($match[1]=="16569E"){
				$match[1]=1;
			}else{
				$match[1]=0;
			}
			if(substr($match[2],-2)=='AM'&&$prev=='PM'){
				$t = explode('/',$date);
				$date = gmdate("n/j/Y", gmmktime(0,0,0,$t[0],$t[1],$t[2])+86401);
			}
			$prev = substr($match[2],-2);
			$match[2] = $date.' '.$match[2];
			if(strpos($match[3],' &lt;AUTO-REPLY&gt;')!== FALSE){
				$match[3] = str_replace(' &lt;AUTO-REPLY&gt;','',$match[3]);
				$match[4] = '&lt;AUTO-REPLY&gt; '.$match[4];
			}
			if($html){
				$match[4] = strip_tags($match[4],'<br/><span><font><p><a>');
			}else{
				$match[4] = str_replace('<br/>',"\n",$match[4]);
				$match[4] = strip_tags($match[4]);
			}
			$log[] = array($match[1],$match[2],$match[3],$match[4]);
		}
	}
	return $log;
}
 
//this function found on http://us3.php.net/manual/en/function.scandir.php
//by phpdotnet at lavavortex dot com
function file_list($d,$x){
	foreach(array_diff(scandir($d),array('.','..')) as $f)if(is_file($d.'/'.$f)&&(($x)?ereg($x.'$',$f):1))$l[]=$f;
	return $l;
}

Join names

Recently I start join couple's names

Larry and Eva.
After the join, we have
Larva, Lava(hot...) and Vary

Peter and Alex
After the join, we have
Telex, Petal and Alter

Matt and Angela
mage, lama, mange, mangel

It work with other strings too... like
Chao and Evil
...Hail!!!

Sweet and Ass
We have...
Asset
ok... that is a lame one...
I would go with Swass. But it's not a word, so my script will not show it.

Yeah, I wrote a naive script to suggest possible names from a dictionary. It's slow...

//$s1 = first person's name, $s2 second person's name
//$a, a array of words, just google dictionary.txt
//$m minimal size of sub string used from each name
//$x minimal size of the output word
function joinname($s1,$s2,$a,$m=2,$x=4){
	$s1 = strtolower($s1);
	$s2 = strtolower($s2);
	$l1 = strlen($s1)+1;
	$l2 = strlen($s2)+1;
	if($m<1){
		$m=1;
	}
	for($i=$m;$i<$l1;$i++){
		for($i1=0;$i1<$l1-$i;$i1++){
			$t = substr($s1,$i1,$i);
			for($j=max($m,$x-$i);$j<$l2;$j++){
				for($j1=0;$j1<$l2-$j;$j1++){
					$t2 = substr($s2,$j1,$j);
					if(in_array($t.$t2,$a)){
						$r[] = $t.$t2;
					}
					if(in_array($t2.$t,$a)){
						$r[] = $t2.$t;
					}
				}
			}
		}
	}
	return $r;
}

imagetrim() function

This PHP function trims off a color in the outside of the image, and leave a smaller image.

Example:
From this:
Circle
To this:
Trimmed circle
By using this:

$file = "circle.png";
$img = imagecreatefrompng($file);
imagepng(imagetrim($img,imagecolorat($img,0,0)),$file);

Source of the function

//Input a image resource and a integer represent color
function imagetrim($img,$color){
	$mx=imagesx($img);
	$my=imagesy($img);
	for($x=0;$x<$mx;++$x){
		for($y=0;$y<$my;++$y){
			if(imagecolorat($img,$x,$y)!=$color){
				$minx = $x;
				break 2;
			}
		}
	}
	//The image is filled with $color
	if($minx==0){
		return null;
	}
	for($x=$mx-1;$x>$minx;--$x){
		for($y=0;$y<$my;++$y){
			if(imagecolorat($img,$x,$y)!=$color){
				$maxx = $x+1;
				break 2;
			}
		}
	}
	for($y=0;$y<$my;++$y){
		for($x=$minx;$x<$maxx;++$x){
			if(imagecolorat($img,$x,$y)!=$color){
				$miny = $y;
				break 2;
			}
		}
	}
	for($y=$my-1;$y>$miny;--$y){
		for($x=$minx;$x<$maxx;++$x){
			if(imagecolorat($img,$x,$y)!=$color){
				$maxy = $y+1;
				break 2;
			}
		}
	}
	$img2=imagecreatetruecolor($maxx-$minx,$maxy-$miny);
	imagecopy($img2,$img,0,0,$minx,$miny,$maxx-$minx,$maxy-$miny);
	return $img2;
}

Small update for BCext. Notes on PHPRPC and lifestream

I did a small update on BCext to improve it's factorial calculating function. The algorithm follows in my old post. I also found a flaw in the 4th formula.
Reintroduced to BCext, which remind me of bcrand() function I did a long time ago, that directed me to PHPRPC. The author of PHPRPC created a not so fast large random number generator on bcmath, I want to see if he have any new version(no, it's still the old generator).

Opening the page surprised me. PHPRPC have gone a long way, it's on it's 3.0 version. Looking at the improvement and the benchmark, I think I have found the future for my lifestream. PHPRPC is like XMLRPC... but faster and easier.

Talking about lifestreams. Here are some really lifestream worthy stuff. WhatPulse. Problem with whatpulse... security. Sometimes all I need is a reliable server that record how much keystroke one have. whatpulse fails at it by having so much anti-cheat security that made me loss many keystrokes. Maybe modify pykeylogger and send result to your own server? People see how many keys you stroked real time on your profile page? AWESOME!

Ok, maybe some are not lifestream worthy, just some life statistics.
Other possibilities. Spending log(I'm using GNUcash :), playlist and chat log.

You will say there are stuff like this online. RescueTime replace ManicTime+ self written script, Whatpulse replace pykeylogger + self written script(well this one is so easy I don't see the reason to use Whatpulse...), last.fm or w/e replace foobar2000+self written script that require learning foobar2000's API, mint replace gnucash + self written script turn gnucash export into more usable format, IM history replace pidgin + self written script analyze chat logs.

Nah. When you using those services. The data is not your data, the data is their data, they don't give you the freedom to access your data(export, API). (unless you pay some fees in some instances, like RescueTime)

Honey Pot that kill bots