Algorithm
Fastest integer multiplication algorithm not on wikipedia
Posted June 27th, 2008 by MgcclUpdate: It is, just not featured on multiplication algorithm page.
I just found it out today.
Someone should read it and understand it and wikipeida it. [That person is certainly, not me]
Faster Integer Mulitplication
- Mgccl's blog
- Add new comment
- 375 reads
Factorial Algorithms Part 1
Posted September 14th, 2007 by MgcclFactorial algorithms are the main topic I'm working on from 2 months ago. A lot interesting founds.
The naive algorithm was discussed in a previous post.
This post will introduce a new way to find the factorials. The algorithm used in GMP was described briefly in the manual.
The first step is to find out all the 2 factors. I don't know how exactly GMP would do that. Divide all the number have the potential of having 2s? find how many binary zeros in the end of the number and remove them? Could be any of those.
Second step is to collect up the terms. Which will result some possible powers like (3*5)^3.
Third step is to multiple everything inside each term, using binary split method that is nice in chain multiplication.
Some of my analysis result.
1. All odd number except 1 will be a factor.
2. There will be around floor(base 2 log(n)) terms when finding the factorial of n
Considering binary splitting multiplication will take the most of the time because it contains n/2 multiplications. The process suppose to run at O(n log(n)^3) when used with FFT based multiplication according to mathematical constants and computations site's description on binary splitting. I guess the GMP one could run faster because factor 2 is out and exponentiation by squaring further increase the speed.
- Mgccl's blog
- Add new comment
- 703 reads
Calculate the factorial from the bottom up
Posted August 30th, 2007 by MgcclAccording to my previous post about chain multiplication of 3 numbers, it's certain the most naive top to bottom recursive factorial algorithm that times itself by n*(n-1)*(n-2)...*2 is slower than the bottom to top 2*3*4....*n.
This post is about the proof of the concept.
T(n) as the time complexity of top to bottom multiplication for n!
B(n) as the time complexity of bottom to top multiplication for n!
The relationship can be presented like this

How do we compare these two values? calculate those 2 functions from 1 to 10000 does show B(n) is faster and grow slower. It is good evidence, still it's not a proof. Let's work on the proof now.
Take off those floor function and the + 1, these have to little significance on the growth. Change log to general log instead of base 2 log. Easier on writing and the properties of each express will hold. Because all log functions have the same logarithmic growth rate. And in this general log, it's not possible to have a number lesser than one since there is a + 1 in the expression we can't see.

Suddenly, the entire expression became easier. It doesn't take a genius to see that:
in the B(n),
log k logarithmically increases
log((k-1)!) = increases at lesser than linearlogarithmic rate but faster than linear rate.
e^n < n! < n^n
n < log(n!) < log(n^n)
n < log(n!) < n log n
In T(n), log(n!/(n-k)!) is increasing and log(n-k) is decreasing
log(n!/(n-k)!) increases like log((k-1)!)
log(n-k) decreases logarithmically
T(n) might actually smaller than B(n)!
Let's put everything inside one log.

Now, we can take off this log sign. Why? because log is a continuous increasing function. so the relationship between both summations is still the same without the log. This same technique can be applied to the log in the exponents. Notice I have removed the B(n) and T(n). B(n) and T(n) are not approximate to the 2 expression right now, but each expression still corresponds to the same greater or lesser relationship between B(n) and T(n).

Now, because I don't have the most amazing brain in the world, I will do some cheat...
I removed the bases, only leave the exponents!
The growth of the exponents is WAY faster than the base(factorial growth VS linear growth), it's ok to ignore the base.
Note I have also changed the first expression's k's initial value to 1.
So everything come down to this comparison

n! is equal to n*(n-1)!, because the 2nd expression got a n!, it is greater than the 1st expression. so calculate factorial by multiply from 2 to n is faster than n to 2.
More rigor proof is required if I want to publish a paper...
- Mgccl's blog
- Add new comment
- 1132 reads
Ideal chain multiplication for 3 positive integers
Posted August 30th, 2007 by MgcclSkim to the conclusion is fine.
Let us not assume the schoolbook multiplication's time complexity is O(1), but O(ab). Where a is the first number's binary digit count and b is the second number's binary digit count. Put another way, the complexity of multiplication is the product of the binary digit count of each multiple. of course this could be faster than O(ab), it would accord to how many zeros in the binary format of the number.
for example. 2×1024(10×10000000000), some optimized CPU only did one bit multiplication and one addition. But we can forget that kind of special case since it depend on CPU architecture most of the time.
If one of the multiple is 0, the product is always 0. Every equation after this point is assuming there is only positive integers.
Many people does not know. that chain multiplication's time complexity have something to do with the order of the multiplication. For example.
5×7×3 is in theory, slower than 3×5×7
3×5 = 15, complexity= 2×3 = 6
15×7 = 105, complexity= 4×3 = 12
Total complexity = 6+12 = 18
For 5×7×3
5×7 = 35, complexity= 3×3 = 9
35×3 = 105, complexity= 6×2 = 12
Total complexity = 9+12 = 21
Now, it's easy to formulate some mathematical model.
Before that, we have to define a function that count the binary digit of a number.D(x)

We know that ab and ba have the same time complexity.(O(ab) = O(ba)), so there are only 3 orders to calculate abc:
(ab)c
(ac)b
(bc)a
The complexity formula for each one of them



Expand the formula by replace all D(x) with equals? No, let's try expand D(ab) in the form of D(a) + D(b) + c first.
Look this though and carefully.


ahh, so D(ab) can be either D(a)+D(b)-1 or D(a)+D(b). Assume that D(ab), D(ac) and D(bc) all follow the later equation.
D(a)×D(b)+(D(a)+D(b))×D(c)=D(a)×D(b)+D(a)×D(c)+D(b)×D(c)
D(a)×D(c)+(D(a)+D(c))×D(b)=D(a)×D(b)+D(a)×D(c)+D(b)×D(c)
D(b)×D(c)+(D(b)+D(c))×D(a)=D(a)×D(b)+D(a)×D(c)+D(b)×D(c)
Wow, then all 3 equations is completely the same! Let's call this equation MAX.
To calculates most efficiently, we have to find one or more D(xy) = D(x)+D(y)-1
Suppose all 3 equations can have D(ab) = D(a)+D(b)-1
we will get these 3 results
D(a)×D(b)+D(a)×D(c)+D(b)×D(c)-D(c)=MAX-D(c)
D(a)×D(b)+D(a)×D(c)+D(b)×D(c)-D(b)=MAX-D(b)
D(a)×D(b)+D(a)×D(c)+D(b)×D(c)-D(a)=MAX-D(a)
Above equations states it's better to have the largest number multiply in the end.
Conclusion:
3 numbers, a,b and c.
x and y and z are 3 variables does not equal to each other(unless there is equality among a, b and c) and it's domain are {a, b, c}.
If it's not possible to find a D(xy) = D(x)+D(y)-1. then the order of the multiplication does not have effect on the time complexity.
If it's it is possible to find a D(xy) = D(x)+D(y)-1. then when z is the largest while the previous relation still holds, then it reduce the most on the time complexity.
After Note:
The chain multiplication introduced here is only for 3 numbers, but it can be applied to larger chains by calling this recursively.
I have googled around and I can't find any article about this subject, most likely it's because the time complexity it saves are extremely low(D(n) is approximately log(n)) and to find the perfect order for the chain multiplication would cost a lot more than the multiplication itself.
If I made any error(likely.) please leave a comment so I can admire your skill publicly.
- Mgccl's blog
- Add new comment
- 531 reads
1D segment intersection detection part 1
Posted July 4th, 2007 by MgcclIn a 1D space, find if 2 segments intersect or not is frequenly implemented in simple collision detection. 1D segment intersection detection can be extended into 2D rectangle intersection detection and 3D box intersection detection. Many games have to use these to detect if 2 object intersects. Sometimes, details like how much space is intersected are also essential for many game.
1D segment detection is super easy.
suppose there is a segment a, it have 2 points, one is x1, the other is x2. Then there is segment b, also with 2 points, y1 and y2. x1<=x2 and y1<=y2. Then, if x1<=y2 and y1<=x2, then the 2 segment intersects with each other.

The function in both C++ and PHP
function intersectsegment($a,$b){ if($a[0]<=$b[1]&&$a[1]>=$b[0]){ return 1; } return 0; }
struct segment{ int x1, x2; }; int intersectsegment(segment a, segment b){ if(a.x1<=b.x2&&a.x2>=b.x1){ return 1; } return 0; }
In theory, this is the fastest algorithm possible for finding the intersection between 2 segments.
Ahh, what a weak opponent... I don't even have the time to use my powerful algorithms... 
Time to tackle what I was here for:
Boss-- Find all the overlaps between set A, a few segments and set B, a huge amount of segments.
The naive idea: loop though each one of the set A segment with each one in set B. Suppose set B have n segments, because set B is a lot larger than A. The naive algorithm runs in Θ(n) time constantly.
The source code for the naive algorithm is on the attachment. For simplicity, I only counted how many intersections there are without returning the intersected segments.
If you test the source code, you will find if you take out
std::sort(b, b+size);
intersection detection will run slower. If anyone know the reason, please comment, thx in advance.
I have a lot more better algorithm. Meet one of them now :)
First sort the array of segments by their smaller value(x1).
Then, create a larger segment outside 2 adjacent smaller segments and create even larger segment covers the 2 large segments. Do the same operations until there is only one large segment left.
The large segments can be created with a function like this:
segment largegment(segment a, segment b){ segment tmp; tmp.x1 = min(a.x1,b.x1); tmp.x2 = max(a.x2,b.x2); return tmp; }
Start comparing segments in set A with the largest segments in segment B. if overlaps, compare with the 2nd level, then 3rd level until a point where both smaller segments does not overlap or loop till the smallest segments. If a large segment does not overlap, every smaller segments under the large segment is no longer useful.
The running time for best case(no overlap) will be Ω(1), worst case O(n) and most cases, it's Θ(log n).
In worst case, it actually slower than the naive algorithm, since it preform all the calculation in the naive algorithm and to calculate intersect for all the larger segments.
Oh, it's not over, this is only part 1. Some of my algorithmic predator are still waiting to kill its prey...
- Mgccl's blog
- 2 comments
- 611 reads







Recent comments
2 days 22 hours ago
3 days 7 hours ago
3 days 15 hours ago
3 days 15 hours ago
3 days 17 hours ago
3 days 18 hours ago
3 days 18 hours ago
4 days 36 min ago
5 days 10 hours ago
5 days 17 hours ago