Information

What I'm always looking for is Ontology

Bye taxonomy...
jk. There is not enough development in ontology yet to completely discard taxonomy. Even if ontology had more development, taxonomy will still dominate because it's simplicity.
I'm going to use Taxonomy for taxonomy schemes below unless I specifically address it.

I'm interested in classifying data due to my need of creating a system that can find a particular math problem fast when all math problem are provided with description. Mathematics has concrete definitions of each of it's individual elements. It's the ideal model to test any classification system.

My previous views on classification are in fact restricted on the taxonomy system. It's simple, but can get really complicated. There is something extremely difficult to address. For instance. The problem between how to exactly describe a term.
My confusion with taxonomy is: Does terms intended to be the only object describe terms?
In the most common taxonomies, only terms can be used to describe terms. Because it only have terms and parent-child relationships. There is nothing else, it's easy to implement and get the job done.
I tried to divide term into quality and quantity, which is like creating attributes to extend the taxonomy. I'm not a genius, there are shortcomings in the model. Include using taxonomy to describe attributes, which I think it's a problem in all models ever created by man.

An example:
Triangle is a child term of Polygon.
Triangle have 3 sides but Polygon can have [positive integer between and include 3 to infinity] sides.
So sides should be an attribute. The different quantities in attribute created the difference of terms.
In fact, even without manually associate Polygon as the parent of Triangle. After examine all the possible attributes of the two term, the computer can see that Triangle is a child of Polygon.
So side is an attribute. But side(synonym of edge) is just a 1-face. So side can have the attribute of n-face set to 1. So an attribute is getting described by another attribute. Replace the word "attribute" with "term" in my last sentence. It would be the standard model of taxonomy. The attribute+term model of taxonomy seems like the normal taxonomy system forced to create a unnatural border between completely same concepts. I get confused and I really want to understand if attribute and term can be used interchangeably or not.

But worry no more... I start to focus my attention on ontology.
All the quotes are from wikipedia

Definition of Ontology:

An ontology is a representation of a set of concepts within a domain and the relationships between those concepts.

Review what Taxonomy means.

Taxonomy is the practice and science of classification.

We can't compare ontology with taxonomy. The definition shows compare them are like comparing water with kittens1.
We can compare ontology model and taxonomic schemes. They both can do the essential thing I want: Show the relationships between items and their properties.

If there is unlimited system resource. A taxonomy can really in fact classify everything by exhaustively convert all attribute and associated values into terms. Like the following one:

With unlimited computational and storage resource. I would tell you right now that the current multi-inheritance taxonomy scheme is perfect. There is no need of develop a specific ontology model for anything. We can all sleep at night knowing another great challenge is defeated.

But obviously no one is going to represent every single number as a separate term in taxonomy. You can, if you have relatively few attributes and value combinations. Each combination requires 3 slot for storage. One refer to the attribute, one refer to the value and one refer to the term that's the combination of those two. It construct a huge web, and need huge computational power.

A ontology model basically solved that problem. Ontology is almost like the math we see everyday! It has three significant advancement compare to normal taxonomy--Attributes, axioms and restrictions. Refer to Wikipedia on these, because I'm not as expert as Wikipedia in this particular topic. Wikipedia is like the living proof of a very loose ontology.

I'm looking forward to OWL become part of Drupal one day. Then, when it start to have service APIs for distributed Drupal and data, Drupal will be the perfect CMS for everything.

What is beyond the current ontology?
Humans start to talk in constructed language that's syntactically unambiguous(The only one known to me is Lojban) so machines can now, seriously, understand what we are talking about. Then all the ontology and taxonomy structures are built by machines automatically.

  1. 1. No, they are not both eatable

The taxonomy systems and its problems

In this informational world, it becomes more and more difficult to classify data, till now, many ways have been suggested, each one of them have some flaws.

Trees
Tree is a classic taxonomy system. It's widely used in classifying living organisms, file systems and most paper's organizations. Each parent term can have a few or no child terms. Each child term can have only one direct parent term.
Pros
1. easy to navigate
2. easy to manage
Problems
The importance of each term can be understand differently. For example, should equilateral triangle be equilateral polygon->triangle or should it be triangle->equilateral.
Easy navigation require some amount of knowledge about how item are categorized.

Multiple inheritance
It's the same as the tree system, but each term can have more than one direct parent term. The equilateral triangle can be classified as the child term of equilateral polygon and triangle.
Pros
1. easy to navigate
2. easy to manage, more difficult than tree
Problems
Since equilateral triangle is belongs to both equilateral polygon and triangle, and they are both belong to polygon. Should equilateral polygon and triangle classified as the same kind of item? It's sure that triangle, pentagon and hexagon are more related than equilateral polygon's relationship to triangle. Should a new parent created just to differ these two apart? Like "relative length of the polygon side" and "amount of sides".

Naive Tagging
Pros
1. Easy to tag
2. Easy to navigate when there is a smart system
3. Social tagging power
Problems
No sense of parent and child. So someone can tag equilateral triangle "math", "geometry" and "polygon". But we know polygon is part of geometry, which is part of math. but if someone searches "math" and it only got tagged by "geometry", it will not show up. To make an item fully accessible, an item have to be tagged with a tag and all it's parent tags.
Few word can have the same meaning. Different user would tag it differently and result search for one, the other don't show up. For example "Triangle" and "Polygon with 3 sides" are the identical tag by the definition of triangle, but they are different tags.
A word can mean different things. Like "Python" can mean a programming language or a special group of snakes. Search python as the language, turns out a lot of snakes are not intended.

Finally, after Drupal's Taxonomy system come out, I saw the best categorizing system, yet.
Tagging with multiple inheritance and synonym support + more than one vocabularies
The most powerful taxonomy system Drupal can offer is Tagging with multiple inheritance synonym support. And as a extra, "related tags" field. It's not formal to use related tags, because there is no standard to define what the relation is between the tags and how to act for different relations.
Vocabularies are complete different set of term, they have have no relationship with another set of vocabularies. So they can have the same term("python") but they are used differently(Vocabulary of programming and vocabulary of snakes). Basically, it's the same as categorizing one set of terms into one single parent, except they are defined not to have any relation with another set of terms.
Pros
1. Really powerful at classifying data
Problems
Beside the huge difficulty of constant adding synonyms for a tag, format all the possible inheritance, and a lot of database queries. This system still can't fix the multi meaning word problem, but if the taxonomy managers are careful, it will not happen. The basic rule: Never let a term be ambiguous. For example, instead of using "Python" for both the snake and programming language, do "Python(Programming Language)" and "Python(snake)". This problem does not exist in tree based system. If anyone can think of a better suggestion, please let me know :)
Also, a few category problem seems impossible unless one knows how to perfectly define vocabularies. For a personal example, let's say there is are two vocabularies, time and space. An item appeared in China at 2001, and also appeared in US at 2003. So this item will be tagged:
Space: US, China
Time: 2001, 2003
but, the space are not directly associated with the time, one with no knowledge about the item can inferred the item also appeared in China at 2003, a wrong statement. When I was the Space and Time vocabularies, I found it makes perfect sense, user can browse by time or by space. until I finally meet a problem like this and have to consider redefine the vocabularies. The only solution requires the removal of combine 2 vocabularies. Create terms like "China 2001" inherit from "2001" and "China". If we think about it. For many cases, there can be only 1 term directly inherits 2 term, those times I like to think it as a synonym of 2 term combined. The disadvantage will be a huge amount of database usage, say, there are 200+ countries and human have recorded history for more than 5000 years. If the original database contain like 5000 terms now, it should be over million terms.

Future...
I want to hear suggestions on how these problems can be solved. because I have no idea.

Designline - A Design Timeline

Designline - A Design Timeline shows how a site start from scratch to the final product. Here is copy that stored on this site.

designline openair


Editor Comment:

Wow. this is cool, going to make one myself.

Eboy's New FooBar Poster

Eboy group released it's new poster that features over 20 popular web 2.0 sites. Check it out! Here are the 2 posters related to the web:

Foobar

Communication city
Editor Comment:

Very Pixel... Very Very Very Pixel.

Google Adsense Sparklines Visualization


Google Adsense Sparklines Visualization
, uses Sparklines in action. It shows how to integrate include Google's Adsense infomation and then turn it into a Sparkline graph.

Editor Comment:

I tested it, and here is mine version Adsense Sparkline For RSS Feeds of Google Adsense earning, try Adsense Earnings RSS Feed

Syndicate content
Honey Pot that kill bots