The Open GRiD Project

Maxim L. Lifantsev

    


Why should I
read this?

What does it
stand for?


Research Papers     FAQ     Downloads     Mailing Lists     Site News


What and Why
is this?



    A new architecture for Internet searching, categorization, and ranking is being implemented: it is to be contributed to, maintained, and controlled collectively and consistently by all the people who surf and create the Web pages.

Why do we
need this
new thing?

    Potential global implications of implementing the architecture seem to be surprisingly substantial for many areas other than just Internet searching.

What are those
implications?

    The basic idea is to capture opinions of very many people about each Web page in order to find out their weighted collective opinion about the page; and then use that opinion as the main part of page description (together with its title, keywords, text, etc.) when making a Web search. That is, we will build a collaborative information filtering, classification, and ranking system.

What exactly
is proposed?

Existing
analogs?


    Intuitively, the proposed method gives everyone the ability to easily express his/her opinion about anything; and to easily find out the collective expressed opinions of all the other people on the Web about anything.

What are the
main benefits?

    Also the proposed idea can be viewed as a neat way to easily use everywhere on the Internet the benefits of peer review and peer recognition (which are widely adopted e.g. in science and in Linux/free (open source) software developers community).

Tell me more
about this!

    Technically the idea develops on the methods used by the Google search engine (it can also be seen as the next step after the Open Directory). But the proposed extensions should allow for making much more flexible searches and for getting much more accurate search results and categorizations comparing to the current search engines and directories (including Google and the Open Directory).

Tell me more
about this!

    The implementation of the Open GRiD project has been started, but some collective efforts of many different people (both developers and opinion content creators), that is, your efforts, Dear Reader, are required.

What
can I do?


-> To the Open GRiD Project Research Papers
These papers provide more recent and more rigorous view on the project than what is presented and referenced below, but the latter is a less technical and sometimes more detailed presentation of the ideas behind the papers.


Notes on Reading The OpenGRiD Project Description Document


Table of Contents

-> To the Frequently Asked Questions

-> To The Open GRiD Project Architecture Proposal

-> To the Software Download Page

<- Back to the main page of the Open GRiD project.

This project is constantly evolving.
Check regularly for news and updates or simply subscribe for the news mailing list.


The Problem

What is one of the main problems of the Internet today?
The problem (which I think is the main problem) is as follows:
It is improperly difficult to efficiently find all the relevant information one can look for that is available somewhere on the Internet. And one never knows how good is what he/she has found and what he/she has possibly missed.
That is, the currently available search engines and directories are not adequate for the ever-growing complexity and size of the Web.

The problem with most current Internet search engines is that the search results are based *only* (mostly) on the contents of the page search engine gives you as the result of your query. (Categorizing directories like the Open Directory -former NewHoo- (http://dmoz.org/) and Yahoo! (http://www.yahoo.com/) are discussed in the next section.)

The above searching principle works exactly as good as the honesty and more importantly just the knowledge of each person that set up a particular Web page goes. If the author describes his/her Web page (by its title, keywords, text, etc.) correctly and accurately (with respect to the methods used by the search engines) you can find his/her Web page using current search engines.

But you will not lose his/her page among hundreds of other search results a search engine gives you only if *all* (a sufficient majority of) Web authors describe their pages accurately.

What makes things even worse is that HTML does not have almost any facilities designed to help search information on the Web (it was designed simply as a language instructing a browser how to view the information).

Obviously, no search engine (or a single firm/organization/country) can enforce that all Web authors describe their pages accurately. This is just impossible because of the limited knowledge of each Web author even if each of them is absolutely honest and does his/her best.

For more information on how search engines work and select the pages to display first see http://searchenginewatch.com/webmasters/work.html and http://searchenginewatch.com/webmasters/rank.html.

Also, it is impossible in principle for any search engine to extract accurate information about a Web page looking solely at what is written on the page itself (which is what most of search engines do): For that the search engine must know *everything* and be able to apply this knowledge to judge relevance/quality/etc. of the Web page.

Just recall the frustration you frequently get when trying to find something non trivial using current search engines: You get hundreds or thousands search results most of which are completely unrelated to what you were looking for.

Now imagine a much better situation: the one when you can utilize all the brain power and time of thousands of previous surfers to advise/recommend you which pages are (more) relevant and provide (better) information/services/etc. you are looking for. Read on to find out how we can make it become a reality.

Another point is that current search engines can be pretty easily fooled: Remember those XXX porn sites popping up in first 50-100 search results when you search for something vague and popular? Such sites just put a bunch of popular search words and phrases into the fields describing the page to a search engine to fool the search engine.


Towards a Solution

In this section we discuss several existing approaches on how to radically improve the quality of search results.
(A similar review-style analysis of most of the approaches discussed below is given on the Search Engine Watch site: see 
http://searchenginewatch.com/sereport/9808-clicks.html.)

Part of subsection "How the Proposed Solution is Related to What We Know" later in this document describes the relations of the proposed solution to the approaches described next.


Google Linking Ranks


Google Link Text Consideration


IBM's "Clever" Project


Human-Maintained Directories


Link Click Counting by Direct Hit


The Proposed Solution

The proposed solution is based on a small but crucial extension of HTML standard (together with corresponding modifications to a search engine similar to Google).

The modification of HTML standard is that we add two optional fields to a hyperlink:

In this case a link from a site www.newsexperts.foo of the following form

would mean that the authors of www.newsexperts.foo consider the information on www.somenews.foo regarding computer news to be ranked having goodness/value/quality of 80% out of 100%.

A link of the form

means that the authors of www.newsexperts.foo consider international news on www.somenews.foo to have rank of 10%.

And a link of the form

means that www.newsexperts.foo considers business news on www.somenews.foo to have rank of -30%. That is, they think that trusting or using those news might harm to some extent; that is www.newsexperts.foo do not recommend one to rely on business news from www.somenews.foo with strongness of the recommendation being 30%.

Links without keywords and values would mean that there is no statement about the contents of the referred page made by the referring site.

Links with these new fields will be called "voting links" or "opinion links" further in this document.

You can say that these all are very subjective opinions of www.newsexperts.foo about www.somenews.foo. That is absolutely true. But when one computes for example "computer news" rank of www.somenews.foo based on *all* such opinion-loaded links from all sites worldwide taking into account "computer expert", "computer news expert", etc. ranks of all these sites, one will get the very precise opinion about the quality of computer news on www.somenews.foo as expressed by *everybody* on the Web who has such voting links and whose vote either directly or indirectly has influenced the computation of this particular www.somenews.foo rank.

This rank computation scheme again has a recursive (cyclic) nature very similar to the one used in Google. But one difference is that for each page there might be many ranks reflecting its value in different categories.

I do not claim that some particular rank computation scheme is *the* best. The point is that once we have all this voting data, one can implement many different strategies to compute ranks reflecting many different things.

As I see it, with the Open GRiD Project being widely implemented anybody can do searches like the following examples:

The important thing to note is that, the results one gets is the best available reflection of the opinion expressed by everybody who was willing to express it and could physically do so (using a particular method of a particular search engine to aggregate the opinions).

Another point is that everyone is free to make any kind of search and interpret the search results the way he/she wants:

The point is that you determine what you want to find and how to use the results, the Open GRiD Project implemented will simply give everyone the ability to make those searches and get those results easily.

Note also that for example just the fact that an expert expresses a positive or negative opinion about a Web site with high or low rank will not influence the rank of that expert directly and immediately. But the rank of that expert will eventually change if his/her act of expressing this opinion influences the opinions of others about that expert.


How the Proposed Solution is Related to What We Know


The Main Innovations of the Proposed Solution


Easy Vote Creation

\/ Quick link to "What *You* Can Do Now" section.


What People have Said About the Open GRiD Project

Here is an incomplete list of responses I have gotten so far. (I will be updating it from time to time.)

Note that from Dec. 22, 1998 to Feb. 12, 1999 this document was tentatively titled "The Christmas Document" according to the time it was first posted.

You might also wish to look at the mailing lists archives.

/\ Back to the beginning of the document.



Existing Partial Implementations of the Proposed Solution

Here is a short list of examples that already implement some parts of the proposed solution (possibly in a restricted area):


Possible Extensions and Topics to Discuss and Settle On

Many of the following items are already discussed in detail and/or their solutions are proposed in The Open GRiD Project Architecture Proposal (which is a rather technical document):


Discussion on Possible Problems with Implementing
the Open GRiD Project and the Features of the Project

The Possibility of Changing the HTML Standards in the Proposed Way


Availability of the Technology to Implement a Search Engine for the Open GRiD Project


Willingness of People to Use Such Voting Links


The Speed of Spread of the Open GRiD Project


The Degree to Which Such Opinion Forming System can be Biased


The Problem of How an Unknown New Idea Gets Attention


Using Your Reputation to Increase the Importance of Your Opinion
Versus Preserving Your Anonymity


Possible Misuses of the Open GRiD Project


Inevitability, Possible Future Scenarios


Possible Financial Resources For and Around the Open GRiD Project

\/ Quick link to "What *You* Can Do Now" section.


Other Potential Global Implications and Predictions

Only Fair Behavior Wins (Survives)


Reputation of a Person and His/Her Achievements


Ultimately Fair Competition in Industry


Ultimate Democracy


Progress Acceleration


Copyright Notice

Copyright (C) 1998-1999 Maxim L. Lifantsev

This refers to the copyright on the ideas of the Open GRiD Project expressed in this document.

The initial solution has been generated by the author, Maxim L. Lifantsev, on December 16-18, 1998.

Since the idea of the Open GRiD Project is I think very simple when you know it, I am not certain that I am the first one to come up with exactly this idea or with a very similar one, but I created this idea independently based on what is mentioned in the acknowledgements section.

The license guiding the usage of this document describing the Open GRiD Project is almost identical to the GNU General Public License (GNU GPL) (see http://www.gnu.org/copyleft/gpl.html).

Main implications of that are as follows (the differences form GNU GPL are also indicated below):


What *You* Can Do Now

If you would like to see the project implemented, it is important that you do your (small) part of the job to make it happen. The point is that the Open GRiD Project can not be realized by one person or by a small group of people without some support of many people. Active participation of many different people in adoption and use of the project is a necessary precondition for it to come true.
There is no real reason why you can not (or should not) be among the people helping the project to become a reality!

The easiest thing to do is to vote in favor of the ideas of the Open GRiD Project to let other readers know about your opinion. Just click on the the button below:

The current results are as follows:

Certainly these voting results are not very useful because (1) this voting does not change much; (2) many people just do not vote at all; (3) you do not know who voted and why he/she voted in a particular way.

The next section lists more interesting and helpful ways one can support the project.


Concrete Things to be Done


What has been Done


How to Make Your Opinion About the Open GRiD Project Heard


Means to Find Out About Current Degree of Acceptance of the Open GRiD Project


Acknowledgments

I would like to mention people who have somehow contributed to creation of this document:

I guess I will have to augment and make more detailed this section as the popularity of the Open GRiD Project grows and more people contribute to it. Here are some of these "post-creation" acknowledgments:


On Sending E-mails to the Author of This Document

Anybody is welcome to send me (maxim@cs.sunysb.edu) an e-mail with information you think I must know or that must be incorporated into this site (points than need to be changed, missing points, wrong points, etc.).

If you have some message appropriate for one of the mailing lists dedicated to this project, submit it there.

BUT, please be thoughtful: do not abuse my e-mail address or the mailing lists.


Table of Contents


/\ Back to top of this page.

<- Back to the main page of the Open GRiD project.

H To the author's home page.


The first version (0.1) of this document was first posted on Dec. 22, 1998
This is version 1.0 possibly with some modifications that do not yet qualify for a new version number.
It was first posted on Feb. 13, 1999
and last updated on Dec. 17, 2000 at 04:29 PM EST by Maxim Lifantsev
Here is the revisions and changes information.
Today's date is Dec. 05, 2008
For disclosures about the properties of the pages on this site look here.
Comments, Suggestions?