Project “New Search Engine”
Business Plan


Initial Position - My InsertionFinanceHistoryProcedure - Estimate of the Price - Features of the Project
Idea, New Principle of SearchingAlgorithm, Short DescriptionMini Search Engine - Method of Evaluating of the Search Results
Microsoft and Bing - Short CV - Contacts

1. Initial Position
Google is the biggest company on Internet and one of the richest companies in the world. But about 30 percent of WWW links is placed incorrectly in Google (my previous estimate, confirmed now by a study of the City Group). Relevant (qualitatively and quantitatively excellent WWW pages) are not up, while less important WWW pages are up.
Thus it is possible to create a better search engine. I think, I know the way, which leads to this goal.

2. My Insertion
I insert into the project:
The idea – new principle of searching (ready).
The algorithm of searching (the system of criteria and initial weights between the criteria are ready).
The algorithm of the mini search engine (ready).
Mini search engine (will be inserted after being written, it is necessary to work it out, mainly for this purpose is destined the investment).
Method of evaluating of the search results (ready).
More detailed information to the inserted things is stated below.

3. Finance
I have already the initial investment 60 thousand USD. This investment is sufficient for the realization of the prototype (mini search engine).
I am seeking investment 2 million USD. The conditions of the investment are defined here: Investment into the New Czech Search Engine.
For this investment, the real search engine will be realized and performed in one country (Czech Republic).
See also Budget (Financial consideration).

4. History
The present status of the project is the result of my 4 years work.
4 years ago I have revealed (as most of searching persons), that the search engines (including the best Google) don’t give such results of searching, which I expect. I made a small research, which I made more accurate later using 100 keywords, and have found out, that roughly 30 percent of found WWW pages are placed incorrectly. Around 3 years I have experimented with the search engines, changing the properties of my WWW pages and following, how the search engines react to these changes. Moreover I, of course, followed other keywords and WWW pages. I still could not find the answer to the question, why the sequence of found WWW pages is not optimal, while non-important WWW pages (from the point of view of searching) are placed up. I reached it after 3 years, when I managed to have a look at searching from another point of view (angle), than present search engines. I simply revealed the principle, how to put up while searching rightly the relevant WWW pages, which the present search engines, according to my opinion, do not use.
Next year I paid attention to the construction of the algorithm of searching, where I projected this principle (criteria, sub criteria, weights between the criteria, to the suggestion of the algorithm of the mini search engine (so that it could be realized in relatively short time), to the method of comparing of the search results of various search engines and to the exact formulation of the text of the project.
I offer you this project now for kind judgment, and, if you like it, also for the investment.
Remark:
There exists some analogy between the history of “my search engine” and the history of other search engines.
The authors of Google have invented their algorithm of searching already in 1995. They claimed, that their algorithm is better than the algorithms of those times and they wanted to sell this algorithm. They failed in this attempt for about 3 years. Only then they decided to develop the whole search engine and got the first bigger investment (100 thousand USD from the director of Sun Microsystems).
Two years ago, the experts looked at the inventors of the real time searching (searching in social networks and news, e.g. at Facebook and Twitter) like they “were crazy”, now it is nearly the world Internet sensation number 1. Notice: it is special, not general searching.

5. Procedure
- the suggestion of the algorithm of searching and its theoretical verification on my system of 21 WWW servers is ready
- the mini search engine will be programmed (restricted to testing keywords) for practical verification of the algorithm
- the weight of the criteria of the algorithm will be optimized
- using the testing keywords, the results of searching of the mini search engine will be compared with Google and Bing
-- non-success (the algorithm is worse than Google and/or Bing):
--- the project will be finished
--- the project will be sold to other interested person
--- the optimization of the weights of the criteria will continue
-- the “New Czech Search Engine” will be developed (it is very probacle, that my search algorithm is better than the searching on the Czech search engine Seznam)
-- success (the algorithm is equal or better than Google and/or Bing):
--- the algorithm will be sold to Microsoft, eventually to other computer or Internet company
--- complete software for the search engine will be developed (further investment is necessary), this software will be sold
--- complete search engine will be realized – hardware and network (further investment is necessary), this search engine will be sold or will be operated

6. Estimate of the Price
Microsoft was attempting to buy Yahoo, at first as a whole (for about 44 billion USD), then the “search part” only (for about 19 billion USD). Estimated division of the Yahoo price: 10 billion mark, 10 billion portal, 5 billion search engine – hardware, 5 billion search engine – network, 5 billion search engine – software, 5 billion search engine – algorithm.
In case of success and selling my algorithm of searching can thus be the selling price of this algorithm round 2 billion USD (financial purpose of this project is the selling of the algorithm of searching, that is why it will be appropriate to go with the price a bit lower, than is the real price). The price 2 billion USD would be divided into two halves, i.e. for me 1 billion USD and for the investor(s) 1 billion USD. This would mean for the investor(s) total income in the height of 17 thousand multiple, than was the investment:
income 1 billion USD / investment 60 thousand USD = around 17 thousand.

7. Features of the Project
The probability of success is 70 percent, the probability of non-success is 30 percent. The first results will be available in round 6 months, the final result (the quality of the algorithm of searching compared to Google and/or Bing, business negotiations) will be available in 12 months. In case of full success the investor will take about 17 thousand multiple of the investment (!), (single investors would get portions, which would be adequate to the corresponding parts of their investment). In case of non-success there are available alternatives, that decrease the risk, see point 5 - Procedure.


8. Idea – New Principle of Searching
The principle of searching, which I have invented, is characterized here.
8.1.
The base of my principle of searching is in two points:
- I evaluate all components of the Net: WWW pages, scripts, images (drawings, photos, maps), audio, video, documents ...
- I construct form these components other objects = thematically connected sets.
Absolute difference between Google (and the like) and me, is that the existing search engines evaluate WWW pages, i.e. basically elements, while I evaluate sets of thematically connected Net components.
Most WWW pages are very similar to each other, in terms of size, occurrence of keywords and, if we consider home WWW pages, also what concerns the number of WWW links. The differences are minimal, distinctive space is very inflated, about the sequence often decide one occurrence of a keyword, 1-2 WWW links or even random (location of keywords).
Not even the best search engine (and that Google is somehow good) can do anything about it, this is generally due to the principle of evaluation of WWW pages.
In contrast to this, I evaluate set of thematically connected components of Net. These are as a rule sets, which can be distinguished quite well (in terms of size, occurrence of keywords, WWW links etc.). Simply, I have the distinctive space comfortably stretched, so my algorithm based on this principle works better than existing algorithms of searching.  Only from the sequence of such sets I derive the sequence of WWW pages, which are contained in them.

This can be nicely demonstrated on the example of the Rank of WWW pages, which is in existing search engines (PageRank, SRank) static (does not depend on the keywords), while in my algorithm is dynamic (depends on the keywords in thematically connected sets), which is again much more accurate .
In other words:
On one side there is the searching person and the searched keyword, on the other side there is the set of all information on the Net concerning this keyword. WWW pages bite only very small pieces from this set, roughly one percent and less. One WWW page about cars contains lets say 1 percent of all the information concerning cars on the Net, the other 0.9 percent, the third 0.8 percent... The differences between WWW pages are relatively small.
In contrast, sets of Net components bite much bigger portions from the summary information concerning the searched keyword. A set of thematically connected Net components may contain 10 percent of all the information concerning cars on the Net, the second set 9 percent, the third set 8 percent, furthermore it usually decreases a lot.
Differences between sets are relatively large.
The differences between sets of components of the Net are much bigger than the differences between single WWW pages. One can guess, that my distinctive space is 10 times bigger than distinctive space of Google (of existing search engines). Based on this principle, my algorithm is more robust, the sequence of WWW pages, computed by my algorithm, is better.
8.2.
The basic property of my principle is, that it puts up, while searching, rightly the relevant, i.e. qualitatively and quantitatively good WWW pages, for the searched keywords. Up are those WWW pages, which are adequate to the searched keyword(s) – not more general, other or less general WWW pages.
8.3.
The idea is oriented towards basic (common, classical) searching. Not to specialized branches of searching, which are e.g. Internet shops or real time searching (searching of persons – Facebook, news searching – miniblogs – Twitter).
8.4.
My principle is not expressed in the search algorithm by a single criterion. On the contrary, it is projected practically into all the criteria of the search algorithm, and it influences the main criteria in fundamental way. It winds through the algorithm like a “red line”. It can be said, that into the search algorithm, instead of the concept “WWW pages”, my concept of “other objects” is installed (moreover, some criteria being completed or changed in other way).
8.5.
My concept is no “abstract noun”, on the contrary, it is a well-known computer term with fix content, used by myself in another way while searching. I simply look at searching in other way, from other point of view (angle).
8.6.
It is no “artificial (computer) intelligence“, as is e.g. the search engine WoframAlfa. This artificial intelligence has the challenge only in remote future, not now.
8.7.
My thought is not used by existing search engines. This follows from my study of publicly accessible algorithms as well as from practical verification. Would the present search engines use this idea, their behavior would be other, the sequence of found WWW pages while searching would be different.
8.8.
In my algorithm, I do not have any special criterion against "SEO spamming", i.e. against artificial (formal) putting up some WWW pages while searching, which is a big problem for the present search engines. But the magic of my thought and algorithm consists among others in, that it is able to eliminate this “SEO spamming” naturally, simply this elimination follows from my search algorithm. It is adjacent effect of my algorithm, but it is so.
8.9.
My principle and/or algorithm is (according to my opinion) patentable (I am also engaged in intellectual property protection, in patenting and in designation of origin in EU).
But I do not want to patent it, for the following reasons:
- it is something like “family silver”, like recipe for Becherovka Liquer or Whisky, which is also not to be revealed or patented
- to make really worldwide patent (search engines are worldwide) represents cost in the height of about half million USD; especially in the initial stadium of the project such cost cannot be imagined
- if something is patented, then the patent application (patent text) is published, i.e. publically accessible; if my idea would be used by another person, it would be difficult to prove (cost of court proceedings; how to know what's what in hundreds thousands rows of the source code of the program of foreign search engine; the misusing person or company could put to the court other program, than he/she is really using – for direct proof of this it would be necessary to realize elsewhere practically the whole duplicated search engine…).

9. Algorithm – Short Description
The principle of searching, which I have invented, is projected into the criteria, which define the sequence of WWW pages while searching. My algorithm of searching consists of 30 criteria. My principle of searching is projected into all these criteria. Moreover, some criteria are new or modified. Finding the correct weights between the criteria in the algorithm is also important. I do have the initial settings of the weights, the weights will be optimized using the mini search engine.


10. Mini Search Engine
Normally, to verify the algorithm of searching, it is necessary to construct the whole search engine (a work for several – or many – people for several years). I have invented, how to reduce this procedure to work of about 2 people for about 1 year.
10-100 of keywords will be chosen. From every keyword the mini search engine will reveal 100-1000 front WWW pages. The mini search engine will count the sequence of these WWW pages according to my algorithm of searching, After, I will perform the optimization of the weights of the criteria of searching. I will change these weights and follow the effect on the sequence of the WWW pages. At the end of this procedure I will choose, according to my opinion, the best relation of the weights. The results of this optimized algorithm of searching will be compared to the search results of Google and Bing.
Remark:
The mini search engine will be written as “universal tool for the development and testing of the algorithms of searching“. The criteria of searching as well as the properties of the found objects will be described parametrically. It will be simple to change the single criteria and weights, alternatively to insert a completely new algorithm there. Such a mini search engine will be probably salable also “as such”.

11. Method of Evaluating of the Search Results
How to evaluate the search results, i.e. how the compare the search results of two search engines, given by the chosen keyword(s).
On the left side of the display there are the search results of one search engine, on the right side of the display there are the search results of another search engine, found while searching according to given keyword(s). Which results are better?
Here is the method, which I have revealed and which I suggest:
For every WWW link (WWW page) found, it is necessary to estimate, which percentage of the searching users will click on this WWW link and will find the WWW page relevant (corresponding to it, what the user wanted to find). The estimation can be done from the point of view of the interests and of the geography of the user. So the most found WWW links (WWW pages) can be evaluated.
Examples:
Searching on google.com according to the keyword „cars“: at the WWW pages about Australian cars will click probably 2.6 percent of the users (simply number of Australians / number of English speaking people = 21 000 000 / 813 000 000 = 0.026).
Searching on google.cz according to the keyword „Morava“: at the WWW pages of the rock band „Morava“ will click probably 1 percent of the users; when searching “Morava”, about 20 percent of the users are interested in Moravian music, about one half of them is interested in rock, about 10 percent of them are interested right in the rock band “Morava” (0.2 x 0.5 x 0.1 = 0.01).
There are at least two other methods of evaluation of the search results. One has the City Group, another has Microsoft (according to the comments of Steve Ballmer).
If it will be a wish of the buyer of my algorithm (Microsoft for Bing), they may choose their own testing keywords, I will process these keywords for them and will generate the sequence of found links.
After that, they will be able to compare my results with the results of searching of Google, Bing or another search engine, according to their method.

12. Microsoft - Bing
Microsoft tries to penetrate into Internet (searching) already for about 10 years (Inktomi, Netscape, MSN Search, Live Search, Yahoo, Bing). Microsoft introduced the new search engine Bing on Internet in May 2009, so far without substantial success. The statistics of visits of the search engine Bing were increasing for about 2 months (thanks to the advertising campaign for round 100 million USD), now they are more or less decreasing. The relation between the using of Google and Bing is in the world round 30:1, in USA round 10:1 (according to the www.statcounter.com). The most analysts expect, that the present relation between Google and Bing will keep, that the changes (up or down) are not very probable.
After the agreement between Microsoft and Yahoo was signed, the Chief Executive Officer of Microsoft Steven A. Ballmer said, he believes in the „future of searching “. This step, when Yahoo will use for searching just the search engine Bing, seemed surprising for the most of the analysts (and also for the employees of Microsoft). But from the point of view of Microsoft it does have some logics: now, Microsoft have their own search engine, by this search engine (among others) three much frequented WWW pages will be occupied (microsoft.com, bing.com a yahoo.com), and gradually Microsoft wants to develop Bing in order to be as good as Google.
Mr. Steven A. Ballmer has now decreased salary, probably also following to the non-success of Bing (according to the Internet news). According to the analysts, the problem is in the quality of searching, i.e. in the present algorithm of Bing (the user can not be kept long term by advertising campaign, instead only by the quality of searching).
According to the latest news, Microsoft wants to invest into the searching about 8 billion USD during the next 5 years.
That is why, it does have sense to develop new algorithm of searching and to offer it to Microsoft.

13. Short CV
I studied on the Czech Technical University – Prague, branch computers. I have the title CSc. (equivalent to Master of Science) for the work concerning structural programming. I was programmer for about 15 years, in Brno, Czech Republic, I made big laboratory information system for several hospitals, medical points and many doctors. I am for about 16 years independent expert and make small business, in the branches of programming and Internet. I invented and realized the programming language Visual Pascal (superstructure of Pascal). Microsoft was interested to buy it. However the owner of Pascal, the Borland Company, has refused to sell the company to Microsoft, so that Microsoft preferred logically the language Visual C++. I am engaged in lists and search engines for about 10 years, 6 years theoretically and 4 years practically. The Netscape Company was seriously interested to buy my patent protected algorithms for the lists (construction of categories, fuse of lists), there were negotiations in the headquarters of Netscape. But Netscape refused to sell their company to Microsoft, so Microsoft destroyed them as well. I was working for 5 years for a Canadian telecommunication company, I made connection between computers and mobile phones. Now I provide 21 WWW servers, I am engaged in Internet searching, in improving of the sequence of WWW pages in search engines and in presentations.

Brno, Czech Republic, February 10th, 2011.

Ing. Petr Hejl, CSc.
Ondrouskova 15, 63500 Brno, Czech Republic
tel.: (+420) 608 374 535
email: phejl@lednice.org