Search Engine History – Web Search Before Google

Did Google always dominate the web search market? In the second of three posts on the history of the Search Engines, I look at the pioneers of the early search market, including the very first web crawler, WWW Wanderer. Did you know that Disney used to be one of the biggest players in the business? Or that Altavista was more technically advanced, in many ways, in 1998 than Google is now? Read on!

The pioneering Web Search Engines

Really, the point at which modern search engines first begin to appear is after the development and popularisation of the MOSAIC browser in 1993. In 1994, Internet Magazine was launched, together with a review of the top 100 websites billed as the ‘most extensive’ list ever to appear in a magazine. A 28.8Kbps modem was priced at $399 and brought the internet within the reach of the masses (albeit slowly)!

At this point and for the next 4-5 years, it was just about possible to produce printed and web-based directories of the best sites and for this to be useful information for consumers. However, the rapid growth in the number of www sites (from 130 in 1993 to over 600,000 in 1996) began to make this endeavour seem as futile as producing a printed yellow pages of all the businesses, media and libraries in the world!

Whilst WAIS was not a lasting success, it did highlight the value of being able to search – and click through to – the full text of documents on multiple internet hosts. The nascent internet magazines and web directories further highlighted the challenge of being able to keep up with an internet which was growing faster than the ability of any human being to catalogue it.

In June 1993, Matthew Gray at MIT developed the PERL-based web crawler, WWW Wanderer. Initially, this was simply devised as a tool to measure the growth of the world wide web by “collecting sites”. Later, however, Gray (who now works for Google) used the crawled results to build an index called “Wandex” and added a search front-end. In this way, Gray developed the world’s first web search engine and the first autonomous web crawler (an essential feature of all modern search engines).

Whilst Wanderer was the first to send a robot to crawl web sites, it did not index the full text of documents (as had WAIS). The first search engine to combine these two essential ingredients was WebCrawler, developed in 1994 by Brian Pinkerton at the University of Washington. WebCrawler was the search engine on which many of us early pioneers first scoured the web and will be remembered with affection for its (at the time) attractive graphical interface and the incredible speed with which it returned results. 1994 also saw the launch of Infoseek and Lycos.

However, the scale of growth of the web was beginning to put indexing beyond the reach of the average University IT department. The next big step required capital investment. Enter, stage right, the (then huge) Digital Equipment Corporation (DEC) and it’s super-fast Alpha 8400 TurboLaser processor. DEC was an early adopter of web technologies and the first Fortune 500 Company to establish a web site. Its search engine, AltaVista, was launched in 1995.

Founded in 1957, DEC had during the 1970s and 1980s led the mini-computer market. In fact, most of the machines on which the earliest ARPANET hosts ran were DEC-PDP-10s and PDP-11s. However, by the early 1990s, DEC was a business in trouble. In 1977, their then CEO, Ken Olsen, famously said that “there is no reason for any individual to have a computer in his home”. Whilst somewhat taken out of context at the time, this quote was in part symptomatic of DEC’s slow response to the emergence of personal computing and the client-server revolution of the 1980s.

By the time Altavista was being developed, the company was besieged on all sides by HP, Compaq, Dell, SUN and IBM and was losing money like it was going out of fashion. Louis Monier and his research team at DEC were “discovered” internally as the ultimate PR coup; the entire web captured – and searchable – on a single computer. What better way to showcase the company as an innovator and demonstrate the lightning fast speed and 64-bit storage of their new baby?

During 1995, Monier unleashed a thousand web crawlers onto the young web (at that time an unprecedented achievement). By December (site launch) Altavista had indexed more than 16 million documents comprising several billion words. In essence, Altavista was the first commercial-strength, web-based search engine system. AltaVista enjoyed nearly 300,000 visits on its first day alone and, within nine months, was serving 19 million requests a day.

Altavista was, indeed, well ahead of it’s time technically. The search engine pioneered many technologies that Google and others later took years to catch up with. The site carried natural search queries, Boolean operators, automatic translation services (babelfish) and image, video and audio search. It was also lightning fast (at least in the beginning) and (unlike other engines) coped well with indexing legacy internet resources (and particularly the then still popular UseNet newsgroups).

After Altavista, Magellan and Excite (all launched in 1995), a multitude of other search engine companies made their debut, including Inktomi & Ask Jeeves (1996) and Northern Light & Snap (1997). Google itself launched in 1998.

Of these early engines, each enjoyed its own enthusiastic following and a share of the then nascent search market. Each also had its own relative strengths and weaknesses. Northern Light, for example, organized its search results in specific folders labeled by subject (something arguably still to be bettered today) and acquired a small – but enthusiastic following as a result. Snap pioneered search results ranked, in part, by what people clicked on (something Yahoo! and Google are only toying with now!)

In January 1999 (at the beginning of the dotcom boom), the biggest sites (in terms of market share) were Yahoo!, Excite, Altavista and Disney, with 88% of all search engine referrals. Market share was not closely related to the number of pages indexed (where Northern Light, Altavista and a then relatively unknown Google led the pack):

Search Engine Share of search referrals (Dec 99)

Yahoo! – 55.81%

Excite Properties (Excite, Magellan & WebCrawler) – 11.81%

Altavista – 11.18%

Disney Search Properties (Infoseek & Go Network) – 8.91%

Lycos – 5.05%

Go To (now Overture) – 2.76%

Snap / NBCi – 1.58%

MSN – 1.25%

Northern Light

Behind the Form – Google, The Deep-Web Crawl, and Impact on Search Engine Visibility

Crazy Things That Really Rich Companies Do

Kind of like that weird guy at the party with an acoustic guitar and the Pink Floyd shirt, Google is getting DEEP. Some would say…uncomfortably deep. After an already busy year, wherein Google released an open source mobile OS and a browser that’s rapidly gaining market share, they recently announced that they had mapped the sea floor, including the Mariana Trench. And hey, why not found a school featuring some of the greatest scientific minds out there and see what happens?

So Google’s been more visible than ever lately, and there’s no doubt that this’ll continue as they get their hands into more and more projects – but let’s drop down a few floors and look at something that should dramatically affect the way Google’s indexing programs (“spiders” or “crawlers”) collect data, analyze websites and present the results.As much work as the BEM Interactive search engine marketing team puts into making sites appeal to spiders (and there’s a lot we can do to make those spiders love it), the spider programs themselves are pretty straight-forward: hit a site’s page index, check out the structure and content, and compare that to what Google has determined to be “relevant” or “popular.”

But because of the way these programs are written, there are certain areas that they simply can’t reach…namely pages that require human information, input, or action. As a basic example, there’s usually a confirmation page after a user submits a “Contact Us” or “Newsletter Sign-up” form – this could contain a promotional code or some other kind of unique data.This dynamically generated content (this could also be a search results page, calculations or conversions, even the results of a symptom tool on a medical site) simply doesn’t exist until the user creates it! Depending on the form you filled out, the resulting page is yours and yours alone – so try to ignore that tingle of omnipotence next time you Google something.

But search engine spiders can’t understand what the form is asking for or the info being delivered to the user – and even if they could, how would they figure out what to insert in order to generate any relevant content? Drop-down boxes, category selection, zip code input – any of these forms can prevent data from being indexed. Collectively, this blocked data is referred to as the “Deep Web.” By some estimates, the Deep Web contains an astounding amount of data – several orders of magnitude more than what’s currently searchable. Since they chiefly rely on site maps and hyperlinks, search engines crawlers just can’t find a way to access the information.

So can Google really expect to find, log and interpret this data? Well, between mapping the ocean and opening a school that will probably discover the meaning of life before lunch, Google did just that. Working with scientists from Cornell and UCSD, Google researchers (whom I can only hope will not become supervillians at some point) have devised a method for their spiders to complete and submit HTML forms populated with intelligent content. The resulting pages are then indexed and treated as regular indexed data and displayed in search results – in fact, at this moment, content gathered from behind an HTML form is displayed on the first page of Google search queries 1000 times every a second. The methods the bots are using are pretty cool, but I’m Nerd McNerdleson about that kind of thing. So we won’t dive in to the technical stuff here, but check out the article if you’re into it.

That’s cool…NERD. But what does it mean?

Everyone knows Google loves relevance – their entire business model is built upon it. This technology is about pulling exactly what the user is searching for and immediately providing it without even requiring them to visit any page outside of the Google results page! Spooky.

Say that you’re feeling under the weather. Rather than type in “symptom checker” and find a WebMD-type page, you type “coughing, runny nose, strange bubonic plague-like swelling” directly into the search engine. Google – who has already had their spiders hit every medical symptom form out there, query them in endless varieties and combinations, and determine the relevance & popularity of the results – immediately comes back with “You’ve got the Black Death” and you’re set (or…maybe not).

From a retailing standpoint, many sites have functions to generate product lists based on user input. As it stands now, a shopper looking for a red, American-made minivan with under 30K miles would find the appropriate website, input his or her criteria, whereupon the website would query the database and return the results. If Google continues to move forward with their deep web crawls, this information could be displayed directly through their outlet of choice without the user ever accessing any site other than Google (if the user makes a purchase, does Google get a cut? Hmm…)

Obviously, this is a massive step forward in search technology and, in an industry that seems to change every hour, represents a new method of obtaining and presenting information. As web marketers, this is another variable, another challenge to consider in our work – how can we optimize pages that can be generated in a seemingly limitless number of ways? With search engines becoming increasingly more powerful and their data mining capabilities getting deeper, will there come a time when all data is presented through one aggregate portal? This may be years down the line, but the technology and the foundations are here now; forward-thinking businesses and web marketers need to be there as well.

Website and Search Engine Marketing Simplified

The advent of the Internet has resulted in the level of competition for businesses now being global expanding throughout countries and continents. The Internet moves at a rapid pace mainly because it is a dynamic process forcing businesses to adapt several types of marketing procedures to conventional offline marketing and advertising.

The overall objective of online marketing is always to attract additional people to your website improving buyers for your business and enhancing your company brand. Working with the most cost-effective channels, web site marketing targets specific traffic / potential buyers to your web site depending on your market place and needs.

The leading advantage of Internet and search engine marketing as to traditional marketing will be the capability to measure your results in real-time.

A successful Online marketing campaign needs conversion tracking programs to measure results. Conversion metrics involve, Cost Per Click (CPC), average cost per thousand impressions (CPM), Cost Per Action (CPA) and Return On Investment (ROI). Tracking and putting an importance on the conversion metrics is vital to measuring and optimizing your campaigns so you can get to your closing conversion metric, Return On Investment (ROI). Google Analytics is a no cost conversion monitoring tool used to track the overall performance of your Online marketing campaigns from click to conversion.

Website marketing Channels:

1. Display advertising

Display advertising is graphical marketing and advertising on the web that appears next to content articles on a third-party site including web banners or banner adverts. Display advertising appears on internet pages in numerous forms and contains text, images, logos and maps and is an affordable way to grow your advertising reach. Search marketing and display advertising jointly can increase online revenue.

Google Display and Bing / Yahoo! Search Marketing offer substantial display advertising solutions. YouTube offers a video display advert alongside relevant YouTube videos, and/or on web-sites on the Google Display Network that fit your target market.

· Google Display Network – The Google Display Ad Builder software within AdWords lets you easily construct compelling image and video ads to make your organization stand out.

· Yahoo! My Display Ads – My Display Ads displays campaigns throughout Yahoo! Network Plus. With My Display Ads, it is possible to set up your own ads, target the campaign, and monitor your campaign overall performance.

2. Video marketing

Video marketing is presenting information to a possible buyer in video form and guiding them to a product. Online video is progressively becoming a lot more popular and companies are seeing it as being a viable method for attracting customers.

Marketers are realizing the importance of creating outstanding video substance as digital marketing and advertising continues to evolve. Video is one of the more exciting means companies can develop curiosity and inbound clients.

3. Search engine marketing (SEM)

Search engine marketing is the promotion of websites by increasing their visibility in search engine result pages (SERPs) using Pay Per Click (PPC) advertising, contextual / content advertising, paid inclusion and search engine optimization (SEO) procedures.

4. Pay Per Click (PPC)

Your advert will appear next to or on top in the organic search results for the search phrase you entered in Google or Bing. Pay Per Click PPC Advertising is known as “sponsored ads” along with your Google results page and “ads” with the Bing results page. Advertisers place bids for specific search phrases and when a searcher clicks on an advert, the advertiser is charged dependent on the amount the advertiser bid for that search phrase.

5. Contextual / Content advertising

Advertisers place adverts on other sites that carry information relevant to their products and the ads are displayed to people who are searching for information from those web-sites. The search portal analyzes the content of the web page to find out its meaning and matches related key phrase targeted Content Ads to display on a page.

Google AdSense was the first important contextual advertising network.

The Yahoo! Bing network comprises 30% in the search advertising market.

6. Search engine optimization (SEO)

Search engine optimization is the process of increasing the visibility of your web site in search engine’s organic search results by considering how individual search engines function, what individuals search for as well as the keywords and phrases customers search.

Your internet site development and popularity building efforts must be engineered for search engines together with optimized code, content flow design, SEO copywriting and link building.

7. Social media marketing (SMM)

Social media marketing is the process of gaining targeted visitors through social media internet sites which include blogs, social networking web sites, social bookmarking sites, and forums. Social media marketing promotes your site by sending direct visitors, producing links for your website and creating awareness.

· Social networking sites include Facebook, LinkedIn, and Twitter where you could possibly place a link to your web site on your own profile. The largest gain comes when other people mention, link to or bookmark your site.

· Social bookmarking is a network in which users share with each other details about internet sites, articles, or news items that they like. These Social bookmarking sites include Digg, Delicious, StumbleUpon, and Google Bookmarks.

8. Email marketing

Email marketing is delivery of a commercial message to a group of people working with electronic mail. Creating a month-to-month E-Mail publication is one of the important web site promotion strategies. It may very well be a newsletter, list of suggestions, industry updates, or new product information.

Exit mobile version