How Search Engines OperateSearch engines have a short list of critical operations that allows them to provide relevant web results when searchers use their system to find information.
Although a search engine's operations are not particularly lengthy, systems like Google, Yahoo!, AskJeeves, and MSN are among the most complex, processing-intensive computers in the world, managing millions of calculations each second and funneling demands for information to an enormous group of users. Speed Bumps & WallsCertain types of navigation may hinder or entirely prevent search engines from reaching your website's content. As search engine spiders crawl the web, they rely on the architecture of hyperlinks to find new documents and revisit those that may have changed. In the analogy of speed bumps and walls, complex links and deep site structures with little unique content may serve as "bumps." Data that cannot be accessed by spiderable links qualify as "walls." Possible "Speed Bumps" for SE Spiders:
Possible "Walls" for SE Spiders:
The key to ensuring that a site's contents are fully crawlable is to provide direct, HTML links to each page you want the search engine spiders to index. Remember that if a page cannot be accessed from the home page (where most spiders are likely to start their crawl), it is likely that it will not be indexed by the search engines. A sitemap can be of tremendous help for this purpose. Measuring Relevance and PopularityModern commercial search engines rely on the science of information retrieval (IR). That science has existed since the middle of the 20th century, when retrieval systems powered computers in libraries, research facilities, and government labs. Early in the development of search systems, IR scientists realized that two critical components made up the majority of search functionality:
These two items were translated to web search 40 years later and manifest themselves in the form of document analysis and link analysis. In document analysis, search engines look at whether the search terms are found in important areas of the document - the title, the meta data, the heading tags, and the body of text content. They also attempt to automatically measure the quality of the document (through complex systems beyond the scope of this guide). In link analysis, search engines measure not only who is linking to a site or page, but what they are saying about that page/site. They also have a good grasp on who is affiliated with whom (through historical link data, the site's registration records, and other sources), who is worthy of being trusted (links from .edu and .gov pages are generally more valuable for this reason), and contextual data about the site the page is hosted on (who links to that site, what they say about the site, etc.). Link and document analysis combine and overlap hundreds of factors that can be individually measured and filtered through the search engine algorithms (the set of instructions that tells the engines what importance to assign to each factor). The algorithm then determines scoring for the documents and (ideally) lists results in decreasing order of importance (rankings). Information Search Engines Can TrustAs search engines index the web's link structure and page contents, they find two distinct kinds of information about a given site or page - attributes of the page/site itself and descriptive about that site/page from other pages. Since the web is such a commercial place, with so many parties interested in ranking well for particular searches, the engines have learned that they cannot always rely on websites to be honest about their importance. Thus, the days when artificially stuffed meta tags and keyword-rich pages dominated search results (pre-1998) have vanished and given way to search engines that measure trust via links and content. The theory goes that if hundreds or thousands of other websites link to you, your site must be popular, and thus, have value. If those links come from very popular and important (and thus, trustworthy) websites, their power is multiplied to even greater degrees. Links from sites like NYTimes.com, Yale.edu, Whitehouse.gov, and others carry with them inherent trust that search engines then use to boost your ranking position. If, on the other hand, the links that point to you are from low-quality, interlinked sites or automated garbage domains (aka link farms), search engines have systems in place to discount the value of those links. The most well-known system for ranking sites based on link data is the simplistic formula developed by Google's founders - PageRank. PageRank, which relies on a mathematical formula (based around finding a given document in a random pattern of clicking on links), is described by Google in their technology section:
Google uses a PageRank “proxy” value, which logarithmically translates the actual PageRank of a document to a value between 1 and 10, to rank Web sites listed in its directory (which offers a PageRank order or an Alphabetical order for listings) and in its toolbar (below).
PageRank is, in essence, a rough system for estimating the value of a given link based on the links that point to the host page. Since PageRank's inception in the late '90s, more subtle and sophisticated link analysis systems have taken the place of PageRank. Thus, in the modern era of SEO, the PageRank measurement in Google's toolbar, directory, or through sites that query the service is of limited value. Pages with PR8 can be found ranked 20-30 positions below pages with a PR3 or PR4. In addition, the toolbar numbers are updated only every 3-6 months by Google, making the values even less useful. Rather than focusing on PageRank, it's important to think holistically about a link's worth. Here's a small list of the most important factors search engines look at when attempting to value a link:
These are only a few of the many factors search engines measure and weigh when evaluating links. For a more complete list, see SEOmoz's search engine ranking factors article. Link metrics are in place so that search engines can find information to trust. In the academic world, greater citation meant greater importance, but in a commercial environment, manipulation and conflicting interests interfere with the purity of citation-based measurements. Thus, on the modern WWW, the source, style, and context of those citations is vital to ensuring high quality results. The Anatomy of a HyperLinkA standard hyperlink in HTML code looks like this:
A more complex piece of HTML code for a link may include additional attributes such as:
Other types of links may also be used on the web, many of which pass no ranking or spidering value due to their use of re-direct, Javascript, or other technologies. A link that does not have the classic <a href="URL">text</a> format, be it image or text, should be generally considered not to pass link value via the search engines (although in rare instances, engines may attempt to follow these more complex style links).
It's important to understand that, based on a link's anatomy, search engines can (or cannot) interpret and use the data therein. Whereas the right sort of links can provide great value, the wrong sort will be virtually useless (for search ranking purposes). More detailed information on links is available at this resource - anatomy and deployment of links. Keywords and QueriesSearch engines rely on the terms queried by users to determine which results to put through their algorithms, order, and return to the user. But, rather than simply recognizing and retrieving exact matches for query terms, search engines use their knowledge of semantics (the science of language) to construct intelligent matching for queries. An example might be a search for loan providers that also returned results that did not contain that specific phrase, but instead had the term lenders. The engines collect data based on the frequency of use of terms and the co-occurrence of words and phrases throughout the web. If certain terms or phrases are often found together on pages or sites, search engines can construct intelligent theories about their relationships. Mining semantic data through the incredible corpus that is the Internet has given search engines some of the most accurate data about word ontologies and the connections between words ever assembled artificially. This immense knowledge of language and its usage gives them the ability to determine which pages in a site are topically related, what the topic of a page or site is, how the link structure of the web divides into topical communties, and much, much more. Search engines' growing artificial intelligence on the subject of language means that queries will increasingly return more intelligent, evolved results. This heavy investment in the field of natural language processing (NLP) will help to achieve greater understanding of the meaning and intent behind their users' queries. Over the long term, users can expect the results of this work to produce increased relevancy in the SERPs (Search Engine Results Pages) and more accurate guesses from the engines as to the intent of a user's queries. Sorting the Wheat from the ChaffIn the classic world of Information Retrieval, when no commercial interests existed in the databases, very simplistic algorithms could be used to return high quality results. On the world wide web, however, the opposite is true. Commercial interests in the SERPs are a constant issue for modern search engines. With every new focus on quality control and growth in relevance metrics, there are thousands of individuals (many in the field of SEO) dedicated to manipulating these metrics in order to control the SERPs, typically by aiming to list their sites/pages first. The worst kind of results are what the industry refers to as "search spam" - pages and sites with little real value that contain primarily re-directs to other pages, lists of links, scraped (copied) content, etc. These pages are so irrelevant and useless that search engines are highly focused on removing them from the index. Naturally, the monetary incentives are similar to email spam - although few visit and fewer click on the links (which are what provide the spam publisher with revenue), the sheer quantity is the decisive factor in producing income. Other "spam" results range from sites that are of low quality or affiliate status that search engines would prefer not to list, to high quality sites and businesses that are using the link structure of the web to manipulate the results in their favor. Search engines are focused on clearing out all types of manipulation and hope to eventually achieve fully relevant and organic algorithms to determine ranking order. So-called "search engine spammers" engage in a constant battle against these tactics, seeking new loopholes and methods for manipulation, resulting in a never-ending struggle. This guide is NOT about how to manipulate the search engines to achieve rankings, but rather how to create a website that search engines and users will be happy to have ranking permanently in the top positions, thanks to its relevance, quality, and user friendliness. Paid Placement and Secondary Sources in the ResultsThe search engine results pages contain not only listings of documents found to be relevant to the user's query, but other content, including paid advertisements and secondary source results. Google, for example, serves up ads from its well-known AdWords program (which currently fuels more than 99% of Google's revenues), as well as secondary content from its local search, product search (called Froogle), and image search results. Below is a screenshot of Google's search engine results page. Hover on any of the areas of the image to reveal the source of the content:
The sites/pages ranking in the "organic" search results receive the lion's share of searcher eyeballs and clicks - between 60-70%, depending on factors such as the prominence of ads, relevance of secondary content, etc. The practice of optimization for the paid search results is called SEM, or Search Engine Marketing, while optimizing to rank in the secondary results requires unique, advanced methods of targeting specific searches in arenas such as local search, product search, image search, and others. While all of these practices are a valuable part of any online marketing campaign, they are beyond the scope of this guide. Our sole focus remains on the "organic" results, although links at the bottom of this paper can help direct you to resources on other subjects. |
|
|||
search engine optimizationSearch Engine Optimization Solution and Maintenance!We will beat or match any legitimate search engine optimization or Web Development price quote from any of our competitors. Call us now to find out what we can do for your company web initiative. 1-877-298-2104 or 1-305-259-7776 Contact us today for more information and pricing
… |
||||
Search Engine Optimization Solution and Maintenance!We’re your Best Bet on the Net! |
||||
Search Engine Optimization Solution and Maintenance!search engine optimization |
||||
DOMAIN NAME REGISTRATION & SEARCH ENGINE OPTIMIZATION LEADER & MIVA MERCHANT PARTNERSSEO MAP |

![[D:Title]!](images/web-business.jpg)
![[D:Title]!](images/search-engine-marketing-cert.gif)