Skip to main content

Search Engine Processes and Components

Search Engine Processes and Components 
 Modern search engines perform the following processes:
• Web crawling 
• Indexing 
• Searching 
 This section presents an overview of each of these before you move on to 
understanding how a search engine operates. 

 Web Crawling 

Web crawlers or web spiders are internet bots that help search engines update their 
content or index of the web content of various websites. They visit websites on a list 
of URLs (also called seeds ) and copy all the hyperlinks on those sites. Due to the vast 
amount of content available on the Web, crawlers do not usually scan everything on a 
web page; rather, they download portions of web pages and usually target pages that are 
popular, relevant, and have quality links. Some spiders normalize the URLs and store 
them in a predefined format to avoid duplicate content. Because SEO prioritizes content 
that is fresh and updated frequently, some crawlers visit pages where content is updated 
on a regular basis. Other crawlers are defined such that they revisit all pages regardless 
of changes in content. It depends on the way the algorithms are written. If a crawler is 
archiving websites, it preserves web pages as snapshots or cached copies. 
 Crawlers identify themselves to web servers. This identification process is required, and 
 website administrators can provide complete or limited access by defining a robots.txt 
file that educates the web server about pages that can be indexed as well as pages that 
should not be accessed. For example, the home page of a website may be accessible 
for indexing, but pages involved in transactions—such as payment gateway pages—are 
not, because they contain sensitive information. Checkout pages also are not indexed, 
because they do not contain relevant keyword or phrase content, compared to category/
product pages. 
 If a server receives continuous requests, it can get caught in a spider trap . In that 
case, the administrators can tell the crawler’s parents to stop the loops. Administrators 
can also estimate which web pages are being indexed and streamline the SEO properties 
of those web pages. 
 Googlebot (used by Google), BingBot (used by Bing and Yahoo!), and Sphinx (an 
open source, free search crawler written in C++) are some of popular crawlers indexing 
the web for their respective search engines. 


Indexing 

 Indexing methodologies vary from engine to engine. Search-engine owners do not 
disclose what types of algorithms are used to facilitate information retrieval using 
indexing. Usually, sorting is done by using forward and inverted indexes. Forward 
indexing involves storing a list of words for each document, following an asynchronous 
system-processing methodology; that is, a forward index is a list of web pages and which 
words appear on those web pages. On the other hand, inverted indexing involves locating 
documents that contain the words in a user query; an inverted index is a list of words and 
which web pages those words appear on. Forward and inverted indexing are used for 
different purposes. For example, in forward indexing, search-engine spiders crawl the 
Web and build a list of web pages and the words that appear on each page. But in inverted 
indexing, a user enters a query, and the search engine identifies web pages linked to the 
words in the query. 
 During indexing, search engines find web pages and collect, parse, and store data 
so that users can retrieve information quickly and effectively. Imagine a search engine 
searching the complete content of every web page without indexing—given the huge 
volume of data on the Web, even a simple search would take hours. Indexes help reduce 
the time significantly; you can retrieve information in milliseconds. 
 Forward indexing and inverted indexing are also used in conjunction. During 
forward indexing, you can store all the words in a document. This leads to asynchronous 
processing and hence avoids bottlenecks (which are an issue in inverted indexes). Then 
you can create an inverted index by sorting the words in the forward index, to streamline 
the full-text search process. 
 Information such as tags, attributes, and image alt attributes are stored during 
indexing. Even different media types such as graphics and video can be searchable, 
depending on the algorithms written for indexing purposes.


Search Queries 
 A user enters a relevant word or a string of words to get information. You can use plain 
text to start the retrieval process. What the user enters in the search box is called a 
search query . This section examines the common types of search queries: navigation, 
informational, and transactional. 
 Navigational Search Queries 
 These types of queries have predetermined results, because users already know the 
website they want to access.

Informational Search Queries 
 Informational search queries involve finding information about a broad topic and are 
more generic in nature. Users generally type in real-time words to research or expand 
their knowledge about a topic.





!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Your happy 


Thanks!!!!!!!!!!!!!!!  happy
!!! Please!!! 
!enter!
Your comment!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 








A NEW SYSTEM OF ALTERNATING CURRENT MOTORS





https://secretfocustips.blogspot.com/2023/01/basic-radiowave-and-antenna-parameters.html




https://secretfocustips.blogspot.com/2023/01/frequency-selection.html





https://secretfocustips.blogspot.com/2023/01/half-wave-and-quarter-wave-antennas.html





https://secretfocustips.blogspot.com/p/understanding-customer-journey.html





https://secretfocustips.blogspot.com/p/finding-sources-of-information-and.html





https://secretfocustips.blogspot.com/p/rules-factors-for-link-building.html



=================

Focus On Early Secret 


================




👀 Read carefully 👀


=====Thanks====




Comments

Popular posts from this blog

A New System of Alternating Current Motors

A New System of Alternating Current Motors Introduction : In the late 19th and early 20th centuries, the development of electrical engineering revolutionized industries, transportation, and daily life. Central to this transformation was the discovery and utilization of alternating current (AC) motors. These motors, as opposed to direct current (DC) motors, offered several advantages, including increased efficiency, longer transmission distances, and more straightforward regulation. In this article, we will delve into the historical context of AC motors, the key contributions of various inventors, and the development of a new system of alternating current motors that paved the way for the modern electrical era. Historical Context: The idea of utilizing electrical power for practical applications was first demonstrated by Michael Faraday in the early 1830s when he discovered electromagnetic induction. Faraday's work laid the foundation for the understanding of the relationship betw...

FREQUENCY SELECTION.

 FREQUENCY SELECTION. 1. Prior to selecting frequencies for a radio circuit, thought must be given as to what type of antenna will be used. Often, during displacement or during an alert, at least two types of antennas will be used -- a whip while moving; a doublet or suitable compromise antenna while mobile at a halt. Antenna selection will determine the choice of frequency, not the other way around. Looking at the various Immediate Sky-Wave Distance (ISD) charts, page 125 to page 141 (we will use column 5 on the charts) for various antennas, we see that the most reliable antenna is a doublet with reliability dropping until we reach the poorest antenna -- the whip. We also see on the ISD charts that for the same distance, the frequency increases with each type of antenna, with the whip having the highest frequency. What conclusions can we draw from these comparisons? First, we must have two frequencies for sky wave use -- a day and a night frequency. Also, when forced to use a whip...

Understand how SEO For WordPress Pages works

SEO for WordPress Pages The first thing I should mention about WordPress pages is that they can now have comments. This wasn’t always the case, but I guess WordPress caved in to popular demand (of those that used pages for content when they should have perhaps used posts!). The way we are using WordPress pages (for legal pages), it’s unlikely you’ll want to enable that option. We don’t particularly want people commenting on our privacy policy or contact pages! However, if you do have a page that you would like to enable comments on, this is where you do it on a page-by-page basis (the Discussion box is located below the text editor). If you don’t see it, check the screen options (top right of the Dashboard). Fortunately, a lot of the SEO we control on pages (and posts), is supplied to us by the Yoast WordPress SEO plugin. You’ll find a section created by this plugin as you scroll down the page edit screen. It's typically located just under the text There are 4 tabs across the top o...