The first line of each directive block is the "user-agent", which identifies the crawler that the given rules are addressed to.
So, if you want to tell Googlebot not to crawl your WordPress admin page, for example, your directive would start with:
User-agent: Googlebot
Keep in mind that most search engines have multiple crawlers (for the regular index, images, videos, and so on).
Search engines always choose the most denmark number for whatsapp specific block of directives they can find.
Let's say we have three sets of directives: one for *, one for Googlebot, and one for Googlebot-Image.
If the Googlebot-News user-agent crawls your site, it will follow Googlebot's directives.
On the other hand, the Googlebot-Image user-agent will follow the more specific directives of Googlebot-Image.
Here you can find a detailed list of web crawlers and their different user agents.
The Disallow Directive
The second line of each directive block is the "Disallow" line.
You can have multiple "Disallow" directives that specify which parts of your site the crawler cannot access.
An empty "Disallow" line means you are not disallowing anything , so a crawler can access all sections of your site.
For example, if you wanted to allow all search engines to crawl your entire site, your block would look like this:
User-agent: *
Allow: /
If you instead wanted to block all search engines from crawling your site , the block would look like this:
User-agent: *
Disallow: /
Directives like "Allow" and "Disallow" are not case sensitive, so it's up to you whether or not to capitalize them.
However, the values within each directive are sensitive to this difference.
For example, /photo/ is not the same as /Photo/.
However, the "Allow" and "Disallow" directives are often capitalized because it makes the file easier for people to read.
The User-Agent Directive
-
- Posts: 24
- Joined: Sun Dec 15, 2024 5:25 am