What is Log Analysis? Tips for Log Analysis with Screaming Frog
Posted: Mon Dec 09, 2024 6:47 am
Log analysis is one of the most important items that include advanced actions in technical SEO focus. Therefore, it is very important for the opportunities regarding our website and the improvement of the current structure. However, how to interpret the outputs obtained as a result of log analysis is critical. Because in log analysis, not all data may always provide the desired output or action-focused data.
What is a Log File?
The log file is the data that keeps the actions of all the resources that send requests to the website. This data can also be kept daily and allow for analysis of a period of time. The log file stored on the server may sometimes not be active by the server providers. As the website owner, it is useful to get information about the status of the log file from your server provider.
The log file must contain some information within itself. These are; IP address, date, request type, response code, source information that made the request, page and the address that sent the request.
What is Log Analysis?
Log analysis provides information about the requests that search engine spiders send to your website. This information provides an analysis opportunity thanks to how much the site is crawled, which bots visit it more, and what response code the page responds to as a result of the request. Thanks to this information, we can have the opportunity to take action and analyze based on data instead of assumptions. You can access the comprehensive information and video content on using log analysis for SEO that we have presented in the past months here .
While performing log analysis, you can perform detailed examinations by performing daily analysis of the files. As a result of these examinations, we can access some outputs such as the following.
Information about which page or file structure is crawled frequently.
What response or error code is returned as a result of the request to the page.
Scannability checks.
Crawl budget optimization.
Information about the importance of which pages crawlers are making the most requests to.
What is Crawl Budget?
The crawl budget refers to the time spent by search engine overseas chinese in worldwide data bots when they take action to crawl the site and re-crawl the pages they have added to the index. Therefore, the crawl budget of each site is not the same.
In short, it refers to the effort that bots allocate to the number of URLs they can crawl on the website. We can call the reduction of effort here crawl budget optimization. Because not crawling the pages on the site that we do not target will ensure that the crawl budget allocated for the website is used efficiently.
10 Tips for Log Analysis with Screaming Frog Log Analyzer
The Screaming Frog Log Analyzer tool offers analysis of many parts and data in different aspects. Especially in the overview section, we can access information such as the number of requests by crawlers, how many URLs they send requests to, what status code they have, and the number of unique URLs. In fact, being able to analyze parts other than this data is also quite valuable.
1. Review of User Agent Requests
The website receives requests from many crawlers such as Googlebot, YandexBot, Bingbot, Baidu. These requests should mostly be crawlers belonging to search engines preferred by our website users. For example, if there are no or very few users coming to our website from the Bing browser, and if we have the output that Bingbot is sending too many requests in your current log reviews, we will conclude that action needs to be taken here.
We can increase Bingbot's Crawl-delay value to make it come less often or disable crawling from the robots.txt file. In this way, we prevent the user-agent that does not contribute to our website and makes unnecessary requests from spending the crawl budget.
2. Response Code Checks
As a result of requests to URLs on the website, the page presents a response code. These response codes represent different information within themselves. For example, a 500 response code indicates a server problem, while a 200 response code indicates that the page opened without any problems. We can make some inferences based on the response code in log analysis. For example, let's say we recently redirected AMP pages to their main versions with 301. In this case, the requests to our pages containing /amp may be quite high.
If these URLs are not in the index, we can reduce unnecessary requests by disabling crawling from the robots.txt file. Different situations may occur depending on the log outputs and special actions must be taken for these situations.
3. Review of Request Numbers
We can identify the pages that crawlers send the most and least requests to. Especially with source filtering, we can examine only HTML or images and access the total number of requests made in the relevant date range.
What is a Log File?
The log file is the data that keeps the actions of all the resources that send requests to the website. This data can also be kept daily and allow for analysis of a period of time. The log file stored on the server may sometimes not be active by the server providers. As the website owner, it is useful to get information about the status of the log file from your server provider.
The log file must contain some information within itself. These are; IP address, date, request type, response code, source information that made the request, page and the address that sent the request.
What is Log Analysis?
Log analysis provides information about the requests that search engine spiders send to your website. This information provides an analysis opportunity thanks to how much the site is crawled, which bots visit it more, and what response code the page responds to as a result of the request. Thanks to this information, we can have the opportunity to take action and analyze based on data instead of assumptions. You can access the comprehensive information and video content on using log analysis for SEO that we have presented in the past months here .
While performing log analysis, you can perform detailed examinations by performing daily analysis of the files. As a result of these examinations, we can access some outputs such as the following.
Information about which page or file structure is crawled frequently.
What response or error code is returned as a result of the request to the page.
Scannability checks.
Crawl budget optimization.
Information about the importance of which pages crawlers are making the most requests to.
What is Crawl Budget?
The crawl budget refers to the time spent by search engine overseas chinese in worldwide data bots when they take action to crawl the site and re-crawl the pages they have added to the index. Therefore, the crawl budget of each site is not the same.
In short, it refers to the effort that bots allocate to the number of URLs they can crawl on the website. We can call the reduction of effort here crawl budget optimization. Because not crawling the pages on the site that we do not target will ensure that the crawl budget allocated for the website is used efficiently.
10 Tips for Log Analysis with Screaming Frog Log Analyzer
The Screaming Frog Log Analyzer tool offers analysis of many parts and data in different aspects. Especially in the overview section, we can access information such as the number of requests by crawlers, how many URLs they send requests to, what status code they have, and the number of unique URLs. In fact, being able to analyze parts other than this data is also quite valuable.
1. Review of User Agent Requests
The website receives requests from many crawlers such as Googlebot, YandexBot, Bingbot, Baidu. These requests should mostly be crawlers belonging to search engines preferred by our website users. For example, if there are no or very few users coming to our website from the Bing browser, and if we have the output that Bingbot is sending too many requests in your current log reviews, we will conclude that action needs to be taken here.
We can increase Bingbot's Crawl-delay value to make it come less often or disable crawling from the robots.txt file. In this way, we prevent the user-agent that does not contribute to our website and makes unnecessary requests from spending the crawl budget.
2. Response Code Checks
As a result of requests to URLs on the website, the page presents a response code. These response codes represent different information within themselves. For example, a 500 response code indicates a server problem, while a 200 response code indicates that the page opened without any problems. We can make some inferences based on the response code in log analysis. For example, let's say we recently redirected AMP pages to their main versions with 301. In this case, the requests to our pages containing /amp may be quite high.
If these URLs are not in the index, we can reduce unnecessary requests by disabling crawling from the robots.txt file. Different situations may occur depending on the log outputs and special actions must be taken for these situations.
3. Review of Request Numbers
We can identify the pages that crawlers send the most and least requests to. Especially with source filtering, we can examine only HTML or images and access the total number of requests made in the relevant date range.