Fix “Crawled – currently not indexed” Error
What is Crawled Currently Not indexed?
“Crawled Currently Not Indexed” means that Googlebot has crawled the page but hasn’t yet indexed it to appear in the search results.
Reasons for Crawled – Currently Not Indexed error in GSC?
Crawling & indexing are two of the most critical steps in SEO. It is essential to understand why search engines have crawled a URL but not indexed it and what one can do to ensure that the page is indexed. This blog post will explore possible reasons for the “crawled – currently not indexed” error and how to fix them.
Duplicate Content
Duplicate content can be one possible reason for a “Crawled – Currently Not Indexed” error in Google Search Console. When Google crawls the web, it tries to understand the content of a page and determine its relevance and value to users. If the content of a page is identical or very similar to other pages on the web, it may be considered duplicate content. Google may decide not to index the page to avoid showing the same content multiple times in search results. This can cause the page to be marked as “Crawled – Currently Not Indexed” in Google Search Console.
You can tackle the duplicity issue via:
- Identify and remove duplicate content. If you have multiple pages with similar content, consider consolidating the content onto one page or deleting the duplicate pages.
- Use rel=canonical tags to specify the source of the content. This helps Google understand which page is the original and which are duplicates.
- Use the “noindex” tag to tell Google not to index specific pages. This is helpful if you want to keep duplicate pages on your site but don’t want them to be indexed.
- Use unique and relevant titles and descriptions for each page on your site. This can help Google understand the page’s content and give it a better chance of being indexed.
Thin and Low-Quality Content
Thin and low-quality content could be a solid possible reason for a “Crawled – Currently Not Indexed” error in Google Search Console. Google wants to provide its users with high-quality and relevant search results, and it may decide not to index pages with thin or low-quality content.
Thin content refers to pages with very little valuable information or is very short and may not be considered high quality by Google.
Low-quality content, on the other hand, refers to content that is not well-written, is unoriginal, or is misleading. This content may not be practical or helpful to users, and Google may decide not to index it to avoid showing it in search results.
Below are the things you can do to fix the low-quality content issue:
- Improve the quality and depth of the content on your site. Make sure that each page on your site has a clear purpose and provides value to users.
- Remove thin or low-quality content from your site. If you have pages with very little content or content that is not high quality, consider deleting or consolidating these pages.
- Use unique and relevant titles and descriptions for each page on your site. This can help Google understand the page’s content and give it a better chance of being indexed.
Keyword Cannibalization
Keyword cannibalization can occur when a website has multiple pages targeting the same keyword or closely related keywords or when a website has a single page targeting various keywords that are closely related. In either case, the result is that the search engines may struggle to understand which page or pages are most relevant for a particular keyword and may not rank any of the pages as high as they would if the keyword were used more selectively.
It is essential to carefully consider the keywords you are targeting on each website page and to ensure that each page targets a unique and specific keyword or set of keywords. This will help search engines understand each page’s content and rank it appropriately for the keywords it is targeting.
An Example:
Imagine that you have a website that sells running shoes. You have two pages on your website:
One specifically targeted “women’s running shoes,” and another targeted “men’s running shoes.”
If both pages target the keyword “running shoes,” it could create keyword cannibalization. This is because the search engines may confuse which page to rank for the keyword “running shoes,” As a result, neither page may rank as highly as it could if it were targeting a more specific keyword.
To avoid this situation, consider using more specific keywords on each page.
For example, the women’s running shoes page could target the keyword “women’s running shoes,” while the men’s running shoes page could target the keyword “men’s running shoes.” This could help the search engines better understand which page is most suitable for each keyword and allow each page to rank more effectively.
How to fix Crawled Currently Not Indexed Issue?
Analyze the Internal Linking
Internal linking increases the chances of crawlers finding a page. If you link to another page within your site, it makes it easier for spiders to find your content. This includes both external and internal links. External links are those that lead outside of your domain name. An internal link leads to another part of your site. More internal links pointing to a particular page mean that the page will rank higher. So, if you have a lot of pages that talk about the same topic, you should try to link to each. You don’t necessarily have to do this manually; there are many tools out there that help you generate internal links automatically.
Make sure that all critical pages on your site are linked internally. If you have a few pages that aren’t very useful, you can still add them to your sitemap.xml file. However, if you have many pages that are just duplicates of others, consider deleting them.
False positive alert at GSC
False positives occur when Google Search Console marks a page as excluded, but URL analysis tools or live URL tests show that your page has been indexed. This is considered a false positive in Google Search Console reporting.
To do a live URL test:
Go to Google Search and enter the URL of the page as a query. Example: example.com/your-page;
When your page appears in search results, even if Google Search Console shows it as excluded, it means the page has been indexed. This is called a false positive.
This is just an error reported by Search Console, so you don’t need to do anything in this case.
Broken Links
Broken links occur when people navigate to a page, but the link doesn’t work anymore. A broken link in Google Webmaster Tools usually indicates that the page no longer exists. However, it could also suggest that the page is redirected elsewhere.
Robots.txt File Issues
Robots.txt files tell search engines what pages aren’t allowed to crawl. When robots.txt file issues arise, they often happen when a website owner tries to block access to certain pages. You can use the Robots.txt testing tool https://support.google.com/webmasters/answer/6062598?hl=en
Your website is newly launched
When you create a new website, it can take time for Google to discover and crawl its pages. If you have just launched your website and it is not yet appearing in search results, it may be marked as “Crawled – Currently Not Indexed” in Google Search Console.
Actions to help Google discover and index your new website:
- Create an XML sitemap & submit it to Google Search Console. An XML sitemap is a file that lists all the pages on your website and helps Google understand the structure of your site.
- Create high-quality and unique content for your website. Google wants to provide its users with relevant and valuable search results, so it will be more likely to index pages with high-quality content.
- Use relevant and descriptive titles and descriptions for each page on your site. This can help Google understand the content of your pages and give them a better chance of being indexed.
RSS feed URLs
Feed URLs are outdated and should no longer be used. They allow spam bots to access your website, which could lead to malware attacks. If you still use RSS feeds, make sure to update them.