What is a zombie page?
Is it embarrassing for SEO to have zombie pages on its site?
First, if you have too many zombie pages, it means that you have too many bad pages (and that moreover the surfers AND the algo of Google do not appreciate them).Remember that Google judges the overall quality of your site: it takes into account all the pages you ask it to index. We find this statement in the official blog of Google, in the article giving advice on Panda , here is my translation:
If parts of a site are low quality content, then it can degrade the positioning of the entire site. As a result, positioning your high-quality content may progress if you delete low-quality pages, or combine or enhance the content of some surface pages to make pages more useful, or if you migrate your pages from bad ones. quality to another domain name.
You must be “proud” of 100% of the pages you decide to index.
I summarize: if you have too many zombies pages 🧟♂️ 🧟♀️ , as in the fantastic imaginary, they will attack the right pages 😱
Share the info to your Twitter contacts!CLICK TO TWEETI’m just joking about how to describe it, because that’s the reality of what’s going on with Google’s algo.
Then there is another much simpler and more obvious reason, without any assumptions about Google’s algo.
All these pages bring you nothing (not or very few visits, no conversions) and degrade the image of your site (and even the associated company). Whenever you improve a zombie page or pages, they will start to generate traffic and conversions, and improve your brand image. Is not that the goal?
Does Google index zombie pages?
Some are indexed but not very powerful, but for others Google’s algo decides not to index them. In other words, they will not bring you much! And you really have to fix the problem.
Worst of all, these are the pages you want Google to index (they are in your sitemap in addition to the internal links) and Google decides not to even crawl.
John Mueller (from Google) explained that for this last case “It may be that the content of your site is not considered absolutely essential for our research results. ”
Here’s how you can tell if it’s about your site:
- go to Google Search Console (new interface)
- click Index> cover
- if you have an exhaustive sitemap (I recommend it), instead of leaving “All known pages” which is default, click on “All pages sent” to filter on only the URLs you really want indexed:
- click on the “Excluded” tab then click on “Error” to deselect the case of the pages in error
- see if in the rows of the table you see 1 or 2 of the following cases:
- “Explored, currently unindexed”: Google crawled your page but decided it did not deserve to be indexed
- “Discovery, currently unindexed”: Google has crawled your page but decided it does not deserve to be indexed
Here is an example of a site that has a lot of pages involved (and yet these are pages that the site owner wants indexed):
If you have pages with this type of problem, they fall into the category of zombie pages … There are 2 cases:
- URLs that should not have existed: I call that the black mass, it’s a technical problem and a crawler like RM Tech helps you fix it
- URLs that accumulate too much trouble: it’s typically zombie pages and the RM Tech audit helps you deal with them
I continue with the advice of John Mueller (Google) already mentioned above. Here is his procedure:
- “First, check if there are any technical problems”
- “Then, check that everything is OK in the internal mesh”. “Take a crawl tool and see what URLs it can find. “
- “Finally, if this crawl worked [and allowed to find these pages], focus strongly on the quality of these pages. “
Here is another quote that I find very interesting:
Maybe it makes sense to say “good, if I cut the pages by half”. Or maybe even reduce the number of pages I have to 10% of the current number. By doing this, you can make the pages you keep much stronger. You can usually improve the quality of content a little by having more content on these pages.
How to find zombie pages?
Taking into account many factors
To find zombie pages, you need to get as much information as possible on all your pages:
- identify the indexable pages (to study only these pages)
- evaluate the lack of content (main zone of the page) or the obvious sub-optimizations (title very poorly done, etc.)
- evaluate the user’s interest for this page
- evaluate the SEO performance over 1 year, taking care to avoid any sampling: how many hits does the page generate through SEO, what is its positioning in Google and for which queries, etc.?
- evaluate the page’s performance outside of SEO to avoid cataloging “zombie” a page that is successful on social networks or via Google Ads, or that generates revenue, or that participates in conversions …
Once this data is recovered (not always simple it is true), it is only the beginning! We must combine them to evaluate if a page is so bad that it deserves the name zombie.
And then you have to calculate a metric to sort all those pages from worst to worst.
The zombie page index from RM Tech
All that, I spent a lot of time developing it to arrive at the calculation of the zombie index of each indexable page , available in RM Tech , my SEO audit tool on my platform My Ranking Metrics.
I relied on my experience in SEO, in particular the identification of poor quality pages (see my QualityRisk index calculation ) and pages that are too weakly active .
I was able to do huge tests thanks to my platform (long live big data) and exchanges with other experts.
In the end, each indexable page is assigned a zombie index between 0 (no problem) and 100 (maximum problem). This index is provided in the appendix in the conclusion, right next to the QualityRisk Index:
To report very simply the presence of zombie pages on the site you are analyzing and the importance of the problem, the report displays a histogram like this:
This histogram shows if you have a lot of high zombie index pages. This is a more sophisticated version of the QualityRisk histogram (which does not include Analytics and Search Console data).
To locate the zombie pages and to sort them according to an index is good, but it remains to manage these pages and to decide what to do with them …
Before that, I prefer to remember an important point: having the zombie index of each page is extremely useful to make life easier, save time and efficiency, but it certainly does not replace human analysis . Take advantage of the free time by this zombie index calculation (automatic) to think about what to make as a decision.
What to do zombie pages?
The higher the zombie index of a page, the more likely it is “unrecoverable”.
The more zombie pages your site has, the more risky and ineffective it is:
- “Risky” because to judge your site, Google is based on all the pages you ask to index. Allow indexing of “zombie” pages can penalize your good pages
- “Ineffective” because these pages generate no or very few visits and are not or very little consulted by your visitors
It is recommended to perform a human analysis following the method below. Warning :
The purpose of zombie page analysis is to try to save as many pages as possible and delete or deindex only as a last resort.CLICK TO TWEETI also specify that doing nothing zombie pages and hoping for a miracle is not at all the right solution …
For each identified zombie page, try the solutions in that order (move to a solution only if the previous one is not applicable).
1 🔎 Deal with special cases
Is the page recent? If it is indexed for a few months, it may be normal that it has not yet been a great success, especially when compared with the other pages studied over 1 year.
Is the page on a subject of no interest? If this is news, check that the date of publication of your article is visible. A human analysis is necessary here to know if it is necessary to preserve the page. Sometimes an update is possible (without modifying the URL). It’s up to you to decide if you want to keep your “archives”.
Does the page concern a very small niche? If very few people are interested in the subject, even over 1 year, but the content is excellent and the (rare) conversions are very profitable, then you can decide to keep it.
Some pages are exceptions on the site, which you can ignore here: legal notice, CGU / CGV, contact pages but also URLs of pagination (if this one is managed correctly) or other special URLs. RM Tech automatically ignores them 😎
2 📝 Improve the page
It is mainly about improving the content, but sometimes also technical aspects (speed, depth and mesh …). Some tracks (I have all practiced them successfully …):
- complete the information and / or update them
- check if you’re targeting the right keywords
- make sure that the semantic richness of the text is superior to that of your competitors in Google
- improve the UX
- add links to other internal pages
- add internal incoming links (by varying the texts of these links )
- specify your sources, with outgoing external links (hyper important for a site Your Money Your Life )
- highlight the author
- illustrate with images ( well compressed ) or videos
- Make it the promotion on social networks or in your mailings
- (re) check spelling and grammar, make sure your texts are easy to read
- add structured data (schema.org repository)
3 📎 Group pages
If you have more than one page that has too much content for one subject, group them together. Choose the best of this group (cluster), merge the contents and reformat the final result. Do not forget to remove other pages from the cluster and make 301 redirects to the URL that will group them together.
Sometimes, no existing URL is suitable for grouping: you can create a new one and redirect all others to it.
In no case should we make massive redirects to the same URL, especially having no equivalent content deleted pages (the classic mistake is to redirect mass to the home ).
4 ⛔️ Unindex the page
If you think that the content may still interest users or generate revenue, de-index the page but leave it online. If necessary I have a tutorial to de-index pages .
Except in special cases, this means that you do not touch the internal mesh : the links to the page remain in place.
It’s up to you to evaluate if, once the page has been de-indexed, you must also block it on the crawl (robots.txt file).
5 🗑 Delete (and deindex) the page
This is an unrecoverable page or a URL that should never have been indexed ( “black mass” ). The page must be deleted (and de-indexed).
- If the URL has good external backlinks, make a 301 redirect to the best page with semantically close content. You can find backlinks in Search Console (or specialized SEO tools).
- Otherwise, your server must return a code 410 . If you can not do the 410, return a 404. It does not change much but it is to avoid polluting your 404 error report in Search Console including.
Check that no more internal links are pointing to it.
Sites having progressed by managing the zombie pages with RM Tech
I found on the Internet several examples of positive feedback after a big cleaning zombie pages. I do not have the list, see for example Ahrefs (in 2016), Beetle SEO and Raphael Doucet . If you have also worked, contact me to talk about it!
I would also like to mention Brian Dean (from the Backlinko site) who also used the term “zombie pages”. This SEO recognized in the USA recommends the location and removal of these pages. This is even step # 1 of his guide!
And Mary Bowling (well-known SEO since 2003) who advised in 2015 to “destroy the zombie pages that crawl on your site.”
But I have also personally worked on the subject, especially since about 2016, for my clients but also for WebRankInfo. Here are three very encouraging examples …
Client 1: strong rise in SEO traffic, sustainable
This example concerns a client that I have been accompanying for a number of years. Its organic traffic is growing steadily, thanks to high quality content and a good reputation. In July 2017, Google’s SEO traffic is starting to drop … We let go July and August (quite hollow) and in September we go to work.
At the time, I did not yet have access to the zombie index in RM Tech, but I calculated something in the same genre. It was complicated to develop and painful to do each time. That’s why since I added it in RM Tech 🙂
I retrieved the crawl data and combined it in Excel with other information:
- date of publication or last update of the article
- # of times the page was viewed, regardless of the traffic source
- success via social networks
- page performance in terms of ranking / keywords (via Search Console)
Almost all the bad pages identified are improved; some are grouped together and others totally deleted.
Nothing else was done during this period. Result:
2 months later, organic Google traffic has grown strongly, and it was the beginning of a good progression. 1 year later (October 2018), site SEO traffic was up + 19% from the 2017 high (before the decline) and + 66% from the lowest.
Client 2: feedback to each core update
For this other customer, traffic was not good in early 2018 (and previous months). In February I attack the location of pages too weakly active and I study the causes. This allowed me to find new metrics to use for my future algo “zombies”, now included in RM Tech.
In April 2018 Google performs a major update (“core update”) and the traffic goes up at once. In the summer of 2018, we have finalized the work on the other pages too weak, and at each update the traffic goes up:
Customer 3: better SEO
For this one, no page has been removed from the site, however, following the analysis of zombie pages a lot of work has been done:
- the pages having an interest as archives but none for the SEO have been put in noindex
- others were updated, completed and sometimes grouped together when possible
Client 4: big success of the method
For this client, I followed scrupulously the method described above. Result: client (very) happy, the Search Console curve speaks for itself:
Examples of zombie pages found by RM Tech
I end this file with a selection of examples of “zombie” type pages found by RM Tech.
To simplify things (I have plenty of sites to show with their context), I did it on video. Some precisions :
- I tried to choose various examples, but I could have provided 3x more!
- I’ve also included examples of good pages (taken from the same sites) to help you compare. It is impressive to see that this analysis helps to identify the types of content that will have a lasting impact .
- you can download the complete audit report of each example shown here (see the links under the video)
- I got permission from each of these sites to shoot this video
Here are the kinds of things that RM Tech’s zombie algo usually looks great, even in the middle of tens of thousands of pages:
- outdated product sheets or no stock
- the product sheets without any description (or 1 line)
- empty categories or almost
- the correct pages but the title completely missed (title tag)
- Ineffective and risky spam (including satellite pages)
- articles whose content is totally out of date
- the pages in error which nevertheless answer in code 200
- the saturation of the (weak) content by the ad
- the major problems of UX (user experience)
- the technical black mass (indexable URLs that should not have even had to be crawlable)
- on a multilingual site: untranslated pages, or too partially
Are you able to spot all this very quickly for each site studied (without using RM Tech)?
Find this video on YouTube: SEO zombie pages found by RM Tech
Here are the links to download free RM Tech audit reports in PDF format (only annexes are not provided):
- elle.be (women’s press on WordPress): PDF report
- jardinetmaison.fr (media on Drupal): PDF report
- billard-toulet.com (product presentation and blog on WordPress): PDF report
- sweetpartyday.com (ecommerce and blog on WordPress): PDF report
- jet-lag-trips.com (blog on WordPress): PDF report
- cyberpieces.com (ecommerce on Prestashop): PDF report
- feezia.com (ecommerce on Magento): PDF report
- regivia.com (editorial and ecommerce on WordPress + WooCommerce): PDF report
And you ? How do you spot and treat zombie pages in SEO?