Example: De-Indexing PDF and Gated Content
Common inbound marketing channels include gated content – ebooks, videos, PDFs, or PowerPoints – that can only be consumed with the submission of contact information (an email, company name, phone number etc.)
The goal of having such premium content is to market your product and the goal of gating it is to acquire contact information for your sales team. (You can follow the calls-to-action in this blog post to experience what it’s like to be inbound marketed by great content and landing pages.)
A client of mine has a problem in that Google has indexed their gated content on page one for one of their common industry search phrases. The search phrase this content is on page one for is essentially the title of the ebook.
Rather than Google referring the visitor/searcher straight to this landing page so they can give up their contact info, Google is sending them right to the PDF.
In all, the landing page and sales funnel this client made to gather, support and leverage their great content is now seriously devalued. Don’t let this happen to you!
How to: De-Indexing PDFs and Gated Content
Note that your content management system will turn your PDF into a URL once you’ve uploaded it to your website.
- On the Webmaster Tools home page, click the site you want.
- On the Dashboard, click Google Index on the left-hand menu.
- Click Remove URLs.
- Click New removal request.
- Type the URL of the page you want removed from search results (not the Google search results URL or cached page URL), and then click Continue. How to find the right URL. The URL is case-sensitive—use exactly the same characters and capitalization that the site uses.
- Click Yes, remove this page.
- Click Submit Request.
Learn More and How Google Reads PDF Content
Many questions in this blog post were not answered, such as how Google reads the entire PDF document. Does it index links? Can it read meta data? (You can customize a PDF files meta data in Adobe.) This article from the Google blog will provide more insight on PDFs and search engines. Below is an answer it gives regarding PDFs and duplicate content.
“Q: Is it considered duplicate content if I have a copy of my pages in both HTML and PDF?
A: Whenever possible, we recommend serving a single copy of your content. If this isn’t possible, make sure you indicate your preferred version by, for example, including the preferred URL in your Sitemap or by specifying the canonical version in the HTML or in the HTTP headers of the PDF resource. For more tips, read our Help Center article about canonicalization. ”