How to specify a Canonical URL to Google

Canonical - duplicate content and URLs

Observing the behaviour of the reports genrated by Google Search Console (GSC) it seams obvious that there are multiple ways that a page may be seen by a web spider as being non-canonical. For example, when I setup a profile for this website in GSC I did so for the domain name: tempusfugit.me.uk.

Subseuently I added the profiles http://tempusfugit.me.uk/ and https://tempusfugit.me.uk. This makes 3 profiles and along with the www. versions of them they all could be seen as the same page and therfore duplicate URLs.

My main access to GSC is through the profile for the Domain Property: tempusfugit.me.uk not the http:// or the https:// properties

I resubmitted the index page for reindexing as it was still reporting as:
"Duplicate, Google chose different canonical than user".

I did have two canonical declarations but I changed it back to a single declaration, the same as the one selected by Google: https://tempusfugit.me.uk/index.html

I am leaving the other domain property reports alone for now as I want to clear the primary profile.

Top

What Google say:

" When Googlebot indexes a site, it tries to determine the primary content of each page. If Googlebot finds multiple pages on the same site that seem to be the same, it chooses the page that it thinks is the most complete and useful, and marks it as canonical. The canonical page will be crawled most regularly; duplicates are crawled less frequently in order to reduce Google crawling load on your site. "

Declaration tag

I have added the tag to my template and I have commented it out so that an incorrect declaration is not found by a web spider should the page be uploaded by mistake. When the page is "ready" I will have to make sure that it is commented in and that it is correct.

On pages (older ones before I was aware of the need) I add a tag taken from either the template or an existing file that I have open in my editor. To avoid the wrong canonical being picked up I try an copy and past a tag with the actual page_name removed:

" <link rel="canonical" href="https://tempusfugit.me.uk/page_name.html" /> "

This is so that if I upload the page without filling it correctly the web spider doesn't think that I want to have a declaration of a file as duplicate as the file I copied the tag from.

The file name is then checked and added to the sitemap at the same time.

You also need to check that you have the correct spelling, in both the declaration and the sitemap.

What if Google chooses a different page

If Googlebot does this it is probably for a reason. In the eyes of Google you probably do have duplicate pages.

The duplicate content could have been determined because you have unintentionally confused Google. Check to see that you have made the correct declaration and have that in your sitemap. If you do have duplication and the wrong page is being selected over the one that you wanted, delete that page from the server and submit a URL removal for that page.

I understand what Google mean by Canonical but the addition of a rel=canonical tag still seems a little unclear. At the time of writing this page https://tempusfugit.me.uk/ and https://tempusfugit.me.uk/index.html were being reported by Google Search Console as not having a user specified canonical declaration.

I added a rel=canonical link tag to my index page to see if that satisfies Google. The article on Yoast (and a comment on it) seems to suggest that this could be done.

Canonical - duplicate content and URLs

Links