Anyone who indexes websites on search engines is often confronted with error messages in their reports. Some of these relate to page content that is supposedly a "duplicate" and the search engine was unable to locate a "canonical" website. This article is about minimizing these error messages.

Meta-Tags

Internet pages have a specific area outside of the visual appearance that describes the web page. This so-called meta section was developed specifically for search engines and allows additional information to be integrated into the page. It is not visible to the visitor. This additional information is located in the header of the individual web page and can provide a wide range of information. A page without visual content based on HTML with meta information might look like this:

<html><head>

<title>Website by Peter Example</title>
This meta tag specifies the title of the website

<meta name="description" content="This ist mv private website">
This meta tag provides a description of the website

<meta name="keywords" content="Peter, Example, private, homepage">
Specifies short keywords to describe the content with single words 

<link href="/https :// www. myhomepage . com/mycontent" rel="canonical"/>
The relationship attribute "rel" classifies the above hyperlink as canonical. The URL in the hyperlink describes the path to the original page.

</head><body></body></html>

Within metadata, tags represent an organizational unit for a piece of information. A canonical tag is an HTML attribute value that marks a URL as canonical for the page content. If a page is accessible via multiple URLs, the specified web address is considered the main or original page by search engines.
(The spaces in the URLs are for clarity and should not exist in practice)

The problem

Especially in the age of so-called Web Content Management Systems (WCMS), individual pages are accessible via different URL formats. These WCMSs usually offer settings for search engine-friendly URL output and also have addressing options for internal structure management. This allows the same page to be accessible to search engines under different URLs:
(The spaces in the URLs are for clarity and should not exist in practice)

https :// www. myhomepage . com/mycontent

https :// www. myhomepage . com/index.php/mycontent

https :// www. myhomepage . com/?com_content&id=3

This would result in search engines perceiving all three pages as different page destinations with matching content. The search engine would perceive at least two duplicates here. To prevent this incorrect indexing, the URL to the canonical page can be specified in the metadata. This allows this canonical page to be included as an index component.

Generally, all major search engines can evaluate these canonical tags. If a Web Content Management System (WCM) is used for the website, you should check whether this management system offers the option to designate a single page as canonical. Some Web Content Management Systems (WCMs) (e.g., WordPress) also offer extensions (so-called "plugins") that allow such a function to be installed subsequently and can be used to manage canonical pages.