Hi,
I have a strange problem that is causing internal 404's that I do not understand. The whole thing started at the end of 2017. For some reason the /nl language code gets inserted in url's where they do not belong, causing the 404's . Often this concerns css and javascript files. Today I noticed it also happens on external links in the site. In this case the language code is prefixed before the protocol.
I have no clue where it is coming from. Despite many of these inetrnal 404's have GoogleBot or BingBot as user agent, I do not find these errors back in Google's Search Console of Bing's Webmaster Tools.
When I go to the frontend of the site everything seems to work fine. The links are fully functional and nowhere the languages code is added. Everything works as expected.
I checked my site with Screaming Frog and it doesn't find any 404's.
I know sh404SEF just records what is happening and does not add language codes. So the problem is likely not being caused by your software. However, I really have no idear where to look anymore.
Do you have a clue what is going on here? Are they really internal 404's? Or should they be marked as external?
Best regards,
Joep
p.s.: For some reason I can't attach an image to this ticket. Instead I ran an export of which you will find the contents below:
"Nbr","Sef url","Non sef url","Hits","Rank","Date added","Page title","Page description","Page keywords","Page language","Robots tag","Referrer type"
"1","nl/https:/maps.google.com","","1","0","2018-09-16","","","","","","2"
"2","nl/https:/plus.google.com/ xxxx","","2","0","2018-09-16","","","","","","2"
"3","nl/https:/www.xxxx.com","","1","0","2018-09-16","","","","","","2"
"4","nl/https:/www.xxxx.com/de","","1","0","2018-09-16","","","","","","2"
"5","nl/https:/www.xxxx.com/en","","1","0","2018-09-16","","","","","","2"
"6","nl/https:/www.xxxx.com/media/com_acymailing/css/module_default.css","","1","0","2018-09-16","","","","","","2"
"7","nl/https:/www.xxxx.com/nl","","1","0","2018-09-16","","","","","","2"
"8","nl/https:/www.facebook.com/xxxx","","1","0","2018-09-16","","","","","","2"
"9","nl/https:/www.instagram.com/xxxx","","1","0","2018-09-16","","","","","","2"
"10","nl/https:/www.xxxx.nl","","1","0","2018-09-16","","","","","","2"
"11","nl/https:/www.linkedin.com/company/xxxx","","1","0","2018-09-16","","","","","","2"
"12","nl/https:/www.twitter.com/xxxx","","1","0","2018-09-16","","","","","","2"
"13","nl/tel: xxxx","","1","0","2018-09-16","","","","","","2"
"14","nl/tel: xxxx","","1","0","2018-09-16","","","","","","2"
All these come from the same ip / user-agent:
Datum Intern Verwijzer IP
1 2018-09-16 07:45:40 /https://maps.google.com/?q=xxxx
Gebruiker agent : Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.13+ (KHTML, like Gecko) Version/5.1.7 Safari/534.57.2
I have a strange problem that is causing internal 404's that I do not understand. The whole thing started at the end of 2017. For some reason the /nl language code gets inserted in url's where they do not belong, causing the 404's . Often this concerns css and javascript files. Today I noticed it also happens on external links in the site. In this case the language code is prefixed before the protocol.
I have no clue where it is coming from. Despite many of these inetrnal 404's have GoogleBot or BingBot as user agent, I do not find these errors back in Google's Search Console of Bing's Webmaster Tools.
When I go to the frontend of the site everything seems to work fine. The links are fully functional and nowhere the languages code is added. Everything works as expected.
I checked my site with Screaming Frog and it doesn't find any 404's.
I know sh404SEF just records what is happening and does not add language codes. So the problem is likely not being caused by your software. However, I really have no idear where to look anymore.
Do you have a clue what is going on here? Are they really internal 404's? Or should they be marked as external?
Best regards,
Joep
p.s.: For some reason I can't attach an image to this ticket. Instead I ran an export of which you will find the contents below:
"Nbr","Sef url","Non sef url","Hits","Rank","Date added","Page title","Page description","Page keywords","Page language","Robots tag","Referrer type"
"1","nl/https:/maps.google.com","","1","0","2018-09-16","","","","","","2"
"2","nl/https:/plus.google.com/ xxxx","","2","0","2018-09-16","","","","","","2"
"3","nl/https:/www.xxxx.com","","1","0","2018-09-16","","","","","","2"
"4","nl/https:/www.xxxx.com/de","","1","0","2018-09-16","","","","","","2"
"5","nl/https:/www.xxxx.com/en","","1","0","2018-09-16","","","","","","2"
"6","nl/https:/www.xxxx.com/media/com_acymailing/css/module_default.css","","1","0","2018-09-16","","","","","","2"
"7","nl/https:/www.xxxx.com/nl","","1","0","2018-09-16","","","","","","2"
"8","nl/https:/www.facebook.com/xxxx","","1","0","2018-09-16","","","","","","2"
"9","nl/https:/www.instagram.com/xxxx","","1","0","2018-09-16","","","","","","2"
"10","nl/https:/www.xxxx.nl","","1","0","2018-09-16","","","","","","2"
"11","nl/https:/www.linkedin.com/company/xxxx","","1","0","2018-09-16","","","","","","2"
"12","nl/https:/www.twitter.com/xxxx","","1","0","2018-09-16","","","","","","2"
"13","nl/tel: xxxx","","1","0","2018-09-16","","","","","","2"
"14","nl/tel: xxxx","","1","0","2018-09-16","","","","","","2"
All these come from the same ip / user-agent:
Datum Intern Verwijzer IP
1 2018-09-16 07:45:40 /https://maps.google.com/?q=xxxx
Gebruiker agent : Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.13+ (KHTML, like Gecko) Version/5.1.7 Safari/534.57.2