• Home
  • Get help
  • Ask a question
Last post 42 min ago
Posts last week 94
Average response time last week 34 min
All time posts 67887
All time tickets 10493
All time avg. posts per day 20

Helpdesk is open from Monday through Friday CET

Please create an (free) account to post any question in the support area.
Please check the development versions area. Look at the changelog, maybe your specific problem has been resolved already!
All tickets are private and they cannot be viewed by anyone. We have made public only a few tickets that we found helpful, after removing private information from them.

#1465 – strange 404 urls and w3c validator

Posted in ‘sh404SEF’
This is a public ticket. Everybody will be able to see its contents. Do not include usernames, passwords or any other sensitive information.
Monday, 30 November 2015 10:53 UTC
syrah
 Hello,

Since I am unsing sh404sef which allow me to see the 404 url for my website I find lots of url finishing with "undefined" or sitemap.xml or feed/atom.html
such as for example
www.domaine/category/product/undefined
www.domaine/category/product/sitemap.xml
and www.domaine/category/productfeed/atom.html

Do you know what that means and why I get those ?

I also have a problem with the validator 3W
I get this error and warning :
Error: Attribute xmlns:og not allowed here.
From line 1, column 16; to line 1, column 128
TYPE html><html xmlns:og="http://ogp.me/ns#"  xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr-fr" lang="fr-fr" dir="ltr"><head>

Warning: Attribute with the local name xmlns:og is not serializable as XML 1.0.
From line 1, column 16; to line 1, column 128
TYPE html><html xmlns:og="http://ogp.me/ns#"  xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr-fr" lang="fr-fr" dir="ltr"><head>


I get this since i activated the social SEO
Is it normal that I get that error ? Do you have an idea how I can solve this beside to turn that option off ?

Thanks a lot for your help
Monday, 30 November 2015 14:37 UTC
wb_weeblr
Hi

1 - 404: are those 404 marked as Internal or External by sh404SEF?

If external, they are just requests to your site made at random by bots and other sites maybe. Not much you can do about it
If internal, then you have to find where on your site those errors are coming from. sh404SEF will help you by displaying the referrer (the page the request is coming from).

2 - Facebook meta data do not validate. They use invalid markup.

Rgds
 
Monday, 30 November 2015 15:07 UTC
syrah
Hi,

Thanks for your answer.

1- How do I know if it is external or internal because when I am on the control panel of sh404sef the last line says : 88 total 404, 0 internals, 10 externals

but when I click on "10 externals" I am redirected to the all 404 error and I can not make the difference between internal and external.

2- What do you mean by Facebook meta data do not validate ?
If I turn off the social SEO in sh404sef this line
<html xmlns:og="http://ogp.me/ns#"  xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr-fr" lang="fr-fr" dir="ltr">
will be replace by
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr-fr" lang="fr-fr" dir="ltr"><

As you can see the
xmlns:og="http://ogp.me/ns#" 
part has deseapeared which is good for W3C validator but then I do have anymore the opengraph ie
<meta property="og:locale" content="fr_FR" />
  <meta property="og:title" content="titre" />
  <meta property="og:description" content="la descritption" />
  <meta property="og:type" content="article" />
  <meta property="og:url" content="http://www.mondomaine.fr/page.html" />
  <meta property="og:site_name" content="nom du site" />


So is that normal and what should I do about it ? keep opengraph and the error or remove opengraph and have no error ?

Thanks
Monday, 30 November 2015 15:12 UTC
wb_weeblr
Hi

1 - You already know:
0 internals, 10 externals
You don't have internal 404.

but when I click on "10 externals" I am redirected to the all 404 error and I can not make the difference between internal and external
Well, if you don't have internal errors, then all 404s are externals, which is why when you click on that link you get the full list.

I assume if you click on the "0 Internal" link, then you'll get the 404 page with no 404 listed.

Note that when you are on the 404 manager page, the last filter, on the left sidebar, lets you select to display internal, external or both.

2
part has deseapeared which is good for W3C validator but then I do have anymore the opengraph ie
Yes, as I said earlier, the open graph markup used by Facebook does not validate. It's one or the other.

Note I'm not aware of anyone picking the validator over being publishable on Facebook ;)

Rgds
 
Monday, 30 November 2015 15:21 UTC
syrah
Hi again,

1- I am a bit confused because
it says 88 errors 404 and 0 internals and 10 externals
so if among the 88 errors 404 I have 10 externals what are the 78 missing ?

Beside when I go on the 404 manager page when I selected external only I have nothing and when I selected internal only I have all the 88 error.

So I am completly lost. Are my 88 errors internal or external ?

2- thanks for your answer I will keep the opengraph on.

Thanks
Monday, 30 November 2015 16:46 UTC
wb_weeblr
Hi

so if among the 88 errors 404 I have 10 externals what are the 78 missing ?
The 78 most likely happened before we started recording 404 origins, which happened in version 4.7.0, about a couple of weeks ago.

Beside when I go on the 404 manager page when I selected external only I have nothing and when I selected internal only I have all the 88 error.
That's not normal.

Could you please provide us with superadmin credentials to your website. You can create a temporary account, and delete it afterwards, but it must have superuser level.

Thanks and regards
 
Monday, 30 November 2015 17:00 UTC
syrah
Hi,

Here a special temporary account for you to be able to log on my website
the website is www.xxxx.com (www.xxxx.com/administrator)
and the user is : xxxx and the password : xxxx

Hope you will find why it react like that.

Thanks,
Monday, 30 November 2015 17:20 UTC
wb_weeblr
Hi

I'm not sure why it reacts like that, but I have a small idea.
I think you may have some invalid records in the database, which was a problem on some multilingual websites with previous version of sh404SEF (4.7.0).
Your site is not multilingual, but you do have 2 content languages set up (english and french) and that may have caused this problem to happen.

One thing is extremely weird: all those links you have are 404, but they have a custom page title entered:

In addition, the title is the same for all 404s. That shouldn't happen.

To validate this, I'd simply suggest to purge the 404 URLs log, with the "Purge 404" button in the 404 manager. That way, you'll start with a clean slate and we can see which 404s come back (or not), etc

Rgds
 
Monday, 30 November 2015 18:23 UTC
syrah
Hi,

Thanks for your reply.

I am going to purge the 404 right now and see what it will happen in few hours.

For the multilanguage website, do you think i should remove the english language ?

Thanks,
Monday, 30 November 2015 21:11 UTC
syrah
Hi again,

It is now a couple of hours (it is 22h00) that i purge the 404 error and I have now 34 errors but 0 internal and 0 external according to the control panel.

But if I go to the 404 manager I have 34 internals and 0 externals

And as you notice it earlier all the new 404 still has the balise title completed with the title of my home page! i.e : <title>Conseils sur le sexe et la sexualité : comprendre, s'informer</title>

All the 404error seems to me that someone is trying to hack my website with some kind of script. So I checked my server logs and found this:
Line 56537: 199.33.122.202 www.xxxxxxx.com - [30/Nov/2015:20:47:56 +0100] "GET /siteadmin/FCKeditor/editor/ HTTP/1.1" 404 737 "-" "-"
	


the Ip adress match 33 errors.

I understand that I get 404 with bad script trying to get into my website but what I don't get is why I have the same Title in all of them.

And as well as why I get internal error in the 404 manager all those error should be external no?

Thanks
Tuesday, 01 December 2015 08:09 UTC
wb_weeblr
Hi

All the 404error seems to me that someone is trying to hack my website with some kind of script. So I checked my server logs and found this:
Yes, this is very common and presents no risk. You'll see this for all the life of your site, and there will aslo be Wordpress, drupal, magento,... thrown at your Joomla site. It's fine. If you see a 404, it means that the attack failed.

If you want to save a bit of processing power, you can block the IP address of this visitor, either in sh404SEF, under the security tab, or much more efficiently, using your .htaccess file or your hosting company control panel, if they have something like this.

For the multilanguage website to you think i shoukd removethe english language ?
No, you only have to unpublish the English language in the Content tab of the Joomla language manager (not the "Site" or "Administrator" tab of the language manager).

As for the 404 external/internals count, I'm a bit puzzled. Can you do an export of the content of all the xxxx_sh404sef_xxxxx tables in your database? I'd like to look at their content really.

Rgds
 
Tuesday, 01 December 2015 09:39 UTC
syrah
Hello,

1- Here attached an export of all the sh404sef tables in my database.

2- Right now the number of 404 has increased a bit and I had again a url with /undefined
I attached a screen capture of it. It give the referer but I check it and don't understand how it created a url with undefined. do you have an idea of what I should do about it ?

3 and last - among all the 404 there is one old url that doesn't exist anymore on my website. How can I know from where the bingbot find it in order to correct it? I know I can just make a 301 redirection but I have already so much of those on sh404sef or on my htaccess (my website exist since 2000 and has been rebuilt like 3 times ) and I would prefer to avoid adding some more if I can of course.

Thanks a lot for your precious help
Tuesday, 01 December 2015 09:56 UTC
wb_weeblr
Hi

2/ An /undefined URL is created when a javascript on your page is trying to do some ajax request or similar, or maybe load an image, and the URL is in a variable. But there's a bug in the javascript and the variable is not properly calculated. This results in the content of that variable being "undefined" and the script requests the site.com/undefined URL, which of course is a 404.

You have to go to the referrer page, and check anything that uses javascript (the Facebook buttons, twitter, etc). Use them and see if you get a 404 error in the developer console when you use them. That'll tell you what's causing this.

3 -
3 and last - among all the 404 there is one old url that doesn't exist anymore on my website. How can I know from where the bingbot find it in order to correct it?
So what you say is that Bingbot is trying to fetch an old URL that has now disappeared from your site?

Well, in such cases, a 404 is exactly what should happen. You've got only 2 options:
- if you have a very similar page that you want users to go to, then do a 301. I insist on very similar. If not similar, you will gain nothing in terms of SEO, ranking will not be transferred through the 301 to the new page.
- if you don't have a very similar page, just let the 404 happen. 404 are very fine. It's normal they happen, and you don't have to "fix" all of them.

Search engines (Google will do the same) will come back on a regular basis an check if a 404 has really been removed. They want to be sure you removed the page from your site. So when they see a 404, they remove the page from their index, but they keep it somewhere. And they come back over several months to see if the page is still a 404, or if this was just a temporary issue.

1 - I'll check on this data today.

Rgds
 
Tuesday, 01 December 2015 10:13 UTC
syrah
Hi,

Thanks a lot for your answer which is very clear and very usefull for me.

I am looking forward to see what your analysis for the sh404sef tables will say.

Thanks again for your help
Tuesday, 01 December 2015 10:54 UTC
wb_weeblr
Hi

I can't explain what's happening with the external/internal display. The data in the table I could import looks good and the 404s such as admin/editor should show only when displaying Externals or Both.

Can you do a site backup with akeeba (maybe exclude all images, it'll be too big).

Please also note that the backup you sent me actually failed. Most of the data is there, but at the end of the file, the mysql server stopped responding in some way:


Rgds


 
Tuesday, 01 December 2015 11:21 UTC
syrah
Hi,

Oh I am sorry, i forgot that I get this problem when I do an database export from phpmyadmin of OVH.

Here a backup made with akeeba of all my database but it is too big. If I select all the sh404sef tables and copy past them on a news file will it be ok for you do work on it?

Thanks
Tuesday, 01 December 2015 11:38 UTC
syrah
Hi again,

Here the sql file with only the sh404sef tables I made. It is a bit heavy : 2,04Mo with zip hope you will be able to get it

If you need anything else...
Thanks

edit: no it is too big. I will try to copy it in a txt file maybe
Tuesday, 01 December 2015 12:18 UTC
wb_weeblr
Hi

I don't need the database. I need the site. The database is fine, there's something on the site causing this.
You can simply use akeeba backup to do the complete backup, just make sure to exclude the /images folder or folders where you have stored your images, so that the backup us not too big (up to 300 - 500 megs is fine, I had some up to 1 gb, but downloading them is less reliable and takes a lot of time).

When the backup is ready, I can just login to the admin and download it from akeeba backup control panel

Rgds
 
Tuesday, 01 December 2015 12:45 UTC
syrah
Hi,

I just made two backup without images, one in .jpa (#5. backup sans image jpa) and one in .zip (#6. backup sans image zip)

Hope it help

Thanks
Tuesday, 01 December 2015 18:52 UTC
wb_weeblr
Hi

ok, problem was not that complicate after all:

1 - the French translation is wrong: the "Internal" and "External" labels on the selector have been inverted

2 - When displaying the External URLs, I have chosen to display not only the Externals, but also the "Unknown".
Unknown are the 404s registered before this feature was added to sh404SEF. So by default I am assuming they are external.

I agree this is confusing and will change that. Most likely, I will update the counter on the control panel so that it reflects the actual number.

Rgds
 
Tuesday, 01 December 2015 19:09 UTC
wb_weeblr
Hi

I have installed a modified version, that has
a/ the correct language strings in the 404 manager selector
b/ displays the correct number of external URLs in the URL manager

Rgds

 
Wednesday, 02 December 2015 08:00 UTC
syrah
Good morning,

Thanks you very much for the time you spend to help me solve this and for all the advises and anwsers you gave me.

Wednesday, 02 December 2015 09:38 UTC
wb_weeblr
Hi

You're welcome! Closing this ticket now, feel free to open a new one as needed. If you do so, please mention this ticket number in the new one.

If you created any superadmin account for us, be sure to delete or block it now to avoid unnecessary risk in the future.

And please feel free to post a review on the JED,it helps and we appreciate it!


Rgds

 
This ticket is closed, therefore read-only. You can no longer reply to it. If you need to provide more information, please open a new ticket and mention this ticket's number.