• Home
  • Get help
  • Ask a question
Last post 13 min ago
Posts last week 89
Average response time last week 30 min
All time posts 67713
All time tickets 10464
All time avg. posts per day 21

Helpdesk is open from Monday through Friday CET

Please create an (free) account to post any question in the support area.
Please check the development versions area. Look at the changelog, maybe your specific problem has been resolved already!
All tickets are private and they cannot be viewed by anyone. We have made public only a few tickets that we found helpful, after removing private information from them.

#3543 – 404 requests - blocking directories

Posted in ‘sh404SEF’
This is a public ticket. Everybody will be able to see its contents. Do not include usernames, passwords or any other sensitive information.
Thursday, 20 April 2017 15:00 UTC
SportsVision
Is it possible to have certain directories not display in the 404 requests page? I ask because it appears some robots are not adhering to our robots.txt file and scanning our forum (and other) directories. Now I have TONS of forum links in 404 requests.

It would also be nice to be able to redirect to non-Joomla (index.php) pages. We have links on our site that are not part of Joomla (i.e. our Support site) that are generating 404 requests.

One of the reasons we're using sh404SEF (among all the great features) is that Joomla is creating some errant urls. It creates urls that duplicate our subdirectory (site3) that Joomla is installed in when creating header links. For example, it will create a link site3/site3/site3/site3/site3/index.php, which is really www.xxxx.com/site3 (the front page). Is there any way to automate fixing these 404 request issues? Or fix it so Joomla does not create these bogus links?
 
Thursday, 20 April 2017 15:12 UTC
wb_weeblr
Hi

Is it possible to have certain directories not display in the 404 requests page?
That's not possible. What you can do is maybe simply search for them and delete them, but that's about it.
I ask because it appears some robots are not adhering to our robots.txt file and scanning our forum (and other) directories. Now I have TONS of forum links in 404 requests.
Restrictions in your robots.txt do not prevent robots, even if they adhere to the standard (Google, Bing, etc) to crawl those directories and read all those pages. They first will crawl all the pages they can find, and then apply the rules in your robots.txt to decide which one they can keep and index.

BUT

Robots crawling your site are not going to find/create 404s. By definition, they can only load pages that exists, not the missing one (how could they find them, if they don't exist?).

So I'm guessing you are only seeing the usual random attacks that all sites get, which do end up being 404s. For instance, this Joomla site receives many requests for Joomla components that we don't use, or even for WordPress pages.

It would also be nice to be able to redirect to non-Joomla (index.php) pages.
We cannot do that, at least for now, as the target redirect must be stored in our database.

We have links on our site that are not part of Joomla (i.e. our Support site) that are generating 404 requests.
I don't understand. Why are those links generating 404s? If they are links to your support site, did you change those links in some way, which is now causing 404s?

It creates urls that duplicate our subdirectory (site3) that Joomla is installed in when creating header links. For example, it will create a link site3/site3/site3/site3/site3/index.php
I don't see how Joomla could create such links. Where do you see those links exactly? how did you find them? displayed in a page?

Or fix it so Joomla does not create these bogus links?
It can ony be fixed once we know what is the original link, and where it comes from.

Is there any way to automate fixing these 404 request issues?
Usually, 404s do not need to be fixed. The only - very important - exception is if the link is originating from your own site.
For instance, if a link in your menu generate a 404 when clicked, that's a major problem that you have to fix.
Then there is the case of a page that you modified, and changed the URL, but kept the same content. All that is on your site, and you want to do redirects.

But if you removed a page completely, or removed a product in a shop, then you should not redirect that. Having a 404 on those page is exactly what needs to be done, both for user and for SEO.

Rgds
 
Friday, 05 May 2017 05:34 UTC
system
This ticket has been automatically closed. All tickets which have been inactive for a long time are automatically closed. If you believe that this ticket was closed in error, please contact us.
This ticket is closed, therefore read-only. You can no longer reply to it. If you need to provide more information, please open a new ticket and mention this ticket's number.