• Home
  • Get help
  • Ask a question
Last post 50 min ago
Posts last week 81
Average response time last week 4 hours 29 min
All time posts 67931
All time tickets 10501
All time avg. posts per day 20

Helpdesk is open from Monday through Friday CET

Please create an (free) account to post any question in the support area.
Please check the development versions area. Look at the changelog, maybe your specific problem has been resolved already!
All tickets are private and they cannot be viewed by anyone. We have made public only a few tickets that we found helpful, after removing private information from them.

#10858 – Large Pages Pending Number

Posted in ‘4SEO’
This is a public ticket. Everybody will be able to see its contents. Do not include usernames, passwords or any other sensitive information.
Tuesday, 27 February 2024 13:22 UTC
davesouthey

One of our sites has an obscene amount of numbers in the pending for site analysis and growing. We're currenlty at 230k+. Initially we just ignored it, but we've realised that the site isn't making an update sitemap because it's not been able to work through it's pending.

Is there a setting that we've got wrong that is causing this? We're using url parameters across the site so could that be causing it? It's having to analyse and index every combination of url parameters that the site loads?

 

 
Tuesday, 27 February 2024 14:34 UTC
wb_weeblr

Hi

Is there a setting that we've got wrong that is causing this?

Nope, that's just how the site is set up, how many pages it has and how often the pages are changing, and how much traffic it has (which influence directly how often 4SEO works through the backlog of pages to analyze).

There are 2 ways this can be acted upon:

1 - Make 4SEO analyze pages more often: IF you slowed down analysis 4SEO, then let it go faster. 

How often 4SEO analyze pages is decided under Configuration | System | Background processing | Pages between background processing.

That number is 1 by default, meaning that for each visit on the frontend, one page will be analyzed.

If for some reason you increased that number, say to 10, it would mean 4SEO would only analyze one page every 10 frontend visits.

Lower that number, or set it back to 1, to increase the analysis speed.

2 - Make 4SEO analyze less pages

Likely the best option. You can exclude groups of pages from the analysis entirely under Pages | Settings | Excluded URLs

There you should exclude all pages that have no SEO values. That entirely depends on your site content and you need to look at what the site content is to decide on what to exclude.

Exclusion is done with expressions such as /users/{*} for instance (users profile, content edit page, page variations based on price, colors, display, sorting variables, etc) are all good candidates to be excluded.

You can also exclude pages using robots.txt (4SEO crawler complies with robots.txt rules).

This would have the added benefit of also excluding these pages from Google, which would help  focus Google's attention on your actual valuable content.

Best regards

Yannick Gaultier

weeblr.com / @weeblr

 

 
Tuesday, 27 February 2024 14:43 UTC
davesouthey

Hmm, okay. Our Pages between background processing was still 1, so I'm not sure why it backs up as bad as it does as we've got about 3,000 pages. It's a recruitment site so we have lots of job positions being filled and those are the bulk of the pages. I'll look into it some more and see if I can find something on the site to exclude. Thanks.

 
Tuesday, 27 February 2024 15:39 UTC
wb_weeblr

Hi

It's a recruitment site so we have lots of job positions being filled and those are the bulk of the pages

I'd suggest you always post here a full and real link to the site. I can't make any check or suggestions without that really.

so I'm not sure why it backs up as bad as it does as we've got about 3,000 pages.

Not sure what count as "pages", but I'm pretty sure such site would have many more links such as "respond to job offers", "star this offer", etc

If all these links are public (ie they are shown when users are not logged in and you get to a log-in page if not already logged-in) then both 4SEO and Google sees - and crawl them.

In addition, that number also includes all images and external links.

Finally, any change in content, links, images, metadata, etc will cause a page to be put back on the pending analysis list.

Best regards

Yannick Gaultier

weeblr.com / @weeblr

 

 
Friday, 29 March 2024 05:34 UTC
system
This ticket has been automatically closed. All tickets which have been inactive for a long time are automatically closed. If you believe that this ticket was closed in error, please contact us.
This ticket is closed, therefore read-only. You can no longer reply to it. If you need to provide more information, please open a new ticket and mention this ticket's number.