Proxy users and Google bot blocked (hidden SEO damage)

verkoopverkoop Member Posts: 1
The LightSpeed server has a issue with handling proxy requests and Google bot requests.

While in The Netherlands not much people use a proxy server and most people would use a VPN, in some other countries (Eastern Europe) proxies are used by many people. Blocking those users has a impact on revenue.

LightSpeed servers has been blocking this type of users with a plain text "Banned" message.

From a SEO perspective, at least a SEO friendly HTTP 503 - Service Temporarily Unavailable message would be required. A plain text "Banned" message served to Google bot has a direct effect on Google rankings.

An example, Google published the following a week ago, showing the importance of providing Google bot with good quality access.

Google: The Faster We Can Crawl, The More We Can Crawl

https://www.seroundtable.com/google-fast-crawl-25737.html

When we contacted support in Amsterdam it became clear that also Google bot is blocked by the LightSpeed server. The official advise that we received was to reduce the Google Bot Crawl rate in Google Webmaster Tools. From a SEO perspective this is absurd.

Translated:

Dear Barry,

We just talked on the phone. The problem is most likely caused by a crawl delay on our server, which causes Google to be unable to crawl as much as Google may want to. This will cause a temporary block of the Google bot.

Regrettably, we cannot change the crawl delay on our server. You may try to reduce the crawl rate at Google so that the ban can be prevented.


At www.verkoop.com we invested a lot of money to optimize the website to meet the new quality standards set by Google. The website is optimized to achieve Google Lighthouse 100 performance scores and validation as a Progressive Web App (Service Worker based optimization).



A Content Delivery Network (CDN) such as Amazon CloudFront or Google Cloud CDN essentially is a proxy server. It sends the same proxy headers during a request. This means that due to the issue it is not possible to use a CDN on a LightSpeed shop.

www.verkoop.com is hurt by the occasional IP block. The shopping cart becomes blocked, hindering sales.

We converted the IP block to a 503 message on CDN level but the messages have been piling up in Google Webmaster Tools which is effectively undoing the advantage of the Google Cloud CDN for long term performance quality reputation.

We hereby want to request support via the forum to find a solution for the issue.

While we require a solution for cutting edge optimization and SEO technologies, a lot of users (proxy based users) and Google bot (on larger shops) are being blocked without the shop owner knowing about it, which causes hidden damage to SEO and revenue.

The best solution would be for LightSpeed to support the proxy standard so that shoppers who use a proxy are not blocked. And second it would be essential that Google bot is never blocked or even made more easy to spider pages.

For more information we can be reached at [email protected]

The optimization company can be reached at [email protected]


1 comment

  • JaivyDaamJaivyDaam Lightspeed Staff Posts: 29 Lightspeed
    Hi @verkoop,

    I've been researching your claims whereas there are some statements that aren't the same as my own experience.

    I've downloaded and installed TunnelBear, a VPN provider used for the MacOS. I've connected to random VPN's and could visit all of the Lightspeed customers that I have tried, yours was also reachable.
    Are there specific VPN providers that you are talking about? It could help in my investigation.

    About the Google Bot:
    Since we are a SaaS platform, we are unable to raise the crawl-delay as it would overload our platform and could potentially cause harm to any of the visitors of our customers. This is probably the reason why Google bot is shown "Banned", the specific reason why this banned message is shown, can be found in the HTTP headers that were sent with the message. There is a X-Defender header that will contain a small message like "xss_injection", "req_per_ip" and some more.

    Looking at your tickets, I would guess it will contain: "req_per_ip" as the Google bot would crawl too fast and our counter measures would ban the bot for the sake of stability and response time. You could also see this as a small DDOS. The Crawl-Delay in our robots.txt is there to prevent for such a banned to pop-up as that would hurt more than a simple crawl-delay. Under no circumstances shall we raise the crawl-delay or whitelist any IP's that are not under our control. We put great importance in guaranteeing the stability of our service, hence the reason for several counter measures.

    At the moment we are working on the CDN side of things, if you would upload new images you will see that the url of the images changed from static.webshopapp.com to cdn.webshopapp.com. Doesn't that solve your CDN issue? If any of the images are still carrying static.webshopapp.com please re-upload the images to have it pushed to our CDN.

    I know this reply doesn't answer your question, however, there is no answer. The only thing that can be done is to respect the crawl-delay and let Google crawl the pages within that time-frame so Google will never be hurt by a "Banned" message.

    I will look in our Idea board to see if we can separate the Google IP's for a 503 status whenever it crawls and hit the limit. Still, there is no timeframe I can give when or if ever we will release such feature.

    If you have any other questions, please let us know!


Sign In or Register to comment.