In order for your store to be discoverable by search engines such as Google or Bing, you need to allow access to your webpages. However, there are some webpages you don't want to show up in search results, such as login pages, the search result page, and the cart and checkout. The robots.txt file is a tool that discourages search engine crawlers (robots) from indexing these pages.

 

Requirements

  • To manage the robots.txt file, the Manage Settings permission must be enabled on your user account.
  • To manage the robots.txt file individually for a Multi-Storefront store, the Manage Channels and Edit Channels permissions must be enabled on your user account.
 

Default Search Engine Robots File

To view or edit the robots.txt file, go to SettingsWebsite under the Website and scroll down to the Search Engine Robots section. If you are using Multi-Storefront, you can use storefront-specific settings to manage each storefront’s robots.txt file separately.

Here is the default for the HTTPS robots file. If you need to revert to the original file for any reason, you can use this.

User-agent: *
Disallow: /account.php
Disallow: /cart.php
Disallow: /checkout.php
Disallow: /checkout
Disallow: /finishorder.php
Disallow: /login.php
Disallow: /orderstatus.php
Disallow: /postreview.php
Disallow: /productimage.php
Disallow: /productupdates.php
Disallow: /remote.php
Disallow: /search.php
Disallow: /viewfile.php
Disallow: /wishlist.php
Disallow: /admin/
 

Editing the Robots File

You can edit your robots file to change which websites are crawled. However, we strongly recommend against this unless you are familiar with robots.txt files and understand the potential impact to SEO.

 

Be careful! We do not recommend changing your robots.txt files unless you are familiar and comfortable with how they work. Changing these files will directly affect which pages are crawled by search engines, so making a mistake can result in negative SEO impact.

Before making any changes, check the list of Files Disallowed by Default.

If you want to request that search engine robots not crawl a particular page or subdirectory, add "Disallow:" followed by the URL. For example:

  • Disallow: /shipping-return-policy.html
  • Disallow: /content/

After saving your changes, it may be several days or weeks until search engines crawl and index your site. You can also resubmit your sitemap, though this does not guarantee a faster site crawl.

 

Files Disallowed by Default

User-agent: * — The wildcard (*) value means that ALL crawlers should follow the disallow rules.

Disallow: /account.php — This prevents the crawling of the storefront account pages that customers access when logged into their account. Crawlers do not have accounts and have no access to these pages.

Disallow: /cart.php — This prevents the crawling of the cart page. Since this page requires visitor input in adding items, it contains no value as a search result.

Disallow: /checkout.php — This prevents the crawling of the checkout page. Like the cart page, this page is dependent on user input and would be of no value as a search result. Additionally, the checkout page could contain sensitive data such as name, email, addresses, and credit card info. By preventing this page from appearing in search engines, BigCommerce protects the personal data of consumers purchasing from any store and maintains PCI compliance.

Disallow: /checkout — Similar to checkout.php, this also prevents crawling of the checkout page.

Disallow: /finishorder.php — Finishorder.php typically contains a lot of personal data. By preventing search engines from crawling this page, BigCommerce protects consumer data and maintains PCI compliance.

Disallow: /login.php — This prevents the crawling of the store customer login and registration pages. Since these pages have very little content and no value to new visitors of the store, it is blocked from search engines.

Disallow: /orderstatus.php — The order status page requires a user to login before being able to see the content of this page. Since search engines do not have store accounts and cannot input data into text fields, this page is blocked.

Disallow: /postreview.php — Since crawlers cannot input data into text fields, this page is blocked.

Disallow: /productimage.php — Productimage.php is used to create a jquery lightbox window on product pages, which is typically executed when a user clicks on a product image on a product page. The pop up window is not a dedicated page with its own URL and duplicates some text on the product page, so it is blocked to prevent duplicate content, missing title tag and description warnings in search console, and thin content penalties.

Disallow: /productupdates.php — No longer used.

Disallow: /remote.php — Used for the store AJAX calls and does not actually produce a page that is usable by a human.

Disallow: /search.php — This page handles searches from the search box on a store. Google has previously stated that search results pages are not something they want in their index. It creates a poor user experience going from a search results page to another search results page instead of directly to the result.

Disallow: /viewfile.php — Used to attach files to an order. This typically happens with digital transactions such as digital downloads and pdfs. Crawlers do not have store accounts with eligible orders to download content.

Disallow: /wishlist.php — Wishlist.php is user dependent and would provide little to no value to searchers. Additionally, depending on how many products a user adds to a wishlist, the pages could be considered thin or duplicate content.

Disallow: /admin/ — The store login path is blocked for security reasons. Making the login page hard to find reduces the likelihood of a direct attack from hackers. Additionally, this page would be of no value to a searcher.

 

FAQ

Will my trial store be indexed by search engines?

All BigCommerce trials are set to private when they are created, thus keeping search engines from indexing your store while still under development, and it prevents the public from browsing your store until it's ready for launch.

Google Search Console is showing the warning "Indexed, though blocked by robots.txt" for some of my URLs. What should I do?

By default, robots.txt file blocks URLs that pertain to customer checkout and accounts. These should be blocked for security reasons. The warnings can be disregarded if you have not altered the robots.txt file. The warning is only meant to notify you that some URLs are blocked and that it was intentional.

Why are my disallowed webpages appearing in search results?

While adding a webpage URL to your robots.txt file can prevent search engines from crawling the page, it may still be indexed if it is linked from other places on the web.

To prevent most search engine crawlers from indexing pages in your store, you can customize their headers to include a noindex meta tag. Keep in mind that this can critically impact your store's SEO if done incorrectly, so you may want to reach out to a Partner for help in making this change.