Skip to main content

View Post [edit]

Poster: bmuramatsu Date: Aug 7, 2011 4:05pm
Forum: web Subject: Re: robots.txt

Hi, is IA actually following the Allow directive?

I've had my robots.txt file configured as below, but IA still doesn't seem to index my site.

Thanks in advance.


User-agent: *
Disallow: /

User-agent: ia_archiver
Allow: /

User-agent: archive.org_bot
Allow: /

Reply [edit]

Poster: jonc Date: Aug 7, 2011 6:39pm
Forum: web Subject: Re: robots.txt

You'll be better off starting a new thread instead replying to one two years old. The old threads get buried and won't be seen by anybody casually browsing the forums.

It takes a few months to crawl the entire Web, so it might be a while before your site is archived. You'll just have to be patient.

Reply [edit]

Poster: Nemo_bis Date: Nov 7, 2013 1:12pm
Forum: web Subject: Re: robots.txt

Hello bmuramatsu, did you manage to whitelist the wayback machine with the Allow directive, in the end? What's your website? Edit: see also an example of robots.txt which is not respected as intended, Google's.
This post was modified by Nemo_bis on 2013-11-07 21:12:19