Hi John,
Thanks for the encouraging words regarding the standardization of this protocol.
crawl-delay is a highly useless signal to deal with from the perspective of a crawler and most crawlers decided to rather use a visual setting (i.e. a slider) in a tool such as the Yandex.Webmaster tool and Google Search Console a long time ago. That's why we decided to not even mention it in the draft. We did explicitly allow implementors to define their own custom rules, but we brought up the sitemap rule as an example. We can add crawl-delay as a secondary example for a custom rule if you think that would be useful
The Sitemap rule is very useful in general, however it cannot be a directive, which the explicitly defined "disallow" and "allow" are. This is because different implementors approach sitemap ingestion vastly differently to protect their own systems; for example, one search engine might not ingest "anonymous" sitemap submissions such as the robots.txt submissions at all, or they might not ingest from spam ridden domains at all. Basically it's merely a hint, so we decided to give a nod to sitemaps for it's one of the most prominent rules in existing robots.txt files, and just allow implementors to decide how they're implementing and using this custom rule.
Happy to chat more about this, however each of these decisions were made after months of discussions and agreed on by large consumers of robots.txt files.
On Thu, May 5, 2022 at 10:35 PM John Levine <johnl@xxxxxxxxx> wrote:
It appears that The IESG <last-call@xxxxxxxx> said:
>
>The IESG has received a request from an individual submitter to consider the
>following document: - 'Robots Exclusion Protocol'
> <draft-koster-rep-08.txt> as Proposed Standard
>
>The IESG plans to make a decision in the next few weeks, and solicits final
>comments on this action. Please send substantive comments to the
>last-call@xxxxxxxx mailing lists by 2022-06-02. Exceptionally, comments may
>be sent to iesg@xxxxxxxx instead. In either case, please retain the beginning
>of the Subject line to allow automated sorting.
I'm glad to see that the authors agreed to move this to standards
track where it belongs, and I encourage you to publish it.
Nits: it would be nice if it explicitly described the Crawl-delay: and
Sitemap: lines which are widely implemented even if not by the
authors' employer.
R's,
John
--
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call
-- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call