On 2021-07-05 10:30 p.m., Thomas Stephen Lee wrote:
On Mon, Jul 5, 2021 at 12:26 PM Samuel Sieb <samuel@xxxxxxxx> wrote:
On 2021-07-03 8:02 p.m., dwoody5654@xxxxxxxxx wrote:
the url I am trying to download does not have an extension ie. no '.htm' such
as:
https://my.acbl.org/club-results/details/338288
wget does not download the correct web page.
I tried it and it worked, sort of. The problem is that you want to
download everything to view it offline, but the site my.acbl.org has a
robots.txt that says "no robots allowed". So wget respects that and
will not download any required files from that site other than the
initial page. curl probably has the same issue.
_______________________________________________
for wget
https://gist.github.com/u0d7i/87aa962311f2a7c739aa
Ok, that solves it. I was able to download everything and opening the
resulting file in Firefox didn't have any network access. I was able to
see the entire page and even interact with it somewhat.
wget -e robots=off -EHkp https://my.acbl.org/club-results/details/338288
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure