Re: downloading a complete web page without using a browser...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 3 Jul 2021 20:25:04 -0700
users@xxxxxxxxxxxxxxxxxxxxxxx wrote:

> On 7/3/21 8:02 PM, dwoody5654@xxxxxxxxx wrote:
> > I have been using a shell script called save-page-as.sh to download a
> > complete web page. This has been working as expected.
> > The relevant line the the script is:
> > "${browser}" "-new-window" "${url}" &>/dev/null
> > 
> > I now need the ability to run this program or another program via email to
> > my computer from other locations. I do not have the option to login
> > remotely.
> > 
> > The save-page-as.sh program runs firefox. I have not been able to get this
> > to work using email. env shows DISPLAY=:0.0. I have added each of the
> > Display commands as below:
> > 
> > export DISPLAY:0
> > export DISPLAY:0.0
> > export DISPLAY:0.1
> > 
> > None of those have worked.
> > 
> > the url I am trying to download does not have an extension ie. no '.htm'
> > such as:
> > https://my.acbl.org/club-results/details/338288
> > 
> > wget does not download the correct web page.
> > 
> > Appreciate any pointers to get the save-page-as.sh working using a browser
> > or a different command line program.
> > 
> > David
> 
> Hi David,
> 
> Try this
> 
> 
> $ curl https://my.acbl.org/club-results/details/338288 --output> 
> 
>    % Total    % Received % Xferd  Average Speed   Time    Time     Time 
>   Current
>                                   Dload  Upload   Total   Spent    Left 
>   Speed
> 100  463k    0  463k    0     0   193k      0 --:--:--  0:00:02 --:--:-- 
>   193k
> 
> 
> I opened eraseme.html and the 338288 web page right
> next to each other in Firefox and they look exactly
> the same to me.
> 
There are spacing and alignment differences and apparently other differences.
Also if you then run:

html2txt eraseme.html

or

html2text eraseme.html

it does not display any of the text(content).

> I use curl almost exclusively for download web site.
> wget has its issues.
> 
> HTH,
> -T
> _______________________________________________
> users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List
> Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List
> Archives:
> https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
> Do not reply to spam on the list, report it:
> https://pagure.io/fedora-infrastructure
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure



[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux