NavigationUser loginSpam?See spam posts on this site? If so, please don't reply to the spam! Instead, just report the URL to the webmaster. |
I need some help with 'wget'Hi, I am trying to download the Hindu sacred text "Rig Veda" from this website http://www.sacred-texts.com/hin/rigveda/index.htm. The structure of the directory is simple, the texts parts are stored like this: http://www.sacred-texts.com/hin/rigveda/index.htm (main index) http://www.sacred-texts.com/hin/rigveda/rvi01.htm (book 1) http://www.sacred-texts.com/hin/rigveda/rvi02.htm (book 2) It goes on for a total of 10 books. Alas, at the top of each page there are links to other parts of the website including the home page (http://www.sacred-texts.com/index.htm). So when I use 'wget -r -l 2 http://www.sacred-texts.com/hin/rigveda/index.htm' to get everything two levels down from http://www.sacred-texts.com/hin/rigveda/index.htm I ALSO get two levels down from the home page and the rest of the links. I tried 'wget -l 2 -I http://www.sacred-texts.com/hin/rigveda/*' but that did not work and I only got a "missing URL" message. How do I 'pump' only the pages that I need, i.e. the full Rig Veda and not the rest of the world's spirituality? Many thanks for any pointers, VS |
relative and no-parent
Hi there
In addition to your -r -l 2 options, try --relative --no-parent.
For more info on what that means, try the man page.
Good luck,
Georg
wget
If you use firefox you could get the Extension "Down them all" which downloads all links from a page. You can uncheck some of the links. This is probably less hassle than writing a scripts that can do it.
Alternatively you could download the index pages, strip the links from it and make a list of all URL's to be downloaded and pass that list to wget.
I recommend my first suggestion.
Cheap trick
If gromit comes up with such cheap trick I will give you two to. Install Gwget or Kget.
I will try them all ;-)
Thanks for the pointers guys!
Cheers,
VS
Motto: chown -R linux:GNU world
Distros: Debian, Kanotix, Frenzy, Damn Small Linux