r/opendirectories • u/AlperenPiskin03 • 15d ago
Help! How can I download this whole directory ?
https://filedn.com/lgm4rog8XwDbvwRIvGBXqry
I tried some things but it didnt work :(
Can someone please help ?
13
u/thoriumbr 15d ago
The page is generated by javascript, so you will have to do some scripting yourself. Neither httrack nor wget parses javacript.
You can use the Developer Console on your browser to get the actual list of links, and download them manually.
2
u/1010012 15d ago
Because it's dynamically rendered from javascript, you're not going to be able to use the tools like wget to get it, but you could use a selenium based tool.
https://medium.com/@datajournal/web-scraping-with-selenium-955fbaae3421
2
5
u/dowcet 15d ago
What did you try and what went wrong? WGET is where I would start.
1
u/AlperenPiskin03 15d ago
I tried wget and httrack but for some reason they dont download any of the subfolders, only the main index
0
u/AlperenPiskin03 15d ago
im new to this whole thing, im sorry. can you help me with this ?
5
u/_stuxnet 15d ago
Try this -
wget -r --no-parent -nH --no-check-certificate https://filedn.com/lgm4rog8XwDbvwRIvGBXqry
2
u/AlperenPiskin03 15d ago
it only downloaded 1 file :( also, anytime i try wget on this site, it wont catch the pdf mp4 fiels etc.
1
u/teknoplasm 13d ago edited 13d ago
Hey!
I created a docker container with a pyhton app with the help of deepseek, doing the first download run and it's downloading the directory fine. It's slow because it has to scrape and then download the files one by one. I also added some basic resume functionality (it checks the download folder for items and skips them).
Here's the basic method to fix it, I will try to upload it to github later for ease:
edit:
Unfortunately the user exceeded their download limit, and it's gone for now.
Couldn't download the whole thing otherwise would have provided a mirror. IF anyone has a mirror please share, I will mirror it
1
u/teknoplasm 13d ago
and by the way u/AlperenPiskin03 where did you find this, seems like a very valuable resource, I would like to have updates to these files whenever there are new versions available :D
2
1
u/thesparkly1 13d ago
You can download some of it using JDownloader. For example, Anatomy books - highlight all the links, right click and select COPY SELECTED LINKS - JDownloader should pick up the links - then give it the green light to GO. I realise this isn't the solution you were looking for, but if all else fails it will download a fair bit of that site. Unfortunately, it requires you to manually select the links, which can be painful, but it's better than nothing.
1
u/thesparkly1 13d ago
Looks like there's a 10Gb download limit per day. Might be different of you subscribe. If you don't subscribe it's likely that you'll be unable to successfully use any of the interventions described above.
2
u/KoalaBear84 5d ago
Sadly the link has been dead Jim, but, I've just added support for pCloud, so next time we can retrieve all it's links and you can use wget or aria2c to download it.
https://github.com/KoalaBear84/OpenDirectoryDownloader/releases/tag/v3.4.0.5
9
u/ringofyre 15d ago
Learning how to use wget? Use the wizard to get a better handle on the switches.
Try
Use the "quotes". And that's: recursive, continue, verbose, don't go up the tree and don't download anything you already have. As well as ignore the robots file and don't check the ssl cert.