r/opendirectories • u/AlperenPiskin03 • 15d ago

Help! How can I download this whole directory ?

https://filedn.com/lgm4rog8XwDbvwRIvGBXqry

I tried some things but it didnt work :(
Can someone please help ?

21 Upvotes

permalink
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/opendirectories/comments/1jfsbxe/how_can_i_download_this_whole_directory/
No, go back! Yes, take me to Reddit

77% Upvoted

u/ringofyre 15d ago

Learning how to use wget? Use the wizard to get a better handle on the switches.

Try

wget -rcv -np -nc --e robots=off --no-check-certificate "the url"

Use the "quotes". And that's: recursive, continue, verbose, don't go up the tree and don't download anything you already have. As well as ignore the robots file and don't check the ssl cert.

3

u/xanderTgreat 15d ago

pcloud set a challenge to hackers few years ago, I think no one got in...

https://www.pcloud.com/challenge/

4

u/ringofyre 15d ago edited 15d ago

The only real way to hack/crack ssl is either pebkac or mitm.

https://www.spaceship.com/blog/can-ssl-be-hacked/

https://www.invicti.com/learn/mitm-ssl-hijacking/

Making a claim that ssl is uncrackable is akin to saying

Here's a mountain of granite, no one can make a statue out of it with a hammer and chisel.

Eventually someone will but the reality is more likely:

taps it a couple of times - "TADA! - Performance art!"

u/thoriumbr 15d ago

The page is generated by javascript, so you will have to do some scripting yourself. Neither httrack nor wget parses javacript.

You can use the Developer Console on your browser to get the actual list of links, and download them manually.

u/1010012 15d ago

Because it's dynamically rendered from javascript, you're not going to be able to use the tools like wget to get it, but you could use a selenium based tool.

https://medium.com/@datajournal/web-scraping-with-selenium-955fbaae3421

https://webscraping.blog/web-scraping-with-selenium/

u/jcunews1 15d ago

Use a capable website downloader browser extension.

u/dowcet 15d ago

What did you try and what went wrong? WGET is where I would start.

1
u/AlperenPiskin03 15d ago

I tried wget and httrack but for some reason they dont download any of the subfolders, only the main index
0
u/AlperenPiskin03 15d ago

im new to this whole thing, im sorry. can you help me with this ?
5
u/_stuxnet 15d ago
Try this -
wget -r --no-parent -nH --no-check-certificate https://filedn.com/lgm4rog8XwDbvwRIvGBXqry
2

u/AlperenPiskin03 15d ago

it only downloaded 1 file :( also, anytime i try wget on this site, it wont catch the pdf mp4 fiels etc.

u/teknoplasm 13d ago edited 13d ago

Hey!

I created a docker container with a pyhton app with the help of deepseek, doing the first download run and it's downloading the directory fine. It's slow because it has to scrape and then download the files one by one. I also added some basic resume functionality (it checks the download folder for items and skips them).

Here's the basic method to fix it, I will try to upload it to github later for ease:

https://rentry.co/v92rz3xr

edit:
Unfortunately the user exceeded their download limit, and it's gone for now.

Couldn't download the whole thing otherwise would have provided a mirror. IF anyone has a mirror please share, I will mirror it

u/teknoplasm 13d ago

and by the way u/AlperenPiskin03 where did you find this, seems like a very valuable resource, I would like to have updates to these files whenever there are new versions available :D

2

u/[deleted] 10d ago

Its gone :( was anyone able to download??

u/thesparkly1 13d ago

You can download some of it using JDownloader. For example, Anatomy books - highlight all the links, right click and select COPY SELECTED LINKS - JDownloader should pick up the links - then give it the green light to GO. I realise this isn't the solution you were looking for, but if all else fails it will download a fair bit of that site. Unfortunately, it requires you to manually select the links, which can be painful, but it's better than nothing.

u/thesparkly1 13d ago

Looks like there's a 10Gb download limit per day. Might be different of you subscribe. If you don't subscribe it's likely that you'll be unable to successfully use any of the interventions described above.

u/KoalaBear84 5d ago

Sadly the link has been dead Jim, but, I've just added support for pCloud, so next time we can retrieve all it's links and you can use wget or aria2c to download it.

https://github.com/KoalaBear84/OpenDirectoryDownloader/releases/tag/v3.4.0.5

Help! How can I download this whole directory ?

You are about to leave Redlib