Conversation

Notices

Embed this notice
Judeau (EatTheRich) (judeau@mas.to)'s status on Friday, 07-Nov-2025 03:45:52 JST Judeau (EatTheRich)

There is an old website that I really want to preserve and archive a section of. Basically like how the Wayback machine does.
I would really like to be able to essentially browse the website offline on my computer.
Also, if it does not increase complexity, i would like to retain all of the downloadable links and files as well.
Is there a program or a simple way to go about this? Any help would be appreciated.
#AskFedi #AskMastodon #archive
In conversation about 5 months ago from mas.to permalink
Attachments
1. Domain not in remote thumbnail source whitelist: sapusmidjan.is
  
  Home
  
  from tandie
- Embed this notice
  Peter Krefting (nafmo@social.vivaldi.net)'s status on Friday, 07-Nov-2025 03:45:52 JST Peter Krefting
  in reply to
  
  @Judeau If it is a simple static site, GNU Wget has a recursive download mode which can also convert links for offline browsing.
  
  In conversation about 5 months ago permalink
- Embed this notice
  Peter Krefting (nafmo@social.vivaldi.net)'s status on Friday, 07-Nov-2025 21:22:57 JST Peter Krefting
  in reply to
  - Nazo
  @nazokiyoubinbou @Judeau There is a --span-hosts options that seems to be there to download things from other servers as well; I haven't used Wget in this mode in several years, so I don't know how well it works, though.
  
  In conversation about 5 months ago permalink
- Embed this notice
  Nazo (nazokiyoubinbou@urusai.social)'s status on Friday, 07-Nov-2025 21:22:58 JST Nazo
  in reply to
  - Peter Krefting
  @Judeau @nafmo Will this grab external resources though?
  A lot of sites may directly block recursive activity (falsely — or not depending on how you look at it — determining it to be bot activity.) You'll want to, at the very least, add --random-wait in the commandline so it looks less obviously bot-like (and hits the server less hard anyway.)
  Are there a lot of pages? One thing I'm enjoying is the "SingleFile HTML" plugin to save a page into a single .html file instead of a file + directory with broken resources. However, this would give you individual pages, not the equivalent of a functioning site.
  I think there are actually tools specifically designed for archiving sites though. Stuff more directly designed to handle external resources and all. Sadly I can't remember them...
  
  In conversation about 5 months ago permalink
- Embed this notice
  Judeau (EatTheRich) (judeau@mas.to)'s status on Friday, 07-Nov-2025 21:23:11 JST Judeau (EatTheRich)
  in reply to
  - Peter Krefting
  @nafmo Thanks for the tip. I have only messed around with Wget a few times and it's been many years ago.
  I used it for grabbing files off an FTP like site. I imagine a website might be a bit more complicated for me, but I will definitely take a look at it.
  Thanks again!
  
  In conversation about 5 months ago permalink

Public

Notices

Feeds