Downloading Web Content with WGet and YouTube-DL

In this post I am going to cover the process behind using WGet and YouTube-DL to obtain media from hosted websites.

But Michael, isn’t this illegal?

Depends, did you read this? I’m simply showing you the methodology behind something – it’s your choice how to use this.

Basic Install…for Linux.

YouTube-DL and WGet are native to Linux (using the package managers) you can simply perform the following:

sudo apt-get install wget
and:
sudo apt-get install youtube-dl

Installing this on a Windows Client.

But for all us unfortunate users stuck on Windows, how do we achieve this? There are two main methods in which I will demonstrate.

Enabling Bash for Windows 10

If you’re using Windows 10, you can enable “Linux Subsystem” for Windows. It’s a real hard process, paste the following into an administrative PowerShell console:

Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux

…and reboot.

Once you load bash (literally, bash.exe) you can install the client the above mentioned way.

Getting the stand alone programs to run on Windows

Not on Windows 10? Cannot really blame you. So, let’s go and manually get these packages.

First, download wget, and put in a working directory.

Same process for YouTube-DL.

“Using the awesomeness that is, these things”

It’s 12am, and I’ve had 2 hours sleep. The titles don’t matter right now.

Let’s open an administrative PowerShell, and change to the directory.

Directory: C:\bin

Mode   Last Write Time                Length       Name
—- ————- —— —-
-a—-   27/09/2017 12:27 AM     3481920    wget.exe
-a—-   27/09/2017 12:28 AM    7803406    youtube-dl.exe

…and now you look at the help file and figure out how to use the programs yourselves…? No? Okay.

Let’s start with YouTube-DL, and their documentation. This will give you all the switches you can use in conjunction with the program.

For those who are just lazy, save the following as a PowerShell script, and execute it to perform a basic download. I did not error checking or improvement to this:

$YP = "Enter your working directory"
$Vid = "Enter URL to vid"
function getfiles {
 $URL1 = "https://eternallybored.org/misc/wget/current/wget.exe"
 $URL2 = "https://yt-dl.org/downloads/2017.09.24/youtube-dl.exe"
 $output = "$YP"
 Start-BitsTransfer -Source $url1 -Destination $output
 Start-BitsTransfer -Source $url2 -Destination $output
}

function downloadmystuff {
 cd $YP
 youtube-dl $vid --ignore-errors --geo-bypass --yes-playlist --write-description 
--write-all-thumbnails --console-title --print-traffic --all-formats 
}

getfiles
downloadmystuff

Basically, this will download the file mentioned by your “$Vid” function, with the following parameters:

  • –ignore-errors
  • –geo-bypass
  • –yes-playlist
  • –write-description
  • –write-all-thumbnails
  • –console-title
  • –print-traffic
  • –all-formats

Pretty straight forward, easy to understand. Oh, and did you know they support Instagram?

.

Cool. So let’s use this in conjunction with WGet. I want to download my home page:

nanky@DESKTOP-3O8E0L8:~$ cd /mnt/c/bin/ && wget www.michaelnancarrow.com

I want to download just the images from here:

nanky@DESKTOP-3O8E0L8:/mnt/c/bin$ wget -nd -E -H -k -K -p -A jpeg,png,jpg https://imgur.com/gallery/PATH

  • -nd
  • -E
  • -H
  • -k and -K
  • -p and
  • -A

Can all be found out on the manual page.

So there you go, another very basic “how to” document that could have been answered more succinctly by spending 5 minutes on Google. Literally.

One thought on “Downloading Web Content with WGet and YouTube-DL

  1. Pingback: Bash, making things easier. | Michael, 'the dank', Nancarrow

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s