Alex W.'s Blog

Linux Terminal Batch Download One Liner

More than once I’ve come across a situation where I’ve needed to download a set of files from a website, all on the same page, and all with a similar HTML markup pattern. Additionally, I’ve come to fall in love with the Linux Terminal and all of its easy-to-use commands. So, here’s a mashup of the two!

I wanted to download all the (free, legal) mp3s from www.itsmetrognome.com/music/. Fortunately, they download links all followed the same general style:

<a
  class="btn_left  has-counter"
  rel="36"
  title="iPhone (MetroGnome Remix)"
  href="http://www.itsmetrognome.com/?wpdmact=process&did=MzYuaG90bGluaw=="
  >Download</a
>

Therefore it was relatively easy to come up with a regex pattern that would match them:

href='(.+?)'.*?>Download

This had to be adapted slightly for the final version to use lookaheads and the like because grep cannopt pipe individual regex groups (as far as I am aware). Additionally, xargs was used to send the grepped urls to wget. The --content-disposition paramater on wget caused the downloaded files to have their normal, human-readable names.

Here’s the final one line command script that will download all of MetroGnome’s music for you!

curl -s www.itsmetrognome.com/music/ | grep -Po "(?<=href=')(.+?)(?='.*?>Download)" | xargs -P16 wget --content-disposition