Downloading images with nodejs

Downloading images with nodejs

Crysfel Villa Programming

A couple of weeks ago Facebook sent me an email informing me that all profile images will not be accessible anymore unless there's a token on every request.

Given that I'm no longer supporting Facebook login, but still many of my users are using their Facebook profile image, I decided to pull all those images into my own server before facebook implements the new requirements.

After a few hours of googling how to download images/binaries on NodeJS, and only finding old/deprecated implementations all over the place, I stumbled into the got library!

const stream = require('stream');
const { promisify } = require('util');
const fs = require('fs');
const got = require('got');

const pipeline = promisify(stream.pipeline);

async function downloadImage(url, name) {
  await pipeline(
    got.stream(url),
    fs.createWriteStream(name)
  );
}

Before this, I tried using http.request but for some reason, the file was getting created with 0kb of data, I tried using the request library but it's deprecated already.

The got library supports streams, this will allow us to directly write the stream into the file system, for my use case I needed something like this because I had to download 25K images approximately.

Here's how to use it:

(async () => {
  await downloadImage('https://example.com/test.jpg', 'test.jpg');
})();

I decided to download 10 images at the time, to prevent getting banned by Facebook.

const downloads = [];
for (let i = 0; i < users.length; i++) {
  const user = users[i];
  downloads.push(
    downloadImage(user.image, `${user.id}.jpg`)
  );
}

await Promise.all(downloads);

By using Promise.all, I was able to trigger 10 requests at the same time and wait until all of them completed.

I also added a random waiting time before requesting the next 10 images, I did this in order to prevent any possible IP banning.

// Waiting ~1 second between requests
await sleep(1000 + Math.floor(Math.random() * 1000));

function sleep(ms) {
  return new Promise((resolve) => {
    setTimeout(resolve, ms);
  });
} 

I honestly thought I was going to be banned at 5000 or maybe at 10K requests, but I was surprised when after 1 hour and 40 mins the script successfully downloaded 25K images to my server.

So there you have it! Downloading binaries using Node JS!

Happy Coding!

Did you like this post?

If you enjoyed this post or learned something new, make sure to subscribe to our newsletter! We will let you know when a new post gets published!

Article by Crysfel Villa

I'm a Sr Software Engineer who enjoys crafting software, I've been working remotely for the last 10 years, leading projects, writing books, training teams, and mentoring new devs. I'm the tech lead of @codigcoach