After spending several hours updating the Puppeteer
library and testing it in the Heroku environment, I wanted to share my findings. My efforts have resulted in successfully running Puppeteer in Heroku, and I believe this information can be helpful for others facing similar challenges.
First for those who don’t know what the Puppeteer library is.
Puppeteer is a Node.js library developed by Google that allows you to control headless Chrome or Chromium browser programmatically through a high-level API.
And let’s not forget about Heroku.
Heorku is a cloud-based platform that simplifies the deployment, management, and scaling of applications without requiring developers to manage infrastructure or servers.
Situation
I’ve been using Puppeteer to automate various web actions, such as generating high-quality screenshots and PDFs of the web page.
As we at Screenful utilize Heroku, I needed to set up Puppeteer within the Heroku environment.
After conducting thorough research, it became evident that I needed to utilize a Heroku buildpack to ensure proper functionality.
So this was the buildpack which I used to run puppeteer’s headless browser in Heroku.
Thanks to Jon Tewksbury.
Problem
The system had been functioning flawlessly for nearly two years, but issues arose when attempting to upgrade the puppeteer software.
The puppeteer library stopped working because there wasn't an executor file (the browser).
After some research, I discovered that puppeteer@19
includes significant changes related to the executor file, resulting in compatibility issues.
Previously, during the installation of the puppeteer
, the executable file (the browser) was downloaded within the node_modules
directory. However, it has now been changed to download in the ~/cache/puppeter
directory.
During the build process, Heroku didn’t copy the necessary executable file from the build environment into the finalised dyno environment, resulting in its absence. As a consequence, the app crashed when attempting to use a non-existent executable file.
I’ve found some answers in google and tried to clear the Heroku build cache, but there wasn’t any luck…
Solution
After spending several hours debugging and attempting to identify the root cause of the issue, I was able to find a viable solution.
I’ve created a postbuild.bash
file.
echo -e "Store puppeteer executable in cache\n"
mkdir ./.cache
mv /app/.cache/puppeteer ./.cache
And configured it in package.json
to execute after Heroku’s build process.
...
"scripts": {
...
"heroku-postbuild": "bash postbuild.bash"
},
...
Now, whenever npm install
downloads the executable into ~/.cache/puppetter
directory, I’ve moved it to the app’s root directory so the final build process would not miss the executable.
If you try to use pwd
and ls
commands inside the postbuild.bash
and debug Heroku’s build process, then you will get the full picture of what is going on. Also, you can use heroku run bash
to enter inside the dyno and locate the executable file.
***
Thank you, feel free to ask any questions or tweet me @nairihar
Also follow my “JavaScript Universe” newsletter on Telegram: @javascript