| settings | ||
| utils | ||
| .build.env | ||
| .gitignore | ||
| .npmrc | ||
| .nvmrc | ||
| Dockerfile | ||
| entrypoint.sh | ||
| index.js | ||
| LICENSE | ||
| package-lock.json | ||
| package.json | ||
| packages.txt | ||
| README.md | ||
Pepatung
About
Pepatung is a tool for automated PDF generation from web pages using Puppeteer with flexible configuration options.
Table of Contents
Requirements
- Node.js 22.2.0
Features
-
Automated PDF Generation: Automatically captures PDFs from specified web pages.
-
URL Monitoring: Waits for the specified URL to be up and available before capturing.
-
Flexible Configuration: Allows specifying output directory and file name for the generated PDF as well as the browser executable.
-
Multi-Arch Containerisation Support: Containerised for seamless deployments across multiple architectures.
Installation
Local
-
Ensure that you have met all of the project requirements.
-
Clone the repository:
git clone https://gitlab.com/irfanhakim/pepatung.git ~/pepatung -
Get into the local repository:
cd ~/pepatung -
Run the install command:
npm install -
Set any environment variables you need through an
.envfile. -
Run the service locally and monitor the logs to check its progress:
npm start
Docker
A Docker container image for Pepatung is provided, but has not been tested to run:
- Outside of a Kubernetes environment.
For the best experience, please deploy Pepatung using the official Portfolio Helm chart.
Configuration
Environment variables
Environment variables are used to configure certain core options related to the project. For consistency, environment variables used throughout the project are consolidated in the settings/env.js module.
For local testing purposes, please supply your environment variables through an .env file in the root of this repository.
| Option | Description | Sample Value | Default Value |
|---|---|---|---|
PUPPETEER_EXECUTABLE_PATH |
The full path to the Puppeteer executable browser i.e. Chromium or Firefox. | /usr/bin/google-chrome |
Docker: /usr/bin/chromium-browser |
PUPPETEER_SKIP_CHROMIUM_DOWNLOAD |
Specifies whether to skip downloading Chromium during installation. | false |
Docker: true |
SITE_DOMAIN |
The base domain used to resolve relative URLs in target site into absolute links. | https://example.com |
- |
SITE_URL |
The URL of the site to watch and capture as PDF. | http://localhost:80 |
- |
CHECK_INTERVAL |
The interval in milliseconds to check the site until it is available. | 10000 |
5000 |
OUTPUT_DIR |
The directory to output the captured PDF. | /dist/assets/docs |
output |
OUTPUT_FILE |
The name of the output PDF file. | myfile.pdf |
output.pdf |
PAGEBREAK_ELEMENT_SELECTOR |
The query selector to identify the element where a page break should be added before it. | h2 |
- |
PAGEBREAK_ELEMENT_TEXT |
The text within the element where a page break should be added before it. | my header |
- |
PDF_FORMAT |
The paper size format of the output PDF file. | letter |
A4 |
PDF_MARGIN |
Space-separated values for PDF margins in the order: top right bottom left. |
10 0 10 0 |
- |
PDF_SCALE |
The scaling factor for rendering the web page in the PDF. | 0.85 |
1 |
SUM_ELEMENT_SELECTOR |
The query selector to identify the element where a checksum should be added in place. | h2 |
- |
SUM_ELEMENT_TEXT |
The text within the element where a checksum should be added in place. | my header |
- |
License
This project is licensed under the AGPL-3.0-only license. Please refer to the LICENSE file for more information.