How to Batch Process PDF Files: Automate Repetitive PDF Tasks
Learn how to batch process PDF files — compress, convert, rename, watermark, and more — using desktop tools, command-line utilities, and scripts.
Why Batch Processing PDF Files Matters
Processing 50 PDFs one at a time is tedious and error-prone. Batch processing automates repetitive operations — compress, watermark, convert, rename, merge — so you can run them on hundreds of files while you do something else.
Whether you're a business professional dealing with monthly report archives or a developer building a document pipeline, batch PDF processing is a core productivity skill.
What Can Be Batch Processed?
Almost any operation you can do on one PDF can be batched:
- Compression — reduce file size for all PDFs in a folder
- Conversion — convert all PDFs to Word, Excel, image formats, or vice versa
- Merging — combine multiple PDFs into one
- Splitting — split every PDF at page 1 (extract first pages) or into individual pages
- Watermarking — add a text or image watermark to every page of every PDF
- Rotating — fix orientation on a folder of scanned PDFs
- Password protection — encrypt a batch of PDFs with the same password
- OCR — make all scanned PDFs searchable
- Metadata editing — update title, author, keywords across multiple PDFs
- Renaming — rename files based on metadata or content
Method 1: Adobe Acrobat Pro Action Wizard
Acrobat Pro's Action Wizard is the most user-friendly batch processing tool.
Open Action Wizard: Tools → Action Wizard
Create a new action:
- Click "New Action"
- Add steps from the left panel (e.g., "Save As" → "Reduce File Size", "Add Watermark", "Export to Word")
- Set a source folder (process all PDFs in that folder)
- Set an output folder
- Save the action with a name
Run an action: Select the saved action → click "Start" → Acrobat processes all files in the source folder automatically.
Pre-built actions: Acrobat includes built-in actions for Archive to PDF/A, Prepare for Distribution, and more. These can be customised.
Method 2: ILovePDF (Online Batch Processing)
ILovePDF's online tools support batch operations on multiple files simultaneously.
Upload multiple files: Drag and drop multiple PDFs onto any tool (Compress, Merge, Watermark, etc.).
Compress all: Upload all PDFs → select compression level → download a ZIP containing all compressed files.
Convert all: Upload multiple PDFs → convert to Word/JPG/Excel → download all converted files.
Limitations: Free tier limits file size and number of operations per session. Pro tier removes limits and adds API access.
Method 3: Ghostscript (Command Line — Most Powerful Free Option)
Ghostscript is free, open-source, and scriptable for advanced batch operations.
Compress all PDFs in a folder (Linux/Mac shell):
for f in *.pdf; do
gs -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook \
-dBATCH -dNOPAUSE -dQUIET \
-sOutputFile="compressed_${f}" "$f"
done
Windows PowerShell:
Get-ChildItem *.pdf | ForEach-Object {
& gs -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook `
-dBATCH -dNOPAUSE -dQUIET `
-sOutputFile="compressed_$($_.Name)" $_.FullName
}
Convert all PDFs to PNG images (one PNG per page):
for f in *.pdf; do
mkdir -p "${f%.pdf}"
gs -sDEVICE=png16m -r150 \
-sOutputFile="${f%.pdf}/page_%03d.png" "$f"
done
Convert to PDF/A for archiving:
for f in *.pdf; do
gs -dPDFA=2 -dBATCH -dNOPAUSE \
-sDEVICE=pdfwrite \
-sOutputFile="archive_${f}" "$f"
done
Method 4: pdftk (Command Line)
pdftk excels at structural batch operations — merging, splitting, rotating, watermarking.
Merge all PDFs in a folder into one (Linux/Mac):
pdftk *.pdf cat output merged.pdf
Split every PDF into individual pages:
for f in *.pdf; do
pdftk "$f" burst output "${f%.pdf}_page_%04d.pdf"
done
Rotate all pages in all PDFs:
for f in *.pdf; do
pdftk "$f" rotate 1-endright output "rotated_${f}"
done
Add a watermark to all PDFs:
for f in *.pdf; do
pdftk "$f" stamp watermark.pdf output "watermarked_${f}"
done
Method 5: Python (Most Flexible for Custom Workflows)
Python gives you complete control over batch PDF operations. Key libraries:
- PyPDF2 / pypdf — pure Python, good for merging, splitting, rotating, metadata
- PyMuPDF (fitz) — fast, feature-rich, handles compression and rendering
- pdfplumber — excellent for text and table extraction
- Pillow — image processing for rasterized PDF pages
Batch compress and rename:
import fitz # PyMuPDF
import os
input_dir = "input_pdfs"
output_dir = "compressed_pdfs"
os.makedirs(output_dir, exist_ok=True)
for filename in os.listdir(input_dir):
if filename.endswith(".pdf"):
input_path = os.path.join(input_dir, filename)
output_path = os.path.join(output_dir, filename)
doc = fitz.open(input_path)
doc.save(output_path, garbage=4, deflate=True, clean=True)
doc.close()
print(f"Compressed: {filename}")
Batch extract text from all PDFs:
import fitz
import os
for filename in os.listdir("."):
if filename.endswith(".pdf"):
doc = fitz.open(filename)
text = "\n".join(page.get_text() for page in doc)
txt_filename = filename.replace(".pdf", ".txt")
with open(txt_filename, "w", encoding="utf-8") as f:
f.write(text)
print(f"Extracted: {txt_filename}")
Batch merge all PDFs in a folder:
from pypdf import PdfWriter
import os
writer = PdfWriter()
for filename in sorted(os.listdir(".")):
if filename.endswith(".pdf"):
writer.append(filename)
with open("merged.pdf", "wb") as f:
writer.write(f)
Method 6: PDFsam (Desktop GUI, Free)
PDFsam Basic is free and supports batch splitting and merging via a GUI.
Batch split: Add multiple PDFs → set split point → run. Each PDF is split. Batch merge: Add all PDFs → set order → merge into one output.
PDFsam Enhanced (paid) adds more batch operations including rotate, extract, and visual composer.
Organising Batch Output
When batch processing dozens of files, keep outputs organised:
Naming conventions:
compressed_YYYY-MM-DD_originalname.pdf
Folder structure:
/batch_output/
/compressed/
/converted/
/watermarked/
Logging: For scripted batch jobs, write a log file:
import logging
logging.basicConfig(filename="batch.log", level=logging.INFO)
logging.info(f"Processed: {filename}")
Error Handling in Batch Jobs
Some PDFs in a batch may be corrupted, password-protected, or have unusual structures that cause tools to fail.
In Python, wrap each file in a try/except:
for filename in files:
try:
process(filename)
except Exception as e:
print(f"Failed: {filename} — {e}")
# Log and continue; don't stop the entire batch
In shell scripts, continue on error:
for f in *.pdf; do
gs ... -sOutputFile="out_${f}" "$f" || echo "Failed: $f"
done
Summary
Batch PDF processing transforms tedious manual work into automated pipelines. For GUI-based batch work, Acrobat Pro's Action Wizard handles most professional needs. For free batch operations, Ghostscript and pdftk cover compression, conversion, merging, and watermarking from the command line. Python with PyMuPDF provides the most flexibility for custom workflows. The key is choosing the right tool for your skill level and building repeatable scripts or actions rather than repeating manual steps every time.