Tools of PDF

Why Batch Processing PDF Files Matters

Processing 50 PDFs one at a time is tedious and error-prone. Batch processing automates repetitive operations — compress, watermark, convert, rename, merge — so you can run them on hundreds of files while you do something else.

Whether you're a business professional dealing with monthly report archives or a developer building a document pipeline, batch PDF processing is a core productivity skill.

What Can Be Batch Processed?

Almost any operation you can do on one PDF can be batched:

Compression — reduce file size for all PDFs in a folder
Conversion — convert all PDFs to Word, Excel, image formats, or vice versa
Merging — combine multiple PDFs into one
Splitting — split every PDF at page 1 (extract first pages) or into individual pages
Watermarking — add a text or image watermark to every page of every PDF
Rotating — fix orientation on a folder of scanned PDFs
Password protection — encrypt a batch of PDFs with the same password
OCR — make all scanned PDFs searchable
Metadata editing — update title, author, keywords across multiple PDFs
Renaming — rename files based on metadata or content

Method 1: Adobe Acrobat Pro Action Wizard

Acrobat Pro's Action Wizard is the most user-friendly batch processing tool.

Open Action Wizard: Tools → Action Wizard

Create a new action:

Click "New Action"
Add steps from the left panel (e.g., "Save As" → "Reduce File Size", "Add Watermark", "Export to Word")
Set a source folder (process all PDFs in that folder)
Set an output folder
Save the action with a name

Run an action: Select the saved action → click "Start" → Acrobat processes all files in the source folder automatically.

Pre-built actions: Acrobat includes built-in actions for Archive to PDF/A, Prepare for Distribution, and more. These can be customised.

Method 2: ILovePDF (Online Batch Processing)

ILovePDF's online tools support batch operations on multiple files simultaneously.

Upload multiple files: Drag and drop multiple PDFs onto any tool (Compress, Merge, Watermark, etc.).

Compress all: Upload all PDFs → select compression level → download a ZIP containing all compressed files.

Convert all: Upload multiple PDFs → convert to Word/JPG/Excel → download all converted files.

Limitations: Free tier limits file size and number of operations per session. Pro tier removes limits and adds API access.

Method 3: Ghostscript (Command Line — Most Powerful Free Option)

Ghostscript is free, open-source, and scriptable for advanced batch operations.

Compress all PDFs in a folder (Linux/Mac shell):

for f in *.pdf; do
  gs -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook \
    -dBATCH -dNOPAUSE -dQUIET \
    -sOutputFile="compressed_${f}" "$f"
done

Windows PowerShell:

Get-ChildItem *.pdf | ForEach-Object {
    & gs -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook `
      -dBATCH -dNOPAUSE -dQUIET `
      -sOutputFile="compressed_$($_.Name)" $_.FullName
}

Convert all PDFs to PNG images (one PNG per page):

for f in *.pdf; do
  mkdir -p "${f%.pdf}"
  gs -sDEVICE=png16m -r150 \
    -sOutputFile="${f%.pdf}/page_%03d.png" "$f"
done

Convert to PDF/A for archiving:

for f in *.pdf; do
  gs -dPDFA=2 -dBATCH -dNOPAUSE \
    -sDEVICE=pdfwrite \
    -sOutputFile="archive_${f}" "$f"
done

Method 4: pdftk (Command Line)

pdftk excels at structural batch operations — merging, splitting, rotating, watermarking.

Merge all PDFs in a folder into one (Linux/Mac):

pdftk *.pdf cat output merged.pdf

Split every PDF into individual pages:

for f in *.pdf; do
  pdftk "$f" burst output "${f%.pdf}_page_%04d.pdf"
done

Rotate all pages in all PDFs:

for f in *.pdf; do
  pdftk "$f" rotate 1-endright output "rotated_${f}"
done

Add a watermark to all PDFs:

for f in *.pdf; do
  pdftk "$f" stamp watermark.pdf output "watermarked_${f}"
done

Method 5: Python (Most Flexible for Custom Workflows)

Python gives you complete control over batch PDF operations. Key libraries:

PyPDF2 / pypdf — pure Python, good for merging, splitting, rotating, metadata
PyMuPDF (fitz) — fast, feature-rich, handles compression and rendering
pdfplumber — excellent for text and table extraction
Pillow — image processing for rasterized PDF pages

Batch compress and rename:

import fitz  # PyMuPDF
import os

input_dir = "input_pdfs"
output_dir = "compressed_pdfs"
os.makedirs(output_dir, exist_ok=True)

for filename in os.listdir(input_dir):
    if filename.endswith(".pdf"):
        input_path = os.path.join(input_dir, filename)
        output_path = os.path.join(output_dir, filename)

        doc = fitz.open(input_path)
        doc.save(output_path, garbage=4, deflate=True, clean=True)
        doc.close()
        print(f"Compressed: {filename}")

Batch extract text from all PDFs:

import fitz
import os

for filename in os.listdir("."):
    if filename.endswith(".pdf"):
        doc = fitz.open(filename)
        text = "\n".join(page.get_text() for page in doc)
        txt_filename = filename.replace(".pdf", ".txt")
        with open(txt_filename, "w", encoding="utf-8") as f:
            f.write(text)
        print(f"Extracted: {txt_filename}")

Batch merge all PDFs in a folder:

from pypdf import PdfWriter
import os

writer = PdfWriter()
for filename in sorted(os.listdir(".")):
    if filename.endswith(".pdf"):
        writer.append(filename)

with open("merged.pdf", "wb") as f:
    writer.write(f)

Method 6: PDFsam (Desktop GUI, Free)

PDFsam Basic is free and supports batch splitting and merging via a GUI.

Batch split: Add multiple PDFs → set split point → run. Each PDF is split. Batch merge: Add all PDFs → set order → merge into one output.

PDFsam Enhanced (paid) adds more batch operations including rotate, extract, and visual composer.

Organising Batch Output

When batch processing dozens of files, keep outputs organised:

Naming conventions:

compressed_YYYY-MM-DD_originalname.pdf

Folder structure:

/batch_output/
  /compressed/
  /converted/
  /watermarked/

Logging: For scripted batch jobs, write a log file:

import logging
logging.basicConfig(filename="batch.log", level=logging.INFO)
logging.info(f"Processed: {filename}")

Error Handling in Batch Jobs

Some PDFs in a batch may be corrupted, password-protected, or have unusual structures that cause tools to fail.

In Python, wrap each file in a try/except:

for filename in files:
    try:
        process(filename)
    except Exception as e:
        print(f"Failed: {filename} — {e}")
        # Log and continue; don't stop the entire batch

In shell scripts, continue on error:

for f in *.pdf; do
  gs ... -sOutputFile="out_${f}" "$f" || echo "Failed: $f"
done

Summary

Batch PDF processing transforms tedious manual work into automated pipelines. For GUI-based batch work, Acrobat Pro's Action Wizard handles most professional needs. For free batch operations, Ghostscript and pdftk cover compression, conversion, merging, and watermarking from the command line. Python with PyMuPDF provides the most flexibility for custom workflows. The key is choosing the right tool for your skill level and building repeatable scripts or actions rather than repeating manual steps every time.

How to Batch Process PDF Files: Automate Repetitive PDF Tasks