How to Extract and Merge All Images from a PDF into a Single Image Using Python

Have you ever needed to pull out all the images from a PDF and combine them into one long image for easy viewing or sharing? Whether it’s for study notes, infographics, or just organizing your resources, this task can be handled beautifully with a bit of Python magic!

In this blog post, I’ll show you how to extract all images from a PDF and merge them vertically into one single image. Let’s dive in!

Why Would You Need This?

Quick review: Scroll through all images from a PDF at once.
Presentation: Share notes or diagrams as a single, long image.
Social media: Post study materials or infographics without splitting them into multiple files.

Tools You’ll Need

To complete this project, you’ll need:

Python (3.x)
PyMuPDF (for extracting images from PDFs)
Pillow (for handling and merging images)

You can install these with:

pip install pymupdf pillow

Step 1: Extract Images from PDF

First, we use PyMuPDF to extract all the images from each page of your PDF.

import fitz # PyMuPDF
import io
from PIL import Image

pdf_path = "yourfile.pdf"
pdf_file = fitz.open(pdf_path)

images = []

for page_index in range(len(pdf_file)):
page = pdf_file[page_index]
image_list = page.get_images(full=True)
for img_index, img in enumerate(image_list):
xref = img[0]
base_image = pdf_file.extract_image(xref)
image_bytes = base_image["image"]
image = Image.open(io.BytesIO(image_bytes))
images.append(image)

Step 2: Merge Images Vertically

Now, let’s stack the images vertically—each image below the previous one.

# Calculate the total width and height
width = max(img.width for img in images)
total_height = sum(img.height for img in images)

# Create a new blank image with the combined height
merged_image = Image.new("RGB", (width, total_height), (255, 255, 255))

current_y = 0
for img in images:
merged_image.paste(img, (0, current_y))
current_y += img.height

# Save the final merged image
merged_image.save("merged_images.jpg")

Step 3: Run Your Script

Just run your script! After execution, you’ll find merged_images.jpg in your project folder containing all images from your PDF, stacked one below the other.

A Few Tips & Tricks

Handling different widths: If your images have different widths, the script uses the widest image. You may want to resize all images to the same width for uniformity.
Large PDFs: For very large PDFs, memory usage can increase. You can process in batches if needed.
Other formats: You can save as PNG or other formats by changing the file extension in save().

Full Script Example

Here’s the complete code:

import fitz # PyMuPDF
import io
from PIL import Image

pdf_path = "yourfile.pdf"
pdf_file = fitz.open(pdf_path)

images = []
for page_index in range(len(pdf_file)):
page = pdf_file[page_index]
image_list = page.get_images(full=True)
for img in image_list:
xref = img[0]
base_image = pdf_file.extract_image(xref)
image_bytes = base_image["image"]
image = Image.open(io.BytesIO(image_bytes))
images.append(image)

if images:
width = max(img.width for img in images)
total_height = sum(img.height for img in images)
merged_image = Image.new("RGB", (width, total_height), (255, 255, 255))
current_y = 0
for img in images:
merged_image.paste(img, (0, current_y))
current_y += img.height
merged_image.save("merged_images.jpg")
print("All images merged successfully!")
else:
print("No images found in the PDF.")

Conclusion

With just a few lines of Python code, you can extract and merge all images from a PDF. This method is especially useful for students, educators, designers, and anyone who works with PDFs regularly.

Try it out, and let me know in the comments how it worked for you! If you have any questions, feel free to ask. Happy coding!

If you found this helpful, don’t forget to subscribe for more Python tricks and automation guides!

Path Walla

Why Would You Need This?

Tools You’ll Need

Step 1: Extract Images from PDF

Step 2: Merge Images Vertically

Step 3: Run Your Script

A Few Tips & Tricks

Full Script Example

Conclusion

Post a Comment

Post a Comment

About Us

Contact form