How to Scrape and Save All Image Links from Any Website Using Python

Ever wondered how to quickly collect all image links from a website—maybe for inspiration, research, or to automate a repetitive task? With Python, this is easier than you think! In this blog post, I’ll show you how to create a script that fetches all image URLs from any website, prints the first 200 links, and saves every link into a handy CSV file.

Prerequisites

Before we dive in, make sure you have these Python libraries installed:

requests
BeautifulSoup (from bs4)
csv

You can install them using:

pip install requests beautifulsoup4

Step-by-Step Guide

1. Import the Necessary Libraries

import requests
from bs4 import BeautifulSoup
import csv
from urllib.parse import urljoin

2. Fetch and Parse the Website

def get_image_links(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    image_tags = soup.find_all('img')
    image_links = []
    for img in image_tags:
        img_url = img.get('src')
        if img_url:
            # Handle relative URLs
            img_url = urljoin(url, img_url)
            image_links.append(img_url)
    return image_links

3. Print the First 200 Image Links

def print_first_200(image_links):
    print("First 200 Image Links:")
    for idx, link in enumerate(image_links[:200], 1):
        print(f"{idx}: {link}")

4. Save All Image Links to a CSV File

def save_links_to_csv(image_links, filename='image_links.csv'):
    with open(filename, 'w', newline='', encoding='utf-8') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['Image URL'])
        for link in image_links:
            writer.writerow([link])
    print(f"\nAll image links saved to {filename}")

5. Putting It All Together

if __name__ == "__main__":
    url = input("Enter the website URL: ")
    image_links = get_image_links(url)
    print(f"\nTotal images found: {len(image_links)}\n")
    print_first_200(image_links)
    save_links_to_csv(image_links)

Complete Script

import requests
from bs4 import BeautifulSoup
import csv
from urllib.parse import urljoin

def get_image_links(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    image_tags = soup.find_all('img')
    image_links = []
    for img in image_tags:
        img_url = img.get('src')
        if img_url:
            img_url = urljoin(url, img_url)
            image_links.append(img_url)
    return image_links

def print_first_200(image_links):
    print("First 200 Image Links:")
    for idx, link in enumerate(image_links[:200], 1):
        print(f"{idx}: {link}")

def save_links_to_csv(image_links, filename='image_links.csv'):
    with open(filename, 'w', newline='', encoding='utf-8') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['Image URL'])
        for link in image_links:
            writer.writerow([link])
    print(f"\nAll image links saved to {filename}")

if __name__ == "__main__":
    url = input("Enter the website URL: ")
    image_links = get_image_links(url)
    print(f"\nTotal images found: {len(image_links)}\n")
    print_first_200(image_links)
    save_links_to_csv(image_links)

What You Can Do Next

Download images: Extend the script to download each image.
Filter by image type: Only grab .jpg or .png images.
Handle pagination: Scrape multiple pages.

Conclusion

This Python project demonstrates how easy and powerful web scraping can be for extracting image data from any website. Whether you’re a beginner or a seasoned coder, such projects are great for learning and automation. Happy coding!

Path Walla

Prerequisites

Step-by-Step Guide

1. Import the Necessary Libraries

What You Can Do Next

Conclusion

If you liked this post, feel free to share or leave a comment with your suggestions!

Post a Comment

Post a Comment

About Us

Contact form