Ever wondered how to quickly collect all image links from a website—maybe for inspiration, research, or to automate a repetitive task? With Python, this is easier than you think! In this blog post, I’ll show you how to create a script that fetches all image URLs from any website, prints the first 200 links, and saves every link into a handy CSV file.
Prerequisites
Before we dive in, make sure you have these Python libraries installed:
-
requests
-
BeautifulSoup
(frombs4
) -
csv
You can install them using:
pip install requests beautifulsoup4
Step-by-Step Guide
1. Import the Necessary Libraries
1 2 3 4 | import requests from bs4 import BeautifulSoup import csv from urllib.parse import urljoin |
2. Fetch and Parse the Website
1 2 3 4 5 6 7 8 9 10 11 12 | def get_image_links(url): response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') image_tags = soup.find_all('img') image_links = [] for img in image_tags: img_url = img.get('src') if img_url: # Handle relative URLs img_url = urljoin(url, img_url) image_links.append(img_url) return image_links |
3. Print the First 200 Image Links
1 2 3 4 | def print_first_200(image_links): print("First 200 Image Links:") for idx, link in enumerate(image_links[:200], 1): print(f"{idx}: {link}") |
4. Save All Image Links to a CSV File
1 2 3 4 5 6 7 | def save_links_to_csv(image_links, filename='image_links.csv'): with open(filename, 'w', newline='', encoding='utf-8') as csvfile: writer = csv.writer(csvfile) writer.writerow(['Image URL']) for link in image_links: writer.writerow([link]) print(f"\nAll image links saved to {filename}") |
5. Putting It All Together
1 2 3 4 5 6 | if __name__ == "__main__": url = input("Enter the website URL: ") image_links = get_image_links(url) print(f"\nTotal images found: {len(image_links)}\n") print_first_200(image_links) save_links_to_csv(image_links) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | import requests from bs4 import BeautifulSoup import csv from urllib.parse import urljoin def get_image_links(url): response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') image_tags = soup.find_all('img') image_links = [] for img in image_tags: img_url = img.get('src') if img_url: img_url = urljoin(url, img_url) image_links.append(img_url) return image_links def print_first_200(image_links): print("First 200 Image Links:") for idx, link in enumerate(image_links[:200], 1): print(f"{idx}: {link}") def save_links_to_csv(image_links, filename='image_links.csv'): with open(filename, 'w', newline='', encoding='utf-8') as csvfile: writer = csv.writer(csvfile) writer.writerow(['Image URL']) for link in image_links: writer.writerow([link]) print(f"\nAll image links saved to {filename}") if __name__ == "__main__": url = input("Enter the website URL: ") image_links = get_image_links(url) print(f"\nTotal images found: {len(image_links)}\n") print_first_200(image_links) save_links_to_csv(image_links) |
What You Can Do Next
-
Download images: Extend the script to download each image.
-
Filter by image type: Only grab
.jpg
or.png
images. -
Handle pagination: Scrape multiple pages.
Conclusion
This Python project demonstrates how easy and powerful web scraping can be for extracting image data from any website. Whether you’re a beginner or a seasoned coder, such projects are great for learning and automation. Happy coding!
Post a Comment
You can help us by Clicking on ads. ^_^
Please do not send spam comment : )