import pandas as pd
import requests
from bs4 import BeautifulSoupWeek Three Highlights
We had a long weekend so Monday was a change than usual, and I got to sleep in!
Tuesday and Wednesday- I was out of office due to health reasons
Thursday- Worked on collecting data from Iowa Grocers excel file for prices on eggs, bacon and heirloom tomatoes. Finished cities in Iowa starting from the letter C and finished that. Proceeded to work on cities from letter I. Completed cities from O to R and W as well.
Finished Datacamp Web Scraping Course.

Friday- Started working on developing a webscraping script in Python using BeautifulSoup, requests and pandas.
# Read the input Excel file
input_file = "grocery_websites.xlsx"
df = pd.read_excel(input_file)# Create a new DataFrame to store the results
result_df = pd.DataFrame(columns=["Website", "Product", "Price"])# Scrape prices for each website
for index, row in df.iterrows():
website = row["Website"]
product = row["Product"]
url = row["URL"]
try:
# Send a GET request to the website
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
# Find the price element on the page
price_element = soup.find("span", class_="price-amount")
if price_element:
price = price_element.text.strip()
result_df = result_df.append({"Website": website, "Product": product, "Price": price}, ignore_index=True)
else:
result_df = result_df.append({"Website": website, "Product": product, "Price": "Not found"}, ignore_index=True)
except requests.exceptions.RequestException as e:
print(f"Error scraping {website}: {e}")
result_df = result_df.append({"Website": website, "Product": product, "Price": "Error"}, ignore_index=True)# Write the results to a new Excel file
output_file = "grocery_prices.xlsx"
result_df.to_excel(output_file, index=False)This is still in its working phase so hasn’t been finished yet, but working on it so that web scraping gets as autonomous as possible.