Beautiful Soup Selector Returns An Empty List

October 23, 2024 Post a Comment

So im doing the automate the boring stuff course and im trying to scrape the amazon prices for the automate the boring stuff book, but it is returning an empty string no matter wha

Solution 1:

You need to change the parser to lxml and use headers = {'user-agent': 'Mozilla/5.0'}

defgetAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0'} # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select_one('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems.text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)

Snapshot:

If you want to use select then

defgetAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0'} # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems[0].text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)

Try with this.

defgetAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0'}  # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems[0].text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)

Solution 2:

Your request will trigger a 503 error from Amazon. Perhaps due to Amazon's anti-scraping effort. So perhaps you should consider some other means.

import requests

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0'} # to make the server think its a web browser and not a bot

productUrl = 'https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1'

res = requests.get(productUrl, headers=headers)

print (res)

output:

<Response [503]>

Html5 Tech

Beautiful Soup Selector Returns An Empty List

Solution 1:

Solution 2:

Post a Comment for "Beautiful Soup Selector Returns An Empty List"