For the following piece of HTML code, I used beautifulsoup to capture the table information:
Code
Solution 1:
Sorry, I hadn't read your question carefully enough. You're right, the problem is the empty <td/> tags. Just adjust your generator to only include cells with text:
comments = [td.get_text() for td in table.findAll('td') if td.text]
EDIT: I doubt this is the most efficient way to do it, but this will only include tds that have either text or a corresponding td in the first row.
ths = table.tr.find_all('td')
tds_in_row = len(table.tr.next_sibling.find_all('td'))
tds = [
td.get_text()
for i, td in enumerate(table.find_all('td'))
if len(ths) > (i + 1) % tds_in_row or td.text
]
Share
Post a Comment
for "Beautiful Soup Captures Null Values In A Table"
Post a Comment for "Beautiful Soup Captures Null Values In A Table"