How to scrape an object by css class with python?

问题: I'm trying to get the number from a website by its css class. The output of my code below returns None. url = "https://www.reddit.com/r/" + subreddit content = requests.g...

问题:

I'm trying to get the number from a website by its css class. The output of my code below returns None.

url = "https://www.reddit.com/r/" + subreddit
content = requests.get(url)
soup = BeautifulSoup(content.text, 'html.parser')

active_users = soup.find("div", {"class":"_3XFx6CfPlg-4Usgxm0gK8R"})
print(active_users)

The class I'm trying to find is of the number of currently active users on the website. How do I make this work?


回答1:

You can use their JSON api to get active user count, subscribers etc.

For example:

import json
import requests


subreddit = 'python'

headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}
data = requests.get('https://www.reddit.com/r/{}/about.json'.format(subreddit), headers=headers).json()

# uncomment this to print all data:
# print(json.dumps(data, indent=4))

print('Subscribers       :', data['data']['subscribers'])
print('Active user count :', data['data']['active_user_count'])

Prints:

Subscribers       : 604566
Active user count : 2719

回答2:

try to use soup.select() instead. For example:

import requests, bs4, os
content = requests.get('https://getbootstrap.com/')
soup = bs4.BeautifulSoup(content.text, 'html.parser')

active_users = soup.select("div", {"class":"row"})

print(active_users)

for elem in active_users:
    print(elem)

I hope it helps!

  • 发表于 2020-06-27 21:47
  • 阅读 ( 120 )
  • 分类:sof

条评论

请先 登录 后评论
不写代码的码农
小编

篇文章

作家榜 »

  1. 小编 文章
返回顶部
部分文章转自于网络,若有侵权请联系我们删除