Python命令行實(shí)現(xiàn)—查全國(guó)7天天氣
為什么要爬天氣呢?1.可以練練手2.利用itchat庫(kù)實(shí)現(xiàn)自動(dòng)回復(fù)功能后,把查天氣的功能集成起來(lái),實(shí)現(xiàn)微信自助查天氣功能!
首先,還是相似的套路,我們看看能不能在官網(wǎng)上直接抓包(XHR)來(lái)獲取一個(gè)通用API。然后直接用API查詢就OK?在百度搜關(guān)鍵詞【天氣】或者【南京天氣】會(huì)跳出對(duì)應(yīng)的網(wǎng)頁(yè):http://www.weather.com.cn/weather/101190101.shtml.點(diǎn)進(jìn)去,可以看到相應(yīng)城市下一周的天氣情況:

再換一個(gè)城市上海,我們發(fā)現(xiàn),瀏覽器地址變?yōu)椋篽ttp://www.weather.com.cn/weather/101020100.shtml。原來(lái)101020100這串?dāng)?shù)字對(duì)應(yīng)著相應(yīng)城市的代碼。我們來(lái)分析下頁(yè)面上XHR請(qǐng)求,看看有沒有直接抓包的可能?
經(jīng)過谷歌瀏覽器——檢查-Networt-XHR-刷新,發(fā)現(xiàn)并沒有XHR請(qǐng)求,看來(lái)我們需要的天氣內(nèi)容和城市代碼,可能是包含在頁(yè)面中經(jīng)過JS和服務(wù)器處理后呈現(xiàn)的.....好吧,嘗試失敗!
再看一下JS請(qǐng)求,發(fā)現(xiàn)太多了,無(wú)力去逐一查看!所幸網(wǎng)上有人早已記錄下了所有城市對(duì)應(yīng)的城市代碼。我把拷貝了一下,存到了本地mysql,數(shù)據(jù)在百度云上,需要的可以自行下載下,執(zhí)行SQL即可直接把SQL表和數(shù)據(jù)一并建好。https://pan.baidu.com/s/1kXaN2Aj 密碼是:8y6n。
好了,準(zhǔn)備工作做完了,現(xiàn)在思路就很清楚了,全國(guó)城市和代碼都有了,我們查某個(gè)城市的天氣,只需要輸入城市,就可以從mysql里獲取對(duì)應(yīng)的城市代碼如:101020100,然后構(gòu)造相應(yīng)的url:http://www.weather.com.cn/weather/101190101.shtml就可以查看到對(duì)應(yīng)城市的7天天氣了,然而,頁(yè)面并沒有XHR和直接可用的json數(shù)據(jù),那我們只能自己動(dòng)手了——分析網(wǎng)頁(yè)內(nèi)容,動(dòng)手寫正則表達(dá)式/beautifulSoup/Xpath來(lái)提取頁(yè)面信息,具體內(nèi)容在此就不贅述了,詳見代碼就好:
- import re
 - import pymysql
 - import requests
 - from bs4 import BeautifulSoup
 - class SearchWeather():
 - def __init__(self):
 - self.HEADERS ={
 - 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 ''(KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'}
 - self.CONNECTION = pymysql.connect(host='localhost',user='root',password='xxx',db='xxx',charset='utf8',cursorclass=pymysql.cursors.DictCursor)
 - def getcityCode(self,cityName):
 - SQL = "SELECT cityCode FROM cityWeather WHERE cityName='%s'" % cityName
 - try:
 - with self.CONNECTION.cursor() as cursor:
 - cursor.execute(SQL)
 - self.CONNECTION.commit()
 - result = cursor.fetchone()
 - return result['cityCode']
 - except Exception as e:
 - print(repr(e))
 - def getWeather(self,cityCode,cityname):
 - url = 'http://www.weather.com.cn/weather/%s.shtml' % cityCode
 - html = requests.get(url,headers = self.HEADERS)
 - html.encoding='utf-8'
 - soup=BeautifulSoup(html.text,'lxml')
 - weather = "日期 天氣 【溫度】 風(fēng)向風(fēng)力\n"
 - for item in soup.find("div", {'id': '7d'}).find('ul').find_all('li'):
 - date,detail = item.find('h1').string, item.find_all('p')
 - title = detail[0].string
 - templow = detail[1].find("i").string
 - temphigh = detail[1].find('span').string if detail[1].find('span') else ''
 - wind,direction = detail[2].find('span')['title'], detail[2].find('i').string
 - if temphigh=='':
 - weather += '你好,【%s】今天白天:【%s】,溫度:【%s】,%s:【%s】\n' % (cityname,title,templow,wind,direction)
 - else:
 - weather += (date + title + "【" + templow + "~"+temphigh +'°C】' + wind + direction + "\n")
 - return weather
 - def main(self,city):
 - cityCode = self.getcityCode(city)
 - detail = self.getWeather(cityCode,city)
 - print (detail)
 - if __name__ == "__main__":
 - weather = SearchWeather()
 - weather.main(city=input('請(qǐng)輸入城市名稱:'))
 
代碼運(yùn)行效果如下:
















 
 
 


 
 
 
 