作者ntasop (kuli)
看板C_Sharp
标题[问题] 网页爬虫问题
时间Sat Nov 11 12:28:29 2017
请问大家用post方式抓网页资料,同样方式加request header和json资料,
python可以正常抓到网路资料,但C#没办法,请问大家有甚麽地方还需要
注意,非常谢谢。
C#版本:
HttpClient httpClient = new HttpClient();
httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
httpClient.DefaultRequestHeaders.Add("referer", "
https://shopee.tw/jeanch911");
httpClient.DefaultRequestHeaders.Add("user-agent", "Chrome/61.0.3163.100 Safari/537.36");
httpClient.DefaultRequestHeaders.Add("x-csrftoken", "iGwDsBPnVwMlr68yxp23vxYB3N53gMRH");
httpClient.DefaultRequestHeaders.Add("cookie", "略过太长....");
httpClient.BaseAddress = new Uri("
https://shopee.tw/");
string str = "{\"shop_ids\":[6550848]};
JObject jsonObject = JObject.Parse(str);
var content = new StringContent(jsonObject.ToString(), Encoding.UTF8, "application/json");
var respone = await httpClient.PostAsync("api/v1/shops", content);
System.Console.WriteLine(respone.Content.ReadAsStringAsync().Result);
python版本:
headers = {
'Cookie':'.......略过太长',
'Referer':'
https://shopee.tw/trc.tw',
'User-Agent':'Chrome/61.0.3163.100 Safari/537.36',
'X-CSRFToken':'3EwwiZKHrh74ZjKB1oHycwOgS6COQNon'}
jd = json.loads('{"shop_ids":[24550283]}')
response = requests.post('
https://shopee.tw/api/v1/shops/', json = jd, headers = headers)
print(response.text.encode('utf-8').decode('unicode_escape'))
--
※ 发信站: 批踢踢实业坊(ptt.cc), 来自: 175.180.200.197
※ 文章网址: https://webptt.com/cn.aspx?n=bbs/C_Sharp/M.1510374513.A.15E.html
1F:推 vi000246: 抓封包看python跟c#有什麽不同 11/11 13:22
2F:→ ntasop: 请问vi000246大,因没学过这部分知识,该如何抓封包? 11/11 16:52
3F:→ james732: 用wireshark之类的工具吧? 11/11 18:15
4F:推 name2name2: 用httpclient.PostAsJsonAsync 试试 11/12 10:49
5F:→ vi000246: chrome的F12、Fiddler、wireshark 11/12 20:08
6F:推 groovy2016: firebug postman 11/13 01:44