python – 使用R SOAP(SSOAP)检索数据/ scrape
发布时间:2020-12-20 13:32:34 所属栏目:Python 来源:网络整理
导读:在B-cycle页面(www.bcycle.com/whowantsitmore.aspx)上,我试图抓住选票的位置和价值. URL http://mapservices.bcycle.com/bcycleservice.asmx是SOAP服务. 基于documentation我相信我正确地做到了但是由于解析输入参数我得到了一个错误.即使调用没有参数的函
在B-cycle页面(www.bcycle.com/whowantsitmore.aspx)上,我试图抓住选票的位置和价值.
URL http://mapservices.bcycle.com/bcycleservice.asmx是SOAP服务. 基于documentation我相信我正确地做到了但是由于解析输入参数我得到了一个错误.即使调用没有参数的函数也会产生错误. # working with SOAP #install.packages("SSOAP",repos="http://www.omegahat.org/R",dependencies = T,type = "source") library(SSOAP) # Process the Web Service Definition Language (WSDL) file bcycle.asmx <- processWSDL("http://mapservices.bcycle.com/bcycleservice.asmx?WSDL") # Generate functions based on definitions to access the different data sets bcycle.interface <- genSOAPClientInterface(bcycle.asmx@operations[[1]],def = bcycle.asmx,bcycle.asmx@name,verbose=T) # Get the data by requesting the number of cities,username and password (yes it is public) bcycle.interface@functions$getCities("10","bcycle","c@rbont0ns") # receive error: Error in as(parameters,"limit.userName.pw") : # no method or default for coercing "character" to "limit.userName.pw" 这是由于函数中的以下代码: function(parameters = list(...),... etc) { ... as(parameters,"limit.userName.pw") ... } 因此我尝试直接使用.SOAP函数: # Using RCurl library library(RCurl) # set up curl options curl.opts <- curlOptions( verbose=T,header=T,cookie="ASP.NET_SessionId=dv25ws551nquoezqwq3iu545;__utma=27517231.373920809.1357910914.1357910914.1357912862.2;__utmc=27517231;__utmz=27517231.1357910914.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);__utmb=27517231.13.10.1357912862",httpheader = c('Content-Type' = 'text/xml; charset=utf-8',Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"),followLocation = TRUE,useragent = "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101 Firefox/18.0") # Define header and submit request bcycle.server <- SOAPServer("http://mapservices.bcycle.com/bcycleservice.asmx") .SOAP(bcycle.server,"getCities",limit=250,userName="bCycle",pw="c@rbont0ns",action="http://bcycle.com/getCities",xmlns="http://bcycle.com/",.opts=curl.opts,.literal=T,nameSpaces = "1.2",elementFormQualified = T,.returnNodeName = 'getCitiesResponse',.soapHeader = NULL) 我设法连接到他们的服务器但收到错误: System.Web.Services.Protocols.SoapException: Server did not recognize the value of HTTP Header SOAPAction: http://bcycle.com/getCities#getCities 这些是我迄今为止尝试过的选项没有成功. 使用Python我能够发出getCities的请求但没有收到任何回复. import suds client = suds.client.Client('http://mapservices.bcycle.com/bcycleservice.asmx?WSDL') print client # prints WSDL info print client.service.getCities(10,'bcycle','c@rbont0ns') #prints nothing 我真的很有兴趣保持这个R专注,但使用python可以更容易地洞察问题可能是什么. 有任何想法吗? 解决方法
尝试更正用户名并明确命名参数:
library(SSOAP) bcycle.asmx <- processWSDL("http://mapservices.bcycle.com/bcycleservice.asmx?WSDL") bcycle.interface <- genSOAPClientInterface(bcycle.asmx@operations[[1]],verbose=T) out <- bcycle.interface@functions$getCities( limit="10",pw="c@rbont0ns") #> out[[1]]@ #out[[1]]@zip out[[1]]@state_name #out[[1]]@pop out[[1]]@latitude #out[[1]]@ambassador_count out[[1]]@longitude #out[[1]]@city_name out[[1]]@city_name #[1] "toledo" Python调用也可以使用更正后的用户名 import suds client = suds.client.Client('http://mapservices.bcycle.com/bcycleservice.asmx?WSDL') client.service.getCities(10,'bCycle','c@rbont0ns') (ArrayOfCities){ Cities[] = (Cities){ zip = "43606" pop = 337362 ambassador_count = 455261 city_name = "toledo" state_name = "oh" latitude = 41.6743 longitude = -83.6029 },............ (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |