加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 编程开发 > Python > 正文

python – 使用R SOAP(SSOAP)检索数据/ scrape

发布时间:2020-12-20 13:32:34 所属栏目:Python 来源:网络整理
导读:在B-cycle页面(www.bcycle.com/whowantsitmore.aspx)上,我试图抓住选票的位置和价值. URL http://mapservices.bcycle.com/bcycleservice.asmx是SOAP服务. 基于documentation我相信我正确地做到了但是由于解析输入参数我得到了一个错误.即使调用没有参数的函
在B-cycle页面(www.bcycle.com/whowantsitmore.aspx)上,我试图抓住选票的位置和价值.

URL http://mapservices.bcycle.com/bcycleservice.asmx是SOAP服务.

基于documentation我相信我正确地做到了但是由于解析输入参数我得到了一个错误.即使调用没有参数的函数也会产生错误.

# working with SOAP
#install.packages("SSOAP",repos="http://www.omegahat.org/R",dependencies = T,type =  "source")
library(SSOAP)

# Process the Web Service Definition Language (WSDL) file
bcycle.asmx <- processWSDL("http://mapservices.bcycle.com/bcycleservice.asmx?WSDL")

# Generate functions based on definitions to access the different data sets
bcycle.interface <- genSOAPClientInterface(bcycle.asmx@operations[[1]],def = bcycle.asmx,bcycle.asmx@name,verbose=T)

# Get the data by requesting the number of cities,username and password (yes it is public)
bcycle.interface@functions$getCities("10","bcycle","c@rbont0ns")
# receive error: Error in as(parameters,"limit.userName.pw") :
# no method or default for coercing "character" to "limit.userName.pw"

这是由于函数中的以下代码:

function(parameters = list(...),... etc) {
    ...
    as(parameters,"limit.userName.pw")
    ...
}

因此我尝试直接使用.SOAP函数:

# Using RCurl library
library(RCurl)

# set up curl options
curl.opts <- curlOptions(
    verbose=T,header=T,cookie="ASP.NET_SessionId=dv25ws551nquoezqwq3iu545;__utma=27517231.373920809.1357910914.1357910914.1357912862.2;__utmc=27517231;__utmz=27517231.1357910914.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);__utmb=27517231.13.10.1357912862",httpheader = c('Content-Type' = 'text/xml; charset=utf-8',Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"),followLocation = TRUE,useragent = "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101 Firefox/18.0")

# Define header and submit request
bcycle.server <- SOAPServer("http://mapservices.bcycle.com/bcycleservice.asmx")
.SOAP(bcycle.server,"getCities",limit=250,userName="bCycle",pw="c@rbont0ns",action="http://bcycle.com/getCities",xmlns="http://bcycle.com/",.opts=curl.opts,.literal=T,nameSpaces = "1.2",elementFormQualified = T,.returnNodeName = 'getCitiesResponse',.soapHeader = NULL)

我设法连接到他们的服务器但收到错误:

System.Web.Services.Protocols.SoapException:
  Server did not recognize the value of HTTP Header SOAPAction:
  http://bcycle.com/getCities#getCities

这些是我迄今为止尝试过的选项没有成功.

使用Python我能够发出getCities的请求但没有收到任何回复.

import suds

client = suds.client.Client('http://mapservices.bcycle.com/bcycleservice.asmx?WSDL')

print client # prints WSDL info
print client.service.getCities(10,'bcycle','c@rbont0ns') #prints nothing

我真的很有兴趣保持这个R专注,但使用python可以更容易地洞察问题可能是什么.

有任何想法吗?

解决方法

尝试更正用户名并明确命名参数:

library(SSOAP)
bcycle.asmx <- processWSDL("http://mapservices.bcycle.com/bcycleservice.asmx?WSDL")
bcycle.interface <- genSOAPClientInterface(bcycle.asmx@operations[[1]],verbose=T)
out <- bcycle.interface@functions$getCities(
                     limit="10",pw="c@rbont0ns")

#> out[[1]]@
#out[[1]]@zip               out[[1]]@state_name
#out[[1]]@pop               out[[1]]@latitude
#out[[1]]@ambassador_count  out[[1]]@longitude
#out[[1]]@city_name         

out[[1]]@city_name
#[1] "toledo"

Python调用也可以使用更正后的用户名

import suds

client = suds.client.Client('http://mapservices.bcycle.com/bcycleservice.asmx?WSDL')
client.service.getCities(10,'bCycle','c@rbont0ns')
(ArrayOfCities){
   Cities[] = 
      (Cities){
         zip = "43606"
         pop = 337362
         ambassador_count = 455261
         city_name = "toledo"
         state_name = "oh"
         latitude = 41.6743
         longitude = -83.6029
      },............

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读