我正在尝试使用 RSelenium 在此处发布表单 http://www.censusindia.gov.in/Census_Data_2001/Village_Directory/View_data/Village_Profile.aspx
我能够从下拉菜单中收集所有值,但我不知道如何提交表单来检索表格数据。
我已经尝试了一些命令
#Setting up the proxy server
RSelenium::checkForServer()
#Openning the Remote Driver
remDr <- remoteDriver$new()
remDr$open()
remDr$setImplicitWaitTimeout(3000)
remDr$navigate("http://www.censusindia.gov.in/Census_Data_2001/Village_Directory/View_data/Village_Profile.aspx")
stateElem <- remDr$findElement(using = "name", "ctl00$Body_Content$drpState")
stateElem$setElementAttribute(value = "18") #for state "Assam"
和
remDr$executeScript("document.forms[0].submit();", list(value = "18"))
和变体
document.forms["aspnetForm"].elements["stateElem"].value = "18"
我希望能够针对选定州的地区、分区和村庄循环运行此程序。
我知道我不会为社区提供太多帮助,但我是 RSelenium 和 java 的新手。
提前致谢。
最佳答案
我和一个 friend 能够(痛苦地)使用 Python 找出解决这个问题的方法。我仍然很想学习一种使用 R 完成类似任务的方法。下面是代码:
import mechanize
from bs4 import BeautifulSoup
URL = "http://www.censusindia.gov.in/Census_Data_2001/Village_Directory/View_data/Village_Profile.aspx"
def select_element(br, form, value):
br.select_form(nr=0)
br[form] = [value]
return br.submit()
def get_page(state, district, sub_district, village):
""" Get village data """
# I have no idea why exactly this works, the form uses javascript callbacks and
# it seems that you need to submit the form for each selection or you get an error.
br = mechanize.Browser()
br.open(URL)
# Could probably parse the responses at each stage to get valid entries for the next sub-unit.
select_element(br, 'ctl00$Body_Content$drpState', state)
# read html and pull out disticts
select_element(br, 'ctl00$Body_Content$drpDistrict', district)
select_element(br, 'ctl00$Body_Content$drpSubDistrict', sub_district)
r2 = select_element(br, 'ctl00$Body_Content$drpVillage', village)
return r2.read()
state = '35'
district = '01'
sub_district = '0004'
village = '00026500'
print(get_page(state, district, sub_district, village))
parameters = {'35':
{'01':
{'0004': ['00026500']}}}
foo = []
for state, districts in parameters.items():
for district, subdistricts in districts.items():
for subdistrict, villages in subdistricts.items():
for village in villages:
foo.append(get_page(state, district, subdistrict, village))
with open('foo', 'w') as f:
f.write(get_page(state, district, subdistrict, village))
我是 Python 的新手,更喜欢使用 R 来完成这项任务。这可能吗?
关于java - 使用 RSelenium 在 R 中提交 java FORM,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22440864/