我有一个编码的 URI 组件 “http://www.yelp.com/biz/carriage-house-caf%25C3%25A9-houston-2”
。我可以通过如下递归地应用 decodeURIComponent 函数将其转换为 "http://www.yelp.com/biz/carriage-house-café-houston-2"
function recursiveDecodeURIComponent(uriComponent){
try{
var decodedURIComponent = decodeURIComponent(uriComponent);
if(decodedURIComponent == uriComponent){
return decodedURIComponent;
}
return recursiveDecodeURIComponent(decodedURIComponent);
}catch(e){
return uriComponent;
}
}
console.log(recursiveDecodeURIComponent("http://www.yelp.com/biz/carriage-house-caf%25C3%25A9-houston-2"))
输出:“http://www.yelp.com/biz/carriage-house-café-houston-2”
。
我想在 python 中得到相同的结果。 我尝试了以下方法:
print urllib2.unquote(urllib2.unquote(urllib2.unquote("http://www.yelp.com/biz/carriage-house-caf%25C3%25A9-houston-2").decode("utf-8")))
但我得到了 http://www.yelp.com/biz/carriage-house-café-houston-2
。我得到的不是预期字符 é
,而是 'É'
,无论调用 urllib2.unquote 的次数如何。
我正在使用python2.7.3,谁能帮帮我?
最佳答案
我想一个简单的循环就可以解决问题:
uri = "http://www.yelp.com/biz/carriage-house-caf%25C3%25A9-houston-2"
while True:
dec = urllib2.unquote(uri)
if dec == uri:
break
uri = dec
uri = uri.decode('utf8')
print '%r' % uri
# u'http://www.yelp.com/biz/carriage-house-caf\xe9-houston-2'
关于javascript - 像javascript一样在python中递归解码URI组件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14702231/