ruby - Nokogiri XPath 找不到某些节点

标签 ruby xml nokogiri

我正在使用 Nokogiri 修改现有的 XML,但我在选择某些节点时遇到了问题。

这是 XML 的相关片段:

<ProductCatalog>
  <ProductLineItem>
    <updi:ProductIdentification>
      <updi:ProductName>800-22283-03</updi:ProductName>

我可以通过以下方式找到下面的两个节点:

doc.xpath("//updi:ProductIdentification") => #<Nokogiri::XML...
doc.xpath("//updi:ProductName") => #<Nokogiri::XML...

但是,如果我尝试选择上面的节点之一:

doc.xpath("//ProductLineItem") => []

我得到一个空数组。它似乎与前缀有关。我可以找到任何有前缀的元素,但找不到没有前缀的元素。

更新:这是(相当冗长的)命名空间:

xsi:schemaLocation="urn:rosettanet:specification:interchange:ProductCatalogInformationDistribution:xsd:schema:01.00 ..\..\XML\Interchange\ProductCatalogInformationDistribution_01_00.xsd"
xmlns:dplcs="urn:rosettanet:specification:domain:Design:ProductLifeCycleStatusCode:xsd:codelist:01.03"
xmlns:rrt="urn:rosettanet:specification:domain:Shared:RateType:xsd:codelist:01.01" 
xmlns:dl="urn:rosettanet:specification:domain:Logistics:xsd:schema:02.15" 
xmlns:ictc="urn:rosettanet:specification:domain:Design:CatalogType:xsd:codelist:01.00" 
xmlns:updi="urn:rosettanet:specification:universal:ProductIdentification:xsd:schema:01.04" 
xmlns:dddt="urn:rosettanet:specification:domain:Design:DateType:xsd:codelist:01.00" 
xmlns:dsdc="urn:rosettanet:specification:domain:Logistics:ShipDateCode:xsd:codelist:01.03" 
xmlns:ucr="urn:rosettanet:specification:universal:Currency:xsd:codelist:01.02" 
xmlns:dpiac="urn:rosettanet:specification:domain:Logistics:PortIdentifierAuthorityCode:xsd:codelist:01.03" 
xmlns:rptc="urn:rosettanet:specification:domain:Shared:PricingTypeCode:xsd:codelist:01.03" 
xmlns:dit="urn:rosettanet:specification:domain:Procurement:InventoryType:xsd:codelist:01.03" 
xmlns:dtt="urn:rosettanet:specification:domain:Procurement:TransactionType:xsd:codelist:01.04" 
xmlns:upd="urn:rosettanet:specification:universal:PhysicalDimension:xsd:schema:01.05" 
xmlns:dcst="urn:rosettanet:specification:domain:Logistics:CustomsType:xsd:codelist:01.03" 
xmlns:dsd="urn:rosettanet:specification:domain:Logistics:ShippingDocument:xsd:codelist:01.02" 
xmlns:uci="urn:rosettanet:specification:universal:ContactInformation:xsd:schema:01.03" 
xmlns:dpcm="urn:rosettanet:specification:domain:Procurement:PurchaseMethod:xsd:codelist:01.03" 
xmlns:rpsc="urn:rosettanet:specification:domain:Shared:ProductStatusCode:xsd:codelist:01.01" 
xmlns:dgrc="urn:rosettanet:specification:domain:Marketing:GeographicRegionCode:xsd:codelist:01.02" 
xmlns:dtrt="urn:rosettanet:specification:domain:Logistics:TrackingReferenceType:xsd:codelist:01.06" 
xmlns:umtq="urn:rosettanet:specification:universal:MimeTypeQualifier:xsd:codelist:01.02" 
xmlns:dcrt="urn:rosettanet:specification:domain:Procurement:CustomerType:xsd:codelist:01.03" 
xmlns:dscd="urn:rosettanet:specification:domain:Logistics:ShipmentChangeDisposition:xsd:codelist:01.03" 
xmlns:uc="urn:rosettanet:specification:universal:Country:xsd:codelist:01.02" 
xmlns="urn:rosettanet:specification:interchange:ProductCatalogInformationDistribution:xsd:schema:01.00" 
xmlns:dpc="urn:rosettanet:specification:domain:Procurement:PaymentCondition:xsd:codelist:01.03" 
xmlns:rpmt="urn:rosettanet:specification:domain:Shared:PaymentType:xsd:codelist:01.01" 
xmlns:dft="urn:rosettanet:specification:domain:Procurement:FinanceTerms:xsd:codelist:01.03" 
xmlns:dtq="urn:rosettanet:specification:domain:Procurement:TotalQualifier:xsd:codelist:01.03" 
xmlns:ume="urn:rosettanet:specification:universal:MonetaryExpression:xsd:schema:01.04" 
xmlns:dcp="urn:rosettanet:specification:domain:Design:Compliant:xsd:codelist:01.02" 
xmlns:drsc="urn:rosettanet:specification:domain:Marketing:RegistrationStatusCode:xsd:codelist:01.03" 
xmlns:uat="urn:rosettanet:specification:universal:AbstractType:xsd:schema:01.02" 
xmlns:dp="urn:rosettanet:specification:domain:Procurement:xsd:schema:02.17" 
xmlns:rpm="urn:rosettanet:specification:domain:Shared:PaymentMethod:xsd:codelist:01.02" 
xmlns:dfrt="urn:rosettanet:specification:domain:Procurement:ForecastReferenceType:xsd:codelist:01.03" 
xmlns:dtec="urn:rosettanet:specification:domain:Procurement:TaxExemptionCode:xsd:codelist:01.03" 
xmlns:ulc="urn:rosettanet:specification:universal:Locations:xsd:schema:01.04" 
xmlns:dccc="urn:rosettanet:specification:domain:Procurement:CreditCardClassification:xsd:codelist:01.03" 
xmlns:drlc="urn:rosettanet:specification:domain:Logistics:ReturnLabelCode:xsd:codelist:01.03" 
xmlns:st="http://www.ascc.net/xml/schematron" 
xmlns:dnecc="urn:rosettanet:specification:domain:Logistics:NationalExportControlClassification:xsd:codelist:01.03" 
xmlns:rpktc="urn:rosettanet:specification:domain:Shared:PackageTypeCode:xsd:codelist:01.01" 
xmlns:uwt="urn:rosettanet:specification:universal:WeightType:xsd:codelist:01.01" 
xmlns:dfpt="urn:rosettanet:specification:domain:Logistics:FreightPaymentTerms:xsd:codelist:01.03" 
xmlns:dte="urn:rosettanet:specification:domain:Procurement:TransportEvent:xsd:codelist:01.03" 
xmlns:ul="urn:rosettanet:specification:universal:Language:xsd:codelist:01.02" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns:dbpq="urn:rosettanet:specification:domain:Procurement:BookPriceQualifier:xsd:codelist:01.04" 
xmlns:drl="urn:rosettanet:specification:domain:Logistics:RouteLocation:xsd:codelist:01.03" 
xmlns:ssdh="urn:rosettanet:specification:system:StandardDocumentHeader:xsd:schema:01.16" 
xmlns:dmk="urn:rosettanet:specification:domain:Marketing:xsd:schema:02.12" 
xmlns:rmat="urn:rosettanet:specification:domain:Shared:MonetaryAmountType:xsd:codelist:01.01" 
xmlns:uuom="urn:rosettanet:specification:universal:UnitOfMeasure:xsd:codelist:01.03" 
xmlns:dfe="urn:rosettanet:specification:domain:Procurement:ForecastEvent:xsd:codelist:01.03" 
xmlns:dst="urn:rosettanet:specification:domain:Procurement:ShipmentTerms:xsd:codelist:01.03" 
xmlns:udt="urn:rosettanet:specification:universal:DataType:xsd:schema:01.04" 
xmlns:dacc="urn:rosettanet:specification:domain:Procurement:AccountClassification:xsd:codelist:01.03" 
xmlns:dptt="urn:rosettanet:specification:domain:Logistics:PortType:xsd:codelist:01.03" 
xmlns:sha="urn:rosettanet:specification:domain:Shared:xsd:schema:01.10" 
xmlns:dlv="urn:rosettanet:specification:domain:Design:Level:xsd:codelist:01.02" 
xmlns:rict="urn:rosettanet:specification:domain:Shared:InvoiceChargeType:xsd:codelist:01.02" 
xmlns:utt="urn:rosettanet:specification:universal:TaxType:xsd:codelist:01.02" 
xmlns:ddwsr="urn:rosettanet:specification:domain:Marketing:DesignWinStatusReason:xsd:codelist:01.03" 
xmlns:dsm="urn:rosettanet:specification:domain:Logistics:ShipmentMode:xsd:codelist:01.05" 
xmlns:udct="urn:rosettanet:specification:universal:DocumentType:xsd:codelist:01.09" 
xmlns:dac="urn:rosettanet:specification:domain:Design:ActionCode:xsd:codelist:01.03" 
xmlns:dpsr="urn:rosettanet:specification:domain:Procurement:ProductSubstitutionReason:xsd:codelist:01.03" 
xmlns:sft="urn:rosettanet:specification:system:TPIRFileType:xsd:codelist:01.01" 
xmlns:dltcc="urn:rosettanet:specification:domain:Procurement:LeadTimeClassificationCode:xsd:codelist:01.03" 
xmlns:ri="urn:rosettanet:specification:domain:Shared:Interval:xsd:codelist:01.01" 
xmlns:urss="urn:rosettanet:specification:system:xml:1.0" 
xmlns:dds="urn:rosettanet:specification:domain:Design:xsd:schema:02.15" 
xmlns:dslt="urn:rosettanet:specification:domain:Procurement:SaleType:xsd:codelist:01.04" 
xmlns:udc="urn:rosettanet:specification:universal:Document:xsd:schema:01.08" 
xmlns:dabcc="urn:rosettanet:specification:domain:Design:ABCCode:xsd:codelist:01.02" 
xmlns:dppt="urn:rosettanet:specification:domain:Procurement:ProductProcurementType:xsd:codelist:01.03" 
xmlns:rwtc="urn:rosettanet:specification:domain:Shared:WarrantyType:xsd:codelist:01.01" 
xmlns:dlit="urn:rosettanet:specification:domain:Logistics:InstructionType:xsd:codelist:01.00" 
xmlns:rfob="urn:rosettanet:specification:domain:Shared:FreeOnBoard:xsd:codelist:01.01" 
xmlns:upri="urn:rosettanet:specification:universal:ProcessRoleIdentifier:xsd:codelist:01.08" 
xmlns:ddrn="urn:rosettanet:specification:domain:Marketing:DesignRegistrationNotification:xsd:codelist:01.02" 
xmlns:dsh="urn:rosettanet:specification:domain:Procurement:SpecialHandling:xsd:codelist:01.04" 
xmlns:ud="urn:rosettanet:specification:universal:Dates:xsd:schema:01.03" 
xmlns:dpms="urn:rosettanet:specification:domain:Marketing:ProjectMarketSegment:xsd:codelist:01.02" 
xmlns:rssl="urn:rosettanet:specification:domain:Shared:ShippingServiceLevel:xsd:codelist:01.01" 
xmlns:dldr="urn:rosettanet:specification:domain:Logistics:LotDiscrepancyReason:xsd:codelist:01.03" 
xmlns:rat="urn:rosettanet:specification:domain:Shared:AmountType:xsd:codelist:01.02" 
xmlns:upi="urn:rosettanet:specification:universal:PartnerIdentification:xsd:schema:01.12" 
xmlns:ddp="urn:rosettanet:specification:domain:Marketing:Disposition:xsd:codelist:01.02" 
xmlns:dsfr="urn:rosettanet:specification:domain:Procurement:SpecialFulfillmentRequest:xsd:codelist:01.03" 
xmlns:ucs="urn:rosettanet:specification:universal:CountrySubdivision:xsd:codelist:01.02

最佳答案

最简单的快速解决方案是从文档中完全删除命名空间:

require 'nokogiri'
xml = Nokogiri.XML "<root xmlns='foo' xmlns:bar='whee'><a/><bar:b /></root>"

p xml.xpath('//b').length     #=> 0
p xml.xpath('//bar:b').length #=> 1
p xml.xpath('//a').length     #=> 0
xml.remove_namespaces!
p xml.xpath('//a').length     #=> 1
p xml.xpath('//b').length     #=> 1

但是,如果您需要保留 namespace (例如,修改文档并重新保存它,或者您在各种 namespace 中有冲突的元素或属性名称),则上述方法不是有效的解决方案。如果你不能核对命名空间,你可以创建一个前缀并告诉 Nokogiri 它对应的是什么......

xml = Nokogiri.XML "<root xmlns='foo' xmlns:bar='whee'><a/><bar:b /></root>"
p xml.xpath('//x:a','x'=>'foo').length  #=> 1

…其中字符串 foo 是文档中拥有元素的 namespace 的 URI,该元素具有默认 namespace (通常在根目录中),字符串 x 是您想要的任何内容(与文档中已声明的另一个命名空间不冲突)。或者,更简单地说,您可以只使用 xmlns 作为默认命名空间的前缀:

p xml.xpath('//xmlns:a').length  #=> 1

或者,如果您需要离开 namespace 并且可以构造一个合理的 CSS 样式选择器来获取您需要的节点,那么您可以使用 css 方法:

require 'nokogiri'
xml = Nokogiri.XML "<root xmlns='foo' xmlns:bar='whee'>
  <a/>
  <bar:b />
  <c xmlns='jim'><d/></c>
</root>"

p xml.css('a').length, #=> 1
  xml.css('b').length, #=> 0
  xml.css('c').length, #=> 0
  xml.css('d').length  #=> 0

如上所示,请注意,这仅适用于与根元素位于同一命名空间中的节点。

关于ruby - Nokogiri XPath 找不到某些节点,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13670541/

相关文章:

php - 如何使文本中的每个单词都可以点击并将其发送到脚本

Ruby `when' 关键字不在 case 语句中使用 ==。它有什么用?

ruby-on-rails - 使用 url2png 将屏幕截图保存到自己的服务器的最佳方法?

android - Edittext 随动画改变宽度

ruby - 如何使用 nokogiri gem 从 FTP 远程解析 XML 文件,无需下载

ruby - 如何用Nokorigi替换element?

ruby - 使用 to_html 时如何避免在 Nokogiri 中漂亮地打印 HTML?

ruby-on-rails - 为什么我得到 undefined method `first' for nil :NilClass when trying to pull value from array in ruby on rails?

ruby - 什么时候用模块,什么时候用类

ruby - 你如何处理 Savon 中的嵌套命名空间?