python - 如何在 Python 中解析此 XML 响应?

标签 python xml parsing xpath lxml

这是我的 XML 文件:

<?xml version="1.0" ?>
<Items>
    <Item>
        <ASIN>3570102769</ASIN>
        <DetailPageURL>http://www.amazon.de/Inside-IS-Tage-Islamischen-Staat/dp/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D165953%26creativeASIN%3D3570102769</DetailPageURL>
        <ItemLinks>
            <ItemLink>
                <Description>Add To Wishlist</Description>
                <URL>http://www.amazon.de/gp/registry/wishlist/add-item.html%3Fasin.0%3D3570102769%26SubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL>
            </ItemLink>
            <ItemLink>
                <Description>Tell A Friend</Description>
                <URL>http://www.amazon.de/gp/pdp/taf/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL>
            </ItemLink>
            <ItemLink>
                <Description>All Customer Reviews</Description>
                <URL>http://www.amazon.de/review/product/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL>
            </ItemLink>
            <ItemLink>
                <Description>All Offers</Description>
                <URL>http://www.amazon.de/gp/offer-listing/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL>
            </ItemLink>
        </ItemLinks>
        <ItemAttributes>
            <Author>Jürgen Todenhöfer</Author>
            <Binding>Gebundene Ausgabe</Binding>
            <EAN>9783570102763</EAN>
            <EANList>
                <EANListElement>9783570102763</EANListElement>
            </EANList>
            <ISBN>3570102769</ISBN>
            <IsEligibleForTradeIn>1</IsEligibleForTradeIn>
            <ItemDimensions>
                <Height Units="hundredths-inches">874</Height>
                <Length Units="hundredths-inches">575</Length>
                <Width Units="hundredths-inches">126</Width>
            </ItemDimensions>
            <Label>C. Bertelsmann Verlag</Label>
            <Languages>
                <Language>
                    <Name>Deutsch</Name>
                    <Type>Published</Type>
                </Language>
                <Language>
                    <Name>Deutsch</Name>
                    <Type>Original</Type>
                </Language>
                <Language>
                    <Name>Deutsch</Name>
                    <Type>Unbekannt</Type>
                </Language>
            </Languages>
            <ListPrice>
                <Amount>1799</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 17,99</FormattedPrice>
            </ListPrice>
            <Manufacturer>C. Bertelsmann Verlag</Manufacturer>
            <ManufacturerMinimumAge Units="months">192</ManufacturerMinimumAge>
            <NumberOfPages>288</NumberOfPages>
            <PackageDimensions>
                <Height Units="hundredths-inches">118</Height>
                <Length Units="hundredths-inches">567</Length>
                <Weight Units="hundredths-pounds">93</Weight>
                <Width Units="hundredths-inches">252</Width>
            </PackageDimensions>
            <PackageQuantity>1</PackageQuantity>
            <ProductGroup>Book</ProductGroup>
            <ProductTypeName>ABIS_BOOK</ProductTypeName>
            <PublicationDate>2015-04-27</PublicationDate>
            <Publisher>C. Bertelsmann Verlag</Publisher>
            <Studio>C. Bertelsmann Verlag</Studio>
            <Title>Inside IS - 10 Tage im 'Islamischen Staat'</Title>
            <TradeInValue>
                <Amount>930</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 9,30</FormattedPrice>
            </TradeInValue>
        </ItemAttributes>
        <OfferSummary>
            <LowestNewPrice>
                <Amount>1799</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 17,99</FormattedPrice>
            </LowestNewPrice>
            <LowestUsedPrice>
                <Amount>1390</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 13,90</FormattedPrice>
            </LowestUsedPrice>
            <LowestCollectiblePrice>
                <Amount>4999</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 49,99</FormattedPrice>
            </LowestCollectiblePrice>
            <TotalNew>56</TotalNew>
            <TotalUsed>8</TotalUsed>
            <TotalCollectible>1</TotalCollectible>
            <TotalRefurbished>0</TotalRefurbished>
        </OfferSummary>
        <Offers>
            <TotalOffers>1</TotalOffers>
            <TotalOfferPages>1</TotalOfferPages>
            <MoreOffersUrl>http://www.amazon.de/gp/offer-listing/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</MoreOffersUrl>
            <Offer>
                <OfferAttributes>
                    <Condition>New</Condition>
                </OfferAttributes>
                <OfferListing>
                    <OfferListingId>9KHCZj9qtL6ucVBPASfXaryQjU8tWbc0n%2F3F4F7GraOKW6Csji2OxpD93%2FkoHwgIGQctlnrtx4RWIeJULAcvvsFhiopFi08JdsZ%2FeO3u6g0%3D</OfferListingId>
                    <Price>
                        <Amount>1799</Amount>
                        <CurrencyCode>EUR</CurrencyCode>
                        <FormattedPrice>EUR 17,99</FormattedPrice>
                    </Price>
                    <Availability>Gewöhnlich versandfertig in 24 Stunden</Availability>
                    <AvailabilityAttributes>
                        <AvailabilityType>now</AvailabilityType>
                        <MinimumHours>0</MinimumHours>
                        <MaximumHours>0</MaximumHours>
                    </AvailabilityAttributes>
                    <IsEligibleForSuperSaverShipping>1</IsEligibleForSuperSaverShipping>
                </OfferListing>
            </Offer>
        </Offers>
    </Item>
    <Item>
        <ASIN>3813506479</ASIN>
        <DetailPageURL>http://www.amazon.de/Altes-Land-Roman-D%C3%B6rte-Hansen/dp/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D165953%26creativeASIN%3D3813506479</DetailPageURL>
        <ItemLinks>
            <ItemLink>
                <Description>Add To Wishlist</Description>
                <URL>http://www.amazon.de/gp/registry/wishlist/add-item.html%3Fasin.0%3D3813506479%26SubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL>
            </ItemLink>
            <ItemLink>
                <Description>Tell A Friend</Description>
                <URL>http://www.amazon.de/gp/pdp/taf/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL>
            </ItemLink>
            <ItemLink>
                <Description>All Customer Reviews</Description>
                <URL>http://www.amazon.de/review/product/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL>
            </ItemLink>
            <ItemLink>
                <Description>All Offers</Description>
                <URL>http://www.amazon.de/gp/offer-listing/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL>
            </ItemLink>
        </ItemLinks>
        <ItemAttributes>
            <Author>Dörte Hansen</Author>
            <Binding>Gebundene Ausgabe</Binding>
            <EAN>9783813506471</EAN>
            <EANList>
                <EANListElement>9783813506471</EANListElement>
            </EANList>
            <ISBN>3813506479</ISBN>
            <IsEligibleForTradeIn>1</IsEligibleForTradeIn>
            <ItemDimensions>
                <Height Units="hundredths-inches">870</Height>
                <Length Units="hundredths-inches">567</Length>
                <Width Units="hundredths-inches">114</Width>
            </ItemDimensions>
            <Label>Albrecht Knaus Verlag</Label>
            <Languages>
                <Language>
                    <Name>Deutsch</Name>
                    <Type>Published</Type>
                </Language>
                <Language>
                    <Name>Deutsch</Name>
                    <Type>Original</Type>
                </Language>
            </Languages>
            <ListPrice>
                <Amount>1999</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 19,99</FormattedPrice>
            </ListPrice>
            <Manufacturer>Albrecht Knaus Verlag</Manufacturer>
            <NumberOfPages>288</NumberOfPages>
            <PackageDimensions>
                <Height Units="hundredths-inches">118</Height>
                <Length Units="hundredths-inches">858</Length>
                <Weight Units="hundredths-pounds">101</Weight>
                <Width Units="hundredths-inches">559</Width>
            </PackageDimensions>
            <ProductGroup>Book</ProductGroup>
            <ProductTypeName>ABIS_BOOK</ProductTypeName>
            <PublicationDate>2015-02-16</PublicationDate>
            <Publisher>Albrecht Knaus Verlag</Publisher>
            <Studio>Albrecht Knaus Verlag</Studio>
            <Title>Altes Land: Roman</Title>
            <TradeInValue>
                <Amount>965</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 9,65</FormattedPrice>
            </TradeInValue>
        </ItemAttributes>
        <OfferSummary>
            <LowestNewPrice>
                <Amount>1999</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 19,99</FormattedPrice>
            </LowestNewPrice>
            <LowestUsedPrice>
                <Amount>1599</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 15,99</FormattedPrice>
            </LowestUsedPrice>
            <TotalNew>72</TotalNew>
            <TotalUsed>8</TotalUsed>
            <TotalCollectible>0</TotalCollectible>
            <TotalRefurbished>0</TotalRefurbished>
        </OfferSummary>
        <Offers>
            <TotalOffers>1</TotalOffers>
            <TotalOfferPages>1</TotalOfferPages>
            <MoreOffersUrl>http://www.amazon.de/gp/offer-listing/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</MoreOffersUrl>
            <Offer>
                <OfferAttributes>
                    <Condition>New</Condition>
                </OfferAttributes>
                <OfferListing>
                    <OfferListingId>aeRv5KPt26T8S0hLrgV8Bv9UPYABYOMijGRxffbNJXUZSN4XfeeOZZpCZ28EURzmgMLlcYEBSRlMXS%2F8Z0pN1JbYerndME%2B2VK3RosfdQJA%3D</OfferListingId>
                    <Price>
                        <Amount>1999</Amount>
                        <CurrencyCode>EUR</CurrencyCode>
                        <FormattedPrice>EUR 19,99</FormattedPrice>
                    </Price>
                    <Availability>Gewöhnlich versandfertig in 24 Stunden</Availability>
                    <AvailabilityAttributes>
                        <AvailabilityType>now</AvailabilityType>
                        <MinimumHours>0</MinimumHours>
                        <MaximumHours>0</MaximumHours>
                    </AvailabilityAttributes>
                    <IsEligibleForSuperSaverShipping>1</IsEligibleForSuperSaverShipping>
                </OfferListing>
            </Offer>
        </Offers>
    </Item>
</Items>

我想要获取任何 ASIN 元素。所以我尝试了这个:

from lxml import etree
doc = etree.fromstring(xmlstring)
items = doc.xpath('//Items/Item')
for a in items:
    asin = a.xpath('//ASIN/text()')
    print asin

我得到的是这样的:

['3570102769', '3813506479']
['3570102769', '3813506479']

但我想要这个:

['3570102769']
['3813506479']

我不明白这是什么问题?我认为我应该迭代任何元素,并且每个元素中都有一个带有 one asin 的项目。为什么它会返回 two 乘以 two asin?

最佳答案

当您搜索 a.xpath('//ASIN/text()') 时,您将再次搜索完整的文档树。引用自 XML Path language specification :

//para selects all the para descendants of the document root and thus selects all para elements in the same document as the context node

因此,您要做的就是迭代匹配的 Item 节点并说“请给我本文档中的所有 ASIN 节点”。其上下文(Item 节点)将被忽略。

您应该做的是直接选择 ASIN 子节点。保持原来的实现,这可能看起来像这样:

doc = etree.fromstring(xmlstring)
items = doc.xpath('//Items/Item')
for a in items:
    asin = a.xpath('ASIN/text()')
    print asin

它给出了您想要的输出:

['3570102769']
['3813506479']

或者,如果您不确定 ASIN 出现在 Item 节点中的哪个位置,您可以使用 .//ASIN/text()

关于python - 如何在 Python 中解析此 XML 响应?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30163243/

相关文章:

python - 关于Python函数max()和ifelse结构耗时的问题

python - 将 2D 列表分配给 2 个 Dataframe 列 Pandas

python - 是否可以在 Excel 中使用 Python 动态更改公式引用的路径?

parsing - 解析树、注释解析树和激活树有什么区别?(编译器)

c# - 如何用java或C#解析包含utf-8字符的pdf文件

python - 将矩阵划分为 2x2 方阵子矩阵 - maxpooling fprop

python - 如何为 Django 1.4 网站实时显示脚本结果?

android - 适配器类中的 RecyclerView 多个布局 View

xml - 如何使用 XSLT 散列 XML 中的字段

excel - 使用 VBA 解析和扩展具有变量的代数方程