xml - 如何删除仅从特定单词的最后一个实例开始的整个字符串?

标签 xml go web-scraping substring byte

我正在尝试从 RSS 链接中抓取一些数据。我刚刚开始这个项目;稍后会有一些带有 GUI 的东西。我无法删除一些我不想在特定行上显示的内容。在这种情况下,我希望最后一个“at”之后的所有内容都消失,以便它只显示职位。

我曾尝试用空字符串替换“at”字符串的实例,但这也会从字符串中删除任何“a”后跟“t”的实例。我想我必须设置一个由空格分隔的单词映射(也许是 strings.Fields() ?),然后设置一个 for 循环来替换从一个单词开始的整个字符串。

代码:

package main

import (
    "encoding/xml"
    "fmt"
    "log"
    "net/http"
    "strings"
)

type JobInfo struct{
    Title string `xml:"title"`
    Location string `xml:"location"`
    Company string `xml:"a10:author"`
    PostDate string `xml:"pubDate"`
    Description string `xml:"description"`
}

type Channel struct{
    Title string `xml:"title"`
    Link  string `xml:"link"`
    Desc  string `xml:"description"`
    Items []JobInfo `xml:"item"`
}

type Rss struct {
    Channel Channel `xml:"channel"`
} 

func main() {
    resp, err := http.Get("https://stackoverflow.com/jobs/feed?l=Bridgewater%2c+MA%2c+USA&u=Miles&d=100")
    if err != nil{
        log.Fatal(err)
        return
    }
    defer resp.Body.Close()

    rss := Rss{}

    decoder := xml.NewDecoder(resp.Body)
    err = decoder.Decode(&rss)
    if err != nil{
        log.Fatal(err)
        return
    }

    fmt.Printf("%v\n", rss.Channel.Title)
    for i, item := range rss.Channel.Items{
        fmt.Printf("%v. Job Information:\n", i+1)
        fmt.Printf("Title: %v\n", item.Title)
        fmt.Printf("Location: %v\n", item.Location)
        fmt.Printf("Company: %v\n", item.Company)
        postdate := strings.Replace(item.PostDate, "Z", "", -1)
        fmt.Printf("Post Date: %v\n", postdate)
        fmt.Printf("Description: %v\n", item.Description)
    }
}

输出(1 个示例):

1. Job Information:
Title: Senior Web Engineer at Maark (Boston, MA)
Location: Boston, MA
Company: 
Post Date: Fri, 07 Dec 2018 20:21:34 
Description: <p>At MAARK, we are passionate about bringing innovation and advanced concepts to life and sweating every detail in order to create the best possible experience for our clients and their customers. We believe that every interaction and transaction a user has with a product should be designed and built to work for that individual.</p><p>The clients we work with span a host of verticals  ranging from large financial institutions to telecoms to hospitality to bio-engineering. And the problems we solve for them range from highly animated, interactive frontends, to complex business rules on the backend, to IoT solutions that bridge digital and real worlds. We love diving into new subjects and researching our clients' businesses and look to work with people who are just as naturally curious.<br></p><p>In this role, you will work as part of our growing development team to craft and build frontend side of apps and web sites. We work on highly creative projects, utilize a wide variety of fullstack technologies, and empower each of our developers to create innovative solutions and explore and master emerging web technologies.</p><p>We are looking for local (Boston metro area) candidates only. No remote opportunities for this position. </p><br><p><strong>Benefits</strong></p><br><p><strong>About Maark</strong><br></p><p>Maark is a strategic marketing and innovation agency for global companies - headquartered in Boston, MA. We help our clients define and articulate their vision, design new connected customer experiences, and develop applications at the intersection of where whats possible meets whats relevant. </p><p>We are proud to foster a workplace free from discrimination. We strongly believe that diversity of experience, perspectives, and background will lead to a better environment for our employees and a better experience for our clients.</p><p><em>Maark is an Equal Employment Opportunity (EEO) employer. It is the policy of Maark to prohibit discrimination and harassment of any type and to afford equal employment opportunities to all persons without regard to race, color, religion, sex, national origin, age, gender, physical or mental disability, veteran-status, or any other characteristic protected by applicable federal, state or local law.</em></p>

在 Title 字段中,这应该只写:Senior Web Engineer

稍后我会在描述中找出公司名称和字节串,但如果有人对此有任何意见,我们将不胜感激!

最佳答案

希望这对你有用

   fmt.Println(item.Title[:strings.LastIndex(item.Title," at ")])

关于xml - 如何删除仅从特定单词的最后一个实例开始的整个字符串?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53676936/

相关文章:

go - 如何在 google cloud run for firebase 中创建上下文对象

go - 期待 nil 但得到一个 nil 值的接口(interface)作为返回,这应该是 nil

php - 如何在使用 PHP 抓取网页时跳过包含文件扩展名的链接

python - 无法使用 BeautifulSoup 和 Requests 抓取下拉菜单

java - SAX 究竟是如何解析文档的?

c# - 如何使用 C# 访问特定的 XML 元素

java - 如何在非常大的 XML 文件中快速搜索/索引?

qt - 有效方法以红色高亮显示 'Unresolved Reference'

python - 使用 Selenium 查找 div 中的索引元素

ios - 将 swift 协议(protocol)属性覆盖为可选