wikipedia - 获取维基百科文章的第一段(仅文本)返回不期望的结果

标签 wikipedia wikipedia-api wikitext

我正在尝试检索维基百科文章的第一段文本,在本例中为UNIX,但它返回了我不需要的输出。

对于我在 Wikipedia api 和 StackOverflow 上阅读的内容,这是进行调用的请求 URL:

http://en.wikipedia.org/w/api.php?format=php&action=query&titles=unix&redirects=1&prop=revisions&rvprop=content&rvsection=0&rvlimit=1

我的预期输出是:

Unix (officially trademarked as UNIX, sometimes also written as Unix in small caps) is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, Michael Lesk and Joe Ossanna.[1] The Unix operating system was first developed in assembly language, but by 1973 had been almost entirely recoded in C, greatly facilitating its further development and porting to other hardware. Today's Unix system evolution is split into various branches, developed over time by AT&T as well as various commercial vendors, universities (such as University of California, Berkeley's BSD), and non-profit organizations.

我目前的结果:

{{Use dmy dates|date=August 2012}}
{{Infobox OS
|name               = Unix
|logo               = 
|screenshot         = [[File:Unix history-simple.svg|250px]]
|caption            = Evolution of Unix and Unix-like systems
|website            = [http://www.unix.org unix.org]
|developer          = [[Ken Thompson (computer programmer)|Ken Thompson]], [[Dennis Ritchie]], [[Brian Kernighan]], [[Douglas McIlroy]], and [[Joe Ossanna]] at [[Bell Labs]]
|source_model       = Historically [[Closed source software|closed source]], now some Unix projects ([[Berkeley Software Distribution|BSD]] family and [[Illumos]]) are [[open source]]d.
|frequently_updated = yes <!-- Release version update? Don't edit this page, just click on the version number! -->
|programmed_in      = [[C (programming language)|C]] 
|kernel_type        = [[Monolithic Kernel|Monolithic]]
|ui                 = [[Command-line interface]] & [[Graphical user interface|Graphical]] ([[X Window System]])
|language           = English 
|family             = Unix
|released           = {{start date and age|df=yes|1969}}
|license            = [[Proprietary software|Proprietary]]
|working_state      = Current 
}}

'''Unix''' (officially trademarked as '''UNIX''', sometimes also written as '''<span style="font-variant: small-caps;">Unix</span>''' in small caps) is a [[Computer multitasking|multitasking]], [[multi-user]] computer [[operating system]] originally developed in 1969 by a group of [[American Telephone & Telegraph|AT&T]] employees at [[Bell Labs]], including [[Ken Thompson]], [[Dennis Ritchie]], [[Brian Kernighan]], [[Douglas McIlroy]], [[Michael Lesk]] and [[Joe Ossanna]].<ref name=" Ritchie">{{cite journal
  | last = Ritchie
  | first = D.M.
  | authorlink = 
  | coauthors = Thompson, K.
  | title = The UNIX Time-Sharing System
  | journal = Bell System Tech. J.
  | volume = 57
  | issue = 6
  | pages = 1905-1929
  | publisher = American Tel. & Tel.
  | location = USA
  | date = July 1978
  | url = http://www.alcatel-lucent.com/bstj/vol57-1978/articles/bstj57-6-1905.pdf
  | issn = 
  | doi = 
  | id = 
  | accessdate = December 9, 2012}}</ref>  The Unix operating system was first developed in [[assembly language]], but by 1973 had been almost entirely recoded in [[C (programming language)|C]], greatly facilitating its further development and [[Software portability|porting]] to other hardware. Today's Unix system evolution is split into various branches, developed over time by AT&T as well as various commercial vendors, universities (such as [[University of California, Berkeley]]'s [[BSD]]), and [[non-profit]] organizations.

[[The Open Group]], an industry standards consortium, owns the UNIX trademark. Only systems fully compliant with and certified according to the [[Single UNIX Specification]] are qualified to use the trademark; others might be called ''Unix system-like'' or ''[[Unix-like]]'', although the Open Group disapproves<ref>[http://www.unix.org/questions_answers/faq.html#7a  What is a "Unix-like" operating system?] Unix.org FAQ</ref> of this term.  However, the term ''Unix'' is often used informally to denote any operating system that closely resembles the trademarked system.

During the late 1970s and early 1980s, the influence of Unix in academic circles led to large-scale adoption of Unix (particularly of the [[Berkeley Software Distribution|BSD]] variant, originating from the [[University of California, Berkeley]]) by commercial startups, the most notable of which are [[Solaris (operating system)|Solaris]], [[HP-UX]], [[Sequent Computer Systems|Sequent]], and [[AIX operating system|AIX]], as well as [[Darwin (operating system)|Darwin]], which forms the core set of components upon which [[Apple Inc.|Apple]]'s [[OS X]], [[Apple TV]], and [[IOS (Apple)|iOS]] are based.<ref>{{cite web|url=http://marketshare.hitslink.com/operating-system-market-share.aspx?qprid=8&qpcustomd=0 |title=Operating system market share |publisher=Marketshare.hitslink.com |date= |accessdate=2012-08-22}}</ref><ref>{{cite web|url=http://developer.apple.com/library/mac/#documentation/MacOSX/Conceptual/OSX_Technology_Overview/SystemTechnology/SystemTechnology.html#//apple_ref/doc/uid/TP40001067-CH207-BCICAIFJ |title=Loading |publisher=Developer.apple.com |date= |accessdate=2012-08-22}}</ref> Today, in addition to certified Unix systems such as those already mentioned, [[Unix-like]] operating systems such as [[MINIX]], [[Linux]], and [[BSD]] descendants ([[FreeBSD]], [[NetBSD]], [[OpenBSD]], and [[DragonFly BSD]]) are commonly encountered. The term ''traditional Unix'' may be used to describe an operating system that has the characteristics of either [[Version 7 Unix]] or [[UNIX System V]]."

检索文章的正确方法是什么?

提前致谢!

最佳答案

如果您只需要纯文本,请使用 TextExtracts : http://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&explaintext=1&titles=Unix

这将产生:Unix 是一种多任务、多用户计算机操作系统,存在多种变体。最初的 Unix 是由 Ken Thompson、Dennis Ritchie 等人在 AT&T 贝尔实验室研究中心开发的。从高级用户或程序员的角度来看,Unix 系统的特点是模块化设计,有时被称为“Unix 哲学”,这意味着操作系统提供了一组简单的工具,每个工具都执行有限的、定义明确的功能,并具有统一的文件系统作为主要的通信手段以及 shell 脚本和命令语言,以结合工具来执行复杂的工作流程。

关于wikipedia - 获取维基百科文章的第一段(仅文本)返回不期望的结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13807137/

相关文章:

python - 属性错误: module 'wikipedia' has no attribute 'summary'

xml - 用 R 抓取维基百科来制作列表和数据框

javascript - JS里面触发JS?

.Net WikiText 到 HTML 解析器

api - 维基百科 API "preflight is invalid"

java - 维基百科:跨多种语言的页面

api - 在本地安装维基百科及其 API

c# - 如何通过 API 从维基百科获取地标性地点的标题?

wikipedia - 如何从 Wikipedia API 获取标题和摘要列表?