r - 在 R 中安装 coreNLP

标签 r nlp stanford-nlp devtools r-package

我按照此链接上的说明使用 coreNLP https://github.com/statsmaths/coreNLP

但是,我发现了这个错误

> library(coreNLP)

Error in get(method, envir = home) : 
lazy-load database '/Users/apple/Library/R/3.2/library/coreNLP/R/coreNLP.rdb is  corrupt
In addition: Warning messages:
 1: In .registerS3method(fin[i, 1], fin[i, 2], fin[i, 3], fin[i, 4],  :
 restarting interrupted promise evaluation
 2: In get(method, envir = home) :
 restarting interrupted promise evaluation
 3: In get(method, envir = home) : internal error -3 in R_decompress1
 Error: package or namespace load failed for ‘coreNLP’

最佳答案

遇到 java.lang.UnsupportedClassVersionError: edu/stanford/nlp/pipeline/StanfordCoreNLP : Unsupported major.minor version 52.0 错误信息后:

你需要

  • 安装 java 8,(以 super 用户身份),
  • 将操作系统使用的默认 jvm 更改为此 jvm(* 见下文),
  • 在命令行运行R CMD javareconf,然后
  • 将环境变量 LD_LIBRARY_PATH 设置为 libjvm.so 的存储目录。
  • 重启 R/RStudio

  • 确保您的计算机上存在交换文件(或交换分区)。调用 free 检查输出中是否有以 swap 开头的行,并且该行的值不为零。

我使用 ubuntu,我的 java 8 libjvm.so 在这里:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so

您可以在 .Rprofile 文件中执行此操作。添加这一行,也许在文件的底部:

Sys.setenv(LD_LIBRARY_PATH=paste0(Sys.getenv("LD_LIBRARY_PATH"), ":", "/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/"))

当我在 R 中执行此操作时:

R> Sys.getenv("LD_LIBRARY_PATH")
[1] "/usr/local/lib64/R/lib:/usr/local/lib64:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/"
R> library(coreNLP)
R> initCoreNLP()

我得到这个结果:

Searching for resource: config.properties
Adding annotator tokenize
TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
Adding annotator ssplit
Adding annotator pos
Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [1.1 sec].
Adding annotator lemma
Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [5.6 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [2.1 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [3.8 sec].
Initializing JollyDayHoliday for SUTime from classpath: edu/stanford/nlp/models/sutime/jollyday/Holidays_sutime.xml as sutime.binder.1.
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/defs.sutime.txt
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.sutime.txt
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.holidays.sutime.txt
Adding annotator parse
Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... done [0.6 sec].
Adding annotator dcoref
Adding annotator sentiment

R> example(getSentiment)

gtSntmR> getSentiment(annoEtranger) # first Sentence of L'Etranger by A.Camus
  id sentimentValue sentiment
1  1              1  Negative
2  2              2   Neutral

gtSntmR> getSentiment(annoHp) # first Sentence of Harry Potter V1
  id sentimentValue    sentiment
1  1              4 Verypositive

(*) 如何在 Linux 上查看默认的 jvm:

update-alternatives --display java

结果

java - auto mode
  link currently points to /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java

要显示所有可用的替代方案,请使用

update-alternatives --list java

结果(在我的机器上):

/usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java
/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java

更改备选方案:

sudo update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java

稍微玩一下更新选项。

关于r - 在 R 中安装 coreNLP,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34716207/

相关文章:

java - SentimentCoreAnnotations.AnnotatedTree 无法解析为类型

java - 无法写入核心转储,Java 运行时环境检测到 fatal error

geolocation - 基于字典的 NLTK 标注器

r - 如何从 R 矩阵按名称访问行和列

R 从字符到数字

r - 在 R 中划分两个数据帧(一分为二)

nlp - "isolated symbol probabilities of English"是什么意思

java - 在 stanford pos tagger 中编辑配置文件

nlp - 将数据集转换为 CoNLL 格式。用 O 标记剩余的标记

rugarch 不会加载,但可以安装得很好(在 mac 上)