java - 使用RJB(Ruby Java Bridge)的OpenNLP中的java.lang.NullPointerException

标签 java ruby jar opennlp rjb

我正在尝试使用open-nlp Ruby gem通过RJB(Ruby Java Bridge)访问Java OpenNLP处理器。我不是Java程序员,所以我不知道该如何解决。关于解决,调试,收集更多信息等的任何建议将不胜感激。

环境是Windows 8,Ruby 1.9.3p448,Rails 4.0.0,JDK 1.7.0-40 x586。宝石是rjb 1.4.8和louismullie / open-nlp 0.1.4。作为记录,该文件在JRuby中运行,但是在该环境中我遇到了其他问题,因此暂时保留本机Ruby。

简而言之,open-nlp gem失败,缺少java.lang.NullPointerException和Ruby错误方法。我犹豫要说为什么会这样,因为我不知道,但是在我看来,无法访问Jars文件opennlp.tools.postag.POSTaggerME@1b5080a的动态加载,可能是因为OpenNLP::Bindings::Utils .tagWithArrayList设置不正确。 OpenNLP::Bindings是Ruby。实用程序及其方法是Java。 Utils应该是“默认”的Jars和Class文件,这可能很重要。

我在做什么错,在这里?谢谢!

我正在运行的代码直接从github/open-nlp复制而来。我的代码副本是:

class OpennlpTryer

  $DEBUG=false

  # From https://github.com/louismullie/open-nlp
  # Hints: Dir.pwd; File.expand_path('../../Gemfile', __FILE__);
  # Load the module
  require 'open-nlp'
  #require 'jruby-jars'

=begin
  # Alias "write" to "print" to monkeypatch the NoMethod write error
  java_import java.io.PrintStream
  class PrintStream
    java_alias(:write, :print, [java.lang.String])
  end
=end

=begin
  # Display path of jruby-jars jars...
  puts JRubyJars.core_jar_path # => path to jruby-core-VERSION.jar
  puts JRubyJars.stdlib_jar_path # => path to jruby-stdlib-VERSION.jar
=end
  puts ENV['CLASSPATH']

  # Set an alternative path to look for the JAR files.
  # Default is gem's bin folder.
  # OpenNLP.jar_path = '/path_to_jars/'

  OpenNLP.jar_path = File.join(ENV["GEM_HOME"],"gems/open-nlp-0.1.4/bin/")
  puts OpenNLP.jar_path
  # Set an alternative path to look for the model files.
  # Default is gem's bin folder.
  # OpenNLP.model_path = '/path_to_models/'

  OpenNLP.model_path = File.join(ENV["GEM_HOME"],"gems/open-nlp-0.1.4/bin/")
  puts OpenNLP.model_path
  # Pass some alternative arguments to the Java VM.
  # Default is ['-Xms512M', '-Xmx1024M'].
  # OpenNLP.jvm_args = ['-option1', '-option2']
  OpenNLP.jvm_args = ['-Xms512M', '-Xmx1024M']
  # Redirect VM output to log.txt
  OpenNLP.log_file = 'log.txt'
  # Set default models for a language.
  # OpenNLP.use :language
  OpenNLP.use :english          # Make sure this is lower case!!!!

# Simple tokenizer

  OpenNLP.load

  sent = "The death of the poet was kept from his poems."
  tokenizer = OpenNLP::SimpleTokenizer.new

  tokens = tokenizer.tokenize(sent).to_a
# => %w[The death of the poet was kept from his poems .]
  puts "Tokenize #{tokens}"

# Maximum entropy tokenizer, chunker and POS tagger

  OpenNLP.load

  chunker = OpenNLP::ChunkerME.new
  tokenizer = OpenNLP::TokenizerME.new
  tagger = OpenNLP::POSTaggerME.new

  sent = "The death of the poet was kept from his poems."

  tokens = tokenizer.tokenize(sent).to_a
# => %w[The death of the poet was kept from his poems .]
  puts "Tokenize #{tokens}"

  tags = tagger.tag(tokens).to_a
# => %w[DT NN IN DT NN VBD VBN IN PRP$ NNS .]
  puts "Tags #{tags}"

  chunks = chunker.chunk(tokens, tags).to_a
# => %w[B-NP I-NP B-PP B-NP I-NP B-VP I-VP B-PP B-NP I-NP O]
  puts "Chunks #{chunks}"


# Abstract Bottom-Up Parser

  OpenNLP.load

  sent = "The death of the poet was kept from his poems."
  parser = OpenNLP::Parser.new
  parse = parser.parse(sent)

=begin
  parse.get_text.should eql sent

  parse.get_span.get_start.should eql 0
  parse.get_span.get_end.should eql 46
  parse.get_child_count.should eql 1
=end

  child = parse.get_children[0]

  child.text # => "The death of the poet was kept from his poems."
  child.get_child_count # => 3
  child.get_head_index #=> 5
  child.get_type # => "S"

  puts "Child: #{child}"

# Maximum Entropy Name Finder*

  OpenNLP.load

  # puts File.expand_path('.', __FILE__)
  text = File.read('./spec/sample.txt').gsub!("\n", "")

  tokenizer = OpenNLP::TokenizerME.new
  segmenter = OpenNLP::SentenceDetectorME.new
  puts "Tokenizer: #{tokenizer}"
  puts "Segmenter: #{segmenter}"

  ner_models = ['person', 'time', 'money']
  ner_finders = ner_models.map do |model|
    OpenNLP::NameFinderME.new("en-ner-#{model}.bin")
  end
  puts "NER Finders: #{ner_finders}"

  sentences = segmenter.sent_detect(text)
  puts "Sentences: #{sentences}"

  named_entities = []

  sentences.each do |sentence|
    tokens = tokenizer.tokenize(sentence)
    ner_models.each_with_index do |model, i|
      finder = ner_finders[i]
      name_spans = finder.find(tokens)
      name_spans.each do |name_span|
        start = name_span.get_start
        stop = name_span.get_end-1
        slice = tokens[start..stop].to_a
        named_entities << [slice, model]
      end
    end
  end
  puts "Named Entities: #{named_entities}"

# Loading specific models
# Just pass the name of the model file to the constructor. The gem will search for the file in the OpenNLP.model_path folder.

  OpenNLP.load

  tokenizer = OpenNLP::TokenizerME.new('en-token.bin')
  tagger = OpenNLP::POSTaggerME.new('en-pos-perceptron.bin')
  name_finder = OpenNLP::NameFinderME.new('en-ner-person.bin')
# etc.
  puts "Tokenizer: #{tokenizer}"
  puts "Tagger: #{tagger}"
  puts "Name Finder: #{name_finder}"

# Loading specific classes
# You may want to load specific classes from the OpenNLP library that are not loaded by default. The gem provides an API to do this:

# Default base class is opennlp.tools.
  OpenNLP.load_class('SomeClassName')
# => OpenNLP::SomeClassName

# Here, we specify another base class.
  OpenNLP.load_class('SomeOtherClass', 'opennlp.tools.namefind')
  # => OpenNLP::SomeOtherClass

end

失败的行是第73行:( token ==正在处理的句子。)
  tags = tagger.tag(tokens).to_a  # 
# => %w[DT NN IN DT NN VBD VBN IN PRP$ NNS .]

tagger.tag调用open-nlp / classes.rb第13行,这是引发错误的地方。那里的代码是:
class OpenNLP::POSTaggerME < OpenNLP::Base

  unless RUBY_PLATFORM =~ /java/
    def tag(*args)
      OpenNLP::Bindings::Utils.tagWithArrayList(@proxy_inst, args[0])  # <== Line 13
    end
  end

end

此时抛出的Ruby错误是:`method_missing':未知异常(NullPointerException)。对此进行调试,发现错误java.lang.NullPointerException。 args [0]是正在处理的句子。 @proxy_inst是opennlp.tools.postag.POSTaggerME@1b5080a。

OpenNLP::Bindings设置Java环境。例如,它设置了要加载的Jar以及这些Jar中的类。在第54行中,它为RJB设置了默认值,该默认值应按如下所示设置OpenNLP::Bindings::Utils及其方法:
  # Add in Rjb workarounds.
  unless RUBY_PLATFORM =~ /java/
    self.default_jars << 'utils.jar'
    self.default_classes << ['Utils', '']
  end

utils.jar和Utils.java在CLASSPATH中,而其他Jars也在加载中。正在访问它们,这已得到验证,因为如果其他Jar不存在,则会抛出错误消息。 CLASSPATH是:
.;C:\Program Files (x86)Java\jdk1.7.0_40\lib;C:\Program Files (x86)Java\jre7\lib;D:\BitNami\rubystack-1.9.3-12\ruby\lib\ruby\gems\1.9.1\gems\open-nlp-0.1.4\bin

Jar的应用程序位于D:\ BitNami \ ruby​​stack-1.9.3-12 \ ruby​​ \ lib \ ruby​​ \ gems \ 1.9.1 \ gems \ open-nlp-0.1.4 \ bin中,如果它们不存在,则再次存在我在其他Jars上收到错误消息。 ... \ bin中的Jars和Java文件包括:
jwnl-1.3.3.jar
opennlp-maxent-3.0.2-incubating.jar
opennlp-tools-1.5.2-incubating.jar
opennlp-uima-1.5.2-incubating.jar
utils.jar
Utils.java

Utils.java如下:
import java.util.Arrays;
import java.util.ArrayList;
import java.lang.String;
import opennlp.tools.postag.POSTagger;
import opennlp.tools.chunker.ChunkerME;
import opennlp.tools.namefind.NameFinderME; // interface instead?
import opennlp.tools.util.Span;

// javac -cp '.:opennlp.tools.jar' Utils.java
// jar cf utils.jar Utils.class
public class Utils {

    public static String[] tagWithArrayList(POSTagger posTagger, ArrayList[] objectArray) {
      return posTagger.tag(getStringArray(objectArray));
    }
    public static Object[] findWithArrayList(NameFinderME nameFinder, ArrayList[] tokens) {
      return nameFinder.find(getStringArray(tokens));
    }
    public static Object[] chunkWithArrays(ChunkerME chunker, ArrayList[] tokens, ArrayList[] tags) {
      return chunker.chunk(getStringArray(tokens), getStringArray(tags));
    }
    public static String[] getStringArray(ArrayList[] objectArray) {
      String[] stringArray = Arrays.copyOf(objectArray, objectArray.length, String[].class);
          return stringArray;
    }
}

因此,应该定义tagWithArrayList并导入opennlp.tools.postag.POSTagger。 (OBTW,只是为了尝试,我在此文件中将POSTagger的发生率更改为POSTaggerME。它什么都没有改变...)

工具jar文件opennlp-tools-1.5.2-incubating.jar包含了postag / POSTagger和POSTaggerME类文件。

错误消息是:
D:\BitNami\rubystack-1.9.3-12\ruby\bin\ruby.exe -e $stdout.sync=true;$stderr.sync=true;load($0=ARGV.shift) D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb
.;C:\Program Files (x86)\Java\jdk1.7.0_40\lib;C:\Program Files (x86)\Java\jre7\lib;D:\BitNami\rubystack-1.9.3-12\ruby\lib\ruby\gems\1.9.1\gems\open-nlp-0.1.4\bin
D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/bin/
D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/bin/
Tokenize ["The", "death", "of", "the", "poet", "was", "kept", "from", "his", "poems", "."]
Tokenize ["The", "death", "of", "the", "poet", "was", "kept", "from", "his", "poems", "."]
D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp/classes.rb:13:in `method_missing': unknown exception (NullPointerException)
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp/classes.rb:13:in `tag'
    from D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:73:in `<class:OpennlpTryer>'
    from D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:1:in `<top (required)>'
    from -e:1:in `load'
    from -e:1:in `<main>'

修改后的Utils.java:
import java.util.Arrays;
import java.util.Object;
import java.lang.String;
import opennlp.tools.postag.POSTagger;
import opennlp.tools.chunker.ChunkerME;
import opennlp.tools.namefind.NameFinderME; // interface instead?
import opennlp.tools.util.Span;

// javac -cp '.:opennlp.tools.jar' Utils.java
// jar cf utils.jar Utils.class
public class Utils {

    public static String[] tagWithArrayList(POSTagger posTagger, Object[] objectArray) {
      return posTagger.tag(getStringArray(objectArray));
    }f
    public static Object[] findWithArrayList(NameFinderME nameFinder, Object[] tokens) {
      return nameFinder.find(getStringArray(tokens));
    }
    public static Object[] chunkWithArrays(ChunkerME chunker, Object[] tokens, Object[] tags) {
      return chunker.chunk(getStringArray(tokens), getStringArray(tags));
    }
    public static String[] getStringArray(Object[] objectArray) {
      String[] stringArray = Arrays.copyOf(objectArray, objectArray.length, String[].class);
          return stringArray;
    }
}

修改后的错误消息:
Uncaught exception: uninitialized constant OpennlpTryer::ArrayStoreException
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:81:in `rescue in <class:OpennlpTryer>'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:77:in `<class:OpennlpTryer>'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:1:in `<top (required)>'

与Utils.java一起的修订错误已修订为“import java.lang.Object;”:
Uncaught exception: uninitialized constant OpennlpTryer::ArrayStoreException
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:81:in `rescue in <class:OpennlpTryer>'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:77:in `<class:OpennlpTryer>'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:1:in `<top (required)>'

从OpennlpTryer中删除的救援显示在classes.rb中捕获的错误:
Uncaught exception: uninitialized constant OpenNLP::POSTaggerME::ArrayStoreException
    D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp/classes.rb:16:in `rescue in tag'
    D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp/classes.rb:13:in `tag'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:78:in `<class:OpennlpTryer>'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:1:in `<top (required)>'

相同的错误,但删除了所有挽救措施,因此是“本地Ruby”
Uncaught exception: unknown exception
    D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp/classes.rb:15:in `method_missing'
    D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp/classes.rb:15:in `tag'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:78:in `<class:OpennlpTryer>'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:1:in `<top (required)>'

修改后的Utils.java:
import java.util.Arrays;
import java.util.ArrayList;
import java.lang.String;
import opennlp.tools.postag.POSTagger;
import opennlp.tools.chunker.ChunkerME;
import opennlp.tools.namefind.NameFinderME; // interface instead?
import opennlp.tools.util.Span;

// javac -cp '.:opennlp.tools.jar' Utils.java
// jar cf utils.jar Utils.class
public class Utils {

    public static String[] tagWithArrayList(
      System.out.println("Tokens: ("+objectArray.getClass().getSimpleName()+"): \n"+objectArray);
      POSTagger posTagger, ArrayList[] objectArray) {
      return posTagger.tag(getStringArray(objectArray));
    }
    public static Object[] findWithArrayList(NameFinderME nameFinder, ArrayList[] tokens) {
      return nameFinder.find(getStringArray(tokens));
    }
    public static Object[] chunkWithArrays(ChunkerME chunker, ArrayList[] tokens, ArrayList[] tags) {
      return chunker.chunk(getStringArray(tokens), getStringArray(tags));
    }
    public static String[] getStringArray(ArrayList[] objectArray) {
      String[] stringArray = Arrays.copyOf(objectArray, objectArray.length, String[].class);
          return stringArray;
    }
}

我在util.jar中解压缩的Utils.class上运行了cavaj,这就是我发现的内容。它与Utils.java有很多不同。两者都安装了open-nlp 1.4.8 gem。我不知道这是否是问题的根本原因,但是此文件是文件中断的核心,我们之间存在重大差异。我们应该使用哪个?
import java.util.ArrayList;
import java.util.Arrays;
import opennlp.tools.chunker.ChunkerME;
import opennlp.tools.namefind.NameFinderME;
import opennlp.tools.postag.POSTagger;

public class Utils
{

    public Utils()
    {
    }

    public static String[] tagWithArrayList(POSTagger postagger, ArrayList aarraylist[])
    {
        return postagger.tag(getStringArray(aarraylist));
    }

    public static Object[] findWithArrayList(NameFinderME namefinderme, ArrayList aarraylist[])
    {
        return namefinderme.find(getStringArray(aarraylist));
    }

    public static Object[] chunkWithArrays(ChunkerME chunkerme, ArrayList aarraylist[], ArrayList aarraylist1[])
    {
        return chunkerme.chunk(getStringArray(aarraylist), getStringArray(aarraylist1));
    }

    public static String[] getStringArray(ArrayList aarraylist[])
    {
        String as[] = (String[])Arrays.copyOf(aarraylist, aarraylist.length, [Ljava/lang/String;);
        return as;
    }
}

从10/07开始使用的Utils.java,已编译并压缩为utils.jar:
import java.util.Arrays;
import java.util.ArrayList;
import java.lang.String;
import opennlp.tools.postag.POSTagger;
import opennlp.tools.chunker.ChunkerME;
import opennlp.tools.namefind.NameFinderME; // interface instead?
import opennlp.tools.util.Span;

// javac -cp '.:opennlp.tools.jar' Utils.java
// jar cf utils.jar Utils.class
public class Utils {

    public static String[] tagWithArrayList(POSTagger posTagger, ArrayList[] objectArray) {
      return posTagger.tag(getStringArray(objectArray));
    }
    public static Object[] findWithArrayList(NameFinderME nameFinder, ArrayList[] tokens) {
      return nameFinder.find(getStringArray(tokens));
    }
    public static Object[] chunkWithArrays(ChunkerME chunker, ArrayList[] tokens, ArrayList[] tags) {
      return chunker.chunk(getStringArray(tokens), getStringArray(tags));
    }
    public static String[] getStringArray(ArrayList[] objectArray) {
      String[] stringArray = Arrays.copyOf(objectArray, objectArray.length, String[].class);
          return stringArray;
    }
}

失败发生在第110行的BindIt::Binding::load_klass中:
# Private function to load classes.
# Doesn't check if initialized.
def load_klass(klass, base, name=nil)
  base += '.' unless base == ''
  fqcn = "#{base}#{klass}"
  name ||= klass
  if RUBY_PLATFORM =~ /java/
    rb_class = java_import(fqcn)
    if name != klass
      if rb_class.is_a?(Array)
        rb_class = rb_class.first
      end
      const_set(name.intern, rb_class)
    end
  else
    rb_class = Rjb::import(fqcn)             # <== This is line 110
    const_set(name.intern, rb_class)
  end
end

消息如下,但是它们与所标识的特定方法不一致。每次运行可能显示不同的方法,POSTagger,ChunkerME或NameFinderME中的任何一种。
D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/bind-it-0.2.7/lib/bind-it/binding.rb:110:in `import': opennlp/tools/namefind/NameFinderME (NoClassDefFoundError)
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/bind-it-0.2.7/lib/bind-it/binding.rb:110:in `load_klass'
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/bind-it-0.2.7/lib/bind-it/binding.rb:89:in `block in load_default_classes'
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/bind-it-0.2.7/lib/bind-it/binding.rb:87:in `each'
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/bind-it-0.2.7/lib/bind-it/binding.rb:87:in `load_default_classes'
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/bind-it-0.2.7/lib/bind-it/binding.rb:56:in `bind'
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp.rb:14:in `load'
    from D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:54:in `<class:OpennlpTryer>'
    from D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:1:in `<top (required)>'
    from -e:1:in `load'
    from -e:1:in `<main>'

关于这些错误的有趣之处在于它们起源于OpennlpTryer第54行:
  OpenNLP.load

此时,OpenNLP启动RJB,该RJB使用BindIt加载jar和类。这早于我在此问题开始时看到的错误之前。但是,我不禁认为这一切都与之相关。我真的根本不理解这些错误的不一致之处。

我能够将日志记录功能添加到Utils.java中,在添加“import java.io. *”后对其进行编译并进行压缩。但是,由于这些错误,我将其拔出,因为我不知道它是否涉及。我认为不是。但是,由于这些错误是在加载期间发生的,因此无论如何都不会调用该方法,因此在该处进行记录将无济于事...

对于其他每个jar,将加载jar,然后使用RJB导入每个类。实用程序的处理方式有所不同,并指定为“默认”。据我所知,执行Utils.class以加载其自己的类吗?

稍后于10/07更新:

我认为这是我的位置。首先,如我今天早些时候所述,替换Utils.java时遇到一些问题。在安装修补程序之前,可能需要解决该问题。

其次,我现在了解POSTagger和POSTaggerME之间的区别,因为ME表示最大熵。测试代码试图调用POSTaggerME,但对我来说,它看起来像Utils.java(已实现)支持POSTagger。我尝试更改测试代码以调用POSTagger,但它说找不到初始化程序。仔细看一下每一个的来源,我想在这里,我认为POSTagger的存在仅仅是为了支持实现它的POSTaggerME。

来源是opennlp-tools文件opennlp-tools-1.5.2-incubating-sources.jar。

我不明白的是首先使用实用程序的全部原因是什么?为什么bindings.rb中提供的jar /类不够用?这感觉就像是一个糟糕的猴子补丁。我的意思是,首先查看bindings.rb的作用:
  # Default JARs to load.
  self.default_jars = [
    'jwnl-1.3.3.jar',
    'opennlp-tools-1.5.2-incubating.jar',
    'opennlp-maxent-3.0.2-incubating.jar',
    'opennlp-uima-1.5.2-incubating.jar'
  ]

  # Default namespace.
  self.default_namespace = 'opennlp.tools'

  # Default classes.
  self.default_classes = [
    # OpenNLP classes.
    ['AbstractBottomUpParser', 'opennlp.tools.parser'],
    ['DocumentCategorizerME', 'opennlp.tools.doccat'],
    ['ChunkerME', 'opennlp.tools.chunker'],
    ['DictionaryDetokenizer', 'opennlp.tools.tokenize'],
    ['NameFinderME', 'opennlp.tools.namefind'],
    ['Parser', 'opennlp.tools.parser.chunking'],
    ['Parse', 'opennlp.tools.parser'],
    ['ParserFactory', 'opennlp.tools.parser'],
    ['POSTaggerME', 'opennlp.tools.postag'],
    ['SentenceDetectorME', 'opennlp.tools.sentdetect'],
    ['SimpleTokenizer', 'opennlp.tools.tokenize'],
    ['Span', 'opennlp.tools.util'],
    ['TokenizerME', 'opennlp.tools.tokenize'],

    # Generic Java classes.
    ['FileInputStream', 'java.io'],
    ['String', 'java.lang'],
    ['ArrayList', 'java.util']
  ]

  # Add in Rjb workarounds.
  unless RUBY_PLATFORM =~ /java/
    self.default_jars << 'utils.jar'
    self.default_classes << ['Utils', '']
  end

最佳答案

在完整的CLASSES.RB模块末尾查看完整代码

我今天遇到了同样的问题。我不太了解为什么要使用Utils类,因此我以以下方式修改了classes.rb文件:

unless RUBY_PLATFORM =~ /java/
  def tag(*args)
    @proxy_inst.tag(args[0])
    #OpenNLP::Bindings::Utils.tagWithArrayList(@proxy_inst, args[0])
  end
end

这样,我可以使以下测试通过:
sent   = "The death of the poet was kept from his poems."
tokens = tokenizer.tokenize(sent).to_a
# => %w[The death of the poet was kept from his poems .]
tags   = tagger.tag(tokens).to_a
# => ["prop", "prp", "n", "v-fin", "n", "adj", "prop", "v-fin", "n", "adj", "punc"]

R_G编辑:
我测试了该更改,并消除了错误。我将不得不做更多的测试,以确保结果是预期的。但是,遵循相同的模式,我也对classes.rb进行了以下更改:
def chunk(tokens, tags)
  chunks = @proxy_inst.chunk(tokens, tags)
  # chunks = OpenNLP::Bindings::Utils.chunkWithArrays(@proxy_inst, tokens,tags)
  chunks.map { |c| c.to_s }
end

...
class OpenNLP::NameFinderME < OpenNLP::Base
  unless RUBY_PLATFORM =~ /java/
    def find(*args)
      @proxy_inst.find(args[0])
      # OpenNLP::Bindings::Utils.findWithArrayList(@proxy_inst, args[0])
    end
  end
end

这样就可以执行整个样本测试而不会失败。我将在以后提供有关结果验证的更新。

每个空间教皇和R_G的最终编辑和更新的CLASSES.RB:

事实证明,这个答案是所需解决方案的关键。但是,结果被更正后并不一致。根据RJB的指定,我们继续对其进行深入研究,并在通话过程中实现了强类型化。这会将调用转换为使用_invoke方法,其中参数包括所需的方法,强类型和其他参数。安德烈的建议是解决方案的关键,因此对他表示敬意。这是完整的模块。它消除了对尝试进行这些调用但失败的Utils.class的需要。我们计划对open-nlp gem发出github pull请求,以更新此模块:
require 'open-nlp/base'

class OpenNLP::SentenceDetectorME < OpenNLP::Base; end

class OpenNLP::SimpleTokenizer < OpenNLP::Base; end

class OpenNLP::TokenizerME < OpenNLP::Base; end

class OpenNLP::POSTaggerME < OpenNLP::Base

  unless RUBY_PLATFORM =~ /java/
    def tag(*args)
        @proxy_inst._invoke("tag", "[Ljava.lang.String;", args[0])
    end

  end
end


class OpenNLP::ChunkerME < OpenNLP::Base

  if RUBY_PLATFORM =~ /java/

    def chunk(tokens, tags)
      if !tokens.is_a?(Array)
        tokens = tokens.to_a
        tags = tags.to_a
      end
      tokens = tokens.to_java(:String)
      tags = tags.to_java(:String)
      @proxy_inst.chunk(tokens,tags).to_a
    end

  else

    def chunk(tokens, tags)
      chunks = @proxy_inst._invoke("chunk", "[Ljava.lang.String;[Ljava.lang.String;", tokens, tags)
      chunks.map { |c| c.to_s }
    end

  end

end

class OpenNLP::Parser < OpenNLP::Base

  def parse(text)

    tokenizer = OpenNLP::TokenizerME.new
    full_span = OpenNLP::Bindings::Span.new(0, text.size)

    parse_obj = OpenNLP::Bindings::Parse.new(
    text, full_span, "INC", 1, 0)

    tokens = tokenizer.tokenize_pos(text)

    tokens.each_with_index do |tok,i|
      start, stop = tok.get_start, tok.get_end
      token = text[start..stop-1]
      span = OpenNLP::Bindings::Span.new(start, stop)
      parse = OpenNLP::Bindings::Parse.new(text, span, "TK", 0, i)
      parse_obj.insert(parse)
    end

    @proxy_inst.parse(parse_obj)

  end

end

class OpenNLP::NameFinderME < OpenNLP::Base
  unless RUBY_PLATFORM =~ /java/
    def find(*args)
      @proxy_inst._invoke("find", "[Ljava.lang.String;", args[0])
    end
  end
end

关于java - 使用RJB(Ruby Java Bridge)的OpenNLP中的java.lang.NullPointerException,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19018014/

相关文章:

java - 如何从另一个模块注入(inject)实现

java - 如何用 Java 中的特殊字符替换元音字母?

ruby-on-rails - Instagram ruby​​ gem - 无法到达回调 URL

ruby - 克隆或复制 Ruby 的 Scan 方法的结果数组?

Java: NoClassDefFoundError: org/json/JSONException

java - 在 IBM Cloud 上部署 JAR 文件

java - 启用 JVMTI *功能* 查询局部变量的开销

java - 更新 Java 中绑定(bind)的 JLabel 文本

ruby-on-rails - Ruby 2.1.0/2.1.1/2.1.2 不支持调试器 gem

java - 如何将res文件从项目放到同一文件夹中的jar中?