java - listFiles(), isDirectory() 方法无法读取 java 1.4 中的 unicoded 数据

我正在使用 Java 1.4 作为我的客户端要求以及 lucene-core-2.9.2.jar 和 lucene-demos-2.9.2。 jar 。我正在使用 Ant 进行构建。它适用于除 Unicode 和 scandic char 之外的所有目录。

当我尝试使用 listFiles() 列出时，它列出了除 unicoded 之外的所有数据，显示为 block 。当它想使用isDirectory()读取列表时，它不能定义那些用于索引的文件夹名称是其他语言(包含unicode或scandic char).

如何使用 unicoded 数据和 scandic char 解决这个问题？

如果我使用 Java 6 或 7，它运行良好。所以根据客户需求(Java 1.4)，请不要告诉我使用 Java 5,6 或 7。提供其他有值(value)的答案。作为您的最佳理解，我在下面添加了我的代码

public void addIntoIndex(File dir, IndexWriter indexWriter) {       
try {
    System.out.println("Now in addIntoIndex");
    File[] htmls = dir.listFiles();

    /** "Release_Notes" folder will be excluded for indexing */
    if(dir.getName().equals("Release_Notes") && this.searchOption.equals("systemHelp")) {
        System.out.println("'Release_Notes' folder will be excluded for indexing.");
        return;
    }

    for(int i = 0; i < htmls.length; i++){
        String htmlPath = htmls[i].getAbsolutePath();   

        if(htmls[i].isDirectory()) {
            addIntoIndex(new File(htmls[i].getAbsolutePath()), indexWriter);
        }

        if(htmlPath.endsWith(".html") || htmlPath.endsWith(".htm")){
            addDocument(htmlPath, indexWriter);
        }
    }

} catch (Exception e) {
    e.printStackTrace();
}
}

最佳答案

我的问题终于解决了。实际上，我正在为我所有的 html 文件编制索引，这些文件为

<html>
<head>..</head>
<body>...</body>
</html>

采用这种格式。

在 head 部分添加以下两行后，这个问题在我的 java 1.4.02 版本中解决了。

<meta http-equiv=Content-Type content="text/html; charset=utf-8">
<meta http-equiv="content-script-type" content="text/javascript; charset=UTF-8"/>

特别感谢我的项目经理和Peter Lawrey和 txtechhelp

关于java - listFiles(), isDirectory() 方法无法读取 java 1.4 中的 unicoded 数据，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/20630273/

java - listFiles(), isDirectory() 方法无法读取 java 1.4 中的 unicoded 数据

上一篇：java - android异步读取usb

下一篇：java - GWT，Tomcat，2个模块，JRebel:DevMode在错误的位置查找nocache.js并显示未找到404