我正在使用以下函数来计算文件的校验和:
public static void generateChecksums(String strInputFile, String strCSVFile) {
ArrayList<String[]> outputList = new ArrayList<String[]>();
try {
MessageDigest m = MessageDigest.getInstance("MD5");
File aFile = new File(strInputFile);
InputStream is = new FileInputStream(aFile);
System.out.println(Calendar.getInstance().getTime().toString() +
" Processing Checksum: " + strInputFile);
double dLength = aFile.length();
try {
is = new DigestInputStream(is, m);
// read stream to EOF as normal...
int nTmp;
double dCount = 0;
String returned_content="";
while ((nTmp = is.read()) != -1) {
dCount++;
if (dCount % 600000000 == 0) {
System.out.println(". ");
} else if (dCount % 20000000 == 0) {
System.out.print(". ");
}
}
System.out.println();
} finally {
is.close();
}
byte[] digest = m.digest();
m.reset();
BigInteger bigInt = new BigInteger(1,digest);
String hashtext = bigInt.toString(16);
// Now we need to zero pad it if you actually / want the full 32 chars.
while(hashtext.length() < 32 ){
hashtext = "0" + hashtext;
}
String[] arrayTmp = new String[2];
arrayTmp[0] = aFile.getName();
arrayTmp[1] = hashtext;
outputList.add(arrayTmp);
System.out.println("Hash Code: " + hashtext);
UtilityFunctions.createCSV(outputList, strCSVFile, true);
} catch (NoSuchAlgorithmException nsae) {
System.out.println(nsae.getMessage());
} catch (FileNotFoundException fnfe) {
System.out.println(fnfe.getMessage());
} catch (IOException ioe) {
System.out.println(ioe.getMessage());
}
}
问题是读取文件的循环真的很慢:
while ((nTmp = is.read()) != -1) {
dCount++;
if (dCount % 600000000 == 0) {
System.out.println(". ");
} else if (dCount % 20000000 == 0) {
System.out.print(". ");
}
}
一个 3 GB 的文件从一个位置复制到另一个位置只需不到一分钟,计算则需要一个多小时。我可以做些什么来加快速度,还是应该尝试朝不同的方向前进,比如使用 shell 命令?
更新:感谢 ratchet freak 的建议,我将代码更改为速度快得离谱的代码(我猜快了 2048 倍...):
byte[] buff = new byte[2048];
while ((nTmp = is.read(buff)) != -1) {
dCount += 2048;
if (dCount % 614400000 == 0) {
System.out.println(". ");
} else if (dCount % 20480000 == 0) {
System.out.print(". ");
}
}
最佳答案
使用缓冲区
byte[] buff = new byte[2048];
while ((nTmp = is.read(buff)) != -1)
{
dCount+=ntmp;
//this logic won't work anymore though
/*
if (dCount % 600000000 == 0)
{
System.out.println(". ");
}
else if (dCount % 20000000 == 0)
{
System.out.print(". ");
}
*/
}
编辑:或者如果您不需要这些值,请执行
while(is.read(buff)!=-1)is.skip(600000000);
nvm 显然 DigestInputStream
的实现者很愚蠢,在发布之前没有正确测试所有内容
关于java:需要提高校验和计算的性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/6091544/