我在 Excel (.xlsx) 中有大约 60000 行的大量数据列表,我正在使用 Spout 库将所有数据插入 MySQL 以读取 Excel 数据,在从 excel 文件读取数据后,有一些详细信息不是 Unicode 格式,也不是英文(当前数据是印地语)所以首先我使用 Python 脚本将这个非标准文本转换为 Unicode,然后最后将 Unicode 数据印地语转换为使用 Google Translate API 的英语..一切正常,但它花费了很多时间。在我当前的脚本中,在完成所有这些步骤后,MySQL 中插入了 200 行,这大约需要 20 分钟。我的问题是如何改进我当前的脚本以获得更高的性能 这是我当前的脚本:
// Convert local hindi font to unicode
function KridevToUnicode($k) {
$myfile = fopen("input.txt", "w");
fwrite($myfile, $k);
fclose($myfile);
shell_exec('krutidev2unicode.py -i input.txt -o output.txt');
$file = fopen("output.txt","r");
return hindiToEnglish(fgets($file));
fclose($file);
}
// Convert hindi to english
function hindiToEnglish($text) {
$apiKey2 = 'XXXXXXXXXXXXXXXXXXXXX';
$url = 'https://www.googleapis.com/language/translate/v2?key=' . $apiKey2 . '&q=' . rawurlencode($text) . '&source=hi&target=en';
$handle = curl_init($url);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($handle);
$responseDecoded = json_decode($response, true);
curl_close($handle);
return $responseDecoded['data']['translations'][0]['translatedText'];
}
//Read data from Excel and insert in MySQL
require_once 'Spout/Autoloader/autoload.php';
use Box\Spout\Reader\ReaderFactory;
use Box\Spout\Common\Type;
$reader = ReaderFactory::create(Type::XLSX);
$reader->open('a.xlsx');
foreach($reader->getSheetIterator() as $sheet) {
foreach($sheet->getRowIterator() as $row) {
ob_start();
$i++;
$membershipNumber = $row[1];
$memberName = KridevToUnicode($row[3]);
$fatherHusbandName = KridevToUnicode($row[4]);
if(is_string($row[5])) {
$dob = '';
}else {
$dob = $row[5]->format('d-m-Y');
$dob = date('Y-m-d',strtotime($dob));
}
$permanentAddress = KridevToUnicode($row[6]);
$currentAddress = KridevToUnicode($row[7]);
$district = KridevToUnicode($row[8]);
$state = KridevToUnicode($row[10]);
$pincode = $row[11];
$mobile = $row[13];
$email = $row[14];
$Shasan = KridevToUnicode($row[15]);
$Occupation = KridevToUnicode($row[16]);
$education = KridevToUnicode($row[17]);
$Inspiration = KridevToUnicode($row[18]);
$ReceiptNo = $row[19];
if(is_string($row[20])) {
$DateofReceipt = '';
}else {
$DateofReceipt = $row[20]->format('d-m-Y');
$DateofReceipt = date('Y-m-d',strtotime($DateofReceipt));
}
$sql = "INSERT INTO users (membershipNumber, name, father_husband_name, dob, permanent_address, current_address, district, state, pin_code, phone, email, shasan, profession, education, inspiration, receiptNo, DateofReceipt)
VALUES ('$membershipNumber', '$memberName', '$fatherHusbandName', '$dob', '$permanentAddress', '$currentAddress', '$district', '$state', '$pincode', '$mobile', '$email', '$Shasan', '$Occupation', '$education', '$Inspiration', '$ReceiptNo', '$DateofReceipt')";
if ($conn->query($sql) === TRUE) {
} else {
echo "Error: " . $sql . "<br>" . $conn->error;
}
echo ob_get_contents();
ob_end_flush();
}
}
$reader->close();
$conn->close();
最佳答案
准备一个批量插入查询,并在您完成创建查询后执行它。 您可以使用插入插入多个值,像这样进行查询并在完成后执行它
Here是怎样的
关于Php 提高大数据循环的性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51302491/