php - 如何在 PHP 中修复格式错误的 JSON?

标签 php mysql json

我得到一个 JSON 格式的数据馈送,并且是唯一可用的格式。在 PHP 中,我使用 json_decode 来解码 JSON,但它被破坏了,我发现 JSON 是在某些地方生成的,在一个人的昵称中用双引号引起来。我使用以下方法验证了这一点: http://jsonformatter.curiousconcept.com

我无法控制数据的创建,但当它出现时我必须处理这种损坏的格式。解析后的数据将放入 MySQL TABLE。

例如:

"contact1": "David "Dave" Letterman",

json_decode 会返回 NULL。如果我手动保存文件,并将其更改为围绕 Dave 昵称的单引号,那么一切正常。

$json_string = file_get_contents($json_download);
$json_array = json_decode($json_string, true);

如何在 json_string 被 json_decode 处理之前修复损坏的 JSON 格式? 文件预处理应该怎么做,反斜杠昵称的双引号?或者将它们更改为单引号?在 MySQL 中存储这样的双引号是个好主意吗?

我不知道每个数据馈送何时会发生这种情况,所以我不想只检查 contact1 是否有内部双引号来修复它们。在 PHP 中有没有一种方法可以像上面的例子一样,在冒号后面的所有内容都反斜杠,除了外部双引号?谢谢!

这是 tftd 提供的正确代码:

<?php
// This:
// "contact1": "David "Dave" Letterman",
// Needs to look like this to be decoded by JSON:
// "contact1": "David \"Dave\" Letterman",

$data ='"contact1": "David "Dave" Letterman",';
function replace($match){
    $key = trim($match[1]);
    $val = trim($match[2]);

    if($val[0] == '"')
        $val = '"'.addslashes(substr($val, 1, -1)).'"';
    else if($val[0] == "'")
        $val = "'".addslashes(substr($val, 1, -1))."'";

    return $key.": ".$val;
}
$preg = preg_replace_callback("#([^{:]*):([^,}]*)#i",'replace',$data);
var_dump($preg);
$json_array = json_decode($preg);
var_dump($json_array);
echo $json_array . "\n";
echo $preg . "\n";
?>

这是输出:

string(39) ""contact1": "David \"Dave\" Letterman","
NULL

"contact1": "David \"Dave\" Letterman",

最佳答案

我有一个自己的 jsonFixer() 函数 - 它分两步工作:删除垃圾(为了不连贯的格式相等)和重新格式化。

<?php
  function jsonFixer($json){
    $patterns     = [];
    /** garbage removal */
    $patterns[0]  = "/([\s:,\{}\[\]])\s*'([^:,\{}\[\]]*)'\s*([\s:,\{}\[\]])/"; //Find any character except colons, commas, curly and square brackets surrounded or not by spaces preceded and followed by spaces, colons, commas, curly or square brackets...
    $patterns[1]  = '/([^\s:,\{}\[\]]*)\{([^\s:,\{}\[\]]*)/'; //Find any left curly brackets surrounded or not by one or more of any character except spaces, colons, commas, curly and square brackets...
    $patterns[2]  =  "/([^\s:,\{}\[\]]+)}/"; //Find any right curly brackets preceded by one or more of any character except spaces, colons, commas, curly and square brackets...
    $patterns[3]  = "/(}),\s*/"; //JSON.parse() doesn't allow trailing commas
    /** reformatting */
    $patterns[4]  = '/([^\s:,\{}\[\]]+\s*)*[^\s:,\{}\[\]]+/'; //Find or not one or more of any character except spaces, colons, commas, curly and square brackets followed by one or more of any character except spaces, colons, commas, curly and square brackets...
    $patterns[5]  = '/["\']+([^"\':,\{}\[\]]*)["\']+/'; //Find one or more of quotation marks or/and apostrophes surrounding any character except colons, commas, curly and square brackets...
    $patterns[6]  = '/(")([^\s:,\{}\[\]]+)(")(\s+([^\s:,\{}\[\]]+))/'; //Find or not one or more of any character except spaces, colons, commas, curly and square brackets surrounded by quotation marks followed by one or more spaces and  one or more of any character except spaces, colons, commas, curly and square brackets...
    $patterns[7]  = "/(')([^\s:,\{}\[\]]+)(')(\s+([^\s:,\{}\[\]]+))/"; //Find or not one or more of any character except spaces, colons, commas, curly and square brackets surrounded by apostrophes followed by one or more spaces and  one or more of any character except spaces, colons, commas, curly and square brackets...
    $patterns[8]  = '/(})(")/'; //Find any right curly brackets followed by quotation marks...
    $patterns[9]  = '/,\s+(})/'; //Find any comma followed by one or more spaces and a right curly bracket...
    $patterns[10] = '/\s+/'; //Find one or more spaces...
    $patterns[11] = '/^\s+/'; //Find one or more spaces at start of string...

    $replacements     = [];
    /** garbage removal */
    $replacements[0]  = '$1 "$2" $3'; //...and put quotation marks surrounded by spaces between them;
    $replacements[1]  = '$1 { $2'; //...and put spaces between them;
    $replacements[2]  = '$1 }'; //...and put a space between them;
    $replacements[3]  = '$1'; //...so, remove trailing commas of any right curly brackets;
    /** reformatting */
    $replacements[4]  = '"$0"'; //...and put quotation marks surrounding them;
    $replacements[5]  = '"$1"'; //...and replace by single quotation marks;
    $replacements[6]  = '\\$1$2\\$3$4'; //...and add back slashes to its quotation marks;
    $replacements[7]  = '\\$1$2\\$3$4'; //...and add back slashes to its apostrophes;
    $replacements[8]  = '$1, $2'; //...and put a comma followed by a space character between them;
    $replacements[9]  = ' $1'; //...and replace by a space followed by a right curly bracket;
    $replacements[10] = ' '; //...and replace by one space;
    $replacements[11] = ''; //...and remove it.

    $result = preg_replace($patterns, $replacements, $json);

    return $result;
  }
?>

使用示例:

<?php
  // Received badly formatted json:
  // {"contact1": "David "Dave" Letterman", price : 30.00, 'details' : "Greatest 'Hits' Album"}
  $json_string = '{"contact1": "David "Dave" Letterman", price : 30.00, \'details\' : "Greatest \'Hits\' Album"}';
  jsonFixer($json_string);
?>

结果:

{"contact1": "David \"Dave\" Letterman", "price" : "30.00", "details" : "Greatest \'Hits\' Album"}

注意:这并没有对所有可能格式错误的 JSON 字符串进行测试,但我在复杂的多级 JSON 字符串上使用并且在此之前运行良好。

关于php - 如何在 PHP 中修复格式错误的 JSON?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13236819/

相关文章:

php - 当用户点击窗口时javascript提醒一条消息

php - 用 HTML 标记替换字符串中的引号

php - 在 Laravel 中找不到类 'Symfony\Component\Debug\ExceptionHandler'

PHP 表单图像上传有效,但文本输入未存储在数据库中

mysql - 存储过程中的 SQL 变量 ORDER BY 子句

php - 获取每组的最后一行?

没有 "CREATE ALGORITHM"和 "DEFINER"的 mysqldump

java - hibernate 和 JAXB/JSON(消息模型和数据模型)

java - Log4j2 json 配置不写入文件

php - MySQL JSON 数据类型是否不利于数据检索的性能?