javascript - javascript escape() 和 unescape() 的 PHP 实现

标签 javascript php encoding escaping

首先,我知道 JS escape()unescape() 均已弃用。基本上我们有一个古老的系统,它在存储到数据库之前对数据进行 JS escape() 处理,每次我们都需要在客户端对数据进行 unescape() 处理,然后才能显示实际数据(我知道这很愚蠢,但它是几年前为了在不兼容 Unicode 的数据库上支持 Unicode 字符而完成的)。

是否有任何现有的 PHP 实现可以模拟 JavaScript escape()unescape() 函数?

最佳答案

经过一番搜索,我能够将两个 PHP 函数组合在一起,从而实现我想要的功能。这些代码并不漂亮,但在我们目前拥有的数据上 100% 有效,所以我想在这里分享它们。

/**
 *  Simulate javascript escape() function
 */
function escapejs($source) {
    $map = array(           
      ,'~'        => '%7E'
      ,'!'        => '%21'
      ,'\''       => '%27'       // single quote
      ,'('        => '%28'
      ,')'        => '%29'
      ,'#'        => '%23'
      ,'$'        => '%24'
      ,'&'        => '%26'
      ,','        => '%2C'
      ,':'        => '%3A'
      ,';'        => '%3B'
      ,'='        => '%3D'
      ,'?'        => '%3F'
      ,' '       => '%20'       // space
      ,'"'        => '%22'       // double quote
      ,'%'        => '%25'
      ,'<'        => '%3C'
      ,'>'        => '%3E'
      ,'['        => '%5B'
      ,'\\'       => '%5C'       // forward slash \
      ,']'        => '%5D'
      ,'^'        => '%5E'
      ,'{'        => '%7B'
      ,'|'        => '%7C'
      ,'}'        => '%7D'
      ,'`'        => '%60'
      ,chr(9)     => '%09'
      ,chr(10)    => '%0A'
      ,chr(13)    => '%0D'
      ,'¡'       => '%A1'
      ,'¢'       => '%A2'
      ,'£'       => '%A3'
      ,'¤'       => '%A4'
      ,'¥'       => '%A5'
      ,'¦'       => '%A6'
      ,'§'       => '%A7'
      ,'¨'       => '%A8'
      ,'©'       => '%A9'
      ,'ª'       => '%AA'
      ,'«'       => '%AB'
      ,'¬'       => '%AC'
      ,'¯'       => '%AD'
      ,'®'       => '%AE'
      ,'¯'       => '%AF'
      ,'°'       => '%B0'
      ,'±'       => '%B1'
      ,'²'       => '%B2'
      ,'³'       => '%B3'
      ,'´'       => '%B4'
      ,'µ'       => '%B5'
      ,'¶'       => '%B6'
      ,'·'       => '%B7'
      ,'¸'       => '%B8'
      ,'¹'       => '%B9'
      ,'º'       => '%BA'
      ,'»'       => '%BB'
      ,'¼'       => '%BC'
      ,'½'       => '%BD'
      ,'¾'       => '%BE'
      ,'¿'       => '%BF'
      ,'À'       => '%C0'
      ,'Á'       => '%C1'
      ,'Â'       => '%C2'
      ,'Ã'       => '%C3'
      ,'Ä'       => '%C4'
      ,'Å'       => '%C5'
      ,'Æ'       => '%C6'
      ,'Ç'       => '%C7'
      ,'È'       => '%C8'
      ,'É'       => '%C9'
      ,'Ê'       => '%CA'
      ,'Ë'       => '%CB'
      ,'Ì'       => '%CC'
      ,'Í'       => '%CD'
      ,'Î'       => '%CE'
      ,'Ï'       => '%CF'
      ,'Ð'       => '%D0'
      ,'Ñ'       => '%D1'
      ,'Ò'       => '%D2'
      ,'Ó'       => '%D3'
      ,'Ô'       => '%D4'
      ,'Õ'       => '%D5'
      ,'Ö'       => '%D6'
      ,'×'       => '%D7'
      ,'Ø'       => '%D8'
      ,'Ù'       => '%D9'
      ,'Ú'       => '%DA'
      ,'Û'       => '%DB'
      ,'Ü'       => '%DC'
      ,'Ý'       => '%DD'
      ,'Þ'       => '%DE'
      ,'ß'       => '%DF'
      ,'à'       => '%E0'
      ,'á'       => '%E1'
      ,'â'       => '%E2'
      ,'ã'       => '%E3'
      ,'ä'       => '%E4'
      ,'å'       => '%E5'
      ,'æ'       => '%E6'
      ,'ç'       => '%E7'
      ,'è'       => '%E8'
      ,'é'       => '%E9'
      ,'ê'       => '%EA'
      ,'ë'       => '%EB'
      ,'ì'       => '%EC'
      ,'í'       => '%ED'
      ,'î'       => '%EE'
      ,'ï'       => '%EF'
      ,'ð'       => '%F0'
      ,'ñ'       => '%F1'
      ,'ò'       => '%F2'
      ,'ó'       => '%F3'
      ,'ô'       => '%F4'
      ,'õ'       => '%F5'
      ,'ö'       => '%F6'
      ,'÷'       => '%F7'
      ,'ø'       => '%F8'
      ,'ù'       => '%F9'
      ,'ú'       => '%FA'
      ,'û'       => '%FB'
      ,'ü'       => '%FC'
      ,'ý'       => '%FD'
      ,'þ'       => '%FE'
      ,'ÿ'       => '%FF'
    );

    $convmap = array(0x80, 0x10ffff, 0, 0xffffff);

    $org = $source;

    // make sure string is UTF8
    if (false === mb_check_encoding($source, 'UTF-8')) {
        if (false === ($source = iconv(mb_detect_encoding($text, mb_detect_order(), true), "UTF-8", $source))) {
          $source = $org;
        }
    }

    $chrArray = preg_split('//u', $source, -1, PREG_SPLIT_NO_EMPTY);  // split up the UTF8 string into chars
    $oChrArray = array();

    foreach ($chrArray as $index => $chr) {

      if (isset($map[$chr])) {
        $chr = $map[$chr];
      }
      // if char doesn't fall within ASCII then assume unicode, get the hex html entities
      //elseif (mb_detect_encoding($chr, 'ASCII', true) !== 'ASCII') {
      else {
        $chr = mb_encode_numericentity($chr, $convmap, "UTF-8", true);

        // since we will be converting the &#x notation to the non-standard %u for backward compatbility, make sure the code is 4 digits long by prepending 0p
        if (substr($chr, 0, 3) == '&#x' && substr($chr, -1) == ';' && strlen($chr) == 7)
          $chr = '&#x0'.substr($chr, 3);
      }

      $oChrArray[] = $chr;
    }
    $decodedStr = implode('', $oChrArray);
    $decodedStr = preg_replace('/&#x([0-9A-F]{4});/', '%u$1', $decodedStr);   // we need to use the %uXXXX format to simulate results generated with js escape()
    return $decodedStr;
}

/**
 *  Simulate javascript unescape() function
 */
function unescapejs($source) {
    $source = str_replace(array('%0B'), array(''), $source);    // stripe out vertical tab
    $s= preg_replace('/%u(....)/', '&#x$1;', $source);
    $s= preg_replace('/%(..)/', '&#x$1;', $s);
    return html_entity_decode($s, ENT_QUOTES, 'UTF-8');
}

关于javascript - javascript escape() 和 unescape() 的 PHP 实现,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28977523/

相关文章:

javascript 表单验证突然停止工作

php - 表单操作和发布方法不起作用 php

php - 正则表达式检查带短掩码的 IP 地址

java - Solr查询非UTF-8字符

java - JNA:仅更改一个外部 native 库的字符串编码

java - Linux 更新后 JVM 中的错误文件编码

javascript - 跨域脚本标签适用于 FF 和 Chrome,但不适用于 IE

javascript - 从数组中删除项目的更好方法

php - 如何从 PHP 中的数组中检索数据

javascript - 将所见即所得添加到不属于您自己的站点?