unicode - Perl 6字符将匹配哪些Unicode属性?

标签 unicode raku

.uniprop返回单个属性:

put join ', ', 'A'.uniprop;

我得到一个属性(一般类别):
Lu

环顾四周,我没有找到一种方法来获取所有其他属性(包括诸如ID_Start之类的派生属性)。我想念什么?我知道我可以看一下数据文件,但是我宁愿有一个返回列表的方法。

我对此最感兴趣,因为正则表达式理解属性并匹配正确的属性。我想使用任何字符并显示它将匹配的属性。

最佳答案

"A".uniprop("Alphabetic")将获得字母属性。您是否在询问其他可能的属性?

所有这些都打勾的项目可能会起作用。这只是显示其烘烤测试的状态https://github.com/perl6/roast/issues/195

https://github.com/rakudo/rakudo/blob/master/src/core/Cool.pm6#L396-L483这对您可能更有用
第一个哈希只是将属性名称的别名映射到全名。第二个哈希指定该属性是否为B( bool 值),S(字符串),I(整数),nv(数字值),na(Unicode名称)和其他一些特殊字符。

如果我不明白您的问题,请告诉我,我将修改此答案。

更新:似乎您想找出所有将匹配的属性。您要做的是迭代https://github.com/rakudo/rakudo/blob/master/src/core/Cool.pm6#L396-L483的所有内容,并仅查看string,integer和boolean属性。这是完整的东西:https://gist.github.com/samcv/ae09060a781bb4c36ae6cac80ea9325f

sub MAIN {
    use Test;
    my $char = 'a';
    my @result = what-matches($char);
    for @result {
        ok EVAL("'$char' ~~ /$_/"), "$char ~~ /$_/";
    }
}
use nqp;
sub what-matches (Str:D $chr) {
    my @result;
    my %prefs = prefs();
    for %prefs.keys -> $key {
        given %prefs{$key} {
            when 'S' {
                my $propval = $chr.uniprop($key);
                if $key eq 'Block' {
                    @result.push: "<:In" ~ $propval.trans(' ' => '') ~ ">";
                }
                elsif $propval {
                    @result.push: "<:" ~ $key ~ "<" ~ $chr.uniprop($key) ~ ">>";
                }
            }
            when 'I' {
                @result.push: "<:" ~ $key ~ "<" ~ $chr.uniprop($key) ~ ">>";
            }
            when 'B' {
                @result.push: ($chr.uniprop($key) ?? "<:$key>" !! "<:!$key>");
            }

        }
    }
    @result;

}
sub prefs {
    my %prefs = nqp::hash(
          'Other_Grapheme_Extend','B','Titlecase_Mapping','tc','Dash','B',
          'Emoji_Modifier_Base','B','Emoji_Modifier','B','Pattern_Syntax','B',
          'IDS_Trinary_Operator','B','ID_Continue','B','Diacritic','B','Cased','B',
          'Hangul_Syllable_Type','S','Quotation_Mark','B','Radical','B',
          'NFD_Quick_Check','S','Joining_Type','S','Case_Folding','S','Script','S',
          'Soft_Dotted','B','Changes_When_Casemapped','B','Simple_Case_Folding','S',
          'ISO_Comment','S','Lowercase','B','Join_Control','B','Bidi_Class','S',
          'Joining_Group','S','Decomposition_Mapping','S','Lowercase_Mapping','lc',
          'NFKC_Casefold','S','Simple_Lowercase_Mapping','S',
          'Indic_Syllabic_Category','S','Expands_On_NFC','B','Expands_On_NFD','B',
          'Uppercase','B','White_Space','B','Sentence_Terminal','B',
          'NFKD_Quick_Check','S','Changes_When_Titlecased','B','Math','B',
          'Uppercase_Mapping','uc','NFKC_Quick_Check','S','Sentence_Break','S',
          'Simple_Titlecase_Mapping','S','Alphabetic','B','Composition_Exclusion','B',
          'Noncharacter_Code_Point','B','Other_Alphabetic','B','XID_Continue','B',
          'Age','S','Other_ID_Start','B','Unified_Ideograph','B','FC_NFKC_Closure','S',
          'Case_Ignorable','B','Hyphen','B','Numeric_Value','nv',
          'Changes_When_NFKC_Casefolded','B','Expands_On_NFKD','B',
          'Indic_Positional_Category','S','Decomposition_Type','S','Bidi_Mirrored','B',
          'Changes_When_Uppercased','B','ID_Start','B','Grapheme_Extend','B',
          'XID_Start','B','Expands_On_NFKC','B','Other_Uppercase','B','Other_Math','B',
          'Grapheme_Link','B','Bidi_Control','B','Default_Ignorable_Code_Point','B',
          'Changes_When_Casefolded','B','Word_Break','S','NFC_Quick_Check','S',
          'Other_Default_Ignorable_Code_Point','B','Logical_Order_Exception','B',
          'Prepended_Concatenation_Mark','B','Other_Lowercase','B',
          'Other_ID_Continue','B','Variation_Selector','B','Extender','B',
          'Full_Composition_Exclusion','B','IDS_Binary_Operator','B','Numeric_Type','S',
          'kCompatibilityVariant','S','Simple_Uppercase_Mapping','S',
          'Terminal_Punctuation','B','Line_Break','S','East_Asian_Width','S',
          'ASCII_Hex_Digit','B','Pattern_White_Space','B','Hex_Digit','B',
          'Bidi_Paired_Bracket_Type','S','General_Category','S',
          'Grapheme_Cluster_Break','S','Grapheme_Base','B','Name','na','Ideographic','B',
          'Block','S','Emoji_Presentation','B','Emoji','B','Deprecated','B',
          'Changes_When_Lowercased','B','Bidi_Mirroring_Glyph','bmg',
          'Canonical_Combining_Class','S',
    );
}

关于unicode - Perl 6字符将匹配哪些Unicode属性?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49245522/

相关文章:

sockets - 简单的非阻塞 Web 服务器

command-line - Perl6一线执行。主题如何更新?

raku - 定义自己的 CONTROL 异常

python - Pandas 混合值列到字符串

php - 如何导出为阿拉伯语的 excel?

c++ - 为什么我的程序不能正确地注入(inject)我的 .dll?

raku - map 签名与Whatever不匹配? x 对 X 对 xx

raku - Cro::WebSocket::Client 不起作用

python - UnicodeEncodeError : 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128)

language-agnostic - 如何以不支持 utf-8 的格式存储 unicode 数据