如何将多字节字符串拆分为Php中的单词?

如何将多字节字符串拆分为PHP中的单词?
这是我到目前为止所做的,但我想改进代码……

   mb_internal_encoding( 'UTF-8');
   mb_regex_encoding( 'UTF-8');
   $arr = mb_split( '[\s\[\]().,;:-_]', $str );

有没有办法说一个单词是一个“alpha”字符序列(不使用符号a-z,因为我想包括非拉丁字符)

解决方法:

试试这个宝贝:

preg_match_all('/[\p{L}\p{M}]+/u', $subject, $result, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($result[0]); $i++) {
    # Matched text = $result[0][$i];
}

匹配所有可能的字母及其口音作为单词:

     "
[\p{L}\p{M}]       # Match a single character present in the list below
                   # A character with the Unicode property “letter” (any kind of letter from any language)
                   # A character with the Unicode property “mark” (a character intended to be combined with another character (e.g. accents, umlauts, enclosing Boxes, etc.))
   +               # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
"

See it.

相关文章

统一支付是JSAPI/NATIVE/APP各种支付场景下生成支付订单,返...
统一支付是JSAPI/NATIVE/APP各种支付场景下生成支付订单,返...
前言 之前做了微信登录,所以总结一下微信授权登录并获取用户...
FastAdmin是我第一个接触的后台管理系统框架。FastAdmin是一...
之前公司需要一个内部的通讯软件,就叫我做一个。通讯软件嘛...
统一支付是JSAPI/NATIVE/APP各种支付场景下生成支付订单,返...