使用split分割时:
[a,a,|,b,|,c,c]
先看一下split的用法:
Splits
<span style="color: #0000ff;">this<span style="color: #000000;"> string around matches of the given regular expression.This method works as <span style="color: #0000ff;">if by invoking the two-<span style="color: #000000;">argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
The string "boo:and:foo",<span style="color: #0000ff;">for<span style="color: #000000;"> example,yields the following results with these expressions:
Regex Result
: { "boo","and","foo"<span style="color: #000000;"> }}
o { "b","",":and:f"<span style="color: #000000;"> }}
Parameters:
regex the delimiting regular expression
Returns:
the array of strings computed by splitting <span style="color: #0000ff;">this<span style="color: #000000;"> string around matches of the given regular expression
Throws:
PatternSyntaxException - <span style="color: #0000ff;">if the regular expression's Syntax is invalid
<span style="color: #000000;">Since:
1.4<span style="color: #000000;">
See Also:
java.util.regex.Pattern
@spec
JSR-51
可以看到split中参数是一个正则表达式,正则表达式中有一些特殊字符需要注意,它们有自己的用法:
http://www.fon.hum.uva.nl/praat/manual/Regular_expressions_1__Special_characters.html
<span style="color: #ff0000;">\
the backslash escape character.The backslash gives special meaning to the character following it. For example,the combination "\n" stands <span style="color: #0000ff;">for the newline,one of the control characters. The combination "\w" stands <span style="color: #0000ff;">for a "word" character,one of the convenience escape sequences <span style="color: #0000ff;">while "\1"<span style="color: #000000;"> is one of the substitution special characters.
Example: The regex "aa\n" tries to match two consecutive "a"<span style="color: #000000;">s at the end of a line,inclusive the newline character itself.
Example: "a+" matches "a+" and not a series of one or "a"<span style="color: #000000;">s.
<span style="color: #ff0000;">^<span style="color: #000000;"> the caret is the start of line anchor or the negate symbol.
Example: "^a" matches "a"<span style="color: #000000;"> at the start of a line.
Example: "[^0-9]"<span style="color: #000000;"> matches any non digit.
<span style="color: #ff0000;">$ the dollar is the end of line anchor.
Example: "b$" matches a "b"<span style="color: #000000;"> at the end of a line.
Example: "^b$"<span style="color: #000000;"> matches the empty line.
<span style="color: #ff0000;">{ } the open and close curly bracket are used as range quantifiers.
Example: "a{2,3}" matches "aa" or "aaa"<span style="color: #000000;">.
<span style="color: #ff0000;">[ ] the open and close square bracket define a character <span style="color: #0000ff;">class<span style="color: #000000;"> to match a single character.
The "^" as the first character following the "[" negates and the match is <span style="color: #0000ff;">for the characters not listed. The "-" denotes a range of characters. Inside a "[ ]" character <span style="color: #0000ff;">class<span style="color: #000000;"> construction most special characters are interpreted as ordinary characters.
Example: "[d-f]" is the same as "[def]" and matches "d","e" or "f"<span style="color: #000000;">.
Example: "[a-z]"<span style="color: #000000;"> matches any lowercase characters in the alfabet.
Example: "[^0-9]"<span style="color: #000000;"> matches any character that is not a digit.
Example: A search <span style="color: #0000ff;">for "[][()?<>.?]" in the string "[]()?<>.?" followed by a replace string "r" has the result "rrrrrrrrrrrrr". Here the search string is one character <span style="color: #0000ff;">class<span style="color: #000000;"> and all the Meta characters are interpreted as ordinary characters without the need to escape them.
<span style="color: #ff0000;">( ) the open and close parenthesis are used <span style="color: #0000ff;">for<span style="color: #000000;"> grouping characters (or other regex).
The groups can be referenced in both the search and the substitution phase. There also exist some special constructs with parenthesis.
Example: "(ab)\1" matches "abab"<span style="color: #000000;">.
<span style="color: #ff0000;">. the dot matches any character except the newline.
Example: ".a" matches two consecutive characters where the last one is "a"<span style="color: #000000;">.
Example: "..txt$" matches all strings that end in ".txt"<span style="color: #000000;">.
<span style="color: #ff0000;"> the star is the match-zero-or-<span style="color: #000000;">more quantifier.
Example: "^.*$"<span style="color: #000000;"> matches an entire line.
<span style="color: #ff0000;">+ the plus is the match-one-or-<span style="color: #000000;">more quantifier.
? the question mark is the match-zero-or-<span style="color: #000000;">one quantifier. The question mark is also used in special constructs with parenthesis and in changing match behavIoUr.
<span style="color: #ff0000;">|<span style="color: #000000;"> the vertical pipe separates a series of alternatives.
Example: "(a|b|c)a" matches "aa" or "ba" or "ca"<span style="color: #000000;">.
<span style="color: #ff0000;">< ><span style="color: #000000;"> the smaller and greater signs are anchors that specify a left or right word boundary.
<span style="color: #ff0000;">- the minus indicates a range in a character <span style="color: #0000ff;">class (when it is not at the first position after the "[" opening bracket or the last position before the "]"<span style="color: #000000;"> closing bracket.
Example: "[A-Z]"<span style="color: #000000;"> matches any uppercase character.
Example: "[A-Z-]" or "[-A-Z]" match any uppercase character or "-"<span style="color: #000000;">.
<span style="color: #ff0000;">& the and is the "substitute complete match" symbol.
小结:
对字符串的正则操作时要注意特殊字符的转义。