Basic Grammar
捕获组及反相引用 Grouping and Backreferences
(regex): 捕获组. 如(abc){3}\1中,有一个捕获组(abc),后面复用该捕获组3次,之后反向引用捕获组1(?:regex): 非捕获组. 如(?:abc){3}中,有一个捕获组(abc),后面复用该捕获组3次,但无法反向引用捕获组(?<name>regex)命名捕获组,Java7支持
修改器 Modifiers
(?i)regex(?-i): 打开和关闭大小写敏感(?s)regex(?-s): 打开和关闭”dot matches newline”(?m)regex:^匹配每行的开头$匹配每行的结尾(?-m)regex:^匹配全部内容的开头$匹配全部内容的结尾(?i-sm)regex: 组合几个开关(?i-sm:reg1)reg2: 组合开关,只作用于reg1
Atomic Grouping and Possessive Quantifiers
(?>regex)?+, *+...
查看器 Lookaround
(?=regex): Zero-width positive lookahead.- Zero-width: not consume any data,
1(?=2)3永远匹配失败 - positive: ‘等于’
- lookahead: ‘向前匹配’, ‘streets’–>
t(?=s)–>第二个’t’
- Zero-width: not consume any data,
(?!regex): Zero-width negative lookahead.- negative: ‘不等于’
(?<=regex): Zero-width positive lookbehind.- lookbehind: ‘向后匹配’, ‘streets’–>
(?<=s)t–>第一个’t’
- lookbehind: ‘向后匹配’, ‘streets’–>
(?<!regex): Zero-width positive lookbehind.- negative: ‘不等于’
Continuing from The Previous Match
\\Gregex
Conditionals
Comments
(?#comment): comments
Common Usage
-
匹配中文
[\u4e00-\u9fa5] -
匹配首尾空白字符
^\s*|\s*$ -
匹配邮件地址
\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)* -
匹配不含有连续横杠的字符串
(-(?!-)|[a-z0-9])*