Python 正则表达式标志 flags

正则标志修改匹配行为，如忽略大小写、多行模式等。

IGNORECASE 忽略大小写

Python

import re

text = "Hello World"
pattern = r"hello"

# 无标志：不匹配
match = re.search(pattern, text)
print(match)  # None

# 使用 IGNORECASE 标志
match = re.search(pattern, text, flags=re.IGNORECASE)
print(match.group())  # Hello

# 简写形式 (?i)
match = re.search(r"(?i)hello", text)
print(match.group())  # Hello

MULTILINE 多行模式

Python

import re

text = "第一行
第二行
第三行"

# 默认 ^ 和 $ 匹配整个字符串开头结尾
matches = re.findall(r"^第", text)
print(matches)  # ['第']（仅第一个）

# MULTILINE 使 ^ $ 匹配每行开头结尾
matches = re.findall(r"^第", text, flags=re.MULTILINE)
print(matches)  # ['第', '第', '第']

# 简写形式 (?m)
matches = re.findall(r"(?m)^第", text)
print(matches)  # ['第', '第', '第']

DOTALL 点号匹配换行

Python

import re

text = "Hello
World"

# 默认 . 不匹配换行符
match = re.search(r"Hello.*World", text)
print(match)  # None

# DOTALL 使 . 匹配包括换行符
match = re.search(r"Hello.*World", text, flags=re.DOTALL)
print(match.group())  # Hello\nWorld

# 简写形式 (?s)
match = re.search(r"(?s)Hello.*World", text)
print(match.group())  # Hello\nWorld

VERBOSE 详细模式

Python

import re

# 允许空格和注释，提高可读性
pattern = r"
    \d{4}      # 年份
    -          # 分隔符
    \d{2}      # 月份
    -          # 分隔符
    \d{2}      # 日期
"

text = "2024-05-19"
match = re.search(pattern, text, flags=re.VERBOSE)
print(match.group())  # 2024-05-19

# 简写形式 (?x)
pattern = r"(?x)\d{4}-\d{2}-\d{2}"

ASCII 仅 ASCII 匹配

Python

import re

text = "Hello 你好"

# \w 默认匹配 Unicode 字母数字
matches = re.findall(r"\w+", text)
print(matches)  # ['Hello', '你好']

# ASCII 标志仅匹配 ASCII 字母数字
matches = re.findall(r"\w+", text, flags=re.ASCII)
print(matches)  # ['Hello']

# 简写形式 (?a)

组合多个标志

Python

import re

text = "HELLO
world"

# 组合多个标志
matches = re.findall(r"^hello", text, flags=re.IGNORECASE | re.MULTILINE)
print(matches)  # ['HELLO']

# 简写组合 (?im)
matches = re.findall(r"(?im)^hello", text)
print(matches)  # ['HELLO']

标志对比

标志	常量	简写	效果
IGNORECASE	re.I	(?i)	忽略大小写
MULTILINE	re.M	(?m)	多行模式
DOTALL	re.S	(?s)	. 匹配换行
VERBOSE	re.X	(?x)	详细模式
ASCII	re.A	(?a)	仅 ASCII

编译时指定标志

Python

import re

# 编译正则时指定标志
pattern = re.compile(r"hello", flags=re.IGNORECASE)

text = "Hello World"
match = pattern.search(text)
print(match.group())  # Hello

# 后续调用无需重复指定标志
match2 = pattern.findall("hello there")
print(match2)  # ['hello']

内联标志范围

Python

import re

text = "Hello WORLD"

# 标志仅影响部分模式
pattern = r"(?i:hello) WORLD"  # hello 忽略大小写，WORLD 不忽略
match = re.search(pattern, text)
print(match)  # None（WORLD 需精确匹配）

pattern = r"(?i:hello) (?i:world)"
match = re.search(pattern, text)
print(match.group())  # Hello WORLD

标志取消

Python

import re

# (-i) 取消标志
text = "Hello HELLO"
pattern = r"(?i)hello(?-i:hello)"

matches = re.findall(pattern, text)
print(matches)  # ['Hello']（第二个 hello 需精确匹配）

要点总结

re.IGNORECASE 忽略大小写匹配
re.MULTILINE 使 ^ $ 匹配每行
re.DOTALL 使 . 匹配换行符
re.VERBOSE 允许注释和空格
re.ASCII 仅匹配 ASCII 字符
可用 | 组合多个标志
内联标志 (?i) 嵌入模式中
编译时指定标志避免重复传递

📝 发现内容有误？点击此处直接编辑