Python正则re模块使用步骤及原理解析

这篇文章主要介绍了Python正则re模块使用步骤及原理解析,文中通过示例代码介绍的非常详细，对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下

python中使用正则表达式的步骤：

1.导入re模块：import re

2.初始化一个Regex对象：re.compile()

3.刚刚创建的Regex对象调用search方法进行匹配，返回要给march对象

4.刚刚的march对象调用group方法，展示匹配到的字符串

下面例子的知识点：

对正则表达式分组用：()，正则里的分组计数从1开始，不是从0，切记~~

group(数字)：去对应的分组的值

groups():返回所有分组的元组形式

d表示一个数字

regex_obj = re.compile(r'(ddd)-(ddd)-(dddd)')

match_obj = regex_obj.search('我司电话：035-411-1234')

result1 = match_obj.group(1)

result2 = match_obj.group(2)

result3 = match_obj.group(3)

print(result1)

print(result2)

print(result3)

result4 = match_obj.group()

print(result4)

result5 = match_obj.groups()

print(result5)

执行结果：

035

411

1234

035-411-1234

('035', '411', '1234')

补充知识点：w表示一个单词，s表示一个空格

regex_obj = re.compile(r'(dwd)-(ddd)-(dddd)')

match_obj = regex_obj.search('我司电话：0a5-411-1234')

result = match_obj.group(1)

print(result)

regex_obj = re.compile(r'(dwd)-(ddd)-(dddd)')

match_obj = regex_obj.search('我司电话：0哈5-411-1234')

result = match_obj.group(1)

print(result)

regex_obj = re.compile(r'(dsd)-(ddd)-(dddd)')

match_obj = regex_obj.search('我司电话：0 5-411-1234')

result = match_obj.group(1)

print(result)

执行结果：

0a5

0哈5

0 5

| 或：

regex_obj = re.compile(r'200|ok|successfully')

match_obj1 = regex_obj.search('vom get request and stored successfully')

result1 = match_obj1.group()

print(result1)

match_obj2 = regex_obj.search('vom get request,response 200 ok')

result2 = match_obj2.group()

print(result2)

match_obj3 = regex_obj.search('vom get request,response ok 200')

result3 = match_obj3.group()

print(result3)

执行结果：

successfully

200

注意：如果search返回的march对象只有一个结果值的话，不能用groups，只能用group

regex_obj = re.compile(r'200|ok|successfully')

match_obj1 = regex_obj.search('vom get request and stored successfully')

result2 = match_obj1.groups()

print(result2)

result1 = match_obj1.group()

print(result1)

执行结果：

()

successfully

? ：可选匹配项

+ ：1次或 n次匹配

* ：*前面的字符或者字符串匹配 0次、n次

注意：*前面必须要有内容

regex_obj = re.compile(r'(haha)*,welcome to vom_admin system') 指haha这个字符串匹配0次或者多次

regex_obj = re.compile(r'(ha*),welcome to vom_admin system') 指ha这个字符串匹配0次或者多次

. : 通配符，匹配任意一个字符

所以常常用的组合是：.*

regex_obj = re.compile(r'(.*),welcome to vom_admin system')

match_obj1 = regex_obj.search('Peter,welcome to vom_admin system')

name = match_obj1.group(1)

print(name)

执行结果：

Peter

{} ：匹配特定的次数

里面只写一个数字：匹配等于数字的次数

里面写{3,5}这样两个数字的，匹配3次或 4次或 5次，按贪心匹配法，能满足5次的就输出5次的，没有5次就4次，4次也没有才是3次

regex_obj = re.compile(r'((ha){3}),this is very funny')

match_obj1 = regex_obj.search('hahahaha,this is very funny')

print("{3}结果",match_obj1.group(1))

regex_obj = re.compile(r'((ha){3,5}),this is very funny')

match_obj1 = regex_obj.search('hahahaha,this is very funny')

print("{3,5}结果",match_obj1.group(1))

执行结果：

{3}结果 hahaha

{3,5}结果 hahahaha

findall()：返回所有匹配到的字串的列表

regex_obj = re.compile(r'ddd')

match_obj = regex_obj.findall('我是101班的，小李是103班的')

print(match_obj)

regex_obj = re.compile(r'(ddd)-(ddd)-(dddd)')

match_obj = regex_obj.findall('我家电话是123-123-1234，我公司电话是890-890-7890')

print(match_obj)

打印结果：

['101', '103']

[('123', '123', '1234'), ('890', '890', '7890')]

[]：创建自己的字符集：

[abc]：包括[]内的字符

[^abc]：不包括[]内的所有字符

也可以使用:[a-zA-Z0-9]这样简写

regex_obj = re.compile(r'[!@#$%^&*()]') name = input("请输入昵称，不含特殊字符：") match_obj = regex_obj.search(name) if match_obj: print("昵称输入不合法，包含了特殊字符：", match_obj.group()) else: print("昵称有效")

执行结果：

请输入昵称，不含特殊字符：*h

昵称输入不合法，包含了特殊字符： *

^：开头

$：结尾

regex_obj = re.compile(r'(^[A-Z])(.*)')

name = input("请输入昵称，开头必须大写字母：")

match_obj = regex_obj.search(name)

print(match_obj.group())

执行结果：

请输入昵称，开头必须大写字母：A1234

A1234

sub()：第一个参数为要替换成的，第二个参数传被替换的，返回替换成功后的字符串

regex_obj = re.compile(r'[!@#$%^&*()]')

match_obj = regex_obj.sub('嘿','haha,$%^,hahah')

print(match_obj)

执行结果：

haha,嘿嘿嘿,hahah

补充一下正则表达式的表，正则太复杂了，要常看常用才能熟练

Python正则re模块使用步骤及原理解析

相关文章