当前位置：首页 > article >正文

第7章模式匹配与正则表达式

article 2025/2/23 2:44:31

1. 不用正则表达式来查找文本模式
2. 用正则表达式来查找文本模式
- 2.1 创建正则表达式（Regex）对象
- 2.2 匹配Regex对象
3. 用正则表达式匹配更多模式
- 3.1 利用括号分组
- 3.2 用管道匹配多个分组
- 3.3 用问号实现可选匹配
- 3.4 用星号匹配零次或多次
- 3.5 用加号匹配一次或多次
- 3.6 用花括号匹配特定次数
4. 贪心和非贪心匹配
5. findall() 方法
6. 字符分类
7. 建立自己的字符分类
8. 插入字符和美元字符
9. 通配字符
- 9.1 用点-星匹配所有字符
- 9.2 用句点字符匹配换行符
10. 不区分大小写的匹配
11. 用 sub() 方法替换字符串

1. 不用正则表达式来查找文本模式

def isPhoneNumber(text):
    if len(text) != 11:
        return False

    for i in range(0, len(text)):
        if (i == 3 or i == 7) and text[i] != "-":
            return False
        elif i != 3 and i != 7 and not text[i].isdecimal():
            return False

    return True


text = "123-456-789"
print(text)
print(isPhoneNumber(text))

2. 用正则表达式来查找文本模式

2.1 创建正则表达式（Regex）对象

import re

text = re.compile(r'\d\d\d-\d\d\d-\d\d\d')

2.2 匹配Regex对象

import re

text = re.compile(r'\d\d\d-\d\d\d-\d\d\d')
match = text.search("~~~123-456-789~~~")
print(match.group())

3. 用正则表达式匹配更多模式

3.1 利用括号分组

import re

text = re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d)')
match = text.search("~~~123-456-789~~~")
print(match.group(1))
# 123
print(match.group(2))
# 456-789
print(match.groups())
# ('123', '456-789')

3.2 用管道匹配多个分组

| ：管道

import re

text = re.compile(r'456|123')
match = text.search("123-456-789")
print(match.group())
# 123

3.3 用问号实现可选匹配

import re

text = re.compile(r'\d\d\d(~)?\d\d\d')
match = text.search("123123")
print(match.group())
# 123123

match = text.search("123~123")
print(match.group())
# 123~123

3.4 用星号匹配零次或多次

import re

text = re.compile(r'\d\d\d(~)*\d\d\d')
match = text.search("123123")
print(match.group())
# 123123

match = text.search("123~~~123")
print(match.group())
# 123~~~123

3.5 用加号匹配一次或多次

import re

text = re.compile(r'\d\d\d(~)+\d\d\d')
match = text.search("123~123")
print(match.group())
# 123~123

match = text.search("123~~~123")
print(match.group())
# 123~~~123

3.6 用花括号匹配特定次数

import re

text = re.compile(r'\d\d\d(~){3,5}\d\d\d')
match = text.search("123~~~123")
print(match.group())
# 123~~~123

match = text.search("123~~~~~123")
print(match.group())
# 123~~~~~123

4. 贪心和非贪心匹配

贪心匹配：尽可能匹配最长的字符串
非贪心匹配：尽可能匹配最短的字符串

import re

text = re.compile(r'(123 ){2,4}')
match = text.search("123 123 123 123 123 ")
print(match.group())
# 123 123 123 123

text = re.compile(r'(123 ){2,4}?')
match = text.search("123 123 123 123 123 ")
print(match.group())
# 123 123

5. findall() 方法

import re

text = re.compile(r'\d\d\d-\d\d\d-\d\d\d')
match = text.search("~~~123-456-789~~~111-222-333~~~")
print(match.group())
# 123-456-789

match = text.findall("~~~123-456-789~~~111-222-333~~~")
print(match)
# ['123-456-789', '111-222-333']

6. 字符分类

编写字符分类	表示
\d	0~9的任何数字
\D	除0~9的数字以外的任何字符
\w	任何字母、数字和下划线字符
\W	除字母、数字和下划线以外的任何字符
\s	空格、制表符或换行符
\S	除空格、制表符和换行符以外的任何字符

7. 建立自己的字符分类

import re

text = re.compile(r'[0-5]')
match = text.findall("1a2b3c4d")
print(match)
# ['1', '2', '3', '4']

text = re.compile(r'[abc]')
match = text.findall("1a2b3c4d")
print(match)
# ['a', 'b', 'c']

text = re.compile(r'[^abc]')
match = text.findall("1a2b3c4d")
print(match)
# ['1', '2', '3', '4', 'd']

8. 插入字符和美元字符

^ ：以指定文本开始
$ ：以指定文本结束

import re

text = re.compile(r'^\d\d\d')
match = text.search("123abc456")
print(match)
# <re.Match object; span=(0, 3), match='123'>

text = re.compile(r'\d\d\d$')
match = text.search("123abc456")
print(match)
# <re.Match object; span=(6, 9), match='456'>

9. 通配字符

. ：匹配换行符之外的所有字符

import re

text = re.compile(r'..23')
match = text.findall("123abc23")
print(match)
# ['bc23']

9.1 用点-星匹配所有字符

import re

text = re.compile(r'123(.*)456(.*)')
match = text.findall("123abc456def")
print(match)
# [('abc', 'def')]

9.2 用句点字符匹配换行符

re.DOTALL ：让句点字符匹配所有字符（包括换行符）

import re

text = re.compile(r'.*')
match = text.search("123abc\n456def")
print(match.group())
# 123abc

text = re.compile(r'.*', re.DOTALL)
match = text.search("123abc\n456def")
print(match.group())
# 123abc\n456def

10. 不区分大小写的匹配

re.I ：不区分大小写

import re

text = re.compile(r'abc', re.I)
match = text.findall("abcABC")
print(match)
# ['abc', 'ABC']

11. 用 sub() 方法替换字符串

import re

text = re.compile(r'ABC\w*')
match = text.sub("abc", "ABC : 123")
print(match)
# abc : 123

查看全文

http://www.kler.cn/a/134096.html

聊聊logback的EvaluatorFilter

计算机硬件的基本组成

K-Means聚类

数电实验-----实现74LS139芯片扩展为3-8译码器以及应用（Quartus II ）

【VSCode】Visual Studio Code 下载与安装教程

macos 配置ndk环境

【DevOps】Git 图文详解（四）：Git 使用入门

阿坤老师的独特瓷器(Java详解)

Linux下快速确定目标服务器支持哪些协议和密码套件

学习网络编程No.10【深入学习HTTPS】

sqlite 判断数据表是否存在失效的一种情况

Python数据分析实战① Python实现数据可视化

Unity中Shader法线贴图（上）

qt 重载信号，使用““方式进行connect()调用解决方案

【算法与数据结构】前言

WPF中如何在MVVM模式下关闭窗口

【0到1学习Unity脚本编程】第一人称视角的角色控制器

技术贴 | SQL 执行 - 执行器优化

【六袆 - MySQL】SQL优化；Explain SQL执行计划分析；

WPF位图效果