Python Match.string 属性

最后修改于 2025 年 4 月 20 日

Match.string 简介

Match.string 属性是 Python 匹配对象的一个属性。它返回传递给匹配函数的原始字符串。

此属性是只读的，并提供对完整输入字符串的访问，而不仅仅是匹配的部分。它对于匹配周围的上下文操作很有用。

匹配对象由 re.match、re.search 和其他正则表达式操作返回。 string 属性保留了输入。

基本语法

访问 Match.string 的语法很简单

match.string

其中 match 是正则表达式操作返回的匹配对象。不需要任何参数，因为它只是一个简单的属性访问。

Match.string 的基本用法

让我们从一个简单的示例开始，演示 Match.string 的基本用法。

basic_string.py

#!/usr/bin/python

import re

text = "The quick brown fox jumps over the lazy dog"
match = re.search(r'fox', text)

if match:
    print("Matched text:", match.group())
    print("Original string:", match.string)
    print("Is same object?", match.string is text)

此示例显示了如何从匹配对象访问原始字符串。输出确认它是传递给 search 的相同字符串对象。

match = re.search(r'fox', text)

我们执行一个基本的正则表达式搜索，以查找文本中的单词“fox”。如果成功，这将返回一个匹配对象。

print("Original string:", match.string)

string 属性将返回完整的原始输入字符串，而不仅仅是匹配的部分。

将 Match.string 与多个匹配项一起使用

string 属性在多个匹配项中保持不变。

multiple_matches.py

#!/usr/bin/python

import re

text = "apple orange apple banana apple"
pattern = re.compile(r'apple')

for match in pattern.finditer(text):
    print(f"Found '{match.group()}' at {match.start()}-{match.end()}")
    print("Full string:", match.string)
    print("---")

这表明每个匹配对象都引用相同的原始字符串。 string 属性在匹配之间不会改变。

使用 Match.string 提取上下文

我们可以使用 Match.string 来提取匹配项周围的上下文。

context_extraction.py

#!/usr/bin/python

import re

text = "The event will occur on 2023-12-25 at 14:30"
match = re.search(r'\d{4}-\d{2}-\d{2}', text)

if match:
    start, end = match.start(), match.end()
    context = match.string[max(0, start-10):end+10]
    print(f"Found date: {match.group()}")
    print(f"Context: ...{context}...")

这展示了如何使用原始字符串来获取匹配项周围的文本。 string 属性提供对完整输入的访问。

Match.string 与已编译的模式

string 属性对于已编译的模式的工作方式相同。

compiled_pattern.py

#!/usr/bin/python

import re

text = "User: john_doe, Email: john@example.com"
pattern = re.compile(r'(\w+)@(\w+\.\w+)')

match = pattern.search(text)
if match:
    print("Full match:", match.group())
    print("Original string:", match.string)
    print("Username:", match.group(1))
    print("Domain:", match.group(2))

已编译的模式会生成具有相同 string 属性的匹配对象。其行为与未编译的模式匹配相同。

使用 Match.string 进行字符串验证

我们可以使用 is 验证匹配是否来自特定字符串。

string_verification.py

#!/usr/bin/python

import re

original = "Important message: SECRET123"
copy_text = original[:]

match = re.search(r'SECRET\d+', original)

if match:
    print("Is original string?", match.string is original)
    print("Is copy string?", match.string is copy_text)
    print("Content equal?", match.string == copy_text)

这表明 Match.string 维护对象标识，而不仅仅是值相等。它引用传递给匹配器的确切字符串。

Match.string 在替换函数中

string 属性可以在替换回调中使用。

replacement_callback.py

#!/usr/bin/python

import re

text = "Prices: $10, $20, $30"

def add_tax(match):
    price = int(match.group(1))
    taxed = price * 1.2
    return f"${taxed:.2f} (original: {match.string[match.start():match.end()]})"

result = re.sub(r'\$(\d+)', add_tax, text)
print(result)

替换函数使用 match.string 来引用原始匹配的文本。这在替换期间提供了上下文。

Match.string 与多行字符串

该属性保留了完整的字符串，包括换行符。

multiline_string.py

#!/usr/bin/python

import re

text = """First line
Second line with IMPORTANT data
Third line"""

match = re.search(r'IMPORTANT', text)

if match:
    print("Matched:", match.group())
    print("Complete string:", repr(match.string))
    line_start = match.string.rfind('\n', 0, match.start()) + 1
    line_end = match.string.find('\n', match.end())
    print("Full line:", match.string[line_start:line_end])

这展示了如何使用 match.string 来处理多行输入。我们可以提取包含匹配项的完整行。

最佳实践

使用 Match.string 时，请考虑以下最佳实践

当您需要匹配项周围的上下文时，请使用它
请记住，它是完整的原始字符串，而不仅仅是匹配项
对于身份检查，首选 is 而不是 ==
将 start 和 end 结合使用以进行精确切片
请注意非常大的字符串的内存

性能注意事项

Match.string 属性只是对原始字符串的引用。它不会创建副本，因此内存开销很小。

但是，保持匹配对象处于活动状态会将原始字符串保留在内存中。对于大型字符串，请考虑提取所需部分并丢弃匹配项。

来源

Python 匹配对象文档

本教程介绍了 Python Match.string 属性的基本方面。了解此功能有助于进行高级文本处理任务。

作者

我叫 Jan Bodnar，是一位充满热情的程序员，拥有丰富的编程经验。自 2007 年以来，我一直在撰写编程文章。迄今为止，我已经撰写了 1,400 多篇文章和 8 本电子书。我拥有超过十年的编程教学经验。

列出所有 Python 教程。