Perl pos 函数

最后修改于 2025 年 4 月 4 日

Perl 的 pos 函数用于跟踪和操作字符串中下一个正则表达式匹配将开始的位置。它对于高级字符串解析和迭代匹配操作至关重要。

pos 与 /g 正则表达式修饰符配合使用以启用顺序匹配。它可以检索和设置当前匹配的位置。位置是基于零的，从字符串的开头开始计数。

pos 的基本用法

使用 pos 最简单的方法是检查当前位置。

basic.pl

#!/usr/bin/perl

use strict;
use warnings;
use v5.34.0;

my $text = "Hello world";
$text =~ /Hello/g;

print "Current position: ", pos($text), "\n";

我们在字符串中匹配 "Hello"，然后检查位置。匹配后的位置将紧跟在匹配的子字符串之后。

$ ./basic.pl
Current position: 5

使用 pos 进行迭代匹配

pos 通常与 /g 一起用于迭代匹配。

iterate.pl

#!/usr/bin/perl

use strict;
use warnings;
use v5.34.0;

my $text = "a1 b2 c3 d4";
while ($text =~ /(\w)(\d)/g) {
    print "Found $1$2 at position ", pos($text), "\n";
}

此脚本查找字符串中的所有字母-数字对。pos 函数显示每次迭代后下一个匹配将开始的位置。

$ ./iterate.pl
Found a1 at position 2
Found b2 at position 5
Found c3 at position 8
Found d4 at position 11

手动设置 pos

您可以手动设置位置以控制匹配的开始位置。

setpos.pl

#!/usr/bin/perl

use strict;
use warnings;
use v5.34.0;

my $text = "abc123xyz789";
pos($text) = 3;  # Skip first 3 characters

if ($text =~ /\d+/g) {
    print "First number found: $& at position ", pos($text), "\n";
}

我们将位置设置为跳过前 3 个字母。然后，正则表达式从该位置开始匹配第一个数字序列。

$ ./setpos.pl
First number found: 123 at position 6

带捕获组的 pos

pos 与捕获组一起使用来跟踪复杂匹配。

capture.pl

#!/usr/bin/perl

use strict;
use warnings;
use v5.34.0;

my $text = "Name: John, Age: 30, City: NY";
while ($text =~ /(\w+):\s*([^,]+)/g) {
    print "$1 = $2 (ends at ", pos($text), ")\n";
}

此脚本提取由冒号分隔的键值对。每次匹配后的位置显示下一次搜索将开始的位置。

$ ./capture.pl
Name = John (ends at 11)
Age = 30 (ends at 20)
City = NY (ends at 28)

重置 pos

您可以重置位置以从头开始匹配。

reset.pl

#!/usr/bin/perl

use strict;
use warnings;
use v5.34.0;

my $text = "apple banana cherry";
$text =~ /banana/g;

print "After first match: ", pos($text), "\n";

pos($text) = 0;  # Reset position
$text =~ /apple/g;

print "After reset and second match: ", pos($text), "\n";

我们匹配 "banana"，然后重置位置从头开始匹配 "apple"。重置 pos 允许对同一字符串进行新的搜索。

$ ./reset.pl
After first match: 12
After reset and second match: 5

带 Unicode 字符串的 pos

当使用适当的设置时，pos 可以正确处理 Unicode 字符串。

unicode.pl

#!/usr/bin/perl

use strict;
use warnings;
use v5.34.0;
use utf8;

my $text = "café ☕ museum";
$text =~ /é/g;

print "Position after café: ", pos($text), "\n";

$text =~ /☕/g;
print "Position after coffee: ", pos($text), "\n";

这演示了 pos 与 Unicode 字符一起工作。在使用 UTF-8 时，位置会正确地为多字节字符前进。

$ ./unicode.pl
Position after café: 4
Position after coffee: 7

高级 pos 操作

pos 可用于创建性地完成复杂的解析任务。

advanced.pl

#!/usr/bin/perl

use strict;
use warnings;
use v5.34.0;

my $text = "data:123-456-789;info:abc-def-ghi;";
my %data;

while ($text =~ /(\w+):([^;]+)/g) {
    my $key = $1;
    my $values = $2;
    my @values = split /-/, $values;
    $data{$key} = \@values;
    pos($text)++;  # Skip semicolon
}

use Data::Dumper;
print Dumper \%data;

这个高级示例从字符串中提取结构化数据。我们手动调整 pos 以跳过键值对之间的分号。

$ ./advanced.pl
$VAR1 = {
          'info' => [
                      'abc',
                      'def',
                      'ghi'
                    ],
          'data' => [
                      '123',
                      '456',
                      '789'
                    ]
        };

最佳实践

与 /g 修饰符一起使用： pos 主要与全局正则表达式匹配一起使用。
检查已定义性： 在使用 pos 的值之前，请验证它是否已定义。
需要时重置： 使用 undef 清除 pos 以重新开始匹配。
与 substr 结合使用： 将 substr 与 pos 结合使用以进行精确提取。
注意 Unicode： 确保正确的编码以进行准确的位置跟踪。

来源

Perl pos 文档

本教程介绍了 Perl 的 pos 函数，并通过实际示例演示了其在字符串操作和解析中的用法。

作者

我的名字是 Jan Bodnar，我是一名充满热情的程序员，拥有丰富的编程经验。我自 2007 年以来一直在撰写编程文章。迄今为止，我已撰写了 1,400 多篇文章和 8 本电子书。我在教学编程方面拥有十多年的经验。

列出所有 Perl 教程。