ZetCode

Perl 字符串 II

最后修改于 2023 年 8 月 24 日

Perl 字符串 II 教程是 Perl 字符串教程的第二部分。我们继续介绍 Perl 中的字符串数据类型。请访问 Perl 字符串 查看教程的第一部分。

Perl 多行字符串

Perl 中创建多行字符串有几种方法。

multiline.pl
#!/usr/bin/perl

use 5.30.0; use warnings;

my $sonnet = "Not marble, nor the gilded monuments Of princes, shall outlive
this powerful rhyme; But you shall shine more bright in these contents Than
unswept stone, besmear'd with sluttish time. When wasteful war shall statues
overturn, And broils root out the work of masonry, Nor Mars his sword nor war's
quick fire shall burn The living record of your memory. 'Gainst death and
all-oblivious enmity Shall you pace forth; your praise shall still find room
Even in the eyes of all posterity That wear this world out to the ending doom.
So, till the judgment that yourself arise, You live in this, and dwell in
lovers' eyes.";

say $sonnet;

与其他语言不同,Perl 允许在引号对中使用多行字符串。

$ ./multiline.pl Not marble, nor the gilded monuments Of princes, shall outlive
this powerful rhyme; But you shall shine more bright in these contents Than
unswept stone, besmear'd with sluttish time. When wasteful war shall statues
overturn, And broils root out the work of masonry, Nor Mars his sword nor war's
quick fire shall burn The living record of your memory. 'Gainst death and
all-oblivious enmity Shall you pace forth; your praise shall still find room
Even in the eyes of all posterity That wear this world out to the ending doom.
So, till the judgment that yourself arise, You live in this, and dwell in
lovers' eyes.

另一种方法是使用 qqq 操作符。

multiline2.pl
#!/usr/bin/perl

use 5.30.0; use warnings;

my $sonnet = qq/Not marble, nor the gilded monuments Of princes, shall outlive
this powerful rhyme; But you shall shine more bright in these contents Than
unswept stone, besmear'd with sluttish time. When wasteful war shall statues
overturn, And broils root out the work of masonry, Nor Mars his sword nor war's
quick fire shall burn The living record of your memory. 'Gainst death and
all-oblivious enmity Shall you pace forth; your praise shall still find room
Even in the eyes of all posterity That wear this world out to the ending doom.
So, till the judgment that yourself arise, You live in this, and dwell in
lovers' eyes./;

say $sonnet;

该示例使用 qq 操作符创建多行文本;该字符串使用双引号。

另一种方法是使用 here document,这是一种从 Unix shell 继承的功能。

multiline3.pl
#!/usr/bin/perl

use 5.30.0; use warnings;

my $sonnet = << "TEXT"; Not marble, nor the gilded monuments Of princes,
shall outlive this powerful rhyme; But you shall shine more bright in these
contents Than unswept stone, besmear'd with sluttish time. When wasteful war
shall statues overturn, And broils root out the work of masonry, Nor Mars his
sword nor war's quick fire shall burn The living record of your memory. 'Gainst
death and all-oblivious enmity Shall you pace forth; your praise shall still
find room Even in the eyes of all posterity That wear this world out to the
ending doom. So, till the judgment that yourself arise, You live in this, and
dwell in lovers' eyes. TEXT

say $sonnet;

here document 的开始和结束由分隔符指定:在本例中为 "TEXT"。双引号分隔符创建双引号字符串,而单引号分隔符创建单引号字符串。

Perl 字符串 substr

substr 从字符串中提取子字符串并返回它。第一个字符的偏移量为零。如果偏移量参数为负数,则从字符串末尾倒数该数量的位置开始。如果省略长度参数,则返回到字符串末尾的所有内容。

substrings.pl
#!/usr/bin/perl

use 5.30.0; use warnings;

my $word = "bookcase";

say substr $word, 0, 4; say substr $word, 4, 8; say substr $word, 4; say substr
$word, -8, 4; say substr $word, -4;

在示例中,我们使用 substr 函数从 "bookcase" 中提取单词。

$ ./substrings.pl book case case book case

Perl 字符串 startswith 和 endswith

String::Util 扩展包含一些字符串实用函数。

starts_ends.pl
#!/usr/bin/perl

use 5.30.0; use warnings; use String::Util qw/startswith endswith/;

my @words = qw/sky cup coin owl falcon war nice thorn/;

foreach (@words) {

    if (startswith 'c') {

        say;}}

say '-------------------------';

foreach (@words) {

    if (endswith 'n' or endswith 'r') {

        say;}}

我们定义了一个单词列表。我们查找所有以 'c' 开头并以 'n' 或 'r' 结尾的单词。

$ ./starts_ends.pl cup coin ------------------------- coin falcon war thorn

Perl 字符串正则表达式运算符

=~ 运算符将字符串与正则表达式匹配,如果正则表达式匹配则返回真值,如果不匹配则返回假值。

regex_oper.pl
#!/usr/bin/perl

use 5.30.0;
use warnings;

my @words = qw/book bookshelf bookworm bookcase bookish bookkeeper
    booklet bookmark/;

my $ptr = "^book(worm|mark|keeper)?\$";

foreach (@words) {

    if ($_ =~ $ptr) {

        say "$_ matches";
    }
}

我们定义了一个单词列表。我们遍历列表并检查列表是否包含以下任何单词:book, bookworm, bookmark, bookkeper。我们使用交替字符 | 来指定多个可能的匹配。

$ ./regex_oper.pl
book matches
bookworm matches
bookkeeper matches
bookmark matches

在下一个示例中,我们使用正则表达式删除单词中的特定字符。

removing.pl
#!/usr/bin/perl

use 5.30.0;
use warnings;

my $text = qq/Did you go there? We did, but we had a "great" service there./;
my @words = split ' ', $text;

foreach my $word (@words) {

    $word =~ s/[?".!,]//g;
    say $word;
}

say '----------------------------';

foreach (@words) {

    $_ =~ s/[?".!,]//g;
    say;
}

我们定义一个句子并将其分解为单词。

my $text = qq/Did you go there? We did, but we had a "great" service there./;

使用 qq 操作符定义了一个句子。

my @words = split ' ', $text;

使用 split 函数将句子分解为单词。我们还没有完成,因为我们有像 there? 和 did 这样的单词,它们包含标点符号。这些必须被删除。

foreach my $word (@words) {

    $word =~ s/[?".!,]//g;
    say $word;
}

使用正则表达式删除标点符号。s 前缀表示我们将给定字符集替换为空字符,这意味着我们删除了它们。

foreach (@words) {

    $_ =~ s/[?".!,]//g;
    say;
}

这是使用特殊 $_ 变量的替代解决方案。

$ ./removing.pl
Did
you
go
there
We
did
but
we
had
a
great
service
there
----------------------------
Did
you
go
there
We
did
but
we
had
a
great
service
there

在下一个示例中,我们从字符串中删除空格。

remove_space.pl
#!/usr/bin/perl

use 5.30.0;
use warnings;

my $send = 1;
my $name = 'John Doe';

if ($send) {
    (my $msg = qq{
        Dear $name,

        this is a message for you.

        Regards,
            Roger Roe
        }) =~ s/^ {8}//mg;

    say $msg;
}

我们使用 qq 操作符定义了一个多行字符串。通过应用 s/^ {8}//mg 正则表达式,删除最多八个字符的前导空格。

$ ./remove_space.pl

Dear John Doe,

this is a message for you.

Regards,
    Roger Roe

Perl 字符串空/空白

空字符串不包含任何字符,而空白字符串仅包含空格字符。

blank_empty.pl
#!/usr/bin/perl

use 5.30.0;
use warnings;

my @words = ("sky", "\n\n", "  ", " cup", "\t\t", "", "\r", "\r\n", "sky");

my $i = 0;

foreach (@words) {

    if ($_ eq '') {

        say "$i -> empty";
    } else {

        say "$i -> not empty";
    }

    $i++;
}

say '-------------------------';

$i = 0;

foreach (@words) {

    if ($_ =~ /^\s*$/) {

        say "$i -> blank";
    } else {

        say "$i -> not blank";
    }

    $i++;
}

该示例使用 eq 操作符检查空字符串。要检查空白字符串,我们使用 /^\s*$/ 正则表达式。

$ ./blank_empty.pl
0 -> not empty
1 -> not empty
2 -> not empty
3 -> not empty
4 -> not empty
5 -> empty
6 -> not empty
7 -> not empty
8 -> not empty
-------------------------
0 -> not blank
1 -> blank
2 -> blank
3 -> not blank
4 -> blank
5 -> blank
6 -> blank
7 -> blank
8 -> not blank

Perl 循环字符串

接下来的示例展示了如何在 Perl 中迭代字符串字符。

loop_chars.pl
#!/usr/bin/perl

use 5.30.0; use warnings;

my $msg = 'an old falcon';
my $n = length $msg;
my $i = 0;

for (my $i=0; $i < $n; $i++) {

    say substr $msg, $i, 1;
}

在第一个示例中,我们使用 length 确定字符串的长度,然后使用经典的 for 循环。要获取字符,我们使用 substr 函数。

$ ./loop_chars.pl
a
n

o
l
d

f
a
l
c
o
n

第二个示例使用 split 函数和 foreach 循环。

loop_chars2.pl
#!/usr/bin/perl

use 5.30.0; use warnings;

my $msg = 'an old falcon';
my @chars = split '', $msg;

foreach (@chars) {

     say;
}

split 函数用于将字符串分解为字符列表。我们使用 foreach 迭代列表。

第三个示例使用正则表达式和 foreach 循环。

loop_chars3.pl
#!/usr/bin/perl

use 5.30.0;
use warnings;

my $msg = 'an old falcon';
my @chars = $msg =~ /./sg;

foreach (@chars) {

     say;
}

使用正则表达式将字符串分解为字符列表。我们使用 foreach 迭代列表。

Perl 字符串格式化

在 Perl 中,我们可以使用 printfsprintf 函数来格式化字符串。

这些函数将格式字符串和参数列表作为参数。

%[flags][width][.precision]specifier

格式说明符具有此语法。方括号 [] 中的选项是可选的。

末尾的说明符字符定义了其对应参数的类型和解释。

标志 是一组修改输出格式的字符。有效标志集取决于说明符字符。宽度 是一个非负十进制整数,表示要写入输出的最小字符数。如果要打印的值短于宽度,则结果会用空格填充。即使结果较大,也不会截断该值。

对于整数说明符,精度 设置要写入的最小位数。如果要写入的值短于此数字,则结果会用前导零填充。对于字符串,它是要打印的最大字符数。对于 e、E、f 和 F 说明符,它是小数点后要打印的位数。对于 g 和 G 说明符,它是要打印的有效数字的最大数量。

format_funs.pl
#!/usr/bin/perl

use 5.30.0;
use warnings;

my $name = "Jane";
my $age = 17;

printf "%s is %d years old\n", $name, $age;

my $res = sprintf "%s is %d years old", $name, $age;
say $res;

printf "%d%%\n", 23

格式化函数根据指定的格式说明符构建字符串。

printf "%s is %d years old\n", $name, $age;

printf 函数将格式化后的字符串打印到控制台。%s 期望一个字符串变量,%d 期望一个整数变量。

my $res = sprintf "%s is %d years old", $name, $age;

sprintf 函数将格式化后的字符串放入一个变量。

printf "%d%%\n", 23

由于 % 是格式字符串中的特殊字符,因此需要将其加倍才能将其打印为百分号。

$ ./format_funs.pl
Jane is 17 years old
Jane is 17 years old
23%

格式化函数按给定参数的顺序应用格式说明符。下一个示例显示如何更改它们的顺序。

format_index.pl
#!/usr/bin/perl

use 5.30.0;
use warnings;

my $oranges = 2;
my $apples = 4;
my $bananas = 3;

my $res = sprintf 'There are %d oranges, %d apples and %d bananas',
    $oranges, $apples, $bananas;
say $res;

my $res2 = sprintf 'There are %2$d oranges, %d apples and %1$d bananas',
    $oranges, $apples, $bananas;
say $res2;

我们格式化了两个字符串。在第一种情况下,变量按其指定顺序应用。在第二种情况下,我们使用 2$1$ 字符改变它们的顺序,它们分别取第三个和第二个参数。

$ ./format_index.pl
There are 2 oranges, 4 apples and 3 bananas
There are 4 oranges, 2 apples and 2 bananas

格式说明符字符定义了其对应参数的类型和解释。

format_specs.pl
#!/usr/bin/perl

use 5.30.0;
use warnings;

printf "%d\n", 1671;
printf "%i\n", 1671;
printf "%u\n", 1671;
printf "%o\n", 1671;
printf "%x\n", 1671;
printf "%X\n", 1671;
printf "%#b\n", 1671;
printf "%#B\n", 1671;
printf "%a\n", 1671.678;
printf "%A\n", 1671.678;
printf "%f\n", 1671.678;
printf "%F\n", 1671.678;
printf "%e\n", 1671.678;
printf "%E\n", 1671.678;
printf "%g\n", 1671.678;
printf "%G\n", 1671.678;
printf "%s\n", 'Zetcode';
printf "%c %c %c %c %c %c %c\n", ord('Z'), ord('e'), ord('t'),
    ord('C'), ord('o'), ord('d'), ord('e');
printf "%p\n", 1671;
printf "%d%%\n", 1671;

该示例显示了 Perl 的字符串格式说明符字符。

$ ./format_specs.pl
1671
1671
1671
3207
687
687
0b11010000111
0B11010000111
0x1.a1eb645a1cac1p+10
0X1.A1EB645A1CAC1P+10
1671.678000
1671.678000
1.671678e+03
1.671678E+03
1671.68
1671.68
Zetcode
Z e t C o d e
556e7c3c97b8
1671%

Perl 字符串格式化数字表示法

下面的示例展示了如何使用各种数字表示法。

notations.pl
#!/usr/bin/perl

use 5.30.0;
use warnings;

my $val = 1234;

printf "%d\n", $val;
printf "%x\n", $val;
printf "%b\n", $val;
printf "%o\n", $val;

数字 1234 以四种不同的表示法打印:十进制、十六进制、二进制和八进制。

$ ./notations.pl
1234
4d2
10011010010
2322

Perl 字符串格式化精度

下一个示例格式化浮点值的精度。

format_precision.pl
#!/usr/bin/perl

use 5.30.0;
use warnings;

printf "%0.f\n", 16.540;
printf "%0.2f\n", 16.540;
printf "%0.3f\n", 16.540;
printf "%0.5f\n", 16.540;

对于浮点值,精度 是小数点后要打印的位数。

$ ./format_precision.pl
17
16.54
16.540
16.54000

Perl 字符串格式化科学计数法

e、E、g 和 G 说明符用于以科学计数法格式化数字。

format_scientific.pl
#!/usr/bin/perl

use 5.30.0;
use warnings;

my $val = 1273.78888769000;

printf "%f\n", $val;
printf "%e\n", $val;
printf "%g\n", $val;
printf "%E\n", $val;
printf "%G\n", $val;

say '-------------------------';

printf "%.10f\n", $val;
printf "%.10e\n", $val;
printf "%.10g\n", $val;
printf "%.10E\n", $val;
printf "%.10G\n", $val;

say '-------------------------';

my $val2 = 66_000_000_000;

printf "%f\n", $val2;
printf "%e\n", $val2;
printf "%g\n", $val2;
printf "%E\n", $val2;
printf "%G\n", $val2;

该示例以普通十进制和科学计数法格式化浮点值。

$ ./format_scientific.pl
1273.788888
1.273789e+03
1273.79
1.273789E+03
1273.79
-------------------------
1273.7888876900
1.2737888877e+03
1273.788888
1.2737888877E+03
1273.788888
-------------------------
66000000000.000000
6.600000e+10
6.6e+10
6.600000E+10
6.6E+10

Perl 字符串格式化标志

标志 是一组修改输出格式的字符。有效标志集取决于说明符字符。Perl 识别以下标志

format_flags.pl
#!/usr/bin/perl

use 5.30.0;
use warnings;

printf "%+d\n", 1691;

say '----------------------';

printf "%#x\n", 1691;
printf "%#X\n", 1691;
printf "%#b\n", 1691;
printf "%#B\n", 1691;

say '----------------------';

printf "%10d\n", 1691;
printf "%-10d\n", 1691;
printf "%010d\n", 1691;

该示例在字符串格式说明符中使用标志。

$ ./format_flags.pl
+1691
----------------------
0x69b
0X69B
0b11010011011
0B11010011011
----------------------
      1691
1691
0000001691

Perl 字符串格式化宽度

宽度 是要输出的最小字符数。它由紧跟在动词前面的可选十进制数字指定。如果省略,则宽度为表示该值所需的任意长度。

如果宽度大于该值,则用空格填充。

width.pl
#!/usr/bin/perl

use 5.30.0;
use warnings;

my $w = "falcon";
my $n = 122;
my $h = 455.67;

printf "%s\n", $w;
printf "%10s\n", $w;

say '---------------------';

printf "%d\n", $n;
printf "%7d\n", $n;
printf "%07d\n", $n;

say '---------------------';

printf "%10f\n", $h;
printf "%11f\n", $h;
printf "%12f\n", $h;

示例将宽度用于字符串、整数和浮点数。

printf "%07d\n", $n;

如果前面有 0,则数字不是用空格填充,而是用 0 填充。

$ ./width.pl 
falcon
    falcon
---------------------
122
    122
0000122
---------------------
455.670000
 455.670000
  455.670000

在本文中,我们继续使用 Perl 中的字符串数据类型。

访问 Perl 字符串 或查看 所有 Perl 教程