Python 读取文件

最后修改于 2024 年 1 月 29 日

Python 读取文件教程展示了如何在 Python 中读取文件。我们展示了几个读取文本文件和二进制文件的示例。

如果我们想读取一个文件，我们需要先打开它。为此，Python 有内置的 open 函数。

Python open 函数

open 函数用于在 Python 中打开文件。

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None)

file 是要打开的文件名。mode 指示文件将如何打开：用于读取、写入或追加。buffering 是一个可选整数，用于设置缓冲策略。encoding 是用于解码或编码文件的编码名称。errors 是一个可选字符串，指定如何处理编码和解码错误。newline 控制换行符的行为。

文件模式为

模式	含义
'r'	以读取模式打开（默认）
'w'	以写入模式打开，首先截断文件
'a'	以写入模式打开，如果文件存在则在文件末尾追加
'b'	二进制模式
't'	文本模式（默认）
'+'	更新（读取和写入）
'x'	独占创建，如果文件存在则失败

首先，我们处理文本文件。在教程的最后，我们处理一个二进制文件。

works.txt

Lost Illusions
Beatrix
Honorine
The firm of Nucingen
Old Goriot
Colonel Chabert
Cousin Bette
Gobseck
César Birotteau
The Chouans

我们使用此文件读取文本。

Python read

read 函数最多将 size 个字符读取为一个字符串。如果 size 参数为负数，则会读取到文件末尾。

read_all.py

#!/usr/bin/python

with open('works.txt', 'r') as f:

    contents = f.read()
    print(contents)

该示例读取整个文件并打印其内容。

with open('works.txt', 'r') as f:

我们以读取模式打开 works.txt 文件。由于我们没有指定二进制模式，文件以默认的文本模式打开。它返回文件对象 f。with 语句通过封装常见的准备和清理任务来简化异常处理。它还会自动关闭打开的文件。

contents = f.read()

我们调用文件对象的 read 函数。由于我们没有指定任何参数，它会读取整个文件。

$ ./read_all.py 
Lost Illusions
Beatrix
Honorine
The firm of Nucingen
Old Goriot
Colonel Chabert
Cousin Bette
Gobseck
César Birotteau
The Chouans

Python 读取字符

通过向 read 函数提供 size 参数，我们可以指定要读取的字符数。

read_characters.py

#!/usr/bin/python

with open('works.txt', 'r') as f:

    data1 = f.read(4)
    print(data1)

    data2 = f.read(20)
    print(data2)

    data3 = f.read(10)
    print(data3)

在示例中，我们从文件中读取了 4、20 和 10 个字符。

$ ./read_characters.py 
Lost
    Illusions
Beatrix
H
onorine
Th

Python readline

readline 函数读取直到换行符或文件末尾，并返回一个字符串。如果流已到达文件末尾，则返回一个空字符串。如果指定了 size 参数，则最多读取 size 个字符。

read_line.py

#!/usr/bin/python

with open('works.txt', 'r') as f:

    line = f.readline()
    print(line.rstrip())

    line2 = f.readline()
    print(line2.rstrip())

    line3 = f.readline()
    print(line3.rstrip())

在示例中，我们从文件中读取了三行。rstrip 函数会从字符串末尾删除换行符。

$ ./read_line.py 
Lost Illusions
Beatrix
Honorine

Python readlines

readlines 函数读取并返回流中的行列表。

read_lines.py

#!/usr/bin/python

with open('works.txt', 'r') as f:

    lines = f.readlines()

    print(lines)

    for line in lines:

        print(line.strip())

在示例中，我们使用 readlines 读取文件内容。我们打印行列表，然后使用 for 语句循环遍历列表。

$ ./read_lines.py 
['Lost Illusions\n', 'Beatrix\n', 'Honorine\n', 'The firm of Nucingen\n', 
'Old Goriot\n', 'Colonel Chabert\n', 'Cousin Bette\n', 'Gobseck\n', 
'César Birotteau\n', 'The Chouans\n']
Lost Illusions
Beatrix
Honorine
The firm of Nucingen
Old Goriot
Colonel Chabert
Cousin Bette
Gobseck
César Birotteau
The Chouans

Python 读取文件

由于从 open 函数返回的文件对象是可迭代的，我们可以直接将其传递给 for 语句。

read_file.py

#!/usr/bin/python

with open('works.txt', 'r') as f:

    for line in f:

        print(line.rstrip())

该示例遍历文件对象以打印文本文件的内容。

$ ./read_file.py 
Lost Illusions
Beatrix
Honorine
The firm of Nucingen
Old Goriot
Colonel Chabert
Cousin Bette
Gobseck
César Birotteau
The Chouans

seek 函数

seek 函数将流位置更改为指定的字节偏移量。

seek(offset, whence=SEEK_SET)

offset 值根据 whence 指示的位置进行解释。whence 的默认值为 SEEK_SET。

whence 的值是

SEEK_SET 或 0 - 流的开始（默认）；offset 应该是零或正数
SEEK_CUR 或 1 - 当前流位置；offset 可能是负数
SEEK_END 或 2 - 流的结束；offset 通常是负数

seeking.py

#!/usr/bin/python

with open('works.txt', 'r') as f:

    data1 = f.read(22)
    print(data1)

    f.seek(0, 0)

    data2 = f.read(22)
    print(data2)

在示例中，我们从文本流中读取了 22 个字符。然后我们将流位置设置回开头，并再次读取 22 个字符。

$ ./seeking.py 
Lost Illusions
Beatrix
Lost Illusions
Beatrix

tell 函数

tell 函数返回当前的流位置。

telling.py

#!/usr/bin/python

with open('works.txt', 'r') as f:

    print(f'The current file position is {f.tell()}')

    f.read(22)
    print(f'The current file position is {f.tell()}')

    f.seek(0, 0)
    print(f'The current file position is {f.tell()}')

我们使用 read 和 seek 移动流位置，并使用 tell 打印它。

$ ./telling.py 
The current file position is 0
The current file position is 22
The current file position is 0

Python 使用 try/except/finally 读取文件

with 语句简化了我们在读取文件时的工作。如果没有 with，我们需要手动处理异常并关闭资源。

try_except_finally.py

#!/usr/bin/python

f = None

try:

    f = open('works.txt', 'r')

    for line in f:
        print(line.rstrip())

except IOError as e:

    print(e)

finally:

    if f:
        f.close()

在示例中，我们使用 try、except 和 finally 关键字来处理异常和资源释放。

Python 读取二进制文件

在下面的示例中，我们读取一个二进制文件。

read_binary.py

#!/usr/bin/python

with open('web.png', 'rb') as f:

    hexdata = f.read().hex()

    n = 2
    data = [hexdata[i:i+n] for i in range(0, len(hexdata), n)]

    i = 0
    for e in data:

        print(e, end=' ')
        i += 1

        if i % 20 == 0:
            print()

    print()

该示例读取一个 PNG 文件。它将数据输出为十六进制值。

with open('web.png', 'rb') as f:

我们以读取和二进制模式打开 PNG 文件。

hexdata = f.read().hex()

我们使用 hex 函数读取所有数据并将其转换为十六进制值。

n = 2
data = [hexdata[i:i+n] for i in range(0, len(hexdata), n)]

我们将字符串分块成一个包含两个字符的列表。

i = 0
for e in data:

    print(e, end=' ')
    i += 1

    if i % 20 == 0:
        print()

我们以列的形式打印数据；列之间有一个空格。每行输出有 20 列。

$ ./read_binary.py 
89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44 52 00 00 00 10 
00 00 00 10 08 06 00 00 00 1f f3 ff 61 00 00 03 2a 49 44 41 
54 78 01 75 53 43 98 23 69 18 ae eb 7a af eb 4a c6 b6 ed 38 
19 bb b5 6c db 66 55 45 dd 71 66 9f a4 ad 8a 9d b1 6d e3 fe 
ac 77 2f 63 bf 5b 55 a7 e1 e1 c7 a7 f7 33 01 e0 b5 43 6a 1a 
3e 27 e5 6d a9 62 b9 d6 39 4a a5 bd 3e 4a ad bb 22 56 d2 76 
...

来源

Python 语言参考

在本文中，我们展示了如何在 Python 中读取文件。

作者

我叫 Jan Bodnar，是一名充满激情的程序员，拥有丰富的编程经验。自 2007 年以来，我一直在撰写编程文章。迄今为止，我已撰写了 1,400 多篇文章和 8 本电子书。我在编程教学方面有十多年的经验。

列出所有 Python 教程。