Python 中的文件

最后修改于 2023 年 10 月 18 日

在本 Python 编程教程中，我们将使用文件和标准输入/输出。我们将展示如何从文件读取和写入文件。我们将简要介绍 pickle 模块。

Python 中的一切都是对象。UNIX 中的一切都是文件。

Python open 函数

open 函数用于在 Python 中打开文件。

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None)

file 是要打开的文件的名称。 mode 表示将如何打开文件：用于读取、写入或追加。 buffering 是一个可选的整数，用于设置缓冲策略。 encoding 是用于解码或编码文件的编码的名称。 errors 是一个可选的字符串，用于指定如何处理编码和解码错误。 newline 控制换行符的行为。

文件模式为

模式	含义
'r'	读取 (默认)
'w'	写入
'a'	追加
'b'	二进制数据
'+'	更新（读取和写入）
'x'	独占创建，如果文件存在则失败

在下面的示例中，我们使用此文本文件。

works.txt

Lost Illusions
Beatrix
Honorine
The firm of Nucingen
Old Goriot
Colonel Chabert
Cousin Bette

Python with 语句

使用文件常常会导致错误；因此，我们必须管理可能的异常。此外，当不再需要文件对象时，必须关闭文件对象。 with 语句通过封装常见的准备和清理任务来简化异常处理。它还会自动关闭已打开的文件。

read_width.py

#!/usr/bin/env python

# read_width.py

with open('works.txt', 'r') as f:

    contents = f.read()
    print(contents)

该示例读取 works.txt 文件的内容。

with open('works.txt', 'r') as f:

使用 open 函数，我们打开 works.txt 文件进行读取。文件对象被别名为 f。 with 语句负责处理异常和关闭已打开的文件。

contents = f.read()

read 方法读取所有内容直到 EOF。它将数据作为一个大字符串返回。

Python read 函数

read 函数从文件中读取指定数量的字节。如果未指定字节数，则读取整个文件。

read_data.py

#!/usr/bin/env python

# read_data.py

import sys

with open('works.txt', 'r') as f:

    print(f.read(5))
    print(f.read(9))

代码示例从文件中读取五个字母并将它们打印到控制台。然后它读取并打印另外九个字母。第二个操作从第一个操作结束的位置继续读取。

$ ./read_data.py
Lost
Illusions

Python 文件位置

文件位置是我们在文件中读取数据的位置。 tell 方法给出文件中的当前位置，而 seek 方法移动文件中的位置。

file_positions.py

#!/usr/bin/env python

# file_positions.py

with open('works.txt', 'r') as f:

    print(f.read(14))
    print(f"The current file position is {f.tell()}")

    f.seek(0, 0)

    print(f.read(30))

在该示例中，我们读取 14 个字节，然后确定并打印当前文件位置。我们使用 seek 将当前文件位置移动到开头，然后读取另外 30 个字节。

$ ./file_positions.py
Lost Illusions
The current file position is 14
Lost Illusions
Beatrix
Honorin

Python readline 方法

readline 方法从文件中读取一行。字符串中保留了一个尾随换行符。当函数到达文件末尾时，它将返回一个空字符串。

readbyline.py

#!/usr/bin/env python

# readbyline.py

with open('works.txt', 'r') as f:

    while True:

        line = f.readline()

        if not line:
            break

        else:
            print(line.rstrip())

使用 readline 方法和 while 循环，我们从文件中读取所有行。

while True:

我们开始一个无限循环；因此，我们必须稍后使用 break 语句跳出循环。

line = f.readline()

从文件中读取一行。

if not line:
    break

else:
    print(line.rstrip())

如果到达 EOF，我们调用 break 语句；否则，我们打印该行到控制台，同时去除行尾的换行符。

在下面的示例中，我们使用一种更方便的方式来遍历文件的行。

read_file.py

#!/usr/bin/env python

# read_file.py

with open('works.txt', 'r') as f:

    for line in f:

        print(line.rstrip())

在内部，文件对象是一个迭代器。我们可以将文件对象传递给 for 循环来遍历它。

Python readlines 方法

readlines 方法读取数据直到文件末尾并返回一个行列表。

readlines.py

#!/usr/bin/env python

# readlines.py

with open('works.txt', 'r') as f:

    contents = f.readlines()

    for i in contents:

        print(i.strip())

readlines 方法将文件的所有内容读入内存。然而，对于非常大的文件，这可能会有问题。

Python write 方法

write 方法将一个字符串写入文件。

strophe.py

#!/usr/bin/env python

# strophe.py

text = '''Incompatible, it don't matter though
'cos someone's bound to hear my cry
Speak out if you do
You're not easy to find\n'''

with open('strophe.txt', 'w') as f:

    f.write(text)

这次我们以“w”模式打开文件，以便我们可以向其中写入。如果文件不存在，则创建它。如果它存在，则覆盖它。

$ cat strophe.txt
Incompatible, it don't matter though
'cos someone's bound to hear my cry
Speak out if you do
You're not easy to find

Python 标准 I/O

有三个基本的 I/O 连接：标准输入、标准输出和标准错误。标准输入是进入程序的数据。标准输入来自键盘。标准输出是我们使用 print 关键字打印数据的地方。除非重定向，否则它是终端控制台。标准错误是程序写入错误消息的流。它通常是文本终端。

Python 中的标准输入和输出是位于 sys 模块中的对象。

对象	描述
sys.stdin	标准输入
sys.stdout	标准输出
sys.stderr	标准错误

根据 UNIX 哲学，标准 I/O 流是文件对象。

Python 标准输入

标准输入是进入程序的数据。

read_name.py

#!/usr/bin/env python

# read_name.py

import sys

print('Enter your name: ', end='')

name = ''

sys.stdout.flush()

while True:

    c = sys.stdin.read(1)

    if c == '\n':
        break

    name = name + c

print('Your name is:', name)

read 方法从标准输入中读取一个字符。在我们的示例中，我们得到一个提示，上面写着“输入你的名字”。我们输入我们的名字并按下 Enter 键。 Enter 键生成 换行 符： \n。

$ ./read_name.py
Enter your name: Peter
Your name is: Peter

为了获取输入，我们可以使用更高级别的函数：input 和 raw_input。

如果给定了提示，input 函数会打印提示并读取输入。

input_example.py

#!/usr/bin/env python

# input_example.py

data = input('Enter value: ')

print('You have entered:', data)

$ ./input_example.py
Enter value: Hello there
You have entered: Hello there

Python 标准输出

标准输出是我们打印数据的地方。

std_output.py

#!/usr/bin/env python

# std_output.py

import sys

sys.stdout.write('Honore de Balzac, Father Goriot\n')
sys.stdout.write('Honore de Balzac, Lost Illusions\n')

在该示例中，我们向标准输出写入一些文本。在我们的例子中，这是终端控制台。我们使用 write 方法。

$ ./stdout.py
Honore de Balzac, Father Goriot
Honore de Balzac, Lost Illusions

print 函数默认将一些文本放入 sys.stdout 中。

print_fun.py

#!/usr/bin/env python

# print_fun.py

print('Honore de Balzac')
print('The Splendors and Miseries of Courtesans', 'Gobseck', 'Father Goriot', sep=":")

vals = [1, 2, 3, 4, 5]

for e in vals:
    print(e, end=' ')

print()

在此示例中，我们使用 sep 和 end 参数。 sep 分隔打印的对象，而 end 定义在末尾打印的内容。

$ ./print_fun.py
Honore de Balzac
The Splendors and Miseries of Courtesans:Gobseck:Father Goriot
1 2 3 4 5

可以使用 print 函数写入文件。 print 函数包含一个 file 参数，它告诉我们打印数据的位置。

print2file.py

#!/usr/bin/env python

# print2file.py

with open('works.txt', 'w') as f:

    print('Beatrix', file=f)
    print('Honorine', file=f)
    print('The firm of Nucingen', file=f)

我们打开一个文件，并将巴尔扎克的三本书的标题写入其中。文件对象被赋予 file 参数。

$ cat works.txt
Beatrix
Honorine
The firm of Nucingen

Python 重定向

标准输出可以被重定向。在下面的示例中，我们将标准输出重定向到一个常规文件。

redirect.py

#!/usr/bin/env python

# redirect.py

import sys

with open('output.txt', 'w') as f:

    sys.stdout = f

    print('Lucien')
    sys.stdout.write('Rastignac\n')
    sys.stdout.writelines(['Camusot\n', 'Collin\n'])

    sys.stdout = sys.__stdout__

    print('Bianchon')
    sys.stdout.write('Lambert\n')

在 redirect.py 脚本中，我们将标准输出重定向到常规文件 output.txt。然后我们恢复原始的标准输出。 std.output 的原始值保存在一个特殊的 sys.__stdout__ 变量中。

$ ./redirect.py
Bianchon
Lambert
$ cat output.txt
Lucien
Rastignac
Camusot
Collin

Python pickle 模块

到目前为止，我们一直在处理简单的文本数据。如果我们处理的是对象而不是简单的文本呢？对于这种情况，我们可以使用 pickle 模块。这个模块序列化 Python 对象。 Python 对象被转换为字节流并写入文本文件。此过程称为 pickling。逆操作，从文件读取和重建对象称为反序列化或 unpickling。

pickle_ex.py

#!/usr/bin/env python

# pickle_ex.py

import pickle


class Person:

    def __init__(self, name, age):
        self.name = name
        self.age = age

    def get_name(self):
        return self.name

    def get_age(self):
        return self.age


person = Person('Monica', 15)
print(person.get_name())
print(person.get_age())

with open('monica', 'wb') as f:
    pickle.dump(person, f)

with open('monica', 'rb') as f2:
    monica = pickle.load(f2)

print(monica.get_name())
print(monica.get_age())

在我们的脚本中，我们定义了一个 Person 类。我们创建一个人。我们使用 dump 方法对对象进行 pickling。我们关闭文件，再次打开它进行读取，并使用 load 方法对对象进行 unpickling。

$ ./pickle_ex.py
Monica
15
Monica
15

在本 Python 教程中，我们在 Python 中使用文件和标准输入/输出。

电子书