在 Python 3 中将字符串转换为字节的最佳方法？

从TypeError的答案中可以看出，有两种不同的方法可以将字符串转换为字节：'str' 不支持缓冲区接口

以下哪种方法更好或更 Pythonic？还是仅仅是个人喜好问题？

b = bytes(mystring, 'utf-8')

b = mystring.encode('utf-8')

如果查看bytes文档，它会指向bytearray ：

bytearray（[源 [，编码 [，错误]]]）
返回一个新的字节数组。字节数组类型是一个可变的整数序列，范围为 0 <= x <256。它具有可变序列类型中介绍的大多数可变序列的常用方法，以及字节类型具有的大多数方法，请参见字节和。字节数组方法。
可选的 source 参数可以通过几种不同的方式用于初始化数组：
如果是字符串，则还必须提供编码（以及可选的错误）参数；然后，bytearray（）使用 str.encode（）将字符串转换为字节。
如果它是整数，则数组将具有该大小，并将使用空字节初始化。
如果它是符合缓冲区接口的对象，则该对象的只读缓冲区将用于初始化 bytes 数组。
如果是可迭代的，则它必须是 0 <= x <256 范围内的整数的可迭代对象，这些整数用作数组的初始内容。
没有参数，将创建大小为 0 的数组。

因此， bytes可以对字符串进行编码，还可以做更多的事情。这是 Pythonic 的用法，它允许您使用有意义的任何类型的源参数来调用构造函数。

对于编码字符串，我认为some_string.encode(encoding)比使用构造函数更具 Python some_string.encode(encoding) ，因为它是最自说明的文档 -“采用此字符串并使用此编码对其进行编码” 比bytes(some_string, encoding)更清晰bytes(some_string, encoding) - 使用构造函数时没有显式动词。

编辑：我检查了 Python 源。如果使用 CPython 将 unicode 字符串传递给bytes ，则它将调用PyUnicode_AsEncodedString ，它是encode的实现；因此，如果您自己encode ，则只是跳过了一个间接级别。

另外，请参见 Serdalis 的评论unicode_string.encode(encoding)也是 Pythonic 的，因为它的反函数是byte_string.decode(encoding)并且对称性很好。

比想像的要容易：

my_str = "hello world"
my_str_as_bytes = str.encode(my_str)
type(my_str_as_bytes) # ensure it is byte representation
my_decoded_str = my_str_as_bytes.decode()
type(my_decoded_str) # ensure it is string representation

绝对最好的方法不是 2，而是 3。自 Python 3.0 以来，第一个用于encode参数默认为 'utf-8' 。因此最好的方法是

b = mystring.encode()

这也将更快，因为默认参数的结果不是 C 代码中的字符串"utf-8" ，而是NULL ，它检查起来要快得多！

以下是一些时间安排：

In [1]: %timeit -r 10 'abc'.encode('utf-8')
The slowest run took 38.07 times longer than the fastest. 
This could mean that an intermediate result is being cached.
10000000 loops, best of 10: 183 ns per loop

In [2]: %timeit -r 10 'abc'.encode()
The slowest run took 27.34 times longer than the fastest. 
This could mean that an intermediate result is being cached.
10000000 loops, best of 10: 137 ns per loop

尽管发出警告，但重复运行后时间仍然非常稳定 - 偏差仅为〜2％。

使用不带参数的encode()与 Python 2 不兼容，因为在 Python 2 中，默认字符编码为ASCII 。

>>> 'äöä'.encode()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)