Python对象序列化和反序列化 – object serialization and deserialization

发布于 2023-11-17  219 次阅读


Please refresh the page if equations are not rendered correctly.
---------------------------------------------------------------

整理自: Python Pickle Tutorial: Object Serialization | DataCampPython开发之序列化与反序列化:pickle、json模块使用详解 - 奥辰 - 博客园

对象的序列化和反序列化就是将数据从内存保存到硬盘和从硬盘重新加载数据的过程。

需要解决的问题:保存过程中数据结构不变,原来的类的方法和属性、数据结构等需要一起保存并能够重新使用。

解决:使用pickle

1.Markup some data

students = {


  'Student 1': {
        'Name': "Alice", 'Age' :10, 'Grade':4,
    },

    'Student 2': {
        'Name':'Bob', 'Age':11, 'Grade':5
    },

    'Student 3': {
        'Name':'Elena', 'Age':14, 'Grade':8
    },
    '张三' : {
        'Name':'张三', 'Age':14, 'Grade':8
    },
}
type(students)
dict

2. 序列化和反序列化

保存为文本文件(.txt)

加入encoding= 'utf-8' flag来处理中文字符

with open('student_info.txt','w', encoding= 'utf-8') as data:
      data.write(str(students))
data.close()

可以看到数据被存储为一个txt文档,文档内容是:

{'Student 1': {'Name': 'Alice', 'Age': 10, 'Grade': 4}, 'Student 2': {'Name': 'Bob', 'Age': 11, 'Grade': 5}, 'Student 3': {'Name': 'Elena', 'Age': 14, 'Grade': 8}}

重新加载数据(同样加入encoding= 'utf-8'来处理中文)

with open("student_info.txt", 'r', encoding= 'utf-8') as f:
    for students in f:
        print(students)
f.close()

type(students)
{'Student 1': {'Name': 'Alice', 'Age': 10, 'Grade': 4}, 'Student 2': {'Name': 'Bob', 'Age': 11, 'Grade': 5}, 'Student 3': {'Name': 'Elena', 'Age': 14, 'Grade': 8}, '张三': {'Name': '张三', 'Age': 14, 'Grade': 8}}





str

可以看到数据变成了字符串,类型改变了,不满足要求

使用Json

问题与直接写为文本文件类似。本质上是字典形式的文本文件

import json

with open('student_info.json','w') as data:
    json.dump(students,data)
data.close()

运行以上代码应该得到了一个json文件。内容如下:

"{'Student 1': {'Name': 'Alice', 'Age': 10, 'Grade': 4}, 'Student 2': {'Name': 'Bob', 'Age': 11, 'Grade': 5}, 'Student 3': {'Name': 'Elena', 'Age': 14, 'Grade': 8}, '\u5f20\u4e09': {'Name': '\u5f20\u4e09', 'Age': 14, 'Grade': 8}}"

中文以编码形式存储,但是看下面的测试是可以正确还原的:

# Load the json file
with open('student_info.json','r') as data:
    data_json = json.load(data)
data.close()

print(data_json)
print(type(data_json))
{'Student 1': {'Name': 'Alice', 'Age': 10, 'Grade': 4}, 'Student 2': {'Name': 'Bob', 'Age': 11, 'Grade': 5}, 'Student 3': {'Name': 'Elena', 'Age': 14, 'Grade': 8}, '张三': {'Name': '张三', 'Age': 14, 'Grade': 8}}
<class 'str'>

然而依然是字符串。不满足要求。

另外json还有json.dumps()函数。

dumps是将Object转化成str格式,loads是将str转化成原来的Object格式。

dump和load也是类似的功能,只是与文件操作结合起来了,需要提供第二个参数:文件路径。如上所示

stud = json.dumps(students)
print(stud)
print(type(stud))

# loads
stud = json.loads(stud)
print(stud)
print(type(stud))
"{'Student 1': {'Name': 'Alice', 'Age': 10, 'Grade': 4}, 'Student 2': {'Name': 'Bob', 'Age': 11, 'Grade': 5}, 'Student 3': {'Name': 'Elena', 'Age': 14, 'Grade': 8}, '\u5f20\u4e09': {'Name': '\u5f20\u4e09', 'Age': 14, 'Grade': 8}}"
<class 'str'>
{'Student 1': {'Name': 'Alice', 'Age': 10, 'Grade': 4}, 'Student 2': {'Name': 'Bob', 'Age': 11, 'Grade': 5}, 'Student 3': {'Name': 'Elena', 'Age': 14, 'Grade': 8}, '张三': {'Name': '张三', 'Age': 14, 'Grade': 8}}
<class 'str'>

无需文件操作

使用pickle

Python Pickle Tutorial: Object Serialization | DataCamp

Advantages of using Pickle to serialize objects

  • Unlike serialization formats like JSON, which cannot handle tuples and datetime objects, Pickle can serialize almost every commonly used built-in Python data type. It also retains the exact state of the object which JSON cannot do.
  • Pickle is also a good choice when storing recursive structures since it only writes an object once.
  • Pickle allows for flexibility when deserializing objects. You can easily save different variables into a Pickle file and load them back in a different Python session, recovering your data exactly the way it was without having to edit your code.

Disadvantages of using Pickle

  • Pickle is unsafe because it can execute malicious Python callables to construct objects. When deserializing an object, Pickle cannot tell the difference between a malicious callable and a non-malicious one. Due to this, users can end up executing arbitrary code during deserialization.
  • As mentioned previously, Pickle is a Python-specific module, and you may struggle to deserialize pickled objects when using a different language.
  • According to multiple benchmarks, Pickle appears to be slower and produces larger serialized values than formats such as JSON and ApacheThrift.
pickle.dump(obj, file, protocol=None, *, fix_imports=True, buffer_callback=None)

pickle.dumps(obj, protocol=None, *, fix_imports=True, buffer_callback=None)

pickle.load(file, *, fix_imports=True, encoding='ASCII', errors='strict', buffers=None)

pickle.loads(data, /, *, fix_imports=True, encoding=”ASCII”, errors=”strict”, buffers=None)

dump和dumps的区别与json是相同的

import pickle

with open('student_info.pickle','wb') as data:
    pickle.dump(students,data)
data.close()

# Load the pickle file
with open('student_info.pickle','rb') as data:
    data_pickle = pickle.load(data)
data.close()

print(data_pickle)
print(type(data_pickle))
{'Student 1': {'Name': 'Alice', 'Age': 10, 'Grade': 4}, 'Student 2': {'Name': 'Bob', 'Age': 11, 'Grade': 5}, 'Student 3': {'Name': 'Elena', 'Age': 14, 'Grade': 8}, '张三': {'Name': '张三', 'Age': 14, 'Grade': 8}}
<class 'str'>

随便一个类

class SaveData:
    def __init__(self, path):
        self.path = path
        self.file = None
        self.file_name = None
        self.file_path = None

    def open(self, file_name):
        self.file_name = file_name
        self.file_path = os.path.join(self.path, self.file_name)
        self.file = open(self.file_path, "w")

    def write(self, data):
        self.file.write(data)

    def close(self):
        self.file.close()
path = "data"
file_name = "test.txt"
data = "test"
save_data = SaveData(path)

output_hal = open("class_test.tow", 'wb')
str = pickle.dumps(save_data)
output_hal.write(str)
output_hal.close()

# load class_test.tow with pickle.loads()
input_hal = open("class_test.tow", 'rb')

print(input_hal)
print(type(input_hal))
Everything not saved will be lost.
最后更新于 2023-11-17