当前位置: 代码迷 >> python >> zlib decompress() 只解码第一行
  详细解决方案

zlib decompress() 只解码第一行

热度:80   发布时间:2023-07-16 10:55:45.0

我有一个名为idf的大字典(超过 1000 个条目),我想将它的所有values()在一个压缩的 .txt 文件中。 这是我的代码

for key in idf:
    data = str(idf[key])
    compressed_index = zlib.compress(data.encode('ISO-8859-1'))
    with open(current_inverted_index, "ab") as my_file:
        my_file.write(compressed_index)

压缩结束后,我的新 .txt 文件的大小为(443MB),前几行如下所示:

xú??Vw∞∞4422–5000T∑R?6¨≠’Q@76??a?ò?d?
1?ò[?d??±?X?m16?≠??5%xú??;í$7D?2?úUI?+O7ê?–I∫?*??? e ?????H$????/?ˇ1W.%??R ????w???W??’r??’≥>??|Sè??3ü9???z?=_}ü??j~Fw[ˇ?????…?&/??/?3$??   ?m<O?–RüwVüOYs??ü?t?dl”‘??≥??a??ü+??T?]}???o???≈*?S”j5??'zπ,??ú}uΩy??g??UM;KM?k?2?b?…?S6z??°C?—≤Cf??‘?¨ ????zvΩ÷÷ü??–??@≈J?±?
?ê5??i3ü??≤áu-?a1?id???é(??5t?G?≈pY?>/ ???±-?π≠?pgùXBF?8≤Z?2∏??r?‘?M ?C3wY.??≤??%??I√≥?cJπ0∑?'?ê7??òM??$EP.Cèì?v^\?"h?.§O???m?cTN?A>??X??????áf??eú<R?#-)?6?%?≤??∏_‰?v?U&hM?l??5·I?4?F?`7???z???&??l{ à??–ê?5C9—ì ?<??“ó?x?&_??Qv?j?????og???4N?d&SZùwêf^5§**M???≤≥?;V"?-?g]ü??Z?]ú∏R??r ???‰ ????3>?’?X?:?v??CK??F????4:?ò?≠,?<?9'r?àπ1ê?i|∑π??∞?;

我正在尝试测试我的编码,但我只将字典中第一个键的第一个值作为b"[{'AP891220-0001': {1}}, {'AP891220-0034': {512}}, {'AP891220-0073': {311}}, {'AP891220-0078': {231}}, {'AP891220-0079': {137}}]"这是我的解码代码:

f = open('inverted_indexes/id_1.txt', 'rb')
decompressed_data = zlib.decompress(f.read())
print(decompressed_data)

我不确定是什么问题以及为什么我只解码 .txt 文件的一小部分而不是所有内容

使用像pickle (不安全)或json这样的序列化库一次压缩整个字典:

import zlib
import pickle

index = 'index.txt'

idf = dict(zip('abcdefghijklmnop',range(16)))

compressed_index = zlib.compress(pickle.dumps(idf))
with open(index, 'wb') as my_file:
    my_file.write(compressed_index)

with open(index, 'rb') as f:
    decompressed_data = zlib.decompress(f.read())
print(pickle.loads(decompressed_data))
import zlib
import json

index = 'index.txt'

idf = dict(zip('abcdefghijklmnop',range(16)))
compressed_index = zlib.compress(json.dumps(idf).encode())
with open(index, 'wb') as my_file:
    my_file.write(compressed_index)

with open(index, 'rb') as f:
    decompressed_data = zlib.decompress(f.read()).decode()
print(json.loads(decompressed_data))
  相关解决方案