当前位置: 首页 > news >正文

青少年编程与数学 02-016 Python数据结构与算法 30课题、数据压缩算法

青少年编程与数学 02-016 Python数据结构与算法 30课题、数据压缩算法

  • 一、无损压缩算法
    • 1. Huffman编码
    • 2. Lempel-Ziv-Welch (LZW) 编码
    • 3. Run-Length Encoding (RLE)
  • 二、有损压缩算法
    • 1. JPEG压缩(有损)
    • 2. DEFLATE(ZIP压缩)
    • 3. Brotli
    • 4. LZMA
    • 5. Zstandard (Zstd)
  • 总结

课题摘要:
介绍一些常见的数据压缩算法,并提供更详细的Python代码实现。


一、无损压缩算法

1. Huffman编码

Huffman编码是一种基于字符频率的编码方法,通过构建一棵Huffman树来生成每个字符的唯一编码。

详细代码示例(Python)

import heapq
from collections import defaultdict, Counterclass Node:def __init__(self, char, freq):self.char = charself.freq = freqself.left = Noneself.right = Nonedef __lt__(self, other):return self.freq < other.freqdef build_huffman_tree(frequency):heap = [Node(char, freq) for char, freq in frequency.items()]heapq.heapify(heap)while len(heap) > 1:left = heapq.heappop(heap)right = heapq.heappop(heap)merged = Node(None, left.freq + right.freq)merged.left = leftmerged.right = rightheapq.heappush(heap, merged)return heap[0]def generate_codes(node, prefix="", code_dict=None):if code_dict is None:code_dict = {}if node is not None:if node.char is not None:code_dict[node.char] = prefixgenerate_codes(node.left, prefix + "0", code_dict)generate_codes(node.right, prefix + "1", code_dict)return code_dictdef huffman_encode(s):frequency = Counter(s)huffman_tree = build_huffman_tree(frequency)huffman_codes = generate_codes(huffman_tree)encoded_string = ''.join(huffman_codes[char] for char in s)return encoded_string, huffman_codesdef huffman_decode(encoded_string, huffman_codes):reverse_dict = {code: char for char, code in huffman_codes.items()}current_code = ""decoded_string = ""for bit in encoded_string:current_code += bitif current_code in reverse_dict:decoded_string += reverse_dict[current_code]current_code = ""return decoded_string# 示例
s = "this is an example for huffman encoding"
encoded_string, huffman_codes = huffman_encode(s)
print("Encoded string:", encoded_string)
print("Huffman dictionary:", huffman_codes)
decoded_string = huffman_decode(encoded_string, huffman_codes)
print("Decoded string:", decoded_string)

2. Lempel-Ziv-Welch (LZW) 编码

LZW编码是一种基于字典的压缩算法,通过动态构建字典来编码重复的字符串。

详细代码示例(Python)

def lzw_encode(s):dictionary = {chr(i): i for i in range(256)}w = ""result = []for c in s:wc = w + cif wc in dictionary:w = wcelse:result.append(dictionary[w])dictionary[wc] = len(dictionary)w = cif w:result.append(dictionary[w])return resultdef lzw_decode(encoded):dictionary = {i: chr(i) for i in range(256)}w = chr(encoded.pop(0))result = [w]for k in encoded:if k in dictionary:entry = dictionary[k]elif k == len(dictionary):entry = w + w[0]result.append(entry)dictionary[len(dictionary)] = w + entry[0]w = entryreturn ''.join(result)# 示例
s = "TOBEORNOTTOBEORTOBEORNOT"
encoded = lzw_encode(s)
print("Encoded:", encoded)
decoded = lzw_decode(encoded)
print("Decoded:", decoded)

3. Run-Length Encoding (RLE)

RLE是一种简单的无损压缩算法,通过将连续重复的字符替换为字符和重复次数的组合。

详细代码示例(Python)

def rle_encode(s):if not s:return ""result = []prev_char = s[0]count = 1for char in s[1:]:if char == prev_char:count += 1else:result.append((prev_char, count))prev_char = charcount = 1result.append((prev_char, count))return ''.join([f"{char}{count}" for char, count in result])def rle_decode(encoded):result = []i = 0while i < len(encoded):char = encoded[i]count = int(encoded[i+1])result.append(char * count)i += 2return ''.join(result)# 示例
s = "AAAABBBCCDAA"
encoded = rle_encode(s)
print("Encoded:", encoded)
decoded = rle_decode(encoded)
print("Decoded:", decoded)

二、有损压缩算法

1. JPEG压缩(有损)

JPEG是一种广泛使用的图像压缩标准,通常用于有损压缩。虽然JPEG压缩的实现较为复杂,但可以使用Python的Pillow库来处理JPEG图像。

详细代码示例(Python)

from PIL import Image# 压缩图像
def compress_image(input_path, output_path, quality=85):image = Image.open(input_path)image.save(output_path, "JPEG", quality=quality)# 示例
compress_image("input.jpg", "output.jpg", quality=50)

2. DEFLATE(ZIP压缩)

DEFLATE是一种结合了LZ77算法和Huffman编码的压缩算法,广泛用于ZIP文件格式。

详细代码示例(Python)

import zlibdef deflate_compress(data):compressed_data = zlib.compress(data.encode())return compressed_datadef deflate_decompress(compressed_data):decompressed_data = zlib.decompress(compressed_data)return decompressed_data.decode()# 示例
data = "this is an example for deflate compression"
compressed_data = deflate_compress(data)
print("Compressed data:", compressed_data)
decompressed_data = deflate_decompress(compressed_data)
print("Decompressed data:", decompressed_data)

3. Brotli

Brotli是一种现代的压缩算法,结合了多种压缩技术,提供比DEFLATE更好的压缩率。

详细代码示例(Python)

import brotlidef brotli_compress(data):compressed_data = brotli.compress(data.encode())return compressed_datadef brotli_decompress(compressed_data):decompressed_data = brotli.decompress(compressed_data)return decompressed_data.decode()# 示例
data = "this is an example for brotli compression"
compressed_data = brotli_compress(data)
print("Compressed data:", compressed_data)
decompressed_data = brotli_decompress(compressed_data)
print("Decompressed data:", decompressed_data)

4. LZMA

LZMA是一种高效的压缩算法,广泛用于7z文件格式。

详细代码示例(Python)

import lzmadef lzma_compress(data):compressed_data = lzma.compress(data.encode())return compressed_datadef lzma_decompress(compressed_data):decompressed_data = lzma.decompress(compressed_data)return decompressed_data.decode()# 示例
data = "this is an example for lzma compression"
compressed_data = lzma_compress(data)
print("Compressed data:", compressed_data)
decompressed_data = lzma_decompress(compressed_data)
print("Decompressed data:", decompressed_data)

5. Zstandard (Zstd)

Zstd是一种现代的压缩算法,结合了高压缩率和快速解压缩的特点。

详细代码示例(Python)

import zstandarddef zstd_compress(data):compressed_data = zstandard.compress(data.encode())return compressed_datadef zstd_decompress(compressed_data):decompressed_data = zstandard.decompress(compressed_data)return decompressed_data.decode()# 示例
data = "this is an example for zstd compression"
compressed_data = zstd_compress(data)
print("Compressed data:", compressed_data)
decompressed_data = zstd_decompress(compressed_data)
print("Decompressed data:", decompressed_data)

总结

这些数据压缩算法在不同的场景下具有各自的优势和适用性。无损压缩算法如Huffman编码、LZW编码和RLE适用于需要完全恢复原始数据的场景,而有损压缩算法如JPEG压缩则适用于对数据质量要求不高的场景。根据具体需求选择合适的压缩算法可以有效节省存储空间和传输带宽。

相关文章:

  • 基于DeepSeek与Excel的动态图表构建:技术融合与实践应用
  • 平均池化(Average Pooling)
  • 【绘制图像轮廓】图像处理(OpenCV) -part7
  • Fastdata极数:全球AR/VR行业发展趋势报告2025
  • spring-batch批处理框架(1)
  • 面向新一代扩展现实(XR)应用的物联网框架
  • 【Matlab】中国沿岸潮滩宽度和坡度分布
  • PH热榜 | 2025-04-19
  • PHP+MYSQL开发一个简易的个人博客(一)
  • 第2期:控制流程语句详解(条件判断与循环)
  • LeetCode[459]重复的子字符串(KMP解法)
  • 聊聊Spring AI Alibaba的ElasticsearchDocumentReader
  • opencv图像旋转(单点旋转的原理)
  • linux oracle 19c 静默安装
  • 使用Redis实现实时排行榜
  • Redis(持久化)
  • Gradle与Idea整合
  • python(八)-数据类型转换
  • Vue3 + Three.js 场景编辑器开发实践
  • JAVA学习-多线程
  • 财政部:一季度证券交易印花税411亿元,同比增长60.6%
  • 美国佛罗里达州立大学枪击事件已致2人死亡
  • 恒大汽车接获港交所复牌指引,还未披露公司2024年年报
  • 征税!断供!特朗普政府与哈佛对峙再升级
  • 明确五项重点任务!市场监管总局开展广告市场秩序整治
  • 专家:晚期肿瘤不再“无药可医”,早筛早诊早治可提高生存率