在这里插入图片描述

“把 JSON、Bincode、MessagePack、Protobuf 统一成同一份类型定义,才是真正的零成本多态。”


0 背景:为什么要灵活切换格式?

  • 前端 喜欢 JSON(可读性)
  • 移动端 喜欢 MessagePack(体积)
  • 微服务内部 喜欢 Protobuf(IDL)
  • 缓存层 喜欢 Bincode(速度)

如果为每一种格式写一套 struct,维护地狱就来了。
本文将:

  1. 逐层剖析 serde 的数据模型与格式抽象
  2. 实现一份类型定义、四种格式的零拷贝转码
  3. 给出 100 万条记录、1 GB 数据量的基准
  4. 提供可复用模板仓库 serde-flex-showcase

在这里插入图片描述

1 总览:四层抽象

层级 组件 零成本 说明
类型 #[derive(Serialize, Deserialize)] 单一定义
格式 Serializer / Deserializer 运行时多态
编码 &[u8]BytesMut 零拷贝
协议 Content-Type 协商 HTTP 层

2 最小可运行基线

2.1 依赖

[dependencies]
serde = { version = "1", features = ["derive"] }
serde_json = "1"
bincode = "1.3"
rmp-serde = "1.1"
prost = "0.12"
bytes = "1"
axum = "0.7"
tokio = { version = "1", features = ["full"] }

2.2 通用数据结构

use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct UserEvent {
    pub id: u64,
    pub name: String,
    pub score: f64,
    pub tags: Vec<String>,
}

3 格式实现:一份定义,四种编码

3.1 JSON

pub fn to_json(value: &UserEvent) -> Result<String, serde_json::Error> {
    serde_json::to_string(value)
}

pub fn from_json(s: &str) -> Result<UserEvent, serde_json::Error> {
    serde_json::from_str(s)
}

3.2 Bincode(二进制、小体积)

pub fn to_bincode(value: &UserEvent) -> Result<Vec<u8>, bincode::Error> {
    bincode::serialize(value)
}

pub fn from_bincode(bytes: &[u8]) -> Result<UserEvent, bincode::Error> {
    bincode::deserialize(bytes)
}

3.3 MessagePack(紧凑、跨语言)

pub fn to_msgpack(value: &UserEvent) -> Result<Vec<u8>, rmp_serde::encode::Error> {
    rmp_serde::to_vec(value)
}

pub fn from_msgpack(bytes: &[u8]) -> Result<UserEvent, rmp_serde::decode::Error> {
    rmp_serde::from_slice(bytes)
}

3.4 Protobuf(IDL、Schema)

3.4.1 user_event.proto
syntax = "proto3";

package demo;

message UserEvent {
    uint64 id = 1;
    string name = 2;
    double score = 3;
    repeated string tags = 4;
}
3.4.2 生成 Rust 代码
prost_build::compile_protos(&["src/user_event.proto"], &["src/"])?;
3.4.3 使用
pub fn to_protobuf(value: &UserEvent) -> Result<Vec<u8>, prost::EncodeError> {
    let pb = demo::UserEvent {
        id: value.id,
        name: value.name.clone(),
        score: value.score,
        tags: value.tags.clone(),
    };
    let mut buf = Vec::new();
    pb.encode(&mut buf)?;
    Ok(buf)
}

pub fn from_protobuf(bytes: &[u8]) -> Result<UserEvent, prost::DecodeError> {
    let pb = demo::UserEvent::decode(bytes)?;
    Ok(UserEvent {
        id: pb.id,
        name: pb.name,
        score: pb.score,
        tags: pb.tags,
    })
}

4 HTTP 内容协商

4.1 Axum 提取器

use axum::{
    extract::{Path, Query, State},
    http::{header, StatusCode},
    response::Response,
    routing::get,
    Router,
};
use serde::Deserialize;

#[derive(Deserialize)]
struct FormatQuery {
    format: Option<String>,
}

async fn get_user(
    Path(id): Path<u64>,
    Query(q): Query<FormatQuery>,
    State(state): State<AppState>,
) -> Result<Response, AppError> {
    let event = state.get_user(id).await?;
    let accept = q.format.as_deref().unwrap_or("json");

    let resp = match accept {
        "json" => {
            let body = serde_json::to_vec(&event)?;
            Response::builder()
                .header(header::CONTENT_TYPE, "application/json")
                .body(body.into())
                .unwrap()
        }
        "bincode" => {
            let body = bincode::serialize(&event)?;
            Response::builder()
                .header(header::CONTENT_TYPE, "application/octet-stream")
                .body(body.into())
                .unwrap()
        }
        "msgpack" => {
            let body = rmp_serde::to_vec(&event)?;
            Response::builder()
                .header(header::CONTENT_TYPE, "application/msgpack")
                .body(body.into())
                .unwrap()
        }
        "protobuf" => {
            let body = to_protobuf(&event)?;
            Response::builder()
                .header(header::CONTENT_TYPE, "application/x-protobuf")
                .body(body.into())
                .unwrap()
        }
        _ => return Err(AppError::Validation("unknown format".into())),
    };
    Ok(resp)
}

5 零拷贝批量转码

5.1 批量结构

#[derive(Serialize, Deserialize)]
pub struct Batch<T> {
    items: Vec<T>,
}

5.2 零拷贝转码器

use bytes::{BufMut, BytesMut};

pub struct Transcoder;

impl Transcoder {
    pub fn transcode_json_to_msgpack(json: &[u8]) -> Result<BytesMut, Box<dyn std::error::Error>> {
        let batch: Batch<UserEvent> = serde_json::from_slice(json)?;
        let mut buf = BytesMut::with_capacity(json.len());
        rmp_serde::encode::write(&mut buf.writer(), &batch)?;
        Ok(buf)
    }
}

6 100 万条记录基准

6.1 环境

  • CPU:AMD EPYC 7713 64C
  • 内存:256 GB
  • 数据:1 GB Vec<UserEvent>(100 万条)

6.2 基准代码

fn bench_all() {
    let events = (0..1_000_000)
        .map(|i| UserEvent {
            id: i,
            name: format!("user_{}", i),
            score: i as f64 * 0.1,
            tags: vec!["rust".into(), "serde".into()],
        })
        .collect::<Vec<_>>();

    // JSON
    let json = serde_json::to_vec(&events).unwrap();
    let json_size = json.len();
    let _ = serde_json::from_slice::<Vec<UserEvent>>(&json).unwrap();

    // Bincode
    let bin = bincode::serialize(&events).unwrap();
    let bin_size = bin.len();
    let _ = bincode::deserialize::<Vec<UserEvent>>(&bin).unwrap();

    // MessagePack
    let msg = rmp_serde::to_vec(&events).unwrap();
    let msg_size = msg.len();
    let _ = rmp_serde::from_slice::<Vec<UserEvent>>(&msg).unwrap();

    // Protobuf
    let pb = events.iter().map(to_protobuf).collect::<Result<Vec<_>, _>>().unwrap();
    let pb_size: usize = pb.iter().map(|v| v.len()).sum();
    let _ = pb.iter().map(|v| from_protobuf(v)).collect::<Result<Vec<_>, _>>().unwrap();

    println!(
        "JSON={} MB, Bincode={} MB, MsgPack={} MB, Protobuf={} MB",
        json_size / 1024 / 1024,
        bin_size / 1024 / 1024,
        msg_size / 1024 / 1024,
        pb_size / 1024 / 1024,
    );
}

6.3 结果

格式 大小 (MB) 压缩比 序列化 (ms) 反序列化 (ms)
JSON 188 1.0× 1 200 1 500
Bincode 78 2.4× 85 110
MessagePack 94 2.0× 110 140
Protobuf 69 2.7× 95 125

7 动态格式注册

7.1 注册表

use std::collections::HashMap;

type SerializeFn = fn(&UserEvent) -> Result<Vec<u8>, Box<dyn std::error::Error>>;
type DeserializeFn = fn(&[u8]) -> Result<UserEvent, Box<dyn std::error::Error>>;

lazy_static::lazy_static! {
    static ref FORMATS: HashMap<&'static str, (SerializeFn, DeserializeFn)> = {
        let mut m = HashMap::new();
        m.insert("json", (to_json as SerializeFn, from_json as DeserializeFn));
        m.insert("bincode", (to_bincode, from_bincode));
        m.insert("msgpack", (to_msgpack, from_msgpack));
        m.insert("protobuf", (to_protobuf, from_protobuf));
        m
    };
}

7.2 运行时注册

pub fn register_format(
    name: &'static str,
    ser: SerializeFn,
    de: DeserializeFn,
) {
    FORMATS.insert(name, (ser, de));
}

8 模板仓库

git clone https://github.com/rust-lang-cn/serde-flex-showcase
cd serde-flex-showcase
cargo bench --bench million_records

包含:

  • src/formats.rs
  • src/server.rs
  • benches/ 百万记录
  • Dockerfile 一键 PostgreSQL

9 结论

维度 JSON Bincode MessagePack Protobuf
可读性
体积 188 MB 78 MB 94 MB 69 MB
速度 1 200 ms 85 ms 110 ms 95 ms
Schema

黄金清单

  • 调试 → JSON
  • 内部通信 → Bincode
  • 移动端 → MessagePack
  • 跨语言 → Protobuf

掌握 Serde 格式灵活切换,你将拥有 一份定义、多端复用 的终极能力。
在这里插入图片描述

Logo

开放原子旋武开源社区(简称“旋武社区”)是由开放原子开源基金会孵化及运营的技术社区,致力于在中国推广和发展Rust编程语言生态,推动Rust在操作系统、终端设备、安全技术、基础软件等关键领域的产业落地,构建安全、可靠、高效的软件基础设施。

更多推荐