Serialisation with Msgspec

python
TIL
Another Serializer in Python
Author

Mahamadi NIKIEMA

Published

December 22, 2024

While Pydantic is well-known for serialization and validation in Python, I recently discovered msgspec, a lightning-fast library that supports encoding and decoding various formats, including JSON, YAML, TOML, and MessagePack.

Encoding

You can encode Python objects into JSON or MessagePack.

import msgspec

# Encoding as JSON
json_data = msgspec.json.encode({"name": "awesome name"})
print(json_data)

# Encode as msgpack
msgpack_data = msgspec.msgpack.encode({"name": "awesome name"})
print(msgpack_data)
b'{"name":"awesome name"}'
b'\x81\xa4name\xacawesome name'

The Core: msgspec.Struct

The core component is the module msgspec.Struct.

At the heart of msgspec is the Struct class, which provides structure and type safety for your data models.

Defining a Structured Mode

import msgspec
from typing import Set


class ConfigStrategy(msgspec.Struct):
    name: str
    language: str
    stop_words: Set[str] = set()


spacy_cfg = ConfigStrategy(name="spacy", language="french")
print(spacy_cfg)
ConfigStrategy(name='spacy', language='french', stop_words=set())

Encoding the data

You can encode the structured object directly into JSON:

msgspec.json.encode(spacy_cfg)
b'{"name":"spacy","language":"french","stop_words":[]}'

Decoding the data

JSON Decoding

By default, msgspec does not perform type validation during the decoding:

msgspec.json.decode(b'{"name":"spacy","language":"french","stop_words":[]}')
{'name': 'spacy', 'language': 'french', 'stop_words': []}

Type Validation

msgspec makes it easy to decode serialized data into a structured object, complete with type validation:

msgspec.json.decode(b'{"name":"spacy","language":"french","stop_words":[]}', type=ConfigStrategy)
ConfigStrategy(name='spacy', language='french', stop_words=set())

If you’re looking for a high-performance alternative to libraries like Pydantic, give msgspec a try!