Home
lodum is a high-performance framework for loading and dumping Python data structures efficiently and ergonomically.
Think of it as
serdefor Python.
⚡ Why lodum?
| Feature | Description |
|---|---|
| 🚀 Fast | ~64% faster dumps than standard introspection using AST bytecode generation. |
| 🛡️ Safe | Secure-by-default design. Blocks arbitrary code execution in pickle. |
| 📦 Universal | One API for JSON, YAML, TOML, MsgPack, CBOR, BSON, and Pickle. |
| 🧩 Extensible | Native support for numpy, pandas, and polars without extra glue code. |
| ✅ Validated | Built-in validators (Range, Length) and schema generation. |
Installation
pip install lodum
# Or with all optional dependencies (YAML, TOML, binary formats, Pandas, etc.)
pip install "lodum[all]"
Core Concepts
The architecture of lodum is built on a clear separation of concerns, just like serde:
- lodum-enabled Data Structures: You define the data you want to encode by decorating your classes with
@lodum. This decorator introspects your class to understand its structure. - Data Formats (Loaders/Dumpers): The logic for converting data into a specific format (like JSON) is handled by
LoaderandDumperimplementations. This makes the core library format-agnostic.
This means you can define how your data is structured once, and then easily encode it to multiple formats (JSON, YAML, etc.) by simply using a different module.
Getting Started
Here is a quick example of how to encode a simple Python object to JSON and decode it back.
1. Define your data structure
Use the @lodum decorator on your class. You can use standard __init__ methods or dataclasses. Make sure to include type hints, as lodum uses them to understand your data.
from lodum import lodum
from dataclasses import dataclass
@lodum
@dataclass
class User:
name: str
age: int
is_active: bool
2. Encode to JSON
Use the json.dumps function to convert an instance of your class into a JSON string.
from lodum import json
user = User(name="Alex", age=30, is_active=True)
# Encode the object to a JSON string
json_string = json.dumps(user)
print(json_string)
# Output: {"name": "Alex", "age": 30, "is_active": true}
3. Decode and Encode with Multiple Formats
You can easily switch between formats. For example, you can decode from JSON and then encode to YAML using the json.loads and yaml.dumps functions.
from lodum import json, yaml
# You can also encode to YAML
yaml_string = yaml.dumps(user)
print(yaml_string)
# -> name: Alex
# -> age: 30
# -> is_active: true
json_data = '{"name": "Barbara", "age": 25, "is_active": false}'
# Decode the JSON string back to a User object
barbara = json.loads(User, json_data)
print(f"Name: {barbara.name}, Age: {barbara.age}, Active: {barbara.is_active}")
# Output: Name: Barbara, Age: 25, Active: False
This simple example demonstrates the core functionality.
Round-Trip Example
lodum ensures that your data can be reliably converted between formats. Here's an example of a full round-trip conversion, starting with JSON, converting to YAML, and then back to JSON, verifying that the data remains consistent.
import json as std_json
from lodum import lodum, json, yaml
@lodum
class ServerConfig:
def __init__(self, host: str, port: int, services: list[str]):
self.host = host
self.port = port
self.services = services
# 1. Start with a JSON string
original_json = '{"host": "127.0.0.1", "port": 8080, "services": ["users", "products", "inventory"]}'
# 2. Decode the JSON to a Python object
config_from_json = json.loads(ServerConfig, original_json)
# 3. Encode the object to YAML
yaml_output = yaml.dumps(config_from_json)
# 4. Decode the YAML back to a Python object
config_from_yaml = yaml.loads(ServerConfig, yaml_output)
# 5. Encode the final object back to JSON
final_json = json.dumps(config_from_yaml)
# 6. Verify that the final JSON matches the original
# We load them into dictionaries to ignore any formatting differences
assert std_json.loads(original_json) == std_json.loads(final_json)
print("Round-trip conversion successful!")
Error Reporting
lodum provides detailed path information when deserialization fails, making it easy to identify the exact field that caused the error.
from lodum import lodum, json
from lodum.exception import DeserializationError
@lodum
class User:
def __init__(self, name: str, age: int):
self.name = name
self.age = age
json_data = '{"name": "Alex", "age": "not_an_int"}'
try:
json.loads(User, json_data)
except DeserializationError as e:
print(e)
# Output: Error at age: Expected int, got str
The path tracking works through nested objects, lists, and dictionaries (e.g., root.users[2].id).
Field Customization
You can customize the behavior of individual fields by using the field() function as a default value in your __init__ method.
from lodum import lodum, field, json
@lodum
class User:
def __init__(
self,
# Rename 'user_id' to 'id' in the output
user_id: int = field(rename="id", default=0),
# This field is required
email: str,
# This field will not be included in the output
password_hash: str = field(skip_serializing=True, default=""),
# If 'prefs' is missing on decoding, it will default to an empty dict
prefs: dict = field(default_factory=dict),
# Add validation to a field
age: int = field(validate=lambda x: x >= 0, default=0)
):
self.user_id = user_id
self.email = email
self.password_hash = password_hash
self.prefs = prefs
self.age = age
# Encode a user
user = User(email="name@example.com", user_id=123, password_hash="secret")
print(json.dumps(user))
# -> {"id": 123, "email": "name@example.com", "prefs": {}}
# Decode a user
user_data = '{"id": 456, "email": "test@example.com"}'
user = json.loads(User, user_data)
# user.user_id -> 456
# user.prefs -> {}
Supported field() options
rename="new_name": Use a different name for the field in the output.skip_serializing=True: Exclude the field from the output.default=value: Provide a default value if the field is missing during decoding.default_factory=callable: Provide a zero-argument function to call for a default value.serializer=callable: A function to call to encode the field's value.deserializer=callable: A function to call to decode the field's value.validate=callable: A function or list of functions to validate the field's value during decoding.
Validation
lodum includes a set of built-in validators in the lodum.validators module. You can use them to ensure your data meets specific criteria.
from lodum import lodum, field, json
from lodum.validators import Range, Length, Match, OneOf
@lodum
class Product:
def __init__(
self,
name: str = field(validate=Length(min=3, max=50)),
price: float = field(validate=Range(min=0)),
category: str = field(validate=OneOf(["electronics", "books", "clothing"])),
code: str = field(validate=Match(r"^[A-Z]{2}-\d{4}$"))
):
self.name = name
self.price = price
self.category = category
self.code = code
# This will raise a DeserializationError
try:
json.loads(Product, '{"name": "A", "price": -10, "category": "food", "code": "abc"}')
except Exception as e:
print(e)
JSON Schema
You can generate a standard JSON Schema for any @lodum-decorated class using lodum.schema(). This is particularly useful for documenting your data models or for use with LLM tool definitions.
import lodum
@lodum
class User:
def __init__(self, id: int, name: str):
self.id = id
self.name = name
# Generate the schema
schema = lodum.schema(User)
import json
print(json.dumps(schema, indent=2))
# {
# "type": "object",
# "properties": {
# "id": { "type": "integer" },
# "name": { "type": "string" }
# },
# "required": ["id", "name"]
# }
Converting to/from Dictionaries
While lodum is primarily used for external wire formats, it also provides ergonomic helpers for converting objects to and from plain Python primitives (dictionaries and lists) without any string encoding.
lodum.asdict(obj)
Recursively converts a lodum-enabled object into standard Python primitives. This is a "Deep Normalization" that handles renaming, skipping fields, and converting complex types like Enums or Datetimes into plain values.
import lodum
@lodum
class User:
def __init__(self, user_id: int = lodum.field(rename="id"), name: str = ""):
self.user_id = user_id
self.name = name
user = User(user_id=1, name="Alex")
data = lodum.asdict(user)
print(data)
# Output: {"id": 1, "name": "Alex"}
lodum.fromdict(cls, data)
Hydrates a lodum-enabled class from a dictionary. Unlike standard dictionary assignment, this performs full type validation and automatically instantiates nested objects.
Supported Collection Wrappers
lodum automatically normalizes and hydrates various standard library collection wrappers, converting them to/from standard list and dict during serialization:
- collections.deque
- collections.UserList
- collections.UserDict
- collections.Counter
- collections.defaultdict
- collections.OrderedDict
Performance
lodum is designed for high performance. When you first use a @lodum-enabled class, the library analyzes its structure and generates specialized Python bytecode for serialization and deserialization using an internal Abstract Syntax Tree (AST) compiler.
This approach eliminates the overhead of generic introspection and getattr calls during runtime, resulting in:
- ~64% faster dumping (serialization) than the baseline.
- ~35% faster loading (deserialization) than the baseline.
See PERFORMANCE.md for detailed benchmark results and comparisons with other libraries.
Binary Data
lodum handles binary data (bytes and bytearray) differently depending on the format:
- Text-based formats (JSON, TOML) encode binary data as Base64-encoded strings.
- Binary formats (MsgPack, CBOR, BSON, Pickle) and YAML use their native binary representation where possible, ensuring efficient storage and transmission.
Supported Formats
lodum is designed to be format-agnostic, and new formats can be added by implementing the Dumper and Loader protocols. The following formats are currently supported:
- JSON:
lodum.json - YAML:
lodum.yaml - Pickle:
lodum.pickle(Warning:pickleis insecure. Only deserialize data from trusted sources.)lodumimplements aSafeUnpicklerthat restricts deserialization to a small set of safe types: - Standard Python
builtins(likeint,str,list, etc.) - Custom classes decorated with
@lodum - Explicitly forbids modules known to be dangerous (like
os,sys,subprocess) Additionally,lodum.pickle.dumpsperforms structural validation to ensure onlylodum-enabled data is serialized. - TOML:
lodum.toml - MessagePack:
lodum.msgpack - CBOR:
lodum.cbor(e.g.,cbor.dumps(obj)) - BSON:
lodum.bson(e.g.,bson.dumps(obj))
Supported Types
lodum currently supports the following types for serialization:
- Primitives:
int,str,float,bool,None - Collections:
list,dict,tuple,set,bytes,bytearray,array.array,collections.defaultdict,collections.OrderedDict,collections.Counter - Typing:
Optional,Union,Any,TypeVar(The@lodumdecorator preserves the type identity of the decorated class usingTypeVar, ensuring excellent IDE support and static type checking.) - Standard Library:
datetime.datetime(encoded as ISO 8601 strings),enum.Enum(encoded by value),uuid.UUID,decimal.Decimal,pathlib.Path - Third-Party Libraries:
numpy.ndarray,pandas.DataFrame,pandas.Series,polars.DataFrame,polars.Series - Custom Objects: Any class decorated with
@lodum.
The library is designed to be extended with support for more formats and more complex data types in the future.
Contributing
Contributions are welcome! Please see the Contributing Guidelines for more information.
Internals & Roadmap
- Looking for the API Reference?
- Migrating from another library? See our Migration Guide.
- Interested in how
lodumworks under the hood? Check out ARCHITECTURE. - Adding support for a new data format? See Implementing New Formats.
- See how Lodum performs in our PERFORMANCE report.
- Want to see where we are going? Read our ROADMAP.
License
This project is licensed under the Apache License 2.0. See the LICENSE file for details.