Implementing New Formats
Lodum is designed to be format-agnostic. You can add support for a new data format (e.g., XML, Protobuf, or a custom text format) by implementing two core protocols: Dumper and Loader.
The core engine uses these protocols to bridge the gap between Python objects and the specific data format, handling all the complex logic like recursion, type validation, and circular reference detection.
The Dumper Protocol
A Dumper is responsible for taking primitive Python values and converting them into an intermediate representation (IR) or directly into the target format.
Protocol Interface
class Dumper(Protocol):
def dump_int(self, value: int) -> Any: ...
def dump_str(self, value: str) -> Any: ...
def dump_float(self, value: float) -> Any: ...
def dump_bool(self, value: bool) -> Any: ...
def dump_bytes(self, value: bytes) -> Any: ...
def dump_list(self, value: List[Any]) -> Any: ...
def dump_dict(self, value: Dict[str, Any]) -> Any: ...
def begin_struct(self, cls: Type) -> Any: ...
def end_struct(self) -> Any: ...
Implementing a Dumper
Most formats can inherit from BaseDumper, which provides default implementations that return the values as-is. This is useful for formats that work with standard Python collections (like JSON or YAML libraries).
from lodum.core import BaseDumper
class MyFormatDumper(BaseDumper):
def dump_bytes(self, value: bytes) -> Any:
# Example: encode bytes to hex for a text format
return value.hex()
The Loader Protocol
A Loader is responsible for reading primitive values and collections from the source data.
Protocol Interface
class Loader(Protocol):
def load_int(self) -> int: ...
def load_str(self) -> str: ...
def load_float(self) -> float: ...
def load_bool(self) -> bool: ...
def load_bytes(self) -> bytes: ...
def load_list(self) -> Iterator["Loader"]: ...
def load_dict(self) -> Iterator[tuple[str, "Loader"]]: ...
def load_any(self) -> Any: ...
def mark(self) -> Any: ...
def rewind(self, marker: Any) -> None: ...
def get_dict(self) -> Optional[Union[Dict[str, Any], List[Any]]]: ...
Key Methods
load_list/load_dict: These should return an iterator of newLoaderinstances, each wrapping a nested element.mark/rewind: Required for supportingUniontypes.mark()should return the current state of the loader, andrewind(marker)should restore it. This allowslodumto try decoding data into multiple different types.get_dict: An optimization. If the current data is already a raw Pythondictorlist, returning it here allows the compiler to bypass creating multipleLoaderwrappers.
Implementing a Loader
Inheriting from BaseLoader is highly recommended. It provides standardized type checking and error messages (e.g., "Expected int, got str").
from lodum.core import BaseLoader
class MyFormatLoader(BaseLoader):
# BaseLoader handles load_int, load_str, etc. using load_any()
# You only need to override specific behavior.
pass
Creating the Public API
Once you have your Dumper and Loader, you typically expose dumps and loads functions that wrap the lodum.internal calls.
from typing import Any, Type, TypeVar
from lodum.internal import dump, load
T = TypeVar("T")
def dumps(obj: Any) -> str:
dumper = MyFormatDumper()
data = dump(obj, dumper)
return str(data) # Or your format's encoding logic
def loads(cls: Type[T], data_string: str) -> T:
# Your format's parsing logic to get a Python dict/list
raw_data = parse_my_format(data_string)
loader = MyFormatLoader(raw_data)
return load(cls, loader)
Best Practices
- Use
BaseLoader: It ensures your format provides the same high-quality error messages as the built-in formats. - Handle
bytes: If your format doesn't support raw binary data, implementdump_bytesandload_bytes_valueto handle Base64 or Hex encoding. - Recursive Safety: You don't need to worry about recursion limits or circular references; the
lodum.internal.dumpandloadfunctions handle this automatically. - Performance: If your format returns standard Python dicts/lists, ensure
get_dict()returns them to enable the compiler's fast-path optimizations. - Thoroughly Test Your Implementation: After implementing a new format, it's crucial to ensure its correctness and robustness. Run the project's comprehensive test suite and add new tests specifically for your format. Refer to the Contributing Guide for detailed instructions on running tests and maintaining code quality.