convtools — write transformations as expressions, run them as Python¶
convtools lets you declare data transformations in plain Python, then compiles them into tiny, optimized Python functions at runtime. You keep your data in native iterables (lists, dicts, generators, CSV streams)—no heavy container required.
Why pick convtools?¶
- Stay in Python. Compose transformations as expressions: pipes, filters,
joins, group‑bys, reducers, window functions, and more. Then call
.gen_converter()to get a real Python function. - Stream‑friendly. Works directly on iterators and files; the Table helper processes CSV‑like data without loading everything into memory.
- Powerful aggregations. Rich reducers (Sum, CountDistinct, MaxRow,
ArraySorted, Dict*, TopK…) with per‑reducer
wherefilters and defaults. Nested aggregations are first‑class. - Debuggable & inspectable. Print the generated code with
debug=Trueor set global options viac.OptionsCtx. Works withpdb/pydevd. - Plays nicely with Pandas/Polars. It’s not a DataFrame; it’s a code‑generation layer. Use it when you want lean, composable transforms over native Python data.
Installation¶
pip install convtools
60-second tour¶
1) Build & run a converter¶
from convtools import conversion as c
# Title‑case a name in an incoming dict
to_title = c.item("name").pipe(str.title).gen_converter()
assert to_title({"name": "jane doe"}) == "Jane Doe"
Under the hood gen_converter() compiles your expression into an ad‑hoc Python
function. Want a one‑off call? Use .execute(data) instead.
2) Transform a collection¶
from convtools import conversion as c
rows = [
{"name": "ada", "score": 10},
{"name": "grace", "score": 12},
{"name": "linus", "score": 9},
]
to_names = (
c.iter(c.item("name").pipe(str.title))
.as_type(list) # return a list, not a generator
.gen_converter()
)
assert to_names(rows) == ["Ada", "Grace", "Linus"]
Uses c.iter to express a per‑row transform and .as_type(list) to collect.
3) Group & aggregate¶
from convtools import conversion as c
orders = [
{"user": "a", "amount": 20, "status": "paid"},
{"user": "a", "amount": 30, "status": "refunded"},
{"user": "b", "amount": 15, "status": "paid"},
{"user": "b", "amount": 10, "status": "paid"},
]
group_and_sum_paid = (
c.group_by(c.item("user"))
.aggregate(
{
"user": c.item("user"),
"paid_total": c.ReduceFuncs.Sum(
c.item("amount"),
where=c.item("status") == "paid",
),
}
)
.sort(key=c.item("paid_total").desc())
.gen_converter()
)
assert group_and_sum_paid(orders) == [
{"user": "b", "paid_total": 25},
{"user": "a", "paid_total": 20},
]
Reducers support where filters and sensible defaults.
c.group_by(...).aggregate(...) returns a list you can sort, filter, or map
further.
4) Join two sequences¶
from convtools import conversion as c
collection_1 = [
{"id": 1, "name": "Nick"},
{"id": 2, "name": "Joash"},
{"id": 3, "name": "Bob"},
]
collection_2 = [
{"ID": "3", "age": 17, "country": "GB"},
{"ID": "2", "age": 21, "country": "US"},
{"ID": "1", "age": 18, "country": "CA"},
]
input_data = (collection_1, collection_2)
conv = (
c.join(
c.item(0),
c.item(1),
c.and_(
c.LEFT.item("id") == c.RIGHT.item("ID").as_type(int),
c.RIGHT.item("age") >= 18,
),
how="left",
)
.pipe(
c.list_comp(
{
"id": c.item(0, "id"),
"name": c.item(0, "name"),
"age": c.item(1, "age", default=None),
"country": c.item(1, "country", default=None),
}
)
)
.gen_converter()
)
assert conv(input_data) == [
{"id": 1, "name": "Nick", "age": 18, "country": "CA"},
{"id": 2, "name": "Joash", "age": 21, "country": "US"},
{"id": 3, "name": "Bob", "age": None, "country": None},
]
c.join returns (left, right) tuples; c.LEFT/c.RIGHT let you express join
conditions. Hash‑join optimization kicks in on equi‑joins.
Streaming CSVs with Table¶
from convtools import conversion as c
from convtools.contrib.tables import Table
from decimal import Decimal
# tests/csvs/orders.csv
"""
order_id,price,qty,status
a,20,2,paid
a,30,3,refunded
b,15,4,paid
b,10,5,paid
"""
# Read a CSV, infer header, and stream out a subset
pipe = (
Table.from_csv("tests/csvs/orders.csv", header=True) # stream in
.filter(c.col("status") == "paid") # row-wise filter
.update(total=c.col("price").as_type(Decimal) * c.col("qty").as_type(int))
.take("order_id", "total")
.into_iter_rows(dict) # stream out or into_csv("output.csv")
)
assert list(pipe) == [
{"order_id": "a", "total": Decimal("40")},
{"order_id": "b", "total": Decimal("60")},
{"order_id": "b", "total": Decimal("50")},
]
Table is optimized for streaming transformations: rename/take/drop/update
columns, joins, explode, pivot, and more. Note: Table consumes its input once
(it’s an iterator).
Debugging & generated code¶
Pass debug=True to .gen_converter(...) or .execute(...) to print the
compiled function for inspection. You can also set global debug options with
c.OptionsCtx(). Installing black prettifies the printed code automatically.
from convtools import conversion as c
with c.OptionsCtx() as opts:
opts.debug = True
c.item(1).gen_converter()
When should I reach for convtools?¶
- You need composable transforms over native Python data (lists/dicts/generators/CSV), not a DataFrame.
- You want to express business rules declaratively and generate fast, readable Python functions.
- You need aggregations/joins/pipes that you can reuse across scripts and services.
Info
Looking for benchmarks and deeper rationale? See Benefits in the docs and the linked benchmark sources.
Install & use in 3 steps¶
-
pip install convtools -
from convtools import conversion as c -
Build an expression →
gen_converter()→ call it wherever you need.
Links¶
- 📚 Docs: https://convtools.readthedocs.io/ <-- You are here
- 📦 PyPI: https://pypi.org/project/convtools/
- 🧪 Examples: see "Basics", "Collections", "Aggregations", "Joins" and "Contrib / Tables" in the docs.
- 🤖 LLM-friendly docs: llms.txt — a concise, machine-readable overview of convtools for AI assistants and code generators
Contributing¶
-
Star the repo and share use‑cases in Discussions -- it really helps.
-
To report a bug or suggest enhancements, please open an issue and/or submit a pull request.
-
Reporting a Security Vulnerability: see the security policy.
What’s included in the box?¶
-
from convtools import conversion as c— the main interface. -
from convtools.contrib.tables import Table— stream processing of CSV‑like/tabular data. -
from convtools.contrib import fs— tiny helpers for splitting buffers with custom newlines.
License¶
MIT License (see LICENSE.txt).