Basics¶
The idea behind this library is to allow you to dynamically build data transforms, which can be compiled to ad-hoc python functions.
This means that we need to introduce convtools
primitives for the most basic
python operations first, before we can get to a more complex things like
aggregations and joins.
c.this¶
To start with, here is a function which increments an input by one:
def f(data):
return data + 1
If we called data
as c.this
, then this data transform would look like:
c.this + 1
. And this is a correct convtools
conversion.
So the steps of getting a converter function are:
- you define data transforms using it's building blocks (conversions)
- you call
gen_converter
conversion method to generate ad-hoc code and compile a function, which implements the transform you just defined - you use the resulting function as many times as needed
from convtools import conversion as c
conversion = c.this + 1
converter = conversion.gen_converter(debug=True)
assert converter(1) == 2
def converter(data_):
try:
return data_ + 1
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
Many examples will contain debug=True
just so the generated code is visible
for those who are curious, not because it's required :)
If we need a converter function to run it only once, then we can shorten it to:
from convtools import conversion as c
assert (c.this + 1).execute(1, debug=True) == 2
def converter(data_):
try:
return data_ + 1
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
c.item¶
Of course the above is not enough to work with data. Let's define conversions to perform key/index lookups.
No default¶
def f(data):
return data[1]
There are two ways to do it:
c.this.item(1)
- to build on top of the previous examplec.item(1)
- same, but shorter
from convtools import conversion as c
converter = c.item(1).gen_converter(debug=True)
assert converter([10, 20]) == 20
def converter(data_):
try:
return data_[1]
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
With default¶
Should you need to suppress KeyError
to achieve dict.get(key, default)
behavior:
from convtools import conversion as c
converter = c.item(1, default=-1).gen_converter(debug=True)
assert converter([10]) == -1
def converter(data_, *, __get_1_or_default=__naive_values__["__get_1_or_default"]):
try:
return __get_1_or_default(data_, 1, -1)
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
Multiple indexes / keys¶
Sometimes you may need to perform multiple subsequent index/key lookups:
from convtools import conversion as c
converter = c.item(1, "value").gen_converter(debug=True)
assert converter([{"value": 10}, {"value": 20}]) == 20
def converter(data_):
try:
return data_[1]["value"]
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
c.attr¶
Just like the c.item
conversion takes care of index/key lookups, the
c.attr
does attribute lookups. So to define the following conversion:
def f(data):
return data.value
just use c.attr("value")
.
Here is all-in one example:
from convtools import conversion as c
class Obj:
a = 1
class Container:
obj = Obj
assert c.attr("obj", "a").execute(Container, debug=True) == 1
assert c.attr("b", default=None).execute(Obj, debug=True) is None
def converter(data_):
try:
return data_.obj.a
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
def attr_or_default(default_, data_):
try:
return data_.b
except AttributeError:
return default_
def converter(data_):
try:
return attr_or_default(None, data_)
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
c.naive¶
In fact we implicitly used c.naive
when we implemented the increment
conversion. It is used to make objects/functions/whatever available inside
conversions.
A good example is when we need to achieve something like the following:
VALUE_TO_VERBOSE = {
1: "ACTIVE",
2: "INACTIVE",
}
def f(data):
return VALUE_TO_VERBOSE[data]
VALUE_TO_VERBOSE
available to the function. To build an
equivalent conversion wrap an object to be exposed into c.naive
:
from convtools import conversion as c
VALUE_TO_VERBOSE = {
1: "ACTIVE",
2: "INACTIVE",
}
converter = c.naive(VALUE_TO_VERBOSE).item(c.this).gen_converter(debug=True)
assert converter(2) == "INACTIVE"
def converter(data_, *, __v=__naive_values__["__v"]):
try:
return __v[data_]
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
And yes, you can pass conversions as arguments to other conversions (notice
.item(c.this)
part).
c.input_arg¶
Given that we are generating functions, it is useful to be able to add
parameters to them. Let's update our "increment" function to have a
keyword-only increment
parameter:
def f(data, *, increment):
return data + increment
To build a conversion like this use c.input_arg("increment")
to reference the
keyword argument to be passed:
from convtools import conversion as c
converter = (c.this + c.input_arg("increment")).gen_converter(debug=True)
assert converter(10, increment=5) == 15
def converter(data_, *, increment):
try:
return data_ + increment
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
Calling functions¶
One of the most important primitive conversions is the one which calls functions. Let's build a conversion which does the following:
from datetime import datetime
def f(data):
return datetime.strptime(data, "%m/%d/%Y")
We can either:
- use
c.call_func
ondatetime.strptime
- use
call_method
ondatetime
- expose
datetime.strptime
viac.naive
and then call it
from datetime import datetime
from convtools import conversion as c
# No. 1
converter = c.call_func(datetime.strptime, c.this, "%m/%d/%Y").gen_converter(
debug=True
)
assert converter("12/31/2000") == datetime(2000, 12, 31)
# No. 2
converter = (
c.naive(datetime)
.call_method("strptime", c.this, "%m/%d/%Y")
.gen_converter(debug=True)
)
assert converter("12/31/2000") == datetime(2000, 12, 31)
# No. 3
assert (
c.naive(datetime.strptime)
.call(c.this, "%m/%d/%Y")
.execute("12/31/2000", debug=True)
) == datetime(2000, 12, 31)
def converter(data_, *, __strptime=__naive_values__["__strptime"], __v=__naive_values__["__v"]):
try:
return __strptime(data_, __v)
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
def converter(data_, *, __datetime=__naive_values__["__datetime"], __v=__naive_values__["__v"]):
try:
return __datetime.strptime(data_, __v)
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
def converter(data_, *, __strptime=__naive_values__["__strptime"], __v=__naive_values__["__v"]):
try:
return __strptime(data_, __v)
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
Tip
If we think about which one is faster, have a look at the generated code.
That extra .strptime
attribute lookup in the 2nd variant makes it slower,
while both other variants perform this lookup only once at conversion
building stage and wouldn't perform it if we stored the converter for
further reuse.
calling with *args
, **kwargs
¶
Of course this is slower because on every call args
and kwargs
get rebuilt,
but sometimes you cannot avoid such calls as f(*args, **kwargs)
. The options
are:
c.apply_func
apply_method
apply
from datetime import datetime
from convtools import conversion as c
data = {"obj": {"args": (1, 2), "kwargs": {"mode": "verbose"}}}
class A:
@classmethod
def f(cls, *args, **kwargs):
return len(args) + len(kwargs)
# No. 1
converter = c.apply_func(
A.f, c.item("obj", "args"), c.item("obj", "kwargs")
).gen_converter(debug=True)
assert converter(data) == 3
# No. 2
converter = (
c.naive(A)
.apply_method("f", c.item("obj", "args"), c.item("obj", "kwargs"))
.gen_converter(debug=True)
)
assert converter(data) == 3
# No. 3
converter = (
c.naive(A.f)
.apply(c.item("obj", "args"), c.item("obj", "kwargs"))
.gen_converter(debug=True)
)
assert converter(data) == 3
def converter(data_, *, __f=__naive_values__["__f"]):
try:
return __f(*data_["obj"]["args"], **data_["obj"]["kwargs"])
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
def converter(data_, *, __A=__naive_values__["__A"]):
try:
return __A.f(*data_["obj"]["args"], **data_["obj"]["kwargs"])
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
def converter(data_, *, __f=__naive_values__["__f"]):
try:
return __f(*data_["obj"]["args"], **data_["obj"]["kwargs"])
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
Operators¶
from convtools import conversion as c
args = ()
c(
{
"-a": -c.item(0),
"a + b": c.item(0) + c.item(1),
"a - b": c.item(0) - c.item(1),
"a * b": c.item(0) * c.item(1),
"a ** b": c.item(0) ** c.item(1),
"a / b": c.item(0) / c.item(1),
"a // b": c.item(0) // c.item(1),
"a % b": c.item(0) % c.item(1),
"a == b": c.item(0) == c.item(1),
"a >= b": c.item(0) >= c.item(1),
"a <= b": c.item(0) <= c.item(1),
"a < b": c.item(0) < c.item(1),
"a > b": c.item(0) > c.item(1),
"a or b": c.or_(c.item(0), c.item(1), *args),
# "a or b": c.item(0).or_(c.item(1)),
# "a or b": c.item(0) | c.item(1),
"a and b": c.and_(c.item(0), c.item(1), *args),
# "a and b": c.item(0).and_(c.item(1)),
# "a and b": c.item(0) & c.item(1),
"not a": ~c.item(0),
"a is b": c.item(0).is_(c.item(1)),
"a is not b": c.item(0).is_not(c.item(1)),
"a in b": c.item(0).in_(c.item(1)),
"a not in b": c.item(0).not_in(c.item(1)),
}
).gen_converter(debug=True)
def converter(data_, *, __v=__naive_values__["__v"]):
try:
return {
"-a": (-data_[0]),
"a + b": (data_[0] + data_[1]),
"a - b": (data_[0] - data_[1]),
"a * b": (data_[0] * data_[1]),
"a ** b": (data_[0] ** data_[1]),
"a / b": (data_[0] / data_[1]),
"a // b": (data_[0] // data_[1]),
__v: (data_[0] % data_[1]),
"a == b": (data_[0] == data_[1]),
"a >= b": (data_[0] >= data_[1]),
"a <= b": (data_[0] <= data_[1]),
"a < b": (data_[0] < data_[1]),
"a > b": (data_[0] > data_[1]),
"a or b": (data_[0] or data_[1]),
"a and b": (data_[0] and data_[1]),
"not a": (not data_[0]),
"a is b": (data_[0] is data_[1]),
"a is not b": (data_[0] is not data_[1]),
"a in b": (data_[0] in data_[1]),
"a not in b": (data_[0] not in data_[1]),
}
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
Converter signature¶
Sometimes it's required to adjust automatically generated converter signature,
there are three parameters of gen_converter
to achieve that:
method
- results in signature likedef converter(self, data_)
class_method
- results in signature likedef converter(cls, data_)
signature
- uses the provided signature
just make sure you to include data_
in case your conversion uses the input.
from convtools import conversion as c
class A:
get_one = c.naive(1).gen_converter(class_method=True, debug=True)
get_two = c.naive(2).gen_converter(method=True, debug=True)
get_self = c.input_arg("self").gen_converter(signature="self", debug=True)
a = A()
assert A.get_one(None) == 1 and a.get_two(None) == 2 and a.get_self() is a
def converter(cls, data_):
try:
return 1
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
def converter(self, data_):
try:
return 2
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
def converter(self):
try:
return self
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
Debug¶
When you need to debug a conversion, the very first thing is to enable debug mode. There are 2 ways:
- pass
debug=True
to eithergen_converter
orexecute
methods - set debug options using
c.OptionsCtx
context manager
In both cases it makes sense to install black
code formatter (pip install
black
), it will be used automatically once installed.
from convtools import conversion as c
c.item(1).gen_converter(debug=True)
with c.OptionsCtx() as options:
options.debug = True
c.item(1).gen_converter()
def converter(data_):
try:
return data_[1]
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
def converter(data_):
try:
return data_[1]
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
Another way to debug is to use breakpoint
method:
c({"a": c.breakpoint()}).gen_converter(debug=True)
# same
c({"a": c.item(0).breakpoint()}).gen_converter(debug=True)
Inline expressions¶
Warning
convtools
cannot guard you here and doesn't infer any insights from the
attributes of unknown pieces of code. Avoid using if possible.
There are two ways to pass custom code expression as a string:
c.escaped_string
c.inline_expr
from convtools import conversion as c
assert c.escaped_string("1 + 1").execute(None, debug=True) == 2
assert c.inline_expr("1 + 1").execute(None, debug=True) == 2
assert (
c.inline_expr("{} + {}").pass_args(c.this, 1).execute(10, debug=True) == 11
)
assert (
c.inline_expr("{a} + {b}").pass_args(a=c.this, b=1).execute(10, debug=True)
== 11
)
def converter(data_):
try:
return 1 + 1
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
def converter(data_):
try:
return 1 + 1
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
def converter(data_):
try:
return data_ + 1
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
def converter(data_):
try:
return data_ + 1
except __exceptions_to_dump_sources:
__convtools__code_storage.dump_sources()
raise
Now that we know the basics and how the thing works, we are ready to go over more complex conversions in a more cheatsheet-like narrative.