Basics¶

The idea behind this library is to allow you to dynamically build data transforms, which can be compiled to ad-hoc python functions.

This means that we need to introduce convtools primitives for the most basic python operations first, before we can get to a more complex things like aggregations and joins.

c.this¶

To start with, here is a function which increments an input by one:

def f(data):
    return data + 1

If we called data as c.this, then this data transform would look like: c.this + 1. And this is a correct convtools conversion.

So the steps of getting a converter function are:

you define data transforms using it's building blocks (conversions)
you call gen_converter conversion method to generate ad-hoc code and compile a function, which implements the transform you just defined
you use the resulting function as many times as needed

convtoolsdebug stdout

from convtools import conversion as c

conversion = c.this + 1
converter = conversion.gen_converter(debug=True)

assert converter(1) == 2

def converter(data_):
    try:
        return data_ + 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Many examples will contain debug=True just so the generated code is visible for those who are curious, not because it's required :)

If we need a converter function to run it only once, then we can shorten it to:

convtoolsdebug stdout

from convtools import conversion as c

assert (c.this + 1).execute(1, debug=True) == 2

def converter(data_):
    try:
        return data_ + 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

c.item¶

Of course the above is not enough to work with data. Let's define conversions to perform key/index lookups.

No default¶

def f(data):
    return data[1]

There are two ways to do it:

c.this.item(1) - to build on top of the previous example
c.item(1) - same, but shorter

convtoolsdebug stdout

from convtools import conversion as c

converter = c.item(1).gen_converter(debug=True)

assert converter([10, 20]) == 20

def converter(data_):
    try:
        return data_[1]
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

With default¶

Should you need to suppress KeyError to achieve dict.get(key, default) behavior:

convtoolsdebug stdout

from convtools import conversion as c

converter = c.item(1, default=-1).gen_converter(debug=True)

assert converter([10]) == -1

def converter(data_, *, __get_1_or_default=__naive_values__["__get_1_or_default"]):
    try:
        return __get_1_or_default(data_, 1, -1)
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Multiple indexes / keys¶

Sometimes you may need to perform multiple subsequent index/key lookups:

convtoolsdebug stdout

from convtools import conversion as c

converter = c.item(1, "value").gen_converter(debug=True)

assert converter([{"value": 10}, {"value": 20}]) == 20

def converter(data_):
    try:
        return data_[1]["value"]
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

c.attr¶

Just like the c.item conversion takes care of index/key lookups, the c.attr does attribute lookups. So to define the following conversion:

def f(data):
    return data.value

just use c.attr("value").

Here is all-in one example:

convtoolsdebug stdout

from convtools import conversion as c

class Obj:
    a = 1

class Container:
    obj = Obj

assert c.attr("obj", "a").execute(Container, debug=True) == 1
assert c.attr("b", default=None).execute(Obj, debug=True) is None

def converter(data_):
    try:
        return data_.obj.a
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def attr_or_default(default_, data_):
    try:
        return data_.b
    except AttributeError:
        return default_

def converter(data_):
    try:
        return attr_or_default(None, data_)
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

c.naive¶

In fact we implicitly used c.naive when we implemented the increment conversion. It is used to make objects/functions/whatever available inside conversions.

A good example is when we need to achieve something like the following:

VALUE_TO_VERBOSE = {
    1: "ACTIVE",
    2: "INACTIVE",
}
def f(data):
    return VALUE_TO_VERBOSE[data]

here we made VALUE_TO_VERBOSE available to the function. To build an equivalent conversion wrap an object to be exposed into c.naive:

convtoolsdebug stdout

from convtools import conversion as c

VALUE_TO_VERBOSE = {
    1: "ACTIVE",
    2: "INACTIVE",
}
converter = c.naive(VALUE_TO_VERBOSE).item(c.this).gen_converter(debug=True)

assert converter(2) == "INACTIVE"

def converter(data_, *, __v=__naive_values__["__v"]):
    try:
        return __v[data_]
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

And yes, you can pass conversions as arguments to other conversions (notice .item(c.this) part).

c.input_arg¶

Given that we are generating functions, it is useful to be able to add parameters to them. Let's update our "increment" function to have a keyword-only increment parameter:

def f(data, *, increment):
    return data + increment

To build a conversion like this use c.input_arg("increment") to reference the keyword argument to be passed:

convtoolsdebug stdout

from convtools import conversion as c

converter = (c.this + c.input_arg("increment")).gen_converter(debug=True)

assert converter(10, increment=5) == 15

def converter(data_, *, increment):
    try:
        return data_ + increment
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Calling functions¶

One of the most important primitive conversions is the one which calls functions. Let's build a conversion which does the following:

from datetime import datetime

def f(data):
    return datetime.strptime(data, "%m/%d/%Y")

We can either:

use c.call_func on datetime.strptime
use call_method on datetime
expose datetime.strptime via c.naive and then call it

convtoolsdebug stdout

from datetime import datetime
from convtools import conversion as c

# No. 1
converter = c.call_func(datetime.strptime, c.this, "%m/%d/%Y").gen_converter(
    debug=True
)

assert converter("12/31/2000") == datetime(2000, 12, 31)

# No. 2
converter = (
    c.naive(datetime)
    .call_method("strptime", c.this, "%m/%d/%Y")
    .gen_converter(debug=True)
)

assert converter("12/31/2000") == datetime(2000, 12, 31)

# No. 3
assert (
    c.naive(datetime.strptime)
    .call(c.this, "%m/%d/%Y")
    .execute("12/31/2000", debug=True)
) == datetime(2000, 12, 31)

def converter(data_, *, __strptime=__naive_values__["__strptime"], __v=__naive_values__["__v"]):
    try:
        return __strptime(data_, __v)
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_, *, __datetime=__naive_values__["__datetime"], __v=__naive_values__["__v"]):
    try:
        return __datetime.strptime(data_, __v)
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_, *, __strptime=__naive_values__["__strptime"], __v=__naive_values__["__v"]):
    try:
        return __strptime(data_, __v)
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Tip

If we think about which one is faster, have a look at the generated code. That extra .strptime attribute lookup in the 2nd variant makes it slower, while both other variants perform this lookup only once at conversion building stage and wouldn't perform it if we stored the converter for further reuse.

calling with `*args`, `**kwargs`¶

Of course this is slower because on every call args and kwargs get rebuilt, but sometimes you cannot avoid such calls as f(*args, **kwargs). The options are:

c.apply_func
apply_method
apply

convtoolsdebug stdout

from datetime import datetime
from convtools import conversion as c

data = {"obj": {"args": (1, 2), "kwargs": {"mode": "verbose"}}}

class A:
    @classmethod
    def f(cls, *args, **kwargs):
        return len(args) + len(kwargs)

# No. 1
converter = c.apply_func(
    A.f, c.item("obj", "args"), c.item("obj", "kwargs")
).gen_converter(debug=True)
assert converter(data) == 3

# No. 2
converter = (
    c.naive(A)
    .apply_method("f", c.item("obj", "args"), c.item("obj", "kwargs"))
    .gen_converter(debug=True)
)
assert converter(data) == 3

# No. 3
converter = (
    c.naive(A.f)
    .apply(c.item("obj", "args"), c.item("obj", "kwargs"))
    .gen_converter(debug=True)
)
assert converter(data) == 3

def converter(data_, *, __f=__naive_values__["__f"]):
    try:
        return __f(*data_["obj"]["args"], **data_["obj"]["kwargs"])
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_, *, __A=__naive_values__["__A"]):
    try:
        return __A.f(*data_["obj"]["args"], **data_["obj"]["kwargs"])
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_, *, __f=__naive_values__["__f"]):
    try:
        return __f(*data_["obj"]["args"], **data_["obj"]["kwargs"])
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Operators¶

convtoolsdebug stdout

from convtools import conversion as c

args = ()

c(
    {
        "-a": -c.item(0),
        "a + b": c.item(0) + c.item(1),
        "a - b": c.item(0) - c.item(1),
        "a * b": c.item(0) * c.item(1),
        "a ** b": c.item(0) ** c.item(1),
        "a / b": c.item(0) / c.item(1),
        "a // b": c.item(0) // c.item(1),
        "a % b": c.item(0) % c.item(1),
        "a == b": c.item(0) == c.item(1),
        "a >= b": c.item(0) >= c.item(1),
        "a <= b": c.item(0) <= c.item(1),
        "a < b": c.item(0) < c.item(1),
        "a > b": c.item(0) > c.item(1),
        "a or b": c.or_(c.item(0), c.item(1), *args),
          # "a or b": c.item(0).or_(c.item(1)),
          # "a or b": c.item(0) | c.item(1),
        "a and b": c.and_(c.item(0), c.item(1), *args),
          # "a and b": c.item(0).and_(c.item(1)),
          # "a and b": c.item(0) & c.item(1),
        "not a": ~c.item(0),
        "a is b": c.item(0).is_(c.item(1)),
        "a is not b": c.item(0).is_not(c.item(1)),
        "a in b": c.item(0).in_(c.item(1)),
        "a not in b": c.item(0).not_in(c.item(1)),
    }
).gen_converter(debug=True)

def converter(data_, *, __v=__naive_values__["__v"]):
    try:
        return {
            "-a": (-data_[0]),
            "a + b": (data_[0] + data_[1]),
            "a - b": (data_[0] - data_[1]),
            "a * b": (data_[0] * data_[1]),
            "a ** b": (data_[0] ** data_[1]),
            "a / b": (data_[0] / data_[1]),
            "a // b": (data_[0] // data_[1]),
            __v: (data_[0] % data_[1]),
            "a == b": (data_[0] == data_[1]),
            "a >= b": (data_[0] >= data_[1]),
            "a <= b": (data_[0] <= data_[1]),
            "a < b": (data_[0] < data_[1]),
            "a > b": (data_[0] > data_[1]),
            "a or b": (data_[0] or data_[1]),
            "a and b": (data_[0] and data_[1]),
            "not a": (not data_[0]),
            "a is b": (data_[0] is data_[1]),
            "a is not b": (data_[0] is not data_[1]),
            "a in b": (data_[0] in data_[1]),
            "a not in b": (data_[0] not in data_[1]),
        }
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Converter signature¶

Sometimes it's required to adjust automatically generated converter signature, there are three parameters of gen_converter to achieve that:

method - results in signature like def converter(self, data_)
class_method - results in signature like def converter(cls, data_)
signature - uses the provided signature

just make sure you to include data_ in case your conversion uses the input.

convtoolsdebug stdout

from convtools import conversion as c


class A:
    get_one = c.naive(1).gen_converter(class_method=True, debug=True)

    get_two = c.naive(2).gen_converter(method=True, debug=True)

    get_self = c.input_arg("self").gen_converter(signature="self", debug=True)


a = A()

assert A.get_one(None) == 1 and a.get_two(None) == 2 and a.get_self() is a

def converter(cls, data_):
    try:
        return 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(self, data_):
    try:
        return 2
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(self):
    try:
        return self
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Debug¶

When you need to debug a conversion, the very first thing is to enable debug mode. There are 2 ways:

pass debug=True to either gen_converter or execute methods
set debug options using c.OptionsCtx context manager

In both cases it makes sense to install black code formatter (pip install black), it will be used automatically once installed.

convtoolsdebug stdout

from convtools import conversion as c

c.item(1).gen_converter(debug=True)

with c.OptionsCtx() as options:
    options.debug = True
    c.item(1).gen_converter()

def converter(data_):
    try:
        return data_[1]
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_):
    try:
        return data_[1]
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Another way to debug is to use breakpoint method:

c({"a": c.breakpoint()}).gen_converter(debug=True)
# same
c({"a": c.item(0).breakpoint()}).gen_converter(debug=True)

Inline expressions¶

Warning

convtools cannot guard you here and doesn't infer any insights from the attributes of unknown pieces of code. Avoid using if possible.

There are two ways to pass custom code expression as a string:

c.escaped_string
c.inline_expr

convtoolsdebug stdout

from convtools import conversion as c

assert c.escaped_string("1 + 1").execute(None, debug=True) == 2
assert c.inline_expr("1 + 1").execute(None, debug=True) == 2

assert (
    c.inline_expr("{} + {}").pass_args(c.this, 1).execute(10, debug=True) == 11
)
assert (
    c.inline_expr("{a} + {b}").pass_args(a=c.this, b=1).execute(10, debug=True)
    == 11
)

def converter(data_):
    try:
        return 1 + 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_):
    try:
        return 1 + 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_):
    try:
        return data_ + 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_):
    try:
        return data_ + 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Now that we know the basics and how the thing works, we are ready to go over more complex conversions in a more cheatsheet-like narrative.

Basics¶

c.this¶

c.item¶

No default¶

With default¶

Multiple indexes / keys¶

c.attr¶

c.naive¶

c.input_arg¶

Calling functions¶

calling with *args, **kwargs¶

Operators¶

Converter signature¶

Debug¶

Inline expressions¶

calling with `*args`, `**kwargs`¶