Skip to content

Basics

The idea behind this library is to allow you to dynamically build data transforms, which can be compiled to ad-hoc python functions.

This means that we need to introduce convtools primitives for the most basic python operations first, before we can get to a more complex things like aggregations and joins.

c.this

To start with, here is a function which increments an input by one:

def f(data):
    return data + 1

If we called data as c.this, then this data transform would look like: c.this + 1. And this is a correct convtools conversion.

So the steps of getting a converter function are:

  1. you define data transforms using it's building blocks (conversions)
  2. you call gen_converter conversion method to generate ad-hoc code and compile a function, which implements the transform you just defined
  3. you use the resulting function as many times as needed
from convtools import conversion as c

conversion = c.this + 1
converter = conversion.gen_converter(debug=True)

assert converter(1) == 2
def converter(data_):
    try:
        return data_ + 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Many examples will contain debug=True just so the generated code is visible for those who are curious, not because it's required :)

If we need a converter function to run it only once, then we can shorten it to:

from convtools import conversion as c

assert (c.this + 1).execute(1, debug=True) == 2
def converter(data_):
    try:
        return data_ + 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

c.item

Of course the above is not enough to work with data. Let's define conversions to perform key/index lookups.

No default

def f(data):
    return data[1]

There are two ways to do it:

  1. c.this.item(1) - to build on top of the previous example
  2. c.item(1) - same, but shorter
from convtools import conversion as c

converter = c.item(1).gen_converter(debug=True)

assert converter([10, 20]) == 20
def converter(data_):
    try:
        return data_[1]
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

With default

Should you need to suppress KeyError to achieve dict.get(key, default) behavior:

from convtools import conversion as c

converter = c.item(1, default=-1).gen_converter(debug=True)

assert converter([10]) == -1
def converter(data_, *, __get_1_or_default=__naive_values__["__get_1_or_default"]):
    try:
        return __get_1_or_default(data_, 1, -1)
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Multiple indexes / keys

Sometimes you may need to perform multiple subsequent index/key lookups:

from convtools import conversion as c

converter = c.item(1, "value").gen_converter(debug=True)

assert converter([{"value": 10}, {"value": 20}]) == 20
def converter(data_):
    try:
        return data_[1]["value"]
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

c.attr

Just like the c.item conversion takes care of index/key lookups, the c.attr does attribute lookups. So to define the following conversion:

def f(data):
    return data.value

just use c.attr("value").

Here is all-in one example:

from convtools import conversion as c

class Obj:
    a = 1

class Container:
    obj = Obj

assert c.attr("obj", "a").execute(Container, debug=True) == 1
assert c.attr("b", default=None).execute(Obj, debug=True) is None
def converter(data_):
    try:
        return data_.obj.a
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def attr_or_default(default_, data_):
    try:
        return data_.b
    except AttributeError:
        return default_

def converter(data_):
    try:
        return attr_or_default(None, data_)
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

c.naive

In fact we implicitly used c.naive when we implemented the increment conversion. It is used to make objects/functions/whatever available inside conversions.

A good example is when we need to achieve something like the following:

VALUE_TO_VERBOSE = {
    1: "ACTIVE",
    2: "INACTIVE",
}
def f(data):
    return VALUE_TO_VERBOSE[data]
here we made VALUE_TO_VERBOSE available to the function. To build an equivalent conversion wrap an object to be exposed into c.naive:

from convtools import conversion as c

VALUE_TO_VERBOSE = {
    1: "ACTIVE",
    2: "INACTIVE",
}
converter = c.naive(VALUE_TO_VERBOSE).item(c.this).gen_converter(debug=True)

assert converter(2) == "INACTIVE"
def converter(data_, *, __v=__naive_values__["__v"]):
    try:
        return __v[data_]
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

And yes, you can pass conversions as arguments to other conversions (notice .item(c.this) part).

c.input_arg

Given that we are generating functions, it is useful to be able to add parameters to them. Let's update our "increment" function to have a keyword-only increment parameter:

def f(data, *, increment):
    return data + increment

To build a conversion like this use c.input_arg("increment") to reference the keyword argument to be passed:

from convtools import conversion as c

converter = (c.this + c.input_arg("increment")).gen_converter(debug=True)

assert converter(10, increment=5) == 15
def converter(data_, *, increment):
    try:
        return data_ + increment
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Calling functions

One of the most important primitive conversions is the one which calls functions. Let's build a conversion which does the following:

from datetime import datetime

def f(data):
    return datetime.strptime(data, "%m/%d/%Y")

We can either:

  1. use c.call_func on datetime.strptime
  2. use call_method on datetime
  3. expose datetime.strptime via c.naive and then call it
from datetime import datetime
from convtools import conversion as c

# No. 1
converter = c.call_func(datetime.strptime, c.this, "%m/%d/%Y").gen_converter(
    debug=True
)

assert converter("12/31/2000") == datetime(2000, 12, 31)

# No. 2
converter = (
    c.naive(datetime)
    .call_method("strptime", c.this, "%m/%d/%Y")
    .gen_converter(debug=True)
)

assert converter("12/31/2000") == datetime(2000, 12, 31)

# No. 3
assert (
    c.naive(datetime.strptime)
    .call(c.this, "%m/%d/%Y")
    .execute("12/31/2000", debug=True)
) == datetime(2000, 12, 31)
def converter(data_, *, __strptime=__naive_values__["__strptime"], __v=__naive_values__["__v"]):
    try:
        return __strptime(data_, __v)
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_, *, __datetime=__naive_values__["__datetime"], __v=__naive_values__["__v"]):
    try:
        return __datetime.strptime(data_, __v)
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_, *, __strptime=__naive_values__["__strptime"], __v=__naive_values__["__v"]):
    try:
        return __strptime(data_, __v)
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Tip

If we think about which one is faster, have a look at the generated code. That extra .strptime attribute lookup in the 2nd variant makes it slower, while both other variants perform this lookup only once at conversion building stage and wouldn't perform it if we stored the converter for further reuse.

calling with *args, **kwargs

Of course this is slower because on every call args and kwargs get rebuilt, but sometimes you cannot avoid such calls as f(*args, **kwargs). The options are:

  1. c.apply_func
  2. apply_method
  3. apply
from datetime import datetime
from convtools import conversion as c

data = {"obj": {"args": (1, 2), "kwargs": {"mode": "verbose"}}}

class A:
    @classmethod
    def f(cls, *args, **kwargs):
        return len(args) + len(kwargs)

# No. 1
converter = c.apply_func(
    A.f, c.item("obj", "args"), c.item("obj", "kwargs")
).gen_converter(debug=True)
assert converter(data) == 3

# No. 2
converter = (
    c.naive(A)
    .apply_method("f", c.item("obj", "args"), c.item("obj", "kwargs"))
    .gen_converter(debug=True)
)
assert converter(data) == 3

# No. 3
converter = (
    c.naive(A.f)
    .apply(c.item("obj", "args"), c.item("obj", "kwargs"))
    .gen_converter(debug=True)
)
assert converter(data) == 3
def converter(data_, *, __f=__naive_values__["__f"]):
    try:
        return __f(*data_["obj"]["args"], **data_["obj"]["kwargs"])
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_, *, __A=__naive_values__["__A"]):
    try:
        return __A.f(*data_["obj"]["args"], **data_["obj"]["kwargs"])
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_, *, __f=__naive_values__["__f"]):
    try:
        return __f(*data_["obj"]["args"], **data_["obj"]["kwargs"])
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Operators

from convtools import conversion as c

args = ()

c(
    {
        "-a": -c.item(0),
        "a + b": c.item(0) + c.item(1),
        "a - b": c.item(0) - c.item(1),
        "a * b": c.item(0) * c.item(1),
        "a ** b": c.item(0) ** c.item(1),
        "a / b": c.item(0) / c.item(1),
        "a // b": c.item(0) // c.item(1),
        "a % b": c.item(0) % c.item(1),
        "a == b": c.item(0) == c.item(1),
        "a >= b": c.item(0) >= c.item(1),
        "a <= b": c.item(0) <= c.item(1),
        "a < b": c.item(0) < c.item(1),
        "a > b": c.item(0) > c.item(1),
        "a or b": c.or_(c.item(0), c.item(1), *args),
          # "a or b": c.item(0).or_(c.item(1)),
          # "a or b": c.item(0) | c.item(1),
        "a and b": c.and_(c.item(0), c.item(1), *args),
          # "a and b": c.item(0).and_(c.item(1)),
          # "a and b": c.item(0) & c.item(1),
        "not a": ~c.item(0),
        "a is b": c.item(0).is_(c.item(1)),
        "a is not b": c.item(0).is_not(c.item(1)),
        "a in b": c.item(0).in_(c.item(1)),
        "a not in b": c.item(0).not_in(c.item(1)),
    }
).gen_converter(debug=True)
def converter(data_, *, __v=__naive_values__["__v"]):
    try:
        return {
            "-a": (-data_[0]),
            "a + b": (data_[0] + data_[1]),
            "a - b": (data_[0] - data_[1]),
            "a * b": (data_[0] * data_[1]),
            "a ** b": (data_[0] ** data_[1]),
            "a / b": (data_[0] / data_[1]),
            "a // b": (data_[0] // data_[1]),
            __v: (data_[0] % data_[1]),
            "a == b": (data_[0] == data_[1]),
            "a >= b": (data_[0] >= data_[1]),
            "a <= b": (data_[0] <= data_[1]),
            "a < b": (data_[0] < data_[1]),
            "a > b": (data_[0] > data_[1]),
            "a or b": (data_[0] or data_[1]),
            "a and b": (data_[0] and data_[1]),
            "not a": (not data_[0]),
            "a is b": (data_[0] is data_[1]),
            "a is not b": (data_[0] is not data_[1]),
            "a in b": (data_[0] in data_[1]),
            "a not in b": (data_[0] not in data_[1]),
        }
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Converter signature

Sometimes it's required to adjust automatically generated converter signature, there are three parameters of gen_converter to achieve that:

  1. method - results in signature like def converter(self, data_)
  2. class_method - results in signature like def converter(cls, data_)
  3. signature - uses the provided signature

just make sure you to include data_ in case your conversion uses the input.

from convtools import conversion as c


class A:
    get_one = c.naive(1).gen_converter(class_method=True, debug=True)

    get_two = c.naive(2).gen_converter(method=True, debug=True)

    get_self = c.input_arg("self").gen_converter(signature="self", debug=True)


a = A()

assert A.get_one(None) == 1 and a.get_two(None) == 2 and a.get_self() is a
def converter(cls, data_):
    try:
        return 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(self, data_):
    try:
        return 2
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(self):
    try:
        return self
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Debug

When you need to debug a conversion, the very first thing is to enable debug mode. There are 2 ways:

  1. pass debug=True to either gen_converter or execute methods
  2. set debug options using c.OptionsCtx context manager

In both cases it makes sense to install black code formatter (pip install black), it will be used automatically once installed.

from convtools import conversion as c

c.item(1).gen_converter(debug=True)

with c.OptionsCtx() as options:
    options.debug = True
    c.item(1).gen_converter()
def converter(data_):
    try:
        return data_[1]
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_):
    try:
        return data_[1]
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Another way to debug is to use breakpoint method:

c({"a": c.breakpoint()}).gen_converter(debug=True)
# same
c({"a": c.item(0).breakpoint()}).gen_converter(debug=True)

Inline expressions

Warning

convtools cannot guard you here and doesn't infer any insights from the attributes of unknown pieces of code. Avoid using if possible.

There are two ways to pass custom code expression as a string:

  1. c.escaped_string
  2. c.inline_expr
from convtools import conversion as c

assert c.escaped_string("1 + 1").execute(None, debug=True) == 2
assert c.inline_expr("1 + 1").execute(None, debug=True) == 2

assert (
    c.inline_expr("{} + {}").pass_args(c.this, 1).execute(10, debug=True) == 11
)
assert (
    c.inline_expr("{a} + {b}").pass_args(a=c.this, b=1).execute(10, debug=True)
    == 11
)
def converter(data_):
    try:
        return 1 + 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_):
    try:
        return 1 + 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_):
    try:
        return data_ + 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

def converter(data_):
    try:
        return data_ + 1
    except __exceptions_to_dump_sources:
        __convtools__code_storage.dump_sources()
        raise

Now that we know the basics and how the thing works, we are ready to go over more complex conversions in a more cheatsheet-like narrative.