Skip to content

Contrib / Fs

split_buffer

Python's open function doesn't support custom newlines in the text mode and doesn't support "newlines" (delimiters) in binary mode, so it is convenient to have split_buffer helper for this:

import io
from convtools.contrib.fs import split_buffer
from convtools.contrib.tables import Table

buffer = io.StringIO("a,b;;;1,2;;;3,4")
lines_generator = split_buffer(buffer, delimiter=";;;", chunk_size=32768)

# e.g. convenient for
assert list(
    Table.from_csv(lines_generator, header=True).into_iter_rows(dict)
) == [{"a": "1", "b": "2"}, {"a": "3", "b": "4"}]

split_buffer_n_decode

There's also a sibling method, which works with bytes and runs decode on each element - split_buffer_n_decode:

import io
from convtools.contrib.fs import split_buffer_n_decode

buffer = io.BytesIO(b"a,b;;;1,2;;;3,4")
lines = list(
    split_buffer_n_decode(
        buffer, delimiter=b";;;", chunk_size=32768, encoding="utf-8"
    )
)
assert lines == ["a,b", "1,2", "3,4"]