Pipes And Generator¶
We already saw how to generate a sample sequence and chain pipes to it. We will now detailed how you can extend the built-in types with your own generators and pipes.
Custom Generators¶
If you want to provide access to your own data or create a synthetic dataset, you should consider to use SamplesSequence.from_callable(), so that all you have to do is to provide a function (idx: int) -> Sample.
However, writing a new generator class is not too difficult. First, derive from SamplesSequence, then:
apply the decorator
@source_sequenceto your classset a
title: this will be the name of the associated method (see the example below)provide a class help: it will be used for automatic help generation (see CLI)
define your parameters as
pydantic.Field: field’s description will be used for automatic help generationimplement
def get_sample(self, idx: int) -> Sampleanddef size(self) -> int
from typing import List
from pathlib import Path
from pydantic import Field, DirectoryPath, PrivateAttr
from pipelime.sequences import SamplesSequence, Sample, source_sequence
from pipelime.items.base import ItemFactory
@source_sequence
class SequenceFromImageList(SamplesSequence, title="from_image_list"):
"""A SamplesSequence loading images in folder as Samples."""
folder: DirectoryPath = Field(..., description="The folder to read.")
ext: str = Field(".png", description="The image file extension.")
_samples: List[Path] = PrivateAttr()
def __init__(self, **data):
super().__init__(**data)
self._samples = [
Sample({"image": ItemFactory.get_instance(p)})
for p in self.folder.glob("*" + self.ext)
]
def size(self) -> int:
return len(self._samples)
def get_sample(self, idx: int) -> Sample:
return self._samples[idx]
In the above example notice that:
we use
PrivateAttrto define an internal variable (see pydantic for details)we delegate to
ItemFactory.get_instancethe actual creation of the item: this way we support any possible extension as well as the.remotefiles
Once the module is imported, the generator is automatically registered into SamplesSequence
as from_image_list:
from pipelime.sequences import SamplesSequence
dataset = SamplesSequence.from_image_list(folder="path/to/folder")
Do you want a preview of the auto-generated help?
from pipelime.cli import pl_print
pl_print("from_image_list")
>>>
from_image_list
(*, folder: pydantic.types.DirectoryPath, ext: str = '.png')
A SamplesSequence loading images in folder as Samples.
Fields Description Type Default
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
folder ▶ The folder to read. DirectoryPath ✗
ext ▶ The image file extension. str .png
Custom Pipes¶
To create your own piped operation just derive from PipedSequenceBase, then:
apply the decorator
@piped_sequenceto your classset a
title: this will be the name of the associated method (see the example below)provide a class help: it will be used for automatic help generation (see CLI)
define your parameters as
pydantic.Field(Field’s description will be used for automatic help generation)implement
def get_sample(self, idx: int) -> Sampleand, possibly,def size(self) -> int
from pydantic import Field
from pipelime.sequences import Sample, piped_sequence
from pipelime.sequences.pipes import PipedSequenceBase
@piped_sequence
class ReverseSequence(PipedSequenceBase, title="reversed"):
"""Reverses the order of the first `num` samples."""
num: int = Field(..., description="The number of samples to reverse.")
def get_sample(self, idx: int) -> Sample:
if idx < self.num:
return self.source[self.num - idx - 1]
return self.source[idx]
In the above example notice that:
we do not implement
sizesince it does not changewe refer to the source sequence as
self.source; alternatively, we could have accessed the source by callingsuper().get_sample()
As with the generator before, once the module is imported, the pipe registers itself on SamplesSequence
with the given title, i.e., reversed, so that you can simply do, e.g., dataset.reversed(num=20).
Do you want a preview of the auto-generated help?
from pipelime.cli import pl_print
pl_print("reversed")
>>>
reversed
(*, num: int)
Reverses the order of the first num samples.
Fields Description Type Default
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
num ▶ The number of samples to reverse. int ✗