text Package
Modules
| function_extension | |
| text_chunker |
A Text splitter. Split text in chunks, attempting to leave meaning intact. For plain text, split looking at new lines first, then periods, and so on. For markdown, split looking at punctuation first, and so on. |
Functions
aggregate_chunked_results
Aggregate the results from the chunked results.
async aggregate_chunked_results(func: KernelFunction, chunked_results: list[str], kernel: Kernel, arguments: KernelArguments) -> str
Parameters
| Name | Description |
|---|---|
|
func
Required
|
|
|
chunked_results
Required
|
|
|
kernel
Required
|
|
|
arguments
Required
|
|
split_markdown_lines
Split markdown into lines.
It will split on punctuation first, and then on space and new lines.
split_markdown_lines(text: str, max_token_per_line: int, token_counter: ~collections.abc.Callable = <function _token_counter>) -> list[str]
Parameters
| Name | Description |
|---|---|
|
text
Required
|
|
|
max_token_per_line
Required
|
|
|
token_counter
|
|
split_markdown_paragraph
Split markdown into paragraphs.
split_markdown_paragraph(text: list[str], max_tokens: int, token_counter: ~collections.abc.Callable = <function _token_counter>) -> list[str]
Parameters
| Name | Description |
|---|---|
|
text
Required
|
|
|
max_tokens
Required
|
|
|
token_counter
|
|
split_plaintext_lines
Split plain text into lines.
it will split on new lines first, and then on punctuation.
split_plaintext_lines(text: str, max_token_per_line: int, token_counter: ~collections.abc.Callable = <function _token_counter>) -> list[str]
Parameters
| Name | Description |
|---|---|
|
text
Required
|
|
|
max_token_per_line
Required
|
|
|
token_counter
|
|
split_plaintext_paragraph
Split plain text into paragraphs.
split_plaintext_paragraph(text: list[str], max_tokens: int, token_counter: ~collections.abc.Callable = <function _token_counter>) -> list[str]
Parameters
| Name | Description |
|---|---|
|
text
Required
|
|
|
max_tokens
Required
|
|
|
token_counter
|
|