Share via


text Package

Modules

function_extension
text_chunker

A Text splitter.

Split text in chunks, attempting to leave meaning intact. For plain text, split looking at new lines first, then periods, and so on. For markdown, split looking at punctuation first, and so on.

Functions

aggregate_chunked_results

Aggregate the results from the chunked results.

async aggregate_chunked_results(func: KernelFunction, chunked_results: list[str], kernel: Kernel, arguments: KernelArguments) -> str

Parameters

Name Description
func
Required
chunked_results
Required
kernel
Required
arguments
Required

split_markdown_lines

Split markdown into lines.

It will split on punctuation first, and then on space and new lines.

split_markdown_lines(text: str, max_token_per_line: int, token_counter: ~collections.abc.Callable = <function _token_counter>) -> list[str]

Parameters

Name Description
text
Required
max_token_per_line
Required
token_counter

split_markdown_paragraph

Split markdown into paragraphs.

split_markdown_paragraph(text: list[str], max_tokens: int, token_counter: ~collections.abc.Callable = <function _token_counter>) -> list[str]

Parameters

Name Description
text
Required
max_tokens
Required
token_counter

split_plaintext_lines

Split plain text into lines.

it will split on new lines first, and then on punctuation.

split_plaintext_lines(text: str, max_token_per_line: int, token_counter: ~collections.abc.Callable = <function _token_counter>) -> list[str]

Parameters

Name Description
text
Required
max_token_per_line
Required
token_counter

split_plaintext_paragraph

Split plain text into paragraphs.

split_plaintext_paragraph(text: list[str], max_tokens: int, token_counter: ~collections.abc.Callable = <function _token_counter>) -> list[str]

Parameters

Name Description
text
Required
max_tokens
Required
token_counter