∞ Setence Splitter¶
Documentation¶
- Class name:
LLMSentenceSplitter
- Category:
SALT/Language Toolkit/Parsing
- Output node:
False
The LLMSentenceSplitter node is designed to segment text into smaller, manageable chunks based on specified size and overlap criteria. This functionality is crucial for processing large documents or texts in a way that makes them more amenable to detailed analysis or further processing steps.
Input types¶
Required¶
chunk_size
- Specifies the maximum size of each text chunk. It determines how large each segment of text will be, directly impacting the granularity of the splitting process.
- Comfy dtype:
INT
- Python dtype:
int
chunk_overlap
- Defines the number of characters that will overlap between consecutive text chunks. This overlap ensures continuity and context preservation across the segmented text.
- Comfy dtype:
INT
- Python dtype:
int
Output types¶
llm_sentence_splitter
- Comfy dtype:
LLM_SENTENCE_SPLITTER
- Produces an instance of a SentenceSplitter, configured with the provided chunk size and overlap, ready to be used for text segmentation.
- Python dtype:
SentenceSplitter
- Comfy dtype:
Usage tips¶
- Infra type:
CPU
- Common nodes: unknown
Source code¶
class LLMSentenceSplitter:
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"chunk_size": ("INT", {"min": 8, "max": 2048, "step": 1, "default": 1024}),
"chunk_overlap": ("INT", {"min": 0, "max": 2048, "step": 1, "default": 20})
},
}
RETURN_TYPES = ("LLM_SENTENCE_SPLITTER",)
RETURN_NAMES = ("llm_sentence_splitter",)
FUNCTION = "semantic_nodes"
CATEGORY = f"{MENU_NAME}/{SUB_MENU_NAME}/Parsing"
def semantic_nodes(self, chunk_size, chunk_overlap):
splitter = SentenceSplitter(
chunk_size=chunk_size,
chunk_overlap=chunk_overlap,
)
return (splitter, )