Processing components in Langflow

This Langflow feature is currently in public preview. Development is ongoing, and the features and functionality are subject to change. Langflow, and the use of such, is subject to the DataStax Preview Terms.

Processing components process and transform data within a flow.

Use a processing component in a flow

The Split Text processing component in this flow splits the incoming Data into chunks to be embedded into the vector store component.

The component offers control over chunk size, overlap, and separator, which affect context and granularity in vector store retrieval results.

Alter Metadata

This component is in Legacy. Legacy components can be used in flows, but may not work due to Langflow core updates.

This component modifies metadata of input objects. It can add new metadata, update existing metadata, and remove specified metadata fields. The component works with both Message and Data objects, and can also create a new Data object from user-provided text.

Name	Display Name	Info
input_value	Input	Objects to which Metadata should be added.
text_in	User Text	Text input; the value will be in the 'text' attribute of the Data object. Empty text entries are ignored.
metadata	Metadata	Metadata to add to each object.
remove_fields	Fields to Remove	Metadata fields to remove.

Name	Display Name	Info
model	Language Model	Connect the 'Language Model' output from your LLM component here.
system_message	System Message	Multi-line system instruction for all rows in the DataFrame.
df	DataFrame	The DataFrame whose column, specified by 'column_name', will be treated as text messages.
column_name	Column Name	The name of the DataFrame column to treat as text messages.

Name	Display Name	Info
first_text	First Text	The first text input to concatenate.
second_text	Second Text	The second text input to concatenate.
delimiter	Delimiter	A string used to separate the two text inputs. Defaults to a space.

Name	Display Name	Info
number_of_fields	Number of Fields	The number of fields to be added to the record.
text_key	Text Key	Key that identifies the field to be used as the text content.
text_key_validator	Text Key Validator	If enabled, checks if the given 'Text Key' is present in the given 'Data'.

Operation	Description	Required Inputs
Add Column	Adds a new column with a constant value	new_column_name, new_column_value
Drop Column	Removes a specified column	column_name
Filter	Filters rows based on column value	column_name, filter_value
Head	Returns first n rows	num_rows
Rename Column	Renames an existing column	column_name, new_column_name
Replace Value	Replaces values in a column	column_name, replace_value, replacement_value
Select Columns	Selects specific columns	columns_to_select
Sort	Sorts DataFrame by column	column_name, ascending
Tail	Returns last n rows	num_rows

Name	Display Name	Info
df	DataFrame	The input DataFrame to operate on.
operation	Operation	Select the DataFrame operation to perform. Options: Add Column, Drop Column, Filter, Head, Rename Column, Replace Value, Select Columns, Sort, Tail
column_name	Column Name	The column name to use for the operation.
filter_value	Filter Value	The value to filter rows by.
ascending	Sort Ascending	Whether to sort in ascending order.
new_column_name	New Column Name	The new column name when renaming or adding a column.
new_column_value	New Column Value	The value to populate the new column with.
columns_to_select	Columns to Select	List of column names to select.
num_rows	Number of Rows	Number of rows to return (for head/tail). Default: 5
replace_value	Value to Replace	The value to replace in the column.
replacement_value	Replacement Value	The value to replace with.

Name	Display Name	Info
data	data	The data to convert to text.
template	Template	The template to use for formatting the data. It can contain the keys {text}, {data} or any other key in the data.
sep	Separator	The separator to use between multiple data items.

Name	Display Name	Info
data	data	data object to filter
filter_criteria	Filter Criteria	List of keys to filter by.

Name	Display Name	Info
input_data	Input data	The list of data items to filter.
filter_key	Filter Key	The key to filter on (for example, 'route').
filter_value	Filter Value	The value to filter by (for example, 'CMIP').
operator	Comparison Operator	The operator to apply for comparing the values.

Name	Display Name	Info
json_str	JSON String	The JSON string to be cleaned. This can be a raw, potentially malformed JSON string produced by language models or other sources that may not fully comply with JSON specifications.
remove_control_chars	Remove Control Characters	If set to True, this option removes control characters (ASCII characters 0-31 and 127) from the JSON string. This can help eliminate invisible characters that might cause parsing issues or make the JSON invalid.
normalize_unicode	Normalize Unicode	When enabled, this option normalizes Unicode characters in the JSON string to their canonical composition form (NFC). This ensures consistent representation of Unicode characters across different systems and prevents potential issues with character encoding.
validate_json	Validate JSON	If set to True, this option attempts to parse the JSON string to ensure it is well-formed before applying the final repair operation. It raises a ValueError if the JSON is invalid, allowing for early detection of major structural issues in the JSON.

Inputs
Name	Display Name	Info
data	Data	The structured data to filter or transform using a Lambda function.
llm	Language Model	The connection port for a Model component.
filter_instruction	Instructions	Natural language instructions for how to filter or transform the data using a Lambda function, such as `Filter the data to only include items where the 'status' is 'active'`.
sample_size	Sample Size	For large datasets, the number of characters to sample from the dataset head and tail.
max_size	Max Size	The number of characters for the data to be considered "large", which triggers sampling by the `sample_size` value.

Inputs
Name	Display Name	Info
models	Language Models	List of LLMs to route between.
input_value	Input	The input message to be routed.
judge_llm	Judge LLM	LLM that will evaluate and select the most appropriate model.
optimization	Optimization	Optimization preference (quality/speed/cost/balanced).

Name	Display Name	Info
filtered_data	Filtered Data	The filtered or transformed data object.
dataframe	DataFrame	The filtered data returned as a `DataFrame`.

Name	Display Name	Info
output	Output	The response from the selected model.
selected_model	Selected Model	Name of the chosen model.

Name	Display Name	Info
df	DataFrame	The DataFrame to convert to text rows.
template	Template	The template for formatting each row. Use placeholders matching column names in the DataFrame, for example, `{col1}`, `{col2}`.
sep	Separator	String that joins all row texts when building the single Text output.

Name	Display Name	Info
input_value	Input	The data object to filter. It can be a message or data object.
query	JQ Query	JQ Query to filter the data. The input is always a JSON list.

Name	Display Name	Info
stringify	Stringify	Enable to convert input to a string instead of using a template.
template	Template	Template for formatting using variables in curly brackets. For DataFrames, use column names, such as `Name: {Name}. For Data objects, use `{text}.
input_data	Data or DataFrame	The input to parse - accepts either a DataFrame or Data object.
sep	Separator	String used to separate rows/items. Default: newline.
clean_data	Clean Data	When stringify is enabled, cleans data by removing empty rows and lines.

Name	Display Name	Info
input_text	Input Text	The text to analyze. Type: MessageTextInput, Required: true
pattern	Regex Pattern	The regular expression pattern to match. Type: MessageTextInput, Required: true

Name	Display Name	Info
data	Data	List of extracted matches as Data objects. Method: extract_matches
text	Message	Formatted text of all matches. Method: get_matches_text

Name	Display Name	Info
input_type	Input Type	Select the type of input to save. Options: ["DataFrame", "Data", "Message"]. Type: DropdownInput
df	DataFrame	The DataFrame to save. Type: DataFrameInput
data	Data	The Data object to save. Type: DataInput
message	Message	The Message to save. Type: MessageInput
file_format	File Format	Select the file format to save the input. Options for DataFrame/Data: ["csv", "excel", "json", "markdown"]. Options for Message: ["txt", "json", "markdown"]. Type: DropdownInput
file_path	File Path (including filename)	The full file path (including filename and extension). Type: StrInput

Name	Display Name	Info
data_list	Data List	List of data to select from.
data_index	Data Index	Index of the data to select.

Name	Display Name	Info
data_inputs	Input Documents	The data to split. The component accepts `Data` or `DataFrame` objects.
chunk_overlap	Chunk Overlap	The number of characters to overlap between chunks. Default: `200`.
chunk_size	Chunk Size	The maximum number of characters in each chunk. Default: `1000`.
separator	Separator	The character to split on. Default: `newline`.
text_key	Text Key	The key to use for the text column (advanced). Default: `text`.

Processing components in Langflow

Was this helpful?

Name	Display Name	Info
chunks	Chunks	List of split text chunks as `Data` objects.
dataframe	DataFrame	List of split text chunks as `DataFrame` objects.

Name	Display Name	Info
old_data	data	The records to update. It can be a single data object or a list of data objects.
number_of_fields	Number of Fields	Number of fields to be added to the record (range: 1-15).
text_key	Text Key	Key that identifies the field to be used as the text content.
text_key_validator	Text Key Validator	If enabled, checks if the given 'Text Key' is present in the given 'data' object.