Processing components in Langflow

This Langflow feature is currently in public preview. Development is ongoing, and the features and functionality are subject to change. Langflow, and the use of such, is subject to the DataStax Preview Terms.

Processing components process and transform data within a flow.

Use a processing component in a flow

The Split Text processing component in this flow splits the incoming Data into chunks to be embedded into the vector store component.

The component offers control over chunk size, overlap, and separator, which affect context and granularity in vector store retrieval results.

vector store document ingestion

Alter Metadata

This component is in Legacy. Legacy components can be used in flows, but may not work due to Langflow core updates.

This component modifies metadata of input objects. It can add new metadata, update existing metadata, and remove specified metadata fields. The component works with both Message and Data objects, and can also create a new Data object from user-provided text.

Parameters

Inputs
Name Display Name Info

input_value

Input

Objects to which Metadata should be added.

text_in

User Text

Text input; the value will be in the 'text' attribute of the Data object. Empty text entries are ignored.

metadata

Metadata

Metadata to add to each object.

remove_fields

Fields to Remove

Metadata fields to remove.

Outputs
Name Display Name Info

data

Data

List of Input objects, each with added Metadata.

Batch run

The batch run component runs a language model over each row of a DataFrame text column, and returns a new DataFrame with the original text and the model’s response.

Parameters

Inputs
Name Display Name Info

model

Language Model

Connect the 'Language Model' output from your LLM component here.

system_message

System Message

Multi-line system instruction for all rows in the DataFrame.

df

DataFrame

The DataFrame whose column, specified by 'column_name', will be treated as text messages.

column_name

Column Name

The name of the DataFrame column to treat as text messages.

Outputs
Name Display Name Info

batch_results

Batch Results

A DataFrame with two columns: 'text_input' and 'model_response'.

Combine data

Prior to Langflow version 1.1.3, this component was named Merge Data.

This component combines multiple data objects into a unified list of data objects.

Parameters

Inputs
Name Display Name Info

data_inputs

data Inputs

A list of data input objects to be merged.

Outputs
Name Display Name Info

merged_data

Merged data

The resulting list of merged data objects with consistent keys.

Combine Text

This component concatenates two text sources into a single text chunk using a specified delimiter.

Parameters

Inputs
Name Display Name Info

first_text

First Text

The first text input to concatenate.

second_text

Second Text

The second text input to concatenate.

delimiter

Delimiter

A string used to separate the two text inputs. Defaults to a space.

Outputs
Name Display Name Info

message

Message

A Message object containing the combined text.

Create data

This component is in Legacy as of Langflow version 1.1.3. Legacy components can be used in flows, but may not work due to Langflow core updates.

This component dynamically creates a Data object with a specified number of fields.

Parameters

Inputs
Name Display Name Info

number_of_fields

Number of Fields

The number of fields to be added to the record.

text_key

Text Key

Key that identifies the field to be used as the text content.

text_key_validator

Text Key Validator

If enabled, checks if the given 'Text Key' is present in the given 'Data'.

Outputs
Name Display Name Info

data

Data

A Data object created with the specified fields and text key.

DataFrame operations

This component performs various operations on Pandas DataFrames. Here’s an overview of the available operations and their inputs:

Operations
Operation Description Required Inputs

Add Column

Adds a new column with a constant value

new_column_name, new_column_value

Drop Column

Removes a specified column

column_name

Filter

Filters rows based on column value

column_name, filter_value

Head

Returns first n rows

num_rows

Rename Column

Renames an existing column

column_name, new_column_name

Replace Value

Replaces values in a column

column_name, replace_value, replacement_value

Select Columns

Selects specific columns

columns_to_select

Sort

Sorts DataFrame by column

column_name, ascending

Tail

Returns last n rows

num_rows

Parameters

Inputs
Name Display Name Info

df

DataFrame

The input DataFrame to operate on.

operation

Operation

Select the DataFrame operation to perform. Options: Add Column, Drop Column, Filter, Head, Rename Column, Replace Value, Select Columns, Sort, Tail

column_name

Column Name

The column name to use for the operation.

filter_value

Filter Value

The value to filter rows by.

ascending

Sort Ascending

Whether to sort in ascending order.

new_column_name

New Column Name

The new column name when renaming or adding a column.

new_column_value

New Column Value

The value to populate the new column with.

columns_to_select

Columns to Select

List of column names to select.

num_rows

Number of Rows

Number of rows to return (for head/tail). Default: 5

replace_value

Value to Replace

The value to replace in the column.

replacement_value

Replacement Value

The value to replace with.

Outputs
Name Display Name Info

output

DataFrame

The resulting DataFrame after the operation.

Data to message

This component is in Legacy, which means it is no longer in active development as of Langflow version 1.3. Instead, use the Parser component.

This component converts data objects into plain text using a specified template.

Parameters

Inputs
Name Display Name Info

data

data

The data to convert to text.

template

Template

The template to use for formatting the data. It can contain the keys {text}, {data} or any other key in the data.

sep

Separator

The separator to use between multiple data items.

Outputs
Name Display Name Info

text

Text

The resulting formatted text string as a message object.

Filter data

This component is in Beta as of Langflow version 1.1.3, and is not yet fully supported.

This component filters a data object based on a list of specified keys. This component allows for selective extraction of data from a data object, retaining only the key-value pairs that match the provided filter criteria.

Parameters

Inputs
Name Display Name Info

data

data

data object to filter

filter_criteria

Filter Criteria

List of keys to filter by.

Outputs
Name Display Name Info

filtered_data

Filtered data

The resulting filtered data object.

Filter Values

This component is in Beta as of Langflow version 1.1.3, and is not yet fully supported.

This component filters a list of data items based on a specified key, filter value, and comparison operator.

Parameters

Inputs
Name Display Name Info

input_data

Input data

The list of data items to filter.

filter_key

Filter Key

The key to filter on (for example, 'route').

filter_value

Filter Value

The value to filter by (for example, 'CMIP').

operator

Comparison Operator

The operator to apply for comparing the values.

Outputs
Name Display Name Info

filtered_data

Filtered data

The resulting list of filtered data items.

JSON Cleaner

This component is in Legacy. Legacy components can be used in flows, but may not work due to Langflow core updates.

This component cleans JSON strings to ensure they are fully compliant with the JSON specification.

Parameters

Inputs
Name Display Name Info

json_str

JSON String

The JSON string to be cleaned. This can be a raw, potentially malformed JSON string produced by language models or other sources that may not fully comply with JSON specifications.

remove_control_chars

Remove Control Characters

If set to True, this option removes control characters (ASCII characters 0-31 and 127) from the JSON string. This can help eliminate invisible characters that might cause parsing issues or make the JSON invalid.

normalize_unicode

Normalize Unicode

When enabled, this option normalizes Unicode characters in the JSON string to their canonical composition form (NFC). This ensures consistent representation of Unicode characters across different systems and prevents potential issues with character encoding.

validate_json

Validate JSON

If set to True, this option attempts to parse the JSON string to ensure it is well-formed before applying the final repair operation. It raises a ValueError if the JSON is invalid, allowing for early detection of major structural issues in the JSON.

Outputs
Name Display Name Info

output

Cleaned JSON String

The resulting cleaned, repaired, and validated JSON string that fully complies with the JSON specification.

Lambda filter

This component uses an LLM to generate a Lambda function for filtering or transforming structured data.

To use the Lambda filter component, you must connect it to a Language Model component, which the component uses to generate a function based on the natural language instructions in the Instructions field.

This example gets JSON data from the https://jsonplaceholder.typicode.com/users API endpoint. The Instructions field in the Lambda filter component specifies the task extract emails. The connected LLM creates a filter based on the instructions, and successfully extracts a list of email addresses from the JSON data.

component lambda filter

Parameters

Inputs
Name Display Name Info

data

Data

The structured data to filter or transform using a Lambda function.

llm

Language Model

The connection port for a Model component.

filter_instruction

Instructions

Natural language instructions for how to filter or transform the data using a Lambda function, such as Filter the data to only include items where the 'status' is 'active'.

sample_size

Sample Size

For large datasets, the number of characters to sample from the dataset head and tail.

max_size

Max Size

The number of characters for the data to be considered "large", which triggers sampling by the sample_size value.

Outputs
Name Display Name Info

filtered_data

Filtered Data

The filtered or transformed data object.

dataframe

DataFrame

The filtered data returned as a DataFrame.

LLM Router

This component routes requests to the most appropriate LLM based on the OpenRouter model specifications.

Parameters

Inputs
Name Display Name Info

models

Language Models

List of LLMs to route between.

input_value

Input

The input message to be routed.

judge_llm

Judge LLM

LLM that will evaluate and select the most appropriate model.

optimization

Optimization

Optimization preference (quality/speed/cost/balanced).

Outputs
Name Display Name Info

output

Output

The response from the selected model.

selected_model

Selected Model

Name of the chosen model.

Message to data

This component converts a message object to a data object.

Parameters

Inputs
Name Display Name Info

message

message

The message object to convert to a data object.

Outputs
Name Display Name Info

data

data

The resulting data object converted from the input message.

Parse DataFrame

This component is in Legacy, which means it is no longer in active development as of Langflow version 1.3. Instead, use the Parser component.

This component converts a DataFrame into plain text following a specified template. Each column in the DataFrame is treated as a possible template key, for example, {col_name}.

Parameters

Inputs
Name Display Name Info

df

DataFrame

The DataFrame to convert to text rows.

template

Template

The template for formatting each row. Use placeholders matching column names in the DataFrame, for example, {col1}, {col2}.

sep

Separator

String that joins all row texts when building the single Text output.

Outputs
Name Display Name Info

text

Text

All rows combined into a single text, each row formatted by the template and separated by the separator value defined in sep.

Parse JSON

This component is in Legacy as of Langflow version 1.1.3. Legacy components can be used in flows, but may not work due to Langflow core updates.

This component converts and extracts JSON fields using JQ queries.

Parameters

Inputs
Name Display Name Info

input_value

Input

The data object to filter. It can be a message or data object.

query

JQ Query

JQ Query to filter the data. The input is always a JSON list.

Outputs
Name Display Name Info

filtered_data

Filtered data

Filtered data as a list of data objects.

Parser

This component formats DataFrame or Data objects into text using templates, with an option to convert inputs directly to strings using stringify.

To use this component, create variables for values in the template the same way you would in a Prompt component. For DataFrames, use column names, for example Name: {Name}. For Data objects, use {text}.

Parameters

Inputs
Name Display Name Info

stringify

Stringify

Enable to convert input to a string instead of using a template.

template

Template

Template for formatting using variables in curly brackets. For DataFrames, use column names, such as `Name: {Name}. For Data objects, use `{text}.

input_data

Data or DataFrame

The input to parse - accepts either a DataFrame or Data object.

sep

Separator

String used to separate rows/items. Default: newline.

clean_data

Clean Data

When stringify is enabled, cleans data by removing empty rows and lines.

Outputs
Name Display Name Info

parsed_text

Parsed Text

The resulting formatted text as a message object.

Regex Extractor

This component extracts patterns from text using regular expressions.

Parameters

Inputs
Name Display Name Info

input_text

Input Text

The text to analyze. Type: MessageTextInput, Required: true

pattern

Regex Pattern

The regular expression pattern to match. Type: MessageTextInput, Required: true

Outputs
Name Display Name Info

data

Data

List of extracted matches as Data objects. Method: extract_matches

text

Message

Formatted text of all matches. Method: get_matches_text

Save to File

This component saves DataFrames, Data, or Messages to various file formats.

Parameters

Inputs
Name Display Name Info

input_type

Input Type

Select the type of input to save. Options: ["DataFrame", "Data", "Message"]. Type: DropdownInput

df

DataFrame

The DataFrame to save. Type: DataFrameInput

data

Data

The Data object to save. Type: DataInput

message

Message

The Message to save. Type: MessageInput

file_format

File Format

Select the file format to save the input. Options for DataFrame/Data: ["csv", "excel", "json", "markdown"]. Options for Message: ["txt", "json", "markdown"]. Type: DropdownInput

file_path

File Path (including filename)

The full file path (including filename and extension). Type: StrInput

Outputs
Name Display Name Info

confirmation

Confirmation

Confirmation message after saving the file. Method: save_to_file

Select Data

This component is in Legacy as of Langflow version 1.1.3. Legacy components can be used in flows, but may not work due to Langflow core updates.

This component selects a single data item from a list.

Parameters

Inputs
Name Display Name Info

data_list

Data List

List of data to select from.

data_index

Data Index

Index of the data to select.

Outputs
Name Display Name Info

selected_data

Selected Data

The selected Data object.

Split text

This component splits text into chunks based on specified criteria.

Parameters

Inputs
Name Display Name Info

data_inputs

Input Documents

The data to split. The component accepts Data or DataFrame objects.

chunk_overlap

Chunk Overlap

The number of characters to overlap between chunks. Default: 200.

chunk_size

Chunk Size

The maximum number of characters in each chunk. Default: 1000.

separator

Separator

The character to split on. Default: newline.

text_key

Text Key

The key to use for the text column (advanced). Default: text.

Outputs
Name Display Name Info

chunks

Chunks

List of split text chunks as Data objects.

dataframe

DataFrame

List of split text chunks as DataFrame objects.

Update data

The Update data component dynamically updates or appends data with specified fields.

Parameters

Inputs
Name Display Name Info

old_data

data

The records to update. It can be a single data object or a list of data objects.

number_of_fields

Number of Fields

Number of fields to be added to the record (range: 1-15).

text_key

Text Key

Key that identifies the field to be used as the text content.

text_key_validator

Text Key Validator

If enabled, checks if the given 'Text Key' is present in the given 'data' object.

Outputs
Name Display Name Info

data

data

The resulting updated data objects.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax, an IBM Company | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com