File Management

CSV Stream | Process Large CSVs in your Function Stack

Summary

Handling large CSV files in your Xano function stacks can be a memory-intensive task, leading to potential failures or performance issues. Thankfully, Xano has introduced a new feature called CSV Stream, which allows you to process CSV files in chunks, improving memory efficiency and enabling seamless handling of large datasets. In this guide, we'll walk you through the process of using CSV Stream in your function stacks.

The Traditional Approach: Limitations and Inefficiencies

Before we dive into CSV Stream, let's revisit the traditional approach to processing CSVs in Xano function stacks:

  1. Start with a File Resource Input: Your CSV file serves as the input for your function stack.
  2. Get File Resource Data: Use the `getFileResourceData` step to retrieve the raw data from the file resource.
  3. Create a New Variable: Since the `getFileResourceData` step includes metadata, create a new variable to isolate the raw file data.
  4. Apply CSV Decode Filter: Use the CSV decode filter to convert the raw data into a CSV format.
  5. Remove Header Row: Remove the header row from the CSV data.
  6. Process Records with a forEach Loop: Iterate through the remaining rows and add records one by one to your table.

While this approach works, it has a significant drawback: you need to hold the entire CSV data in memory multiple times, leading to high memory usage. As the size of the CSV file increases, the function stack becomes more prone to failures, particularly on lower-tier plans or instances with high concurrent usage.

Introducing CSV Stream: Efficient CSV Processing

CSV Stream is designed to solve the memory inefficiencies associated with processing large CSV files. Here's how you can implement it in your function stacks:

  1. Start with a File Resource Input: Like the traditional approach, your CSV file serves as the input for your function stack.
  2. Use the CSV Stream Function: Instead of multiple steps, you can now use the new `CSVStream` function. Provide a variable name and the file resource value as input.
  3. Process Records with a forEach Loop: CSV Stream is designed to work seamlessly with the `forEach` loop, providing rows of your CSV in chunks for efficient processing.
  4. Add Records or Perform Additional Operations: Within the `forEach` loop, you can add records to your table or perform any other desired operations on the CSV data.

CSV Stream is specifically designed to work with the `forEach` loop, similar to the `stream` return type introduced for the `Query All Records` functionality. This approach ensures that you don't need to hold the entire CSV data in memory at once, significantly reducing memory usage and enabling efficient processing of large datasets.

Comparing CSV Stream with the Traditional Approach

To illustrate the efficiency gains of CSV Stream, let's compare the two approaches using a CSV file with over 200,000 records on a Launch Plan instance.

Traditional Approach:

  1. Run the function stack using the traditional approach with the large CSV file.
  2. Observe the result: An "unknown error" is returned, likely due to memory issues caused by the CSV file's size.

CSV Stream Approach:

  1. Run the function stack using CSV Stream with the same large CSV file.
  2. Observe the result: The function stack completes successfully, and the CSV data is processed without any memory-related issues.

While using CSV Stream, you won't be able to view the entire contents of the CSV in the function stack, as the goal is to reduce memory usage. However, you can inspect individual items within the `forEach` loop using a stop and debug statement.

After running the CSV Stream function stack, you can navigate to your database and verify that all records (216,930 in this example) have been successfully added to the table.

Conclusion

CSV Stream is a game-changer for processing large CSV files on Xano. By enabling chunk-based processing, it significantly reduces memory usage and allows you to handle datasets of any size efficiently, even on lower-tier plans or instances with high concurrent usage. Implementing CSV Stream in your function stacks is straightforward and can be achieved with just a few steps. Give it a try and experience the power of efficient CSV processing on Xano!

If you have any questions or need further assistance, feel free to leave a comment below, reach out to Xano Support, or engage with the Xano community.

This transcript was AI generated to allow users to quickly answer technical questions about Xano.

Was this helpful?

I found it helpful

I need more support
Sign up for XanoSign up for Xano

Build without limits on a secure, scalable backend.

Unblock your team’s progress and create a backend that will scale for free.

Start building for free