Import Data into Dataset
Action ID: import_dataset
Description
Import a file into a dataset.
Input Parameters
file_url
string
✓
-
The URL of the file to import. Supports CSV, XLSX, and JSON lines.
dataset_id
string
✓
-
The ID of the dataset to import into.
Output Parameters
records_imported
integer
The number of records successfully imported.
records_failed
integer
The number of records that failed to import.
How It Works
This node fetches a file from the specified URL, parses it based on its format (CSV, XLSX, or JSON lines), and imports the records into the target dataset. Each row or JSON object is processed as an individual record. The node validates data format, handles parsing errors, and provides a summary of successfully imported and failed records. The import operation is transactional, ensuring data integrity throughout the process.
Usage Examples
Example 1: Import CSV File
Input:
file_url: "https://example.com/users.csv"
dataset_id: "dataset_abc123"Output:
records_imported: 1523
records_failed: 7Example 2: Import Excel File
Input:
file_url: "https://storage.example.com/sales_data_2024.xlsx"
dataset_id: "dataset_sales_001"Output:
records_imported: 8450
records_failed: 0Example 3: Import JSON Lines File
Input:
file_url: "https://api.example.com/exports/events.jsonl"
dataset_id: "dataset_events_456"Output:
records_imported: 12350
records_failed: 23Common Use Cases
Bulk Data Loading: Import large volumes of data from external sources into your datasets
Data Migration: Transfer data from legacy systems or external platforms into AgenticFlow
Periodic Data Updates: Schedule regular imports to keep datasets synchronized with external sources
ETL Pipelines: Part of extract-transform-load workflows for data integration
Data Consolidation: Combine data from multiple file sources into a centralized dataset
Initial Setup: Populate new datasets with historical or reference data
Third-Party Integration: Import data exported from CRM, analytics, or other business tools
Error Handling
Invalid URL
File URL is malformed or inaccessible
Verify the URL is correct and publicly accessible
Unsupported Format
File format is not CSV, XLSX, or JSON lines
Convert file to a supported format before importing
Dataset Not Found
Dataset ID doesn't exist
Verify the dataset_id is correct and the dataset exists
File Download Failed
Cannot access the file at the URL
Check URL accessibility, authentication, or network connectivity
Parsing Error
File format is corrupted or invalid
Validate file structure and ensure proper formatting
Schema Mismatch
File columns don't match dataset schema
Ensure file headers match expected dataset fields
Size Limit Exceeded
File is too large to process
Split large files into smaller chunks for import
Notes
Supported Formats: The node accepts CSV, XLSX (Excel), and JSON Lines (JSONL) formats. Ensure your file matches one of these formats.
File Accessibility: The file_url must be publicly accessible or use pre-signed URLs if using cloud storage like S3 or Google Cloud Storage.
Column Mapping: CSV and XLSX files should have header rows that match the dataset field names for proper mapping.
JSON Lines Format: Each line must contain a valid JSON object representing one record.
Error Tracking: Check the records_failed count to identify if any records didn't import. Review logs for specific error details.
Performance: Large files may take time to process. Consider splitting extremely large datasets into multiple imports.
Data Validation: Records that fail validation checks (e.g., missing required fields, invalid data types) will be counted in records_failed.
Idempotency: Re-importing the same file may create duplicate records unless your dataset has unique constraints configured.
Last updated
Was this helpful?