we have a service that imports a CSV file from the webshop and reads each line to update the corresponding item in our database. It works fine for almost all of the products but we are facing some problems for products where certain cells or values in the row have commas in them for example such as the product summary and title etc.
The problem is that when a product summary field has commas, like “Main computer for managing, The card takes care of retrieving and sending data up in the Cloud’, In addition, the external slave controls cards such as general output cards with relays or transistors.”
It gets split as well and throws off the index, so we get the wrong value.The CSV file is data that is provided server, so we can’t edit the file. so let’s discuss how we can handle this in our code so that we can read the CSV file that won’t split everything by comma.
CSV File:
title,cost_price,summary
Chip controller,2500.0,"System is a flexible system, that makes it easy to manage all the technical installations in your home or building." Master controller,5500.0,"Master controller for larger installations with integrated: MODBUS, M-BUS, CANBUS and wireless communication interfaces. " CANBUS,6000.0,"various technical installations in the home or office can easily be controlled"
I have a CSV file with 3 columns and one of the columns has text separated by a comma “System is a flexible system, that makes it easy to manage all the technical installations in your home or building.”
When I read this file my code overflow from this column and subsequently data moves to a new line.
Reading large csv files with strings containing commas in some field
I have searched some queries on how to read CSV files where values contain a comma, and I’m going to discuss some methods.We want output like below
Output:
Name:"Chip controller",cost_price:"2500.0",summary:"System is a flexible system, that makes it easy to manage all the technical installations in your home or building." Name:"Master controller",cost_price:"5500.0",summary:"Master controller for larger installations with integrated: MODBUS, M-BUS, CANBUS and wireless communication interfaces." Name:"CANBUS",cost_price:"6000.0",summary:"various technical installations in the home or office can easily be controlled"
Model class for parsing csv file
public class Product {
public string title {
get;
set;
}
public string cost_price {
get;
set;
}
public string summary {
get;
set;
}
}
Method 1:Microsoft.VisualBasic.FileIO.TextFieldParser
Using the Microsoft.VisualBasic.FileIO.TextFieldParser we can easily parse the our CSV file,TextFieldParser will handle parsing a delimited file with quotes values also.
TextFieldParser is used in reading CSV files and in which we specify a delimiter string like a comma or any other character and then we can read in the fields of every line.Using that function I have read a large .csv file with 50000 rows and look fine.
public void ParseCSVFileUsingTextFieldParser()
{ //For adding Microsoft.VisualBasic.FileIO,Right-click on your project and select Add Reference...
//In the Reference Manager, expand Assemblies and select Framework. Then check the box for Microsoft.VisualBasic and click OK.
string csv = @"E:\MyJson\cs.csv";
TextFieldParser csvParser = new TextFieldParser(csv);
csvParser.HasFieldsEnclosedInQuotes = true;
csvParser.SetDelimiters(",");
string[] values;
List<Product> products = new List<Product>();
int i = 0;
while (!csvParser.EndOfData)
{
values = csvParser.ReadFields();
if (i > 0) //to avoid reading header
{
Product product = new Product();
product.title = values[0];
product.cost_price = values[1]; country
product.summary = values[2];
products.Add(product);
}
i++;
}
csvParser.Close();
foreach (var d in products)
{
Debug.Print(($@"Name:""{d.title}"",cost_price:""{d.cost_price}"",summary:""{d.summary}"""));
}
}
Method 1:LINQtoCSV package
Installation LINQtoCSV package in your project and write the code like below.LINQtoCSV handles data fields that contain commas and line breaks and in addition to commas, most delimiting characters can be used, including.
public void ParseCSVUsingLINQtoCSV() {
string csv = @ "E:\MyJson\cs.csv";
CsvContext cc = new CsvContext();
CsvFileDescription inputFileDescription = new CsvFileDescription {
SeparatorChar = ',', FirstLineHasColumnNames = true, IgnoreUnknownColumns = true
};
IEnumerable <Product> products = cc.Read < Product > (csv, inputFileDescription);
foreach(var d in products) {
Debug.Print(($ @ "Name:"
"{d.title}"
",cost_price:"
"{d.cost_price}"
",summary:"
"{d.summary}"
""));
}
Method 3:Using Regex and JsonConvert
we can also use regex for reading cell with comma values,look at below code.
public void ParseCsvUsingRegex() {
string csv = @ "E:\MyJson\cs.csv";
List < Product > products = new List < Product > ();
using(var streamReader = File.OpenText(csv)) {
int i = 0;
while (!streamReader.EndOfStream) {
var line = streamReader.ReadLine();
if (!string.IsNullOrEmpty(line)) {
Regex CSVParser = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
string[] values = CSVParser.Split(line);
if (i > 0) {
Product product = new Product();
product.title = values[0];
product.cost_price = values[1];
product.summary = JsonConvert.DeserializeObject < string > (values[2]);
products.Add(product);
}
}
i++;
}
}
A CSV is a comma-isolated values document, which permits information to be saved in a plain configuration. The CSVs seem to be a commonplace calculation sheet yet with a .csv expansion.
CSV records can be utilized with most any calculation sheet program, like Microsoft Excel or Google Spreadsheets. They contrast from other accounting sheet record types since you can have one sheet in the document, they can’t save a cell, section, or column. Likewise, you can’t save recipes in this organization.
These documents fill a wide range of business needs. For instance, they assist organizations with sending out high measures of information to a more engaged data set.
CSV records are plain-text documents, making them simple for a site designer to make
Since they’re plain text, it’s not difficult to bring them into a calculation sheet or another stockpiling information base, no matter what the particular programming you’re utilizing.
Saving CSV documents is moderately simple, you simply have to know where to change the record type.
Under the “Record name” area in the “Save As” tab, you can choose “Save As” and transform it to “CSV (comma delimited) (*.csv). When that choice is utilized After choosing, you are on your way for quicker and more straightforward information organization.It ought to be no different for both Apple and Microsoft working frameworks.
In the realm of online trade, one of your fundamental targets is to arrive at an enormous number of clients. Since CSV records are not difficult to arrange, web based business entrepreneurs can control these documents in more than one way. CSV documents are generally utilized for bringing in and trading significant data like client or request information to and from your data set.
Read Similar Articles
- How To Cast List<dynamic> To List<Object> In Flutter
- [Fixed]-pandasnotimplementederror: the method `pd.series.__iter__()` is not implemented. if you want to collect your data as an numpy array, use 'to_numpy()' instead.
- Client-side Data Compression and Decompression with JavaScript
- [Solved]- DataTables- Uncaught TypeError: Cannot read properties of undefined (reading 'length')
- Solved Error: Only authentication clear text password and authentication md5 password supported for now. received 10
- Create Stored Procedure with Output and Input parameters in SQL
- CRUD Operation In React JS Using Hooks