Questions tagged [csv-simple]

Text files containing simple comma separated values. Use this for data that is a subset of real CSV without newlines in (quoted) field content, that can be handled with line oriented tools without the need for a real parser.

CSV files come in many varieties and don't have to be comma separated. A full-fledged CSV file cannot be handled with simplistic approach with line oriented tools as rows span multiple lines.

If data is simplified CSV, at the minimum no newlines should be included in row content and it might not have any quoted data at all (e.g. if the separator character is not included in any of the cell contents).

Describe the exact content (separator value, quotes or not possible cell content) as a few example lines don't say anything about the next line in a file.

Common tools for simplified CSV are grep, sed and awk

For real, non-subset, CSV data a parser is needed. Questions relating to that (and only those) should use the tag. See Is there a robust command line tool for processing csv files?

106 questions
9
votes
4 answers

How to insert CSV data into an SQLite table via a shell pipe?

I have written a program that outputs the result to the standard output in strict pure CSV form (every line represents a single record and contain the same set of comma-separated fields, fields only contain lowercase English letters, numbers and…
Ivan
  • 17,368
  • 35
  • 93
  • 118
8
votes
1 answer

Print each field of CSV on newline without knowing number of fields

I was playing with IFS today and created a quick text file with a list of numbers separated by commas on 1 line. 1,2,3,4,5 I then tried to write a script to print each number on a newline. I was able to make it work, but I had to know how many…
Keith Shannon
  • 83
  • 1
  • 1
  • 6
7
votes
5 answers

Split a CSV file based on second column value

I am using Ubuntu and I want to split my csv file into two csv files based on the value in the second column (age). The first file for patients under 60 (<60) and the second for patients over 60(>=). For example, if I have the following…
Solomon123
  • 123
  • 5
6
votes
7 answers

How to remove duplicate value in a tab-delimited text file

I have a tab delimited column text like below A B1 B1 C1 B B2 D2 C C12 C13 C13 D D3 D5 D9 G F2 F2 how could I convert the above table like below A B1 C1 B B2 D2 C …
desu
  • 119
  • 6
6
votes
4 answers

Remove entries from one CSV file that are already present in another

I have two files: 'file1' has employee ID numbers, 'file2' has the complete database of the employees. Here is what they look like: file1 123123 222333 file2 111222 Jones Sally 111333 Johnson Roger 123123 Doe John 444555 Richardson George 222333…
pgrason
  • 61
  • 1
  • 1
  • 5
5
votes
3 answers

Determine maximum column length for every column in a simplified csv-file (one line per row)

To determine the maximum length of each column in a comma-separated csv-file I hacked together a bash-script. When I ran it on a linux system it produced the correct output, but I need it to run on OS X and it relies on the GNU version of wc that…
jpw
  • 153
  • 1
  • 2
  • 5
4
votes
3 answers

How to do calculation for each row

I had a csv file of data such like that when read into shell: name,income,reward,payment Jackson,10000,2000,1000 Paul,2500,700,200 Louis,5000,100,1800 and I want to find the net earning for each person, use formula: "net =…
Polsop
  • 43
  • 4
4
votes
3 answers

How to add column when the number of columns in file is 2

I am trying to write awk for a file I have. Example of the dataset is S,CV0110,1235 S,1234 D,CQ120,3245 P,7894 Desired outcome is as follows (added empty field when the number of fields in a row is 2) S,CV0110,1235 S,,1234 D,CQ120,3245 P,,7894 I…
mike
  • 69
  • 2
4
votes
4 answers

Double quote value assignments stored in a CSV?

I have a file that contains text as follows: dt=2016-06-30,path=path1,site=US,mobile=1 dt=2016-06-21,path=path2,site=UK,mobile=0 I want to convert it to text with double-quoted values in the key-value pairs, like…
4
votes
1 answer

AWK, Sum of category

I have tons of CSV files with similar content. The values are usually comma separated and they look like this. product_a, domestic, 500 product_a, abroad, 15 product_b, domestic, 313 product_b, abroad, 35 product_c, domestic, …
Je.dno
  • 43
  • 5
4
votes
6 answers

Count unique associated values in awk (or perl)

I've already found "How to print incremental count of occurrences of unique values in column 1", which is similar to my question, but the answer isn't sufficient for my purposes. First let me just illustrate what I want to do: # Example input apple …
Wildcard
  • 35,316
  • 26
  • 130
  • 258
3
votes
1 answer

Why is column adding a newline in the middle of my row where one is not present in the original data?

I'm analyzing some packetfilter logs and wanted to make a nice table of some output, which normally works fine when I use column -t. I can't use a tab as my output field separator (OFS) in this case because it jacks up the multi-word string fields…
Dan
  • 396
  • 4
  • 14
3
votes
3 answers

Count the number of occurrences of a column value in a TSV file with AWK

I have a TSV tab-separated file with 3 cols: ID\tTEXT\tTYPE To print the TYPE column I do cat /dataset.csv | awk -F $'\t' '{print $3}' Those values are an enumeration of values like {CLASS_A,CLASS_B,CLASS_C}, etc. I need a inline way with AWK to…
loretoparisi
  • 287
  • 6
  • 14
3
votes
1 answer

Command to "fill down" columns in text file, a la the Excel fill down function

I have a text file with rows, and columns within those rows. I want to, essentially, replicate the Excel "fill down" function. In other words, if there is a blank "cell" on a line, it will look to the line above it and fill down the value in the…
Simonmdr
  • 141
  • 1
  • 1
  • 6
3
votes
1 answer

Create table with frequency of unique names retrieved from multiple .csv files

I have 32 CSV files containing fetched information from a database. I need to make a frequency table in TSV/CSV format, where the names of the rows are the name of each file, and the names of the columns are the unique names found throughout the…
Lucia O
  • 407
  • 1
  • 4
  • 7
1
2 3 4 5 6 7 8