Questions tagged [csvkit]

8 questions
6
votes
2 answers

Deduplicate CSV rows based on a specific column, with a CSV parser

I searched for this task, and found the following older questions: Removing Duplicates from a CSV based on specified columns Identify unique records on CSV based on specific columns But I can't use awk because my data is a complex CSV file with…
6
votes
3 answers

Truncate an CSV column using CsvKit

How can I truncate the length of a column using CSVKit? The definition looks like this: Column 1: no length restriction Column 2: This should properly handle escaped (quoted) columns and new lines. For example: First…
patstuart
  • 163
  • 4
2
votes
2 answers

how to install csvkit in bash

Kusalananda nicely recommends using csvformat from csvkit to format jq @csv into a csv format without double quotes " answering how to parse json with jq. This answer does not seem to involve the use of python. But the csvkit installation tutorial…
Johan
  • 321
  • 1
  • 15
1
vote
1 answer

How can I separate these two columns in this csv file in Linux/Bash?

I am looking to separate these two columns, each into their own separate text files. This data is from a csv file on Kaggle that contains Titanic passenger data. The first column is the number of passengers, and the second column is the age of those…
0
votes
1 answer

Syntactical error with csvsql query?

I have a csv file attributes.csv from which I want to retrieve all records to a new file attributes_withoutPIDate.csv excluding records for which the Name column has "PI Date" as the value. Commanding csvsql in this manner csvsql -d ',' -I --query…
ptrcao
  • 5,455
  • 11
  • 36
  • 44
0
votes
2 answers

Concatenating columns of the same csv file to create a new column with a new heading

What I have is a CSV file to this effect: +------------+--------------+ | Category I | Sub-Category | +------------+--------------+ | 1144 | 128 | | 1144 | 128 | | 1000 | 100 | | 1001 | 100…
ptrcao
  • 5,455
  • 11
  • 36
  • 44
0
votes
1 answer

How to write a csvcut script to cut column by header with multiple files?

Since csvcut (from csvkit) does not take more than a single file at a time, I need to write a script to process multiple files using it. The first parameter should be the delimiter, the second parameter should be the header of the column to extract,…
amV
  • 75
  • 4
0
votes
1 answer

CSV fields max length error and setting quoting=csv.QUOTE_NONE

After running csvcut on a comma-delimited .csv file: [root@server files]# csvcut -c title,mpn,overview,techspecs2,image_carousel_elargesrc syn_multi-image.csv > syn_scraped_cut.csv I get the error: CSV contains fields longer than maximum length of…
ptrcao
  • 5,455
  • 11
  • 36
  • 44