Questions tagged [miller]

Questions about the `mlr` ("Miller") name-indexed data processing tool.

Miller (mlr) is a command-line utility that allows name-indexed data such as CSV files to be processed with functions equivalent to sed, awk, cut, join, sort, etc.  It can convert between formats (e.g., JSON).  It preserves headers when sorting or reversing,  and streams data where possible so its memory requirements stay small.

Homepage: https://miller.readthedocs.io/
Code: https://github.com/johnkerl/miller

9 questions
6
votes
2 answers

Deduplicate CSV rows based on a specific column, with a CSV parser

I searched for this task, and found the following older questions: Removing Duplicates from a CSV based on specified columns Identify unique records on CSV based on specific columns But I can't use awk because my data is a complex CSV file with…
6
votes
1 answer

How can I call an external command from within Miller (mlr)’s DSL?

Suppose I have the following CSV: $ cat test.csv id,domain 1,foo.com 2,bar.com Using mlr put, I can easily map any function over a field in the CSV, as long as I can define it in the Miller DSL. So, for example, mlr --csv put '$id = $id + 1' will…
sjy
  • 826
  • 8
  • 22
5
votes
3 answers

Output a header label in data field in miller

Given file.csv: a,b,c 1,2,3 How can mlr be made to output: a,b,c 1,2,c Using the label name of $c without knowing in advance that $c contains the letter "c"? Note: correct answer must use mlr only.
agc
  • 7,045
  • 3
  • 23
  • 53
2
votes
3 answers

Extracting domains from a CSV URL column using Miller

Having CSV content similar to…
T145
  • 121
  • 7
1
vote
2 answers

How could I (painlessly) split or reverse "Last, First" within a record in Miller?

I have a tab-delimited file where one of the columns is in the format "LastName, FirstName". What I want to do is split that record out into two separate columns, last, and first, use cut or some other verb(s) on that, and output the result to…
TheDudeAbides
  • 446
  • 3
  • 12
0
votes
1 answer

Forcing miller to read data as string in conversion to JSON

In the following MWE echo x="1e2" | mlr --ojson cat my intention is for miller to generate a one-element JSON array containing the object {"x": "1e2"} The object actually returned (within the array) is instead {"x": 1e2} where the value is taken…
Marcos
  • 103
  • 3
0
votes
1 answer

Finding whether a string is a substring of another with Miller/mlr's DSL

How do I find whether a column of a CSV contains another using mlr's DSL? In other words I have a CSV a,b test and,test and more and want to find out whether 'test and' (a) is included in 'test and more' (b)
E Lisse
  • 3
  • 1
-1
votes
1 answer

Adding an empty column to a CSV file with Miller

I have a CSV file that looks like this: 0 1 2 3 I'd like to use Miller to append an empty column x to every row so that the output file looks like this: 0,x 1, 2, 3, How do I do that?
Mateusz Piotrowski
  • 4,623
  • 5
  • 36
  • 70
-2
votes
1 answer

Separating CSV (one column) into many columns on delimiter (comma)

I have a CSV with ~50 comma-separated values in one column that I want to separate into separate columns. The header is line 1. This should be really simple, and I've tried a lot surrounding awk and mlr but haven't been able to adapt anything I've…