3

I have a file of the format, with a leading space before each line:

 "Western Overseas",
 "Western Overseas",
 "^",
 "--",
 "^",
 "--",
 "--",
 null,
 24995,
 9977,
 "CR",

 "Western Refrigeration Private Limited",
 "Western Refrigeration Private Limited",
 "[ICRA]A",
 "--",
 "[ICRA]A1",
 "--",
 "Stable",
 null,
 14951,
 2346,
 "CR",

I would like to convert it to a CSV file with format:

 "Western Overseas","Western Overseas","^","--","^","--","--",null,24995,9977,"CR"
 "Western Refrigeration Private Limited","Western Refrigeration Private Limited","[ICRA]A","--","[ICRA]A1","--","Stable",null,14951,2346,"CR"

I'm trying to use tr but am having trouble since it either prints all output to one line and seems to replace newlines with a double newline. Any help is appreciated.

Kamil Maciorowski
  • 19,242
  • 1
  • 50
  • 94
user362513
  • 31
  • 3
  • 1
    (1) Edited to reflect two lines instead of 3. (4) There are no leading spaces on the empty line, only at the beginning of lines with actual text. – user362513 Jul 16 '19 at 07:47
  • 1
    No space. I.e. `foo,bar` is required for the output. – user362513 Jul 16 '19 at 07:52
  • 1
    Please provide the first few lines of a hexdump from the actual file. Something like `xxd test.txt | head -n 12`. – Kamil Maciorowski Jul 16 '19 at 08:41
  • 1
    ```00000000: 2022 5765 7374 6572 6e20 4f76 6572 7365 "Western Overse 00000010: 6173 222c 0a20 2257 6573 7465 726e 204f as",. "Western O 00000020: 7665 7273 6561 7322 2c0a 2022 5e22 2c0a verseas",. "^",.``` – user362513 Jul 16 '19 at 08:48
  • 1
    `awk '$1=$1' RS=',\n\n' infile` if you don't mind last comma for last line. – αғsнιη Jul 16 '19 at 13:17

2 Answers2

5

An awk solution is

awk '{if(NF){gsub(/^ |,$/,""); printf c $0; c=","}else{printf "\n"; c=""}};END{printf "\n"}'

expanded with comments:

{
    if(NF) { # if the line isn't empty
        gsub(/^ |,$/,""); # remove the first space and last comma
        printf c $0; # print the line (without a newline)
        c="," # set c to add a comma for the next field
    } else {
        printf "\n"; # empty line, output a newline
        c="" # don't print a comma for the next entry
    }
};
END {
    printf "\n" # finish off with a newline
}
Dabombber
  • 302
  • 1
  • 7
1
<file sed '
   :start
   s/\n$//
   t
   s/\n //
   N
   b start
  ' | sed 's/,$//'

The first sed loops (:start, b start) and appends lines to its pattern space (N) until a newline at the very end is found and deleted (s/\n$//). This indicates an empty line was read, the tool exits the loop then (t). At each iteration any surviving newline (and a consecutive space) is removed anyway to concatenate lines (s/\n //).

The second sed removes trailing commas.

Kamil Maciorowski
  • 19,242
  • 1
  • 50
  • 94