Replace characters on specific lines and specific columns

Question

Consider this sample.txt:

ATO   N   X B               
AT    H1  X BT              
ATOM H25  X BAA              
ATOM  H3  X BUTZ              
ATOM  CA  X BAT

I want to replace X's from lines 2-4, to "A" with awk or whatever, so output should be:

ATO   N   X B               
AT    H1  A BT              
ATOM H25  A BAA              
ATOM  H3  A BUTZ              
ATOM  CA  X BAT

I emphasize that X (or its substitute A) is 11th "entity" (including characters or spaces) in line, and should stay 11th "entity" in output, and every other "entity" should stay on its place as in original file.

How to do this? Thanks

"I emphasize that X (or its substitute A) is 11th character in line, and should stay 11th character in output, and every other character should stay on its place" > Then please improve your example to show lines where X is not the 11th character so people can easily check if their answers are adequate. Also, in the title you talk about specific columns, you have to decide which one you want. — Quasímodo, Feb 20 '21 at 18:28
@Quasímodo Thanks for the comment, but title exactly describes what I needed: replace 11th character from specific lines. Maybe sample.txt could be more general, but mondo-cane gives solution which works for that case. — multipole, Feb 20 '21 at 18:53
Well, I realize that number of column and number of character probably is not the same thing. I treat them as they are the same thing. Still don't know which terminology I should use if I only want to acces 11th "something" (including characters and spaces)? — multipole, Feb 20 '21 at 19:17
You could straighten (?) the file first, by replacing one-or-multiple blanks with tabs and then threat columns, depending on the tools. In AWK you can specify the field separator as regex like so: `awk -F'[ \t]*' '{print $2}' foo.dat` to handle whitespace of different/mixed kind, but you can't use it, if you have missing values at defined positions in files. — user unknown, Feb 20 '21 at 19:44
@userunknown But my problem is that I am not interested in whitespace delimiters. Suppose that line 2 is: ABCDXFGHIJKL. How should I ask a question if I want to replace object on 5th place (X) and change it to Y, so output be: ABCDYFGHIJKL ? However, general example should include random number of spaces (represented here as underscore): "A__BX__DFG". How to ask to replace 5th object (X) so output will be: "A__BY__DFG"? — multipole, Feb 20 '21 at 20:06
Well, you should write write, that the position is measured in fixed character steps and present examples, which don't look like a table with 4 columns. However, mondo-cane's sed solutions looks as if it is adressing this setting, isn't it? — user unknown, Feb 21 '21 at 04:46
All but one of the answers so far, including the one you accepted, will fail if X was a regexp metacharacter (e.g. `*`), or A was a backreference character (`&`), and most will replace X with A regardless of where it occurs rather than replacing the 11th character. It's also not clear what you'd want to do if the 11th character was not X so different answers make different assumptions. — Ed Morton, Feb 21 '21 at 14:25
If you want to replace the 11th character regardless of what character that is then say THAT and show various characters in that position being replaced. If you only want to replace the 11th character if it is `X` then say THAT and again show multiple characters in the 11th position but only the `X`s being replaced. — Ed Morton, Feb 21 '21 at 14:29

score 3 · Answer 1 · answered Feb 20 '21 at 18:14

3

With awk:

awk 'BEGIN{ OFS=FS="" } FNR>=2 && FNR<=4 && $11=="X"{ $11="A" }1' sample.txt

Use a null string as input and output field separator and replace the 11th field for the defined record numbers of the input file if the field contains X. Then print the record.

answered Feb 20 '21 at 18:14

Freddy

25,172
1
21
60

1

Nice answer, just a side note: Empty field separator is unspecified behavior, but works with modern Awks (tested with gawk and mawk). – Quasímodo Feb 20 '21 at 18:17
1

Thanks for the hint @Quasímodo! Didn't know that this is indeed undefined behaviour according to POSIX. Solution tested with gawk, mawk, busybox awk and macOS awk. – Freddy Feb 20 '21 at 18:31
I'm actually surprised that the awk on MacOS supports splitting into characters given a null FS since it's a BSD awk but anyway - see https://unix.stackexchange.com/a/417122/133219 for a discussion on the subject. – Ed Morton Feb 21 '21 at 14:44
1

@EdMorton I can confirm that `awk -F ''` issues a warning `awk: field separator FS is empty` and doesn't split the input (`FS` is still set to the default value, process exits with zero status) while setting `FS` e.g. `awk -v FS=` works without warning. Version is `awk version 20070501`. – Freddy Feb 21 '21 at 15:55

DanieleGrassini · Accepted Answer · 2021-02-21T17:37:19.187

A simple sed command should do the job:

sed '2,4s/ X / A /' your_file


cat foo.txt
ATO   N   X B               
AT    H1  X BT              
ATOM H25  X BAA              
ATOM  H3  X BUTZ              
ATOM  CA  X BAT 

sed '2,4s/ X / A /' foo.txt

ATO   N   X B               
AT    H1  A BT              
ATOM H25  A BAA              
ATOM  H3  A BUTZ              
ATOM  CA  X BAT

As @Quasimodo pointed out the upper sed command will fail if it encounter another sequence like X That's a GNU Awk solution instead:

awk 'NR >= 2 && NR <= 4 && $3~/X/ { sub(/X/, "A") } { print }' foo.txt

UPDATE

Many thanks to @Quasimodo for this command:

sed '2,4s/^\(.\{10\}\)X/\1A/'

This ensure that only the X that appear in the 11th character will be replaced

guest_7 · Answer 3 · 2021-02-20T23:55:52.933

1

Using awk, not necessarily GNU, we do as shown. First select lines based upon the range and then further refine them by trying a substitution at the 11th character position.

awk '(NR==2),(NR==4) {
    sub(/^.{10}X/, substr($0,1,10) "A")
}1' file

The same thing in perl

perl -lpe 'substr($_,10,1) =~ s/X/A/ if 2..4' file

sed -e '
  2,4s/./&\n/11
  s/X\n/A/;s/\n//
' file

Input:

cat - <<\! > file
ATO   N   X B               
AT    H1  Q BT              
ATOM H25  X BA
ATOM  H3  X
ATOM  CA  X BAT 
!

Result:

ATO   N   X B               
AT    H1  Q BT              
ATOM H25  A BA
ATOM  H3  A
ATOM  CA  X BAT

edited Feb 20 '21 at 23:55

answered Feb 20 '21 at 23:48

guest_7

5,698
1
6
13

Like your sed script, but why you have added the second cmd (s/\n//) ? – DanieleGrassini Feb 21 '21 at 20:38
1

@mobdo-cane That is to strip away the marker in case the 11th character is not an X. – guest_7 Feb 22 '21 at 04:40

Ed Morton · Answer 4 · 2021-02-21T14:47:09.880

Your question isn't clear but it sounds like this is what you want:

$ awk -v n=11 -v c='A' '(2<=NR) && (NR<=4){$0=substr($0,1,n-1) c substr($0,n+1)} 1' file
ATO   N   X B
AT    H1  A BT
ATOM H25  A BAA
ATOM  H3  A BUTZ
ATOM  CA  X BAT

The above will work using any awk in any shell on every Unix box and replaces the 11th character on lines 2-4 regardless of what the original character is and whether or not it also appears elsewhere in the input, and regardless of what the replacement character is, e.g.:

$ cat file
****************************
****************************
*****************************
******************************
****************

$ awk -v n=11 -v c='&' '(2<=NR) && (NR<=4){$0=substr($0,1,n-1) c substr($0,n+1)} 1' file
****************************
**********&*****************
**********&******************
**********&*******************
****************

No other currently posted answer would do that.

@guest_7 if that can happen then provide requirements for how that should be handled (skip the line, add blanks, exit with an error, whatever...) and tweak the code to handle it however is desired. — Ed Morton, Feb 22 '21 at 14:13

score 0 · Answer 5 · answered Feb 21 '21 at 18:32

0

awk 'NR >1 && NR <5 {gsub("X","A",$3)}1' filename

output

ATO   N   X B               
AT    H1  A BT              
ATOM H25  A BAA              
ATOM  H3  A BUTZ              
ATOM  CA  X BAT

answered Feb 21 '21 at 18:32

Praveen Kumar BS

5,139
2
9
14

Replace characters on specific lines and specific columns

5 Answers5