-1

I'm pretty new to bash scripting, so apologies if this is obvious!

I'm trying to create a bash script to traverse a bunch of files I have of the format ID1.1.fq.stuff, ID1.2.fq.stuff, ID2.1.fq.stuff, ID2.2.fq.stuff .... etc. The script is meant to find files that are paired (both files for ID1, ID2, and so on) and then submit them both together to a program called STAR for downstream processing.

I made the following bash script:

#/!/bin/sh
module load STAR
current_id = ""
current_file = ""
 for fqfile in `ls path/*`; do
  filename = ${fqfile%%.fq*}
  id = ${filename%.*}
  if $id == $current_id; then
   STAR --readFilesIn $current_file $fqfile --outFileNamePrefix ./$id.bam
  else
   current_id = $id
   current_file = $fqfile
  fi
done

When I run it, I get these errors:

[path to $id, without file extensions]: No such file or directory
current_id: command not found
current_file: command not found

What am I doing wrong?

Thank you!

  • 1
    Welcome to SE. In bash there are no spaces between variable , assignment and value. You need to glue that all together. `current_id="$id"` for example. Also `ls` is nog meant to be used in scripts. So basically there are quite some issues with your code at the moment. – Valentin Bajrami Jul 05 '23 at 19:12
  • [Bash pitfall number one](https://mywiki.wooledge.org/BashPitfalls#for_f_in_.24.28ls_.2A.mp3.29). There are other issues, some already noted, others not yet. E.g. `if …; then` expects a command in place of `…`; the command may be `[` with arguments. Your shebang is not a shebang. – Kamil Maciorowski Jul 05 '23 at 19:33
  • Possible duplicate of [Spaces in variable assignments in shell scripts](https://unix.stackexchange.com/q/258727) – Stéphane Chazelas Jul 05 '23 at 19:56
  • 2
    drop the script into https://www.shellcheck.net, read the feedback it gives, go back and test all parts of the script one line at a time – ilkkachu Jul 05 '23 at 20:36
  • The first line: `#/!/bin/sh` should be `#!/bin/sh` and if you want to make use of extra features in Bash, it should probably be `#!/bin/bash` or `#!/usr/bin/env bash`. – Sotto Voce Jul 05 '23 at 20:42

1 Answers1

1

I used bash syntax because the question is tagged with bash

Problems with the original script

  • Iterating over ls output
  • Ignoring non-matching globs
  • Incorrect shebang
  • Did not use double quotes to prevent globbing and word splitting
  • Did not quote the right-hand side of == to prevent glob matching
#!/usr/bin/env bash
 
# Instructs bash to immediately exit if any command has a non-zero exit status    
set -e 
# Allows patterns which match no files to expand to a null string, rather than themselves
shopt -s nullglob
 
module load STAR
 
# Don't put spaces around '=' when assigning variables in bash.
current_id=""
current_file=""
 
# Iterating over ls output is fragile. Use globs
for fqfile in path/*.fq*; do
  filename="${fqfile%%.fq*}"
  id="${filename%.*}"
  if [[ $id == "$current_id" ]]; then
   STAR --readFilesIn "$current_file" "$fqfile" --outFileNamePrefix "./$id.bam"
  else
   current_id="$id"
   current_file="$fqfile"
  fi
done
memchr
  • 551
  • 1
  • 9
  • If I remove the first line (set -e) to see error messages, I still see the same error as before: current_id: command not found current_file: command not found The first error did go away. – nerd_inthecorner Jul 05 '23 at 20:25
  • 1
    @nerd_inthecorner that's hard to believe. To check if whatever `STAR` is is causing your problems - what happens if you put `echo` in front of `module load STAR` and `STAR --readFilesIn`? – Ed Morton Jul 06 '23 at 10:20