I need to filter a very large csv file in the most efficient way. I already tried csvgrep but is not efficient timewise, so I'm now trying to use AWK for this task.
The csv file has a ; separator, and I need to filter all rows that have in the 48th column a certain string that starts with a certain pattern, that I will pass through a script. So it will be something along the lines of:
pattern='59350'
awk -F ";" '$48 ~ /^59350/ input.csv > output.csv # This works
However, I need to pass $pattern inside the regex statement, rather than explicitly write the pattern.
I have tried several combinations but all give me an empty output.csv file.
Here are some of my failed attempts:
awk -F ";" -v var="$pattern" '$48 ~ /^var/' input.csv > output.csv
awk -F ";" -v var="$pattern" '$48 ~ /^$var/ {print}' input.csv > output.csv
awk -F ";" -v var="$pattern" '$48 ~ /^${var}/ {print}' input.csv > output.csv
How do I do this?
Please, also, if you have a more efficient way that won't load the whole csv file in memory or just faster (I was thinking of grep but not sure if it is suitable and how to implement it)
Thank you in advance