[GRASS-dev] Pipeline efficiency in bash shell scripts

I'll keep this short and provide more details if people need it. In a for loop within bash scripts, where several commands are performing text manipulation, is it more efficient to pipe the commands together in one long pipeline (case 1), or instead dump the output from one program into a text file and use a redirection to input the results into a second command (case 2)?

Case 1

for FILES in *.extension ; do
    mbnavlist -Iinputfile -OJXY -N0 | awk '{lots of awk text manipulation goes here}' | v.in.ascii

done

Case 2

for FILES in *.extension ; do

    mbnavlist -Iinputfile -OJXY -N0 > TMP.txt
    awk '{slicing and dicing commands go here}' < TMP.txt > v.in.ascii OR awk '{slicing and dicing commands go here}' < TMP.txt > TMP2.txt follwed by v.in.ascii < TMP2.txt

In both cases, the for loop is calling the mbnavlist program (from free and open source bathymetry processing software MBTools) and awk together thousands or times. I wasn't sure if there are any general benefits to piping vs. writing out to a file, then redirecting input.

Any suggestions?

~ Eric.

Patton, Eric wrote:

I'll keep this short and provide more details if people need it. In a
for loop within bash scripts, where several commands are performing
text manipulation, is it more efficient to pipe the commands together
in one long pipeline (case 1), or instead dump the output from one
program into a text file and use a redirection to input the results
into a second command (case 2)?

Case 1

for FILES in *.extension ; do
    mbnavlist -Iinputfile -OJXY -N0 | awk '{lots of awk text manipulation goes here}' | v.in.ascii

done

Case 2

for FILES in *.extension ; do

    mbnavlist -Iinputfile -OJXY -N0 > TMP.txt
    awk '{slicing and dicing commands go here}' < TMP.txt > v.in.ascii
OR
    awk '{slicing and dicing commands go here}' < TMP.txt > TMP2.txt
follwed by
    v.in.ascii < TMP2.txt

In both cases, the for loop is calling the mbnavlist program (from
free and open source bathymetry processing software MBTools) and awk
together thousands or times. I wasn't sure if there are any general
benefits to piping vs. writing out to a file, then redirecting input.

Any suggestions?

Any difference is likely to be so small that there's no reason to
prefer one to the other based upon efficiency concerns.

Exactly which will be more efficient depends upon more factors than
can reasonably be discussed here.

--
Glynn Clements <glynn@gclements.plus.com>

In both cases, the for loop is calling the mbnavlist program (from
free and open source bathymetry processing software MBTools) and awk
together thousands or times. I wasn't sure if there are any general
benefits to piping vs. writing out to a file, then redirecting input.

Any suggestions?

Any difference is likely to be so small that there's no reason to
prefer one to the other based upon efficiency concerns.

Exactly which will be more efficient depends upon more factors than
can reasonably be discussed here.

--
Glynn Clements <glynn@gclements.plus.com>

Thanks for the feedback. I didn't really suspect a difference, but good to know.

~ Eric.