3

So, I bought this book called Primes and Programming, and it's pretty tough going. Today I wrote this (simple) program from chapter 1:

#!/usr/bin/env python

import math

def find_gcd(a,b):
    while b > 0:
        r = a - b * math.floor(a/b)
        a = b
        b = r
    return int(a)

if __name__ == "__main__":
    import random, sys
    while True:
        print find_gcd(random.randrange(int(sys.argv[1])), random.randrange(int(sys.argv[2])))

...and just now I called it like so:

./gcd-rand.py 10000 10000 > concievablyreallyhugefile

...and now I'm dreaming of a bash one-liner that breaks when concievablyreallyhugefile has reached a certain size. I guess it would look something like:

while $(du -h f) < 32M; do ./gcd-rand.py 10000 10000 > $f; done

...but I have never written a while loop in bash before and I don't really know how the syntax works.

ixtmixilix
  • 13,040
  • 27
  • 82
  • 118
  • 1
    To answer your question you don't need to buy a book, actually — just issue `man bash`. – poige Mar 07 '13 at 11:43
  • I presume the exercise you're working through asks you to roll your own, but for the sake of reference, the [`fractions.gcd` method](http://docs.python.org/3.3/library/fractions.html#fractions.gcd) is useful. – Ricardo Altamirano Mar 07 '13 at 13:49

4 Answers4

3

The trick is to use the test command test or the equivalent [ ... ]:

 while [ "$(du -m f|cut -f1)" -lt 32 ]
 do 
  ./gcd-rand.py 10000 10000 > "$f"
 done

See help test for more information.

Note

test or [ command is a bash builtin. The help information can be retrieved inside bash via help test or help [. man test refers to the test command that is used if a shell has no such builtin or is invoked explicitly as /usr/bin/test.

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
H.-Dirk Schmitt
  • 1,999
  • 11
  • 13
3
./gcd-rand.py 10000 10000 | head -c 32M > concievablyreallyhugefile

head will stop reading after 32MB. Soon after head stops reading, gcd-rand.py will receive a SIGPIPE signal and exit.

To avoid storing a truncated last line, as Michael Kjörling noticed:

./gcd-rand.py 10000 10000 | head -c 32M | sed '$d' > concievablyreallyhugefile
Gilles 'SO- stop being evil'
  • 807,993
  • 194
  • 1,674
  • 2,175
  • This. Piping is The Unix Way (tm), *and* it will give you exactly as much data as you want. Of course, it *might* break the resulting file in the middle of a number, so in the general case the last line of the file will be meaningless. If you want to guard against that, it'd probably be better to implement output size limiting in the script itself (look up `len()`, and remember to account for the newline). – user Mar 09 '13 at 23:21
  • @MichaelKjörling Good point about the last truncated line. Again piping saves the day. – Gilles 'SO- stop being evil' Mar 10 '13 at 16:34
1

Your python code loops forever. Thus, you might want to run it in the background and then kill it when the file size is exceeded. As one-liner:

{ ./gcd-rand.py 10000 10000 > f & }; p=$!; while (( $(stat -c %s f) < 33554432 )); do sleep .1; done; kill $p

Note: choose sleep time as appropriate, instead of stat you can also use du, as suggested by Dirk.

Jurij
  • 71
  • 2
  • This is good, but you should use `wc -c` instead of `stat`, which will allow it to work outside of Linux. – jordanm Mar 07 '13 at 15:23
0

You can use the ulimit command to restrict how large a file the shell (or its children) can create:

ulimit -f 32768
chepner
  • 7,341
  • 1
  • 26
  • 27
  • I think this qualifies as an excellent example of what [Raymond Chen calls "using global state to manage a local problem"](http://blogs.msdn.com/b/oldnewthing/archive/2008/12/11/9193695.aspx). – user Mar 11 '13 at 08:19
  • Well, it's limited to the current shell, so `(ulimit -f 32768; cmd)` is a possibility. – chepner Mar 11 '13 at 12:33