In the beginning there was cat …

… and I’m not talking about Ceiling Cat. I’m talking about the Linux command cat.

Apropos the comment that one should not use cat to produce a stream of text for a command that takes a filename as a argument … I say “Balderdash!”

If your plan is to process the text from a file with a single command and that’s that .. no modifications will be needed … then by all means, you will save time and computing resources (though immeasurably little of either) by using the filename as an argument to said command.

But if you are not quite sure how you are going to get to your eventual final results, you might find that staring with cat will give you a paws up when you need to start fiddling. The cat command will even take arguments that can help you.

In fact, I think it is the widespread conception that cat is ‘merely a filter’ that does nothing but pass the contents of the file it is fed contributes to the belief that cat is always useless. In fact, cat is a very powerful command.

Cat counts as a filter. A file goes in and comes out. When cat is issued with nothing other than a filename as an argument, it does nothing but stream out the contents of the file. This is convenient if you want to construct a complex command that starts with the contents of a file running into standard input. For instance:

cat mouse.txt

will simply ‘print’ to the terminal the contents of moue.txt. But that’s OK. You can do this and verify that mouse.txt exists and contains roughly what you thought it contained. Then you can add something to this such as

cat mouse.txt | grep mice

which gets me:

Three blind mice.

Three blind mice.

As three blind mice.

I could have gone

grep mice mouse.text

and gotten the same thing, but that would interfere with the poetry of

cat mouse.txt | grep mice | wc

which gets me: 2 10 66

which is, of course, either Daryl Johnston’s or Rodney Anoa’i's birthday.

Here’s another good one. A common question given to programmers looking for work, as part of their application, is this:

“Write a perl one liner that adds numbers to the lines in a file.”

Answer:

Who needs perl? … cat -n mouse.txt

which gets me this:

1 Three Blinde Mice,
2 three Blinde Mice,
3 Dame Iulian,
4 Dame Iulian,
5 The Miller and his merry olde Wife,
6 shee scrapte her tripe licke thou the knife.
7
8 Three blind mice. Three blind mice.
9 See how they run. See how they run.
10 They all ran after the farmer’s wife
11 She cut off their tails with a carving knife.
12 Did you ever see such a thing in your life
13 As three blind mice.

or, cat -b mouse.txt

which is similar but only numbers non-blank lines.

or the -s option, one of the coolest, which does not allow more than one blank line at a time through the filter!

or -T which shows tabs, otherwise invisible.

Man cat. Try it, you’ll like it.

Comments

  1. #1 blf
    August 5, 2008
    cat foo | cmd

    has the very distinct and real problem that if file foo cannot be opened, then the “exit status” of the pipe ($?) will be 0 (zero)–success. This is probably not what you want, especially in a Makefile, or in constructs such as

    if cat foo | cmd; then … fi

    In contrast, when a command has a filename argument, it usually fails (exits non-0) if it cannot open a required/named input file. Even if the command just reads stdin (the standard input), crunch (<file) behaves better than using cat.

    cat has its place and its purpose. But the Useless Use of Cat™ award exists for multiple reasons.

    Also look up Pike’s paper, cat -v Considered Harmful.

  2. #2 Larry
    August 5, 2008

    When I read the previous post that started this, I almost joined the “don’t cat” responders. But on consideration, I think this is more like off-topic grammar correction.

    For those on the fly, build up a command line by experimentation situations, the performance difference is insignificant and probably allows quicker command editing. So Greg’s original examples are fine.

    Lifelong habit will probably limit how much *I* use cat, however, lol.

    If I’m going to the trouble to write a script, I take more care in crafting the commands and handling edge / error conditions.

    So, it may be beneficial for casual readers to be altered to potential problems, but not in a dogmatic way. Google the recursive “‘considered harmful’ considered harmful”.

    In the spirit of TMTOWTDI, how about a contest to see how many ways to number lines of a file? I’ll start with “nl foo.txt”. (j/k)

  3. #3 Greg Laden
    August 5, 2008

    Yes, this is all about constructing commands. Between each iteration of developing the command, you are using the history buttons to go back to older versions and then modifying.

    cat mouse.txt issued by itself tells you that mouse.txt is in the working directory, reminds you what the contents of the file looks like, etc.

    However, one might seriously leave it this way when making a script because later modifications may involve copying the script to command line and playing around further. It depends on what you are trying to do.

    Brian, the name of the paper is “Program design in the UNIX environment. “Cat -v….” is a related talk Kernighan (did not invent C) and Rob Pike gave on it (and I think the name of an organization). This is not really about the ‘-v’ switch on cat. These are radical writings and are part of the kernel wars, not good programming practices. Pike, in particular, is to *nix, say, what PZ Myers is to the Catholic Church.

    How to deal with the exit status: Again, in a command line building scenario, we do not see the exit status, so go figure. If I have three commands chained as shown above, the exit status of the second command is hidden. If one really wants exit stati out of a sequence of commands, I suppose one could tee the output to standard error to some place (not sure if that would work).

    I’m a little uncomfortable with non zero exit status being used to make a decision about doing something other than exiting with an error message anyway.

  4. #4 Virgil Samms
    August 5, 2008

    There doesn’t seem to be a “dog” command in Linux. This is obviously bigotry.

  5. #5 llewelly
    August 5, 2008

    I’m a little uncomfortable with non zero exit status being used to make a decision about doing something other than exiting with an error message anyway.

    Consider a configure script, used to configure makefiles before building. It needs to test for the existence and behavior of many commands and libraries. Instead of exiting with an error message when it sees a non-zero exit status, it goes on to test alternatives and configure the make accordingly.

  6. #6 Ben Zvan
    August 5, 2008

    If you’re really concerned about cat trying to cat a nonexistent file, just use this instead:

    [ -f mouse ]&& cat mouse | cmd || echo "no mouse found"

    -or-

    if [ -f mouse]; then
       cat mouse | cmd
    else
       echo "no mouse found"
    fi

    Error management is an important part of any script.

  7. #7 Greg Laden
    August 5, 2008

    llewelly: See, Ben’s solution specifically checks for a particular condition. I know that a file operation is likely to fail because of a missing file, but a) it may fail for other reasons and b) what seems like an obvious OK alternative early in development may be come a window into disaster later in development.

    Virgil: Right, but did you ever notice what “dog” spelled backwards is? Not that this is relevant, but did you ever notice?

  8. #8 Stephanie Z
    August 5, 2008

    So, Greg, you’re suggesting that use of “dog” should be reserved for daemons?

  9. #9 Greg Laden
    August 5, 2008

    It depends on the Linux distro.