Now on ScienceBlogs: Oldest Human-Made Object in Space

ScienceBlogs Book Club: Inside the Outbreaks

Greg Laden's Blog

Evolution, Life Sciences, Science Education, Human Evolution, and Stuff

Darwing_Face.jpg Learn more about Charles Darwin and his work.

Hornbill170.jpg Looking for stuff about birds?

Lion_mane170.jpg Lean more about lions

Congo_sidebar.jpg An archaeological expedition to the Congo


The Skeptical Search Engine


Nature Blog Network
Climate Defense Fund


The contents of Greg Laden's Blog are copyrighted by Greg Laden.

Recent Comments

Search

Profile


Click on "About" for the big picture, and "Archives" for the details.


Recent Posts

Blogroll

If you don't see yourself on my blogroll, just drop me a line and let me know. I'll add you.*
*Assuming that I'm on your blogroll, of course!

Archives

« Billy Wayne Posey is finally dead. Better late than never. | Main | The Falsehoods »

Today's Linux Command Line Lesson: counting files

Category: Computer TricksLinuxone-liners
Posted on: August 17, 2009 4:02 PM, by Greg Laden

Type:

ls | wc -l

The output is an approximation of how many files are in the current directory.

The command ls gives a list of files (ls stands for "list stuff")

the vertical line is a pipe. This means the standard output of the left side of the pipe is sent (like in a pipe) to the standard input of the right side.

wc means "word count" ... the default output is the number of lines, nmber of words, and number of bytes for a file or for standard input that was sent to the command. The -l option puts out only the number of lines. That, then, is the number of files.

The last time I mentioned this, commenter Colin M added this:

And for recursive (list the number of files under the current directory and all subdirectories, etc):

find | wc -l

If you want to count just normal files (i.e. not directories and other weird things):

find -type f | wc -l

I mention above that this is an approximation. It is approximate because there are all kinds of screwey things that can happen to make this the incorrect number. A better command that will probably give you the exact number of files in the current directory is this:

find . -type f -maxdepth 1 -printf "%i\n" | sort | uniq | wc -l

Gaze at this bit of code for a while and try to figur out why this works.

The first part requires that find finds files in the present directory (the dot makes that happen) that really are files (type f) and ignores anythying in any subdirectories (maxdepth). After that, it gets obscure. To find out why this works the way it does, examine the comment provided by Master Basher Winter Toad.

Once there have a look at Virgil Samms' version as well.

Share on Facebook
Share on StumbleUpon
Share on Facebook
Find more posts in: Technology

TrackBacks

TrackBack URL for this entry: http://scienceblogs.com/mt/pings/116748

Comments

1

windows equiv:
dir /b | find /c /v ""

You can also specify only directories or (and I've found this useful when hunting a virus) only hidden files by adding "/a d" or "/a h" before the pipe. Or control the subdirectory depth with /s. It doesn't seem to have the /same/ problem with erroneous counts, but occasionally gets mixed up on other things instead.

And of course, under both you can pipe the output to a file or executable for monitoring/datalogging purposes.

Posted by: Spiv | August 17, 2009 4:37 PM

2

I've tried from time to time to get a handle on Bash, but tcsh remains my favorite shell. Here's a line from my tcshrc file:

alias list "ls -lA; echo 'total items this dir:'; ls -lA | wc -l"

(That's all one line.)
Bash just spits this back at me with "not found". I think it wants me to do something different with the quotes. Most likely, something similar could be offered to Bash that would be accepted, and you'd get a directory listing and the number of entries at the bottom.

Posted by: John Swindle | August 17, 2009 5:27 PM

3

John, your alias is the same under bash. The problem you were having is the syntax for the alias command is slightly different. Under tcsh, it's:

alias name command

Under bash, it is:

alias name=command

Here's my output:

$ alias list="ls -lA; echo -n 'total items this dir:'; ls -lA | wc -l"
$ list
total 100
-rw------- 1 james james 9624 Aug 2 15:31 .bash_history
-rw-r--r-- 1 james james 24 Nov 2 2006 .bash_logout
-rw-r--r-- 1 james james 191 Nov 2 2006 .bash_profile
-rw-r--r-- 1 james james 124 Nov 2 2006 .bashrc
drwxrwxr-x 2 james james 4096 Jul 9 17:10 bin
drwx------ 2 james james 4096 Jun 12 14:54 .elinks
-rw-r--r-- 1 james james 383 Nov 2 2006 .emacs
drwxrwxr-x 3 james james 4096 Jun 4 17:06 .emacs.d
-rw-r--r-- 1 james james 120 Nov 2 2006 .gtkrc
drwxr-xr-x 3 james james 4096 Nov 2 2006 .kde
drwx------ 2 james james 4096 May 26 13:32 .ssh
drwxr-xr-x 3 james james 4096 Jul 31 14:03 tmp
-rw------- 1 james james 5832 Jul 9 17:10 .viminfo
-rw------- 1 james james 61 Jan 14 2009 .Xauthority
-rw-r--r-- 1 james james 658 Nov 2 2006 .zshrc

total items this dir:16

Posted by: James | August 17, 2009 6:13 PM

4

John, the reason your "alias" command gave you a "not found" message is because the usage for the built-in alias command (in bash) has this basic usage:

alias [name[=command] ...]

So if you give it alias, you'll get a list of all aliases already defined.

If you give it alias ls you'll see the current alias for the ls command.

If you give it alias ls vi you'll see the aliases for both the ls and vi commands. For example, on my system:

$ alias ls vi
alias ls='ls --color=tty'
alias vi='vim'

So if I type:

$ alias ls "ls -lA"

..then it tries to give me the definitions of the ls alias, and (standard UNIX quote rules apply) the ls -lA alias. But I don't have an alias called ls -lA so it barfs with a "not found" for that alias:

$ alias ls "ls -a"
alias ls='ls --color=tty'
-bash: alias: ls -a: not found

Posted by: James | August 17, 2009 6:32 PM

5

Spiv:

No, this is the windows equivalent:

click. Click click. Click.

"You computer requires a security update before that function can be performed."

OK (click).

Download...download...download....

"Your computer must now be rebooted"

OK (click)

Kerkluck, kerlkunk, wizzzzz.... zvooooormmm... clunk clunk

http://www.youtube.com/watch?v=tkhY0HbhyN8

"You may now perform the function"

OK. click click click click...

... freeze ....

(What he hell was I trying to do, I can't remember?!?!?#$%##??)

Posted by: Greg Laden | August 17, 2009 6:55 PM

6

If this alies is supposed to give the number of items per directory after the phrase with no linefeed then it needs to be adjusted slightly.

Posted by: Greg Laden | August 17, 2009 7:12 PM

7

Greg - your "Windows equivalent" reminds me why I have such warm feelings for all things Microsoft.

By "with no linefeed", is this what you meant?
yada... echo -n ...yada

Posted by: John Swindle | August 17, 2009 9:15 PM

8

John... yeah, maybe.

Posted by: Greg Laden | August 17, 2009 9:28 PM

9

I just look at the bottom of the window where it says "12 items, 1.2 TB available."

Posted by: Ben Zvan | August 17, 2009 10:07 PM

10

Spoil sport.

Posted by: Greg Laden | August 17, 2009 10:12 PM

11

Yeah... well... that's the Mac equivalent. No right-clicking or anything.

Posted by: Ben Zvan | August 17, 2009 10:19 PM

12

#6 and #7:

You mean, the -n I included in my example? (#3)

:-)

Posted by: James | August 17, 2009 10:45 PM

13

ROFL (after testing -n)

Yeah... well... that's the Mac equivalent. No right-clicking or anything.

Well, of course, in Linux with gnome, I can see that too. However, I don't think you see that on Windows.

But, of course, on needs to have the number as output for piping it to the next script. For some reason. Which, by the way, is why the longer listing example above is not really what we are looking for. We just want the number.

Posted by: Greg Laden | August 17, 2009 10:58 PM

14

John: Try zsh. csh and its variants are a bad idea. The bugs of csh may have been fixed but the design flaws remain.

I'm wary of aliases especially on machines I'm not familiar with (when using shared accounts, etc.) so I prefer using the full path in certain cases. It may not be as important now, but older Solaris machines would keep a separate copy of BSD-flavoured commands under /usr/ucb so depending on how weirdly paths were set up, you might get the BSD version of ps instead of the expected AT&T version.

And finally, some commands behave differently if the output is sent somewhere besides the terminal. For example, 'ls --color=auto' will send ANSI color escape sequences to the terminal but will omit them when the output is sent to a file or piped into a command.

In this case, on a Linux system I'd use '/bin/ls -A1 | wc -l' The -A option tells it to show all the hidden files and directories - anything with a leading dot except . and .. (the current and parent directories.) The -1 option puts one entry per line so wc is counting what you think it's counting.

The upshot is that sometimes you need to be very precise in what you tell the machine to do and the only way to know how precise to be is to make mistakes. The unix world is a rich tapestry of obscure, archaic, and powerful tools and it takes equal parts patience, experimentation, and RTFMing to develop proficiency. It's a lot better now than when I started with it; disk is cheap enough that documentation is now stored online instead of that foot-thick binder with tattered pages bolted to the desk in the computer lab. RTFMing is a lot easier now. :)

Posted by: Bob | August 17, 2009 11:34 PM

15

Greg: are you running windows on a 1950's industrial punch press or something? Your computer probably shouldn't be making those noises if not. If so, well, I knew windows supported some pretty antiquated hardware, but dang...

(at any rate piping a directory listing to 'find' works great. So neener.)

Posted by: Spiv | August 18, 2009 9:06 AM

16

This is too funny. I just got an IM from a coworker asking "Do you know how to get a count of files in a particular directory?" I sent back "ls -1a | wc -l". The next thing she wanted was a count of all the files in all the sub-directories. Hah!

Posted by: Ben Zvan | August 18, 2009 1:14 PM

17

You don't need the -1 option, Ben. ls knows that it is writing to a pipe rather than a (pseudo)terminal so it automatically switches to that format (one file per line, no funny escape codes to colour code the files).

-a also lists files that begin with '.' (those are normally hidden in Unix -- we don't have a separate file flag for that the way they do in DOS/Windows). But '.' and '..' (current and parent directories) also begin with a '.'. Do you want to include them in the count?

Posted by: Peter Lund | August 19, 2009 2:55 AM

18

Counting files on a Mac ...

http://codesnippets.joyent.com/posts/show/2099

Posted by: mokk | September 27, 2009 5:50 AM

19

mokk: Cool. Now, let's write a procedure for counting the number of lines of C code it takes to count the files on a Mac!

Posted by: Greg Laden | September 27, 2009 10:05 AM

Post a Comment

(Email is required for authentication purposes only. On some blogs, comments are moderated for spam, so your comment may not appear immediately.)





ScienceBlogs

Search ScienceBlogs:

Go to:

Advertisement
Follow ScienceBlogs on Twitter

© 2006-2011 ScienceBlogs LLC. ScienceBlogs is a registered trademark of ScienceBlogs LLC. All rights reserved.