UNREGISTERD VERSION

Transcription

UNREGISTERD VERSION
Linux: Phrasebook
By Scott Granneman
...............................................
Publisher: Sams
Pub Date: June 12, 2006
Print ISBN-10: 0-672-32838-0
Print ISBN-13: 978-0-672-32838-1
Pages: 400
Table of Contents | Index
Linux Phrasebook
is sure to become the pocket guide that you keep within reach at all times. This concise, handy
reference can be used "in the street," just like a language phrasebook. Skipping the usual tutorial on
Linux, the Linux Phrasebook goes straight to practical Linux uses, providing immediate applicable
solutions for day-to-day tasks. It includes code phrases that allow Linux users to employ the command
line to complete onerous and repetitive tasks, as well as flexible code and commands can be
customized to meet the needs of any Linux user. The concise information combined with random
accessibility makes the Linux Phrasebook a robust, yet agile, reference guide that no Linux user should
be without.
Linux: Phrasebook
By Scott Granneman
...............................................
Publisher: Sams
Pub Date: June 12, 2006
Print ISBN-10: 0-672-32838-0
Print ISBN-13: 978-0-672-32838-1
Pages: 400
Table of Contents | Index
Copyright
About the Author
Acknowledgments
We Want to Hear from You!
Reader Services
Introduction
Part I: Getting Started
Chapter 1. Things to Know About Your Command Line
Everything Is a File
Maximum Filename Lengths
Names Are Case-Sensitive
Special Characters to Avoid in Names
Wildcards and What They Mean
Conclusion
Chapter 2. The Basics
List Files and Folders
List the Contents of Other Folders
List Folder Contents Using Wildcards
View a List of Files in Subfolders
View a List of Contents in a Single Column
View Contents As a Comma-Separated List
View Hidden Files and Folders
Visually Display a File's Type
Display Contents in Color
List Permissions, Ownership, and More
Reverse the Order Contents are Listed
Sort Contents by File Extension
Sort Contents by Date and Time
Sort Contents by Size
Express File Sizes in Terms of K, M, and G
Display the Path of Your Current Directory
Change to a Different Directory
Change to Your Home Directory
Change to Your Previous Directory
Change a File to the Current Time
Change a File to Any Desired Time
Create a New, Empty File
Create a New Directory
Create a New Directory and Any Necessary Subdirectories
Find Out What mkdir Is Doing As It Acts
Copy Files
Copy Files Using Wildcards
Copy Files Verbosely
Stop Yourself from Copying over Important Files
Copy Directories
Copy Files As Perfect Backups in Another Directory
Move and Rename Files
Rename Files and Folders
Delete Files
Remove Several Files At Once with Wildcards
Remove Files Verbosely
Stop Yourself from Deleting Key Files
Delete an Empty Directory
Remove Files and Directories That Aren't Empty
Delete Troublesome Files
Become Another User
Become Another User, with His Environment Variables
Become root
Become root, with Its Environment Variables
Conclusion
Chapter 3. Learning About Commands
Find Out About Commands with man
Search for a Command Based on What It Does
Quickly Find Out What a Command Does Based on Its Name
Rebuild man's Database of Commands
Read a Command's Specific Man Page
Print Man Pages
Learn About Commands with info
Navigate Within info
Locate the Paths for a Command's Executable, Source Files, and Man Pages
Read Descriptions of Commands
Find a Command Based on What It Does
Find Out Which Version of a Command Will Run
Conclusion
Chapter 4. Building Blocks
Run Several Commands Sequentially
Run Commands Only If the Previous Ones Succeed
Run a Command Only If the Previous One Fails
Plug the Output of a Command into Another Command
Understand Input/Output Streams
Use the Output of One Command As Input for Another
Redirect a Command's Output to a File
Prevent Overwriting Files When Using Redirection
Append a Command's Output to a File
Use a File As Input for a Command
Conclusion
Part II: Working with Files
Chapter 5. Viewing Files
View Files on stdout
Concatenate Files to stdout
Concatenate Files to Another File
Concatenate Files and Number the Lines
View Text Files a Screen at a Time
Search Within Your Pager
Edit Files Viewed with a Pager
View the First 10 Lines of a File
View the First 10 Lines of Several Files
View the First Several Lines of a File or Files
View the First Several Bytes, Kilobytes, or Megabytes of a File
View the Last 10 Lines of a File
View the Last 10 Lines of Several Files
View the Last Several Lines of a File or Files
View the Constantly Updated Last Lines of a File or Files
Conclusion
Chapter 6. Printing and Managing Print Jobs
List All Available Printers
Determine Your Default Printer
Find Out How Your Printers Are Connected
Get All the Information About Your Printers at Once
Print Files to the Default Printer
Print Files to Any Printer
Print More Than One Copy of a File
List Print Jobs
List Print Jobs by Printer
Cancel the Current Print Job Sent to the Default Printer
Cancel a Print Job Sent to Any Printer
Cancel All Print Jobs
Conclusion
Chapter 7. Ownerships and Permissions
Change the Group Owning Files and Directories
Recursively Change the Group Owning a Directory
Keep Track of Changes Made to a File's Group with chgrp
Change the Owner of Files and Directories
Change the Owner and Group of Files and Directories
Understand the Basics of Permissions
Change Permissions on Files and Directories Using Alphabetic Notation
Change Permissions on Files and Directories Using Numeric Permissions
Change Permissions Recursively
Set and Then Clear suid
Set and Then Clear sgid
Set and Then Clear the Sticky Bit
Conclusion
Chapter 8. Archiving and Compression
Archive and Compress Files Using zip
Get the Best Compression Possible with zip
Password-Protect Compressed Zip Archives
Unzip Files
List Files That Will Be Unzipped
Test Files That Will Be Unzipped
Archive and Compress Files Using gzip
Archive and Compress Files Recursively Using gzip
Get the Best Compression Possible with gzip
Uncompress Files Compressed with gzip
Test Files That Will Be Unzipped with gunzip
Archive and Compress Files Using bzip2
Get the Best Compression Possible with bzip2
Uncompress Files Compressed with bzip2
Test Files That Will Be Unzipped with bunzip
Archive Files with tar
Archive and Compress Files with tar and gzip
Test Files That Will Be Untarred and Uncompressed
Untar and Uncompress Files
Conclusion
Part III: Finding Stuff
Chapter 9. Finding Stuff: Easy
Search a Database of Filenames
Search a Database of Filenames Without Worrying About Case
Manage Results Received When Searching a Database of Filenames
Update the Database Used by locate
Searching Inside Text Files for Patterns
The Basics of Searching Inside Text Files for Patterns
Search Recursively for Text in Files
Search for Text in Files, Ignoring Case
Search for Whole Words Only in Files
Show Line Numbers Where Words Appear in Files
Search the Output of Other Commands for Specific Words
See Context for Words Appearing in Files
Show Lines Where Words Do Not Appear in Files
List Files Containing Searched-for Words
Search for Words Inside Search Results
Conclusion
Chapter 10. The find Command
Find Files by Name
Find Files by Ownership
Find Files by Group Ownership
Find Files by File Size
Find Files by File Type
Show Results If the Expressions Are True (AND)
Show Results If Either Expression Is True (OR)
Show Results If the Expression Is Not True (NOT)
Execute a Command on Every Found File
Print Find Results into a File
Conclusion
Part IV: Environment
Chapter 11. Your Shell
View Your Command-Line History
Run the Last Command Again
Run a Previous Command Using Numbers
Run a Previous Command Using a String
Display All Command Aliases
View a Specific Command Alias
Create a New Temporary Alias
Create a New Permanent Alias
Remove an Alias
Conclusion
Chapter 12. Monitoring System Resources
View All Currently Running Processes
View a Process Tree
View Processes Owned by a Particular User
End a Running Process
View a Dynamically Updated List of Running Processes
List Open Files
List a User's Open Files
List Users for a Particular File
List Processes for a Particular Program
Display Information About System RAM
Show File System Disk Usage
Report File Space Used by a Directory
Report Just the Total Space Used for a Directory
Conclusion
Chapter 13. Installing Software
Install Software Packages for RPM-Based Distributions
Remove Software Packages for RPM-Based Distributions
Install Software Packages and Dependencies for RPM-Based Distributions
Remove Software Packages and Dependencies for RPM-Based Distributions
Upgrade Software Packages and Dependencies for RPM-Based Distributions
Find Packages Available for Download for RPM-Based Distributions
Install Software Packages for Debian
Remove Software Packages for Debian
Install Software Packages and Dependencies for Debian
Remove Software Packages and Dependencies for Debian
Upgrade Software Packages and Dependencies for Debian
Find Packages Available for Download for Debian
Clean Up Unneeded Installation Packages for Debian
Troubleshoot Problems with apt
Conclusion
Part V: Networking
Chapter 14. Connectivity
View the Status of Your Network Interfaces
Verify That a Computer Is Running and Accepting Requests
Trace the Route Packets Take Between Two Hosts
Perform DNS Lookups
Configure a Network Interface
View the Status of Your Wireless Network Interfaces
Configure a Wireless Network Interface
Grab a New Address Using DHCP
Make a Network Connection Active
Bring a Network Connection Down
Display Your IP Routing Table
Change Your IP Routing Table
Troubleshooting Network Problems
Conclusion
Chapter 15. Working on the Network
Securely Log In to Another Computer
Securely Log In to Another Machine Without a Password
Securely Transfer Files Between Machines
Securely Copy Files Between Hosts
Securely Transfer and Back Up Files
Download Files Non-interactively
Download Websites Non-interactively
Download Sequential Files and Internet Resources
Conclusion
Chapter 16. Windows Networking
Discover the Workgroup's Master Browsers
Query and Map NetBIOS Names and IP Addresses
List a Machine's Samba Shares
Access Samba Resources with an FTP-Like Client
Mount a Samba Filesystem
Conclusion
Index
Copyright
Linux Phrasebook
Copyright © 2006 by Sams Publishing
All rights reserved. No part of this book shall be reproduced, stored in a retrieval system, or transmitted
by any means, electronic, mechanical, photocopying, recording, or otherwise, without written
permission from the publisher. No patent liability is assumed with respect to the use of the information
contained herein. Although every precaution has been taken in the preparation of this book, the
publisher and author assume no responsibility for errors or omissions. Nor is any liability assumed for
damages resulting from the use of the information contained herein.
International Standard Book Number: 0-672-32838-0
Library of Congress Catalog Card Number:2005904028
Printed in the United States of America
First Printing: June 2006
09 08 07 06 4 3 2 1
Trademarks
All terms mentioned in this book that are known to be trademarks or service marks have been
appropriately capitalized. Sams Publishing cannot attest to the accuracy of this information. Use of a
term in this book should not be regarded as affecting the validity of any trademark or service mark.
Warning and Disclaimer
Every effort has been made to make this book as complete and as accurate as possible, but no warranty
or fitness is implied. The information provided is on an "as is" basis.
Bulk Sales
Sams Publishing offers excellent discounts on this book when ordered in quantity for bulk purchases or
special sales. For more information, please contact
U.S. Corporate and Government Sales
1-800-382-3419
[email protected]
For sales outside of the U.S., please contact
International Sales
[email protected]
Acquisitions Editor
Jenny Watson
Development Editors
Songlin QiuScott Meyers
Managing Editor
Gina Kanouse
Project Editor
George E. Nedeff
Copy Editor
Heather Wilkins
Indexer
John Bickelhaupt
Proofreader
Jessica McCarty
Technical Editor
Timothy Boronczyk
Publishing Coordinator
Vanessa Evans
Multimedia Developer
Dan Scherf
Book Designer
Gary Adair
Page Layout
Bronkella Publishing LLC
Dedication
The book is dedicated to Linux users, both old and new. Welcome!
About the Author
Scott Granneman is an author, educator, and consultant. As a writer, he focuses on open source
software, as shown by his first two books, Don't Click on the Blue E!: Switching to Firefox and Hacking
Knoppix, and his contributions to Ubuntu Hacks. In addition, he is a monthly columnist for
SecurityFocus, with op/ed pieces that focus on general security topics, and for Linux Magazine, in a
column focusing on new and interesting Linux software, and he blogs professionally on The Open Source
Weblog.
An Adjunct Professor at Washington University in St. Louis, Scott teaches a variety of courses about
technology and the Internet. As an educator, he has taught thousands of people of all agesfrom
preteens to senior citizenson a wide variety of topics, including literature, education, and technology.
With the shift in focus over the last decade to Linux and other open-source technologies, Scott has
worked to bring knowledge of these powerful new directions in software to people at all technical skill
levels.
As a Principal of WebSanity, a website creation and hosting firm, he works with businesses and
nonprofits to leverage the Internet's communication, sales, and service opportunities. He manages the
firm's Unix-based server environment, thereby putting what he writes and teaches into practical use,
and works closely with other partners and developers on the underlying WebSanity Content
Management System (CMS) software and its extensions.
Acknowledgments
No one learns about the Linux shell in a vacuum, and I have hundreds of writers, working over the last
several decades, to thank for educating me about the awesome possibilities provided by the Linux
command line. Books, websites, blogs, handouts at LUG meetings: All helped me learn about bash, and
they continue to teach me today. If I can give back just a fraction of what I have absorbed, I'll be
satisfied.
In addition to that general pool of knowledgeable individuals, I'd like to thank the people (and animals)
who gave me help and support during the writing of Linux Phrasebook.
My agent, Laura Lewin, who has been helpful in too many ways for me to recount.
My editors at Pearson, who gave me the opportunity to write this book in the first place and have
encouraged me whenever I needed prodding.
Robert Citek provided invaluable help with RPM, and was always there if I had a question. That man
knows Linux.
My business partner and lifelong buddy Jans Carton helped me focus when I needed it, and was a
(mostly) willing guinea pig for many new commands and options. Looking back at that day in fifth
grade when we met, who'da thunk it?
Jerry Bryan looked over everything I wrote and fixed all the little grammatical mistakes and typos I
made. I promise, Jerry: One day I'll learn the difference between "may" and "might!"
My wife, Denise Lieberman, patiently listened to me babble excitedly whenever I figured out something
cool, even though she had absolutely no idea what I was talking about. That's true love. Thanks,
Denise!
Finally, I must point to my cute lil' Shih Tzu Libby, who always knew exactly the right time to put her
front paws on my leg and demand ear scratches and belly rubs.
We Want to Hear from You!
As the reader of this book, you are our most important critic and commentator. We value your opinion
and want to know what we're doing right, what we could do better, what areas you'd like to see us
publish in, and any other words of wisdom you're willing to pass our way.
You can email or write me directly to let me know what you did or didn't like about this bookas well as
what we can do to make our books stronger.
Please note that I cannot help you with technical problems related to the topic of this book, and that due
to the high volume of mail I receive, I might not be able to reply to every message.
When you write, please be sure to include this book's title and author as well as your name and phone
or email address. I will carefully review your comments and share them with the author and editors who
worked on the book.
Email: [email protected]
Mail:
Mark Taber
Associate Publisher
Sams Publishing
800 East 96th Street
Indianapolis, IN 46240 USA
Reader Services
Visit our website and register this book at www.samspublishing.com/register for convenient access to
any updates, downloads, or errata that might be available for this book.
Introduction
Among key Linux features, the command-line shell is one of the most important. If you run a Linux
server, your main interface is more than likely going to be the shell. If you're a power user running
Linux on the desktop, you probably have a terminal open at all times. If you're a Linux newbie, you may
think that you'll never open up the command line, but you will sometime ... and the more you use
Linux, the more you're going to want to use that shell.
The shell in many ways is the key to Linux's power and elegance. You can do things with the command
line that you simply can't do with whatever GUI you favor. No matter how powerful KDE or GNOME may
be (or IceWM or XFCE or any of the other kajillion windowing environments out there), you will always
be able to do many things faster and more efficiently with a terminal. If you want to master Linux, you
need to begin by mastering the Linux command line.
The traditional method has been to use the Linux man pages. While man pages are useful, they are often
not enough, for one simple reason: They lack examples. Oh, a few man pages here and there have a few
examples, but by and large, examples are hard to come by. This presents a real problem for users at all
experience levels: It's one thing to see options listed and explained, but it's another thing entirely to see
those options used in real world situations.
This book is all about those missing examples. I've been using Linux for over a decade and I consider
myself pretty knowledgeable about my favorite operating system. On top of that, I'm so addicted to the
command line that I have KDE set up to automatically start Konsole, the KDE terminal, when I log in.
But I'm always lamenting with other Linux users the dearth of examples found in man pages. When I
was asked to write Linux Phrasebook, and told that it was to consist of hundreds of examples illustrating
the most important Linux commands, I replied, "I can't wait! That's a book I'd buy in a heartbeat!"
You're holding the result in your hands: a book about the Linux commands you just have to know, with
examples illustrating how to use each and every one. This is a reference book that will be useful now
and for years to come, but I also hope you find it enjoyable as well, and even a little fun.
Audience for This Book
I've written this book to be useful both to beginnersthe folks that show up to meetings of our Linux
Users Group seeking guidance and a helping hand as they begin the adventure of using Linuxand to
experienced users who use the shell for everything from systems administration to games to
programming. If you've just started using Linux, this book will help teach you about the shell and its
power; if you've been using Linux for years and years, Linux Phrasebook will still teach you some new
tricks and remind you of some features you'd long ago forgotten.
There are many shells out therecsh , tcsh, zsh , to name but a fewbut I use the default shell for virtually
every Linux distro: bash, the Bourne Again Shell. The bash shell is not only ubiquitous, but also
amazingly powerful and flexible. After you get comfortable with bash, you may choose to explore other
options, but knowledge of bash is required in the world of Linux.
I wrote this book using K/Ubuntu, but the commands discussed should work on your distro as well. The
only major difference comes when you run a command as root. Instead of logging in as root, K/Ubuntu
encourages the use of the sudo command; in other words, instead of running lsof firefox as root, a
K/Ubuntu user would run sudo lsof firefox.
In order to appeal to the widest number of readers out there, I showed the commands as though you
have to run them as root, without sudo. If you see a # in front of a command, that's the shell indicating
that root is logged in, which means you need to be root to run that command, or utilize sudo if you're
using K/Ubuntu or a similar distro.
Conventions Used in This book
This book uses the following conventions.
Monospace type is used to differentiate between code/programming-related terms and regular
English, and to indicate text that should appear on your screen. For example:
The df command shows results in kilobytes by default, but it's usually easier to comprehend if you
instead use the -h (or --human-readable) option.
It will look like this to mimic the way text looks on your screen.
An arrow ( ) at the beginning of a line of code means that a single line of code is too long to fit
on the printed page. It signifies to readers that the author meant for the continued code to appear
on the same line. Many of the code examples in this book have been truncated due to length.
In addition to this, the following elements are used to introduce other pertinent information used
in this book.
A Note presents interesting pieces of information related to the surrounding discussion.
A Tip offers advice or teaches an easier way to do something.
A Caution advises you about potential problems and helps you steer clear of disaster.
Part I: Getting Started
Chapter 1. Things to Know About Your Command Line
Chapter 2. The Basics
Chapter 3. Learning About Commands
Chapter 4. Building Blocks
Chapter 1. Things to Know About Your
Command Line
Before you really dig in to your bash shell, you first need to understand a few things that will help you
as you proceed throughout this book. These are some absolutes that you just gotta know, and, trickily,
some of them are not obvious at all. But after you understand them, some of the ways in which your
shell behaves will start making much more sense.
Everything Is a File
On a Linux system, everything is a fileeverything, which may seem obvious at first. Of course a text
document is a file, and so is an OpenOffice.org document, and don't forget a picture, an MP3, and a
video. Of course!
But what about a directory? It's a file, tooa special kind of file that contains information about other
files. Disk drives are really big files. Network connections are files. Even running processes are files. It's
all files.
To Linux, a file is just a stream of bits and bytes. Linux doesn't care what those bits and bytes form;
instead, the programs running on Linux care. To Linux, a text document and a network connection are
both files; it's your text editor that knows how to work with the text document, and your Internet
applications that recognize the network connection.
Throughout this book I'm going to refer to files. If it's appropriate, feel free to read that as "files and
directories and subdirectories and everything else on the system." In particular, many of the commands
I'll cover work equally well on documents and directories, so feel free to try out the examples on both.
Maximum Filename Lengths
People who can look back to using MS-DOS (shudder!) remember that filenames could be no longer
than eight characters, plus a three-letter extension, giving you incredibly descriptive names such as
MSRSUME1.DOC. Pre-OS X Macs, on the other hand, extended that limit to 31 characters, which might
sound long but could still produce some odd-looking names.
Linux (and Unix) filenames can be up to 255 characters in length. That's an outrageous length for a
filename, and if you're getting anywhere even close to that, your picture should appear next to verbose
in the dictionary. You're given up to 255 characters, so feel free to be descriptive and accurate, but
don't go nuts.
In fact, it's a good idea to keep filenames below 80 characters because that's the width of your average
terminal and your filenames will appear on one line without wrapping. But that's just advice, not a
requirement. The freedom to describe a file in 200+ characters is yours; just use it wisely.
Names Are Case-Sensitive
Unlike Windows and Mac OS machines, Linux boxes are case-sensitive when it comes to filenames. You
could find the three following files in the same directory on a computer running Linux:
bookstobuy.txt
BooksToBuy.txt
BoOkStObUy.txt
To the Linux filesystem, those are three completely different files. If you were on Windows or Mac OS,
however, you would be asked to rename or cancel your attempt to add BooksToBuy.txt to a directory
that already contained bookstobuy.txt.
Case-sensitivity also means that commands and filenames must be entered exactly to match their real
command names or filenames. If you want to delete files by running rm, you can't type in RM or Rm or rM.
rm is it. And if you want to delete bookstobuy.txt and you instead enter rm BooksToBuy.txt, you just
removed the wrong file or no file at all.
The lesson is twofold: Linux forces you to be precise, but precision is a good thing. At the same time,
you're given a degree of flexibility that you won't find in other operating systems. That combination of
required precision and flexibility is one of the things that makes Linux fun to use, yet understandably a
bit confusing for new users.
Special Characters to Avoid in Names
Every operating system has certain no-no's when it comes to the characters you can use when naming
files and directories. If you use Mac OS, the colon ( :) isn't allowed; Windows users, on the other hand,
can't use the backslash (\). Linux has its verboten characters as well. Before looking at those, however,
here are the characters that are always safe: numbers, letters (either uppercase or lowercase), dots (.),
and underscores ( _). Other items on your keyboard might work perfectly, others might work but present
complications due to the fact that your shell will try to interpret them in various ways, and some won't
work at all.
/ is never an option because that particular character is used to separate directories and files. Let's say
you want to keep a file listing books you want to buy. You somehow manage to name the file
books/to_buy.txt (with a forward slash) to distinguish it from books/on_loan.txt and books/lost.txt.
Now when you try to refer to your file at /home/scott/documents/books/to_buy.txt , your command isn't
going to work because your shell thinks that a books directory is inside the documents directory, but it
doesn't exist.
Instead of a forward slash, use an underscore (as I did for the to_buy part of the filename), or cram the
words together (as in booksToBuy.txt or BooksToBuy.txt).
You could use a dash, forming books-to-buy.txt, but I find that underscores work nicely as word
separators while remaining more unobtrusive than dashes. If you do use a dash, though, do not place it
at the beginning of a filename, as in -books_to_buy.txt, or after a space, as in books - to buy. As
you're going to see later, if you're using a command and you want to invoke special options for that
command, you preface the options with dashes. As you're going to see in Chapter 2, "The Basics," the
rm command deletes files, but if you tried typing rm -books_to_buy.txt, your shell would complain with
the following error message:
rm: invalid option -- b
You can use spaces if you'd like, forming books to buy.txt, but you have to let your shell know that
those spaces are part of the filename. Your shell usually sees a space as a separator between
arguments. Attempts to delete books to buy.txt confuses the shell, as it would try to delete a file
named books, then one named to, and finally one named buy.txt. Ultimately, you won't delete books
to buy.txt, and you might accidentally delete files you didn't want to remove.
So how do you work with spaces in filenames? Or the * and ? characters, which you'll learn more about
in the next section? Or the ' and " characters, which also have special meanings in your shell? You have
several choices. Avoid using them, if at all possible. Or escape them by placing a \ in front of the
characters, which tells the shell that it should ignore their special usage and treat them as simple
characters. It can grow tiresome, however, making sure that \ is in its proper place all the time:
$ rm Why\ don\ 't\ I\ name\ files\ with\ \*\?.txt
Yuck. A simpler method that's a bit less onerous is to surround the filename with quotation marks,
which function similarly to the \:
$ rm "Why don't I name files with *?.txt"
This will work, but it's still a pain to have to use quotation marks all the time. A better solution is just
not to use these characters in the first place. Table 1.1 has some characters and what to do about them.
Table 1.1. How to Use Special Characters in
Filenames
Character
Advice
/
Never use. Cannot be escaped.
\
Must be escaped. Avoid.
-
Never use at beginning of file or directory
name.
[ ]
Must be escaped. Avoid.
{}
Must be escaped. Avoid.
*
Must be escaped. Avoid.
?
Must be escaped. Avoid.
'
Must be escaped. Avoid.
"
Must be escaped. Avoid.
Wildcards and What They Mean
Imagine that you have the following files12 pictures and a text filein a directory on your computer:
libby1.jpg
libby2.jpg
libby3.jpg
libby4.jpg
libby5.jpg
libby6.jpg
libby7.jpg
libby8.jpg
libby9.jpg
libby10.jpg
libby11.jpg
libby12.jpg
libby1.txt
You want to delete these files using the rm command (covered in Chapter 2) on your command line.
Removing them one at a time would be tedious and kind of silly. After all, one of the reasons to use
computers is to automate and simplify boring tasks. This is a job for wildcards, which allow you to
specify more than one file at a time by matching characters.
There are three wildcards: * (asterisk), ? (question mark), and [ ] (square brackets). Let's take each
one in order.
The * matches any character zero or more times. Table 1.2 has some uses of the * and what they would
match.
Table 1.2. The Wildcard * and What It Matches
Command
Matches
rm libby1*.jpg
libby10.jpg through libby12.jpg, as well as
libby1.txt
rm libby*.jpg
libby1.jpg through libby12.jpg, but not
libby1.txt
rm *txt
libby1.txt, but not libby1.jpg through
libby12.jpg
Command
Matches
rm libby*
libby1.jpg tHRough libby12.jpg, and
libby1.txt
rm *
All files in the directory
The ? matches a single character. Table 1.3 has some uses of the ? and what they would match.
Table 1.3. The Wildcard ? and What It Matches
Command
Matches
rm libby1?.jpg
libby10.jpg through libby12.jpg, but not
libby1.txt
rm libby?.jpg
libby1.jpg through libby9.jpg, but not
libby10.jpg
rm libby?.*
libby1.jpg though libby9.jpg, as well as
libby1.txt
The [ ] match either a set of single characters ([12], for instance) or a range of characters separated
by a hyphen (such as [1-3]). Table 1.4 has some uses of the [ ] and what they would match.
Table 1.4. The Wildcard [] and What It Matches
Command
Matches
rm libby1[12].jpg
libby11.jpg and libby12.jpg, but not
libby10.jpg
rm libby1[0-2].jpg
libby10.jpg through libby12.jpg, but not
libby1.jpg
rm libby[6-8].jpg
libby6.jpg through libby8.jpg, but nothing
else
You'll be using wildcards all through this book, so it's good to introduce them now. They make dealing
with files on the command line that much easier, and you're going to find them extremely helpful.
Conclusion
You've learned the stuff about Linux that might not be obvious to the new user, but that will come in
handy as you begin using your shell and its commands. The details in this chapter will save you
headaches as you start applying the materials in later chapters. Wondering why you can't copy a
directory with a space in it, how to delete 1,000 files at one time, or why you can't run RM
bookstobuy.txt? Those are not fun questions. With a little up-front knowledge, you can avoid the
common mistakes that have plagued so many others.
With that out of the way, it's time to jump in and start learning commands. Turn the page, and let's go!
Chapter 2. The Basics
This chapter introduces the basic commands you'll find yourself using several times every day. Think of
these as the hammer, screwdriver, and pliers that a carpenter keeps in the top of his toolbox. After you
learn these commands, you can start controlling your shell and finding out all sorts of interesting things
about your files, folders, data, and environment.
List Files and Folders
ls
The ls command is probably the one that people find themselves using the most. After all, before you
can manipulate and use files in a directory, you first have to know what files are available. That's where
ls comes in, as it lists the files and subdirectories found in a directory.
Note
The ls command might sound simplejust show me the files!but there are a surprising number
of permutations to this amazingly pliable command, as you'll see.
Typing ls lists the contents of the directory in which you're currently located. When you first log in to
your shell, you'll find yourself in your home directory. Enter ls, and you might see something like the
following:
$ ls
alias Desktop
iso
pictures program_files todo
bin
documents music podcasts src
videos
List the Contents of Other Folders
ls music
You don't have to be in a directory to find out what's in it. Let's say that you're in your home directory,
but you want to find out what's in the music directory. Simply type the ls command, followed by the
folder whose contents you want to view, and you'll see this:
$ ls music
Buddy_Holly
Clash
Donald_Fagen
new
In the previous example, a relative path is used, but absolute paths work just as well.
$ ls /home/scott/music
Buddy_Holly Clash Donald_Fagen
new
The ability to specify relative or absolute paths can be incredibly handy when you don't feel like moving
all over your file system every time you want to view a list of the contents of a directory. Not sure if you
still have that video of Tiger Woods sinking that incredible putt? Try this (~ is like an alias meaning your
home directory):
$ ls ~/videos
Ubuntu_Talk.mpeg
airhorn_surprise.wmv
apple_navigator.mov
b-ball-e-mail.mov
carwreck.mpg
nerdtv_1_andy_hertzfeld
nerdtv_2_max_levchin
nerdtv_3_bill_joy
RPG_Nerds.mpeg
tiger_woods_just_did_it.wmv
Yes, there it is: tiger_woods_just_did_it.wmv.
List Folder Contents Using Wildcards
ls ~/videos/*.wmv
You just learned how to find a file in a directory full of files, but there's a faster method. If you knew
that the video of Tiger Woods you're looking for was in Windows Media format (Boo! Hiss!) and
therefore ended with .wmv, you could use a wildcard to show just the files that end with that particular
extension.
$ ls ~/videos
Ubuntu_Talk.mpeg
airhorn_surprise.wmv
apple_navigator.mov
b-ball-e-mail.mov
carwreck.mpg
$ ls ~/videos/*.wmv
airhorn_surprise.wmv
nerdtv_1_andy_hertzfeld
nerdtv_2_max_levchin
nerdtv_3_bill_joy
RPG_Nerds.mpeg
tiger_woods_just_did_it.wmv
tiger_woods_just_did_it.wmv
There's another faster method, also involving wildcards: Look just for files that contain the word tiger.
$ ls ~/videos/*tiger*
tiger_woods_just_did_it.wmv
View a List of Files in Subfolders
ls -R
You can also view the contents of several subdirectories with one command. Say you're at a Linux Users
Group (LUG) meeting and installations are occurring around you fast and furious. "Hey," someone
hollers out, "does anyone have an ISO image of the new Kubuntu that I can use?" You think you
downloaded that a few days ago, so to be sure, you run the following command (instead of ls -R, you
can have also used ls --recursive):
$ ls -R ~/iso
iso:
debian-31r0a-i386-netinst.iso knoppix ubuntu
iso/knoppix:
KNOPPIX_V4.0.2CD.iso
KNOPPIX_V4.0.2DVD.iso
iso/ubuntu:
kubuntu-5.10-install.iso
kubuntu-5.10-live.iso
ubuntu-5.10-install.iso
ubuntu-5.10-live.iso
There it is, in ~/iso/ubuntu: kubuntu-5.10-install-i386.iso . The -R option traverses the iso directory
recursively, showing you the contents of the main iso directory and every subdirectory as well. Each
folder is introduced with its pathrelative to the directory in which you startedfollowed by a colon, and
then the items in that folder are listed. Keep in mind that the recursive option becomes less useful when
you have many items in many subdirectories, as the listing goes on for screen after screen, making it
hard to find the particular item for which you're looking. Of course, if all you want to do is verify that
there are many files and folders in a directory, it's useful just to see everything stream by, but that
won't happen very often.
View a List of Contents in a Single Column
ls -1
So far, you've just been working with the default outputs of ls. Notice that ls prints the contents of the
directory in alphabetical columns, with a minimum of two spaces between each column for readability.
But what if you want to see the contents in a different manner?
If multiple columns aren't your thing, you can instead view the results of the ls command as a single
column using, logically enough, ls -1 (or ls --format=single-column).
$ ls -1 ~/
bin
Desktop
documents
iso
music
pictures
src
videos
This listing can get out of hand if you have an enormous number of items in a directory, and more so if
you use the recursive option as well, as in ls -1R ~/. Be prepared to press Ctrl+c to cancel the
command if a list is streaming down your terminal with no end in sight.
View Contents As a Comma-Separated List
ls -m
Another option for those who can't stand columns of any form, whether it's one or many, is the -m
option (or --format=commas).
$ ls -m ~/
bin, Desktop, docs, iso, music, pix, src, videos
Think of the m in -m as a mnemonic for comma, and it is easier to remember the option. Of course, this
option is also useful if you're writing a script and need the contents of a directory in a comma-separated
list, but that's a more advanced use of a valuable option.
View Hidden Files and Folders
ls -a
Up to this point, you've been viewing the visible files in directories, but don't forget that many
directories contain hidden files in them as well. Your home directory, for instance, is just bursting with
hidden files and folders, all made invisible by the placement of a . at the beginning of their names. If
you want to view these hidden elements, just use the -a option (or --all).
$ ls -a ~/
.
..
.3ddesktop
.abbrev_defs
.adobe
.gimp-2.2
.gksu.lock
.glade2
.gnome
.openoffice.org1.9.95
.openoffice.org1.9
.openoffice.org2
.opera
.gnome2_private pictures
You should know several things about this listing. First, ls -a (the a stands for all) displays both hidden
and unhidden items, so you see both .gnome and pictures . Second, you'll always see the . and ..
because . refers to the current directory, while .. points to the directory above this one, the parent
directory. These two hidden files exist in every single folder on your system, and you can't get rid of
them. Expect to see them every time you use the -a option. Finally, depending on the directory, the -a
option can reveal a great number of hidden items of which you weren't aware.
Visually Display a File's Type
ls -F
The ls command doesn't tell you much about an item in a directory besides its name. By itself, it's hard
to tell if an item is a file, a directory, or something else. An easy way to solve this problem and make ls
really informative is to use the -F option (or --classify).
$ ls -F ~/bin
adblock_filters.txt
addext*
address_book.csv
address_book.sxc
programs_kill_artsd*
address_book.xls
programs_usual*
fixm3u* pix2tn.pl*
flash.xml*
pop_login*
getip*
procmail/
homesize*
html2text.py*
This tells you quite a bit. An * or asterisk after a file means that it is executable, while a / or forward
slash indicates a directory. If the filename lacks any sort of appendage at all, it's just a regular ol' file.
Other possible endings are shown in Table 2.1.
Table 2.1. Symbols
and File Types
Character
Meaning
*
Executable
/
Directory
@
Symbolic
link
|
FIFO
=
Socket
Display Contents in Color
ls --color
In addition to the symbols that are appended to files and folders when you use the -F option, you can
also ask your shell to display things in color, which gives an additional way to classify items and tell
them apart. Many Linux installs come with colors already enabled for shells, but if yours does not, just
use the --color option.
$ ls --color
adblock_filters.txt
addext
address_book.csv
fixm3u
flash.xml
getip
pix2tn.pl
pop_login
procmail
In this setup, executable files are green, folders are blue, and normal files are black (which is the
default color for text in my shell). Table 2.2 gives you the full list of common color associations (but
keep in mind that these colors may vary on your particular distro).
Table 2.2. Colors and File Types
Color
Meaning
Default shell text
color
Regular file
Green
Executable
Blue
Directory
Magenta
Symbolic link
Yellow
FIFO
Magenta
Socket
Red
Archive (.tar, .zip, .deb, .rpm)
Magenta
Images (.jpg, .gif, .png, .tiff)
Magenta
Audio (.mp3, .ogg, .wav)
Tip
Want to see what colors are mapped to the various kinds of files on your system? Enter
dircolors --print-database, and then read the results carefully. You can also use the
dircolors command to change those colors as well.
With the combination of --color and -F, you can see at a glance what kinds of files you're working with
in a directory. Now we're cookin' with gas!
$ ls -F --color
adblock_filters.txt
addext*
address_book.csv
fixm3u*
flash.xml*
getip*
pix2tn.pl*
pop_login*
procmail/
List Permissions, Ownership, and More
ls -l
You've now learned how to format the results of ls to tell you more about the contents of directories,
but what about the actual contents themselves? How can you learn more about the files and folders,
such as their size, their owners, and who can do what with them? For that information, you need to use
the -l option (or --format=long ).
$ ls -l ~/bin
total 2951
-rw-r--r-1 scott scott 15058 2005-10-03 18:49
-rwxr-xr-1 scott root
33 2005-04-19 09:45
245 2005-10-15 22:38
adblock_filters.txt
addext
-rwxr--r--
1 scott scott
backup
drwxr-xr-x
-rw-r--r--
9 scott scott 1080 2005-09-22 14:42
1 scott scott 237641 2005-10-14 13:50
bin_on_bacon
calendar.ics
-rwxr-xr-drwxr-xr-x
1 scott root 190 2005-04-19 09:45
2 scott scott 48 2005-04-19 09:45
convertsize
credentials
The -l option stands for long, and as you can see, it provides a wealth of data about the files found in a
directory. Let's move from right to left and discuss what you see.
On the farthest right is the easiest item: The name of the listed item. Want ls to display more about it?
Then add the -F option to -l, like this: ls -lF. Color is easily available as well, with ls -lF --color.
Moving left, you next see a date and time. This is the time that the file was last modified, including the
date (in year-month-day format) and then the time (in 24-hour military time).
Farther left is a number that indicates the size of the item, in bytes. This is a bit tricky with foldersfor
instance, the previous readout says that bin_on_bacon is 1080 bytes, or just a little more than one
kilobyte, yet it contains 887KB of content inside it. The credentials directory, according to ls -l, is 48
bytes, but contains nothing inside it whatsoever! What is happening?
Remember in Chapter 1, "Things to Know About Your Command Line," when you learned that
directories are just special files that contain a list of their contents? In this case, the contents of
credentials consists of nothing more than the .. that all directories have to refer to their parent, so it's
a paltry 48 bytes, while bin_on_bacon contains information about more than 30 items, bringing its size
up to 1080 bytes.
The next two columns to the left indicate, respectively, the file's owner and its group. As you can see in
the previous listing, almost every file is owned by the user scott and the group scott, except for addext
and convertsize, which are owned by the user scott and the group root.
Note
Those permissions need to be changed, which you'll learn how to do in Chapter 7,
"Ownerships and Permissions" (hint: the commands are chown and chgrp).
The next to last column as you move left contains a number. If you're examining a file, this number tells
you how many hard links exist for that file; if it's a directory, it refers to the number of items it
contains.
Tip
For more information about hard (and soft) links, see
www.granneman.com/techinfo/linux/thelinuxenvironment/softandhardlinks.htm, or search
Google for linux hard links.
And now you reach the final item on the left: The actual permissions for each file and directory. This
might seem like some arcane code, but it's actually very understandable with just a little knowledge.
There are 10 items, divided (although it doesn't look that way) into 4 groups. The first group consists of
the first character; the second group contains characters 2 through 4; the third consists of characters 5
through 7; and the fourth and final group is made up of characters 8 through 10. For instance, here's
how the permissions for the credentials directory would be split up: d|rwx|r-x|r-x .
That first group tells you what kind of item it is. You've already seen that -F and --color do this in
different ways, but so does -l. A d indicates that credentials is a directory, while a - in that first
position indicates a file. (Even if the file is executable, ls -l still uses just a -, which means that -F and
--color here give you more information.) There are, of course, other options that you might see in that
first position, as detailed in Table 2.3.
Table 2.3. Permission
Characters and File
Types
Character
Meaning
-
Regular file
-
Executable
Character
Meaning
d
Directory
l
Symbolic link
s
Socket
b
Block device
c
Character
device
p
Named pipe
Tip
To view a list of files that shows at least one of almost everything listed in this table, try ls -l
/dev.
The next nine charactersmaking up groups two, three, and fourstand for, respectively, the permissions
given to the file's owner, the file's group, and all the other users on the system. In the case of addext,
shown previously, its permissions are rwxr-xr--, which means that the owner scott has rwx , the group
(in this case, also scott) has r-x , and the other users on the box have r-- . What's that mean?
In each case, r means "yes, read is allowed"; w means "yes, write is allowed" (with "write" meaning
both changing and deleting); and x means "yes, execute is allowed." A - means "no, do not allow this
action." If that - is located where an r would otherwise show itself, that means "no, read is not
allowed." The same holds true for both w and x.
Looking at addext and its permissions of rwxr-xr--, it's suddenly clear that the owner (scott) can read,
write, and execute the file; the members of the group (root) can read and execute the file, but not write
to it; and everyone else on the machine (often called the "world") can read the file but cannot write to it
or run it as a program.
Now that you understand what permissions mean, you'll start to notice that certain combinations seem
to appear constantly. For instance, it's common to see rw-r--r-- for many files, which means that the
owner can both read and write to the file, but both the group and world can only read the file. For
programs, you'll often see rwxr-xr-x, which allows everyone on the computer to read and run the
program, but restricts changing the file to its owner.
Directories, however, are a bit different. The permissions of r, w, and x are pretty clear for a file: You
can read the file, write (or change) it, or execute it. But how do you execute a directory?
Let's start with the easy one: r. In the case of a directory, r means that the user can list the contents of
the directory with the ls command. A w indicates that users can add more files into the directory,
rename files that are already there, or delete files that are no longer needed. That brings us to x, which
corresponds to the capability to access a directory in order to run commands that access and use files in
that directory, or to access subdirectories inside that directory.
As you can see, -l is incredibly powerful all by itself, but it becomes even more useful when combined
with other options. You've already learned about -a, which shows all files in a directory, so now it
should be obvious what -la would do (or --format=long --all).
$ la -la ~/
drwxr-xr-x
2 scott scott
200 2005-07-28 01:31
alias
drwx------
2 root root
72 2005-09-16 19:14
.aptitude
-rw-r--r-drwxr-xr-x
-rw-------
1 scott scott
10 scott scott
1 scott scott
1026 2005-09-25 00:11
592 2005-10-18 11:22
8800 2005-10-18 19:55
.audacity
.Azureus
.bash_history
Note
From this point forward in this chapter, if scott is the owner and group for a file, I will remove
that data from the listing.
Reverse the Order Contents are Listed
ls -r
If you don't like the default alphabetical order that -l uses, you can reverse it by adding -r (or -reverse).
$ ls -lar ~/
-rw-------
8800 2005-10-18 19:55 .bash_history
drwxr-xr-x
-rw-r--r-drwx-----drwxr-xr-x
592
1026
72
200
2005-10-18
2005-09-25
2005-09-16
2005-07-28
11:22
00:11
19:14
01:31
.Azureus
.audacity
.aptitude
alias
Note
Keep in mind that this is -r, not -R.-r means reverse, but -R means recursive.
When you use -l, the output is sorted alphabetically based on the name of the files and folders; the
addition of -r reverses the output, but it is still based on the filename. Keep in mind that you can add r virtually any time you use ls if you want to reverse the default output of the command and options
you're inputting.
Sort Contents by File Extension
ls -X
The name of a file is not the only thing you can use for alphabetical sorting. You can also sort
alphabetically by the file extension. In other words, you can tell ls to group all the files ending with
.doc together, followed by files ending with .jpg, and finally finishing with files ending with .txt. Use
the -X option (or --sort=extension); if you want to reverse the sort, add the -r option (or --reverse).
$ ls -lX ~/src
drwxr-xr-x
320
drwxr-xr-x
1336
-rw-r--r-- 2983001
-rw-r--r-- 6683923
2005-10-06
2005-09-18
2005-06-20
2005-09-24
22:35
15:01
02:15
22:41
backups
fonts
install.tar.gz
DuckDoom.zip
Folders go first in the list (after all, they have no file extension), followed by the files that do possess an
extension. Pay particular attention to installdata.tar.gz it has two extensions, but the final one, the
.gz , is what is used by ls.
Sort Contents by Date and Time
ls -t
Letters are great, but sometimes you need to sort a directory's contents by date and time. To do so, use
-t (or --sort=time) along with -l; to reverse the sort, use -tr (or --sort=time --reverse) along with l.
$ ls -latr ~/
-rw------8800 2005-10-18 19:55 .bash_history
drwx-----368 2005-10-18 23:12 .gnupg
drwxr-xr-x
2760 2005-10-18 23:14 bin
drwx-----168 2005-10-19 00:13 .Skype
All of these items except the last one were modified on the same day; the last one would have been first
if you weren't using the -r option and thereby reversing the results.
Note
Notice that you're using four options at one time in the previous command:-latr. You could
have instead used -l -a -t -r, but who wants to type all of those hyphens? It's quicker and
easier to just combine them all into one giant option. The long version of the options (those
that start with two hyphens and consist of a word or two), however, cannot be combined and
have to be entered separately, as in -la --sort=time --reverse.
Sort Contents by Size
ls -S
You can also sort by size instead of alphabetically by filename or extension, or by date and time. To sort
by size, use -S (or --sort=size).
$ ls -laS ~/
-rw-r--r-109587
-rw------40122
-rwxr--r-15465
-rw------8757
2005-10-19
2005-04-20
2005-10-12
2005-10-19
11:53
11:00
15:45
08:43
.xsession-errors
.nessusrc
.vimrc
.bash_history
When you sort by size, the largest items come first. To sort in reverse, with the smallest at the top, just
use -r.
Express File Sizes in Terms of K, M, and G
ls -h
In the previous section, the 15465 on .vimrc's line means that the file is about 15KB, but it's not always
convenient to mentally translate bytes into the equivalent kilobytes, megabytes, or gigabytes. Most of
the time, it's more convenient to use the -h option (or --human-readable), which makes things easier to
understand.
$ ls -laSh ~/
-rw-r--r-100K 2005-10-19 11:44
-rw------40K 2005-04-20 11:00
-rwxr--r-16K 2005-10-12 15:45
-rw------- 8.6K 2005-10-19 08:43
.xsession-errors
.nessusrc
.vimrc
.bash_history
In this example, you see K for kilobytes; if the files were big enough, you'd see M for megabytes or even
G for gigabytes. Some of you might be wondering how 40122 bytes for .nessusrc became 40K when you
used -h. Remember that 1024 bytes make up a kilobyte, so when you divide 40122 by 1024, you get
39.1816406 kilobytes, which ls -h rounds up to 40K. A megabyte is actually 1,048,576 bytes, and a
gigabyte is 1,073,741,824 bytes, so a similar sort of rounding takes place with those as well.
Note
In my ~/.bashrc file, I have the following aliases defined, which have served me well for
years. Use what you've learned in this section to extend these examples and create aliases
that exactly meet your needs (for more on aliases, see Chapter 11, "Your Shell").
alias
alias
alias
alias
alias
l= 'ls -F'
l1= 'ls -1F'
la= 'ls -aF'
ll= 'ls -laFh'
ls= 'ls -F'
Display the Path of Your Current Directory
pwd
Of course, while you're listing the contents of directories hither and yon, you might find yourself
confused about just where you are in the file system. How can you tell in which directory you're
currently working? The answer is the pwd command, which stands for print working directory.
Note
The word print in print working directory means "print to the screen," not "send to printer."
The pwd command displays the full, absolute path of the current, or working, directory. It's not
something you'll use all the time, but it can be incredibly handy when you get a bit discombobulated.
$ pwd
/home/scott/music/new
Change to a Different Directory
cd
It's possible to list the contents of any directory simply by specifying its path, but often you actually
want to move into a new directory. That's where the cd command comes in, another one that is almost
constantly used by shell aficionados.
The cd command is simple to use: Just enter cd, followed by the directory into which you want to move.
You can use a relative path, based on where you are currently cd src or cd ../../or you can use an
absolute path, such as cd /tmp or cd /home/scott/bin .
Change to Your Home Directory
cd ~
You should know about a few nice shortcuts for cd. No matter where you are, just enter a simple cd,
and you'll immediately be whisked back to your home directory. This is a fantastic timesaver, one that
you'll use all the time. Or, if you'd like, you can use cd ~ because the ~ is like a shortcut meaning "my
home directory."
$ pwd
/home/scott/music
$ cd ~
$ pwd
/home/scott
Change to Your Previous Directory
cd Another interesting possibility is cd -, which takes you back to your previous directory and then runs
the pwd command for you, printing your new (or is it old?) location. You can see it in action in this
example.
$ pwd
/home/scott
$ cd music/new
$ pwd
/home/scott/music/new
$ cd /home/scott
Using cd - can be useful when you want to jump into a directory, perform some action there, and then
jump back to your original directory. The additional printing to the screen of the information provided
by pwd is just icing on the cake to make sure you're where you want to be.
Change a File to the Current Time
touch
The touch command isn't one you'll find yourself using constantly, but you'll need it as you proceed
through this book, so it's a good one to cover now. Interestingly, the main reason for the existence of
touchto update the access and modification times of a fileisn't the main reason you'll be using the
command. Instead, you're going to rely on its secondary purpose that undoubtedly gets more use than
the primary purpose!
Note
You can only use the touch command on a file and change the times if you have write
permission for that file. Otherwise, touch fails.
To simultaneously update both the access and modification times for a file (or folder), just run the basic
touch command.
$ ls -l ~/
drwxr-xr-x
848 2005-10-19
drwxr-xr-x
1664 2005-10-18
drwxr-xr-x
632 2005-10-18
-rw-r--r-239 2005-09-10
$ touch wireless.log
$ ls -l ~/
drwxr-xr-x
848 2005-10-19
drwxr-xr-x
1664 2005-10-18
drwxr-xr-x
632 2005-10-18
-rw-r--r-239 2005-10-19
11:36
12:07
12:25
23:12
src
todo
videos
wireless.log
11:36
12:07
12:25
14:00
src
todo
videos
wireless.log
Thanks to touch, both the modification time and the access time for the wireless.log file have changed,
although ls -l only shows the modification time. The file hadn't been used in more than a month, but
touch now updates it, making it look like it was just ... touched.
You can be more specific, if you'd like. If you want to change just the access time, use the -a option (or
--time=access ); to alter the modification time only, use -m (or --time=modify ).
Change a File to Any Desired Time
touch -t
Keep in mind that you aren't constrained to the current date and time. Instead, you can pick whatever
date and time you'd like, as long as you use this option and pattern: -t [[CC]YY]MMDDhhmm[.ss]. The
pattern is explained in Table 2.4.
Table 2.4. Patterns for Changing a File's Times
Characters Meaning
CC
First two characters of a four-digit year
YY
Two-digit year:
If 0068, assumes that first two digits are 20
If 6999, assumes that first two digits are 19
If nothing, assumes current year
MM
Month (0112)
DD
Day (0131)
hh
Hour (0123)
mm
Minute (0059)
ss
Second (0059)
It's very important that you include the zeroes if the number you want to use isn't normally two digits
or your pattern won't work. Here are a few examples of touch with the -t option in action to help get
you started.
$ ls -l
-rw-r--r-- 239 2005-10-19 14:00 wireless.log
$ touch -t 197002160701 wireless.log
$ ls -l
-rw-r--r-- 239 1970-02-16 07:01 wireless.log
$ touch -t 9212310000 wireless.log
$ls -l
-rw-r--r-- 239 1992-12-31 00:00 wireless.log
$ touch -t 3405170234 wireless.log
$ ls -l
-rw-r--r-- 239 2034-05-17 02:34 wireless.log
$ touch -t 10191703 wireless.log
$ls -l
-rw-r--r-- 239 2005-10-19 17:03 wireless.log
First you establish that the current date and time for wireless.log is 2005-10-19 14:00. Then you go
back in time some 35 years, to 1970-02-16 07:01, and then forward a little more than 20 years to 199212-31 00:00, and then leap way into the future to 2034-05-17 02:34, when Linux computers will rule the
world and humans will live in peace and open-source prosperity, and then finish back in our day and
time.
You should draw a couple of lessons from this demonstration. You go back more than three decades by
specifying the complete four-digit year (1970), the month (02), the day (16), the hour (07), and the
minute (01). You don't need to specify seconds. After that, you never specify a four-digit year again. 92
in 9212310000 is within the range of 6999, so touch assumes you mean 19 as the base century, while 34
in 3405170234 lies between 00 and 68, so 20 is used as the base. The last time touch is used, a year
isn't specified at all, just a month (10), a day (19), an hour (17), and minutes (03), so touch knows you
mean the current year, 2005. By understanding how to manipulate touch, you can change the date
stamps of files when necessary.
Create a New, Empty File
touch
But that's not the main reason many people use touch. The command has an interesting effect if you try
to use the touch command on a file that doesn't exist: It creates an empty file using the name you
specified.
$ ls -l ~/
drwxr-xr-x
drwxr-xr-x
848 2005-10-19 11:36 src
632 2005-10-18 12:25 videos
$ touch test.txt
$ ls -l ~/
drwxr-xr-x
-rw-r--r-drwxr-xr-x
848 2005-10-19 11:36 src
0 2005-10-19 23:41 test.txt
632 2005-10-18 12:25 videos
Why would you use touch in this way? Let's say you want to create a file now, and then fill it with
content later. Or you need to create several files to perform tests on them as you're playing with a new
command you've discovered. Both of those are great reasons, and you'll find more as you start learning
more about your shell.
Create a New Directory
mkdir
The touch command creates empty files, but how do you bring a new folder into existence? With the
mkdir command, that's how.
$ ls -l
drwxr-xr-x
drwxr-xr-x
848 2005-10-19 11:36 src
632 2005-10-18 12:25 videos
$ mkdir test
$ ls -l
drwxr-xr-x
drwxr-xr-x
drwxr-xr-x
848 2005-10-19 11:36 src
48 2005-10-19 23:50 test
632 2005-10-18 12:25 videos
Note
On most systems, new directories created by mkdir give the owner read, write, and execute
permissions, while giving groups and the world read and execute permissions. Want to change
those? Look in Chapter 7 at the chmod command.
It should make you happy to know that your shell takes care of you: Attempts to create a directory that
already exists fail and result in a warning message.
$ mkdir test
mkdir: cannot create directory 'test' : File exists
Create a New Directory and Any Necessary Subdirectories
mkdir -p
If you want to create a new subdirectory inside a new subdirectory inside a new subdirectory, this
seems like a rather tedious task at first: create the first subdirectory, cd into that one, create the
second subdirectory, cd into that one, and finally create the third subdirectory. Yuck. Fortunately, mkdir
has a wonderful option that makes this whole process much more streamlined: -p (or --parents).
$ ls -l
drwxr-xr-x
848 2005-10-19 11:36 src
$ mkdir -p pictures/personal/family
$ ls -l
drwxr-xr-x
72 2005-10-20 00:12 pictures
drwxr-xr-x
848 2005-10-19 11:36 src
$ cd pictures
$ ls -l
drwxr-xr-x
72 2005-10-20 00:12
cd personal
$ ls -l
drwxr-xr-x
48 2005-10-20 00:12
personal
family
Find Out What mkdir Is Doing As It Acts
mkdir -v
Now isn't that much easier? Even easier still would be using the -v option (or --verbose), which tells
you what mkdir is doing every step of the way, so you don't need to actually check to make sure mkdir
has done its job.
$ mkdir -pv pictures/personal/family
mkdir: created directory 'pictures'
mkdir: created directory 'pictures/personal'
mkdir: created directory 'pictures/personal/family'
One of the best things about being a Linux user is that the OS goes out of its way to reward lazy
usersthe lazier the betterand this is a great way to see that in action.
Copy Files
cp
Making copies of files is one of those things that computer users, no matter the OS, find themselves
doing all the time. One of the most venerable commands used by the Linux shell is cp, which copies files
and directories. The easiest way to use cp is simply to type in the command, followed by the file you're
wanting to copy, and then the copied file's new name; think of the command's structure as "cp fileyou're-copying-from file-you're-copying-to." Another common way to express that relationship is "cp
source target."
$ pwd
/home/scott/libby
$ ls
libby.jpg
$ cp libby.jpg libby_bak.jpg
$ ls
libby_bak.jpg libby.jpg
This example is pretty simple: The picture is copied into the same directory as the original file. You can
also copy files to another directory, or even copy files from a directory in which you're not currently
located to another directory located somewhere else on your file system.
$ pwd
/home/scott
$ ls ~/libby
libby_bak.jpg libby.jpg
$ cp pix/libby_arrowrock.jpg libby/arrowrock.jpg
$ ls ~/libby
arrowrock.jpg libby_bak.jpg libby.jpg
The same filename, libby_closeup.jpg, is used in this example, which is okay because the file is being
copied into another directory entirely. In the first example for cp, however, you had to use a new name,
libby_bak.jpg instead of libby.jpg, because you were copying the file into the same directory.
If you want to copy a file from another directory into your working directory (the one in which you
currently find yourself), simply use .. (Remember how you learned earlier in this chapter that . means
"the current directory?" Now you see how that information can come in handy.) Of course, you can't
change the name when you use . because it's a shortcut for the original filename.
$ pwd
/home/scott/libby
$ ls
libby_bak.jpg libby.jpg
$ cp pix/libby_arrowrock.jpg .
$ ls
arrowrock.jpg libby_bak.jpg libby.jpg
You don't need to specify a filename for the target if that target file is going to reside in a specific
directory; instead, you can just provide the directory's name.
$ ls -l
drwxr-xr-x
224 2005-10-20 12:34 libby
drwxr-xr-x
216 2005-09-29 23:17 music
drwxr-xr-x 1.6K 2005-10-16 12:34 pix
$ ls libby
arrowrock.jpg libby.jpg
$ cp pix/libby_on_couch.jpg libby
$ ls libby
arrowrock.jpg libby.jpg libby_on_couch.jpg
In the previous example, you need to be certain that a directory named libby already exists for
libby_on_couch.jpg to be copied into, or you would have ended up with a file named libby in your
home directory.
Copy Files Using Wildcards
cp *
It's time for more laziness, namely, the capability to copy several files at one time into a directory using
wildcards. If you've been careful naming your files, this can be a really handy timesaver because you
can exactly specify a group of files.
$ pwd
/home/scott/libby
$ ls ~/pix
arrowrock.jpg
by_pool_03.jpg on_floor_03.jpg
by_pool_01.jpg on_floor_01.jpg on_floor_04.jpg
by_pool_02.jpg on_floor_02.jpg
$ ls
arrowrock.jpg libby.jpg libby_on_couch.jpg
$ cp ~/pix/by_pool*.jpg .
$ ls
arrowrock.jpg by_pool_02.jpg on_couch.jpg
by_pool_01.jpg by_pool_03.jpg libby.jpg
You aren't limited to the * wildcard; instead, you can more precisely identify which files you want to
copy using a bracket that matches any characters named between the [ and ] characters. If you want to
copy the first three on_floor pictures but not the fourth, that's easily done with cp.
$ pwd
/home/scott/libby
$ ls ~/pix
arrowrock.jpg by_pool_03.jpg on_floor_03.jpg
by_pool_01.jpg on_floor_01.jpg on_floor_04.jpg
by_pool_02.jpg on_floor_02.jpg
$ ls
arrowrock.jpg libby.jpg libby_on_couch.jpg
$ cp ~/pix/on_floor_0[1-3].jpg .
$ ls
arrowrock.jpg
libby_on_couch.jpg
on_floor_02.jpg
libby.jpg
on_floor_01.jpg
on_floor_03.jpg
Copy Files Verbosely
cp -v
Adding the -v option (or --verbose) shows you the progress of cp as it completes its work.
$ pwd
/home/scott/libby
$ ls ~/pix
arrowrock.jpg by_pool_03.jpg on_floor_03.jpg
by_pool_01.jpg on_floor_01.jpg on_floor_04.jpg
by_pool_02.jpg on_floor_02.jpg
$ ls
arrowrock.jpg libby.jpg libby_on_couch.jpg
$ cp -v ~/pix/on_floor_0[1-3].jpg .
'/home/scott/pix/on_floor_01.jpg' ->
'./on_floor_01.jpg'
'/home/scott/pix/on_floor_02.jpg' ->
'./on_floor_02.jpg'
'/home/scott/pix/on_floor_03.jpg' ->
'./on_floor_03.jpg'
$ ls
arrowrock.jpg
libby_on_couch.jpg
on_floor_02.jpg
libby.jpg
on_floor_01.jpg
on_floor_03.jpg
The -v option does a nice job of keeping you abreast of the cp command's progress. It does such a good
job that you really don't need to run that last ls command because the -v option assures you that the
files you wanted were in fact copied.
Stop Yourself from Copying over Important Files
cp -i
The previous example demonstrates something important about cp that you need to know. In the "Copy
Files Using Wildcards" example, you copied three libby_on_floor pictures into the libby directory; in
the previous example, you copied the same three libby_on_floor images into the libby directory again.
You copied over files that already existed, but cp didn't warn you, which is how Linux works: It assumes
you know what you're doing, so it doesn't warn you about things like overwriting your files ... unless
you ask it to do so. If you want to be forewarned before overwriting a file using the cp command, use
the -i option (or --interactive ). If you tried once again to copy the same files but used the -i option
this time, you'd get a different result.
$ pwd
/home/scott/libby
$ ls ~/pix
arrowrock.jpg
by_pool_03.jpg
on_floor_03.jpg
by_pool_01.jpg
on_floor_01.jpg
on_floor_04.jpg
by_pool_02.jpg
on_floor_02.jpg$ ls
arrowrock.jpg
libby_on_couch.jpg on_floor_02.jpg
libby.jpg
on_floor_01.jpg
on_floor_03.jpg
$ cp -i ~/pix/on_floor_0[1-3].jpg .
cp: overwrite './on_floor_01.jpg'?
Bam! The cp command stops in its tracks to ask you if you want to overwrite the first file it's trying to
copy, libby_on_floor_01.jpg. If you want to go ahead and copy the file, enter y; otherwise, enter n. If
you do choose n, that doesn't mean the cp stops completely; instead, you're next asked about the next
file, and the next, and so on. The only way to give an n to whole process is to cancel the whole process
by pressing Ctrl+c. Similarly, there's no way to say yes to every question ahead of time, so if you want
to copy 1,000 files over 1,000 other files with the same name and also intend to use the -i option,
make sure you have plenty of time to sit and interact with your shell, because you're going to get asked
1,000 times if you really want to overwrite your files.
Caution
For normal users, -i usually isn't necessary. For root users, however, it's darn near essential,
as a root user can errantly copy over a key system file, causing a disaster. For that reason,
it's a good idea to create an alias in the root user's .bashrc file, making sure that cp is really
cp -i instead.
alias cp= 'cp -i'
Copy Directories
cp -R
So far you've looked at copying files, but there are times you'll want to copy directories as well. You
can't just enter cp source-directory target-directory, though, because that won't work as you
expected; the directory will copy, but not the files inside it.
$ pwd
/home/scott
$ cp libby libby_bak
cp: omitting directory
'libby'
If you want to copy directories, you need to include the -R option (or --recursive), which you should
recall from the ls command. The addition of -R means that the directory, as well as its contents, are
copied.
$ pwd
/home/scott
$ ls -l
drwxr-xr-x
drwxr-xr-x
328 2005-10-17 14:42 documents
240 2005-10-20 17:16 libby
$ cp -R libby libby_bak
$ ls -l
drwxr-xr-x
328 2005-10-17 14:42 documents
drwxr-xr-x
240 2005-10-20 17:16 libby
drwxr-xr-x
240 2005-10-20 17:17 libby_bak
Copy Files As Perfect Backups in Another Directory
cp -a
You might be thinking right now that cp would be useful for backing up files, and that is certainly true
(although better programs exist, and we'll take a look at one of themrsyncin Chapter 15, "Working on
the Network"). With a few lines in a bash shell script, however, cp can be an effective way to back up
various files and directories. The most useful option in this case would be the -a option (or --archive),
which is also equivalent to combining several options: -dpR (or --no-dereference --preserve -recursive). Another way of thinking about it is that -a ensures that cp doesn't follow symbolic links
(which could grossly balloon your copy), preserves key file attributes such as owner and timestamp,
and recursively follows subdirectories.
$ pwd
/home/scott
$ ls -l
drwxr-xr-x
216
drwxr-xr-x
216
$ ls -lR libby
libby:
total 312
-rw-r--r-73786
-rw-r--r-18034
-rw-r--r-198557
drwxr-xr-x
168
2005-10-21 11:31 libby
2005-09-29 23:17 music
2005-10-20
2005-04-19
2005-04-19
2005-10-21
12:12
00:57
00:57
11:31
arrowrock.jpg
libby.jpg
on_couch.jpg
on_floor
libby/on_floor:
total 764
-rw-r--r-218849 2005-10-20 16:11 on_floor_01.jpg
-rw-r--r-200024 2005-10-20 16:11 on_floor_02.jpg
-rw-r--r-358986 2005-10-20 16:11 on_floor_03.jpg
$ cp -a libby libby_bak
$ ls -l
drwxr-xr-x
216 2005-10-21 11:31 libby
drwxr-xr-x
216 2005-10-21 11:31 libby_bak/
drwxr-xr-x
216 2005-09-29 23:17 music
$ ls -lR libby_bak
libby:
total 312
-rw-r--r-73786 2005-10-20 12:12 arrowrock.jpg
-rw-r--r-18034 2005-04-19 00:57 libby.jpg
-rw-r--r-198557 2005-04-19 00:57 on_couch.jpg
drwxr-xr-x
168 2005-10-21 11:31 on_floor
libby/on_floor:
total 764
-rw-r--r-- 218849 2005-10-20 16:11 on_floor_01.jpg
-rw-r--r-- 200024 2005-10-20 16:11 on_floor_02.jpg
-rw-r--r-- 358986 2005-10-20 16:11 on_floor_03.jpg
Note
Yes, you've probably figured it out already, but let me confirm: Libby is my dog, a cute lil'
shih-tzu who eventually ends up in some way in almost everything I write. To see her, check
out http://www.granneman.com/go/libby.
Move and Rename Files
mv
So cp copies files. That seems simple enough, but what about moving files? In the similar vein of
removing unnecessary vowels in commands, we have the mv command, short for move.
You're going to notice quickly that most of the options you learned for cp are quite similar to those used
by mv. This shouldn't surprise you; after all, mv in reality performs a cp -a, and then removes the file
after it's been successfully copied.
At its simplest, mv moves a file from one location to another on your file system.
$ pwd
/home/scott/libby
$ ls
libby_arrowrock.jpg libby_bak.jpg libby.jpg libby_on_couch.jpg on_floor
$ ls ~/pictures/dogs
libby_on_floor_01.jpg libby_on_floor_03.jpg
libby_on_floor_02.jpg libby_on_floor_04.jpg
$ mv ~/pictures/dogs/libby_on_floor_04.jpg
libby_on_floor_04.jpg
$ ls
libby_arrowrock.jpg libby.jpg
libby_on_floor_04.jpg
libby_bak.jpg
libby_on_couch.jpg on_floor
$ ls ~/pictures/dogs
libby_on_floor_01.jpg libby_on_floor_02.jpg
libby_on_floor_03.jpg
Just as you did with cp, you can use a dot to represent the current directory if you don't feel like typing
out the filename again.
$ pwd
/home/scott/libby
$ ls
arrowrock.jpg libby.jpg on_couch.jpg on_floor
$ ls ~/pictures/dogs
on_floor_01.jpg on_floor_03.jpg
on_floor_02.jpg on_floor_04.jpg
$ mv ~/pictures/dogs/on_floor_04.jpg .
$ ls
arrowrock.jpg
on_couch.jpg on_floor_04.jpg
libby.jpg
on_floor
$ ls ~/pictures/dogs
on_floor_01.jpg on_floor_02.jpg on_floor_03.jpg
If you're moving a file into a directory and you want to keep the same filename, you just need to specify
the directory. The filename stays the same.
$ pwd
/home/scott/libby
$ ls
arrowrock.jpg on_couch.jpg on_floor_04.jpg
libby.jpg
on_floor
$ ls on_floor
on_floor_01.jpg on_floor_02.jpg on_floor_03.jpg
$ mv on_floor_04.jpg on_floor
$ ls
arrowrock.jpg on_couch.jpg on_floor_04.jpg
libby.jpg
on_floor
$ ls on_floor
on_floor_01.jpg on_floor_03.jpg
on_floor_02.jpg on_floor_04.jpg
In order to visually communicate that on_floor is in fact a directory, it's a good idea to use a / at the
end of the directory into which you're moving files, like this: mv libby_on_floor_04.jpg on_floor/. If
on_floor is not a directory, your mv won't work, thus preventing you from accidentally writing over a
file.
Note
The cp and mv commands use many of the same options, which work the same way
for either command. For instance, -v copies and moves verbosely, and -i copies and
moves interactively.
Rename Files and Folders
mv
As you'll soon see, however, mv does something more than move, something that might seem a bit
counterintuitive, but which makes perfect sense after you think about it a moment.
At this point, it's a good idea to introduce the other cool feature of mv. Yes, mv moves filesthat's its
name, after allbut it also renames files. If you move a file, you have to give it a target name. There's no
rule that the target name has to be the same as the source name, so shell users since time immemorial
have relied on the mv command to rename files and directories.
$ pwd
/home/scott/libby/by_pool
$ ls -F
libby_by_pool_02.jpg liebermans/
$ mv liebermans/ lieberman_pool/
$ ls -F
libby_by_pool_02.jpg lieberman_pool/
When moving a directory using cp, you had to specify the -R (or --recursive) option in order to copy
the actual directory itself. Not so with mv, which, as you can see in the previous example, happily moves
or renames directories without the need for any extra option at all, a nice change from cp.
Caution
You need to know a very important, and unfortunately easy to overlook, detail about mv. If
you are moving a soft link that points to a directory, you need to be extremely careful about
what you type. Let's say you have a soft link named dogs in your home directory that points to
/home/scott/pictures/dogs, and you want to move the link into the /home/scott/libby
subdirectory. This command moves just the soft link:
$ mv dogs ~/libby
This command, however, moves the directory to which the soft link points:
$ mv dogs/ ~/libby
What was the difference? A simple forward slash at the end of the soft link. No forward slash
and you're moving the soft link itself, and just the link; include a forward slash and you're
moving the directory to which the soft link is designed to point, not the soft link. Be careful!
Delete Files
rm
The rm command (short for remove) deletes files. It really, really removes them. They're gone. There's
no trash can or recycle bin on the Linux command line. You're walking a tightrope, baby, and if you fall
you go splat!
Okay, that's a little extreme. It's true that the shell lacks an undelete command, but you're not
completely hosed if you delete a file. If you stop working on your machine the second you realize your
mistake, if the operating system hasn't overwritten the erased file's sectors, and if you can successfully
use some rather complicated file recovery software, yes, it's possible to recover files. But it's no fun,
and you'll be cursing the whole time. Better to just be careful in the first place.
Tip
Many folks have tried to provide some sort of safety net for rm, ranging from remapping or
replacing the rm command to a temporary trash can
(www.comwt.com/open_source/projects/trashcan/); to adding a trash can onto the shell
(http://pages.stern.nyu.edu/~marriaga/software/libtrash/); to creating a new command,
trash, to replace rm (www.ms.unimelb.edu.au/~jrlooker/shell.html#trash).
If, on the other hand, you want to make positively sure that no one can possibly recover your
deleted files, even men in black working for shadowy U.S. government agencies, use the shred
command instead of rm. The shred command overwrites the file 25 times so it is impossible to
re-create it. Before using shred, however, read its man page, as its success rate is highly
dependent on the type of filesystem you're using.
Using rm is easy. Some would say almost too easy.
$ pwd
/home/scott/libby/by_pool/lieberman_pool
$ ls
pool_01.jpg
pool_03.jpg
pool_01.jpg_bak
pool_03.jpg_bak
$ rm pool_01.jpg_bak
$ ls
pool_01.jpg pool_03.jpg pool_03.jpg_bak
Remove Several Files At Once with Wildcards
rm *
Wildcards such as * help to delete several files with one keystroke.
$ pwd
/home/scott/libby/by_pool/lieberman_pool
$ ls
pool_01.jpg
pool_03.jpg
pool_01.jpg_bak
pool_03.jpg_bak
$ rm *_bak
$ ls
pool_01.jpg pool_03.jpg
Caution
Be very, very, very careful when removing files using wildcards, or you may delete far more
than you intended! A classic example is typing rm *txt instead of typing rm * txt (see the
errant space?). Instead of deleting all text files, the star means that all files were deleted, and
then rm tries to delete a file named txt . Oops!
Remove Files Verbosely
rm -v
If you want to see what rm is doing as it does it, use the -v (or --verbose) option.
$ pwd
/home/scott/libby/by_pool/lieberman_pool
$ ls
pool_01.jpg
pool_03.jpg
pool_01.jpg_bak
pool_03.jpg_bak
$ rm -v *_bak
removed 'pool_01.jpg_bak'
removed 'pool_03.jpg_bak'
$ ls
pool_01.jpg pool_03.jpg
Stop Yourself from Deleting Key Files
rm -i
The -i option (or --interactive ) provides a kind of safety net. It asks you before deleting every file.
This is a good thing to use while you're running as root!
$ pwd
/home/scott/libby/by_pool/lieberman_pool
$ ls
pool_01.jpg
pool_03.jpg
pool_01.jpg_bak
pool_03.jpg_bak
$ rm -i *_bak
rm: remove regular file 'pool_01.jpg_bak'?
rm: remove regular file 'pool_03.jpg_bak'?
$ ls
pool_01.jpg
pool_03.jpg
y
y
When rm asks you what to do, a y is an agreement to nuke the file, and an n is an order to spare the file
and continue onward to the next file.
Delete an Empty Directory
rmdir
Removing files isn't hard at all, but what about directories?
$ pwd
/home/scott/libby/by_pool
$ ls
pool_02.jpg
lieberman_pool
lieberman_pool_bak
$ ls lieberman_pool_bak
pool_01.jpg
pool_03.jpg
pool_01.jpg_bak
pool_03.jpg_bak
$ rm lieberman_pool_bak
rm: cannot remove 'lieberman_pool_bak/': Is a directory
After a few moments of looking around, you might find the rmdir command, which is specifically
designed for deleting directories. So you try it.
$ rmdir lieberman_pool_bak
rmdir: 'lieberman_pool_bak/': Directory not empty
Dang it! That doesn't work either. The rmdir command only deletes empty directories. In this case, the
lieberman_pool_bak folder only contains four items, so it wouldn't be too hard to empty it and then use
rmdir. But what if you want to delete a directory that contains 10 subdirectories that each contains 10
more subdirectories and every single subdirectory contains 25 files? You're going to be deleting forever.
There has to be an easier way! For that, see the next section.
Remove Files and Directories That Aren't Empty
rm -Rf
There is an easier way to remove directories with files. Use the combination of the -R (or --recursive)
and -f (or --force) options. The -r tells rm to go down into every subdirectory it finds and delete
everything, while the -f tells rm to just do it without bothering you with the niceties, such as folders that
aren't empty.
$ pwd
/home/scott/libby/by_pool
$ ls
pool_02.jpg lieberman_pool lieberman_pool_bak
$ ls lieberman_pool_bak
pool_01.jpg
pool_03.jpg
pool_01.jpg_bak
pool_03.jpg_bak
$ rm -Rf lieberman_pool_bak
$ ls
pool_02.jpg lieberman_pool
Pow! That's a sure-fire way to get rid of a directory and all the files and subdirectories inside it.
Caution
The rm -Rf command can destroy your important files and your system.
The classic Linux warning is not to type rm -Rf /* as root. Yes, you will erase your system.
No, it will not be pretty. Yes, you will feel stupid.
In general, be careful when using wildcards with rm -Rf. There's a huge difference between rm
-Rf libby* and rm -Rf libby *. The former deletes everything in the working directory that
begins with libby; the latter deletes any file or folder named exactly libby, and then deletes
everything else in the directory.
You can also inadvertently create a disaster if you mean to enter rm -Rf ~/libby/* and
instead fat finger your command and tell the shell rm -Rf ~/libby /*. First the ~/libby
directory leaves, and then your file system begins its rapid journey toward nonexistence.
Here's one that's bitten a few folks trying to be clever: Never type rm -Rf .*/* to delete a
directory that begins with . because you'll also match .. and end up deleting everything
above your current working folder as well. Oops!
Once again: Be careful using rm -Rf when you're a normal user. Be hyper vigilant and
paranoid when using rm -Rf as root!
Delete Troublesome Files
Before leaving rm, you should know a couple of things about its relationship to certain files on your
system. First, no matter how hard you try, you will not be able to remove the . or .. directories
because they are required to keep your file system hierarchy in place. Besides, why would you want to
remove them? Leave'em alone!
How do you remove a file with a space in it? The normal way of invoking rmthe command, followed by
the filenamewon't work because rm thinks you're talking about two different files. Actually, removing
Cousin Harold's picture isn't too hard. Just put the name of the file in quotation marks.
$ ls
cousin harold.jpg -cousin_roy.jpg cousin_beth.jpg
$ rm cousin harold.jpg
rm: cannot remove 'cousin': No such file or directory
rm: cannot remove 'harold.jpg': No such file or directory
$ rm "cousin harold.jpg"
$ ls
-cousin_roy.jpg cousin_beth.jpg
Here's more of a head-scratcher: How do you remove a file whose name starts with -?
$ ls
-cousin_roy.jpg cousin_beth.jpg
$ rm -cousin_roy.jpg
rm: invalid option -- c
Try 'rm --help' for more information.
D'oh! The rm command sees the - and thinks it's the start of an option, but it doesn't recognize an
option that starts with c. It continues with ousin_roy.jpg and doesn't know what to do.
You have two solutions. You can preface the problematic filename with --, which indicates to the
command that anything coming afterward should not be taken as an option, and is instead a file or
folder.
$ ls
-cousin_roy.jpg cousin_beth.jpg
$ rm -- -cousin_roy.jpg
$ ls
cousin_beth.jpg
Otherwise, you can use the . as part of your pathname, thus bypassing the space before the - that
confuses the rm command and leads it into thinking the filename is actually an option.
$ ls
-cousin_roy.jpg cousin_beth.jpg
$ rm ./-cousin_roy.jpg
$ ls
cousin_beth.jpg
It just goes to show that the ingenuity of Linux users runs deep. That, and try not to put a hyphen at
the beginning of your filename!
Become Another User
su username
The su command (which stands for switch user, not, as is popularly imagined, super user) can allow one
user to temporarily act as another, but it is most often used when you want to quickly become root in
your shell, run a command or two, and then go back to being your normal, non-root user. Think of it as
Clark Kent changing into his Superman duds, righting a few wrongs, and then going back to his nonsuper self.
Invoking su isn't difficult. Just enter su, followed by the user whose identity you want to assume.
$ ls
/home/scott/libby
$ whoami
scott
$ su gromit
Password:
$ whoami
gromit
$ ls
/home/scott/libby
There's a new command in that example, one that's not exactly the most widely used: whoami. It just
tells you who you are, as far as the shell is concerned. Here, you're using it to verify that su is working
as you'd expected.
Become Another User, with His Environment Variables
su -l
The su command only works if you know the password of the user. No password, no transformation. If it
does work, you switch to the shell that the user has specified in the /etc/passwd file: sh, tcsh, or bash,
for instance. Most Linux users just use the default bash shell, so you probably won't see any differences
there. Notice also in the previous example that you didn't change directories when you changed users.
In essence, you've become gromit, but you're still using scott's environment variables. It's as if you
found Superman's suit and put it on. You might look like Superman (yeah, right!), but you wouldn't
have any of his powers.
The way to fix that is to use the -l option (or --login).
$ ls
/home/scott/libby
$ whoami
scott
$ su -l gromit
Password:
$ whoami
gromit
$ ls
/home/gromit
Things look mostly the same as the "Become Another User" example, but things are very different
behind the scenes. The fact that you're now in gromit's home directory should demonstrate that
something has changed. The -l option tells su to use a login shell, as though gromit actually logged in
to the machine. You're now gromit in name, but you're using gromit's environment variables, and
you're in gromit's home directory (where gromit would find himself when he first logged in to this
machine). It's as though putting on Superman's skivvies also gave you the ability to actually leap tall
buildings in a single bound!
Become root
su
At the beginning of this chapter, the point was made that most times su is used to become root. You
could use su root, or better still, su -l root, but there is a quicker way.
$ whoami
scott
$ su
Password:
$ whoami
root
Become root, with Its Environment Variables
su Entering su all by its lonesome is equivalent to typing in su rootyou're root in name and power, but
that's all. Behind the scenes, your non-root environment variables are still in place, as shown here:
$ ls
/home/scott/libby
$ whoami
scott
$ su
Password:
$ whoami
root
$ ls
/home/scott/libby
When you use su -, you not only become root, you also use root's environment variables.
$ ls
/home/scott/libby
$ whoami
scott
$ su Password:
$ whoami
root
$ ls
/root
Now that's better! Appending - after su is the same as su -l root, but requires less typing. You're root
in name, power, and environment, which means you're fully root. To the computer, anything that root
can do, you can do. Have fun with your superpowers, but remember that with great power comes great
aw, you know how it ends.
Conclusion
If this was law school, you would have learned in this chapter what misdemeanors, torts, and felonies
are. If this was a hardware repair class, it would have been RAM, hard drives, and motherboards. Since
this is a book about the Linux shell, you've examined the most basic commands that a Linux user needs
to know to effectively use the command line: ls, pwd , cd, touch, mkdir, cp, mv, rm, and su. rmdir and
whoami were thrown in as extra goodies. With these vital commands under your belt, you're ready to
move ahead to learning about how to learn about commands. Twisty? You ain't seen nothin' yet.
Chapter 3. Learning About Commands
In Chapter 2, "The Basics," you started learning about some basic system commands. You covered a lot,
but even so, much was left out. The ls command is an incredibly rich, powerful tool, with far more
options than were provided in Chapter 2. So how can you learn more about that command or others
that pique your interest? And how can you discover commands if you don't even know their names?
That's where this chapter comes in. Here you will find out how to learn more about the commands you
already know, those you know you don't know, and even those that you don't know that you don't
know!
Let's start with the two 800-pound gorillas man and infoand move from there to some smaller, more
precise commands that actually use much of the data collected by man . By the time you're finished,
you'll be ready to start learning about the huge variety of tools available to you in your shell
environment.
Find Out About Commands with man
man ls
Want to find out about a Linux command? Why, it's easy! Let's say you want to find out more about the
ls command. Enter man ls, and the man (short for manual) page appears, chock full of info about the
various facets of ls. try the same thing for some of the other commands you've examined in this book.
You'll find man pages for (almost) all of them.
Still, as useful as man pages are, they still have problems. You have to know the name of a command
to really use them (although there are ways around that particular issue), and they're sometimes out of
date and missing the latest features of a command. They don't always exist for every command, which
can be annoying. But worst of all, even if you find one that describes a command in which you're
interested and it's up to date, you might still have a big problem: It might be next to useless.
The same developers who write the programs usually (but not always) write the man pages. Most of the
developers who write applications included with a Linux distribution are excellent programmers, but not
always effective writers or explainers of their own work. They know how things work, but they too often
forget that users don't know the things that the developers find obvious and intuitive.
With all of those problems, however, man pages are still a good resource for Linux users at all levels of
experience. If you're going to use Linux on the command line, you need to learn how to use and read
man pages.
As stated before, using this command isn't hard. Just enter man , followed by the command about which
you want to learn more.
$ man ls
LS(1)
User Commands
LS(1)
NAME
ls - list directory contents
SYNOPSIS
ls [OPTION]... [FILE]...
DESCRIPTION
List information about the FILEs (the current
directory by default).
Sort entries alphabetically if none of -cftuSUX
nor --sort.
Mandatory arguments to long options are mandatory
for short options too.
-a, --all
do not hide entries starting with .
-A, --almost-all
do not list implied . and ..
[Listing condensed due to length]
The list of data man provides in this case is quite extensiveover 200 lines worth, in fact. Of course, not all
commands provide that much, and some provide far more. Your job is to read the various sections
provided in a man page, which usually (but not always) consist of the following:
NAME The name of the command and a brief description
SYNOPSIS The basic format of the command
DESCRIPTION A longer overview of the command's purpose
OPTIONS The real meat and potatoes of man: The various options for the command, along with
short explanations of each one
FILES Other files used by the command
AUTHOR Who wrote the command, along with contact information
BUGS Known bugs and how to report new ones
COPYRIGHT This one's obvious: Information about copyright
SEE ALSO Other, related commands
Moving around in a man page isn't that hard. To move down a line at a time, use the down arrow; to
move up a line at a time, use the up arrow. To jump down a page, press the spacebar or f (for
forward); to jump up a page, press b (for backward). When you reach the end of a man page, man
might quit itself, depositing you back on the shell; it might, however, simply stop at the end without
quitting, in which case you should press q to quit the program. In fact, you can press q at any time to
exit man if you're not finding the information you want.
It can be hard to find a particular item in a man page, so sometimes you need to do a little searching.
To search on a man page after it's open, type /, followed by your search term, and then press Enter. If
your term exists, you'll jump to it; to jump to the next occurrence of the term, press Enter again (or n),
and keep pressing Enter (or n) to jump down the screen to each occurrence; to go backward, press
Shift+n.
Search for a Command Based on What It Does
man -k
With just a little training, you find that you can zoom around man pages and find exactly what you
need...assuming you know which man page to read. What if you know a little about what a command
does, but you don't know the actual name of the command? Try using the -k option (or --apropos) and
search for a word or phrase that describes the kind of command you want to discover. You'll get back a
list of all commands whose name or synopsis matches your search term.
$ man list
No manual entry for list
$ man -k list
last (1) - show listing of last logged in users
ls (1) - list directory contents
lshal (1) - List devices and their properties
lshw (1) - list hardware
lsof (8) - list open files
[Listing condensed due to length]
Be careful with the -k option, as it can produce a long list of results, and you might miss what you were
looking for. Don't be afraid to try a different search term if you think it might help you find the
command you need.
Tip
The -k option, also represented by --apropos, is exactly the same as the apropos command.
Quickly Find Out What a Command Does Based on Its Name
man -f
If you know a command's name but don't know what it does, there's a quick and dirty way to find out
without requiring you to actually open the man page for that command. Use the -f option (or -whatis), and the command's synopsis appears.
$ man -f ls
ls (1)
- list directory contents
Tip
Yes, the -f option, also known as --whatis , is the spitting image of the whatis command,
which is examined in more detail later in this chapter.
Rebuild man's Database of Commands
man -u
Occasionally you'll try to use man to find out information about a command and man reports that there is
no page for that command. Before giving up, try again with the -u option (or --update ), which forces
man to rebuild the database of commands and man pages it uses. It's often a good first step if you think
things aren't quite as they should be.
$ man ls
No manual entry for ls
$ man -u ls
LS(1)
User Commands
NAME
ls - list directory contents
SYNOPSIS
ls [OPTION]... [FILE]...
[Listing condensed due to length]
LS(1)
Read a Command's Specific Man Page
man [1-8]
You might notice in the previous listing that the first line of man's page on ls references LS(1), while
earlier, when you used the -k option, all the names of commands were also followed by numbers in
parentheses. Most of them are 1, but one, lsof, is 8. So what's up with all of these numbers?
The answer is that man pages are categorized into various sections, numbered from 1 to 8, which break
down as follows (and don't worry if you don't recognize some of the example, as many of them are
pretty arcane and specialized):
1. General commands. Examples are cd, chmod, lp, mkdir, and passwd.
2. Low-level system calls provided by the kernel. Examples are intro and chmod.
3. C library functions. Examples are beep, HTML::Parser, and Mail::Internet.
4. Special files, such as devices found in /dev. Examples are console, lp, and mouse.
5. File formats and conventions. Examples are apt.conf , dpkg.cfg , hosts, and passwd.
6. Games. Examples are atlantik , bouncingcow, kmahjongg, and rubik.
7. Miscellanea, including macro packages. Examples are ascii, samba, and utf-8.
8. System administration commands used by root. Examples are mount and shutdown .
Almost every command we've looked at so far in this book falls into section 1, which isn't surprising
because we're focused on general use of your Linux system. But notice how some commands fall into
more than one section: chmod, for instance, is in both 1 and 2, while passwd can be found in 1 and 5. By
default, if you enter man passwd in your shell, man defaults to the lower number, so you'll get the section
1 man page for passwd, which isn't very helpful if you want to learn about the file passwd. To see the
man page for the file passwd, follow man with the section number for the data you want to examine.
$ man passwd
PASSWD(1)
PASSWD(1)
NAME
passwd - change user password
SYNOPSIS
passwd [-f|-s] [name]
passwd [-g] [-r|-R] group
passwd [-x max] [-n min] [-w warn] [-i inact]
login
passwd {-l|-u|-d|-S|-e} login
DESCRIPTION
passwd changes passwords for user and group accounts. A normal user...
[Listing condensed due to length]
$ man 5 passwd
PASSWD(5)
PASSWD(5)
NAME
passwd - The password file
DESCRIPTION
passwd contains various pieces of information for each user account.
[Listing condensed due to length]
Print Man Pages
man -t
As easy as it is to view man pages using a terminal program, sometimes it's necessary to print out a
man page for easier reading and cogitation. Printing a man page isn't a one-step process, however, and
the commands that round out this particular section are actually using principles that will be covered
extensively later. But if you want to print out a man page, this is the way to do it. Just have faith that
you'll understand what these commands mean more fully after reading upcoming chapters.
Let's say you have a printer that you have identified with the name hp_laserjet connected to your
system. You want to print a man page about the ls command directly to that printer, so use the -t
option (or --troff) and then pipe the output to the lpr command, identifying your printer with the -P
option.
$ man -t ls | lpr -P hp_laserjet
Note
You'll learn about the pipe symbol (|) in Chapter 4, "Building Blocks," and lpr in Chapter 6,
"Printing and Managing Print Jobs."
In just a moment or two, depending upon the speed of your computer and your printer, hp_laserjet
should begin spitting out the man pages for ls. Maybe you don't want to actually print the pages,
however. Maybe just creating PDFs of the man page for the ls command would be enough. Once again,
the commands to do so might seem a bit arcane now, but they will be far more comprehensible very
soon.
Again, you use the -t option, but this time you send the output to a PostScript file named for the ls
command. If that process completes successfully, you convert the PostScript file to a PDF using the
ps2pdf command, and then after that finishes correctly, you delete the original PostScript file because
it's no longer needed.
$ man -t ls > ls.ps && ps2pdf ls.ps && rm ls.ps
Note
You'll learn about the > and && symbols in Chapter 4 and ps2pdf in Chapter 6.
If you want to make a printed library of man pages covering your favorite commands, or if you want
that library in PDF format (which can then be printed to hard copies, if necessary), now you know how
to do it. For something so simple, man is powerful and flexible, which makes it even more useful than it
already is.
Learn About Commands with info
info
The man command, and the resulting man pages, are simple to work with, even if the content isn't
always as user friendly as you might like. In response to that and other perceived shortcomings with
man pages, the GNU Project, which is responsible in many ways for many of the commands you're
reading about in this book, created its own format: Info pages, which use the info command for
viewing.
Info pages tend to be better written, more comprehensive, and more user friendly in terms of content
than man pages, but man pages are far easier to use. A man page is just one page, while Info pages
almost always organize their contents into multiple sections, called nodes, which might also contain
subsections, called subnodes. The trick is learning how to navigate not just around an individual page,
but also between nodes and subnodes. It can be a bit overwhelming at first to just move around and
find what you're seeking in Info pages, which is rather ironic: Something that was supposed to be nicer
for newbies than man is actually far harder for the novice to learn and use.
There are many facets to info, and one of the best commands to use info against is, in fact, info. To
learn how to use info, as well as read about info, enter the following command:
$ info info
This opens the Info page for the info command. Now you need to learn how to get around in this new
Info world.
Navigate Within info
Within a particular section's screen, to move down, or forward, one line at a time, use the down arrow;
to move up, or back, one line at a time, use the up arrow. When you reach the bottom, or end, of a
particular section, your cursor stops and you are not able to proceed.
If you instead want to jump down a screen at a time, use your keyboard's PageDown button; to jump
up a screen at a time, use the PageUp key instead. You are not able to leave the particular section
you're in, however.
If you reach the end of the section and you want to jump back to the top, just press b , which stands for
beginning . Likewise, e takes you to the "end."
If, at any time, as you're jumping around from place to place, you notice that things look a little
strange, such as the letters or words being distorted, press Ctrl+l to redraw the screen, and all should
be well.
Now that you know how to navigate within a particular section or node, let's move on to navigating
between nodes. If you don't want to use PageDown and PageUp to move forward and backward within a
section, you can instead use the spacebar to move down and the Backspace or Delete keys to move up.
These keys offer one big advantage over PageDown and PageUp, besides being easier to reach: When
you hit the end of a node, you automatically proceed onward to the next node, through subnodes if they
exist. Likewise, going up moves you back to the previous node, through any subnodes. Using the
spacebar, or the Backspace or Delete buttons, you can quickly run through an entire set of Info pages
about a particular command.
If you want to engage in fewer key presses, you can use n (for next ) to move on to the next node at the
same level. If you are reading a node that had subnodes and you press n, you skip those subnodes and
move to the next node that's a peer of the one you're currently reading. If you reading a subnode and
press n, however, you jump to the next subnode. If n moves you to the next node at the current level,
then p moves you to the previous one (p for previous , get it?), again at the same level.
If you instead want to move forward to a node or subnode, use the ] , or right square brace, key. If
you're reading a node and you press ] , you'll jump to that node's first subnode, if one exists.
Otherwise, you'll move to that node's next peer node. To move backward in the same fashion, use the [
, or left square brace, key.
If you want to move up a node, to the parent of the node you're currently reading, use the u (for up )
key. Be careful, thoughit's easy to jump up past the home page of the command you're reading about in
Info to what is termed the Directory node, the root node that leads to all other Info nodes (another way
to reach the Directory node is to type d , for directory , at any time).
The Directory node is a particularly large example of a type of page you find all throughout Info: a
Menu page, which lists all subnodes or nodes. If you ever find yourself on a Menu page, you can quickly
navigate to one of the subnodes listed in that menu in one of two ways. First, type an m (for menu ) and
then start typing the name of the subnode to which you want to jump. For instance, here's the first page
you see when you enter info info on the command line:
[View full width]File: info.info, Node: Top, Next: Getting Started, Up: (dir)
Info: An Introduction
*********************
The GNU Project distributes most of its on-line manuals in the "Info format", which you
read using an "Info reader" . You are probably using an Info reader to read this now.
[content condensed due to length]
*
*
*
*
*
Menu:
Getting Started:: Getting started using an Info reader.
Expert Info:: Info commands for experts.
Creating an Info File:: How to make your own Info file.
Index:: An index of topics, commands, and variables.
To jump to Expert Info, you could type m , followed by Exp . At this point, you could finish typing ert
Info , or you could just press the Tab key and Info fills in the rest of the menu name that matches the
characters you've already entered. If Info complains, you have entered a typo, or more than one menu
choice matches the characters you've typed. Fix your typo, or enter more characters until it is obvious
to Info which menu choice is the one in which you're interested. If you realize that you don't want to go
to a Menu choice at this time, press Ctrl+g to cancel your command, and go back to reading the node
on which you find yourself.
Alternatively, you could just use your up or down arrow key to position the cursor over the menu choice
you want, and then press Enter. Either method works.
If you don't want to navigate Info pages and instead want to search, you can do that as well, in two
ways: By searching just the titles of all nodes for the Info pages about a particular command, or by
searching in the actual text of all nodes associated with a particular command. To search the titles,
enter i (for index because this search uses the node index created by Info), followed by your search
term, and press Enter. If your term exists somewhere in a node's title, you'll jump to it. If you want to
repeat your search and go to the next result, press the comma key.
If you want to search text instead of titles, enter s (for search ), followed by your search term or phrase,
and then press Enter. To repeat that search, enter s , followed immediately by Enter. That's not as easy
as just pressing the comma key like you do when you're searching titles, but it does work.
If at any time you get lost inside Info and need help, just press the ? key and the bottom half of your
window displays all of the various options for Info. Move up and down in that section using the keys
you've already learned. When you want to get out of help, press l.
Finally, and perhaps most importantly, to get out of Info altogether, just press q , for quit , which
dumps you back in your shell. Whew!
Locate the Paths for a Command's Executable, Source Files,
and Man Pages
whereis
The whereis command performs an incredibly useful function: It tells you the paths for a command's
executable program, its source files (if they exist), and its man pages. For instance, here's what you
might get for KWord, the word processor in the KOffice set of programs (assuming, of course, that the
binary, source, and man files are all installed):
[View full width]$ whereis kword
kword: /usr/src/koffice-1.4.1/kword
/kword.1.gz
/usr/bin/kword/usr/bin/X11/kword
usr/share/man/man1
The whereis command first reports where the source files are: /usr/src/koffice-1.4.1/kword . Then it
informs you as to the location of any binary executables: /usr/bin/kword and /usr/bin/X11/kword .
KWord is found in two places on this machine, which is a bit unusual but not bizarre. Finally, you find
out where the man pages are: /usr/share/man/man1/kword.1.gz . Armed with this information, you can
now verify that the program is in fact installed on this computer, and you know now how to run it.
If you want to search only for binaries, use the -b option.
$ whereis -b kword
kword: /usr/bin/kword /usr/bin/X11/kword
If you want to search only for man pages, the -m option is your ticket.
$ whereis -m kword
kword: /usr/share/man/man1/kword.1.gz
Finally, if you want to limit your search only to sources, try the -s option.
$ whereis -s kword
kword: /usr/src/koffice-1.4.1/kword
The whereis command is a good, quick way to find vital information about programs on the computer
you're using. You'll find yourself using it more than you think.
Read Descriptions of Commands
whatis
Earlier in this chapter you found out about the -f option for man, which prints onscreen the description
of a command found in a man page. If you can remember that man -f presents that information, bully
for you. It might be easier, however, to remember the whatis command, which does exactly the same
thing: Displays the man page description for a command.
$ man -f
ls (1) $ whatis
ls (1) -
ls
list directory contents
ls
list directory contents
The whatis command also supports regular expressions and wildcards. To search the man database
using wildcards, use the -w option (or --wildcard).
$ whatis -w ls*
ls (1) - list directory contents
lsb (8) - Linux Standard Base support for Debian
lshal (1) - List devices and their properties
lshw (1) - list hardware
lskat (6) - Lieutnant Skat card game for KDE
[Listing condensed due to length]
Using wildcards might result in a slightly slower search than whatis without options, but it's pretty
negligible on today's fast machines, so you probably don't have to worry about it.
Regular expressions can be used with the -r (or --regex) option.
$ whatis -r ^rm.*
rm (1) - remove files or directories
rmail (8) - handle remote mail received via uucp
rmdir (1) - remove empty directories
rmt (8) - remote magtape protocol module
Tip
There's not enough room in this book to cover regular expressions, but you can read more in
Sams Teach Yourself Regular Expressions in 10 Minutes (ISBN: 0672325667) by Ben Forta.
Again, regular expressions are supposed to slow down the response you get with whatis, but you'll
probably never notice it.
The whatis command is easy to remember (easier than man -f, some would say) and it quickly returns
some important information, so memorize it.
Find a Command Based on What It Does
apropos
The whatis command is similar to man -f, and the apropos command is likewise similar to man -k. Both
commands search man pages for the names and descriptions of commands, helping you if you know
what a command does but can't remember its name.
Note
The word apropos isn't exactly in common usage, but it is a real word, and it means,
essentially, "relevant" or "pertinent." The word appropriate has a somewhat similar meaning,
but it's based on a different Latin root than apropos. Check the words out at
www.answers.com for yourself if you're a fellow linguaphile.
Using apropos is easy: Just follow the command with a word or phrase describing the feature of the
command in which you're interested.
$ man list
No manual entry for list
$ man -k list
last (1) - show listing of last logged in users
ls (1) - list directory contents
lshw (1) - list hardware
lsof (8) - list open files
[Listing condensed due to length]
$ apropos list
last (1) - show listing of last logged in users
ls (1) - list directory contents
lshw (1) - list hardware
lsof (8) - list open files
[Listing condensed due to length]
Just like whatis, you can use the -w (or --wildcard) or the -r (or --regex) options for searches. More
interestingly, though, you can use the -e option (or --exact) when you want to tightly focus on a word
or phrase, without any exception. For instance, in the previous listing, searching for list turned up the
last command because it had the word listing in its description. Let's try that same search, but with the
-e option.
$ apropos -e list
ls (1) - list directory contents
lshw (1) - list hardware
lsof (8) - list open files
[Listing condensed due to length]
This time, last doesn't show up because you wanted only results with the exact word list, not listing. In
fact, the list of results for list went from 80 results without the -e option to 55 results with the -e
option, which makes it far easier to precisely target your command searches and find exactly the
command you want.
Find Out Which Version of a Command Will Run
which
Think back to the whereis command and what happened when you ran it against KWord using the -b
option, for show binaries only.
$ whereis -b kword
kword: /usr/bin/kword /usr/bin/X11/kword
The executable for KWord is in two places. But which one will run first? You can tell that by running the
which command.
$ which kword
/usr/bin/kword
The which command tells you which version of a command will run if you just type its name. In other
words, if you type in kword and then press Enter, your shell executes the one found inside /usr/bin . If
you want to run the version found in /usr/bin/X11, you have to change directories using the cd
command and then enter ./kword, or use the absolute path for the command and type out
/usr/bin/X11/kword .
The which command is also a speedy way to tell if a command is on your system. If the command is on
your system and in your PATH, you'll be told where to find it; if the command doesn't exist, you're back
on the command line with nothing.
$ which arglebargle
$
If you want to find all the locations of a command (just like you would if you used whereis -b), try the a (for all) option.
$ which -a kword
/usr/bin/kword
/usr/bin/X11/kword
Conclusion
The title of this chapter is "Learning About Commands," and that's what we've covered. By now, you've
seen that there are a variety of ways to find out more about your options on the command line. The two
big dogs are man and info, with their volumes of data and descriptions about virtually all the commands
found on your Linux computer. Remember that whereis, whatis, apropos, and which all have their
places as well, especially if your goal is to avoid having to wade through the sometimes overwhelming
verbiage of man and info, an understandable but often impossible goal. Sometimes you just have to roll
up your sleeves and start reading a man page. Think of it like broccoli: You might not enjoy it, but it's
good for you.
It's true that many of the commands in this chapter overlap to a degree. For instance, man -k is the
same as apropos and man -f is just like whatis, while whereis -b is functionally equivalent to which -a .
The choice of which one to use in a given situation is up to you. It's still a good idea, however, to know
the various similarities, so you'll be able to read shell scripts or instructions given by others and
understand exactly what's happening. Linux is all about variety and choice, even in such seemingly
small matters as commands run in the shell.
Chapter 4. Building Blocks
When you're young, you learn numbers, and then later you learn how to combine and work with those
numbers using symbols such as +, , x, and =. So far in this book, you've learned several commands,
but each one has been run one at a time. Commands can actually be combined in more complex and
more interesting ways, however, using various symbols such as |, >, >>, and <. This chapter takes a
look at those building blocks that enable you to do some useful things with the commands you've
learned and the commands you'll be examining in greater detail in subsequent chapters.
Run Several Commands Sequentially
;
What if you have several commands you need to run consecutively, but some of them are going to take
a long time, and you don't feel like babysitting your computer? For instance, what if you have a huge
number of John Coltrane MP3s in a zipped archive file, and you want to unzip them, place them in a
new subdirectory, and then delete the archive file? Normally you'd have to run those commands one at
a time, like this:
Note
In order to save space, I removed the owner and group from the long listing.
$ ls -l /home/scott/music
-rw-r--r-- 1437931 2005-11-07 17:19 JohnColtrane.zip
$ unzip /home/scott/music/JohnColtrane.zip
$ mkdir -p /home/scott/music/coltrane
$ mv /home/scott/music/JohnColtrane*.mp3 /home/scott/music/coltrane/
$ rm /home/scott/music/JohnColtrane.zip
JohnColtrane.zip is a 1.4GB file, and even on a fast machine, unzipping that monster is going to take
some time, and you probably have better things to do than sit there and wait. Command stacking to the
rescue!
Command stacking puts all the commands you want to run on one line in your shell, with each specific
command separated by a semicolon (; ). Each command is then executed in sequential order and each
must terminatesuccessfully or unsuccessfullybefore the next one runs. It's easy to do, and it can really
save you some time.
With command stacking, the previous series of commands now looks like this:
[View full width]$ ls -l /home/scott/music
-rw-r--r-- 1437931 2005-11-07 17:19 JohnColtrane.zip
$ unzip /home/scott/music/JohnColtrane.zip ; mkdir -p /home/scott/music/coltrane ; mv
/home/scott/music/JohnColtrane*.mp3 /home/scott/music/coltrane/ ; rm /home/scott/music
/JohnColtrane.zip
Of course, you can also use this method to introduce short delays as commands run. If you want to take
a screenshot of everything you see in your monitor, just run the following command (this assumes you
have the ImageMagick package installed, which virtually all Linux distributions do):
$ sleep 3 ; import -frame window.tif
The sleep command in this case waits three seconds, and then the screenshot is taken using import .
The delay gives you time to minimize your terminal application and bring to the foreground any
windows you want to appear in the screenshot. The ; makes it easy to separate the commands logically
so you get maximum use out of them.
Caution
Be very careful when command stacking, especially when deleting or moving files! Make sure
what you typed is what you want because the commands will run, one right after another, and
you might end up with unexpected surprises.
Run Commands Only If the Previous Ones Succeed
&&
In the previous section, you saw that ; separates commands, as in this example:
[View full width]$ unzip /home/scott/music/JohnColtrane.zip ; mkdir -p /home/scott/music/colt
/home/scott/music/JohnColtrane*.*mp3 /home/scott/music/coltrane/ ; rm /home/scott/music
/JohnColtrane.zip
What if you fat finger your command, and instead type this:
[View full width]$ unzip /home/scott/JohnColtrane.zip ; mkdir -p /home/scott/music/coltrane ;
/scott/music/JohnColtrane*.*mp3 /home/scott/music/coltrane/ ; rm /home/scott/music
/JohnColtrane.zip
Instead of unzip
/home/scott/music/JohnColtrane.zip , you accidentally enter unzip
/home/scott/JohnColtrane.zip . You fail to notice this, so you go ahead and press Enter, and then get
up and walk away. Your computer can't unzip /home/scott/JohnColtrane.zip because that file doesn't
exist, so it blithely continues onward to the next command (mkdir ), which it performs without a
problem. However, the third command can't be performed ( mv ) because there aren't any MP3 files to
move because unzip didn't work. Finally, the fourth command runs, deleting the zip file (notice that you
provided the correct path this time) and leaving you with no way to recover and start over. Oops!
Note
Don't believe that this chain of events can happen? I did something very similar to it just a
few days ago. Yes, I felt like an idiot.
That's the problem with using ; commands run in sequence, regardless of their successful completion. A
better method is to separate the commands with && , which also runs each command one after the
other, but only if the previous one completes successfully (technically, each command must return an
exit status of 0 for the next one to run). If a command fails, the entire chain of commands stops.
If you'd used && instead of ; in the sequence of previous commands, it would have looked like this:
[View full width]$ unzip /home/scott/JohnColtrane.zip && mkdir -p /home/scott/music/coltrane
scott/music/JohnColtrane.*mp3 /home/scott/music/coltrane/ && rm /home/scott/music
/JohnColtrane.zip
Because the first unzip command couldn't complete successfully, the entire process stops. You walk
back later to find that your series of commands failed, but JohnColtrane.zip still exists, so you can try
once again. Much better!
Here are two more examples that show you just how useful && can be. In Chapter 13 , "Installing
Software ," you're going to learn about apt , a fantastic way to upgrade your Debian-based Linux box.
When you use apt , you first update the list of available software, and then find out if there are any
upgrades available. If the list of software can't be updated, you obviously don't want to bother looking
for upgrades. To make sure the second process doesn't (uselessly) occur, separate the commands with
&& :
# apt-get update && apt-get upgrade
Example two: You want to convert a PostScript file to a PDF using the ps2pdf command, print the PDF,
and then delete the PostScript file. The best way to set up these commands is with && :
$ ps2pdf foobar.ps && lpr foobar.pdf && rm foobar.ps
If you had instead used ; and ps2pdf failed, the PostScript file would still end up in nowheresville,
leaving you without a way to start over.
Now are you convinced that && is often the better way to go? If there's no danger that you might delete
a file, ; might be just fine, but if one of your commands involves rm or something else from where there
is no recovery, you'd better use && and be safe.
Run a Command Only If the Previous One Fails
||
The && runs each command in sequence only if the previous one completes successfully. The || does the
opposite: If the first command fails (technically, it returns an exit status that is not 0), only then does
the second one run. Think of it like the words either/oreither run the first command or the second one.
The || is often used to send an alert to an administrator when a process stops. For instance, to ensure
that a particular computer is up and running, an administrator might constantly query it with the ping
command (you'll find out more about ping in Chapter 14, "Connectivity"); if ping fails, an email is sent
to the administrator to let him know.
ping -c 1 -w 15 -n 72.14.203.104 ||
{
echo "Server down" | mail -s 'Server down' [email protected]
}
Note
Wondering what the | is? Look ahead in this chapter to the "Use the Output of One Command
As Input for Another" section to find out what it is and how to use it.
With just a bit of thought, you'll start to find many places where || can help you. It's a powerful tool
that can really prove useful.
Plug the Output of a Command into Another Command
$()
Command substitution takes the output of a command and plugs it in to another command as though
you had typed that output in directly. Surround the initial command that's runthe one that's going to
produce the output that's plugged inwith $() . An example makes this much clearer.
Let's say you just arrived home from a family dinner, connected your digital camera to your Linux box,
pulled the new photos off of it, and now you want to put them in a folder named today's date.
$ pwd
/home/scott/photos/family
$ ls -1F
2005-11-01/
2005-11-09/
2005-11-15/
$ date "+%Y-%m-%d"
2005-11-24
$ mkdir $(date "+%Y-%m-%d")
$ ls -1F
2005-11-01/
2005-11-09/
2005-11-15/
2005-11-24/
In this example, date "+%Y-%m-%d" is run first, and then the output of that command, 2005-11-24, is
used by mkdir as the name of the new directory. This is powerful stuff, and as you look at the shell
scripts written by others (which you can easily find all over the Web) you'll find that command
substitution is used all over the place.
Note
In the past, you were supposed to surround the initial command with backticks, the `
character at the upper left of your keyboard. Now, however, you're better advised to use the
characters used in this section:$() .
Understand Input/Output Streams
To take advantage of the information in this chapter, you need to understand that there are three
input/output streams for a Linux shell: standard input, standard output, and standard error. Each of
these streams has a file descriptor (or numeric identifier), a common abbreviation, and a usual default.
For instance, when you're typing on your keyboard, you're sending input to standard input, abbreviated
as stdin and identified as 0. When your computer presents output on the terminal, that's standard
output, abbreviated as stdout and identified as 1. Finally, if your machine needs to let you know about
an error and displays that error on the terminal, that's standard error, abbreviated as stderr and
identified as 2.
Let's look at these three streams using a common command, ls. When you enter ls on your keyboard,
that's using stdin. After typing ls and pressing Enter, the list of files and folders in a directory appears
as stdout. If you try to run ls against a folder that doesn't exist, the error message that appears on
your terminal is courtesy of stderr.
Table 4.1 can help you keep these three streams straight.
Table 4.1. The Three Input/Output Streams
File Descriptor
(Identifier)
Name
Common
Abbreviation
Typical Default
0
Standard input
stdin
Keyboard
1
Standard output
stdout
Terminal
2
Standard error
stderr
Terminal
In this chapter, we're going to learn how to redirect input and output. Instead of having output appear
on the terminal, for instance, you can redirect it to another program. Or instead of acquiring input from
your typing, a program can get it from a file. After you understand the tricks you can play with stdin
and stdout, there are many powerful things you can do.
Use the Output of One Command As Input for Another
|
It's a maxim that Unix is made up of small pieces, loosely joined. Nothing embodies that principle more
than the concept of pipes. A pipe is the | symbol on your keyboard, and when placed between two
commands, it takes the output from the first and uses it as input for the second. In other words, |
redirects stdout so it is sent to be stdin for the next command.
Here's a simple example that helps to make this concept clear. You already know about ls , and you're
going to find out about the less command in Chapter 5 , "Viewing Files ." For now, know that less
allows you to page through text files one screen at a time. If you run ls on a directory that has many
files, such as /usr/bin , things just zoom by too fast to read. If you pipe the output of ls to less ,
however, you can page through the output one screen at a time.
$ pwd
/usr/bin
$ ls -1
zipinfo
zipnote
zipsplit
zsoelim
zxpdf
[Listing truncated due to length - 2318 lines!]
$ ls -1 | less
411toppm
7z
7za
822-date
a2p
You see one screen of results at a time when you pipe the results of ls -1 through to less , which
makes it much easier to work with.
Here's a more advanced example that uses two commands discussed later: ps and grep . You'll learn in
Chapter 9 , "Finding Stuff: Easy ," that ps lists running processes and in Chapter 12 , "Monitoring
System Resources," that grep helps find lines in files that match a pattern. Let's say that Firefox is
acting strange, and you suspect that multiple copies are still running in the background. The ps
command lists every process running on your computer, but the output tends to be lengthy and flashes
by in an instant. If you pipe the output of ps to grep and search for firefox , you'll be able to tell
immediately if Firefox in fact is still running.
Note
In order to save space in this listing, I removed the owner, which in every instance was the
same person.
[View full width]$ ps ux
1504 0.8 4.4 75164 46124 ? S Nov20 1:19 kontact
19003 0.0 0.1
3376 1812 pts/4 S+ 00:02 0:00 ssh [email protected]
21176 0.0 0.0
0
0 ? Z 00:14 0:00 [wine-preloader] <defunct>
24953 0.4 3.3 51856 34140 ? S 00:33 0:08 kdeinit: kword /home/scott/documents/clientele
/current
[Listing truncated for length]
$ ps ux | grep firefox
scott 8272 4.7 10.9 184072 112704 ? Sl Nov19 76:45 /opt/firefox/firefox-bin
From 58 lines of output to onenow that's much easier to read!
Note
Keep in mind that many programs can work with pipes, but not all. The text editor vim (or
pico, nano, or emacs), for instance, takes over the entire shell so that all input from the
keyboard is assumed to be directed at vim, while all output is displayed somewhere in the
program. Because vim has total control of the shell, you can't pipe output using the program.
You'll learn to recognize non-pipable programs as you use the shell over time.
Redirect a Command's Output to a File
>
Normally, output goes to your screen, otherwise known as stdout. If you don't want output to go to the
screen and instead want it to be placed into a file, use the > (greater than) character.
$ pwd
/home/scott/music
$ ls -1F
Hank_Mobley/
Horace_Silver/
John_Coltrane/
$ ls -1F Hank_Mobley/* > hank_mobley.txt
$ cat hank_mobley.txt
1958_Peckin'_Time/
1960_Roll_Call/
1960_Soul_Station/
1961_Workout/
1963_No_Room_For_Squares/
$ ls -1F
Hank_Mobley/
hank_mobley.txt
Horace_Silver/
John_Coltrane/
Notice that before you used the >, the file hank_mobley.txt didn't exist. When you use > and redirect to
a file that doesn't already exist, that file is created. Here's the big warning: If hank_mobley.txt had
already existed, it would have been completely overwritten.
Caution
Once again: Be careful when using redirection, as you could potentially destroy the contents of
a file that contains important stuff!
Prevent Overwriting Files When Using Redirection
There's a way to prevent overwriting files when redirecting, howeverthe noclobber option. If you set
noclobber to on, bash won't allow redirection to overwrite existing files without your explicit permission.
To turn on noclobber, use this command:
$ set -o noclobber
At that point, if you want to use redirection and overwrite a file, use >| instead of just >, like this:
$ pwd
/home/scott/music
$ ls -1F
Hank_Mobley/
hank_mobley.txt
Horace_Silver/
John_Coltrane/
$ ls -1F Hank_Mobley/* > hank_mobley.txt
ERROR
$ ls -1F Hank_Mobley/* >| hank_mobley.txt
$ cat hank_mobley.txt
1958_Peckin'_Time/
1960_Roll_Call/
1960_Soul_Station/
1961_Workout/
1963_No_Room_For_Squares/
If you decide you don't like or need noclobber, you can turn it off again:
$ set +o noclobber
To permanently turn on noclobber, you need to add set -o noclobber to your .bashrc file.
Append a Command's Output to a File
>>
As you learned previously, the > character redirects output from stdout to a file. For instance, you can
redirect the output of the date command to a file easily enough:
$ date
Mon Nov 21 21:33:58 CST 2005
$ date > hank_mobley.txt
$ cat hank_mobley.txt
Mon Nov 21 21:33:58 CST 2005
Remember that > creates a new file if it doesn't already exist and overwrites a file that already exists. If
you use >> instead of >, however, your output is appended to the bottom of the named file (and yes, if
the file doesn't exist, it's created).
$ cat hank_mobley.txt
Mon Nov 21 21:33:58 CST 2005
$ ls -1F Hank_Mobley/* >> hank_mobley.txt
$ cat hank_mobley.txt
Mon Nov 21 21:33:58 CST 2005
1958_Peckin' _Time/
1960_Roll_Call/
1960_Soul_Station/
1961_Workout/
1963_No_Room_For_Squares/
Caution
Be careful with >>. If you accidentally type > instead, you won't append, you'll overwrite!
Use a File As Input for a Command
<
Normally your keyboard provides input to commands, so it is termed stdin. In the same way that you
can redirect stdout to a file, you can redirect stdin so it comes from a file instead of a keyboard. Why is
this handy? Some commands can't open files directly, and in those cases, the < (lesser than) is just
what you need.
For instance, normally the echo command repeats what you type on stdin, as shown here:
$ echo "This will be repeated"
This will be repeated.
However, you can use the < to redirect input, and the echo command uses the contents of a file instead
of stdin. In this case, let's use the hank_mobley.txt file created in the previous section.
$ echo < hank_mobley.txt
Mon Nov 21 21:33:58 CST 2005
1958_Peckin_Time/
1960_Roll_Call/
1960_Soul_Station/
1961_Workout/
1963_No_Room_For_Squares/
You're not going to use the < all the time, but it will be exactly what you need in a number of situations,
so keep it in mind.
Conclusion
This book started with some simple commands, but now you've learned the building blocks that allow
you to combine those commands in new and interesting ways. Going forward, it's going to get more
complicated, but it's going to get more powerful as well. The things covered in this chapter are a key
factor in gaining that power. Now, onward!
Part II: Working with Files
Chapter 5. Viewing Files
Chapter 6. Printing and Managing Print Jobs
Chapter 7. Ownerships and Permissions
Chapter 8. Archiving and Compression
Chapter 5. Viewing Files
One of the great things about most Linux boxes is that virtually all important system configurations,
logs, and information files are in ASCII text format. Because ASCII is a well-supported, ancient (in
computer terms) standard, there's a wealth of software commands you can use to view the contents of
those text files. This chapter looks at four commands you'll use constantly when reading ASCII text
files.
View Files on stdout
cat
DOS users have the type command that displays the contents of text files on a screen. Linux users can
use cat , which does the same thing. Want to view a file in the shell? Try cat .
$ cat Hopkins_-_The_Windhover.txt
I caught this morning morning's minion, kingdom
of daylight's dauphin, dapple-dawn-drawn Falcon, in his riding
Of the rolling level underneath him steady air, and striding
High there, how he rung upon the rein of a wimpling wing
In his ecstasy! then off, off forth on swing,
As a skate's heel sweeps smooth on a bow-bend: the hurl and gliding
Rebuffed the big wind. My heart in hiding
Stirred for a bird, -- the achieve of, the mastery of the thing!
Brute beauty and valour and act, oh, air, pride, plume, here
Buckle! AND the fire that breaks from thee then, a billion
Times told lovelier, more dangerous, o my chevalier!
No wonder of it: sheer plod makes plough down
sillion
Shine, and blue-bleak embers, ah my dear,
Fall, gall themselves, and gash gold-vermilion.
$
The cat command prints the file to the screen, and then deposits you back at the command prompt. If
the file is longer than your screen, you need to scroll up to see what flashed by.
That's the big problem with cat : If the document you're viewing is long, it zooms by, maybe for quite a
while, making it hard to read (can you imagine what this would produce: cat Melville__Moby_Dick.txt?). The solution to that is the less command, discussed in "View Text Files a Screen at a
Time" later in this chapter.
Concatenate Files to stdout
cat file1 file2
cat is short for concatenate, which means "to join together." The original purpose of cat is to take two
or more files and concatenate them into one file (the fact that you can use the cat command on just one
file and print it on the screen is a bonus). For instance, let's say you have a short poem by A. E.
Housman and one by Francis Quarles, and you want to view them at the same time.
$ cat housman_-_rue.txt quarles_-_the_world.txt
WITH rue my heart is laden
For golden friends I had,
For many a rose-lipt maiden
And many a lightfoot lad.
By brooks too broad for leaping
The lightfoot boys are laid;
The rose-lipt girls are sleeping
In fields where roses fade.
The world's an Inn; and I her guest.
I eat; I drink; I take my rest.
My hostess, nature, does deny me
Nothing, wherewith she can supply me;
Where, having stayed a while, I pay
Her lavish bills, and go my way.
Notice that cat doesn't separate the two files with a horizontal rule, a dash, or so forth. Instead, cat
mashes the two files together and spits them out. If you want more of a separationhaving the last line
of "With rue my heart is laden" jammed up right next to the first line of "On the world" makes them
hard to read, for instancemake sure there's a blank line at the end of each file you're going to
concatenate together.
Concatenate Files to Another File
cat file1 file2 > file3
In the previous example, you concatenated two files and printed them on the screen to stdout. This
might not be what you want, though. If you're concatenating two files, it might be nice to save the
newly joined creation as another file you can use. To do this, redirect your output from stdout to a file,
as you learned in Chapter 4, "Building Blocks."
$ ls
housman_-_rue.txt quarles_-_the_world.txt
$ cat housman_-_rue.txt quarles_-_the_world.txt > poems.txt$ ls
housman_-_rue.txt poems.txt quarles_-_the_world.txt
Now you can do whatever you want with poems.txt. If you want to add more poems to it, that's easy
enough to do:
$ cat housman_-_one-and-twenty.txt >> poems.txt
Notice that the >> was used to append the new poem to poems.txt this time. The following command
would not have worked:
$ cat housman_-_one-and-twenty.txt poems.txt > poems.txt
If you tried concatenating a file to itself, you'd see this error message, helpfully explaining that it is
impossible to fulfill your request:
cat: poems.txt: input file is output file
Concatenate Files and Number the Lines
cat -n file1 file2
When working with poems and source code, it's really nice to have numbered lines so that references
are clear. If you want to generate line numbers when you use cat , add the -n option (or --number ).
$ cat -n housman_-_rue.txt quarles_-_the_world.txt
1 WITH rue my heart is laden
2
For golden friends I had,
3 For many a rose-lipt maiden
4
And many a lightfoot lad.
5 By brooks too broad for leaping
6
The lightfoot boys are laid;
7 The rose-lipt girls are sleeping
8
In fields where roses fade.
9 The world's an Inn; and I her guest.
10 I eat; I drink; I take my rest.
11 My hostess, nature, does deny me
12 Nothing, wherewith she can supply me;
13 Where, having stayed a while, I pay
14 Her lavish bills, and go my way.
Line numbers can be incredibly useful, and cat provides a quick and dirty way to add them to a file.
Note
For a vastly better cat , check out dog (more information is available at
http://opensource.weblogsinc.com/2005/02/17/why-dogs-are-betters-than-cats). Instead of
local files, you can use dog to view the HTML source of web pages on stdout, or just a list of
images or links on the specified web pages. The dog command converts all characters to
lowercase or vice versa; converts line endings to Mac OS, DOS, or Unix; and even allows you
to specify a range of characters to output (lines 525, for instance). Not to mention, the man
page for dog is one of the funniest ever. This is one dog that knows a lot of new tricks!
And here's another program that was created in response to cat : tac . Yes, that's cat
backward. And that's what tac does: It concatenates files backward. That's not something
you'll use all the time, but when you do need that functionality, it's nice to know that it's easily
available.
View Text Files a Screen at a Time
less file1
The cat command is useful, but if you're trying to read a long file, it's not useful at all because text just
flows past in an unending, unreadable stream. If you're interested in viewing a long file on the
command line (and by "long," think more than a page or two), you don't want cat ; you want less.
The less command is an example of a pager, a program that displays text files one page at a time.
Others are more, pg, and most; in fact, less was released back in 1985 as an improved more, proving
once again that less is more!
Opening a file with lesseven an enormous one like Milton's Paradise Lostcouldn't be easier:
$ less Paradise_Lost.txt
The less command takes over your entire screen, so you have to navigate within less using your
keyboard, and you have to quit less to get back to the command line. To navigate inside less, use the
keys in Table 5.1:
Table 5.1. Key Commands for less
Key Command
Action
PageDn, e, or spacebar
Forward one page
PageUp or b
Back one page
Return, e, j, or down
arrow
Forward one line
y, k, or up arrow
Back one line
G or p
Forward to end of file
1G
Back to beginning of file
Esc-), or right arrow
Scroll right
Esc-(, or left arrow
Scroll left
Q
Quit less
As you can see, you have many options for most commands. Probably the two you'll use most often are
those used to move down a page at a time and quit the program.
To view information about the file while you're in less, press =, which displays some data at the very
bottom of your screen similar to the following:
Paradise_Lost.txt lines 7521-7560/10762 byte 166743/237306 70% (press RETURN)
As you can see, you're helpfully told to press Enter to get rid of the data and go back to using less.
In the same way that you could tell cat to stick in line numbers for a file, you can also order less to
display line numbers. Those numbers only appear, of course, while you're using less. After you press q,
the numbers are gone. To view the file, but with numbers at the start of each line, start less with the -N
(or --LINE-NUMBERS) option, and yes, you must use all caps:
$ less -N Paradise_Lost.txt
Search Within Your Pager
If you're using less to view a large file or an especially dense one, it can be difficult to find the text in
which you're particularly interested. For instance, what if you want to know if Milton uses the word
apple in Paradise Lost to describe the fruit that Adam and Eve ate? While in less, press / and then type
in the pattern for which you'd like to searchand you can even use regular expressions if you'd like. After
your pattern is in place, press Enter and less jumps to the first instance of your search pattern, if it
exists. If your pattern isn't in the file, less tells you that:
Pattern not found (press RETURN)
Repeating your search is also easy and can be done either forward or backward in your file. Table 5.2
covers the main commands you need for searching within less.
Table 5.2. Search Commands for less
Key
Action
Command
/pattern
Search forward for pattern using regex
n
Repeat search forward
N
Repeat search backward
Note
No, Milton never explicitly refers to the fruit as an apple; instead, it's just a "fruit." Yes, I
worked on a doctorate in Seventeenth Century British Literature for a number of years. No, I
didn't finish it, which explains why I'm writing this book and not one solely devoted to John
Milton and Restoration literature.
Edit Files Viewed with a Pager
It's true that less itself is not an editor, just a viewer, but you can pass the file you're viewing with less
to a text editor such as vim or nano for editing by pressing v. Try it. View a file with less, and then press
v. Within a second or two, less disappears and a full-screen text editor takes its place. Make your edits,
quit your editor, and you're back in less with your new changes in place.
If you aren't happy with the editor that you find yourself in when you press v, you can change it to one
of your choice. For instance, if you want to use vim , run the following command before using less:
$ export EDITOR=vim
You only need to run that command once per session, and every time you open less after that, vim will
be your editor. If you end the session, however, you'll need to enter the export command again, which
can quickly grow tedious. Better to add the following line to your .bashrc file, so that it's automatically
applied:
export EDITOR=vim
View the First 10 Lines of a File
head
If you just need to see the first 10 lines of a file, such as Chaucer's Canterbury Tales, there's no need to
use either cat or less. Instead, use the head command, which prints out exactly the first 10 lines of a
file and then deposits you back on the command line.
$ head Canterbury_Tales.txt
Here bygynneth the Book of the Tales of Caunterbury
General Prologue
Whan that Aprill, with his shoures soote
The droghte of March hath perced to the roote
And bathed every veyne in swich licour,
Of which vertu engendred is the flour;
Whan Zephirus eek with his sweete breeth
Inspired hath in every holt and heeth
$
The head command is great for quick glances at a text file, even if that file is enormous. In just a
second, you see enough to know if it's the file you need or not.
View the First 10 Lines of Several Files
head file1 file2
You can use head to view the first 10 lines of several files at one time. This sounds a bit similar to cat ,
and it is, except that head helpfully provides a header that separates the files so it's clear which one is
which. Here's the first ten lines of Chaucer's Canterbury Tales and Milton's Paradise Lost.
$ head Canterbury_Tales.txt Paradise_Lost.txt
==> Canterbury_Tales.txt <==
Here bygynneth the Book of the Tales of Caunterbury
General Prologue
Whan that Aprill, with his shoures soote
The droghte of March hath perced to the roote
And bathed every veyne in swich licour,
Of which vertu engendred is the flour;
Whan Zephirus eek with his sweete breeth
Inspired hath in every holt and heeth
==> Paradise_Lost.txt <==
Book I
Of Man's first disobedience, and the fruit
Of that forbidden tree whose mortal taste
Brought death into the World, and all our woe,
With loss of Eden, till one greater Man
Restore us, and regain the blissful seat,
Sing, Heavenly Muse, that, on the secret top
Of Oreb, or of Sinai, didst inspire
That shepherd who first taught the chosen seed
The head command automatically separates the two excerpts with a space and then a header, which
really makes it easy to see the different files clearly.
View the First Several Lines of a File or Files
If you don't want to see the first 10 lines of a file, you can tell head to show you a different number of
lines by using the -n option followed by a number such as 5 (or --lines=5). If you specify two or more
files, such as Chaucer's Canterbury Tales and Milton's Paradise Lost, the results show all the files.
$ head -n 5 Canterbury_Tales.txt Paradise_Lost.txt
==> Canterbury_Tales.txt <==
Here bygynneth the Book of the Tales of Caunterbury
General Prologue
Whan that Aprill, with his shoures soote
==> Paradise_Lost.txt <==
Book I
Of Man's first disobedience, and the fruit
Of that forbidden tree whose mortal taste
Brought death into the World, and all our woe,
Notice that the five lines include blank lines as well as those with text. Five lines are five lines, no
matter what those lines contain.
View the First Several Bytes, Kilobytes, or Megabytes of a
File
head -c
The -n option allows you to specify the number of lines you view at the top of a file, but what if you
what to see a certain number of bytes? Or kilobytes? Or even (and this is a bit silly because it would
scroll on forever) megabytes? Then use the -c (or --bytes= ) option.
To see the first 100 bytes of Chaucer's Canterbury Tales, you'd use this:
$ head -c 100 Canterbury_Tales.txt
Here bygynneth the Book of the Tales of Caunterbury
General Prologue
Whan that Aprill, with his sh
100 bytes means 100 bytes, and if that means the display is cut off in the middle of a word, so be it.
To view the first 100KB of Chaucer's Canterbury Tales, you'd use this:
$ head -c 100k Canterbury_Tales.txt
Here bygynneth the Book of the Tales of Caunterbury
General Prologue
Whan that Aprill, with his shoures soote
The droghte of March hath perced to the roote
And bathed every veyne in swich licour,
Of which vertu engendred is the flour;
Whan Zephirus eek with his sweete breeth
Inspired hath in every holt and heeth
And to view the first 100MB of Chaucer's Canterbury Tales, you'd use...no, that would take up the rest
of the book! But if you wanted to do it, try this:
$ head -c 100m Canterbury_Tales.txt
The m here means 1048576 bytes, or 1024 x 1024.
View the Last 10 Lines of a File
tail
The head command allows you to view the first 10 lines of a file, and in typical whimsical Unix fashion,
the tail command allows you to view the last 10 lines of a file. From head to tail, get it?
$ tail Paradise_Lost.txt
To the subjected plain - then disappeared
They, looking back, all the eastern side beheld
Of Paradise, so late their happy seat,
Waved over by that flaming brand; the gate
With dreadful faces thronged and fiery arms.
Some natural tears they dropped, but wiped them soon;
The world was all before them, where to choose
Their place of rest, and Providence their guide.
They, hand in hand, with wandering steps and slow,
Through Eden took their solitary way.
Why use tail? Most often, to view the end of a log file to see what's going on with an application or
your system. Of course, there's an important option you want to use in that case, as you'll learn in the
upcoming "View the Constantly Updated Last Lines of a File or Files" section.
View the Last 10 Lines of Several Files
tail file1 file2
You can view the first 10 lines of several files at one time using head; unsurprisingly, you can do the
same thing with tail.
$ tail Paradise_Lost.txt Miller's_Tale.txt
==> Paradise_Lost.txt <==
To the subjected plain - then disappeared
They, looking back, all the eastern side beheld
Of Paradise, so late their happy seat,
Waved over by that flaming brand; the gate
With dreadful faces thronged and fiery arms.
Some natural tears they dropped, but wiped them soon;
The world was all before them, where to choose
Their place of rest, and Providence their guide.
They, hand in hand, with wandering steps and slow,
Through Eden took their solitary way.
==> Miller's_Tale.txt <==
With othes grete he was so sworn adoun
That he was holde wood in al the toun;
For every clerk anonright heeld with oother.
They seyde, "The man is wood, my leeve brother";
And every wight gan laughen at this stryf.
Thus swyved was this carpenteris wyf,
For al his kepyng and his jalousye;
And Absolon hath kist hir nether ye;
And Nicholas is scalded in the towte.
This tale is doon, and God save al the rowte!
Also like head, tail includes a header/separator between the entries, which is helpful.
View the Last Several Lines of a File or Files
tail -n
Continuing the similarities between head and tail, you can specify the number of a file's lines you want
to see instead of accepting the default of 10 by utilizing the -n (or --lines= ) option. Want to see more
than one file? Just add it to your command.
$ tail -n 4 Paradise_Lost.txt Miller's_Tale.txt
==> Paradise_Lost.txt <==
The world was all before them, where to choose
Their place of rest, and Providence their guide.
They, hand in hand, with wandering steps and slow,
Through Eden took their solitary way.
==> Miller's_Tale.txt <==
For al his kepyng and his jalousye;
And Absolon hath kist hir nether ye;
And Nicholas is scalded in the towte.
This tale is doon, and God save al the rowte!
Do you have more than one log file that you want to view? Then this is a useful command. But it's still
not perfect. For the perfect command, read the next section.
View the Constantly Updated Last Lines of a File or Files
tail -f
tail -f --pid=PID# terminates after PID dies.
The great thing about log files is that they constantly change as things happen on your system. The
tail command shows you a snapshot of a file, and then deposits you back on the command line. Want
to see the log file again? Then run tail again...and again...and again. Blech!
With the -f (or --follow ) option, tail doesn't close. Instead, it shows you the last 10 lines of the file
(or a different number if you add -n to the mix) as the file changes, giving you a way to watch all the
changes to a log file as they happen. This is wonderfully useful if you're trying to figure out just what is
happening to a system or program.
For instance, a web server's logs might look like this:
Note
In order to save space, I've removed the IP address, date, and time of the access.
$ tail -f /var/log/httpd/d20srd_org_log_20051201
"GET /srd/skills/bluff.htm HTTP/1.1"...
"GET /srd/skills/senseMotive.htm HTTP/1.1"...
"GET /srd/skills/concentration.htm HTTP/1.1"...
"GET /srd/classes/monk.htm HTTP/1.1"...
"GET /srd/skills/escapeArtist.htm HTTP/1.1"...
It's hard to represent in a book, but this file doesn't close. Instead, tail keeps it open, and makes sure
that any new changes are shown. The file continues to scroll up, apparently forever or until you press
Ctrl+c, which cancels the command and deposits you back on the command line.
Try it with one of your log files, such as /var/log/syslog. Add in the q option to see only a certain
number of lines to begin with, and then try it with two files, such as /var/log/syslog and
/var/log/daemon.log, and see what happens (hint: It's remarkably similar to what you saw in "View the
First 10 Lines of Several Files," but with constant updates).
Conclusion
You've looked at four commands in this chapter: cat , less, head, and tail. They all show you text files
in read-only mode, but they do so in different ways. The cat command shows you the whole file all at
once, while less allows you to page through a file one screen at a time. The head and tail commands
are two sides of the same coinor, rather, the same fileas the former enables you to view the beginning
of a file and the latter displays the end. Together, these four commands make it easy to view just about
any part of a text file that you might need to see.
Chapter 6. Printing and Managing Print Jobs
Over the years, Linux has had several printing systems, including the venerable Line Printer Daemon
(LPD) and LPR Next Generation (LPRng) that are still found in vestigial form on modern Linux
distributions. In the past few years, however, most distributions have settled on the Common Unix
Printing System (CUPS) as their backend of choice. CUPS is well supported, easy to use, modern, and a
perfect drop-in replacement for LPD and LPRng. The same commands used with LPD and LPRng still
work, but now they call functions in CUPS.
This chapter focuses on CUPS because it is the printing system with which most Linux users work. This
chapter does not cover how to set up and configure a printer. Most distributions now provide easy-touse GUI configuration tools to do just that, so you're going to focus on actually querying and using the
printer via the command line.
Note
Linux Journal's "Overview of Linux Printing Systems," available at
www.linuxjournal.com/article/6729, provides an excellent look at the various options Linux
users have today, with special focus on the current favorite, CUPS. For more on CUPS, see
Linux Journal's "The CUPS Printing System" at www.linuxjournal.com/article/8618, a very
good look at this essential technology. The best place to go for information about CUPS is,
unsurprisingly, the CUPS Software Users Manual, which you can find at www.cups.org/doc1.1/sum.html. It's long and sometimes obtuse, but full of valuable advice and help, and it's an
essential resource.
List All Available Printers
lpstat -p
Before you can begin working with your printers, you need to know what makes up "your printers." To
find the printers configured on your system, use the lpstat command (short for line printer status)
along with the -p option.
$ lpstat -p
printer bro is idle. enabled since Jan 01 00:00
printer bro_wk is idle. enabled since Jan 01 00:00
printer wu is idle. enabled since Jan 01 00:00
As you can see, this system has three printersbro , bro_websanity , and wu_eads14and none of them are
printing anything at the moment.
Determine Your Default Printer
lpstat -d
You know all of your printers, thanks to running lpstat -p, but which one is the default? As you're
going to see soon, you can send a print job to a specific printer, or you can quickly send it to your
default printer. To find out which printer is the default, use the lpstat command with the -d (for
default) option.
$ lpstat -d
system default destination: bro
Of course, if you only have one printer connected to your computer, you probably don't need to run this
command. But for laptop users who move around to different locations and who print to different
printers, this command is essential.
Find Out How Your Printers Are Connected
lpstat -s
Laptop users find this next command particularly helpful, as it tells them how they access the printers
available to them. When you first set up a printer, you must specify how you connect to it. You have
several choices:
Local (parallel, serial, or USB)
Remote LPD queue
SMB shared printer (Windows)
Network printer (TCP)
Remote CUPS server (IPP/HTTP)
Network printer with IPP (IPP/HTTP)
To find out what printers are configured for your computer and how you connect to those printers, use
lpstat with the -s option.
$ lpstat -s
system default destination: bro
device for bro: socket://192.168.0.160:9100
device for bro_wk: socket://192.168.1.10:9100
device for wu: socket://128.252.93.10:9100
In this case, every printer is a network printer, so it uses socket://, followed by the printer's IP address
and its port (9100 is standard for most networked printers, although you might see port 35 used as
well). That's pretty easy, but it can quickly get much more complicated.
Although CUPS is user-friendly in many areas, it's famously obtuse when it comes to the Uniform
Resource Indicators (URIs) used to indicate the locations of printers vis-à-vis your Linux box. Table 6.1
lists each connection method and the type of URI you might see, which should help you understand the
list of printers and URIs you see when you run lpstat -s.
Note
Assume that the printer in the following examples is named bro and located on the network at
192.168.0.160. That isn't relevant in every situation, of course. If the printer is connected via
a parallel cable, its IP address doesn't matter.
Table 6.1. Printer Connections and CUPS URIs
Connection Method
Sample URI (Printer bro at 192.168.0.160)
Parallel
parallel:/dev/lp0
Serial
serial:/dev/ttyS1?baud=115200
USB
usb:/dev/usb/lp0
Remote LPD queue
lpd://192.168.0.160/LPT1
SMB shared printer (Windows)
smb://username:[email protected]/bro
Network printer (TCP)
socket://192.168.0.160:9100
Remote CUPS server (IPP/HTTP)
ipp://192.168.0.160:631/printers/bro,
http://192.168.0.160/printers/bro
Network printer with IPP (IPP/HTTP)
ipp://192.168.0.160:631/printers/bro,
http://192.168.0.160/printers/bro
Thanks to the rise of network printing in the past several years, it's getting simpler to connect to
printers via socket, ipp , or http. Even so, you're still going to run into legacy printers that require the
older, more complicated connection methods, so it's good to familiarize yourself with them.
Tip
A bonus is that lpstat -s in essence duplicates the functionality of lpstat -p -d, as it lists all
printers known by your system, as well as the default printer. If you want to know all that
information quickly, this is a good command to use.
Get All the Information About Your Printers at Once
lpstat -t
Now for the mighty command. The nice thing about using lpstat -p, lpstat -d, or lpstat -s is that
you only get the precise bit of information you want. If you want everything all at once, however, use
lpstat with the -t option, which dumps everything lpstat knows about your printers onto your shell.
$ lpstat -t
scheduler is running
system default destination: bro
device for bro: socket://192.168.0.160:9100
device for bro_wk: socket://192.168.1.10:9100
device for wu: socket://128.252.93.10:9100
bro accepting requests since Jan 01 00:00
bro_wk accepting requests since Jan 01 00:00
wu accepting requests since Jan 01 00:00
printer bro is idle. enabled since Jan 01 00:00
printer bro_wk is idle. enabled since Jan 01 00:00
printer wu is idle. enabled since Jan 01 00:00
You get it all: your default printer, a list of all printers known to your system, the connection methods
and locations of those printers, and the status of all printers. The more printers you have configured on
your computer, the longer this listing. For some of you, it will be overwhelming, so you might find that
lpstat with one of the other options you've examined is a better choice.
Print Files to the Default Printer
lpr
Now that you know what printers are on your system, it's time to actually use them to print something.
Printing to your default printer (determined with lpstat -d) is easy.
$ lpr
Lovecraft_-_Call_of_Cthulhu.txt
That's it: Just lpr and the name of the text file. Pretty simple.
Note
You probably expect that you can print ASCII text files on the command line using CUPS, but
it might surprise you to learn that you can also print PDFs or PostScript files. But that's it;
don't try to print Word, OpenOffice.org, or any other kind of non-text-based or nonPostScript-based documents, or your printer will spew out pages of garbage!
Print Files to Any Printer
lpr -P
Printing to the default printer, as shown in the previous section, is pretty easy. If you have more than
one printer, however, and you want to print to one that is not the default, simply use the -P option,
followed by the name of the printer.
$ lpr -P bro_wk Lovecraft_-_Call_of_Cthulhu.txt
If you don't know the names of your printers, use lpstat -p, as discussed previously in the "List All
Available Printers" section.
Note
You'll notice that the filename has underscores instead of spaces, which makes it much easier
to deal with on the command line. If the filename has spaces in it, you have to use one of the
following methods for referencing it:
$ lpr -P bro_wk "Lovecraft - Call of Cthulhu.txt"
or
$ lpr -P bro_wk Lovecraft\ -\ Call\ of\ Cthulhu.txt
If the filename does have spaces, it's probably easiest to use tab completion so you don't have
to type out the name yourself. For instance, you'd enter lpr -P bro_wk Love and then press
the Tab key, allowing bash to finish the filename. For more on tab completion, see
www.slackbook.org/html/shell-bash.html#SHELL-BASH-TAB.
Print More Than One Copy of a File
lpr -#
If you want to print more than one copy of document, use the -# (the pound sign) option, followed by
the number of copies you want:
$ lpr -# 2 -P bro Lovecraft_-_Call_of_Cthulhu.txt
The number can range from 1 to 100. Want more than 100? Repeat the command. Or write a script. Or
hire a professional printer!
List Print Jobs
lpq
If you have several jobs queued up to print, you might want to see that list. Perhaps you want to cancel
one or more of the jobs (more on how to do that ahead, in the next few sections), you want to find out
why a job is taking so long to print, or maybe you just want to know how many print jobs are in line.
The lpq command (as in "lp queue") lists all jobs currently printing on the default printer.
$ lpq
bro is ready and printing
Rank
Owner Job File(s)
Total Size
active scott 489 Lovecraft_-_Call_of_C 108544 bytes
If you want to find out the status of print queues for all your printers, not just your default, simply
append -a (as in all) after lpq :
$ lpq -a
Rank
Owner Job File(s)
Total Size
active scott 489 Lovecraft_-_Call_of_C 108544 bytes
1st
scott 490 ERB_-_A Princess_of_M 524288 bytes
Keep in mind two things. First, the listing provided by lpq -a is cut off, so while the filenames are
actually Lovecraft_-_Call_of_Cthulhu.txt and ERB_-_A Princess_of_Mars.txt , you don't see the entire
name because lpq only shows a certain number of characters.
Second, and this is very important, lpq is not showing all the print jobs the printer knows about, only
those that your machine knows about. From the printer's perspective, the print queue might actually
look like this:
Lovecraft_-_Call_of_Cthulhu.txt
Doyle_-_The_Lost_World.txt
ERB_-_A Princess_of_Mars.txt
To find out the actual queue on the printer, you have to use whatever management utilities come with
that device, which is far beyond the scope of this book.
List Print Jobs by Printer
lpstat
The lpq command shows you the files that are queued to print, but it doesn't tell you to which printer
the files are being sent. To learn that, you need to use the lpstat command, covered earlier in "List All
Available Printers." This time, however, simply use lpstat without any options at all:
$lpstat
bro-489
rsgranne 108544 Tue 10 Dec 2005
bro_wk-490 rsgranne 524288 Tue 10 Dec 2005
The result is a list of print jobs, with the name of each printer handling the job at the beginning of each
line. Did you accidentally send a job to a printer to which you're not currently connected? Find out with
lpstat and then delete the job, which you'll learn in the next section.
Cancel the Current Print Job Sent to the Default Printer
lprm
Do you want to cancel the current print job being sent to your default printer? Just use lprm (as in "lp
remove"):
$lprm
Be sure to issue this command with haste. Many printers now are so fast and have so much memory
that the job might have already left your machine and be running on the printer. In that case, find the
Cancel button on your printer and press it quickly!
Cancel a Print Job Sent to Any Printer
lprm job ID
In the previous section, you learned how to cancel the current print job going to the default printer. But
what if the job is queued and won't start for another few minutes? Or what if the job is heading for a
printer that's not the default? In those cases, you want to still use lprm, but you want to tell the
command which job to cancel by referencing the job's ID number.
Look back at the examples in "List Print Jobs." The third column is labeled "Job" and has a number in it.
Now look back at "List Print Jobs by Printer." After the name of each printer is a hyphen and then a
numberthe same number seen in "List Print Jobs." That number is the job ID number. Specify that
number to lprm, and that exact job will be removed.
$ lpstat
bro-489
rsgranne 108544 Tue 10 Dec 2005
bro_wk-490 rsgranne 524288 Tue 10 Dec 2005
$ lprm 490
$lpstat
bro-489
rsgranne 108544 Tue 10 Dec 2005
You can really save yourself from a needless waste of paper, toner, and ink using this particular
command. Next time you realize that you just sent a 500-page document full of large pictures to your
printer and stop it using lprm and a job ID number, send an email my way and thank me.
Cancel All Print Jobs
lprm What if you have several print jobs you want to cancel? You could delete them all by specifying all of
their job ID numbers, like this:
$ lprm 489 490 491 492 493
That's too much typing, however. If you want to get rid of all print jobs you've queued on every printer,
just follow lprm with a hyphen:
$ lprm -
It's quick, it's painless, and it's lazy in the best Linux tradition. Nothing to be ashamed of there!
Conclusion
Printing is a central activity people perform on their computers, so it's important to learn how to print
effectively. Part of printing is knowing how to query printers and jobs as well as send those jobs to
printers in the first place. Of course, not all jobs need to be printed, so it's important to know how to
cancel unnecessary jobs as well. This chapter covers the major tasks you'll need to work with your
printers from within Linux, but let me leave you with a final bit of advice: Automatic duplex printers that
work with Linux have now fallen in price drastically, and they'll help you save a small fortune in paper.
Check one out when you're next in the market for a printer. You'll be glad you did.
Chapter 7. Ownerships and Permissions
From the beginning, Linux was designed to be a multiuser system (unlike Windows, which was designed
as a sole user OS, the source of much of its security problems even to this day). This meant that
different users would be on the system, creating files, deleting directories, and reading various items.
To keep everyone from stepping on each other's toes and damaging the underlying operating system
itself, a system of permissions was created early on. Mastering the art of Linux permissions will aid you
as you use your Linux box, whether it's a workstation used by one person or a server accessed by
hundreds. The tools are simple, but the power they give you is complex. Time to jump in!
Change the Group Owning Files and Directories
chgrp
By default on virtually every Linux system, when you create a new file (or directory), you are that file's
owner and group. For instance, let's say you're going to write a new script that can be run on your
system.
Note
To save space, I've replaced information you'd normally see with ls -l by an ellipses.
$ touch new_script.sh
$ ls -l
-rw-r--r-- 1 scott scott ... script.sh
Note
The user name and group name happen to both be "scott" on this machine, but that's not
necessarily the case on every machine. When a file is created, the user's UID (her User ID
number) becomes the owner of the file, while the user's GID (her Group ID number) becomes
the group for the file.
But what if you are part of an admins group on your machine and you want your script to be available to
other members of your group so they can run it? In that case, you need to change your group from
scott to admins using the chgrp command.
$ chgrp admins new_script.sh
$ ls -l
-rw-r--r-- 1 scott admins ... script.sh
Note
Yes, the script still won't run because it's not executable. That process is covered later in this
chapter with chmod.
You should know a couple of things about chgrp. When you run chgrp, you can use a group's name or
the group's numeric ID. How do you find the number associated with a group? The easiest way is just to
use cat on /etc/group, the file that keeps track of groups on your machine, and then take a look at the
results.
$ cat /etc/group
bind:x:118:
scott:x:1001:
admins:x:1002:scott,alice,bob
[list truncated for length]
The other point to make about chgrp involves security: You can only change permissions for a group if
you are a member of that group. In other words, Scott, Alice, or Bob can use chgrp to make admins a
group for a file or directory, but Carol cannot because she's not a member of admins.
Recursively Change the Group Owning a Directory
chgrp -R
Of course, you might not want to change the group of just one file or directory. If you want to change
the groups of several files in a directory, you can use a wildcard. If you want to change the contents of
a directory and everything below it, use the -R (or --recursive) option.
$ pwd
/home/scott/pictures/libby
$ ls -F
by_pool/
libby_arrowrock.jpg libby.jpg on_floor/
$ ls -lF *
-rw-r--r-- 1 scott scott ... libby_arrowrock.jpg
-rw-r--r-- 1 scott scott ... libby.jpg
by_pool/:
-rw-r--r-- 1 scott scott ... libby_by_pool_02.jpg
drwxr-xr-x 2 scott scott ... lieberman_pool
on_floor/:
-rw-r--r--rw-r--r-$ chgrp -R
$ ls -l *
-rw-r--r--rw-r--r--
1 scott scott ... libby_on_floor_01.jpg
1 scott scott ... libby_on_floor_02.jpg
family */*
1 scott family ... libby_arrowrock.jpg
1 scott family ... libby.jpg
by_pool:
-rw-r--r-- 1 scott family ... libby_by_pool_02.jpg
drwxr-xr-x 2 scott family ... lieberman_pool
on_floor:
-rw-r--r-- 1 scott family ... libby_on_floor_01.jpg
-rw-r--r-- 1 scott family ... libby_on_floor_02.jpg
Caution
If you used chgrp -R family *, you wouldn't change any of the dot files in the
/home/scott/pictures/libby directory. However, chgrp -R family .* should not be used. It
changes all the dot files in the current directory, but .* also matches .., so all the files in the
parent directory are also changed, which is probably not what you want!
Keep Track of Changes Made to a File's Group with chgrp
chgrp -v
chgrp -c
You've probably noticed that chgrp, like all well-behaved Linux applications, only gives you feedback if
there's a problem. If a program works and does its job correctly, it doesn't bother you with "Hey! I did
it, and it worked!" Instead, Linux command-line applications are only noisy when an issue needs to be
resolved.
If you want to know what chgrp is doing while it's doing it, first try the -v (or --verbose) option. This
tells you, every step of the way, just what tasks chgrp performs.
$ ls -lF
drwxr-xr-x 4 scott scott ... by_pool/
-rw-r--r-- 1 scott scott ... libby_arrowrock.jpg
-rw-r--r-- 1 scott family ... libby.jpg
-rw-r--r-- 1 scott scott ... libby_on_couch.jpg
drwxr-xr-x 2 scott scott ... on_floor/
$ chgrp -v family *
changed group of 'by_pool' to family
changed group of 'libby_arrowrock.jpg' to family
group of 'libby.jpg' retained as family
changed group of 'libby_on_couch.jpg' to family
changed group of 'on_floor' to family
$ ls -lF
drwxr-xr-x 4 scott family ... by_pool/
-rw-r--r-- 1 scott family ... libby_arrowrock.jpg
-rw-r--r-- 1 scott family ... libby.jpg
-rw-r--r-- 1 scott family ... libby_on_couch.jpg
drwxr-xr-x 2 scott family ... on_floor/
Notice what happened in this example. The libby.jpg file was already a member of the family group,
but -v went ahead and reported on it anyway, making sure that you knew that libby.jpg's group was
kept as family. If you're changing groups, you probably don't need to know about items that already
belong to the new group. In cases like that, you want to use the -c (or --changes) option, which
(unsurprisingly) only reports changes made.
$ ls -lF
drwxr-xr-x
-rw-r--r--rw-r--r--rw-r--r--
4
1
1
1
scott
scott
scott
scott
scott ... by_pool/
scott ... libby_arrowrock.jpg
family ... libby.jpg
scott ... libby_on_couch.jpg
drwxr-xr-x 2 scott scott ... on_floor/
$ chgrp -c family *
changed group of 'by_pool' to family
changed group of 'libby_arrowrock.jpg' to family
changed group of 'libby_on_couch.jpg' to family
changed group of 'on_floor' to family
$ ls -lF
drwxr-xr-x 4 scott family ... by_pool/
-rw-r--r-- 1 scott family ... libby_arrowrock.jpg
-rw-r--r-- 1 scott family ... libby.jpg
-rw-r--r-- 1 scott family ... libby_on_couch.jpg
drwxr-xr-x 2 scott family ... on_floor/
This time, nothing was reported to you about libby.jpg because it was already in the family group. So
if you want a full report, even for files that won't be changed, use -v; for a more concise listing, use -c.
Change the Owner of Files and Directories
chown
Changing a file's group is important, but it's far more likely that you'll change owners. To change
groups, use chgrp; to change owners, use chown.
$ ls -l
-rw-r--r-- 1 scott scott ... libby_arrowrock.jpg
-rw-r--r-- 1 scott family ... libby.jpg
-rw-r--r-- 1 scott scott ... libby_on_couch.jpg
$ chown denise libby.jpg
$ ls -l
-rw-r--r-- 1 scott scott ... libby_arrowrock.jpg
-rw-r--r-- 1 denise family ... libby.jpg
-rw-r--r-- 1 scott scott ... libby_on_couch.jpg
Some of the points previously made about chgrp in "Change the Group Owning Files and Directories"
apply to chown as well. The chgrp command uses either a user's name or her numeric ID. The numeric
ID for users can be seen by running cat /etc/passwd, which gives you something like this:
bind:x:110:118::/var/cache/bind:/bin/false
scott:x:1001:1001:Scott,,,:/home/scott:/bin/bash
ntop:x:120:120::/var/lib/ntop:/bin/false
The first number you see is the numeric ID for that user (the second number is the numeric ID for the
main group associated with the user).
Also, you can only change the owner of a file if you are the current owner (or root, of course). That
shouldn't surprise you, but it's good to remember nonetheless.
Caution
If you use chown -R scott *, you do not change any of the dot files in the directory. However,
chown -R scott .* should not be used. It changes all the dot files in the current directory, but
.* also matches .., so all the files in the parent directory are also changed, which is probably
not what you want.
Change the Owner and Group of Files and Directories
chown owner:group
You've seen that you can use chgrp to change groups and chown to change owners, but it's possible to
use chown to kill two birds with one stone. After chown, specify the user and then the group, separated
by a colon, and finally the file or directory (this is one reason why you should avoid using colons in user
or group names).
$ ls -l
-rw-r--r-- 1 scott scott ... libby.jpg
$ chown denise:family libby.jpg
$ ls -l
-rw-r--r-- 1 denise family ... libby.jpg
You can even use chown to change only a group by leaving off the user in front of the colon.
$ ls -l
-rw-r--r-- 1 scott scott ... libby.jpg
$ chown :family libby.jpg
$ ls -l-rw-r--r-- 1 scott family ... libby.jpg
Tip
What if a user or group does have a colon in its name? Just type the backslash in front of that
colon, which "escapes" the character and tells the system that it's just a colon, and not a
separator between a user and group name:
$ chown denise:family\:parents libby.jpg
This works, but it's better to disallow colons in user and group names in the first place.
Because chown does everything chgrp does, there's very little reason to use chgrp, unless you feel like it.
Note
When you separate the user and group, you can actually use either a . or : character. New
recommendations are to stick with the :, however, because the . is deprecated.
Understand the Basics of Permissions
Before moving on to the chmod command, which allows you to change the permissions associated with a
file or directory, let's review how Linux understands those permissions.
Note
Linux systems are beginning to use a more granular and powerful permission system known
as Access Control Lists (ACLs). At this time, however, ACLs are still not widely used, so they're
not covered here. For more info about ACLs, see "Access Control Lists" at Linux Magazine
(www.linux-mag.com/2004-11/guru_01.html) and "An ACL GUI for Linux" at The Open Source
Weblog (http://opensource.weblogsinc.com/2005/12/06/an-acl-gui-for-linux/).
Linux understands that three sets of users can work with a file or directory: the actual owner (also
known as the file's user), a group, and everyone else on the system. Each of these sets is represented
by a different letter, as shown in Table 7.1.
Table 7.1. Users and
Their Abbreviations
User Group
Abbreviation
User (owner) u
Group
g
Others
o
In the "List Permissions, Ownership, and More" section in Chapter 2, "The Basics," you learned about
long permissions, which indicate what users can do with files and directories. In that section, you looked
at three attributes: read, write, and execute, represented by r, w, and x, respectively. Additional
possibilities are suid, sgid, and the sticky bit, represented by s (or S on some systems), s (or S), and t
(or T). Keep in mind, however, that all of these can have different meanings depending on whether the
item with the attribute is a file or a directory. Table 7.2 summarizes each attribute, its abbreviation,
and what it means.
Table 7.2. Permission Letters and Their Meanings
File Attribute
Abbreviation
Meaning for File
Meaning for Directory
Readable
r
Can view.
Can list with ls.
Writable
w
Can edit.
Can delete, rename, or add files.
Executable
x
Can run as program.
Can access to read files and
subdirectories or to run files.
suid
s
Any user can execute
the file with owner's
permissins.
Not applicable.
sgid
s
Any user can execute
the file with group's
permissions.
All newly created files in a
directory belong to the group
owning the directory.
Sticky bit
t
Tells OS that the file User cannot delete or rename
will be frequently
files, unless he is the file's or
executed, so it's
containing directory' owner.
constantly kept in
swap space for fast
access (only for older
Unix systems; Linux
ignores.)
Note
The root user can always do anything to any file or directory, so the previous table doesn't
apply to root.
Each of these file attributes is covered in more detail in the following sections. Now that you understand
the basics, let's look at using the chmod command to change the permissions of files and directories.
Change Permissions on Files and Directories Using
Alphabetic Notation
chmod [ugo][+-=][rwx]
You can use two notations with chmod: alphabetic or numeric. Both have their advantages, but it's
sometimes easier for users to learn the alphabetic system first. Basically, the alphabetic method uses a
simple formula: the user group you want to affect (u, g, o); followed by a plus sign (+) to grant
permission, a minus sign () to remove permission, or an equal sign (=) to set exact permission;
followed by the letters (r, w, x, s, t) representing the permission you want to alter. For instance, let's
say you want to allow members of the family group to be able to change a picture.
$ ls -l
-rw-r--r-- 1 scott family ... libby.jpg
$ chmod g+w libby.jpg
$ ls -l
-rw-rw-r-- 1 scott family ... libby.jpg
Easy enough. But what if you had wanted to give both the family group and all other users write
permission on the file?
$ ls -l
-rw-r--r-- 1 scott family ... libby.jpg
$ chmod go+w libby.jpg
$ ls -l
-rw-rw-rw- 1 scott family ... libby.jpg
Of course, because you're really giving all usersthe owner, the group, and the worldread and write
access, you could have just done it like this:
$ ls -l
-rw-r--r-- 1 scott family ... libby.jpg
$ chmod a=rw libby.jpg
$ ls -l
-rw-rw-rw- 1 scott family ... libby.jpg
You realize you made a mistake, and need to remove the capability of the family group and the world
to alter that picture, and also ensure that the world can't even see the picture.
$ ls -l
-rw-rw-rw- 1 scott family ... libby.jpg
$ chmod go-w libby.jpg
$ ls -l
-rw-r--r-- 1 scott family ... libby.jpg
$ chmod o-r libby.jpg
$ ls -l
-rw-r----- 1 scott family ... libby.jpg
Instead of the -, you could have used the =:
$ ls -l
-rw-rw-rw- 1 scott family ... libby.jpg
$ chmod g=r libby.jpg
$ ls -l
-rw-r--rw- 1 scott family ... libby.jpg
$ chmod o= libby.jpg
$ ls -l
-rw-r----- 1 scott family ... libby.jpg
Notice on the last chmod that o equaled nothing, effectively removing all permissions for all other users
on the system. Now that's fast and efficient!
The advantage to the alphabetic system is that it's often fast, but you can also see its main
disadvantage in the last example: If you want to make changes to two or more user groups and those
changes are different for each user group, you end up running chmod at least two times. The next
section shows how numeric permissions get around that problem.
Change Permissions on Files and Directories Using
Numeric Permissions
chmod [0-7][0-7][0-7]
Numeric permissions (also known as octal permissions) are built around the binary numeric system.
We're going to skip the complicated reasons why the permissions have certain numbers and focus on
the end meaning: read (r) has a value of 4, write ( w) is 2, and execute (x) is 1. Remember that Linux
permissions recognize three user groupsthe owner, the group, and the worldand each user group can
read, write, and execute. (See Table 7.3.)
Table 7.3. Permissions and Numeric
Representations
Owner
Group
World
Permissions
r; w; x
r; w; x
r; w;
x
Numeric
representation
4; 2; 1
4; 2; 1
4; 2;
1
Permissions under this schema become a matter of simple addition. Here are a few examples:
A user has read and write permissions for a file or directory. Read is 4, write is 2, and execute is 0
(because it's not granted). 4 + 2 + 0 = 6.
A user has read and execute permissions for a file. Read is 4, write is 0 (because that permission
hasn't been granted), and execute is 1. 4 + 0 + 1 = 5.
A user has read, write, and execute permissions for a directory. Read is 4, write is 2, and execute
is 1. 4 + 2 + 1 = 7.
Under this method, the most a user group can have is 7 (read, write, and execute), and the least is 0
(cannot read, write, or execute). Because there are three user groups, you have three numbers, each
between 0 and 7, and each representing what permissions that user group has associated with it. Table
7.4 shows the possible numbers and what they mean.
Table 7.4. Numeric
Permissions Represented
with ls -l
Number
ls -l
Representation
0
---
1
--x
2
-w-
3
-wx
4
r--
5
r-x
6
rw-
7
rwx
Although a wide variety of permissions can be set, a few tend to reappear constantly. Table 7.5 shows
the common permissions and what they mean.
Table 7.5. Common Permissions Represented with ls -l
chmod
ls -l
Command
Representation
Meaning
chmod 400
-r--------
Owner can read; no one else can do anything.
chmod 644
-rw-r--r--
Everyone can read; only owner can edit.
chmod 660
-rw-rw----
Owner and group can read and edit; world can do
nothing.
chmod 664
-rw-rw-r--
Everyone can read; owner and group can edit.
chmod 700
-rwx------
Owner can read, write, and execute; no one else can
do anything.
chmod 744
-rwxr--r--
Everyone can read; only owner can edit and execute.
chmod 755
-rwxr-xr-x
Everyone can read and execute; only owner can edit.
chmod 777
-rwxrwxrwx
Everyone can read, edit, and execute (not usually a
good idea).
Caution
Yes, you can set chmod 000 on a file or directory, but now the only user who can do anything
with it or use chmod to change permissions again is root.
Octal permissions should now make a bit of sense. They require a bit more cogitation to understand
than alphabetic permissions, but they also allow you to set changes in one fell swoop. Let's revisit the
examples from "Change Permissions on Files and Directories Using Alphabetic Notation" but with
numbers instead.
Let's say you want to allow members of the family group to be able to change a picture.
$ ls -l
-rw-r--r-- 1 scott family ... libby.jpg
$ chmod 664 libby.jpg
$ ls -l
-rw-rw-r-- 1 scott family ... libby.jpg
What if you had wanted to give both the family group and all other users write permission on the file?
$ ls -l
-rw-r--r-- 1 scott family ... libby.jpg
$ chmod 666 libby.jpg
$ ls -l
-rw-rw-rw- 1 scott family ... libby.jpg
You realize you made a mistake, and you need to remove the capability of the family group and the
world to alter that picture and also ensure that the world can't even see the picture.
$ ls -l
-rw-rw-rw- 1 scott family ... libby.jpg
$ chmod 640 libby.jpg
$ ls -l
-rw-r----- 1 scott family ... libby.jpg
This example shows the key advantage of this method. What took two steps with alphabetic
notationfirst chmod go-w and then chmod o-r (or chmod g=r and then chmod o= )only takes a single
command with numeric notation. For this reason, you often see advanced Linux gurus use octal notation
because it's quicker and more seemingly precise.
Change Permissions Recursively
chmod -R
You've probably noticed by now that many Linux commands allow you to apply them recursively to files
and directories, and chmod is no different. With the -R (or --recursive) option, you can change the
permissions of hundreds of file system objects in secondsjust be sure that's what you want to do.
$ pwd
/home/scott/pictures/libby
$ ls -lF
drwxrw----rw-r--r--rw-r--r-drwxrw---$ ls -l *
-rw-r--r--rw-r--r--
2
1
1
2
scott
scott
scott
scott
scott
scott
scott
scott
...
...
...
...
by_pool/
libby_arrowrock.jpg
libby.jpg
on_floor/
1 scott scott ... libby_arrowrock.jpg
1 scott scott ... libby.jpg
by_pool:
-rw-r--r-- 1 scott scott ... libby_by_pool_02.jpg
-rwxr-xr-x 2 scott scott ... lieberman_pool.jpg
on_floor:
-rw-r--r-- 1 scott scott ... libby_on_floor_01.jpg
-rw-r--r-- 1 scott scott ... libby_on_floor_02.jpg
$ chgrp -R family *
$ chmod -R 660 *
chmod: 'by_pool' : Permission denied
chmod: 'on_floor' : Permission denied
"Permission denied?" What happened? Take a look at Table 7.2. If a file is executable, it can be run as a
program, but a directory must be executable to allow users access inside it to read its files and
subdirectories. Running chmod -R 660 * removed the x permission from everythingfiles and directories.
When chmod went to report what it had done, it couldn't because it couldn't read inside those
directories, since they were no longer executable.
So what should you do? There really isn't a simple answer. You could run chmod using a wildcard that
only affects files of a certain type, like this:
$ chmod -R 660 *.jpg
That would only affect images and not directories, so you wouldn't have any issues. If you have files of
more than one type, however, it can quickly grow tedious, as you'll have to run chmod once for every file
type.
If you have many subdirectories within subdirectories, or too many file types to deal with, you can be
really clever and use the find command to look for all files that are not directories and then change
their permissions. You'll learn more about that in Chapter 10, "The find Command."
The big takeaway here: When changing permissions recursively, be careful. You might not get what you
expected and end up preventing access to files and subdirectories accidentally.
Set and Then Clear suid
chmod u[+-]s
In the "Understand the Basics of Permissions" section, you looked at several possible permissions.
You've focused on r, w, and x because those are the most common, but others can come in handy at
times. Let's take a look at suid, which only applies to executable files, never directories.
After suid is set, suid means that a user can execute a file with the owner's permissions, as though it
was the owner of the program running it. You can see a common example of suid in action by looking
at the permissions for the passwd command, which allows users to set and change their passwords.
$ ls -l /usr/bin/passwd
-rwsr-xr-x 1 root root ... /usr/bin/passwd
You can see that passwd is set as suid because it has an s where the user's x should be. The root user
owns passwd, but it's necessary that ordinary users be allowed to run the command, or they wouldn't be
able to change their passwords on their own. To make the passwd command executable for everyone, x
is set for the user, the group, and all users on the system. That's not enough, however. The answer is to
set passwd as suid root, so anyone can run it with root's permissions for that command.
Note
You might see both an s and an S to indicate that suid is set. You see an s if the owner already
had execute permissions (x) before you set suid, and an S if the owner didn't have execute set
before suid was put in place. The end result is the same, but the capitalization tells you what
was in place originally.
You can set and unset suid in two ways: using the alphabet or using numbers. The alphabet method
would look like this:
$ pwd
/home/scott/bin
$ ls -l
-rwxr-xr-- 1 scott admins ... backup_data
$ chmod u+s backup_data
$ ls -l
-rwsr-xr-- 1 scott admins ... backup_data
Now anyone in the admins group can run the backup_data script as though they were the user scott.
But note that anyone not in the admins group is shut out because it only has read permission for the
program. If it was necessary for everyone on the system to be able to run backup_data as scott, the
permissions would be -rwsr-xr-x.
Removing suid is a matter of using u- instead of u+.
$ ls -l
-rwsr-xr-- 1 scott admins ... backup_data
$ chmod u-s backup_data
$ ls -l
-rwxr-xr-- 1 scott admins ... backup_data
Setting suid via octal permissions is a bit more complicated, only because it introduces a new facet to
the numeric permissions you've been using. You'll recall that numeric permissions use three digits, with
the first representing what is allowed for the owner, the second for the group, and the third for all other
users. It turns out that there's actually a fourth digit that appears to the left of the owner's number.
That digit is a 0 the vast majority of the time, however, so it's not necessary to display or use it. In
other words, chmod 644 libby.jpg and chmod 0644 libby.jpg are exactly the same thing. You only need
that fourth digit when you want to change suid (or sgid or the sticky bit, as you'll see in the following
sections).
The number for setting suid is 4, so you'd change backup_data using numbers like this:
$ pwd
/home/scott/bin
$ ls -l
-rwxr-xr-- 1 scott admins ... backup_data
$ chmod 4754 backup_data
$ ls -l
-rwsr-xr-- 1 scott admins ... backup_data
Removing suid is a matter of purposely invoking the 0 because that sets things back to the default
state, without suid in place.
$ ls -l
-rwsr-xr-- 1 scott admins ... backup_data
$ chmod 0754 backup_data
$ ls -l
-rwxr-xr-- 1 scott admins ... backup_data
Note
As an ordinary user, it's not very likely that you'll need to change programs to suid. Most
often it's associated with programs owned by root, but it's still good to know about it for that
every-so-often case in which you need to use it.
Set and Then Clear sgid
chmod g[+-]s
Closely related to suid is sgid. sgid can apply to both files and directories. For files, sgid is just like
suid, except that a user can now execute a file with the group's permissions instead of an owner's
permissions. For example, on your system the crontab command is probably set as sgid, so that users
can ask cron to run programs for them, but as the much more restricted crontab group rather than the
all-powerful root user.
$ ls -l /usr/bin/crontab
-rwxr-sr-x 1 root crontab ... /usr/bin/crontab
When applied to directories, sgid does something interesting: Any subsequent files created in that
directory belong to the group assigned to the directory. An example helps make this clearer.
Let's say you have three usersAlice, Bob, and Carolwho are all members of the admins group. Alice's
username is alice, and her primary group is also alice, an extremely common occurrence on most
Linux systems. Bob and Carol follow the same pattern, with their usernames and primary groups being,
respectively, bob and carol. If Alice creates a file in a directory shared by the admins group, the owner
and group for that file is alice, which means that the other members of the admins group are unable to
write to that file. Sure, Alice could run chgrp admins document (or chown :admins document) after she
creates a new file, but that quickly grows incredibly tedious.
If the shared directory is set to sgid, however, any new file created in that directory is still owned by
the user who created the file, but it's also automatically assigned to the directory's group, in this case,
admins. The result: Alice, Bob, and Carol can all read and edit any files created in that shared directory,
with a minimum of tedium.
Unsurprisingly, you can set sgid with either letters or numbers. Using letters, sgid is just like suid,
except that a g instead of a u is used. Let's look at sgid applied to a directory, but keep in mind that the
same process is used on a file.
$ ls -lF
drwxr-xr-x 11 scott admins ... bin/
$ chmod g+s bin
$ ls -lF
drwxr-Sr-x 11 scott admins ... bin/
Note
You might see both an s and an S to indicate that sgid is set. You see an s if the group already
had execute permissions (x) before you set sgid, and an S if the group didn't have execute set
before sgid was put in place. The end result is the same, but the capitalization tells you what
was in place originally.
Removing sgid is pretty much the opposite of adding it.
$ ls -lF
drwxr-Sr-x 11 scott admins ... bin/
$ chmod g-s bin
$ ls -lF
drwxr-xr-x 11 scott admins ... bin/
If you haven't already read the previous section, "Set and Then Clear suid," go back and do so, as it
explains the otherwise mysterious fourth digit that appears just before the number representing the
owner's permissions. In the case of suid, that number is 4; for sgid, it's 2.
$ ls -lF
drwxr-xr-x 11 scott admins ... bin/
$ chmod 2755 bin
$ ls -lF
drwxr-Sr-x 11 scott admins ... bin/
You remove sgid the same way you remove suid: with a 0 at the beginning, which takes sgid out of the
picture.
$ ls -lF
drwxr-Sr-x 11 scott admins ... bin/
$ chmod 0755 bin
$ ls -lF
drwxr-xr-x 11 scott admins ... bin/
Note
You know what creating a new file in a sgid directory will do, but be aware that other file
system processes can also be affected by sgid. If you copy a file with cp into the sgid
directory, it acquires the group of that directory. If you move a file with mv into the sgid
directory, however, it keeps its current group ownership and does not acquire that of the
directory's group. Finally, if you create a new directory inside the sgid directory using mkdir, it
not only inherits the group that owns the sgid directory, but also becomes sgid itself.
Set and Then Clear the Sticky Bit
chmod [+-]t
Besides being a fun phrase that rolls off the tongue, what's the sticky bit? In the old days of Unix, if the
sticky bit was set for an executable file, the OS knew that the file was going to be run constantly, so it
was kept in swap space so it could be quickly and efficiently accessed. Linux is a more modern system,
so it ignores the sticky bit when it's set on files.
That means that the sticky bit is used on directories. After it is set on a folder, users cannot delete or
rename files in that folder unless they are that file's owner or the owner of the directory that has the
sticky bit set on it. If the sticky bit isn't set and the folder is writable for users, that also means that
those users can delete and rename any files in that directory. The sticky bit prevents that from
happening. The most common place you'll see it is in your /tmp directory, which is world-writable by
design, but the individual files and folders within /tmp are protected from other users by the sticky bit.
$ ls -l /
drwxrwxrwt 12 root root ... tmp
[Results truncated for length]
Note
You may see both a t and a T to indicate that the sticky bit is set. You see a t if the world
already had execute permissions (x) before you set the sticky bit, and a T if the world didn't
have execute set before the sticky bit was put in place. The end result is the same, but the
capitalization tells you what was in place originally.
Like so many other examples using chmod in this chapter, it's possible to set the sticky bit with either
letters or numbers.
$ ls -lF
drwxrwxr-x 2 scott family ... libby_pix/
$ chmod +t libby_pix
$ ls -lF
drwxrwxr-t 2 scott family ... libby_pix/
Two things might be a bit confusing here. First, although previous uses of the alphabetic method for
setting permissions required you to specify who was affected by typing in a u, g, or o, for instance,
that's not necessary with the sticky bit. A simple +t is all that is required.
Second, note that the t appears in the world's execute position, but even though the directory isn't
world-writable, it still allows members of the family group to write to the directory, while preventing
those members from deleting files unless they own them.
Removing the sticky bit is about as straightforward as you could hope.
$ ls -lF
drwxrwxr-t 2 scott family ... libby_pix/
$ chmod -t libby_pix
$ ls -lF
drwxrwxr-x 2 scott family ... libby_pix/
Setting the sticky bit using octal permissions involves the fourth digit already covered in "List All
Available Printers" and "Set and Then Clear sgid." Where suid uses 4 and sgid uses 2, the sticky bit
uses 1 (see a pattern?).
$ ls -lF
drwxrwxr-x 2 scott family ... libby_pix/
$ chmod 1775 libby_pix
$ ls -lF
drwxrwxr-t 2 scott family ... libby_pix/
Once again, a 0 cancels out the sticky bit.
$ ls -lF
drwxrwxr-t 2 scott family ... libby_pix/
$ chmod 0775 libby_pix
$ ls -lF
drwxrwxr-x 2 scott family ... libby_pix/
The sticky bit isn't something you'll be using on many directories on your workstation, but on a server it
can be incredibly handy. Keep it in mind, and you'll find that it solves some otherwise thorny permission
problems.
Tip
In the interest of speeding up your time on the command line, it's possible to set combinations
of suid, sgid, and the sticky bit at the same time. In the same way that you add 4 (read), 2
(write), and 1 (execute) together to get the numeric permissions for users, you can do the
same for suid, sgid, and the sticky bit.
Number Meaning
0
Removes sticky bit, sgid, and suid
1
Sets sticky bit
2
Sets sgid
3
Sets sticky bit and sgid
4
Sets suid
5
Sets sticky bit and suid
6
Sets sgid and suid
7
Sets sticky bit, sgid, and suid
Be sure to note that using a 0 removes suid, sgid, and the sticky bit all at the same time. If
you use 0 to remove suid but you still want the sticky bit set, you need to go back and reset
the sticky bit.
Conclusion
Permissions are vitally important for the security and even sanity of a Linux system, and they can seem
overwhelming at first. With a bit of thought and learning, however, it's possible to get a good handle on
Linux permissions and use them to your advantage. The combination of chgrp to change group
ownership, chown to change user ownership (and group ownership as well), and the powerful chmod
gives a Linux user a wealth of tools at her disposal that enable her to set permissions in a powerful,
effective way.
Chapter 8. Archiving and Compression
Although the differences are sometimes made opaque in casual conversation, there is in fact a complete
difference between archiving files and compressing them. Archiving means that you take 10 files and
combine them into one file, with no difference in size. If you start with 10 100KB files and archive them,
the resulting single file is 1000KB. On the other hand, if you compress those 10 files, you might find
that the resulting files range from only a few kilobytes to close to the original size of 100KB, depending
upon the original file type.
Note
In fact, you might end up with a bigger file during compression! If the file is already
compressed, compressing it again adds extra overhead, resulting in a slightly bigger file.
All of the archive and compression formats in this chapterzip , gzip, bzip2, and tar are popular, but zip
is probably the world's most widely used format. That's because of its almost universal use on Windows,
but zip and unzip are well supported among all major (and most minor) operating systems, so things
compressed using zip also work on Linux and Mac OS. If you're sending archives out to users and you
don't know which operating systems they're using, zip is a safe choice to make.
gzip was designed as an open-source replacement for an older Unix program, compress . It's found on
virtually every Unix-based system in the world, including Linux and Mac OS X, but it is much less
common on Windows. If you're sending files back and forth to users of Unix-based machines, gzip is a
safe choice.
The bzip2 command is the new kid on the block. Designed to supersede gzip, bzip2 creates smaller
files, but at the cost of speed. That said, computers are so fast nowadays that most users won't notice
much of a difference between the times it takes gzip or bzip2 to compress a group of files.
Note
Linux Magazine published a good article comparing several different compression formats,
which you can find at http://www.linux-mag.com/content/view/1678/43/.
zip , gzip, and bzip2 are focused on compression (although zip also archives). The tar command does
one thingarchiveand it has been doing it for a long time. It's found almost solely on Unix-based
machines. You'll definitely run into tar files (also called tarballs) if you download source code, but
almost every Linux user can expect to encounter a tarball some time in his career.
Archive and Compress Files Using zip
zip
zip both archives and compresses files, thus making it great for sending multiple files as email
attachments, backing up items, or for saving disk space. Using it is simple. Let's say you want to send a
TIFF to someone via email. A TIFF image is uncompressed, so it tends to be pretty large. Zipping it up
should help make the email attachment a bit smaller.
Note
When using ls -l, I'm only showing the information needed for each example.
$ ls -lh
-rw-r--r-- scott scott 1006K young_edgar_scott.tif
$ zip grandpa.zip young_edgar_scott.tif
adding: young_edgar_scott.tif (deflated 19%)
$ ls -lh
-rw-r--r-- scott scott 1006K young_edgar_scott.tif
-rw-r--r-- scott scott 819K grandpa.zip grandpa.zip
In this case, you shaved off about 200KB on the resulting zip file, or 19%, as zip helpfully informs you.
Not bad. You can do the same thing for several images.
$ ls -l
-rw-r--r-- scott scott
251980 edgar_intl_shoe.tif
-rw-r--r-- scott scott 1130922 edgar_baby.tif
-rw-r--r-- scott scott 1029224 young_edgar_scott.tif
$ zip grandpa.zip edgar_intl_shoe.tif edgar_ baby.tif young_edgar_scott.tif
adding: edgar_intl_shoe.tif (deflated 4%)
adding: edgar_baby.tif (deflated 12%)
adding: young_edgar_scott.tif (deflated 19%)
$ ls -l
-rw-r--r-- scott scott
251980 edgar_intl_shoe.tif
-rw-r--r-- scott scott 1130922 edgar_baby.tif
-rw-r--r-- scott scott 2074296 grandpa.zip
-rw-r--r-- scott scott 1029224 young_edgar_scott.tif
It's not too polite, however, to zip up individual files this way. For three files, it's not so bad. The
recipient will unzip grandpa.zip and end up with three individual files. If the payload was 50 files,
however, the user would end up with files strewn everywhere. Better to zip up a directory containing
those 50 files so when the user unzips it, he's left with a tidy directory instead.
$ ls -lF
drwxr-xr-x scott scott edgar_scott/
$ zip grandpa.zip edgar_scott
adding: edgar_scott/ (stored 0%)
adding: edgar_scott/edgar_baby.tif (deflated 12%)
adding: edgar_scott/young_edgar_scott.tif (deflated 19%)
adding: edgar_scott/edgar_intl_shoe.tif (deflated 4%)
$ ls -lF
drwxr-xr-x scott scott
160 edgar_scott/
-rw-r--r-- scott scott 2074502 grandpa.zip
Whether you're zipping up a file, several files, or a directory, the pattern is the same: the zip
command, followed by the name of the Zip file you're creating, and finished with the item(s) you're
adding to the Zip file.
Get the Best Compression Possible with zip
-[0-9]
It's possible to adjust the level of compression that zip uses when it does its job. The zip command
uses a scale from 0 to 9, in which 0 means "no compression at all" (which is like tar , as you'll see later),
1 means "do the job quickly, but don't bother compressing very much," and 9 means "compress the
heck out of the files, and I don't mind waiting a bit longer to get the job done." The default is 6, but
modern computers are fast enough that it's probably just fine to use 9 all the time.
Say you're interested in researching Herman Melville's Moby-Dick, so you want to collect key texts to
help you understand the book: Moby-Dick itself, Milton's Paradise Lost, and the Bible's book of Job. Let's
compare the results of different compression rates.
$ ls -l
-rw-r--r-- scott scott 102519 job.txt
-rw-r--r-- scott scott 1236574 moby-dick.txt
-rw-r--r-- scott scott
508925 paradise_lost.txt
$ zip -0 moby.zip *.txt
adding: job.txt (stored 0%)
adding: moby-dick.txt (stored 0%)
adding: paradise_lost.txt (stored 0%)
$ ls -l
-rw-r--r-- scott scott 102519 job.txt
-rw-r--r-- scott scott 1236574 moby-dick.txt
-rw-r--r-- scott scott 1848444 moby.zip
-rw-r--r-- scott scott
508925 paradise_lost.txt
$ zip -1 moby.zip *txt
updating: job.txt (deflated 58%)
updating: moby-dick.txt (deflated 54%)
updating: paradise_lost.txt (deflated 50%)
$ ls -l
-rw-r--r-- scott scott 102519 job.txt
-rw-r--r-- scott scott 1236574 moby-dick.txt
-rw-r--r-- scott scott 869946 moby.zip
-rw-r--r-- scott scott
508925 paradise_lost.txt
$ zip -9 moby.zip *txt
updating: job.txt (deflated 65%)
updating: moby-dick.txt (deflated 61%)
updating: paradise_lost.txt (deflated 56%)
$ ls -l
-rw-r--r-- scott scott 102519 job.txt
-rw-r--r-- scott scott 1236574 moby-dick.txt
-rw-r--r-- scott scott 747730 moby.zip
-rw-r--r-- scott scott
508925 paradise_lost.txt
In tabular format, the results look like this:
Book
zip -0
zip -1
zip -9
Moby-Dick
0%
54%
61%
Paradise Lost 0%
50%
56%
Job
0%
58%
65%
Total (in
bytes)
1848444 869946
747730
The results you see here would vary depending on the file types (text files typically compress well) and
the sizes of the original files, but this gives you a good idea of what you can expect. Unless you have a
really slow machine or you're just naturally impatient, you should just use -9 all the time to get the
maximum compression.
Note
If you want to be clever, define an alias in your .bashrc file that looks like this:
alias zip='zip -9'
That way you'll always use -9 and won't have to think about it.
Password-Protect Compressed Zip Archives
-P -e
The Zip program allows you to password-protect your Zip archives using the -P option. You shouldn't
use this option. It's completely insecure, as you can see in the following example (the actual password
is 12345678 ):
$ zip -P 12345678 moby.zip *.txt
Because you had to specify the password on the command line, anyone viewing your shell's history (and
you might be surprised how easy it is for other users to do so) can see your password in all its glory.
Don't use the -P option!
Instead, just use the -e option, which encrypts the contents of your Zip file and also uses a password.
The difference, however, is that you're prompted to type the password in, so it won't be saved in the
history of your shell events.
$ zip -e moby.zip *.txt
Enter password:
Verify password:
adding: job.txt (deflated 65%)
adding: moby-dick.txt (deflated 61%)
adding: paradise_lost.txt (deflated 56%)
The only part of this that's saved in the shell is zip -e moby.zip *.txt. The actual password you type
disappears into the ether, unavailable to anyone viewing your shell history.
Caution
The security offered by the Zip program's password protection isn't that great. In fact, it's
pretty easy to find a multitude of tools floating around the Internet that can quickly crack a
password-protected Zip archive. Think of password-protecting a Zip file as the difference
between writing a message on a postcard and sealing it in an envelope: It's good enough for
ordinary folks, but it won't stop a determined attacker.
Also, the version of zip included with some Linux distros may not support encryption, in which
case you'll see a zip error: "encryption not supported." The only solution: recompile zip from
source. Ugh.
Unzip Files
unzip
Expanding a Zip archive isn't hard at all. To create a zipped archive, use the zip command; to expand
that archive, use the unzip command.
$ unzip moby.zip
Archive:
moby.zip
inflating: job.txt
inflating: moby-dick.txt
inflating: paradise_lost.txt
The unzip command helpfully tells you what it's doing as it works. To get even more information, add
the -v option (which stands, of course, for verbose).
unzip -v moby.zip
Archive:
moby.zip
Length Method Size Ratio CRC-32
------- ------ ------ ----- -----102519 Defl:X 35747 65% fabf86c9
1236574 Defl:X 487553 61% 34a8cc3a
508925 Defl:X 224004 56% 6abe1d0f
------------ --1848018
747304 60%
Name
---job.txt
moby-dick.txt
paradise_lost.t
------3 files
There's quite a bit of useful data here, including the method used to compress the files, the ratio of
original to compressed file size, and the cyclic redundancy check (CRC) used for error correction.
List Files That Will Be Unzipped
-l
Sometimes you might find yourself looking at a Zip file and not remembering what's in that file. Or
perhaps you want to make sure that a file you need is contained within that Zip file. To list the contents
of a zip file without unzipping it, use the -l option (which stands for "list").
$ unzip -l moby.zip
Archive:
moby.zip
Length
Date
----------0 01-26-06
207254 01-26-06
102519 01-26-06
1236574 01-26-06
508925 01-26-06
-------2055272
Time
---18:40
18:40
18:19
18:19
18:19
Name
---bible/
bible/genesis.txt
bible/job.txt
moby-dick.txt
paradise_lost.txt
------5 files
From these results, you can see that moby.zip contains two filesmoby-dick.txt and
paradise_lost.txtand a directory ( bible), which itself contains two files, genesis.txt and job.txt.
Now you know exactly what will happen when you expand moby.zip . Using the -l command helps
prevent inadvertently unzipping a file that spews out 100 files instead of unzipping a directory that
contains 100 files. The first leaves you with files strewn pell-mell, while the second is far easier to
handle.
Test Files That Will Be Unzipped
-t
Sometimes zipped archives become corrupted. The worst time to discover this is after you've unzipped
the archive and deleted it, only to discover that some or even all of the unzipped contents are damaged
and won't open. Better to test the archive first before you actually unzip it by using the -t (for test)
option.
$ unzip -t moby.zip
Archive:
moby.zip
testing: bible/
testing: bible/genesis.txt
testing: bible/job.txt
testing: moby-dick.txt
testing: paradise_lost.txt
No errors detected in compressed data
OK
OK
OK
OK
OK
of moby.zip.
You really should use -t every time you work with a zipped file. It's the smart thing to do, and although
it might take some extra time, it's worth it in the end.
Archive and Compress Files Using gzip
gzip
Using gzip is a bit easier than zip in some ways. With zip , you need to specify the name of the newly
created Zip file or zip won't work; with gzip, though, you can just type the command and the name of
the file you want to compress.
$ ls -l
-rw-r--r-- scott scott 508925 paradise_lost.txt
$ gzip paradise_lost.txt
$ ls -l
-rw-r--r-- scott scott 224425 paradise_lost.txt.gz
You should be aware of a very big difference between zip and gzip: When you zip a file, zip leaves the
original behind so you have both the original and the newly zipped file, but when you gzip a file, you're
left with only the new gzipped file. The original is gone.
If you want gzip to leave behind the original file, you need to use the -c (or --stdout or --to-stdout)
option, which outputs the results of gzip to the shell, but you need to redirect that output to another
file. If you use -c and forget to redirect your output, you get nonsense like this:
Not good. Instead, output to a file.
$ ls -l
-rw-r--r-- 1 scott scott 508925 paradise_lost.txt
$ gzip -c paradise_lost.txt > paradise_lost.txt.gz
$ ls -l
-rw-r--r-- 1 scott scott 497K paradise_lost.txt
-rw-r--r-- 1 scott scott 220K paradise_lost.txt.gz
Much better! Now you have both your original file and the zipped version.
Tip
If you accidentally use the -c option without specifying an output file, just start pressing
Ctrl+C several times until gzip stops.
Archive and Compress Files Recursively Using gzip
-r
If you want to use gzip on several files in a directory, just use a wildcard. You might not end up
gzipping everything you think you will, however, as this example shows.
$ ls -F
bible/
moby-dick.txt
paradise_lost.txt
$ ls -l *
-rw-r--r-- scott scott 1236574 moby-dick.txt
-rw-r--r-- scott scott
508925 paradise_lost.txt
bible:
-rw-r--r-- scott
-rw-r--r-- scott
$ gzip *
gzip: bible is a
$ ls -l *
-rw-r--r-- scott
-rw-r--r-- scott
scott 207254 genesis.txt
scott 102519 job.txt
directory -- ignored
scott 489609 moby-dick.txt.gz
scott 224425 paradise_lost.txt.gz
bible:
-rw-r--r-- scott scott 207254 genesis.txt
-rw-r--r-- scott scott 102519 job.txt
Notice that the wildcard didn't do anything for the files inside the bible directory because gzip by
default doesn't walk down into subdirectories. To get that behavior, you need to use the -r (or -recursive) option along with your wildcard.
$ ls -F
bible/
moby-dick.txt
paradise_lost.txt
$ ls -l *
-rw-r--r-- scott scott 1236574 moby-dick.txt
-rw-r--r-- scott scott
508925 paradise_lost.txt
bible:
-rw-r--r-- scott
-rw-r--r-- scott
$ gzip -r *
$ ls -l *
-rw-r--r-- scott
-rw-r--r-- scott
scott 207254 genesis.txt
scott 102519 job.txt
scott 489609 moby-dick.txt.gz
scott 224425 paradise_lost.txt.gz
bible:
-rw-r--r-- scott scott 62114 genesis.txt.gz
-rw-r--r-- scott scott 35984 job.txt.gz
This time, every fileeven those in subdirectorieswas gzipped. However, note that each file is individually
gzipped. The gzip command cannot combine all the files into one big file, like you can with the zip
command. To do that, you need to incorporate tar , as you'll see in "Archive and Compress Files with tar
and gzip."
Get the Best Compression Possible with gzip
-[0-9]
Just as with zip , it's possible to adjust the level of compression that gzip uses when it does its job. The
gzip command uses a scale from 0 to 9, in which 0 means "no compression at all" (which is like tar , as
you'll see later), 1 means "do the job quickly, but don't bother compressing very much," and 9 means
"compress the heck out of the files, and I don't mind waiting a bit longer to get the job done." The
default is 6, but modern computers are fast enough that it's probably just fine to use 9 all the time.
$ ls -l
-rw-r--r-$ gzip -c
$ ls -l
-rw-r--r--rw-r--r-$ gzip -c
$ ls -l
-rw-r--r--rw-r--r--
scott scott 1236574 moby-dick.txt
-1 moby-dick.txt > moby-dick.txt.gz
scott scott 1236574 moby-dick.txt
scott scott 571005 moby-dick.txt.gz
-9 moby-dick.txt > moby-dick.txt.gz
scott scott 1236574 moby-dick.txt
scott scott 487585 moby-dick.txt.gz
Remember to use the -c option and pipe the output into the actual .gz file due to the way gzip works,
as discussed in "Archive and Compress Files Using gzip."
Note
If you want to be clever, define an alias in your .bashrc file that looks like this:
alias gzip= ' gzip -9 '
That way, you'll always use -9 and won't have to think about it.
Uncompress Files Compressed with gzip
gunzip
Getting files out of a gzipped archive is easy with the gunzip command.
$ ls -l
-rw-r--r-- scott scott
224425 paradise_lost.txt.gz
$ gunzip paradise_lost.txt.gz
$ ls -l
-rw-r--r-- scott scott
508925 paradise_lost.txt
In the same way that gzip removes the original file, leaving you solely with the gzipped result, gunzip
removes the .gz file, leaving you with the final gunzipped result. If you want to ensure that you have
both, you need to use the -c option (or --stdout or --to-stdout) and pipe the results to the file you
want to create.
$ ls -l
-rw-r--r-- scott scott
224425 paradise_lost.txt.gz
$ gunzip -c paradise_lost.txt.gz > paradise_lost.txt
$ ls -l
-rw-r--r-- scott scott
508925 paradise_lost.txt
-rw-r--r-- scott scott
224425 paradise_lost.txt.gz
It's probably a good idea to use -c, especially if you plan to keep behind the .gz file or pass it along to
someone else. Sure, you could use gzip and create your own archive, but why go to the extra work?
Note
If you don't like the gunzip command, you can also use gzip -d (or --decompress or -uncompress).
Test Files That Will Be Unzipped with gunzip
-t
Before gunzipping a file (or files) with gunzip, you might want to verify that they're going to gunzip
correctly without any file corruption. To do this, use the -t (or --test) option.
$ gzip -t paradise_lost.txt.gz
$
That's right: If nothing is wrong with the archive, gzip reports nothing back to you. If there's a
problem, you'll know, but if there's not a problem, gzip is silent. That can be a bit disconcerting, but
that's how Unix-based systems work. They're generally only noisy if there's an issue you should know
about, not if everything is working as it should.
Archive and Compress Files Using bzip2
bzip2
Working with bzip2 is pretty easy if you're comfortable with gzip, as the creators of bzip2 deliberately
made the options and behavior of the new command as similar to its progenitor as possible.
$ ls -l
-rw-r--r-- scott scott 1236574 moby-dick.txt
$ bzip2 moby-dick.txt
$ ls -l
-rw-r--r-- scott scott 367248 moby-dick.txt.bz2
Just like gzip, bzip2 leaves you with just the .bz2 file. The original moby-dick.txt is gone. To keep the
original file, use the -c (or --stdout ) option and pipe the output to a filename that ends with .bz2.
$ ls -l
-rw-r--r-$ bzip2 -c
$ ls -l
-rw-r--r--rw-r--r--
scott scott 1236574 moby-dick.txt
moby-dick.txt > moby-dick.txt.bz2
scott scott 1236574 moby-dick.txt
scott scott 367248 moby-dick.txt.bz2
If you look back at "Archive and Compress Files Using gzip," you'll see that gzip and bzip2 are
incredibly similar, which is by design.
Get the Best Compression Possible with bzip2
-[0-9]
Just as with zip and gzip, it's possible to adjust the level of compression that bzip2 uses when it does
its job. The bzip2 command uses a scale from 0 to 9, in which 0 means "no compression at all" (which is
like tar , as you'll see later), 1 means "do the job quickly, but don't bother compressing very much," and
9 means "compress the heck out of the files, and I don't mind waiting a bit longer to get the job done."
The default is 6, but modern computers are fast enough that it's probably just fine to use 9 all the time.
$ ls -l
-rw-r--r-- scott scott 1236574
$ bzip2 -c -1 moby-dick.txt >
$ ls -l
-rw-r--r-- scott scott 1236574
-rw-r--r-- scott scott
424084
$ bzip2 -c -9 moby-dick.txt >
$ ls -l
-rw-r--r-- scott scott 1236574
-rw-r--r-- scott scott
367248
moby-dick.txt
moby-dick.txt.bz2
moby-dick.txt
moby-dick.txt.bz2
moby-dick.txt.bz2
moby-dick.txt
moby-dick.txt.bz2
From 424KB with 1 to 367KB with 9that's quite a difference! Also notice the difference in ultimate file
size between gzip and bzip2. At -9, gzip compressed moby-dick.txt down to 488KB, while bzip2
mashed it even further to 367KB. The bzip2 command is noticeably slower than the gzip command, but
on a fast machine that means that bzip2 takes two or three seconds longer than gzip, which frankly
isn't much to worry about.
Note
If you want to be clever, define an alias in your .bashrc file that looks like this:
alias bzip2='bzip2 -9'
That way, you'll always use -9 and won't have to think about it.
Uncompress Files Compressed with bzip2
bunzip2
In the same way that bzip2 was purposely designed to emulate gzip as closely as possible, the way
bunzip2 works is very close to that of gunzip.
$ ls -l
-rw-r--r-- scott scott
367248 moby-dick.txt.bz2
$ bunzip2 moby-dick.txt.bz2
$ ls -l
-rw-r--r-- scott scott 1236574 moby-dick.txt
You'll notice that bunzip2 is similar to gunzip in another way: Both commands remove the original
compressed file, leaving you with the final uncompressed result. If you want to ensure that you have
both the compressed and uncompressed files, you need to use the -c option (or --stdout or --tostdout) and pipe the results to the file you want to create.
$ ls -l
-rw-r--r-- scott scott
367248
$ bunzip2 -c moby-dick.txt.bz2
$ ls -l
-rw-r--r-- scott scott 1236574
-rw-r--r-- scott scott
367248
moby-dick.txt.bz2
> moby-dick.txt
moby-dick.txt
moby-dick.txt.bz2
It's a good thing when commands copy each other's options and behavior, as it makes them easier to
learn. In this, the creators of bzip2 and bunzip2 showed remarkable foresight.
Note
If you're not feeling favorable toward bunzip2, you can also use bzip2 -d (or --decompress or
--uncompress).
Test Files That Will Be Unzipped with bunzip
-t
Before bunzipping a file (or files) with bunzip, you might want to verify that they're going to bunzip
correctly without any file corruption. To do this, use the -t (or --test) option.
$ bunzip2 -t paradise_lost.txt.gz
$
Just as with gunzip, if there's nothing wrong with the archive, bunzip2 doesn't report anything back to
you. If there's a problem, you'll know, but if there's not a problem, bunzip2 is silent.
Archive Files with tar
-cf
Remember, tar doesn't compress; it merely archives (the resulting archives are known as tarballs, by
the way). Instead, tar uses other programs, such as gzip or bzip2, to compress the archives that tar
creates. Even if you're not going to compress the tarball, you still create it the same way with the same
basic options: -c (or --create ), which tells tar that you're making a tarball, and -f (or --file), which
is the specified filename for the tarball.
$ ls -l
scott scott 102519 job.txt
scott scott 1236574 moby-dick.txt
scott scott 508925 paradise_lost.txt
$ tar -cf moby.tar *.txt
$ ls -l
scott scott 102519 job.txt
scott scott 1236574 moby-dick.txt
scott scott 1853440 moby.tar
scott scott 508925 paradise_lost.txt
Pay attention to two things here. First, add up the file sizes of job.txt, moby-dick.txt , and
paradise_lost.txt, and you get 1848018 bytes. Compare that to the size of moby.tar , and you see that
the tarball is only 5422 bytes bigger. Remember that tar is an archive tool, not a compression tool, so
the result is at least the same size as the individual files put together, plus a little bit for overhead to
keep track of what's in the tarball. Second, notice that tar , unlike gzip and bzip2, leaves the original
files behind. This isn't a surprise, considering the tar command's background as a backup tool.
What's really cool about tar is that it's designed to compress entire directory structures, so you can
archive a large number of files and subdirectories in one fell swoop.
$ ls -lF
drwxr-xr-x scott scott 168 moby-dick/
$ ls -l moby-dick/*
scott scott 102519 moby-dick/job.txt
scott scott 1236574 moby-dick/moby-dick.txt
scott scott
508925 moby-dick/paradise_lost.txt
moby-dick/bible:
scott scott 207254 genesis.txt
scott scott 102519 job.txt
$ tar -cf moby.tar moby-dick/
$ ls -lF
scott scott
168 moby-dick/
scott scott 2170880 moby.tar
The tar command has been around forever, and it's obvious why: It's so darn useful! But it gets even
more useful when you start factoring in compression tools, as you'll see in the next section.
Archive and Compress Files with tar and gzip
-zcvf
If you look back at "Archive and Compress Files Using gzip" and "Archive and Compress Files Using
bzip2" and think about what was discussed there, you'll probably start to figure out a problem. What if
you want to compress a directory that contains 100 files, contained in various subdirectories? If you use
gzip or bzip2 with the -r (for recursive) option, you'll end up with 100 individually compressed files,
each stored neatly in its original subdirectory. This is undoubtedly not what you want. How would you
like to attach 100 .gz or .bz2 files to an email? Yikes!
That's where tar comes in. First you'd use tar to archive the directory and its contents (those 100 files
inside various subdirectories) and then you'd use gzip or bzip2 to compress the resulting tarball.
Because gzip is the most common compression program used in concert with tar , we'll focus on that.
You could do it this way:
$ ls -l moby-dick/*
scott scott 102519 moby-dick/job.txt
scott scott 1236574 moby-dick/moby-dick.txt
scott scott
508925 moby-dick/paradise_lost.txt
moby-dick/bible:
scott scott 207254 genesis.txt
scott scott 102519 job.txt
$ tar -cf moby.tar moby-dick/ | gzip -c >
moby.tar.gz
$ ls -l
scott scott 168 moby-dick/
scott scott
20 moby.tar.gz
That method works, but it's just too much typing! There's a much easier way that should be your
default. It involves two new options for tar : -z (or --gzip), which invokes gzip from within tar so you
don't have to do so manually, and -v (or --verbose), which isn't required here but is always useful, as it
keeps you notified as to what tar is doing as it runs.
$ ls -l moby-dick/*
scott scott 102519 moby-dick/job.txt
scott scott 1236574 moby-dick/moby-dick.txt
scott scott
508925 moby-dick/paradise_lost.txt
moby-dick/bible:
scott scott 207254 genesis.txt
scott scott 102519 job.txt
$ tar -zcvf moby.tar.gz moby-dick/
moby-dick/
moby-dick/job.txt
moby-dick/bible/
moby-dick/bible/genesis.txt
moby-dick/bible/job.txt
moby-dick/moby-dick.txt
moby-dick/paradise_lost.txt
$ ls -l
scott scott
168 moby-dick
scott scott 846049 moby.tar.gz
The usual extension for a file that has had the tar and then the gzip commands used on it is .tar.gz;
however, you could use .tgz and .tar.gzip if you like.
Note
It's entirely possible to use bzip2 with tar instead of gzip. Your command would look like this
(note the -j option, which is where bzip2 comes in):
$ tar -jcvf moby.tar.bz2 moby-dick/
In that case, the extension should be .tar.bz2 , although you may also use .tar.bzip2, .tbz2,
or .tbz. Yes, it's very confusing that using gzip or bzip2 might both result in a file ending with
.tbz. This is a strong argument for using anything but that particular extension to keep
confusion to a minimum.
Test Files That Will Be Untarred and Uncompressed
-zvtf
Before you take apart a tarball (whether or not it was also compressed using gzip), it's a really good
idea to test it. First, you'll know if the tarball is corrupted, saving yourself hair pulling when files don't
seem to work. Second, you'll know if the person who created the tarball thoughtfully tarred up a
directory containing 100 files, or instead thoughtlessly tarred up 100 individual files, which you're just
about to spew all over your desktop.
To test your tarball (once again assuming it was also zipped using gzip), use the -t (or --list) option.
$ tar -zvtf
scott/scott
scott/scott
scott/scott
scott/scott
scott/scott
scott/scott
scott/scott
moby.tar.gz
0 moby-dick/
102519 moby-dick/job.txt
0 moby-dick/bible/
207254 moby-dick/bible/genesis.txt
102519 moby-dick/bible/job.txt
1236574 moby-dick/moby-dick.txt
508925 moby-dick/paradise_lost.txt
This tells you the permissions, ownership, file size, and time for each file. In addition, because every
line begins with moby-dick/, you can see that you're going to end up with a directory that contains
within it all the files and subdirectories that accompany the tarball, which is a relief.
Be sure that the -f is the last option because after that you're going to specify the name of the .tar.gz
file. If you don't, tar complains:
$ tar -zvft moby.tar.gz
tar: You must specify one of the `-Acdtrux'
Try `tar --help' or `tar --usage' for more
information.
options
Now that you've ensured that your .tar.gz file isn't corrupted, it's time to actually open it up, as you'll
see in the following section.
Note
If you're testing a tarball that was compressed using bzip2, just use this command instead:
$ tar -jvtf moby.tar.bz2
Untar and Uncompress Files
-zxvf
To create a .tar.gz file, you used a set of options: -zcvf. To untar and uncompress the resulting file,
you only make one substitution: -x (or --extract) for -c (or --create ).
$ ls -l
rsgranne rsgranne 846049 moby.tar.gz
$ tar -zxvf moby.tar.gz
moby-dick/
moby-dick/job.txt
moby-dick/bible/
moby-dick/bible/genesis.txt
moby-dick/bible/job.txt
moby-dick/moby-dick.txt
moby-dick/paradise_lost.txt
$ ls -l
rsgranne rsgranne
168 moby-dick
rsgranne rsgranne 846049 moby.tar.gz
Make sure you always test the file before you open it, as covered in the previous section, "Test Files
That Will Be Untarred and Uncompressed." That means the order of commands you should run will look
like this:
$ tar -zvtf moby.tar.gz
$ tar -zxvf moby.tar.gz
Note
If you're opening a tarball that was compressed using bzip2, just use this command instead:
$ tar -jxvf moby.tar.bz2
Conclusion
Back in the days of slow modems and tiny hard drives, archiving and compression was a necessity.
These days, it's more of a convenience, but it's still something you'll find yourself using all the time. For
instance, if you ever download source code to compile it, more than likely you'll find yourself face-toface with a file such as sourcecode.tar.gz. In the future, you'll probably see more and more of those
files ending with .tar.bz2 . And if you exchange files with Windows users, you're going to run into files
that end with .zip. Learn how to use your archival and compression tools because you're going to be
using them far more than you think.
Part III: Finding Stuff
Chapter 9. Finding Stuff: Easy
Chapter 10. The find Command
Chapter 9. Finding Stuff: Easy
Every year, hard drives get bigger and cheaper, a nice combination. With all the technical toys we have
in our lives nowdigital cameras, video cameras, MP3 players, as well as movies and music we find on
the Netwe certainly have plenty of stuff to fill those hard drives. Every digital pack rat has to pay the
price, however, and that too often tends to be an inability to find the stuff you want. It can be hard to
find that one photo of your brother in the midst of 10,000 other pictures, or that paper you wrote when
you have to dig through 600 other documents. Fortunately, Linux has powerful tools at your disposal
that can help make retrieving a necessary file a quick and efficient activity.
Search a Database of Filenames
locate
Know the name of a file, or even part of the name, but don't know where it resides on your system?
That's what locate is for. The locate command looks for files, programs, and directories matching your
search term. Any matching results are printed to your terminal, one after the other.
Note
In order to save space, I've replaced the first part of the path - /home/scott - with an ellipses.
$ locate haggard
.../txt/rider_haggard
.../txt/rider_haggard/Queen_of_the_Dawn.txt
.../txt/rider_haggard/Allan_and_the_Ice-Gods.txt
.../txt/rider_haggard/Heu-Heu_or_The_Monster.txt
Your search results show up quickly because locate isn't searching your system in real time. Instead,
locate searches a database of filenames that is automatically updated daily (more about that in
"Update the Database Used by locate" later in this chapter). Because locate searches a precreated
database, its results appear almost instantaneously.
On your computer, though, you're probably using slocate instead of locateyou're just not aware of it.
The slocate command (which stands for secure locate) is a more recent version that won't search
directories that the user running slocate doesn't have permission to view (for instance, if you're not
root, results from /root shouldn't show up when you search with locate). Before slocate, locate would
spit back many errors complaining about permission problems; with slocate, those errors are a thing of
the past.
To verify how slocate works, try the following. Note that the first search, done when you're a normal
user and not root, fails. Use su to become root, run locate again, and bingo! Search results appear
(slocate.db is the database file used by slocate, by the way).
$ locate slocate.db
$ su # locate slocate.db
/var/lib/slocate/slocate.db.tmp
/var/lib/slocate/slocate.db
To make things easy on users, however, most systems create a soft link for /usr/bin/locate that points
to /usr/bin/slocate. To verify that your Linux distribution does this, try the following (these particular
results are from a box running K/Ubuntu 5.10, a Debian-based distribution, and I've removed some
data so I can focus on the important information):
$ ls -l /usr/bin/locate
root root /usr/bin/locate -> slocate
Because it's transparent to the user that she's running slocate, and because locate uses fewer letters
and is therefore quicker to type, we're going to refer to locate in this book, even though slocate is the
actual command that's being run.
Search a Database of Filenames Without Worrying About
Case
locate -i
In the previous section, you tested locate by searching for any files or directories with the word
haggard in the name, so you could find your collection of public domain H. Rider Haggard novels. The
results looked like this:
$ locate haggard
.../txt/rider_haggard
.../txt/rider_haggard/Queen_of_the_Dawn.txt
.../txt/rider_haggard/Allan_and_the_Ice-Gods.txt
.../txt/rider_haggard/Heu-Heu_or_The_Monster.txt
This worked because the directory containing the novels had the word haggard in it. But if that directory
had instead been named H_Rider_Haggard, the search would have failed due to Linux's case sensitivity
(discussed in Chapter 1, "Things to Know About Your Command Line"). Sure enough, when you use the
-i option, a case-insensitive search is performed, finding files with both haggard and Haggard (and, in
fact, HAGGARD, HaGgArD, and so on) in the path.
$ locate -i haggard
/txt/rider_haggard
/txt/rider_haggard/Queen_of_the_Dawn.txt
/txt/rider_haggard/Allan_and_the_Ice-Gods.txt
/txt/rider_haggard/Heu-Heu_or_The_Monster.txt
/txt/Rider_Haggard
/txt/Rider_Haggard/King_Solomons_Mines.txt
/txt/Rider_Haggard/Allan_Quatermain.txt
It turns out that there were more Haggard novels available than it first seemed. Remember to use -i
when you want to maximize your locate results, as you can otherwise miss important files and folders
that you wanted to find.
Note
For more on H. Rider Haggard, see http://en.wikipedia.org/wiki/Rider_Haggard. He's a fun, if
dated, read.
Manage Results Received When Searching a Database of
Filenames
-n
If you use locate very much, eventually you're going to run into something like this:
$ locate pdf
/etc/cups/pdftops.conf
/etc/xpdf
/etc/xpdf/xpdfrc-latin2
On this computer, we get 2,373 results. Far too many! Better to use this construction instead:
$ locate pdf| less
Pipe the output of your locate search to the pager less (which was covered in "View Text Files a Screen
at a Time," in Chapter 5, "Viewing Files"), and you get your 2,373 results one manageable screen at a
time.
If you just want to see the first x results, where x is an integer of your choice, use the -n option,
followed by the number of results you want to see.
$ locate -n 3 pdf
/etc/cups/pdftops.conf
/etc/xpdf
/etc/xpdf/xpdfrc-latin2
This is far more manageable, and might be all you need. Don't let locate just spew a flood of results at
you; instead, take control of the locate command's output and use it to your favor.
Update the Database Used by locate
updatedb
The first section of this chapter that introduced locate, "Search a Database of Filenames," mentioned
that the reason the command is so fast is because it is actually searching a database containing your
machine's file and directory names. When locate is installed, it automatically sets itself up to scan your
hard drive and update that database, usually in the middle of the night. That's great for convenience,
but not so great if you need to find a file you just placed on your computer.
For instance, what if you install Rootkit Hunter, a program that looks for rootkits (used by bad guys to
take control of your Linux box), and then you want to look at the files the program has installed? The
locate command won't be able to help you because it doesn't know about those files and won't know
about them until its database is updated at a later time. You can, however, manually update the
database used by locate at any time by running updatedb . Because that command indexes virtually
every file and folder on your computer, you need to be root to run it (or use sudo on distributions like
K/Ubuntu that discourage root use).
# apt-get install rkhunter
# exit
$ locate rkhunter
$ su # updatedb
# exit
$ locate rkhunter
/usr/local/rkhunter
/usr/local/rkhunter/bin
/usr/local/rkhunter/etc
In the preceding commands, you first install rkhunter , the package name for Rootkit Hunter, and then
exit root. You search for rkhunter , but it's nowhere to be seen. You become root again, run updatedb to
scan your hard drive and let the locate database know about any changes, and then exit root. Finally,
you search for rkhunter with locate again, and this time you're successful.
One thing you should be aware of, however: The speed with which updatedb works is directly
proportional to the amount of stuff on your hard drive and the speed of your computer. Got a fast
processor, a fast hard drive, and few files? Then updatedb will work quickly. Do you have a slow CPU,
5,400RPM drive, and a million files? Expect updatedb to take quite a while. If you're interested in
knowing just how long it takes to run, preface updatedb with the time command, like this:
# time updatedb
When updatedb finishes, time tells you how long it took to get the locate database squared away. That
is useful information to have in your head in case you ever need to use updatedb and you're in a hurry.
Note
The updatedb command is exactly the same as running slocate -u, and updatedb is actually
just a link to slocate, as you can easily see for yourself.
$ ls -l /usr/bin/updatedb
root root /usr/bin/updatedb -> slocate
Searching Inside Text Files for Patterns
grep
The locate command searches the names of files and directories, but it can't search inside those files.
To do that, you use grep. Essentially, you give grep a pattern for which you want to search, point it at a
file or a group of files (or even a whole hard drive) that you want to search, and then grep outputs a list
of lines that match your pattern.
$ grep pain three_no_more_forever.txt
all alone and in pain
In this case you used grep to see if the word pain was in a file containing a poem by Peter Von Zer
Muehlen titled "Three No More Forever." Sure enough, the word pain is in the file, so grep prints the line
containing your search term on the terminal. But what if you want to look in several of Peter's poems at
once? Wildcards to the rescue!
$ grep pain *
fiery inferno in space.txt:watch the paint peel,
three_no_more_forever.txt:all alone and in pain
the speed of morning.txt:of a Chinese painting.
8 hour a day.txt:nice paint job too
ghost pain.txt:Subject: ghost pain
Notice that grep finds all uses of the string pain, including paint and painting . Also pay attention to
how grep shows you the filename for each file that contains the search term, as well as the line
containing that term. So far, it's been pretty easy to search inside files with grep. So it's a perfect time
to complicate matters, as you'll discover in the following sections.
The Basics of Searching Inside Text Files for Patterns
In the previous section, you learned that grep works by looking for the existence of a pattern in a group
of files. Your first use of grep was extremely basic, but now you need to get a bit more complex, and to
do that you need to understand the patterns for which grep searches. Those patterns are built using one
of the most powerful tools in the Linux toolbox: regular expressions, or regex. To take full advantage of
grep, you really need to grok regex; however, regex is a book all in itself, so we're only going to cover
the basics here.
Tip
Want to learn more about regular expressions? You can search the Internet, and you'll find
quite a bit of great stuff there, but Sams Teach Yourself Regular Expressions in 10 Minutes (by
Ben Forta; ISBN: 0672325667) is a great book that'll really help you as you explore and
learn regex.
One thing that confuses new users when they start playing with grep is that the command has several
versions, as shown in Table 9.1.
Table 9.1. Different Versions of grep
Interpret Pattern As
grep Command Option
Separate Command
Basic regular expression
grep -G (or --basic-regexp)
grep
Extended regular expression
grep -E (or --extended-regexp) egrep
List of fixed strings, any of which grep -F (or --fixed-strings)
can be matched
fgrep
Perl regular expression
Not applicable
grep -P (or --perl-regexp)
To summarize this table, grep all by itself works with basic regex. If you use the -E (or --extendedregexp) option or the egrep command, you can use extended regex. Much of the time, this is probably
what you'll want to do, unless you're performing a very simple search. Two more complicated choices
are grep with the -F (or --fixed-strings) option or the fgrep command, which allows you to use
multiple search terms that could be matched, and grep with the -P (or --perl-regexp ) option, which
allows Perl programming mavens to use that language's sometimes unique approach to regex.
Note
In this book, unless otherwise stated, we're using just plain grep for basic regex.
A few possible points of confusion need to be covered before you continue. If you're unclear about any
of these, use the listed resources as a jumping-off point to learn more.
Wildcards are not equivalent to regex. Yes, both wildcards and regular expressions use the * character,
for instance, but they have completely different meanings. Where certain characters (? and *, for
example) are used as wildcards to indicate substitution, the same characters in regex are used to
indicate the number of times a preceding item is to be matched. For instance, with wildcards, the ? in c?
t replaces one and only one letter, matching cat , cot , and cut , for instance, but not ct. With regex, the
? in c[a-z]?t indicates that the letters A through Z are to be matched both zero or one time(s), thereby
corresponding to cat , cot , cut , and also ct.
Tip
To learn more about differences between wildcards and regular expressions, see "What Is a
Regular Expression"
(http://docs.kde.org/stable/en/kdeutils/KRegExpEditor/whatIsARegExp.html), "Regular
Expressions Explained" (www.castaglia.org/proftpd/doc/contrib/regexp.html), and "Wildcards
Gone Wild" (www.linux-mag.com/2003-12/power_01.html).
Another potentially confusing thing about grep is that you need to be aware of special characters in your
grep regex. For instance, in regular expressions, the string [a-e] indicates a regex range, and means
any one character matching a, b, c, d, or e. When using [ or ] with grep, you need to make it clear to
your shell whether the [ and ] are there to delimit a regex range or are part of the words for which
you're searching. Special characters of which you need to keep aware include the following:
.? [ ] ^ $ | \
Finally, there is a big difference between the use of single quotes and double quotes in regex. Single
quotes (' and ') tell the shell that you are searching for a string of characters, while double quotes ("
and ") let your shell know that you want to use shell variables. For instance, using grep and regex in the
following way to look for all usages of the phrase "hey you!" in a friend's poetry wouldn't work:
$ grep hey you! *
grep: you!: No such file or directory
txt/pvzm/8 hours a day.txt:hey you! let 's run!
txt/pvzm/friends & family.txt:in patience they wait
txt/pvzm/speed of morning.txt:they say the force
Because you simply wrote out "hey you!" with nothing around it, grep was confused. It first looked for
the search term hey in a file called "you!" but it was unsuccessful, as that isn't the actual name of a file.
Then it searched for hey in every file contained in the current working directory, as indicated by the *
wildcard, with three good results. It's true that the first of those three contained the phrase you were
searching for, so in that sense your search worked, but not really. This search was crude and does not
always deliver the results you'd like. Let's try again.
This time you'll use double quotes around your search term. That should fix the problem you had when
you didn't use anything at all.
$ grep " hey you! " *
bash: ! " *: event not found
Even worse! Actually, the quotation marks also cause a big problem and give even worse results than
you just saw. What happened? The ! is a shell command that references your command history.
Normally you'd use the ! by following it with a process ID (PID) number that represents an earlier
command you ran, like !264.
Here, though, bash sees the !, looks for a PID after it, and then complains that it can't find an earlier
command named "* (a double quote, a space, and an asterisk), which would be a very weird command
indeed.
It turns out that quotation marks indicate that you are using shell variables in your search term, which
is in fact not what you wanted at all. So double quotes don't work. Let's try single quotes.
$ grep 'hey!' *
txt/pvzm/8 hours a day.txt:hey you! let 's run!
Much better results! The single quotes told grep that your search term didn't contain any shell variables,
and was just a string of characters that you wanted to match. Lo and behold, there was a single result,
the exact one you wanted.
The lesson? Know when to use single quotes, when to use double quotes, and when to use nothing. If
you're searching for an exact match, use single quotes, but if you want to incorporate shell variables
into your search term (which will be rare indeed), use double quotes. If you're searching for a single
word that contains just numbers and letters, though, it's safe to leave off all quotes entirely. If you want
to be safe, go ahead and use single quotes, even around a single wordit can't hurt.
Search Recursively for Text in Files
-R
The * wildcard allows you to search several files in the same directory, but to search in several
subdirectories at once, you need the -R (or --recursive) option. Let's look for the word hideous, a
favorite of horror writers of the nineteenth and early twentieth centuries, amongst a collection of oldfashioned (but still wonderful!) tales.
$ grep -R hideous *
machen/great_god_pan.txt:know, not in
your most fantastic, hideous dreams can you have
machen/great_god_pan.txt:hideously
contorted in the entire course of my practice, and I
machen/great_god_pan.txt:death was
horrible. The blackened face, the hideous form upon
lovecraft/Beyond the Wall of Sleep.txt:some hideous
but unnamed wrong, which
lovecraft/Beyond the Wall of Sleep.txt:blanket over
the hideous face, and awakened the nurse.
lovecraft/Call of Cthulhu.txt:hideous a chain. I
think that the professor, too, intended to
lovecraft/Call of Cthulhu.txt:voodoo meeting;
and so singular and hideous were the rites
lovecraft/Call of Cthulhu.txt:stated, a very
crude bas-relief of stone, comprising a hideous...
Tip
Of course, if you get too many results, you should pipe the results to less, as you did
previously with locate in "Manage Results Received When Searching a Database of
Filenames":
$ grep -R hideous *| less
Another tactic would be to send the output of the command into a text file, and then open that
file in whatever text editor you prefer:
$ grep -R hideous * > hideous_in_horror.txt
That's a great way to search and store your results in case you need them later.
Search for Text in Files, Ignoring Case
-i
By default, searches performed with grep are case-sensitive. In the previous section, you searched
amongst H.P. Lovecraft stories for the word hideous (a favorite of that author). But what about
Hideous?
$ grep Hideous h_p_lovecraft/*
h_p_lovecraft/the_whisperer_in_darkness.txt:
Hideous though the idea was, I knew...
them.
Your earlier search for hideous found 463 results (wow!) and Hideous returned one. Is there any way to
combine them? Yes, with the -i (or --ignore-case ) option, which searches for both, and also searches
for HiDeOuS, HIDEOUS, and all other possible combinations.
$ grep -i hideous h_p_lovecraft/*
h_p_lovecraft/Call of Cthulhu.txt:voodoo meeting;
and so singular and hideous were the rites
h_p_lovecraft/Call of Cthulhu.txt:stated, a very
crude bas-relief of stone, comprising a hideous
h_p_lovecraft/the_whisperer_in_darkness.txt: them.
Hideous though the idea was, I knew...
Keep in mind that you're probably increasing the number of results you're going to get, perhaps by an
order of magnitude. If that's a problem, check the tip at the end of the previous section for some advice
about dealing with it.
Search for Whole Words Only in Files
-w
Think back to the earlier section "The Basics of Searching Inside Text Files for Patterns," in which you
first learned about grep. You searched for the word pain, and grep obediently returned a list showing
you where pain had been used.
$ grep pain *
fiery inferno in space.txt:watch the paint peel,
three_no_more_forever.txt:all alone and in pain
the speed of morning.txt:of a Chinese painting.
8 hour a day.txt:nice paint job too
ghost pain.txt:Subject: ghost pain
By default, grep searches for all occurrences of the string pain, showing you lines that contain pain, but
also paint and painting. If painless, Spain, or painstaking had been in one of the files that were
searched, those lines would have shown up as well. But what if you only wanted lines in which the exact
word pain appeared? For that, use the -w (or --word-regexp ) option.
$ grep -w pain *
three_no_more_forever.txt:all alone and in pain
ghost pain.txt:Subject: ghost pain
This option can really help narrow your search results when you receive too many to easily sort
through.
Show Line Numbers Where Words Appear in Files
-n
The grep command shows you the line containing the term for which you're searching, but it doesn't
really tell you where in the file you can find that line. To find out the line number, utilize the -n (or -line-number) option.
$ grep -n pain *
fiery inferno in space.txt:56:watch the paint peel,
three_no_more_forever.txt:19:all alone and in pain
the speed of morning.txt:66:of a Chinese painting.
8 hour a day.txt:78:nice paint job too
ghost pain.txt:32:Subject: ghost pain
Now that you know the line numbers for each instance of the string pain, it is a simple matter to go
directly to those lines in virtually any text editor. Nice!
Search the Output of Other Commands for Specific Words
$ ls -1 | grep 1960
The grep command is powerful when used by itself, but it really comes alive when you use it as a filter
for the output of other programs. For instance, let's say you have all your John Coltrane MP3s organized
in separate subfolders for each album (66 in all ... yes, Coltrane is that good), with the year at the
beginning. A partial listing might look like this (the -1 option is used with ls so there is one result on
each line):
$ ls -1
1956_Coltrane_For_Lovers
1957_Blue_Train
1957_Coltrane_[Prestige]
1957_Lush_Life
1957_Thelonious_Monk_With_John_Coltrane
Now, what if you just wanted to see a list of the albums you own that Coltrane released in 1960? Pipe
the results of ls -1 to grep, and you'll get your answer in seconds.
$ ls -1 | grep 1960
1960_Coltrane_Plays_The_Blues
1960_Coltrane's_Sound
1960_Giant_Steps
1960_My_Favorite_Things
After you start thinking about it, you'll find literally hundreds of uses for grep in this way. Here's another
powerful one. The ps command lists running processes, while the -f option tells ps to give the full
listing, with lots of information about each process, and the -U option, followed by a username, restricts
the output to processes owned by that user. Normally ps -fU scott would result in a long list, too long
if you're looking for information about a specific process. With grep, however, you can easily restrict the
output.
Note
To save space, some of the information you'd normally see with ps has been removed.
$ ps -fU scott | grep firefox
scott 17623 /bin/sh /opt/firefox/firefox
scott 17634 /opt/firefox/firefox-bin
scott 1601 grep firefox
The ps command lists all commands owned by the scott user (64, in fact), but pipes that output to
grep, which lists only those lines that contain the word firefox in them. Unfortunately, the last line of
output is erroneous: You only care about the actual Firefox program, not the search for firefox using
grep. To hide the search for firefox in the grep results, try this instead:
$ ps -fU scott | grep [f]irefox
scott 17623 /bin/sh /opt/firefox/firefox
scott 17634 /opt/firefox/firefox-bin
Now your grep search term used a regex range, from f to f, that found firefox on those lines output by
ps in which Firefox was running; however, it didn't match the line for grep because that line was
actually [f]irefox, which wouldn't match. The grep command here can't match the original string ps ef | grep [f]irefox because it contained [ and ], and the grep search for [f]irefox resolves to
searching for the exact word firefox. This is a bit confusing, but if you try it yourself and think about it a
bit, it'll make some sense. At any rate, it works. Give it a try!
See Context for Words Appearing in Files
-A, -B, -C
When dealing with data, context is everything. As you've learned, grep outputs the actual line
containing the search term, but you can also tell grep to include lines before and after the match. In the
last section, "Search the Output of Other Commands for Specific Words," you used grep to work with a
list of John Coltrane albums. One of his best was A Love Supreme. What three albums came out before
that one? To get the answer, use the -B (or --before-context=# ) option.
$ ls -1 | grep -B 3 A_Love_Supreme
1963_Impressions
1963_John_Coltrane_&_Johnny_Hartman
1963_Live_At_Birdland
1964_A_Love_Supreme
If you want to find out what came after A Love Supreme, use the -A (or --after-context=#) option
instead.
$ ls -1 | grep -A 3 'A_Love_Supreme'
1964_A_Love_Supreme
1964_Coltrane's_Sound
1964_Crescent
1965_Ascension
To get the full historical context for A Love Supreme, try the -C (or --context=#) option, which
combines before and after.
$ ls -1 | grep -C 2 'A_Love_Supreme'
1963_John_Coltrane_&_Johnny_Hartman
1963_Live_At_Birdland
1964_A_Love_Supreme
1964_Coltrane's_Sound
1964_Crescent
This can be a bit confusing when you have more than one match in a file or group of files. For instance,
Coltrane released several live albums, and if you want to see the albums just before and after those,
you're going to get more complex results.
$ ls -1 | grep -C 1 Live
1963_John_Coltrane_&_Johnny_Hartman
1963_Live_At_Birdland
1964_A_Love_Supreme
-1965_Last_Trane
1965_Live_in_Seattle
1965_Major_Works_of_John_Coltrane
-1965_Transition
1966_Live_at_the_Village_Vanguard_Again!
1966_Live_in_Japan
1967_Expression
1967_Olatunji_Concert_Last_Live_Recording
1967_Stellar_Regions
The -- characters separate each matched group. The first two groups of results are obviousan album
with Live in the title, preceded and followed by another albumbut the last section is a bit more
complicated. Several albums containing the word Live in the title are right next to each other, so the
results are bunched together. It might look a bit weird, but if you look at each instance of Live, you'll
notice that the album before and after it is in fact listed.
The results are even more informative if you incorporate the -n option, which lists line numbers
(because you're using ls -1, it's the line number of that ls listing).
$ ls -1 | grep -n -C 1 Live
37-1963_John_Coltrane_&_Johnny_Hartman
38:1963_Live_At_Birdland
39-1964_A_Love_Supreme
-48-1965_Last_Trane
49:1965_Live_in_Seattle
50-1965_Major_Works_of_John_Coltrane
-52-1965_Transition
53:1966_Live_at_the_Village_Vanguard_Again!
54:1966_Live_in_Japan
55-1967_Expression
56:1967_Olatunji_Concert_Last_Live_Recording
57-1967_Stellar_Regions
Now -C gives you even more information about each line, as indicated by the character after the line
number. A : indicates that the line matches, while a - means that it's a line before or after a match.
Line 54, 1966_Live_in_Japan , does double duty. It comes after
1966_Live_at_the_Village_Vanguard_Again!, which should mean it has a -, but it is itself a match,
which necessitates a :. Because a match is more important, that wins, and a : is ultimately used.
Show Lines Where Words Do Not Appear in Files
-v
In the jazz world, John Coltrane still rules nearly 40 years after his death; likewise, Led Zeppelin is
recognized as one of the great all-time rock 'n' roll bands. While they were together, Led Zeppelin
released nine albums; however, many of them had the band's name in the album title (yes, the fourth
release didn't really have a title, but most critics still recognize it as Led Zeppelin IV, so humor me).
What if you want to see a list of MP3 folders containing Led Zeppelin's albums, but exclude those that
actually have the words Led Zeppelin in the title? With the -v (or --invert-match) option, you can show
only results that do not match the given pattern.
$ ls -1
1969_Led_Zeppelin
1969_Led_Zeppelin_II
1970_Led_Zeppelin_III
1971_Led_Zeppelin_IV
1973_Houses_Of_The_Holy
1975_Physical_Graffiti
1976_Presence
1979_In_Through_The_Out_Door
1982_Coda
$ ls -1 | grep -v Led_Zeppelin
1973_Houses_Of_The_Holy
1975_Physical_Graffiti
1976_Presence
1979_In_Through_The_Out_Door
1982_Coda
With -v, you can really start to funnel your results to show only the exact items you need. You won't
use -v all the time, but when you need it, you'll be glad it's available.
List Files Containing Searched-for Words
-l
The grep command lists the lines containing the term for which you searched, but there might be times
when you don't want to know the lines; instead, you want to know the names of the files that contain
those matched lines. Previously in "Search for Text in Files, Ignoring Case," you looked for lines in H.P.
Lovecraft stories that contained the word hideous. With the -l (or --files-with-matches) option, you
can instead produce a list of those files (the -i is for case-insensitive searches, remember).
$ grep -il hideous h_p_lovecraft/*
h_p_lovecraft/Call of Cthulhu.txt
h_p_lovecraft/From Beyond.txt
h_p_lovecraft/The Case of Charles Dexter Ward.txt
This type of result is particularly useful when combined with other commands. For example, if you
wanted to print a list of Lovecraft's stories containing the word hideous, you could combine grep with
the lpr command as follows:
$ grep -il hideous h_p_lovecraft/* | lpr
Keep in mind that this command would print out the list of stories, not the stories themselves (there is a
way to do that, though, and here's a hint: It involves cat ).
Search for Words Inside Search Results
grep | grep
What if you want a list of albums released by John Coltrane in the last two years of his career? Simple
enough.
$ ls -1 | grep 196[6-7]
1966_Live_at_the_Village_Vanguard_Again!
1966_Live_in_Japan
1967_Expression
1967_Olatunji_Concert_Last_Live_Recording
1967_Stellar_Regions
The range [5-7] limits what would otherwise be a much longer list to the years 19661967. So far, so
good, but what if you don't want to include any of his live albums (which would normally be a horrible
mistake, but let's pretend here)? Here's how to do it:
$ ls -1 | grep 196[6-7] | grep -v Live
1967_Expression
1967_Stellar_Regions
The -v option (which you learned about previously in "Show Lines Where Words Do Not Appear in
Files") worked to strip out lines containing Live, but the really interesting thing here is how you took
the output of ls -1, piped that to grep 196[6-7] , and then piped the output from that filter to a second
instance of grep, this one with -v Live. The final results are exactly what you wanted: a list of all John
Coltrane's albums, released between 19661967, that do not contain Live in the title. And that, my
friends, shows you the power of the Linux command line in a nutshell!
Conclusion
This chapter focused on two commands that you'll be using often: locate and grep. Though they're
relatedboth assist the Linux user in finding files and information on his computerthey go about it in
different ways. The locate command searches the names of files using a database of filenames to speed
up its work, while the grep command looks in real time within the contents of files to pull out the search
terms.
As cool as both locate and grep are, they're just the beginning when it comes to searching your file
system. The next chapter is about one of the most powerful and versatile commands on a Linux system,
a command that perfectly complements, and in fact can work in tandem with, locate and grep. That
command? The mighty find. Turn that page and let's get going!
Chapter 10. The find Command
In the last chapter, we covered commands that let you search for files (locate) and data within files
(grep). The third command in the powerful triumvirate is find. While locate searches a database for
files, which makes it fast but dependent upon a constantly updated database, find searches for files on
the fly using criteria that you specify. Since find has to parse through your file structure, it's much
slower than locate, but you can do things with find that aren't possible with locate.
Throughout this chapter, we look for files on an external hard drive that contains music mounted at
/media/music. You can see that find allows us to slice and dice the files in a variety of ways.
Find Files by Name
find -name
find is basically used to look for files by name, or part of a name (hence the -name option). By default,
find is automatically recursive and searches down through a directory structure. Let's look for all MP3
files sung by the unique group the Shaggs on the music drive:
$ cd /media/music
$ find . -name Shaggs
./Outsider/Shaggs
What? This can't be correct! The find command found the folder, but not the songs. Why? Because we
didn't use any wildcards, find looked for files specifically named "Shaggs." There is only one item with
that precise name: the folder that contains the songs. (Since a folder is a special kind of file, it's
counted!)
We need to use wildcards, but in order to prevent the shell from interpreting the wildcards in ways we
don't intend, we need to surround what we're searching for with quotation marks. Let's try the search
again with our new improvements:
$ find . -name "*Shaggs*"
./Outsider/Shaggs
./Outsider/Shaggs/Gimme_Dat_Ting_(Live).mp3
./Outsider/Shaggs/My_Pal_Foot_Foot.ogg
./Outsider/Shaggs/I_Love.mp3
./Outsider/Shaggs/Things_I_Wonder.ogg
We surrounded the wildcards with quotation marks; lo and behold, we found the folder and the files.
Note
Another option to find that you've been using without realizing it is -print. The -print option
tells find to list the results of its search on the terminal. The -print option is on by default, so
you don't need to include it when you run find.
Another important aspect of find is that the format of your results is dependent upon the path
searched. Previously, we used a relative path, so our results were given to us as relative paths. What
would happen if we used an absolute pathone that begins with a /instead?
$ find / -name "*Shaggs*"
/music/Outsider/Shaggs
/music/Outsider/Shaggs/Gimme_Dat_Ting_(Live).mp3
/music/Outsider/Shaggs/My_Pal_Foot_Foot.ogg
/music/Outsider/Shaggs/I_Love.mp3
/music/Outsider/Shaggs/Things_I_Wonder.ogg
If you search using a relative path, your results use a relative path; if you search using an absolute
path, your results use an absolute path. We'll see other uses of this principle later in the chapter. For
now, just keep this important idea in mind.
Note
To find out more about the Shaggs, see www.allmusic.com/cg/amg.dll?
p=amg&sql=11:qyk9kett7q7q, or just search www.allmusic.com for "Shaggs." You haven't
lived until you've played "My Pal Foot Foot" at your next party!
Find Files by Ownership
find -user
In addition to searching for files by name, you can also search for files by owner. Do you want to find
the files on the music drive owned by scott? Use find with the -user option, followed by the user name
(or the user number, which you can find in /etc/passwd):
$ find . -user scott
Whoa! There are way too many results! It might be easier to look for files that are not owned by scott.
To do so, put a ! in front of the option you wish to reverse:
Note
In order to conserve space, some of the data you'd normally see with ls -l has been
removed.
$ find . ! -user scott
./Outsider/Wing/01_-_Dancing_Queen.mp3
$ ls -l ./Outsider/Wing/01_-_Dancing_Queen.mp3
gus music ./Outsider/Wing/01_-_Dancing_Queen.mp3
Ah... one song by Wing is owned by gus instead of scott. To fix this problem, we need to use chown
(covered in Chapter 7, "Ownerships and Permissions"). Keep in mind that you can always use the ! as a
NOT operator (which is saying, for example, "find files where the user is not scott").
Note
Ah, Wing. First introduced to the world at large in an episode of South Park, her website can
be found at www.wingtunes.com. Wing's version of "Dancing Queen" is highly recommended,
as is her rendition of "I Want to Hold Your Hand."
Find Files by Group Ownership
file -group
If an option that allows you to work with users is available for a command, you know that an option for
groups is also probably present. The find command is no different: If you want to look for files owned
by a particular group, just use -group, followed by the group's name or number. On the music drive,
scott should be the owner and music should be the group. Let's see if there are any files that aren't in
the music group:
$ find . ! -group music
./Disco/Brides_of_Funkenstein_-_Disco_to_Go.mp3
./Disco/Sister_Sledge_-_He's_The_Greatest_Dancer.mp3
./Disco/Wild_Cherry_-_Play_That_Funky_Music.mp3
./Electronica/New_Order/Bizarre_Love_Triangle.mp3
There are only four files out of a large number of files that aren't in the music group. Now we need to
run chgrp (see Chapter 7) on these files to make everything on the drive exactly the same.
Notice that, once again, we used ! to say, "Find the files that are not owned by the music group."
Find Files by File Size
file -size
Sometimes you'll want to find files based on their size. The find command can assist here as well. To
specify a size, use the -size option, followed by a letter representing the size scheme you wish to use.
If you don't provide a letter, the default is used; however, you should understand that you probably
won't find what you want. If you don't append a letter after the number, the default is in bytes (which is
then divided by 512 and rounded up to the next integer). This is too much math. It's easier to use a
suffix after the number that represents the size in more common terms, as shown in Table 10.1.
Table 10.1. Find Files by
Size
Suffix
Meaning
b
512-byte blocks (the
default)
c
Bytes
k
Kilobytes (KB)
M
Megabytes (MB)
G
Gigabytes (GB)
Let's say we want to find every Clash song from their immortal album, London Calling, that is 10MB.
(Yes, these were encoded at a super-high rate.) This job is easy enough with find:
$ cd Punk/Clash/1979_London_Calling
$ find . -size 10M
./07_-_The_Right_Profile.ogg
./08_-_Lost_In_The_Supermarket.ogg
./09_-_Clampdown.ogg
./12_-_Death_Or_Glory.ogg
That's weird. Only four songs? Here's where you need to understand a "gotcha" associated with using
find: If you say 10M , then find looks for files that are exactly 10MB in size (rounded to 10MB, of
course). If you want files larger than 10MB, you need to place a plus sign (+) in front of the given size;
if you want files smaller than 10MB, use a minus sign (-) in front of the size:
$ find . -size +10M
./Jimmy_Jazz.ogg
./ Lover's_Rock.ogg
./ Revolution_Rock.ogg
Now we have a problem. Specifying 10M gives us files that are exactly 10MB, excluding those that are
bigger, while specifying +10M gives us files that are larger than 10MB, excluding those that are exactly
10MB. How do we get both? If you want to learn how to obtain files that are 10MB and larger, see the
"Show Results If Either Expression Is True (OR)" section later in this chapter.
Tip
If you want to find large text files, use c after your number. As Table 10.1 shows, c changes
the search size to bytes. Every character in a text file is a byte, so an easy way to remember c
is to associate it with the "characters" in a text file.
For instance, to find enormous text files, you can use this code:
$ find /home/scott/documents -size +500000c
Find Files by File Type
find -type
One of the most useful options for find is -type, which allows you to specify the type of object you wish
to look for. Remember that everything on a UNIX system is a file (covered back in Chapter 1, "Things to
Know About Your Command Line," in the "Everything Is a File" section), so what you're actually
indicating is the type of file you want find to ferret out for you. Table 10.2 lists the file types you can
use with find.
Table 10.2. Finding Files by
Type
File Type Letter
Meaning
f
Regular file
d
Directory
l
Symbolic (soft) link
b
Block special file
c
Character special
file
p
FIFO (First In First
Out)
s
Socket
Let's say we want a quick list of the Steely Dan albums we have on the music drive. Since all songs are
placed into directories by album, look for folders with -type d:
$ cd /media/music/Rock
$ find Steely_Dan/ -type d
Steely_Dan/
Steely_Dan/1980_Gaucho
Steely_Dan/1975_Katy_Lied
Steely_Dan/1974_Pretzel_Logic
Steely_Dan/1976_The_Royal_Scam
Steely_Dan/2000_Two_Against_Nature
Steely_Dan/1972_Can't_Buy_A_Thrill
Steely_Dan/1973_Countdown_To_Ecstasy
Steely_Dan/1977_Aja
This list is helpful, but since the name of every album directory begins with the year the album was
released, we can sort by year for more precise information:
$ cd /media/music/Rock
$ find Steely_Dan/ -type d | sort
Steely_Dan/
Steely_Dan/1972_Can't_Buy_A_Thrill
Steely_Dan/1973_Countdown_To_Ecstasy
Steely_Dan/1974_Pretzel_Logic
Steely_Dan/1975_Katy_Lied
Steely_Dan/1976_The_Royal_Scam
Steely_Dan/1977_Aja
Steely_Dan/1980_Gaucho
Steely_Dan/2000_Two_Against_Nature
It can be incredibly beneficial to use a pipe to filter the list you generate with find; as you get better
with find, you'll see that you do this more and more.
Show Results If the Expressions Are True (AND)
find -a
A key feature of find is the ability to join several options to more tightly focus your searches. You can
link together as many options as you'd like with -a (or -and). For example, if you want to list every
song performed by the world's greatest rock 'n' roll band, the Rolling Stones, you might use just -name
"Rolling_Stones*" at first, but that wouldn't necessarily work. Some folders have Rolling_Stones in
them, and there may be a soft link or two with the band's name in it as well. We, therefore, also need to
use -type f. Combine them as follows:
$ find . -name "Rolling_Stones*" -a -type f
1968_Beggars_Banquet/03_-_Dear_Doctor.mp3
1968_Beggars_Banquet/01_-_Sympathy_For_The_Devil.mp3
1968_Beggars_Banquet/02_-_No_Expectations.mp3
1968_Beggars_Banquet/04_-_Parachute_Woman.mp3
1968_Beggars_Banquet/05_-_Jig-Saw_Puzzle.mp3
1968_Beggars_Banquet/06_-_Street_Fighting_Man.mp3
That's cool, but how many Stones tunes do we have? Pipe the results of find to wc (which stands for
"word count"), but also use the -l option, which gives you the number of lines instead of the word
count:
$ find . -name " Rolling_Stones* " -a -type f | wc -l
317
Three hundred and seventeen Rolling Stones songs. Sweet. In this case, you can always get what you
want.
Note
If you have no idea what that last sentence means, get yourself a copy of the album Let It
Bleed now. Don't forget to play it loud.
Show Results If Either Expression Is True (OR)
find -o
Earlier, in the section "Find Files by File Size," we saw that we could use find to list every Clash song on
London Calling that was exactly 10MB, and we could use find to list songs on London Calling that were
more than 10MB, but we couldn't do both at the same time with -size. In the previous section, we saw
that -a combines options using AND ; however, we can also utilize -o (or -or ) to combine options using
OR.
So, in order to find songs from London Calling that are 10MB or larger, we use the following command:
$ cd London_Calling
$ find . -size +10M -o -size 10M
03_-_Jimmy_Jazz.ogg
07_-_The_Right_Profile.ogg
08_-_Lost_In_The_Supermarket.ogg
09_-_Clampdown.ogg
12_-_Death_Or_Glory.ogg
15_-_Lover's_Rock.ogg
18_-_Revolution_Rock.ogg
(25th_Anniversary)_-_18_-_Revolution_Rock.mp3
(25th_Anniversary)_-_37_-_Heart_And_Mind.mp3
(25th_Anniversary)_-_ 39_-_London_Calling_(Demo).mp3
Oops...we also got results from the 25th Anniversary release of London Calling as well, which we didn't
want. We need to do two things: exclude results from the 25th Anniversary edition and make sure that
our OR statement works correctly.
To exclude the 25th Anniversary songs, we add ! -name "*25*" at the end of our command. To make
sure that OR works, we need to surround it with parentheses, which combines the statements. However,
you need to escape the parentheses with backslashes so the shell doesn't misinterpret them, and you
also need to put spaces before and after your statement. The combination gives us this command:
$ cd London_Calling
$ find . \( -size +10M -o -size 10M \) ! -name
"*25*"
03_-_Jimmy_Jazz.ogg
07_-_The_Right_Profile.ogg
08_-_Lost_In_The_Supermarket.ogg
09_-_Clampdown.ogg
12_-_Death_Or_Glory.ogg
15_-_Lover's_Rock.ogg
18_-_Revolution_Rock.ogg
Perfect. Seven songs are 10MB or greater in size.
Note
Read about London Calling at Allmusic.com (www.allmusic.com/cg/amg.dll?
p=amg&sql=10:aeu1z82ajyvn) or Pitchfork (www.pitchforkmedia.com/recordreviews/c/clash/london-calling.shtml).
We can also use -o to find out how many songs we have on the music drive. We might start with this
command, which we run from the root of /media/music, and which uses -a (be patient; we'll get to -o):
$ find . -name "*mp3*" -a -type f | wc -l
23407
Twenty-three thousand? That's not right. Ah, now it's obvious. We were only searching for mp3 files,
and an enormous amount of songs was encoded in the superior Ogg Vorbis format, a patent-free, open
source alternative to mp3. Let's look for mp3 or ogg files and count the results with wc l:
$ find . \( -name "*mp3*" -o -name "*.ogg*" \) -a -type f | wc -l
41631
That sounds much better, but there are also some FLAC files on there. Let's add another -o to the mix:
$ find . \( -name "*mp3*" -o -name "*.ogg*" -o -name "*.flac*" \) -a -type f | wc -l
42187
Now that's more like it: 42,000 songs and growing!
Show Results If the Expression Is Not True (NOT)
find -n
We have already used ! to negate expressions in earlier sections in this chapter (see "Find Files By
Ownership" and "Show Results If Either Expression Is True (OR)"), but let's take another, longer look at
this operator. In the previous section, we used find to determine how many mp3, ogg, and flac files we
have on the music drive. But how many total files do we have on the drive?
$ find . -name "* " | wc -l
52111
Hmm...52,000 total files, but only 42,000 are mp3, ogg, or flac. What are the others? Let's construct a
find command that excludes files that end in .mp3, .ogg, or .flac while also excluding directories. Wrap
these four conditions in parentheses, and then place ! before the command to indicate our wish to
negate these items and instead look for things that do not match these specifications:
$ find . ! \( -name "*mp*" -o -name "*ogg" -o -name "*flac" -o -type d \)
./Folk/Joan_Baez/Joan_Baez_-_Imagine.m3u
./500_Greatest_Singles/singles.txt
./Blues/Muddy_Waters/Best_Of.m3u
./Blues/Robert_Johnson/Hellhound_On_My_Trail.MP3
./Blues/Johnny_Winter/Johnny_Winter.m3u [Results
truncated for length]
Now we have the answer, or at least we're starting to get an answer. We have m3u (playlist) files, text
files, and files that end in MP3 instead of mp3 (we also have JPEGs, GIFs, and even music videos). The
find command, like all of Linux, is case-sensitive (as discussed in Chapter 1), so a search for "*mp3*"
will not turn up songs that have the .MP3 suffix. Is there any quick and easy way to change these files
so that they have the correct, lowercase extension? Sure. And we'll use find to do it.
Execute a Command on Every Found File
find -exec
Now we enter the area in which find really shows off its power: the ability to execute commands on the
files that it, uh, finds. After listing the options that help narrow down your searchsuch as -name, -type,
or userappend -exec along with the commands you want to run on each individual file. You use the
symbols {} to represent each file, and end your command with \ to escape the semicolon so your shell
doesn't interpret it as an indication of command stacking (which we discussed in Chapter 4, "Building
Blocks").
In the previous section, for example, we discovered that some of the files on the music drive ended with
MP3 . Since we prefer lowercase extensions, we need to convert all instances of MP3 to mp3 to make the
files consistent. We can do this with the -exec option for find. First, let's verify that there are files that
end with MP3 :
$ find . -name "Robert_Johnson*MP3"
./Blues/Robert_Johnson/Judgment_Day.MP3
./Blues/Robert_Johnson/Dust_My_Broom.MP3
./Blues/Robert_Johnson/Hellhound_On_My_Trail.MP3
Let's use find with -exec to change the file extension. The program we can use with -exec is rename,
which changes parts of filenames:
$ find . -name " *MP3 " -exec rename's/MP3/mp3/g' {} \;
The rename command is followed with instructions for the name change in this format: s/old/new/g.
(The s stands for "substitute," while the g stands for "global.") Now let's see if our command works:
$ find . -name "Robert_Johnson*MP3"
$ ls -1 Blues/Robert_Johnson/
Hellhound_On_My_Trail.mp3
Judgment_Day.mp3
Dust_My_Broom.mp3
Love_in_Vain.mp3
Me_and_the_Devil_Blues.mp3
The command appears to work. Let's try a similar process with another situation. In the previous
section, we noticed that many of the results discovered were m3u , or playlist, files. Unfortunately, many
of these files had spaces in the filenames, which we try to avoid. Let's first generate a list of m3u files
that have spaces. We can search for * *m3u, which uses wildcards around a space to discover files that
include spaces in their name:
$ find . -name "* *m3u"
./Christmas/Christmas With The Rat Pack.m3u
./Christmas/Holiday_Boots_4_Your_Stockings.m3u
./Classical_Baroque/Handel/Chamber Music.m3u
./Classical_Opera/Famous Arias.m3u
./Doo_Wop/Doo Wop Box.m3u
./Electronica/Aphex_Twin/I Care Because You Do.m3u
Now let's find m3u files with spaces in their names; when one is found, we can run rename against it.
We're substituting "\ " (we escape the space so that find understands what we're looking for and the
shell doesn't get confused) with "_" to fix the problem:
$ find . -name "* *m3u" -exec rename 's/\ /_/g' {}
\;
$ find . -name "* *m3u"
$
The command worked as planned.
Note
Note that before running commands on the files, we must always first figure out which files
are going to be changed. That's just good prudence. You don't want to change the wrong files!
Print Find Results into a File
find -fprint
Every time we've used find , we have assumed that the -print option isn't necessary because it's on by
default. If you want to print to a file instead of the terminal, you use the -fprint option, followed by the
name of the file you want to create.
[View full width]$ find . ! \( -name "*mp*" -o -name "*ogg" -o -name "*flac" -o -type d \) non_music_files.txt
$ cat non_music_files.txt
./Folk/Joan_Baez/Joan_Baez_-_Imagine.m3u
./500_Greatest_Singles/singles.txt
./Blues/Muddy_Waters/Best_Of.m3u
./Blues/Robert_Johnson/Hellhound_On_My_Trail.MP3
./Blues/Johnny_Winter/Johnny_Winter.m3u
You can then use these results in scripts, or send them to a printer if that would be helpful.
Conclusion
As you can tell, you can do a lot with find. There are still plenty of things that find can do, so you really
should look at man find or read several of the excellent tutorials available on the World Wide Web that
cover the more arcaneyet usefulaspects of the command. You can use find to search for and list files
and folders using an amazing variety of options. It really becomes invaluable, however, when you use
the -exec option to run commands on the discovered files, or when you pipe the output to other
commands. This is one of the greatest commands available to you on your Linux box, so start using it!
Part IV: Environment
Chapter 11. Your Shell
Chapter 12. Monitoring System Resources
Chapter 13. Installing Software
Chapter 11. Your Shell
So far in this book you've been running commands in your bash shellbut you haven't focused on the
shell itself. In this chapter, you look at two commands that affect your use of the shell: history, which
lists everything you've entered on the command line, and alias, which allows you to create shortcuts
for commands. Both are useful, and both can save you lots of time when you're using the command
line. Laziness is a good thing when it comes to computer users, and these are definitely two commands
that will help you be as lazy as possible when using your Linux box.
View Your Command-Line History
history
Every time you type a command in your shell, that command is saved in a file named .bash_history in
your home directory (the dot in front of the filename means that it's hidden unless you use ls -a). By
default, that file holds the last 500 lines entered on the command line. If you want to view that list of
commands, just enter the history command.
$ history
496
497
498
499
500
ls
cd rsync_ssh
ls
cat linux
exit
Because you're looking at 500 results, they're going to stream by so fast that you can't see any until
you get to the end. Want to step through the results one screen at a time? Turn to your old friend less:
$ history | less
Now you can jump through your results much more easily.
Caution
Now you understand why you need to be careful typing passwords and other sensitive
information on the command line: Anyone who can view your .bash_history file is able to see
those passwords. Be careful and think about what you enter directly on the command line!
Run the Last Command Again
!!
If you want to run the last command you used a second time, enter two exclamation points. That looks
in the history file and runs the last command in the list.
$ pwd
/home/scott
$ !!
pwd
/home/scott
Notice that you first see the actual command that's going to run, and then the results of that command.
This is a truly useful way to have your computer do tedious work for you.
Run a Previous Command Using Numbers
![##]
When you run history, it automatically places a number in front of every previous command. If you'd
like to run a previous command and you know the number that history has assigned to it, just enter an
exclamation point immediately followed by that command's history number, and it will run again.
$ pwd
/home/scott
$ whoami
scott
$ !499
pwd
/home/scott
If you're unsure about the number, run history again to find out. Be aware that the pwd command in
this example was number 499 the first time, but after I ran it again using !499, it became 498 because
it was pushed down on the list by my new command.
Run a Previous Command Using a String
![string]
The capability to run a command again by referencing its number is nice, but it requires that you know
the command's number in history, which can be tedious to discover (piping the output of history to
grep would help, but it's still not optimal). Often a better way to reference a previously entered
command is by the actual command's name. If you follow the exclamation point by the first few letters
of that command, your shell runs the first command it finds when looking backward in .bash_history .
$ cat /home/scott/todo
Buy milk
Buy dog food
Renew Linux Magazine subscription
$ cd /home/scott/pictures
$ !cat
cat /home/scott/todo
Buy milk
Buy dog food
Renew Linux Magazine subscription
If the cat command is found three times in your historyat 35 (cat /home/scott/todo), 412 (cat
/etc/apt/sources.list), and 496 (cat /home/scott/todo)and you enter !cat, the one found at 496 is
the one that runs. If you want to run the cat found at 412, you need to run either !412 or follow the
exclamation mark with enough information so it can tell you're referencing the command listed as 412.
$ !cat /etc
cat /etc/apt/sources.list
deb http://us.archive.ubuntu.com/ubuntu breezy main restricted
deb-src http://us.archive.ubuntu.com/ubuntu breezy main restricted
Because humans have a far easier time remembering words instead of numbers, you'll probably end up
using this method for invoking past commands. If you're ever unsure, run history and take a look.
Display All Command Aliases
alias
If you use a command all the time, or if a command is just particularly long and onerous to type, it
behooves you to make that command an alias. After you've created an alias, you type its name, and the
command referenced by the alias runs. Of course, if a command is particularly complicated or involves
several lines, you should instead turn it into a script or function. But for small things, aliases are
perfect.
Aliases are stored in a file in your home directory. You might find them in .bashrc, but more likely (or
more properly, rather) they're in .bash_aliases . Most Linux distributions come with several aliases
already defined for you; to see that list, just enter alias on the command line.
$ alias
alias la= 'ls -a'
alias ll= 'ls -l'
Most distributions purposely keep the default aliases to a minimum. It's up to you to add new ones, as
you'll shortly see.
View a Specific Command Alias
alias [alias name]
After you've defined several aliases, it can be hard to find a specific one if you type in the alias
command. If you want to review what a specific alias does, just follow alias by the name of the specific
alias about which you want to learn more.
[View full width]$ alias wgetpage
alias wgetpage= 'wget --html-extension --recursive --convert-links --page-requisites
--no-parent $1'
Now you know exactly what the wgetpage alias does, quickly and easily.
Note
You'll learn more about wget in Chapter 15, "Working on the Network."
Create a New Temporary Alias
alias [alias]= '[command]'
If you find yourself typing a command over and over again, maybe it's time to make it an alias. For
instance, to see just the subdirectories in your current working directory, you'd use ls -d */ . To create
a temporary alias for that command, use the following:
$ ls -d */
by_pool/ libby_pix/ on_floor/
$ alias lsd='ls -d */'
$ lsd
by_pool/ libby_pix/ on_floor/
You should realize a couple of things about using alias in this way. Your alias name can't have an = in
it, which makes sense, because it's immediately followed by an = when you're defining it. You can,
however, have an = in the actual alias itself. Also, an alias created in this fashion only lasts as long as
this shell session is active. Log out, and your alias is gone.
Want to create an alias that lasts past when you log out? Then read the next section, "Create a New
Permanent Alias."
Create a New Permanent Alias
alias [alias name]= '[command]'
If you want your aliases to stick around between shell sessions, you need to add them to the file your
shell uses to store aliases. Most of the time that file is either .bashrc or .bash_aliases ; in this case,
you're going to use .bash_aliases . No matter which file it is, be careful editing it, as you can cause
yourself problems later when logging in. If you want to be really careful, create a backup of the file
before editing it. Better safe than sorry.
Note
How can you find out which file you should use? Simple: Type ls -a ~. If you see
.bash_aliases , use that; otherwise, look in .bashrc and see if other aliases are defined in
there. If you don't see any aliases in there, take a look at .profile , which is occasionally
used.
To add an alias to .bash_aliases , open it with your favorite text editor and add a line like the following:
alias lsd= 'ls -d */'
The same rule discussed in "Create a New Temporary Alias" applies here as well: Your alias name can't
have an = in it. After adding your alias to .bash_aliases , save the file and close it. But the alias doesn't
work yet. The .bash_aliases file (or .bashrc if that's what you used) needs to be reloaded for the new
alias to work. You can do this in two ways. You can either log out and then log in again, which is a pain
and not recommended, or you can just reload the file by running this command:
$ . .bash_aliases
That's a dot, followed by a space, followed by the name of the file, which begins with a dot. Now your
new alias will work. Because you have to reload the file every time you add an alias, it's a good idea to
add several at a time to lessen the hassle.
Remove an Alias
unalias
All good things must come to an end, and sometimes an alias outlives its usefulness. To remove an
alias, use the unalias command.
$ ls -d */
by_pool/ libby_pix/ on_floor/
$ alias lsd= 'ls -d */'
$ lsd
by_pool/ libby_pix/ on_floor/
$ unalias lsd
$ lsd
$
Note, though, that this command only works permanently for temporary shell aliases, discussed
previously in "Create a New Temporary Alias." The lsd alias in the previous example is gone for good. If
you use the unalias command on an alias found in .bash_aliases , it too will be gone, but only as long
as you're logged in. When you log out and log back in, or reload .bash_aliases , the alias is back.
To remove aliases from .bash_aliases , you need to edit the file and manually remove the line
containing the alias. If you think there's a chance you might want to reuse the alias again sometime,
just put a pound sign (which comments it out) in front of the alias, like this:
# alias lsd= 'ls -d */'
Save .bash_aliases , reload it with the . .bash_aliases command, and the alias won't work any longer.
But if you ever need it again, open .bash_aliases , remove the pound sign, save and reload the file, and
it's good to go again.
Conclusion
One of your goals as a Linux user should be reducing the amount of characters you have to type to
accomplish your goals. The original developers of Unix were strong believers in that conceptthat's why
you enter ls instead of list and mkdir instead of makedirectory .
The commands you've learned about in this chapterhistory and aliasboth aid in that process. Tired of
typing a long and complicated command? Reference it with history, or create an alias for it. Either way,
you'll save keystrokes, and your keyboard (and, more importantly, your hands and wrists) will thank
you for it.
Chapter 12. Monitoring System Resources
A good computer user has to be a bit of a systems administrator, always checking the health of her
machine to make sure that everything is running smoothly. To help achieve that goal, Linux has several
commands to monitor system resources. In this chapter, you're going to look at several of those
commands. Many of them have multiple uses, but allin the best Unix tradition of small tools that solve a
single task wellare especially good at one particular aspect of system monitoring. Their tasks range
from keeping track of the programs running on your computer (ps and top ) to ending errant processes
(kill) to listing all open files (lsof) to reporting your system's usage of RAM (free) and hard disk space
(df and du). Learn these tools well. You'll find yourself using them in all sorts of situations.
View All Currently Running Processes
ps aux
Every once in a while a program you're running locks up and stops responding, requiring that you close
it. Or you might want to know what programs a certain user is running. Or you might just want to know
about the processes currently running on your machine. In any of those cases and many others, you
want to use the ps command that lists open processes on your computer.
Unfortunately, ps has many versions, and they have different types of options. They can even have
different meanings, depending on whether those options are preceded by a hyphen, so that u and -u
mean completely different things. Up to now, this book has been pretty strict about preceding all
options with a hyphen, but for ps , it's not, as it makes things a bit easier and more uniform.
To see all the processes that any user is running on your system, follow ps with these options: a (which
means all users ), u (user-oriented , or show the user who owns each process), and x (processes
without controlling ttys, or terminal screens , another way of saying "show every process"). Be
forewarned that a fairly long list is going to zoom by on your screen: 132 lines on this computer.
[View full width]$ ps aux
USER
PID
%CPU %MEM
VSZ
RSS
scott 24224
4.1
/bin/wine-preloader
4.2
1150440
scott
5594
machine.com
0.1
3432
0.0
scott 14957
0.3
7.5
171144
/program/soffice.bin -writer
scott
12369
0.0
0.0
0
scott
14680
0.0
0.1
3860
TTY
44036 ?
1820
STAT
START
TIME
R
11:02
12:03 /home/scott/.cxoffice
pts/6 S+
COMMAND
12:14
0:00 ssh
scott@humbug.
78628 ?
Sl
13:01
0:35 /usr/lib/openoffice2
0 ?
Z
15:43
0:00 [wine-preloader] <defunct
1044
pts/5 R+
15:55
0:00 ps aux
The ps command gives you a lot of information, including the user who owns the process, the unique
process ID number (PID) that identifies the process, the percentage of the CPU (%CPU ) and memory
(%MEM ) that the process is using, the current status (STAT ) of the process, and the name of the process
itself.
The STAT column can have different letters in it, with the following being the most important:
R
Running
S
Sleeping
T
Stopped
Z
Zombie
STAT Letter
Meaning
A Z is bad news because it means that the process has basically hung and cannot be stopped
(fortunately, it does not mean that it will try to eat your brain). If you're having a problem with a
program and ps indicates that it has a status of Z , you're probably going to have to reboot the machine
to completely kill it.
Because ps aux provides you with a lot of data, it can be difficult to find the program for which you're
searching.
Piping the output of ps aux to grep can be an easy way to limit the results to a particular command.
$ ps aux | grep [f]irefox
scott 25205
0.0 0.0
scott
25213
1.1 10.9
4336
8
?
189092 113272
?
S Feb08 0:00 /bin/sh /opt/firefox/firefox
Rl Feb08 29:42 /opt/firefox/firefox-bin
Now you know just the instances of Firefox running on this machine, including who's running the
program, how much of a load on the machine that program is, and how long that program has been
running. Useful!
Tip
Why did you search for [f]irefox instead of firefox ? To find the answer, take a look in
Chapter 9 , "Finding Stuff: Easy ," in the "Search the Output of Other Commands for Specific
Words " section.
View a Process Tree
ps axjf
In the Linux world, processes don't just appear out of thin air. Often, starting one program starts other
programs. All processes on a Linux system, for instance, ultimately come from init, the mother of all
processes, which always has a PID of 1. The ps command can provide you with a text-based
representation of this process tree so you can visually see what processes have spawned others. To see
the process tree, use the a, u, and x options, used in the previous section, along with the f (the
evocatively named ASCII art forest) option.
Note
Normally you'd see the following columns:
PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
In the interests of making the command tree easier to understand, most of the columns you'd
actually see with ps axuf have been removed in the following code listing.
$ ps axuf
PPID
PID
1 7558
7558 7561
7561 7604
7561 8225
8225 8279
8225 8341
1 8316
8316 10842
8316 29663
8316 30906
8316 17893
17893 17960
17893 17961
COMMAND
/usr/sbin/gdm
\_/usr/sbin/gdm
\_/usr/X11R6/bin/X :0
\_/bin/sh /usr/bin/startkde
\_/usr/bin/ssh-agent
\_kwrapper ksmserver
kdeinit Running...
\_konqueror [kdeinit] --silent
\_quanta
\_/usr/bin/kwrite /home/scott/analysis
\_/usr/lib/opera/9.0-20060206.1/opera
|
\_/usr/lib/opera/pluginwrapper
|
\_/usr/lib/opera/plugincleaner
Note that ps axjf introduces a key new column, PPID. PPID, the Parent Process ID number, is the
number of the process that spawned the PID. Armed with the PID or PPID, you can end runaway
processes, as you'll see soon in "End a Running Process."
View Processes Owned by a Particular User
ps U [username]
Up to now you've been looking at how ps gives you a list of all processes on your system. If you want to
limit the results to those owned by a single user, simply use the U option, followed by the username or
number.
$ ps U scott
PID TTY STAT
14928 ?
S
14957 ?
Sl
4688 pts/4 S+
26751 ?
Z
27955 pts/5 R+
TIME COMMAND
0:00 /opt/ooo2/program/soffice -writer
0:42 /opt/ooo2/program/soffice.bin -writer
0:00 ssh [email protected]
0:00 [wine-preloader] <defunct>
0:00 ps U scott
Of course, ps doesn't include the username in the listing because it was already part of the command.
Tip
Remember, if you don't know the user's number, just look in /etc/passwd. Find the username,
and the user number is in the third column.
End a Running Process
kill
Sometimes a program goes rogue, and it won't react to normal attempts to close it. In the GUI, you
click repeatedly on the Close button, but nothing happens, or, in the shell, you press Ctrl+C but it fails
to stop a running command. When this happens, it's time for the kill command.
You can use several signals with the kill command, ranging from "Please shut yourself down, cleaning
up your mess as you do so" to "Shut yourself down as soon as you can" to "Die now!" You're going to
focus on the three most important. When you use kill, you can specify the intensity of the process's
death with a number or a word, as Table 12.1 shows.
Table 12.1. Common Signals Associated with kill
Signal Number
Signal Word
Meaning
-1
-HUP (hang up)
Controlling process has died. Shut down (if
applied to a system service, reload configuration
files and restart).
-15
-TERM
Terminate gracefully, cleaning up stray processes
and files.
-9
-KILL
Stop whatever you're doing and die now!
Normally, you should first try -15 (in fact, it's the default if you run kill without any option). That way,
you give the program a chance to shut down any other programs depending upon it, close files opened
by it, and so on. If you've been patient and waited a while ("a while" is completely subjective, of
course), and the process is still running completely out of control or still failing to respond, you can
bring out the big guns and use -9. The -9 option is equivalent to yanking the rug out from under a
running program, and while it does the job, it could leave stray files and processes littering your
system, and that's never a good idea.
As for -1, or -HUP, that's primarily for services such as Samba or wireless connectivity. You won't use it
much, but you should know what it means.
Here's what you would do if gvim (which normally behaves just fine; this is just for demonstration
purposes) seemed to freeze on your system.
$ ps U scott
PID TTY STAT
14928 ?
S
TIME COMMAND
0:00 /opt/ooo2/program/soffice -writer
14957 ?
Sl
4688 pts/4 S+
26751 ?
Z
27921 ?
Ss
27955 pts/5 R+
$ kill 27921
$ ps U scott
PID TTY STAT
14928 ?
S
14957 ?
Sl
4688 pts/4 S+
26751 ?
Z
27955 pts/5 R+
0:42 /opt/ooo2/program/soffice.bin -writer
0:00 ssh [email protected]
0:00 [wine-preloader] <defunct>
0:00 /usr/bin/gvim
0:00 ps U scott
TIME COMMAND
0:00 /opt/ooo2/program/soffice -writer
0:42 /opt/ooo2/program/soffice.bin -writer
0:00 ssh [email protected]
0:00 [wine-preloader] <defunct>
0:00 ps U scott
To kill gvim, you use ps to find gvim's PID, which in this case is 27921. Next you kill that PID (and
remember that kill uses the TERM signal by default) and then check again with ps. Yes, gvim is no more.
Note
Why didn't you kill PID 26751, which has a STAT of Z, indicating that it's a zombie? Because
even -9 doesn't work on a zombieit's already dead, hence the kill command won't work. A
reboot is the only way to fix that (usually unimportant) problem.
View a Dynamically Updated List of Running Processes
top
On occasion you'll find that your Linux machine suddenly slows down, apparently for no reason.
Something is running on your computer at full tilt, and it's using up the processor. Or perhaps you'll
start running a command, only to find that it's using far more of your CPU's cycles than you thought it
would. To find out what's causing the problem, or if you just want to find out what's running on your
system, you could use ps, except that ps doesn't update itself. Instead, it provides a snapshot of your
system's processes, and that's it.
The top command, on the other hand, provides a dynamically updated view of what's running on your
system and how many system resources each process is using. It's a bit hard to see in a book because
you can't see top update its display every second, but here's what you would see at one moment while
top is running.
$ top
top - 18:05:03 up 4 days, 8:03, 1 user, load average: 0.83, 0.82, 0.97
Tasks: 135 total, 3 running, 126 sleeping, 2 stopped, 4 zombie
Cpu(s): 22.9% us, 7.7% sy, 3.1% ni, 62.1% id, 3.5% wa, 0.1% hi, 0.7% si
Mem:
1036136k total, 987996k used,
48140k free, 27884k buffers
Swap:
1012084k total, 479284k used,
532800k free, 179580k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
25213 scott 15 0 230m 150m
15m S 11.1 14.9 33:39.34 firefox-bin
7604 root 15 0 409m 285m 2896 S 10.8 28.2 749:55.75 Xorg
8378 scott 15 0 37084 10m 7456 S 1.0 1.1 13:53.99 kicker
8523 scott 15 0 69416 13m 3324 S 0.7 1.3 63:35.14 skype
29663 scott 15 0 76896 13m 4048 S 0.7 1.3 13:48.20 quanta
The top command gives you a great deal of information about your system in the first five lines of its
output, and then it focuses on listing each running process. Note that top automatically sorts the output
by the numbers in the %CPU column, so as programs use more and then less of your processor, their
position in top changes as well.
If you want to kill programs from within top , just press k. At the top of the listings, just after the line
that begins with Swap:, you'll see the following:
PID to kill:
Enter the PID for the process you want to end (let's say 8026), press Enter, and you're asked what
signal number (discussed in the previous "End a Running Process" section) you want to use:
Kill PID 8026 with signal [15]:
By default, top wants to use 15. If you're happy with that, just press Enter; if not, enter the signal
number you want to use and then press Enter. A second or so later, that process will disappear from
top .
To exit top , press q.
The top command is incredibly useful, and you'll find yourself turning to it often, just to find out what's
happening on your Linux machine. When you have questions about your computer's activity, top often
provides the answers.
List Open Files
lsof
Chapter 1, "Things to Know About Your Command Line," discussed the fact that everything is a file to a
Linux machine, including directories, network connections, and devices. That means that even though
from your perspective you only have one file opena letter to your Momyour system actually has
thousands of files open at any one time. That's rightthousands. To see the complete list of those open
files, use the lsof command (short for list open files).
Actually, don't. If you just run lsof by itself (to get the full list of results, you need to be root), you'll
receive as output that list of thousands of files. On this system, there were 5,497 files open and being
used. Still, lsof can be a useful way to get an idea of just how much is going on with your computer at
any one time.
Piping the output of lsof to less gives you the results one screen at a time.
# lsof | less
COMMAND PID USER
init
1 root
init
1 root
init
1 root
FD TYPE
cwd DIR
rtd DIR
txt REG
DEVICE SIZE NODE NAME
3,1
656
2 /
3,1
656
2 /
3,1
31608 2072 /sbin/init
Still, with 5,497 results, that's many screens to page through. You can also pipe the output of lsof to
grep. As you'll see in the next few sections, however, lsof contains within itself ways to filter out the
data you don't want to see so you can focus only on a particular subset of open files that interest you.
List a User's Open Files
lsof -u
If you want to look at the files a particular user has open (and remember that those include network
connections and devices, among many others), add the -u option to lsof, followed by the username
(remember that lsof must be run as root).
Note
In order to save space, some of the data you'd normally see when you run lsof has been
removed in this and further examples.
# lsof -u scott
COMMAND
PID USER NAMEP
evolution 8603
scott /home/scott/.evolution/addressbook/local/system/addressbook.db
opera
11638 scott /usr/lib/opera/9.0/opera
opera
11638
scott /usr/share/fonts/truetype/msttcorefonts/Arial_Bold.ttf
Even filtering out all users except one, you're left with 3,039 lines in this list. Still, some interesting
items are here. For one, it appears that Evolution (an email and personal information manager
program) is running all the time, without your knowledge or intention. Also, the Opera web browser is
running, which is expected, and one of the web pages it's on is requiring the use of the Arial Bold font,
among other files.
If you administer a box used by more than person, try out lsof -u for your users. You might find that
they are running programs that they shouldn't. If you're the sole user on your Linux machine, try lsof
-u on yourselfyou might find that you're running programs of which you were unaware!
List Users for a Particular File
lsof [file]
In the previous section, you saw what files a particular user had open. Let's reverse that, and see who's
using a particular file. To do so, simply follow lsof with the path to a file on your system. For instance,
let's take a look at who's using the SSH daemon, used to connect remotely to this computer (remember
that lsof must be run as root).
# lsof /usr/sbin/sshd
COMMAND PID USER TYPE
sshd
7814 root REG
sshd
10542 root REG
sshd
10548 scott REG
NAME
/usr/sbin/sshd
/usr/sbin/sshd
/usr/sbin/sshd
That's the result you wanted: two users, root and scott. If an unexpected user had shown up heresay,
4ackordoodyou'd know that you've been compromised.
Note
Yes, sshd is a program, but remember, that's from your human perspective. To Linux,
/usr/sbin/sshd is just another file.
List Processes for a Particular Program
lsof -c [program]
In the previous section, you saw who was using /usr/sbin/sshd. The results you received, however,
don't tell the whole story. Any particular program is actually comprised of calls to several (perhaps
many) other processes, programs, sockets, and devices, all of which appear to Linux as more files. To
find out the full universe of other files associated with a particular running program, follow lsof with the
-c option, and then the name of a running (and therefore "open") program. For example, lsof
/usr/sbin/sshd said that there were only two users and three open files associated with that exact file.
But what about the entire sshd command?
# lsof -c sshd
COMMAND
PID USER NAME
sshd
10542 root /lib/ld-2.3.5.so
sshd
10542 root /dev/null
sshd
10542 root 192.168.0.170:ssh->192.168.0.100:4577 (ESTABLISHED)
sshd
10548 scott /usr/sbin/sshdp
sshd
10548 scott 192.168.0.170:ssh->192.168.0.100:4577 (ESTABLISHED)
In the preceding code, you can see a few of the 94 lines (representing 94 open files) somehow
connected with the sshd program. One is a .so file (shared object, akin to a DLL on Windows), and a
few even show you that a network connection has occurred between this and another machine (actually,
that another machine on this network has connected via SSH to this machine; for more on that process,
see "Securely Log In to Another Computer" in Chapter 15, "Working on the Network").
You can find out a tremendous amount by applying lsof to various commands on your computer; at the
least, you'll learn just how complicated modern programs are. Try it out with the software you use
every day, and you might gain a fresh appreciation for the programmers who worked hard to make it
available to you.
Note
The lsof command has an amazing number of options, and you're only going to look at a tiny
subset. The source code for lsof includes a file named 00QUICKSTART (that's two zeroes at the
beginning) that is a tutorial for some of the command's more powerful features. Search
Google for that filename and start reading.
Display Information About System RAM
free
Nowadays most computers come with hundreds of megabytes or even gigabytes of RAM, but it's still
possible to find your machine slowing down due to a lot of memory usage or virtual memory swapping.
To see the current state of your system's memory, use the free command.
$ free
total
used free shared buffers cached
Mem:
1036136 995852 40284
0
80816 332264
-/+ buffers/cache:
582772 453364
Swap: 1012084 495584 516500
By default, free shows the results in kilobytes, the same as if you'd used the -k option. You can change
that, however. The -b option shows you memory usage in bytes, and -m (which you should probably use
most of the time) uses megabytes.
$ free -m
total used free shared buffers
cached
Mem:
1011
963
48
0
78
316
-/+ buffers/cache: 569 442
Swap:
988
483 504
From free's output, you can see that this machine has 1011MB of available physical RAM (actually
1024MB, but free displays it as 1011MB because the kernel takes up the remaining 13MB, so it could
never be made available for other uses). The first line makes it appear that 963MB of that is in use,
leaving only 48MB of RAM free. There's nearly a gigabyte of swap, or virtual memory, in place, too, and
about half of that, or 483MB, has already been used. That's how it appears, anyway.
It's not that simple, however. The important row is the middle one, labeled -/+ buffers/cache. Hard
drives use buffers and cache to speed up access, and if a program needs that memory, it can be quickly
freed up for that program. From the point of view of the applications running on this Linux machine,
442MB of memory is free right now for use, with 569MB cached but available if needed. And that's in
addition to the swap space you have available. Linux is nothing if not efficient when it comes to
managing memory.
Tip
An excellent page with more information on free and Linux memory management can be
found on the Gentoo Wiki, at http://gentoo-wiki.com/FAQ_Linux_Memory_Management. Even
though it's purportedly for Gentoo (a unique Linux distribution), it still applies to all versions
of Linux.
Show File System Disk Usage
df
The free command deals with the RAM you have on your system; likewise, the df command (think disk
free) deals with the amount of hard drive space you have. Run df, and you get a list of the disk space
that is available to, and used by, every mounted filesystem.
$ df
Filesystem
/dev/hda1
tmpfs
tmpfs
/dev/hda2
1Kblocks
Used Available Use% Mounted on
7678736 5170204
2508532 68% /
518068
0
518068
0% /dev/shm
518068
12588
505480
3% /lib/modules/2.6.12-10-386/volatile
30369948 24792784
5577164 82% /home
Before looking at these results in a more detail, let's make them a little easier to read. The df command
shows results in kilobytes by default, but it's usually easier to comprehend if you instead use the -h (or
--human-readable) option.
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/hda1 7.4G 5.0G 2.4G 68% /
tmpfs
506M
0 506M
0% /dev/shm
tmpfs
506M 13M 494M
3% /lib/modules/2.6.12-10-386/volatile
/dev/hda2
29G 24G 5.4G 82% /home
"Human-readable" (which is a wonderful term, by the way) means that kilobytes are indicated by K,
megabytes by M, and gigabytes by G. You can see those last two in the preceding listing.
So what do the results mean? You have two partitions on the hard drive: /dev/hda1, mounted at /, and
/dev/hda2, mounted at /home. The /home partition had 29GB allocated to it, of which 82% is currently
used, leaving about 5.4GB free. It's not panic time yet when it comes to disk space for that partition,
but a few more CDs worth of MP3s and it might be time to delete some unnecessary files.
The root partition, or /, has less space available: only 2.4GB out of 7.4GB. That partition isn't as likely
to grow, however, because it contains programs and other relatively static files. Granted, /var is found
there, and that's where software installers (as you'll discover in Chapter 13, "Installing Software") and
other files that change size and contents, such as logs, are located. But overall, there's still plenty of
space left on that partition, especially if you're not planning to install enormous applications or start
running a web or database server on this machine.
The other two partitions are both labeled as tmpfs, which means they are temporary file systems used
by the virtual memory, or swap space, on your computer. When you shut down your computer, the
contents of those partitions disappear.
Tip
For more on tmpfs, see the Wikipedia article at http://en.wikipedia.org/wiki/TMPFS.
Report File Space Used by a Directory
du
The df command tells you about your entire hard drive, but what if you just want to know how much
space a directory and its contents are using? The du command (short for disk usage) answers that
question. First you use the cd command to change directories to the directory in question, and then run
du.
$ cd music
$ du
36582
./Donald_Fagen
593985 ./Clash
145962 ./Hank_Mobley/1958_Peckin ' _Time
128200 ./Hank_Mobley/1963_No_Room_For_Squares
108445 ./Hank_Mobley/1961_Workout
2662185.
As with df, results for du are given in kilobytes, but also like df, you can view them in a more
comprehensible fashion with the -h (or --human-readable) option.
$ cd music
$ du -h
36M
./Donald_Fagen
581M
./Clash
143M
./Hank_Mobley/1958_Peckin'_Time
126M
./Hank_Mobley/1963_No_Room_For_Squares
106M
./Hank_Mobley/1961_Workout
2.6G
.
This output shows you the space used by each subdirectory, but directories that contain other
directories also show the total for those contents. Look at the Hank_Mobley directory, which takes up
374MB on this filesystem. That total comes from the three subdirectories contained within Hank_Mobley
(since the actual amount in kilobytes was rounded to display megabytes, the totals are just a bit off); if
the total for Hank_Mobley was substantially bigger than the total of its three subdirectories, that would
mean that the Hank_Mobley directory contained within it several loose files that were outside of a
subdirectory.
Finally, you get a total for the entire music/ directory at the end: 2.6GB.
Report Just the Total Space Used for a Directory
du -s
If you don't want information about all of a directory's subdirectories, it's possible to ask du to simply
report back the grand total by using the -s option.
$ cd music
$ du -hs
2.6G
.
Short and sweet.
Conclusion
Virtually every command you've looked at in this chapter has GUIs. But if your system is nonresponsive, if you're connecting to a machine via SSH (covered in Chapter 15) and can't use a GUI, or if
you just want to use the speediest tool, you'll need to use the command line. The commands in this
chapter are easy to remember, thanks to well-chosen names:
ps (view running processes)
kill (kill processes)
top (top listing of running processes)
lsof (list [ls] open files)
free (free, or available, memory)
df (disk free space)
du (disk usage)
Practice with them on your computer, and don't forget to read the man pages for each one because this
chapter could only cover a fraction of their capabilities. There are many options to several of themin
particular, ps, top , and lsofthat allow you to do some truly amazing detective work on the computers
under your watch.
Chapter 13. Installing Software
Linux distributions by default contain thousands of great software packages. Immediately after
completing a basic install, you can surf the Net, write reports, create spreadsheets, view and
manipulate images, and listen to music. Even with that plethora of software, you're still going to want
to add new tools that you run across. Fortunately, Linux makes it easy to install new software onto your
computer.
Conventional wisdom about Linux said you had to compile any software you wanted to install. You can
still do that if you'd like (and many people do), but that's rarely necessary in this day and age. Instead,
you can install any of thousands of packages easily and quickly using some simple tools.
You need to understand one important thing before proceeding, however. In the Linux world, software
packages come in a variety of formats, with two in particular dominating: RPM and DEB (not
surprisingly, those are the two we're going to cover in this chapter). RPM is used by distributions such
as Red Hat (in fact, RPM stands for Red Hat Package Manager), Fedora Core, SUSE, and other RPMbased distributions. DEB is used by Debian-based distributions such as Debian itself, K/Ubuntu,
Linspire, Xandros, and many others. You should know how both work, but focus on the system that
matches your distribution.
Tip
For a great breakdown of the various package management systems and the distributions that
use them, see DistroWatch's "Linux DistributionsFacts and Figures: What Is Your Distribution's
Package Management?," at http://distrowatch.com/stats.php?section=packagemanagement.
Note that DEB leads the pack (note also that I'm biased because this is being written using
K/Ubuntu, also known as "Debian Done Right").
Install Software Packages for RPM-Based Distributions
rpm -ihv [package]
rpm -Uhv [package]
The rpm command installs software installers that end in .rpm, which seems entirely logical. To install an
RPM package, you need to download it first. Let's use the industry-standard open-source network port
scanner nmap as the example. You can download the RPM package for nmap from
www.insecure.org/nmap/download.html; after it's on your system, you simply run rpm along with three
options: -i (for install), -h (to show hash marks so you can watch the progress of the install), and -v
(be verbose and tell me what you're doing). The command, which must be run as root, would look like
this:
# rpm -ihv nmap-4.01-1.i386.rpm
This is actually not the command you should run, however. A better choice is -Uhv, where -U stands for
upgrade. Why is -U better than -i? Because -i only installs, while -U upgrades and installs. If a
package already exists on your system and you're trying to put a newer one on your machine, -U
performs an upgrade; if a package doesn't already exist on your system, -U notices that and instead
installs it. Therefore, just use -U all the time, and it won't matter if you're upgrading or installing: -U
does what needs to be done, and you won't need to worry about it.
# rpm -Uhv nmap-4.01-1.i386.rpm
Preparing... ############################## [100%]
1:nmap
############################## [100%]
If you want to install more than one RPM, just list them separated by spaces, one after the other:
# rpm -Uhv nmap-4.01-1.i386.rpm nmap-frontend-4.01- 1.i386.rpm
You can also use wildcards if you have many RPMs to install. For instance, if you have 20 .rpm files in a
subdirectory named software , just run this command to install them all:
# rpm -Uhv software/*.rpm
Caution
The -U option is always better, except when installing a kernel. Then you want to use -i
instead. If you upgrade with -U and the new kernel doesn't work, you're in a world of hurt. On
the other hand, if you install a new kernel with -i, the old one is still on your machine as a
backup if the new one blows up.
Remove Software Packages for RPM-Based Distributions
rpm -e [package]
Getting rid of installed RPMs is even easier than installing them. Instead of -Uhv to place RPMs on your
computer, use -e (for erase) to remove them.
# rpm -e nmap
And that's it. The -e option is silent, so there's no output. Notice, however, that when you install
software using rpm -Uhv , you need to specify the filename, otherwise rpm has no way to know what you
want to install. When you remove software with rpm -e, however, you specify the package name
because the files used for installation are now long gone, and rpm knows software by the package name.
Install Software Packages and Dependencies for RPMBased Distributions
yum install [package]
The rpm command is powerful, but after you use it for a while you'll quickly run into a problem when you
try to install a package that has dependencies (otherwise known as "dependency hell"). In order to
install package A, you also need to download and install packages B and C, but in order to install C you
also need to download and install packages D and E, but in order to install E...aaaaaggh!
Debian-based distributions (that you'll read about later in this chapter) solved this problem years ago
with the powerful and useful apt ; RPM-based distributions can use apt , but more commonly they use
the relatively new and still immature yum . Originally developed for the RPM-based Yellow Dog Linux
distribution (hence the name, which stands for Yellow Dog Updater, Modified), yum is now widely used
but still lags behind apt in features and usability. Still, that's what's available, so that's what we're
going to cover.
The yum command installs, upgrades, and uninstalls software packages by acting as a wrapper around
rpm . In addition, yum automatically handles dependencies for you. For example, if you're trying to install
package A from the example in the introductory paragraph, yum downloads and installs A, B, and C for
you. Later, if you decide that you no longer want A on your system, yum can uninstall it for you, along
with B and C, as long as other software packages don't require them.
Installing software with yum is pretty easy. Let's say you want to install XMMS, a media player (not
surprising, since the name stands for X Multimedia System). To install XMMS, you also need to install
several other dependencies. With yum , this process becomes much less painful than it would be if you
were attempting to install manually using rpm . To start with, you don't need to find and download the
xmms package yourself; instead, yum downloads XMMS and any other necessary dependencies for you.
Unfortunately, yum is incredibly verbose as it goes about its work. The following output has been cut
drastically, yet it's still lengthy. Nevertheless, this gives you a rough approximation of what you'd see if
you were using yum .
# yum install xmms
Setting up Install Process
Setting up repositories
update 100% |======================| 951 B
base
100% |======================| 1.1 kB
00:00
00:00
Resolving Dependencies
--> Populating transaction set with selected packages. Please wait.
---> Downloading header for xmms to pack into
transaction set.
xmms-1.2.10-9.i386.rpm 100% |==========| 24 kB 00:00
--> Restarting Dependency Resolution with new
changes.
---> Downloading header for gtk+ to pack into
transaction set.
gtk%2B-1.2.10-33.i386.rpm 100% |==========| 23 kB 00:00
Dependencies Resolved
Package
Arch Version
Installing:
xmms
i386 1:1.2.10-9
Installing for dependencies:
libogg
libvorbis
i386 2:1.1.2-1
i386 1:1.1.0-1
Repository
Size
base
1.9 M
base
base
16 k
185 k
Total download size: 3.3 M
Is this ok [y/N]:
After you enter y , yum downloads and installs the packages, continuing to inform you about what it's
doing, every step of the way.
[View full width]Downloading Packages:
...
(5/6): libogg-1.1.2-1.i38 100% |==========| 16 kB
(6/6): libvorbis-1.1.0-1. 100% |==========| 185 kB
Running Transaction Test
Running Transaction
Installing: libogg
Installing: libvorbis
00:00
00:00
########## [1/6]
########## [2/6]
...
Installed: xmms.i386 1:1.2.10-9
Dependency Installed: gdk-pixbuf.i386
2:1.1.2-1 libvorbis.i386 1:1.1.0-1
1:0.22.0-17.el4.3 gtk+.i386 1:1.2.10-33 libogg.i386
mikmod.i386 0:3.1.6-32.EL4
Whew! XMMS is finally installed and available for use. Now find out how to get rid of XMMS if you decide
you don't like it.
Remove Software Packages and Dependencies for RPMBased Distributions
yum remove [package]
One thing in yum 's favor: Its command syntax is very user friendly. Want to install a package? yum
install. Want to remove? yum remove. So if you're tired of XMMS, just run this command:
# yum remove xmms
Setting up Remove Process
Resolving Dependencies
---> Package xmms.i386 1:1.2.10-9 set to be erased
Dependencies Resolved
Package
Arch
Version
Removing:
xmms
i386
1:1.2.10-9
Repository
installed
Size
5.2 M
Is this ok [y/N]:
Even for something as simple as removing a software package, yum continues its habit of grabbing you
by the lapels and telling you at length about its day. Press y to approve the uninstall, and you get a bit
more data thrown at you:
Running Transaction Test
Running Transaction
Removing : xmms
#################### [1/1]
Removed: xmms.i386 1:1.2.10-9
Complete!
And now XMMS is gone. Notice that the dependencies that were installed by yum (detailed in the
previous section, "Install Software Packages and Dependencies for RPM-Based Distributions") are not
removed along with the xmms package. XMMS needed those dependencies to run, but they can still
work on your computer with different programs, so yum allows them to remain (this is the default
behavior of apt as well, as you'll see soon in "Remove Software Packages and Dependencies for
Debian").
Upgrade Software Packages and Dependencies for RPMBased Distributions
yum update
Your Linux system contains hundreds, if not thousands, of software packages, and upgrades come out
constantly for one package or another. It would be a full-time job for you to manually keep track of
every new version of your software and install updates, but yum makes the process easy. A simple yum
update command tells yum to check for any upgrades to the software it's tracking. If new packages are
available, yum shows you what's available and asks your permission to proceed with an install.
# yum update
Setting up Update Process
Setting up repositories
update 100% |====================| 951 B
base
100% |====================| 1.1 kB
00:00
00:00
Resolving Dependencies
--> Populating transaction set with selected packages. Please wait.
---> Downloading header for cups-libs to pack into transaction set.
cups-libs-1.1.22-0.rc1.9. 100% |==========| 22 kB 00:00
---> Package cups-libs.i386 1:1.1.22-0.rc1.9.10 set to be updated
--> Running transaction check
Dependencies Resolved
Package Arch Version
Repository Size
Installing:
openssl i686 0.9.7a-43.4
pam
i386 0.77-66.13
perl
i386 3:5.8.5-24.RHEL4
udev
i386 039-10.10.EL4.3
wget
i386 1.10.2-0.40E
Transaction Summary
Install
1 Package(s)
Update
11 Package(s)
Remove
0 Package(s)
Total download size: 30 M
Is this ok [y/N]:
update
update
update
update
update
1.1
1.8
11
830
567
M
M
M
k
k
If you press y at this point, you're giving yum approval to download and install 12 packages. After lots of
output, yum completes its job, and your computer is now up to date. Want to live on the bleeding edge?
Run yum update daily. If you're not desirous of always using the latest and greatest, run yum update at a
longer interval, but be sure to run it regularly. Security updates for software come out all the time, and
it's a good idea to keep up-to-date.
Find Packages Available for Download for RPM-Based
Distributions
yum search [string]
yum list available
So you know how to install and remove software using yum , but how do you find software in the first
place? Let's say you're interested in the GIMP, the GNU Image Manipulation Program. You want to know
if any packages related to the GIMP are available to install using yum . You could try using yum search
gimp, but it wouldn't be a good idea. That command looks for matches to your search term in all
package names, descriptions, summaries, and even the list of packagers' names. You'll end up with a
list about as long as Bill Gates's bank statement.
A better choice is to query the list of packages available through yum (which would normally produce
another crazily long list), but then pipe the results through a grep search for your term.
$ yum list available | grep gimp
gimp.i386
1:2.0.5-5
gimp-devel.i386
1:2.0.5-5
gimp-help.noarch
2-0.1.0.3
gimp-print.i386
4.2.7-2
base
base
base
base
Eleven results show upnow that's workable. If you really want to perform the complete search, use yum
search; otherwise, use yum list available and grep. Most of the time, the latter choice is what you'll
really want, and it'll be far easier to work with to boot.
Install Software Packages for Debian
dpkg -i [package]
Installing new software is one of the most fun things you can do with a Linux machine. As you'll learn in
upcoming sections, Debian has apt , the most powerful and easiest-to-use system for installing software
of any Linux distribution. As powerful as apt is, much of it is a wrapper around the dpkg program (in
the same way yum is a wrapper around rpm ), which does the grunt work of installing and removing
software on a Debian-based machine. Before learning how to use apt , you should learn how to use dpkg
because apt can't be used to install everything.
Here's a case in point: One of the most popular Voice over IP (VoIP) programs out now is Skype. Due to
licensing issues, however, Skype isn't included in the default installs of most distributions. If you want
Skype, you have to first download it from the company's site, and then manually install it. To get
Skype, head over to the Linux download page, at www.skype.com/products/skype/linux, and find the
package for your distribution. In this case, you're going to use the Debian package, which at this time is
named skype_1.2.0.18-1_i386.deb.
After the .deb file is downloaded onto your system, it's time to install it. First use cd to change
directories to the directory that contains the .deb file, and then use dpkg to install it.
Note
On most Debian-based distributions, this and all other dpkg commands are run as root. The
wildly popular K/Ubuntu distribution, however, doesn't use root. Instead, commands normally
run as root are prefaced with sudo. In other words, Debian would use this:
# dpkg -i skype_1.2.0.18-1_i386.deb
K/Ubuntu and other sudo-based distributions would instead use this:
$ sudo dpkg -i skype_1.2.0.18-1_i386.deb
This book was written using several machines running K/Ubuntu, so if you see sudo instead of
root, now you know why.
# ls
skype_1.2.0.18-1_i386.deb
# dkpg -i skype_1.2.0.18-1_i386.deb
sudo dpkg -i skype_1.2.0.18-1_i386.deb
Selecting previously deselected package skype.
(Reading database ... 97963 files and directories currently installed.)
Unpacking skype (from skype_1.2.0.18-1_i386.deb) ...
Setting up skype (1.2.0.18-1) ...
That's it. The dpkg command is a model of brevity, telling you the important information and nothing
more.
Remove Software Packages for Debian
dpkg -r [package]
The -i option, used to install software on Debian-based machines, stands for install; in a similar bit of
happy obviousness, the -r option, used to uninstall software on Debian-based machines, stands for
remove. If you get tired of Skype, it's a simple matter to get it off your computer.
# dpkg -r skype
(Reading database ... 98004 files and directories
Removing skype ...
currently installed.)
When you install software using dpkg -i, you need to specify the filename, otherwise dpkg has no way
to know what you want to install. When you remove software with dpkg -r, however, you specify the
package name because the files used for installation are now long gone, and apt knows software by the
package name.
Install Software Packages and Dependencies for Debian
apt-get install [package]
The dkpg command is powerful, but after you use it for a while you'll quickly run into a problem when
you try to install a package that has dependencies (otherwise known as "dependency hell"). In order to
install package A, you also need to download and install packages B and C, but in order to install C you
also need to download and install packages D and E, but in order to install E...aaaaaggh! You need apt !
The apt command and its accessories install, upgrade, and uninstall software packages. Best of all, apt
automatically handles dependencies for you. For example, if you're trying to install package A from the
example in the previous paragraph, apt downloads and installs A, B, and C for you. Later, if you decide
that you no longer want A on your system, apt can uninstall it for you, along with B and C, as long as
other software packages don't require them.
The apt command was originally developed for use on the Debian distribution as a front end to dpkg. It's
now found on every Debian-based distributionwhich includes Debian itself, K/Ubuntu, Linspire, Xandros,
and a whole host of othersand it's one of the features that make Debian so easy to use and powerful.
Other, non-Debian distributions realized how great apt was, and eventually Connectiva ported apt to
work with RPMs. (This chapter focuses on apt as it is used with Debian.)
Tip
For a good overview of apt as it is used with RPM-based distros, see my "A Very Apropos apt "
in the October 2003 issue of Linux Magazine, available at www.linux-mag.com/200310/apt_01.html. In addition to that article, however, you should also check out
http://apt.freshrpms.net for the latest in RPM repositories. Note that free registration is
required to access Linux Magazine's content.
Let's say you want to install the awesome tool sshfs using apt . To do it, you go through the following
process (and remember that these commands must be run as root):
# apt-get update
Get:1 http://us.archive.ubuntu.com breezy
Release.gpg [189B]
Get:2 http://archive.ubuntu.com breezy Release.gpg
[189B]
Hit ftp://ftp.free.fr breezy/free Sources
Hit ftp://ftp.free.fr breezy/non-free Sources
Fetched 140kB in 1m4s (2176B/s)
Reading package lists... Done
[Results truncated for length]
# apt-get install sshfs
Reading package lists... Done
Building dependency tree... Done
The following extra packages will be installed:
fuse-utils libfuse2
The following NEW packages will be installed:
fuse-utils libfuse2 sshfs
Need to get 96.9kB of archives.
After unpacking 344kB of additional disk space will be used.
Do you want to continue [Y/n]? y
Get:1 http://us.archive.ubuntu.com breezy/universe
sshfs 1.1-1 [19.3kB]
...
Fetched 96.9kB in 10s (9615B/s)
Reading package fields... Done
Reading package status... Done
Preconfiguring packages ...
...
Selecting previously deselected package sshfs.
Unpacking sshfs (from .../archives/sshfs_1.1- 1_i386.deb) ...
Setting up sshfs (1.1-1) ...
Let's go over what you just did because you actually ran two commands. apt-get update downloads a
list of current software packages from apt serversknown as repositoriesthat are listed in your apt
configuration file, found at /etc/apt/sources.list (if you want to see where your repositories are, run
cat /etc/apt/sources.list). If you saw Get at the beginning of a line after running apt-get update, it
means that apt saw that the list on the repository is newer, so it's downloading that particular list; Ign ,
on the other hand, means that the list on the repository and on your computer are in sync, so nothing is
downloaded. By running apt-get update before you do anything else, you ensure that your list of
packages is correct and up-to-date.
The command apt-get install sshfs retrieves the specified package, as well as any necessary
dependencies (in this case, fuse-utils and libfuse2 ). After they're on your computer, apt (really dpkg
acting at the behest of apt ) installs all of the software for you. Keep in mind that you always use the
package name, not the filename. In other words, use apt-get install sshfs, not apt-get install
sshfs_1.1-1_i386.deb. As you saw, if apt does discover additional dependencies for the requested
package, as it did for sshfs, you'll have to confirm that you want them installed before apt grabs them.
If you want to install more than one package at the same time, just list them all on the command line.
For example, if you wanted to install sshfs and shfs-utils, you'd do the following:
# apt-get install sshfs shfs-utils
Any dependencies for either sshfs or shfs-utils are discovered, and you are asked if you want to
install them as well. Yes, it's that simple.
Tip
Don't know what sshfs is? Oh, but you should! Read my blog posting "Mount Remote Drives
via SSH with SSHFS," which you can find on The Open Source Weblog at
http://opensource.weblogsinc.com/2005/11/03/mount-remote-drives-via-ssh-with-sshfs/.
Remove Software Packages and Dependencies for Debian
apt-get remove [package]
If you no longer want a package on your system, apt makes it easy to uninstall it: Instead of apt-get
install, you use apt-get remove. This command works exactly contrary to apt-get install: It
uninstalls the packages you specify, along with any dependencies. Once again, reference the package
name, not the filename, so run apt-get remove sshfs, not apt-get remove sshfs_1.1-1_i386.deb.
# apt-get remove sshfs
Password:
The following packages will be REMOVED:
sshfs
After unpacking 98.3kB disk space will be freed.
Do you want to continue [Y/n]?
Removing a package actually doesn't remove every vestige of a package, however, because
configuration files for the removed package stick around on your computer. If you're sure that you want
to remove everything, use the --purge option.
# apt-get --purge remove sshfs
Password:
The following packages will be REMOVED:
sshfs*
After unpacking 98.3kB disk space will be freed.
Do you want to continue [Y/n]?
Using --purge, the packages that apt is about to remove are marked with asterisks, indicating that
associated configuration files are going to be removed as well.
Upgrade Software Packages and Dependencies for Debian
apt-get upgrade
A modern Linux system has several thousand software packages on it, and it's a sure bet that at least
one of those is upgraded by its maintainer every day. With apt , it's quite simple to keep your system
up-to-date. The process looks like this (and remember, this command must be run as root):
[View full width]# apt-get update
Get:1 http://us.archive.ubuntu.com breezy
Release.gpg [189B]
Get:2 http://archive.ubuntu.com breezy Release.gpg
[189B]
Hit ftp://ftp.free.fr breezy/free Sources
Hit ftp://ftp.free.fr breezy/non-free Sources
Fetched 140kB in 1m4s (2176B/s)
Reading package lists... Done
[Results truncated for length]
# apt-get upgrade
Reading package lists... Done
Building dependency tree... Done
The following packages have been kept back:
koffice
The following packages will be upgraded:
kalzium kamera kanagram karbon kbruch kchart kcoloredit kdegraphics
kdegraphics-kfile-plugins
...
53 upgraded, 0 newly installed, 0 to remove and 1 not upgraded.
Need to get 58.3MB of archives.
After unpacking 28.7kB of additional disk space will be used.
Do you want to continue [Y/n]?
Let's figure out what was going on here. Once again, you want to run apt-get update first so your
computer is in sync with your apt repositories. Then apt-get upgrade looks for any differences between
what you have installed and what's available on the repositories. If differences exist, apt shows you a
list of all of the packages that it would like to download and install on your computer. Of course, the
actual list of packages varies depending on how up-to-date your system is. In this case, enough time
had passed that there were 53 packages to upgrade.
If you type y, apt downloads the 53 software packages to /var/cache/apt/archives and then installs
them after they're all on your machine. If you don't want to go through with the upgrade, just type n
instead.
That's easy enough, but the most efficient way to use apt to update your Linux box is to just join the
commands together:
# apt-get update && apt-get upgrade
The && makes sure that apt-get upgrade doesn't run unless apt-get update finishes without errors.
Better yet, make an alias for that command in your .bash_aliases file, as was discussed in the "Create
a New Permanent Alias" section of Chapter 11, "Your Shell."
alias upgrade='apt-get update && apt-get upgrade '
Reload your .bash_aliases file, and now you just need to type in upgrade, press Enter, press Y to
accept any new packages, and you're done. Windows Update, eat your heart out!
Find Packages Available for Download for Debian
apt-cache search
We've been talking a lot about installing software using apt , but how do you know what software
packages are available in the first place? There's another tool in the apt toolbox that helps you do that:
apt-cache search, which searches the lists of packages available in your apt repositories. In a nice
change of pace, you don't need to be root at all to use apt-cache search.
$ apt-cache search dvdcss
libdvdread3 - Simple foundation for reading DVDs
ogle - DVD player with support for DVD menus
libdvdcss2 - portable library for DVD decryption
libdvdcss2-dev - development files for libdvdcss2
Keep in mind several things about how this command searches. It looks for matches to your stringin
this case, dvdcssand not for exact words. Also, your search pattern might appear in the package name
or description. Finally, apt-cache search looks through the entire package list of both installed and
uninstalled packages, so you might already have installed a package that shows up.
Tip
Synaptic is a GUI for apt that allows you to do virtually everything discussed here, but via
point-and-click instead of typing. In particular, its search tools are very nice, so much so that
I usually use Synaptic solely for searching, and do everything else via the command line. That
seems to maximize the effectiveness of each tool best.
Clean Up Unneeded Installation Packages for Debian
apt-get clean
When packages are downloaded and installed using apt , the .deb files are left behind in
/var/cache/apt/archives/. Over time, you can take up a lot of disk space with those unnecessary
installers. To remove all of the unneeded .deb files, use apt-get clean (once again, this command is
run as root):
$ ls -1 /var/cache/apt/archives/
fuse-utils_2.3.0-1ubuntu1.1_i386.deb
libfuse2_2.3.0-1ubuntu1.1_i386.deb
lock
partial/
sshfs_1.1-1_i386.deb
# apt-get clean
$ ls -1 /var/cache/apt/archives/
lock
partial/
If a download is interrupted for whatever reason, you might find a partial .deb file in
/var/cache/apt/archives/partial/. If you know that all of the updates and upgrades are complete and
installed, it's safe to delete the contents of that directory, should you find anything in it.
Troubleshoot Problems with apt
Of course, as great as apt is, you might run into problems. Here are some problems and their
workarounds.
One simple problem that will result in a smack to your own forehead is the "Could not open lock file"
error. You'll try to run apt-get, but instead of working, you get this error message:
E: Could not open lock file /var/lib/dpkg/lock - open (13 Permission denied)
E: Unable to lock the administration directory (/var/lib/dpkg/), are you root?
The solution to this issue is right there in the second line: You're not logged in as root! Simply log in as
root and try again, and everything should work.
Note
If you're running K/Ubuntu, or any other distribution that uses sudo instead of root, the error
means that you didn't preface the command with sudo. In other words, you ran this:
$ apt-get upgrade
You're seeing the error message because you should have run this instead:
$ sudo apt-get upgrade
The next common issue occurs when apt complains about broken dependencies. You'll know this one's
happening when apt encourages you to run apt-get -f install . This suggestion is the program's way
of telling you that your system has some broken dependencies that prevent apt from finishing its job.
There are a couple of possible solutions. You can follow apt 's advice, and run apt-get -f install ,
which tries to fix the problem by downloading and installing the necessary packages. Normally, this
solves the problem, and you can move on.
If you don't want to do that, you can instead try running apt-get -f remove, which tries to fix the
problem by removing packages that apt deems troublesome. These might sound like potentially
dangerous steps to takeand they could be if you don't pay attentionbut each option gives you the
chance to review any proposed changes and give your assent. Just be sure to examine apt 's proposals
before saying yes.
Finally, apt might warn you that some packages "have been kept back." This warning tells you that apt
has found a conflict between the requested package or one of its dependencies and another package
already installed on your system. To resolve the issue, try to install the package that was kept back with
the -u option, which gives you information about exactly what needs to be upgraded.
Conclusion
RPM-based and Debian-based distributions, despite their differences, share some similarities when it
comes to package management, in that both are designed to simplify the installation, removal, and
management of software. RPM-based distributions use the rpm command to install, upgrade, and
remove software, while Debian-based distributions use dpkg for the same purposes. To solve the
headaches of dependency hell, RPM-based distributions have yum , a wrapper around rpm , while Debianbased distributions have apt , a wrapper around dpkg. Still, there are differences, principally in maturity
and ease of use. In those areas, apt is definitely superior to yum , but yum is improving constantly.
In the end, both are still better than what Windows users have to put up with. Windows Update only
updates Microsoft's own software and a few third-party hardware drivers, while both apt and yum handle
virtually every piece of software on a Linux system. That's a huge advantage that Linux users have over
their Microsoft cousins, and it's one that should make us proud.
Part V: Networking
Chapter 14. Connectivity
Chapter 15. Working on the Network
Chapter 16. Windows Networking
Chapter 14. Connectivity
Networking has been part of Linux since the OS's beginning as a humble kernel hacked on by a small
group of programmers, and it's completely central to what makes Linux great. Linux networking most of
the time just works, providing you with a rock-solid connection that can be endlessly tweaked and tuned
to meet your exact needs.
This chapter is all about testing, measuring, and managing your networking devices and connections.
When things are working correctlyand they will for the vast majority of the timeyou can use the tools in
this chapter to monitor your system's network connections; should you experience problems, this
chapter will help you resolve some of the most irksome.
Tip
The information in this chapter assumes that you're using IPv4 addressing, in the format of
xxx.xxx.xxx.xxx. IPv6 will eventually replace IPv4, but that's still quite a while in the future.
At that time, route and many of the other commands you'll be examining will change. For
now, though, the information in this chapter is what you need. For more on IPv6, see
Wikipedia's "IPv6" (http://en.wikipedia.org/wiki/Ipv6).
View the Status of Your Network Interfaces
ifconfig
Everything in this chapter depends on your network connection. At the end of this chapter, in
"Troubleshooting Network Problems," you're going to learn ways to fix problems with your network
connection. Here at the beginning, let's find out what network connections you have in place and their
status.
To get a quick look at all of your network devices, whether they're running or not, use ifconfig (which
stands for interface configuration) with the -a (for all) option. Here's what you might see on a laptop
(note that some distributions require you to log on as root to use ifconfig ):
$ ifconfig -a
ath0 Link encap:Ethernet
HWaddr 00:14:6C:06:6B:FD
inet addr:192.168.0.101
Bcast:192.168.0.255
Mask:255.255.255.0
inet6 addr: fe80::214:6cff:fe06:6bfd/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1257 errors:7557 dropped:0 overruns:0 frame:7557
TX packets:549 errors:2 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:200
RX bytes:195869 (191.2 KiB) TX bytes:95727 (93.4 KiB)
Interrupt:11 Memory:f8da0000-f8db0000
eth0 Link encap:Ethernet
HWaddr 00:02:8A:36:48:8A
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
lo
Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:11092 errors:0 dropped:0 overruns:0 frame:0
TX packets:11092 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:982629 (959.5 KiB) TX bytes:982629 (959.5 KiB)
Three interfaces are listed here: ath0 (a wireless card), eth0 (an Ethernet card), and lo (the loopback
interfacemore on that in a moment). For each of those, you're told the type of connection, the Media
Access Control (MAC) or hardware address, the IP address, the broadcast and subnet mask addresses,
and information about received and transmitted packets, among other data. If a connection is
disconnected, much of that information is missing. In fact, that's one way you can see that ath0 and lo
are up, and eth0 is down: eth0 is missing an IP address, among other important details. Of course, an
easier way to tell is that the fourth line of the ath0 and lo interfaces begins with UP, while eth0 does
not.
Let's take the three interfaces in reverse order. lo is the loopback address, which enables a machine to
refer to itself. The loopback address is always represented by the IP address of 127.0.0.1. Basically,
your system needs it to work correctly. If you have it, don't worry about it; if you don't have it, you'll
know it because you will find yourself in a world of hurt.
Tip
For more on the loopback interface and address, see Wikipedia's "Loopback" at
http://en.wikipedia.org/wiki/Loopback.
eth0 is an Ethernet card, into which you actually plug cables. No cables are plugged into the Ethernet
card's port currently, so it's not activated, hence the lack of any addresses: IP, broadcast, and subnet
mask. It's possible to have both a wired and wireless interface running at the same time, although it's
usually not necessary.
Finally there is ath0, a wireless PCMCIA card. You might also see a wireless card with a name like eth0
if it's the primary network interface, or eth1 if it's secondary. When the wireless card was inserted,
K/Ubuntu automatically recognized it and configured the system to work with it, giving it the ath0
identifier. Because wireless interfaces are just Ethernet interfaces with some extra wireless goodies, the
information you get with ifconfig is similar to what you'd see for eth0 if it was up.
Note
You may see other names for your network devices, such as wlan0 for wireless cards.
ifconfig -a shows all interfaces, even those that are down; ifconfig by itself shows just those
connections that are up. It's a quick way to check the status of your network interfaces, especially if you
need to find your IP address quickly.
Note
You can also configure your network interfaces using ifconfig , a process described later in
"Configure a Network Interface."
Verify That a Computer Is Running and Accepting Requests
ping
ping -c
The ping command sends a special kind of packetan ICMP ECHO_REQUEST messageto the specified
address. If a machine at that address is listening for ICMP messages, it responds with an ICMP
ECHO_REPLY packet. (It's true that firewalls can block ICMP messages, rendering ping useless, but
most of the time it's not a problem.) A successful ping means that network connectivity is occurring
between the two machines.
$ ping www.google.com
ping www.google.com
PING www.l.google.com (72.14.203.99) 56(84) bytes of data.
64 bytes from 72.14.203.99: icmp_seq=1 ttl=245 time=17.1 ms
64 bytes from 72.14.203.99: icmp_seq=2 ttl=245 time=18.1 ms
[Results truncated for length]--- www.l.google.com
ping statistics --6 packets transmitted, 5 received, 16% packet loss, time 5051ms
rtt min/avg/max/mdev = 16.939/17.560/18.136/0.460 ms
The ping command won't stop until you press Ctrl+C. This can cause problems if you forget that you're
using ping because it will continue forever until it is stopped or your machine's network connection
stops. I once forgot and left a ping session running for 18 days, sending nearly 1.4 million pings to one
of my servers. Oops!
If you want to give ping a limit, you can set the number of packets that ping is to send with the -c
option, followed by a number. After ping sends out that number of packets, it stops, reporting its
results.
$ ping -c 3 www.granneman.com
PING granneman.com (216.23.180.5) 56(84) bytes of data.
64 bytes from 216.23.180.5: icmp_seq=1 ttl=44 time=65.4 ms
64 bytes from 216.23.180.5: icmp_seq=2 ttl=44 time=64.5 ms
64 bytes from 216.23.180.5: icmp_seq=3 ttl=44 time=65.7 ms
--- granneman.com ping statistics --3 packets transmitted, 3 received, 0% packet loss,
time 4006ms
rtt min/avg/max/mdev = 64.515/65.248/65.700/0.510 ms
The ping command is a standard and speedy way to determine basic network connectivity; even better,
if you include the -c option, you'll never forget and leave ping running accidentally for 18 days.
For more about using ping to diagnose network connectivity issues, see "Troubleshooting Network
Problems."
Trace the Route Packets Take Between Two Hosts
traceroute
The traceroute command shows every step taken on the route from your machine to a specified host.
Let's say you want to know why you can't get to www.granneman.com . You were able to load it just
fine yesterday, but today's attempts to load the web page are timing out. Where's the problem?
$ traceroute www.granneman.com
traceroute to granneman.com (216.23.180.5), 30 hops max, 38 byte packets
1 192.168.0.1 (192.168.0.1) 1.245 ms 0.827 ms
0.839 ms
2 10.29.64.1 (10.29.64.1) 8.582 ms 19.930 ms
7.083 ms
3 24.217.2.165 (24.217.2.165) 10.152 ms 25.476
ms 36.617 ms
4 12.124.129.97 (12.124.129.97) 9.203 ms 8.003
ms 11.307 ms
5 12.122.82.241 (12.122.82.241) 52.901 ms 53.619 ms 51.215 ms
6 tbr2-p013501.sl9mo.ip.att.net (12.122.11.121)
51.625 ms 52.166 ms 50.156 ms
7 tbr2-cl21.la2ca.ip.att.net (12.122.10.14)
50.669 ms 54.049 ms 69.334 ms
8 gar1-p3100.lsnca.ip.att.net (12.123.199.229)
50.167 ms 48.703 ms 49.636 ms
9 * * *
10 border20.po2-bbnet2.lax.pnap.net (216.52.255.101) 59.414 ms 62.148 ms 51.337 ms
11 intelenet-3.border20.lax.pnap.net (216.52.253.234) 51.930 ms 53.054 ms 50.748 ms
12 v8.core2.irv.intelenet.net (216.23.160.66)
50.611 ms 51.947 ms 60.694 ms
13 * * *
14 * * *
15 * * *
What do those * * * mean? Each one indicates a five-second timeout at that hop. Sometimes that could
indicate that the machine simply doesn't understand how to cope with that traceroute packet due to a
bug, but a consistent set of * indicates that there's a problem somewhere with the router to which
v8.core2.irv.intelenet.net hands off packets. If the problem persists, you need to notify the
administrator of v8.core2.irv.intelenet.net and let him know there's a problem. (Of course, it might not
hurt to let the administrator of gar1-p3100.lsnca.ip.att.net know that his router is having a problem
getting to border20.po2-bbnet2.lax.pnap.net as well, but it's not nearly the problem that intelenet.net
is having.)
One other way to get around a problematic traceroute is to increase the number of hops that the
command will try. By default, the maximum number of hops is 30, although you can change that with
the -m option, as in traceroute -m 40 www.bbc.co.uk .
Tip
Actually, a better traceroute is mtr , which stands for Matt's traceroute. Think of it as a
combination of ping and traceroute . If mTR is available for your Linux distribution, download
and try it out. For more information, head over to www.bitwizard.nl/mtr .
Perform DNS Lookups
host
The Domain Name System (DNS) was created to make it easier for humans to access resources on the
Internet. Computers work beautifully with numbersafter all, everything a computer does is really a
numberbut humans can remember and process words much more efficiently. A website might be
located at 72.14.203.99, but it would be hard for most people to remember that. Instead, it's much
easier to keep in memory that you want to go to www.google.com. DNS is basically a giant database
that keeps track of the relationship between 72.14.203.99 and www.google.com, and millions of other
IP addresses and domain names as well.
Tip
DNS is a large, complicated, and fascinating topic. For more details, see Wikipedia's "Domain
Name System" (http://en.wikipedia.org/wiki/Dns) to start, and then jump into Paul Albitz and
Cricket Liu's seminal DNS and BIND.
To quickly find the IP address associated with a domain name, use the host command:
$ host www.granneman.com
www.granneman.com is an alias for granneman.com.
granneman.com has address 216.23.180.5
www.granneman.com is an alias for granneman.com.
www.granneman.com is an alias for granneman.com.
granneman.com mail is handled by 30 bhoth.pair.com.
There are five responses because host performs several types of DNS lookups. It's easy, however, to
see what you wanted: www.granneman.com can be found at 216.23.180.5.
You can also reverse the process, and find out a domain name associated with an IP.
$ host 65.214.39.152
152.39.214.65.in-addr.arpa domain name pointer
web.bloglines.com.
Note
Many other commands will reveal a host's IP address, but host is the most efficient way to
perform this task. Not to mention, you can do reverse lookups with host, which is not always
possible with other commands.
Find out more about how host can help you at the end of this chapter in "Troubleshooting Network
Problems."
Configure a Network Interface
ifconfig
In the first section of this chapter, "View the Status of Your Network Interfaces," you saw how you could
use ifconfig to get details about the status of your network interfaces. The ifconfig command is more
powerful than that, however, as you can also use it to configure your network interfaces.
Note
You can make quite a few changes with ifconfig , but you're only going to look at a few (for
more details, see man ifconfig).
To change the IP address for the Ethernet card found at eth0 to 192.168.0.125, run this command
(virtually all commands associated with ifconfig need to be run as root):
# ifconfig eth0 192.168.0.125
In order to run certain types of network-sniffing tools, such as the awesome Ethereal, you first need to
set your network card to promiscuous mode. By default, eth0 only listens for packets sent specifically to
it, but in order to sniff all the packets flowing by on a network, you need to tell your card to listen to
everything, which is promiscuous mode.
# ifconfig eth0 promisc
After you've done that, running ifconfig shows you that your card is now looking at every packet it
can. See the PROMISC in the fourth line?
# ifconfig eth0
eth0 Link encap:Ethernet
HWaddr 00:02:8A:36:48:8A
inet addr:192.168.0.143 Bcast:192.168.0.255
Mask:255.255.255.0
inet6 addr: fe80::202:8aff:fe36:488a/64 Scope:Link
UP BROADCAST PROMISC MULTICAST MTU:1500 Metric:1
[Results truncated for length]
When you're done using Ethereal, don't forget to turn off promiscuous mode.
# ifconfig eth0 -promisc
# ifconfig eth0
eth0 Link encap:Ethernet
HWaddr 00:02:8A:36:48:8A
inet addr:192.168.0.143
Bcast:192.168.0.255
Mask:255.255.255.0
inet6 addr: fe80::202:8aff:fe36:488a/64 Scope:Link
UP BROADCAST MULTICAST MTU:1500 Metric:1
You can even change (or "spoof") the hardware MAC address for your network device. This is usually
only necessary to get around some ISPs' attempts to link Internet service to a specific machine. Be
careful with spoofing your MAC address because a mistake can conflict with other network devices,
causing problems. If you do decide to spoof your MAC, make sure you use ifconfig by itself to first
acquire the default MAC address so you can roll back to that later (by the way, the MAC address shown
in this command is completely bogus, so don't try to use it).
# ifconfig eth0 hw ether 00:14:CC:00:1A:00
The ifconfig command is a cornerstone of working with network interfaces. Make sure you understand
how it works so you can take maximum advantage of all it has to offer.
View the Status of Your Wireless Network Interfaces
iwconfig
The ifconfig command shows the status of your network interfaces, even wireless ones. However, it
can't show all the data associated with a wireless interface because ifconfig simply doesn't know about
it. To get the maximum data associated with a wireless card, you want to use iwconfig instead of
ifconfig .
$ iwconfig
lo
no wireless extensions.
eth0 no wireless extensions.
ath0 IEEE 802.11g ESSID:"einstein"
Mode:Managed Frequency:2.437 GHz Access Point: 00:12:17:31:4F:C6
Bit Rate:48 Mb/s
Tx-Power:18 dBm Sensitivity=0/3
Retry:off
RTS thr:off
Fragment thr:off
Power Management:off
Link Quality=41/94 Signal level=-54 dBm Noise level=-95 dBm
Rx invalid nwid:1047 Rx invalid crypt:0 Rx invalid frag:0
Tx excessive retries:73 Invalid misc:73
Missed beacon:21
You can see the data unique to wireless interfaces that iwconfig provides, including the type of card
(802.11g in this case), the ESSID or network name (this network's ESSID is einstein ), the mode or
kind of network to which you're connected, the MAC address of the wireless access point (here
00:12:17:31:4F:C6), and various details about the quality of the wireless connection.
The combination of ifconfig and iwconfig tells you everything you need to know about your wireless
network interface. And, just as you can also use ifconfig to configure your wired cards, you can also
use iwconfig to configure your wireless cards, as you'll see in the next section.
Configure a Wireless Network Interface
iwconfig
In the previous section, "View the Status of Your Wireless Network Interfaces," you used iwconfig to
see important details about your wireless card and its connection. You can also use iwconfig , however,
to configure that wireless card and its connections. If this sounds like ifconfig , it should, as iwconfig
was based on ifconfig and its behaviors.
Note
You can make several changes with iwconfig , but you're only going to look at a few (for more
details, see man iwconfig).
The network topologies associated with wired networks, such as star, bus, and ring, to name a few,
have been known and understood for quite some time. Wireless networks introduce some new
topologies to the mix, including the following:
Managed (an access point creates a network to which wireless devices can connect; the most
common topology for wireless networking)
Ad-Hoc (two or more wireless devices form a network to work with each other)
Master (the wireless device acts as an access point)
Repeater (the wireless device forwards packets to other wireless devices)
There are others, but those are the main ones. Using iwconfig , you can tell your wireless card that you
want it to operate differently, in accordance with a new topology.
Tip
For more on stars, busses, rings, and the like, see Wikipedia's "Network Topology" at
http://en.wikipedia.org/wiki/Network_topology.
# iwconfig ath0 mode ad-hoc
After specifying the interface, simply use the mode option following the name of the mode you want to
use.
Note
Remember that the card you're using in these examples has an interface name of ath0; yours
might be etH1, wlan0, or something else entirely. To find out your interface's name, use
iwconfig by itself, as discussed in the previous section.
The Extended Service Set Identifier (ESSID) is the name of the wireless network to which you're joined
or you want to join. Most of the time an ESSID name of any will work just fine, assuming that you can
meet the network's other needs, such as encryption, if that's necessary. Some networks, however,
require that you specify the exact ESSID.
# iwconfig ath0 essid lincoln
Here you are joining a wireless network with an ESSID of lincoln. Simply use the essid option,
followed by the name of the ESSID, and you're good.
More and more networks are using encryption to protect users' communications from sniffers that
capture all the traffic and then look through it for useful information. The simplest form of network
encryption for wireless networks is Wired Equivalent Privacy (WEP). Although this provides a small
measure of security, it's just that: small. WEP is easily cracked by a knowledgeable attacker, and it has
now been superseded by the much more robust Wi-Fi Protected Access (WPA). Unfortunately, getting
WPA to work with wireless cards on Linux can be a real bear, and is beyond the scope of this book.
Besides, WEP, despite its flaws, is still far more common, and it is better than nothing. Just don't expect
complete and total security using it.
Tip
For more on WEP and WPA, see Wikipedia's "Wired Equivalent Privacy"
(http://en.wikipedia.org/wiki/WEP) and "Wi-Fi Protected Access"
(http://en.wikipedia.org/wiki/Wi-Fi_Protected_Access). You can find information about getting
WPA to work with your Linux distribution at "Linux WPA/WPA2/IEEE 802.1X Supplicant"
(http://hostap.epitest.fi/wpa_supplicant). If you're using Windows drivers via ndiswrapper,
also be sure to check out "How to Use WPA with ndiswrapper"
(http://ndiswrapper.sourceforge.net/mediawiki/index.php/WPA).
WEP works with a shared encryption key, a password that exists on both the wireless access point and
your machine. The password can come in two forms: hex digits or plain text. It doesn't really matter
which because iwconfig can handle both. If you've been given hex digits, simply follow the enc option
with the key.
# iwconfig ath0 enc 646c64586278742a6229742f4c
If you've instead been given plain text to use, you still use the enc option, but you must preface the key
with s: to indicate that what follows is a text string.
# iwconfig ath0 enc s:dldXbxt*b)t/L
Tip
I created those WEP keys using the very nice WEP Key Generator found at
www.andrewscompanies.com/tools/wep.asp.
If you have several options to change at one time, you probably would like to perform all of them with
one command. To do so, follow iwconfig with your device name, and then place any changes you want
to make one after the other.
# iwconfig ath0 essid lincoln enc
646c64586278742a6229742f4c
The preceding listing changes the ESSID and sets WEP encryption using hex digits for the wireless
device ath0. You can set as many things at one time as you'd like.
Grab a New Address Using DHCP
dhclient
Most home networks and many business networks use the Dynamic Host Control Protocol (DHCP) to
parcel out IP addresses and other key information about the network to new machines joining it.
Without DHCP, any new machine must have all of its networking information hard-coded in; with DHCP,
a new machine simply plugs in to the network, asks the DHCP server to provide it with an IP address
and other necessary items, and then automatically incorporates the DHCP server's reply into its
networking configurations.
Note
The following discussion assumes that you've already configured your network device to use
DHCP instead of hard-coded settings. Various Linux distributions expect that information to be
found in different configuration files. Debian-based distributions look for the line iface
[interface] inet dhcp in /etc/network/interfaces . Red Hatderived distributions instead
want to see BOOTPROTO=dhcp in /etc/sysconfig/network-scripts/ifcfg-[interface] . In these
examples, substitute [interface] with the name of your interface. For more inforamtion,
search Google for "dhcp your-distro."
Sometimes your machine can't connect at boot to the DHCP server, so you need to manually initiate the
DHCP request. Or you might have networking problems that require a new IP address. No matter the
reason, the dhclient command attempts to query any available DHCP server for the necessary data
(dhclient must be run as root).
#
dhclient eth0
Listening on LPF/eth0/00:0b:cd:3b:20:e2
Sending on
LPF/eth0/00:0b:cd:3b:20:e2
Sending on
Socket/fallback
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
DHCPOFFER from 192.168.0.1
DHCPREQUEST on eth0 to 255.255.255.255 port 67
DHCPACK from 192.168.0.1
bound to 192.168.0.104 -- renewal in 37250 seconds.
# ifconfig eth0
eth0 Link encap:Ethernet
HWaddr 00:0B:CD:3B:20:E2
inet addr:192.168.0.104
Bcast:192.168.0.255
Mask:255.255.255.0
inet6 addr: fe80::20b:cdff:fe3b:20e2/64 Scope:Link
To release, or give up, the IP address that the DHCP server has assigned you, use the -r (for release)
option.
# dhclient -r eth0
sit0: unknown
sit0: unknown
Listening on
Sending on
Sending on
hardware address type 776
hardware address type 776
LPF/eth0/00:0b:cd:3b:20:e2
LPF/eth0/00:0b:cd:3b:20:e2
Socket/fallback
Ideally, the dhclient command should run automatically when you boot your computer, plug in a
wireless PCMCIA card, or connect an Ethernet cable to your wired jack, but sometimes it doesn't. When
DHCP doesn't work like it's supposed to, turn to dhclient . It's especially nice how dhclient is
automatically verbose, so you can see what's going on and diagnose as necessary.
Note
Some Linux distributions are still using an older program to perform DHCPpumpinstead of
dhclient . For info on pump, take a look at man pump , or read a brief HOWTO for Red Hat and
Mandrake at www.faqs.org/docs/Linux-mini/DHCP.html#REDHAT6.
Make a Network Connection Active
ifup
You actually use the ifup command all the time without realizing it. When you boot your computer to
find that you're successfully connected to the Internet, you can thank ifup. If you plug in an Ethernet
cable to the port on the back of your Linux box, and a few seconds later you're able to get email again,
it's ifup that did the heavy lifting. In essence, ifup runs when it detects a network event, such as a
reboot or a cable plugged in, and then executes the instructions found in your network interface
configuration files (if you're curious, a note in the preceding section discussed the names and locations
of those files).
Sometimes, though, you might experience networking problems and need to run ifup manually. It's
incredibly easy to do so: Log on as root, and then follow ifup with the name of the network interface
you want to activate.
# ifconfig
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
...
# ifup eth0
# ifconfig
eth0 Link encap:Ethernet
HWaddr 00:0B:CD:3B:20:E2
inet addr:192.168.0.14
Bcast:192.168.0.255
Mask:255.255.255.0
inet6 addr: fe80::20b:cdff:fe3b:20e2/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500
Metric:1
...
lo
Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
Notice that ifup doesn't tell you that it was successful. Indeed, like most Unix apps, ifup is silent upon
success, and only noisy if it experiences failure or error. To see what ifup has accomplished, use
ifconfig , as shown in the preceding listing.
Note
You can also use ifconfig [interface] up or iwconfig [interface] up to make wired or
wireless connections active.
Bring a Network Connection Down
ifdown
The ifup command makes network connections active, and the ifdown command brings them down.
Why would you need to bring down your network connection? Most often, it's because you're trying to
bring it up, and ifconfig reports that it's already up, but erroneously configured. So you first bring it
down, and then bring it back up.
# ifup eth0
ifup: interface eth0 already configured
# ifdown eth0
# ifup eth0
Notice that ifdown, like ifup, is silent upon success. If you don't see anything after entering ifdown, the
command was successful, and that network interface is no longer going to work.
Note
You can also use ifconfig eth0 down or iwconfig ath0 down to bring wired or wireless
connections down.
Display Your IP Routing Table
route
When you try to use Secure Shell (SSH) to connect to another computer on your LAN (something you'll
learn about in Chapter 15, "Working on the Network"), how does your computer know that it should
confine the packets to your LAN and not send them to your router to be sent out on the Internet? And if
you point your web browser to www.ubuntu.com, how does your Linux box know to send that request to
your router and not to another machine next to you?
The answer is that your Linux kernel has a routing table that keeps track of those things. To view your
current routing table, simply enter route in your shell (no, you don't have to be root to view the routing
table, but you do have to be root to change it, as you'll see in the next section).
$ route
Kernel IP routing table
Destination Gateway
Genmask
Flags Metric
192.168.0.0 *
255.255.255.0 U
0
default
192.168.0.1 0.0.0.0
UG
0
Ref
0
0
Use Iface
0 eth0
0 eth0
Note
There's only one network interface on this machine, so this is a pretty simple routing table. On
a laptop that has both an Ethernet port and a wireless card, you'll see additional entries.
An IP address is composed of four octets, giving it the appearance of xxx.xxx.xxx.xxx, as in
192.168.0.124. When you send a packet out of your machine, the IP address that is the destination of
that packet is compared to the Destination column in the routing table. The Genmask column works
with the Destination column to indicate which of the four octets should be examined to determine the
packet's destination.
For example, let's say you enter ping 192.168.0.124 in your shell. A Genmask of 255.255.255.0
indicates that only the last octetthe number represented by 0matters. In other words, when looking at
192.168.0.124, only the .124 is important to route packets to that address. Any packets intended for
192.168.0.1 through 192.168.0.255 (the limits of an IP address) match the Genmask and the
Destination, so they stay on the Local Area Network and avoid the router. That's why there's an * in the
Gateway column next to 192.168.0.0: No Gateway is needed because that traffic is local.
On the other hand, everything else is by default intended for the router, which in this instance is at
192.168.0.1 in the Gateway column. The Genmask in this row is 0.0.0.0, indicating that any IP address
not matching 192.168.0.1 through 192.168.0.255 should be sent through 192.168.0.1 (because
192.168.0.1 is the Gateway, it's a special case). 72.14.203.99, 82.211.81.166, and 216.23.180.5 all
match with 0.0.0.0, so they must all go through the Gateway for the Net.
The other interesting thing about the routing table exposed by route is the Flags column, which gives
information about the route. There are several possible flags, but the most common are U (the route is
up) and G (use the gateway). In the preceding table, you can see that both routes are up, but only the
second is the Gateway.
Change Your IP Routing Table
route
The route command can be used not only to view your routing table, but to alter it as well. You need to
be careful here, however, as you can break the network and effectively landlock your computer.
Let's say your machine keeps dropping the Gateway, effectively making it impossible for any packets to
leave your LAN for the Internet (this really happened to me once with a box). Run the route command,
verify that the Gateway is missing, and then add it in with route (although viewing the route can be
done as a normal user, changes to route requires root access).
# route
Kernel IP routing table
Destination Gateway Genmask
Flags Metric Ref
192.168.0.0 *
255.255.255.0 U
0
0
# route add -net default gw 192.168.0.1 dev eth0
# route
Kernel IP routing table
Destination Gateway
Genmask
Flags Metric
192.168.0.0 *
255.255.255.0 U
0
default
192.168.0.1 0.0.0.0
UG
0
Use Iface
0 eth0
Ref
0
0
Use Iface
0 eth0
0 eth0
Let's break down that command. add indicates that you're adding a new route (to remove one, use del ).
The -net option tells the kernel that the target you're adding is a network, in this case the default
destination. gw indicates that you want to route packets matching the destination (here the default,
therefore utilizing a Genmask of 0.0.0.0) using a gateway at 129.168.0.1. Finally, dev eth0 specifies the
device to use, in this case the Ethernet card at eth0.
Let's say that in addition to your Ethernet card at eth0, you also have a wireless card at ath0. You want
that wireless card to access resources on a LAN that uses 10.1.xxx.xxx as its base. You don't want the
wireless card to be able to access the Internet at all. To add a route matching those criteria, you'd use
these commands:
# route
Kernel IP routing table
Destination Gateway
Genmask
Flags Metric Ref
192.168.0.0 *
255.255.255.0 U
0
0
default
192.168.0.1 0.0.0.0
UG
0
0
# route add -net 10.1.0.0 netmask 255.255.0.0 dev ath0
# route
Kernel IP routing table
Destination Gateway
Genmask
Flags Metric Ref
Use Iface
0 eth0
0 eth0
Use Iface
192.168.0.0 *
255.255.255.0
10.1.0.0
*
255.255.0.0
default
192.168.0.1 0.0.0.0
U
U
UG
0
0
0
0
0
0
0 eth0
0 ath0
0 eth0
Here you indicated the wireless card with dev ath0 , and then specified the netmask as 255.255.0.0 so
routing would occur correctly. If you later want to remove that route, you'd use the following:
# route
Kernel IP routing table
Destination Gateway
Genmask
Flags Metric Ref
192.168.0.0 *
255.255.255.0 U
0
0
10.1.0.0
*
255.255.0.0
U
0
0
default
192.168.0.1 0.0.0.0
UG
0
0
# route del -net 10.1.0.0 netmask 255.255.0.0 dev eth0
# route
Kernel IP routing table
Destination Gateway
Genmask
Flags Metric Ref
192.168.0.0 *
255.255.255.0 U
0
0
default
192.168.0.1 0.0.0.0
UG
0
0
Use Iface
0 eth0
0 ath0
0 eth0
Use Iface
0 eth0
0 eth0
Everything is the same, except you use del instead of add . Now that's easy!
Troubleshooting Network Problems
Linux distributions nowadays usually "just work" when it comes to networking, but you might still
experience an issue. Following are some basic tips for troubleshooting network problems.
If your network interface appears to be up and running, but you can't get on the Internet, first try
pinging your localhost device, at 127.0.0.1. If that doesn't work, stop and go no further, because you
have a seriously damaged system. If that works, ping your machine's external IP address. If that
doesn't work, make sure networking is enabled on your machine. If it does work, now try pinging other
machines on your network, assuming you have any. If you're not successful, it's your interface
(assuming your router is okay). Make sure that your cables are plugged in (seriously). Use ifconfig (or
iwconfig if it's wireless) to verify the status of your interface and use ifup to turn the interface on, if
necessary. Then try ping again.
If your attempt to ping another local computer was successful, next try pinging your router. If you can
get to other machines on your network, but you can't get to your router, time to check your routing
tables with route (refer to "Display Your IP Routing Table"). If you're missing items from your routing
table, add them, as detailed in "Change Your IP Routing Table."
Note
It's much easier to diagnose and fix problems if you have a baseline from which to work. After
you know a machine's networking is correct, run route and save the results, so you have a
stored blueprint if the routing table goes bonkers and you need to restore something later.
If you can get to your router, try pinging a machine you know will be up and running out on the
Internet, like www.google.com or www.apple.com. If that doesn't work, try pinging that same
machine's IP address. Yes, that means that you need to have some IP addresses on a sticky note or in a
text file on your computer for just such an occasion as this. Here are a few that are good right now; of
course, they could change, so you really should look these up yourself.
Site
IP Address
www.google.com
72.14.203.99
www.apple.com
17.254.0.91
www.ubuntu.com
82.211.81.166
www.ibm.com
129.42.16.99
www.granneman.com 216.23.180.5
Note
How do you get those IP addresses? You can ping the machine using its domain name, and
ping gives you the IP address, or you can get the same info with traceroute. A quicker
method is with the host command, covered earlier in "Perform DNS Lookups."
If you can get to the IP address, but can't get to the domain name, you have a DNS problem. If you're
using DHCP, it's time to run dhclient (refer to "Grab a New Address Using DHCP") to try to renew the
DNS information provided by your DHCP server. If you're not using DHCP, find the DNS information you
need by looking on your router or asking your administrator or ISP, and then add it manually as root to
/etc/resolv.conf so that it looks like this, for example:
nameserver 24.217.0.5
nameserver 24.217.0.55
That's nameserver (a required word), followed by an IP address you're supposed to use for the DNS. If
your router can handle it, and you know its IP address (192.168.0.1, let's say), you can always try this
first:
nameserver 192.168.0.1
Try ifdown and then ifup and see if you're good to go. If you're still having problems, time to begin
again, starting always with hardware. Is everything seated correctly? Is everything plugged in? After
you're sure of that, start checking your software. The worst-case scenario is that your hardware just
doesn't have drivers to work with Linux. It's rare, and growing rarer all the time, but it still happens.
Wireless cards, however, can be wildly incompatible with Linux thanks to close-mouthed manufacturers
who don't want to help Linux developers make their hardware work. To prevent headaches, it's a good
idea to check online to make sure a wireless card you're thinking about purchasing will be copasetic
with Linux. Good sites to review include Hardware Supported by Madwifi
(http://madwifi.org/wiki/Compatibility); Linux Wireless LAN Support (http://linux-wless.passys.nl/),
which is a bit outdated but still useful; and my constantly updated list of bookmarks relating to Linux
and wireless connectivity (http://del.icio.us/rsgranne/wireless). When it comes to specific hardware, I
can wholeheartedly recommend the Netgear WG511T wireless PCMCIA card. I simply inserted it into my
laptop running the latest K/Ubuntu, and it worked immediately.
Oh, and to finish our troubleshooting: If you can successfully ping both the IP address and the domain
name, stop reading thisyou're online! Go have fun!
Conclusion
You've covered a wide selection of networking tools in this chapter. Many of them, such as ifconfig ,
iwconfig , and route, can perform double duty, both informing you about the status of your connectivity
and allowing you to change the parameters of your connections. Others are used more for diagnostic
purposes, such as ping, traceroute, and host. Finally, some govern your ability to connect to a network
at all: dhclient , ifup, and ifdown. If you're serious about Linux, you're going to need to learn all of
them. There's nothing more frustrating than a network connection that won't work, but often the right
program can fix that problem in a matter of seconds. Know your tools well, and you can fix issues
quickly and efficiently as they arise.
Chapter 15. Working on the Network
Many of the commands in this chapter are so rich in features and power that they deserve books of their
own. This book can't begin to cover all of the awesome stuff you can do with ssh , rsync, wget, and curl,
for instance, but it can get you started. The best thing to do with these commands is to try them, get
comfortable with the basics, and then begin exploring the areas that interest you. There's enough to
keep you busy for a long time, and after you start using your imagination, you'll discover a world of cool
applications for the programs you're going to read about in this chapter.
Securely Log In to Another Computer
ssh
Because Unix was built with networking in mind, it's no surprise that early developers created programs
that would allow users to connect to other machines so they could run programs, view files, and access
resources. For a long time, telnet was the program to use, but it had a huge problem; it was
completely insecure. Everything you send using telnet your username, password, and all commands
and datais sent without any encryption at all. Anyone listening in can see everything, and that's just not
good.
To combat this problem, ssh (secure shell) was developed. It can do everything telnet can, and then a
lot more. Even better, all ssh traffic is encrypted, making it even more powerful and useful. If you need
to connect to another machine, whether that computer is on the other side of the globe or in the next
room, use ssh .
Let's say you want to use the ssh command from your laptop (named pound , and found at
192.168.0.15) to your desktop (named eliot , and located at 192.168.0.25) so you can look at a file.
Your username on the laptop is ezra , but on the desktop it's tom . To SSH to eliot , you'd enter the
following (you could also use domain names such as hoohah.granneman.com if one existed):
$ ssh [email protected]
[email protected]'s password:
Linux eliot 2.6.12-10-386 #1 Mon Jan 16 17:18:08 UTC 2006 i686 GNU/Linux
Last login: Mon Feb 6 22:40:31 2006 from 192.168.0.15
[Listing truncated for length]
You're prompted for a password after you connect. Type it in (you won't see what you're typing, in
order to prevent someone from "shoulder surfing" and discovering your password), press Enter, and if
it's accepted, you see some information about the machine to which you just connected, including its
name, kernel, date and time, and the last time you logged in. You can now run any command you're
authorized to run on that machine as though you were sitting right in front of it. From the perspective of
ssh and eliot , it doesn't matter where on earth you areyou're logged in and ready to go.
If this were the first time you'd ever connected to eliot , however, you would have seen a different
message:
$ ssh [email protected]
The authenticity of host '192.168.0.25 (192.168.0.25)' can't be established.
RSA key fingerprint is 54:53:c3:1c:9a:07:22:0c:82: 7b:38:53:21:23:ce:53.
Are you sure you want to continue connecting (yes/no)?
Basically, ssh is telling you that it doesn't recognize this machine, and it's asking you to verify the
machine's identity. Type in yes , press Enter, and you get another message, along with the password
prompt:
Warning: Permanently added '192.168.0.25'
[email protected]'s password:
(RSA) to the list of known hosts.1
From here things proceed normally. You only see this message the first time you connect to eliot
because ssh stores that RSA key fingerprint it mentioned in a file on pound located at
~/.ssh/known_hosts . Take a look at that file, and you see a line has been added to it. Depending on
whether the HashKnownHosts option has been enabled in your /etc/ssh/ssh_config file, that line
appears in one of two formats. If HashKnownHosts is set to no, it looks something like this:
[View full width]192.168.0.25 ssh-rsa
SkxPUQLYqX
SzknsstN6Bh2MHK5AmC6Epg4psdNL69R5pHbQi3kRWNNNNO3AmnP1lp2RNNNNOVjNN9mu5FZel6zK0iKfJBbLh
/Mh9KOhBNtrX6prfcxO9vBEAHYITeLTMmYZLQHBxSr6ehj/9xFxkCHDYLdKFmxaffgA6Ou2ZUX5NzP6Rct4cfqAY69E
5cUoDv3xEJ/gj2zv0bh630zehrGc=
You can clearly see the IP address as well as the encryption hash, but that's nothing compared to what
you'd see if HashKnownHosts is set to yes:
[View full width]NNNNO3AmnP1lp2RNNNNOVjNNNVRNgaJdxOt3GIrh00lPD6KBIU1kaT6nQoJUMVTx2tWb5KiF/LL
/0czCZIQNPw
DUf6YiKUFFC
6eagqpLDDB4T9qsOajOPLNinRZpcQoPlXf1u6j1agfJzqUJUYE+Lwv8yzmPidCvOuCZ0LQH4qfkVNXEQxmyy6iz6b2wp
Everything is hashed, even the machine's IP address or domain name. This is good from a security
standpoint, but it's a problem if the OS on eliot should ever change. In other words, if you ever need
to reinstall Linux on eliot , the next time you log in from pound you'll see this dire warning:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the RSA host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
19:85:59:5c:6a:24:85:53:07:7a:dc:34:37:c6:72:1b.
Please contact your system administrator.
Add correct host key in /home/pound/.ssh/known_hosts to get rid of this message.
Offending key in /home/pound/.ssh/known_hosts:8
RSA host key for 192.168.0.125 has changed and you have requested strict checking.
Host key verification failed.
The problem is that the ssh key on eliot has changed since the OS has been reinstalled, and when ssh
checks the key for eliot it has stored in pound 's known_hosts file with the new key, it sees that there's a
mismatch and warns you. To fix this, simply delete the line in pound 's known_hosts that corresponds to
eliot , save the file, and reconnect. To ssh , this is the first time you've connected, so it asks if you
want to accept the key. Say yes, and things are good again.
If eliot is the only machine you ever SSH to, finding the correct line to delete in known_hosts is easy
because it's the only one. But if you connect to several machines, you're going to have a hard time.
Quick, which one of these represents hoohah.granneman.com?
AAAAB3NzaC1yc2EAAAABIwAAAIEAtnWqkBg3TVeu00yCQ6XOVH1xnG6aDbWHZIGk2gJo5XvS/YYQ4Mjoi2M/w/0pmPMVDACj
NNNNO3AmnP1lp2RNNNNOVjNNNVRNgaJdxOt3GIrh00lPD6KBIU1kaT6nQoJUMVTx2tWb5KiF/LLD4Zwbv2Z/j/0czCZIQNPw
AAAAB3NzaC1yc2EAAAABIwAAAIEAtnWqkBg3TVeu00yCQ6XOVH1xnG6aDbWHZIGk2gJo5XvS/YYQ4Mjoi2M/w/0pmPMVDACj
Hard to tell, eh? Because you can't tell, you really have no choice but to delete every line, which means
that you have to re-accept the key every time you log in to a machine. In order to avoid this, it might
be easier just to edit /etc/ssh/ssh_config as root and set HashKnownHosts to no.
Securely Log In to Another Machine Without a Password
ssh
The name of this section might appear to be a misnomer, but it's entirely possible to log in to a machine
via ssh , but without providing a password. If you log in every day to a particular computer (and there
are some boxes that I might log in to several times a day), the techniques in this section will make you
very happy.
Let's say you want to make it possible to log in to eliot (username: tom ) from pound (username: ezra)
without requiring that you type a password. To start with, create an ssh authentication key on pound
using the following command:
$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/home/ezra/.ssh/id_dsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/ezra/.ssh/id_dsa.
Your public key has been saved in /home/ezra/.ssh/id_dsa.pub.
The key fingerprint is:
30:a4:a7:31:27:d1:61:82:e7:66:ae:ed:6b:96:3c:24
ezra@pound
Accept the default location in which to save the key by pressing Enter, and leave the passphrase field
blank as well by pressing Enter twice when asked. You just created a private key at ~/.ssh/id_dsa and
a public key at ~/.ssh/id_dsa.pub.
Now you need to transfer the public keynot the private key!from pound to eliot. The developers behind
ssh are way ahead of you, and have created a program that makes this as easy as falling off a log. To
automatically copy your public key from pound to eliot, just enter the following on pound:
$ ssh-copy-id -i ~/.ssh/id_dsa.pub [email protected]
Now try logging into the machine, with ssh '[email protected]', and check in .ssh/authorized_keys to
make sure you haven't added extra keys that you weren't expecting.
You're done (although if you want to follow the advice given by ssh-copy-id, go right ahead). Watch
what happens when you use the ssh command from pound to eliot now:
$ ssh [email protected]
Linux eliot 2.6.12-10-386 #1 Mon Jan 16 17:18:08 UTC 2006 i686 GNU/Linux
Last login: Mon Feb 6 22:40:31 2006 from 192.168.0.15
Notice that you weren't asked to enter a password, which is exactly what you wanted.
Some of you are wondering about the security of this trick. No passwords? Freely exchanging keys? It's
true, but think about it for a moment. True, if someone gets on pound, he can now connect to eliot
without a password. But that simply means that you need to practice good security on pound. If pound is
compromised, you have enormous problems whether or not the attacker realizes that he can also get to
eliot. On top of that, you shoot passwords around the Internet all the time. If an attacker acquires
your password, he can do major damage as well. Isn't your private key as important as a password?
And aren't you going to back it up and safeguard it? When you think about it in those terms, exchanging
keys via ssh is at least as secure as passwords, and in most ways much more secure.
Nonetheless, if you prefer to keep using passwords, that's certainly your prerogative. That's what open
source and Linux is all about: choice.
Securely Transfer Files Between Machines
sftp
In the same way that ssh is a far better choice than telnet, SFTP is far better than FTP. Like telnet,
FTP sends your name and password, as well as all the data being transferred, in the clear so that
anyone sniffing packets can listen in. SFTP, on the other hand, uses ssh to encrypt everything:
username, passwords, and traffic. Other than the fact it's about a million times more secure, it's
remarkably similar to FTP in its commands, which should make it easy to learn and use.
If you can access a machine via ssh , you can also SFTP to it. To use the sftp command from pound
(192.168.0.15; username ezra) to eliot (192.168.0.25; username tom ), just use this command:
$ sftp [email protected]
Connecting to 192.168.0.25...
[email protected]'s password:
sftp>
If you've read the previous section, "Securely Log In to Another Machine Without a Password," you
might be wondering why you were prompted for a password. You're correct: The preceding example
was taken before a connection that didn't require a password was set. After you've done that, you would
instead see a login that looks like this:
$ sftp [email protected]
Connecting to 192.168.0.25...
sftp>
After you're logged in via sftp, the commands you can run are pretty standard. Table 15.1 lists some of
the more common commands; for the full list, look at man sftp .
Table 15.1. Useful SFTP Commands
Command
Meaning
cd
Change directory
exit
Close the connection to the remote SSH server
get
Copy the specified file to the local machine
Command
Meaning
help
Get help on commands
lcd
Change the directory on the local machine
lls
List files on the local machine
ls
List files in the working directory on the remote
SSH server
put
Copy the specified file to the remote SSH server
rm
Remove the specified file from the remote SSH
server
Securely Copy Files Between Hosts
scp
If you're in a hurry and you need to copy a file securely from one machine to another, scp (secure copy)
is what you want. In essence, here's the basic pattern for using scp :
scp user@host1:file1 user@host2:file2
This is basically the same syntax as good ol' cp, but now extended to the network. An example will
make things clearer. Let's say you want to copy backup.sh from pound (192.168.0.15; username ezra)
to /home/tom/bin on eliot (129.168.0.25; username tom ) using scp :
$ pwd
/home/ezra
$ ls ~/bin
backup.sh
$ scp ~/bin/backup.sh [email protected]/home/tom/bin
backup.sh
100% 8806
8.6KB/s
00:00
$
You weren't prompted for a password because you set things up earlier in "Securely Log In to Another
Machine Without a Password" so ssh doesn't require passwords to connect from pound to eliot, and
because scp relies on ssh , you don't need a password here, either. If you hadn't done that, you would
have been asked to enter tom 's password before continuing.
Let's say you have several JPEGs you want to transfer from pound to eliot. No problemjust use a
wildcard:
$ ls -1 ~/covers
earth_wind_&_fire.jpg
handel_-_chamber_music.jpg
smiths_best_1.jpg
strokes_-_is_this_it.jpg
u2_pop.jpg
$ scp *.jpg [email protected]:/home/tom/album_covers
earth_wind_&_fire.jpg
100%
44KB 43.8KB/s
handel_-_chamber_music.jpg
100%
12KB 12.3KB/s
smiths_best_1.jpg
100%
47KB 47.5KB/s
strokes_-_is_this_it.jpg
100%
38KB 38.3KB/s
u2_pop.jpg
100%
9222
9.0KB/sQ
Now let's say you want to go the other direction. You're still on pound, and you want to copy several
pictures of Libby from eliot to pound, and into a different directory than the one in which you currently
are in:
$ scp [email protected]:/home/tom/pictures/dog/libby* ~/pix/libby
libby_in_window_1.20020611.jpg 100%
172KB 172.4KB/s
libby_in_window_2.20020611.jpg 100%
181KB 180.8KB/s
libby_in_window_3.20020611.jpg 100%
197KB 196.7KB/s
libby_in_window_4.20020611.jpg 100%
188KB 187.9KB/s
The scp command is really useful when you need to securely copy files between machines. If you have
many files to copy, however, you'll find that scp can quickly grow tiresome. In cases like that, you might
want to look at SFTP or a mounted Samba share (which is covered in "Mount a Samba Filesystem,"
found in Chapter 16, "Windows Networking").
Securely Transfer and Back Up Files
rsync -v
rsync is one of the coolest, most useful programs ever invented, and many people rely on it every day
(like me!). What does it do? Its uses are myriad (here we go again into "you could write a book about
this command!"), but let's focus on one very powerful, necessary feature: its capability to back up files
effectively and securely, with a minimum of network traffic.
Let's say you intend to back up 2GB of files every night from a machine named coleridge (username:
sam ) to another computer named wordsworth (username: will ). Without rsync , you're looking at a
transfer of 2GB every single night, a substantial amount of traffic, even on a fast network connection.
With rsync , however, you might be looking at a transfer that will take a few moments at most. Why?
Because when rsync backs up those 2GB, it transfers only the differences between all the files that
make up those 2GB of data. If only a few hundred kilobytes changed in the past 24 hours, that's all that
rsync transfers. If instead it was 100MB, that's what rsync copies over. Either way, it's much less than
2GB.
Here's a command that, run from coleridge , transfers the entire content of the documents directory to a
backup drive on wordsworth . Look at the command, look at the results, and then you can walk through
what those options mean (the command is given first with long options instead of single letters for
readability, and then with single letters, if available, for comparison).
[View full width]$ rsync --verbose --progress --stats --recursive --times --perms --links ---rsh=ssh --delete /home/sam/documents/will@wordsworth:/media/backup/documents
$ rsync -v --progress --stats -r -t -p -l -z -e ssh --delete /home/sam/documents/
will@wordsworth:/media/backup/documents
Of course, you could also run the command this way, if you wanted to combine all the options:
[View full width]$ rsync -vrtplze ssh --progress --stats --delete /home/sam/documents/ will@
/media/backup/documents
Upon running rsync using any of the methods listed, you'd see something like this:
building file list ...
107805 files to consider
deleting
clientele/Linux_Magazine/do_it_yourself/13/gantt_chart.txt~
deleting
Security/diebold_voting/black_box_voting/bbv_chapter-9.pdf
deleting E-commerce/Books/20050811 eBay LIL ABNER DAILIES 6 1940.txt
Security/electronic_voting/diebold/black_box_voting/bbv_chapter-9.pdf
legal_issues/free_speech/Timeline A history of free speech.txt
E-commerce/2005/Books/20050811 eBay LIL ABNER DAILIES 6 1940.txt
connectivity/connectivity_info.txt
[Results greatly truncated for length]
Number of files: 107805
Number of files transferred: 120
Total file size: 6702042249 bytes
Total transferred file size: 15337159 bytes
File list size: 2344115
Total bytes sent: 2345101
Total bytes received: 986
sent 2345101 bytes received 986 bytes 7507.48 bytes/sec
total size is 6702042249 speedup is 2856.69
Take a look at those results. rsync first builds a list of all files that it must consider107,805 in this
caseand then deletes any files on the target ( wordsworth ) that no longer exist on the source ( coleridge
). In this example, three files are deleted: a backup file (the ~ is a giveaway on that one) from an article
for Linux Magazine , a PDF on electronic voting, and then a text receipt for a purchased book.
After deleting files, rsync copies over any that have changed, or if it's the same file, just the changes to
the file, which is part of what makes rsync so slick. In this case, four files are copied over. It turns out
that the PDF was actually moved to a new subdirectory, but to rsync it's a new file, so it's copied over in
its entirety.
The same is true for the text receipt. The A history of free speech.txt file is an entirely new file, so
it's copied over to wordsworth as well.
After listing the changes it made, rsync gives you some information about the transfer as a whole. 120
files were transferred, 15337159 bytes (about 14MB) out of 6702042249 bytes (around 6.4GB). Other
data points are contained in the summation, but those are the key ones.
Now let's look at what you asked your computer to do. The head and tail of the command are easy to
understand: the command rsync at the start, then options, and then the source directory you're copying
from (/home/sam/documents/ , found on coleridge ), followed by the target directory you're copying to
(/media/backup/documents , found on wordsworth ). Before going on to examine the options, you need to
focus on the way the source and target directories are designated because there's a catch in there that
will really hurt if you don't watch it.
You want to copy the contents of the documents directory found on coleridge , but not the directory
itself, and that's why you use documents/ and not documents . The slash after documents in
/home/sam/documents/ tells rsync that you want to copy the contents of that directory into the documents
directory found on wordsworth ; if you instead used documents , you'd copy the directory and its
contents, resulting in /media/backup/documents/documents on wordsworth .
Note
The slash is only important on the source directory; it doesn't matter whether you use a slash
on the target directory.
There's one option that wasn't listed previously but is still a good idea to include when you're figuring
out how your rsync command will be structured: -n (or --dry-run ). If you include that option, rsync
runs, but doesn't actually delete or copy anything. This can be a lifesaver if your choices would have
resulted in the deletion of important files. Before committing to your rsync command, especially if
you're including the --delete option, do yourself a favor and perform a dry run first!
Now on to the options you used. The -v (or --verbose ) option, coupled with --progress , orders rsync
to tell you in detail what it's doing at all times. You saw that in the results shown earlier in this section,
in which rsync tells you what it's deleting and what it's copying. If you're running rsync via an
automated script, you don't need this option, although it doesn't hurt; if you're running rsync
interactively, this is an incredibly useful display of information to have in front of you because you can
see what's happening.
The metadata you saw at the end of rsync 's resultsthe information about the number and size of files
transferred, as well as other interesting dataappeared because you included the --stats option. Again,
if you're scripting rsync , this isn't needed, but it's sure nice to see if you're running the program
manually.
You've seen the -r (or --recursive ) option many other times with other commands, and it does here
what it does everywhere else. Instead of stopping in the current directory, it tunnels down through all
subdirectories, affecting everything in its path. Because you want to copy the entire documents
directory and all of its contents, you want to use -r .
The -t (or --times ) option makes rsync transfer the files' modification times along with the files. If you
don't include this option, rsync cannot tell what it has previously transferred, and the next time you run
the command, all files are copied over again. This is probably not the behavior you want, as it
completely obviates the features that make rsync so useful, so be sure to include -t .
Permissions were discussed in Chapter 7 , "Ownerships and Permissions ," and here they reappear
again. The -p (or --perms ) option tells rsync to update permissions on any files found on the target so
they match what's on the source. It's part of making the backup as accurate as possible, so it's a good
idea.
When a soft link is found on the source, the -l (or --links ) option re-creates the link on the target.
Instead of copying the actual file, which is obviously not what the creator of a soft link intended, the link
to the file is copied over, again preserving the original state of the source.
Even over a fast connection, it's a good idea to use the -z (or --compress ) option, as rsync then uses
gzip compression while transferring files. On a slow connection, this option is mandatory; on a fast
connection, it saves you that much more time.
In the name of security, you're using the -e (or --rsh=ssh ) option, which tells rsync to tunnel all of its
traffic using ssh . Easy file transfers, and secure as well? Sign me up!
Note
If you're using ssh , why didn't you have to provide a password? Because you used the
technique displayed in "Securely Log In to Another Machine Without a Password " to remove
the need to do so.
We've saved the most dangerous for last: --delete . If you're creating a mirror of your files, you
obviously want that mirror to be as accurate as possible. That means deleted files on the source need to
be deleted on the target as well. But that also means you can accidentally blow away stuff you wanted
to keep. If you're going to use the --delete optionand you probably willbe sure to use the -n (or --dryrun ) option that was discussed at the beginning of your look at rysnc 's options.
rsync has many other options and plenty of other ways to use the command (the man page identifies
eight ways it can be used, which is impressive), but the setup discussed in this section will definitely get
you started. Open up man rsync on your terminal, or search Google for "rsync tutorial," and you'll find a
wealth of great information. Get to know rsync : When you yell out "Oh no!" upon deleting a file, but
then follow it with "Whew! It's backed up using rsync !" you'll be glad you took the time to learn this
incredibly versatile and useful command.
Tip
If you want to be really safe with your data, set up rsync to run with a regular cron job. For
instance, create a file titled backup.sh (~/bin is a good place for it) and type in the command
you've been using:
[View full width]$ rsync --verbose --progress --stats --recursive --times --perms --links ---rsh=ssh --delete /home/sam/documents/ will@wordsworth:/media/backup/documents
Use chmod to make the file executable:
$ chmod 744 /home/scott/bin/backup.sh
Then add the following lines to a file named cronfile (I put mine in my ~/bin directory as well):
# backup documents every morning at 3:05 am
05 03 * * * /home/scott/bin/backup.sh
The first line is a comment explaining the purpose of the job, and the second line tells cron to
automatically run /home/scott/bin/backup.sh every night at 3:05 a.m.
Now add the job to cron:
$ crontab /home/scott/bin/cronfile
Now you don't need to worry about your backups ever again. It's all automated for youjust make sure
you leave your computers on overnight!
(For more on cron , see man cron or "Newbie: Intro to cron" at
www.unixgeeks.org/security/newbie/unix/cron-1.html .)
Download Files Non-interactively
wget
The Net is a treasure trove of pictures, movies, and music that are available for downloading. The
problem is that manually downloading every file from a collection of 200 MP3s quickly grows tedious,
leading to mind rot and uncontrollable drooling. The wget command is used to download files and
websites without any interference; you set the command in motion and it happily downloads whatever
you specified, for hours on end.
The tricky part, of course, is setting up the command. wget is another super-powerful program that
really deserves a book all to itself, so we don't have space to show you everything it can do. Instead,
you're going to focus on doing two things with wget: downloading a whole mess of files, which is looked
at here, and downloading entire websites, which is covered in the next section.
Here's the premise: You find a wonderful website called "The Old Time Radio Archives." On this website
are a large number of vintage radio shows, available for download in MP3 format365 MP3s, to be exact,
one for every day of the year. It would sure be nice to grab those MP3s, but the prospect of rightclicking on every MP3 hyperlink, choosing Save Link As, and then clicking OK to start the download isn't
very appealing.
Examining the directory structure a bit more, you notice that the MP3s are organized in a directory
structure like this:
http://www.oldtimeradioarchives.com/mp3/
season_10/
season_11/
...
season_20/
...
season_3/
season_4/
...
season_9/
Note
The directories are not sorted in numerical order, as humans would do it, but in alphabetical
order, which is how computers sort numbers unless told otherwise. After all, "ten" comes
before "three" alphabetically.
Inside each directory sit the MP3s. Some directories have just a few files in them, and some have close
to 20. If you click on the link to a directory, you get a web page that lists the files in that directory like
this:
[BACK]
[SND]
[SND]
[SND]
Parent Directory
1944-12-24_532.mp3
1944-12-31_533.mp3
1945-01-07_534.mp3
19-May-2002
06-Jul-2002
06-Jul-2002
06-Jul-2002
01:03
13:54
14:28
20:05
6.0M
6.5M
6.8M
[SND] is a GIF image of musical notes that shows up in front of every file listing.
So the question is, how do you download all of these MP3s that have different filenames and exist in
different directories? The answer is wget!
Start by creating a directory on your computer into which you'll download the MP3 files.
$ mkdir radio_mp3s
Now use the cd command to get into that directory, and then run wget:
$ cd radio_mp3s
$ wget -r -l2 -np -w 5 -A.mp3 -R.html,.gif http://www.oldtimeradioarchives.com/mp3/`
Let's walk through this command and its options.
wget is the command you're running, of course, and at the far end is the URL that you want wget to use:
http://www.oldtimeradioarchives.com/mp3. The important stuff, though, lies in between the command
and the URL.
The -r (or --recursive) option for wget follows links and goes down through directories in search of
files. By telling wget that it is to act recursively, you ensure that wget will go through every season's
directory, grabbing all the MP3s it finds.
The -l2 (or --level=[#]) option is important yet tricky. It tells wget how deep it should go in retrieving
files recursively. The lowercase l stands for level and the number is the depth to which wget should
descend. If you specified -l1 for level one, wget would look in the /mp3 directory only. That would result
in a download of...nothing. Remember, the /mp3 directory contains other subdirectories: season_10,
season_11, and so on, and those are directories that contain the MP3s you want. By specifying -l2 ,
you're asking wget to first enter /mp3 (which would be level one), and then go into each season_#
directory in turn and grab anything in it. You need to be very careful with the level you specify. If you
aren't careful, you can easily fill your hard drive in very little time.
One of the ways to avoid downloading more than you expected is to use the -np (or --no-parent)
option, which prevents wget from recursing into the parent directory. If you look back at the preceding
list of files, you'll note that the very first link is the parent directory. In other words, when in
/season_10, the parent is /mp3. The same is true for /season_11, /season_12, and so on. You don't want
wget to go up, however, you want it to go down. And you certainly don't need to waste time by going up
into the same directory /mp3every time you're in a season's directory.
This next option isn't required, but it would sure be polite of you to use it. The -w (or --wait=[#]) option
introduces a short wait between each file download. This helps prevent overloading the server as you
hammer it continuously for files. By default, the number is interpreted by wget as seconds; if you want,
you can also specify minutes by appending m after the number, or hours with h, or even days with d.
Now it gets very interesting. The -A (or --accept ) option emphasizes to wget that you only want to
download files of a certain type and nothing else. The A stands for accept, and it's followed by the file
suffixes that you want, separated by commas. You only want one kind of file type, MP3, so that's all you
specify: -A.mp3.
On the flip side, the -R (or --reject ) option tells wget what you don't want: HTML and GIF files. By
refusing those, you don't get those little musical notes represented by [SND] shown previously. Separate
your list of suffixes with a comma, giving you -R.html,.gif.
Running wget with those options results in a download of 365 MP3s to your computer. If, for some
reason, the transfer was interruptedyour router dies, someone trips over your Ethernet cable and yanks
it out of your box, a backhoe rips up the fiber coming in to your businessjust repeat the command, but
add the -c (or --continue) option. This tells wget to take over from where it was forced to stop. That
way you don't download everything all over again.
Here's another example that uses wget to download files. A London DJ released two albums worth of
MP3s consisting of mash-ups of The Beatles and The Beastie Boys, giving us The Beastles, of course.
The MP3s are listed, one after the other, on www.djbc.net/beastles.
The following command pulls the links out of that web page, writes them to a file, and then starts
downloading those links using wget:
$ dog --links http://www.djbc.net/beastles/ | grep mp3 > beastles ; wget -i beastles
--12:58:12-http://www.djbc.net/beastles/ webcontent/djbc-holdittogethernow.mp3
=> 'djbc-holdittogethernow.mp3'
Resolving www.djbc.net... 216.227.209.173
Connecting to www.djbc.net|216.227.209.173|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4,533,083 (4.3M) [audio/mpeg]
100%[=========>] 4,533,083
203.20K/s
ETA 00:00
12:58:39 (166.88 KB/s) - 'djbc-holdittogethernow. mp3' saved [4533083/4533083]
In Chapter 5, "Viewing Files," you learned about cat , which outputs files to STDOUT. The "Concatenate
Files and Number the Lines" section mentioned a better cat , known as dog (which is true because dogs
are better than cats). If you invoke dog with the --links option and point it at a URL, the links are
pulled out of the page and displayed on STDOUT. You pipe those links to grep, asking grep to filter out
all but lines containing mp3 , and then redirect the resulting MP3 links to a text file named beastles
(piping and redirecting are covered in Chapter 4, "Building Blocks," and the grep command is covered in
Chapter 9, "Finding Stuff: Easy").
The semicolon (covered in the "Run Several Commands Sequentially" section in Chapter 4) ends that
command and starts a new one: wget. The -i (or --input-file) option tells wget to look in a file for the
URLs to download, instead of STDIN. If you have many links, put them all in a file and use the -i option
with wget. In this case, you point wget to the beastles file you just created via dog and grep, and the
MP3s begin to download, one after the other.
Now really, what could be easier? Ah, the power of the Linux command line!
Download Websites Non-interactively
wget
If you want to back up your website or download another website, look to wget. In the previous section
you used wget to grab individual files, but you can use it to obtain entire sites as well.
Note
Please be reasonable with wget. Don't download enormous sites, and keep in mind that
someone creates and owns the sites that you're copying. Don't copy a site because you want
to "steal" it.
Let's say you're buzzing around a site at www.neato.com and you find yourself at
www.neato.com/articles/index.htm. You'd like to copy everything in the /articles section, but you don't
want anything else on the site. The following command does what you'd like:
$ wget -E -r -k -p -w 5 -np http://www.neato.com/articles/index.htm
You could have combined the options this way, as well:
$ wget -Erkp -w 5 -np http://www.neato.com/articles/index.htm
As in the previous section, the command begins with wget and ends with the URL you want to use. You
looked at the -w (or --wait=[#]) option before, and that's the same, as well as -np (or --no-parent) and
-r (or --recursive). Let's examine the new options that have been introduced in this example.
When you download a site, some of the pages might not end with .htm or .html; instead, they might
end with .asp, .php, .cfm, or something else. The problem comes in if you try to view the downloaded
site on your computer. If you're running a web server on your desktop, things might look just fine, but
more than likely you're not running Apache on your desktop. Even without a web server, however,
pages ending in .htm or .html will work on your box if you open them with a web browser. If you use
the -E (or --html-extension) option, wget converts every page so that it ends with .html, thus enabling
you to view them on your computer without any special software.
Downloading a site might introduce other issues, however, which you can fortunately get around with
the right wget options. Links on the pages you download with wget might not work after you open the
pages on your computer, making it impossible to navigate from page to page. By specifying the -k (or -convert-links) option, you order wget to rewrite links in the pages so they work on your computer.
This option fixes not only links to pages, but also links to pictures, Cascading Style Sheets, and files.
You'll be glad you used it.
Speaking of Cascading Style Sheets (CSS) and images, they're why you want to use the -p (or --pagerequisites) option. In order for the web page to display correctly, the web developer might have
specified images, CSS, and JavaScript files to be used along with the page's HTML. The -p option
requires that wget download any files needed to display the web pages that you're grabbing. With it,
looking at the page after it's on your machine duplicates what you saw on the Web; without it, you
might end up with an unreadable file.
The man page for wget is enormously long and detailed, and it's where you will ultimately end up if you
want to use wget in a more sophisticated way. If you think wget sounds interesting to you, start reading.
You'll learn a lot.
Download Sequential Files and Internet Resources
curl
At first blush, wget and curl seem similar: Both download files non-interactively. They each have one
large difference distinguishing them, however, among many smaller ones: curl supports sequences and
sets in specifying what to download, which wget does not, while wget supports recursion, a feature
missing from curl.
Note
The programs have plenty of other differences. The full list of curl's features can be seen at
"FeaturesWhat Can curl Do" (http://curl.haxx.se/docs/features.html), while some of wget's
are listed at "Overview"
(www.gnu.org/software/wget/manual/html_node/Overview.html#Overview). The cURL site
has a chart comparing curl to other, similar programs at "Compare cURL Features with Other
FTP+HTTP Tools" (http://curl.haxx.se/docs/comparison-table.html); while informative, the
chart is (unsurprisingly) a bit biased toward curl.
Here's an example that uses curl's capability to support sequences in specifying what to download. The
excellent National Public Radio show This American Life makes archives of all of its shows available for
download on its parent website in Real Audio format (why they chose Real and not a more open format
is a mystery). If you want to download 10 of these Real Audio files, just use the following:
$ curl -O http://www.wbez.org/ta/[1-10].rm
[1/10]: http://www.wbez.org/ta/1.rm --> 1.rm
--_curl_--http://www.wbez.org/ta/1.rm
Notice how you used [1-10].rm to specify that you wanted to download 1.rm, 2.rm, 3.rm, and so on. If
WBEZ had instead named the files one.rm, two.rm, and three.rm , for example, you could have used a
part set instead:
$ curl -O http://www.wbez.org/ta/{one,two,three}.rm
The -O (or --remote-name ) option is absolutely required. If you don't use it, curl writes the output of the
download to STDOUT, which means that your terminal will quickly fill with unusable goobledygook. The -
O asks curl to write out what it downloads to a file, and to use the name of the file being downloaded as
the local filename as well.
We've barely scratched the surface of curl. Its man page, while not as long as wget's, is also full of useful
information that you need to read if you're going to maximize your use of curl. Consider it required
reading.
Conclusion
The commands in this chapter are some of my personal favorites, as they allow me to accomplish tasks
that would otherwise be overly tedious, unsafe, or difficult. All of them make the network just another
avenue for your data, your programs, and your imagination, and that's really where Linux shines. By
making tools as powerful as ssh , rsync, wget, and curl freely available to users, Linux encourages
innovation and safety in a way that's just not possible on another operating system that many people
are forced to use. Jump in and learn ssh so you can securely connect to other computers, master rsync
for safe and automated backups, and rely on wget and curl to specifically download needed content in
as efficient a manner as possible. You'll be glad you did!
Chapter 16. Windows Networking
Samba is one of the most important open-source projects in the world because it makes it possible for
Linux (and other Unix machines, such as Mac OS X) to use Server Message Block (SMB), the networking
protocol used on all Microsoft Windows machines. With Samba, Unix machines can connect to and
mount shares on Windows machines, and print to shared printers connected to Windows machines. Unix
machines can also set up Samba-based printer and file shares that Windows machines can connect to
and use. In fact, Samba is so useful that you don't even need Windows machines in the equation. You
might choose to implement Samba for file and printer sharing on a network of Linux machines because
it works so well.
Many good books cover how to set up Samba on a server, so we're not going to cover administrative
commands such as smbd, smbcacls , or smbpasswd (which, it is true, can be run by normal users to
change their own passwords, but in practice is almost always run by the Samba server admin), or how
to configure smb.conf . Instead, this chapter assumes you already have SMB shares set up on a
Windows, Linux, or Mac OS X machine. You're instead going to focus on the client end of things: how to
find out where those shares are, how to connect to those shares, and how to mount shares on your hard
drive.
Tip
Some good books on setting up and administering Samba include
Sams Samba Unleashed, ISBN: 0672318628; by Steve Litt
Sams Teach Yourself Samba in 24 Hours, 2nd Edition, ISBN: 0672322692; by Gerald
Carter
The Official Samba-3 HOWTO and Reference Guide at
http://samba.org/samba/docs/man/Samba-HOWTO-Collection/ is excellent.
Discover the Workgroup's Master Browsers
nmblookup -M [Master Browser]
nmblookup -S [NetBIOS name]
nmblookup -A [IP address]
A Samba server actually uses two daemons: smbd, which makes the shares available, and nmbd, which
maps the NetBIOS names that identify machines using SMB to their IP addresses, thus making it
possible to find and browse SMB shares. You're going to focus for now on commands that communicate
with nmbd, used to gain some overall information about the Windows workgroup you're querying.
Note
The following examples assume you're using a Windows workgroup, and not a domain. A
workgroup is essentially a small group of machines that choose to identify themselves as
belonging together by self-identifying as members of the workgroup. A domain uses a central
serveror several servers in a large networkto authenticate computers and users who want to
join the network. Domains are large, hairy, complicated beasts, and you're best referred to
the books referenced in this chapter's introduction if you want to learn more about them.
In a Windows workgroup, you need a machine that keeps track of the other members of the
workgroupwhat their SMB names and IP addresses are, for instance. That machine is known as the
Master Browser. But which computer in a workgroup is the Master? The one that is elected, based on
the operating system it's running. The latest and greatest OS always wins, so XP will always beat 2000,
which will always beat 98.
When setting up a Samba server, however, it's possible to configure things so the server always stays
out of any such election, letting other machines duke it out, or set things up so that the server always
wins any election. You could go ahead and start connecting to Samba shares if you knew about them,
but it helps to know where your Master Browsers are in case you have any issues.
To query your network for a Master Browser, run nmblookup with the -M (or --master-browser) option,
followed by a - at the end, which basically means "find me a Master Browser." The problem is that you
can't use a - on the command line, or the shell thinks it's the start of an option. So you need to preface
it with -- first, which tells the shell that the following - is in fact a -, and not part of an option.
$ nmblookup -M -- querying __MSBROWSE__ on 192.168.1.255
192.168.1.151 __MSBROWSE__<01>
192.168.1.104 __MSBROWSE__<01>
This isn't good. In your case, you appear to have two Master Browsers on one network, which can be a
real problem because different Masters might know about different machines at different times, causing
mass confusion to users. One minute a user can get to a machine, and the next it's gone. Why? Because
one Master knows about the machine, but the other doesn't. Try explaining that to Bob in Accounting.
Yipes.
To get more information about the Master, use nmblookup with the -S (or --status ) option, which
returns the SMB names the host uses.
$ nmblookup -S 192.168.1.151
querying 192.168.1.151 on 192.168.1.255
name_query failed to find name 192.168.1.151
That didn't work because -S expects a NetBIOS name instead of an IP address. Unfortunately, you don't
know the machine's NetBIOS name, only its IP address. Actually, that's not a problem. You simply add
the -A (or --lookup-by-ip) option, which tells nmblookup that you're giving it an IP address instead of a
NetBIOS name.
$ nmblookup -SA 192.168.1.151
Looking up status of 192.168.1.151
JANSMAC
<00> JANSMAC
<03> JANSMAC
<20> ..__MSBROWSE__. <01> - <GROUP>
MILTON
<00> - <GROUP>
MILTON
<1d> MILTON
<1e> - <GROUP>
B
B
B
B
B <ACTIVE>
B <ACTIVE>
B <ACTIVE>
<ACTIVE>
<ACTIVE>
<ACTIVE>
<ACTIVE>
MAC Address = 00-00-00-00-00-00
Now you know that the machine at 192.168.1.151 identifies itself as JANSMAC (it must be a Mac OS box)
and is the Master for the MILTON workgroup. What about the other IP address?
$ nmblookup -SA 192.168.1.104
Looking up status of 192.168.1.104
ELIOT
<00> ELIOT
<03> ELIOT
<20> ..__MSBROWSE__. <01> - <GROUP>
TURING
<00> - <GROUP>
TURING
<1d> TURING
<1e> - <GROUP>
MAC Address = 00-00-00-00-00-00
B
B
B
B
B
B
B
<ACTIVE>
<ACTIVE>
<ACTIVE>
<ACTIVE>
<ACTIVE>
<ACTIVE>
<ACTIVE>
The computer found at 192.168.1.104 has a NetBIOS name of ELIOT, and is the Master for the TURING
workgroup. So actually you have nothing to worry about, as the two machines are each a Master for a
separate Workgroup. Machines that self-identify as members of MILTON look to JANSMAC for information,
while those that consider themselves part of TURING use ELIOT for the same purpose.
Note
For more information about what the output in these examples means, see
http://support.microsoft.com/kb/q163409/ in Microsoft's Knowledge Base.
Query and Map NetBIOS Names and IP Addresses
nmblookup -T
You can use nmblookup as a quick way to find any machines that are sharing files and printers via
Samba. Append the -T option to the command, followed by " * " (and yes, you must include the
quotation marks to tell the shell that you're not using the * as a wildcard representing the files in your
current working directory).
$ nmblookup -T " * "
querying * on 192.168.1.255
192.168.1.151 *<00>
192.168.1.104 *<00>
192.168.1.10 *<00>
Three machines are on this LAN with Samba shares. To find out which of those are Master Browsers, see
the previous section; to produce a list of the shares offered by each computer, proceed to the next
section.
List a Machine's Samba Shares
smbclient
So you know that a machine has Samba shares on it, thanks to the commands outlined in the previous
section, but you don't know what those shares are. The smbclient command is a versatile tool that can
be used to connect to shares and work with them; at a more simple level, however, it can also list the
shares available on a computer. Just use the -L (or --list) option, followed by the NetBIOS name or IP
address. When prompted for a password, simply press Enter.
$ smbclient -L ELIOT
Password:
Anonymous login successful
Domain=[TURING] OS=[Unix] Server=[Samba 3.0.14a-Ubuntu]
Sharename Type Comment
print$
Disk Printer Drivers
documents Disk Shared presentations and other files
IPC$
IPC IPC Service (eliot server (Samba,
Ubuntu))
ADMIN$
IPC IPC Service (eliot server (Samba,
Ubuntu))
Anonymous login successful
Domain=[TURING] OS=[Unix] Server=[Samba 3.0.14a- Ubuntu]
In this case, you can see all of the shares that are available either to an anonymous logon or have been
marked as being able to be browsed in the server's smb.conf file. To see the shares available to you if
you were to log on, add the -U (or --user) option, followed by your Samba username as found on the
Samba server.
Note
Your Samba username might or might not be the same as your Linux (or Windows or Mac OS
X) username on the Samba server. You need to look on the Samba server to make sure.
$ smbclient -L ELIOT -U scott
Password:
Domain=[ELIOT] OS=[Unix] Server=[Samba 3.0.14a- Ubuntu]
Sharename Type Comment
print$
Disk
documents Disk
IPC$
IPC
ADMIN$
IPC
scott
Disk
Domain=[ELIOT]
Printer Drivers
Shared presentations and other files
IPC Service (eliot server (Samba,
Ubuntu))
IPC Service (eliot server (Samba,
Ubuntu))
Home Directories
OS=[Unix] Server=[Samba 3.0.14aUbuntu]
When you log in, you now see a new share, scottthe home directory of this user. This also tells you that
you can log in, which brings you to the next section, in which you log in and actually use the stuff you
find on the share.
Tip
If you want to test the shares you just created on a Samba server, an excellent method is to
open a shell on that box, and then enter the following:
$ smbclient -L localhost
When prompted for a password, press Enter. This way you can quickly see if the new share
you added is available. Of course, if you didn't make the share open to browsing, you'll need
to log on as a user who can view that share using the -U option.
Access Samba Resources with an FTP-Like Client
smbclient
After you know the shares available on a Samba server, you can log in and start using them. Use the
smbclient command, but in this format:
smbclient //server/share -U username
You can't just log in to a server; instead, you need to log in to a share on that server. To access
password-protected items (and it's certainly advisable to password-protect your shares), you need to
specify a user. To access the documents share on the server ELIOT, you'd use the following:
$ smbclient //eliot/documents -U scott
Password:
Domain=[ELIOT] OS=[Unix] Server=[Samba 3.0.14aUbuntu]
smb: \>
You'll know you were successful because you'll be at the Samba prompt, which looks like smb: \>.
Notice that you're prompted for a password. Assuming that the password for the scott user was 123456
(a very bad password, but this is just hypothetical), you could have entered this command instead, and
you wouldn't be prompted for a password:
$ smbclient //eliot/documents -U scott%123456
This is a terrible idea, however, because anyone looking at either your .bash_history file (discussed in
the "View Your Command-Line History" section in Chapter 11, "Your Shell") or the list of running
processes with ps (see "View All Currently Running Processes" in Chapter 12, "Monitoring System
Resources") could see your password. Never append your password onto your username. It's just good
security practice to enter it when prompted.
Tip
If you're writing a script and you need to log in without interaction, you still shouldn't append
the password onto the username. Instead, use the -A (or --authentication-file=[filename])
option, which references a credentials file. Using the scott user, that file would contain the
following:
username = scott
password = 123456
Make sure you use chmod (see Chapter 7, "Ownerships and Permissions") to set the
permissions on that file so it isn't readable by the world.
After you're connected to a Samba share, you can use many of the commands familiar to those who
have ever used FTP on the command line (see Table 16.1).
Table 16.1. Important smbclient Commands
Command Meaning
cd
Change directory
exit
Close the connection to the Samba server
get
Copy the specified file to the local machine
help
Get help on commands
lcd
Change the directory on the local machine
ls
List files in the working directory on the Samba server
mget
Copy all files matching a pattern to the local machine
mkdir
Create a new directory on the Samba server
mput
Copy all files matching a pattern to the Samba server
put
Copy the specified file to the Samba server
rm
Remove the specified file from the Samba server
You can view other commands by simply typing help after you've logged in to the Samba server, but
Table 16.1 contains the basics. When you're finished, simply type exit, and you log out of the server
and are back on your own machine.
Mount a Samba Filesystem
smbmount
smbumount
It's cool that you can use smbclient to access Samba shares, but it's also a pain if you want to use a
GUI, or if you want to do a lot of work on the share. In those cases, you should use smbmount , which
mounts the Samba share to your local filesystem as though it was part of your hard drive. After it's
mounted, you can open your favorite file manager and access files on the share easily, and you can
even open programs such as word processors and open the files directly with a minimum of fuss.
To mount a Samba share on your filesystem, you must create a mount point onto which Linux will graft
the share. For instance, if you were going to mount the documents share from ELIOT , you might create
a directory like this:
$ mkdir
/home/scott/eliot_documents
Now you run the smbmount command, along with a whole host of options.
[View full width]$ smbmount //eliot/documents /home/scott/eliot_documents -o credentials=/ho
/credentials_eliot,fmask=644,dmask=755,uid=1001,gid=1001,workgroup=TURING
smbmnt must be installed suid root for direct user mounts (1000,1000)
smbmnt failed: 1
Before looking at the option, let's talk about why smbmount failed. The second line tells you why: smbmnt
, which smbmount calls, can only be run by root. This can be a problem if you want your non-root users
to be able to mount Samba shares. The way around that is to set smbmnt as suid root (you learned
about suid in Chapter 7 's "Set and Then Clear suid ").
Note
Yes, you're giving ordinary users the ability to use smbmnt as though they were root, but it's
not really a big problem. It's certainly better than giving them the root password, or requiring
your participation every time they want to mount a Samba share.
Here's how to set smbmnt as suid root.
Note
In order to save space, some of the details you'd normally see with ls -l have been removed.
# ls /usr/bin/smbmnt
-rwxr-xr-x 1 root root /usr/bin/smbmnt
# chmod u+s /usr/bin/smbmnt
# ls -l /usr/bin/smbmnt
-rwsr-xr-x 1 root root /usr/bin/smbmnt
Now let's try smbmount again.
[View full width]$ smbmount //eliot/documents /home/scott/eliot_documents -o redentials=/hom
/credentials_eliot,fmask=644,dmask=755,uid=1001,gid=1001,workgroup=TURING
$ ls -F /home/scott/eliot_documents
presentations/
to_print/
Let's walk through the command you used. After smbmount , //eliot/documents specifies the Samba
server and Samba share to which you're connecting. Then comes the path to your mount point,
/home/scott/eliot_documents . The -o option indicates that options are coming next.
Instead of credentials=/home/scott/credentials_eliot , you could have used any of the following:
username=scott,password=123456
username=scott%123456
username=scott
The first and second choices are completely unsafe because the password would now show up in
.bash_history and ps . Don't do that! The final choice prompts you for a password, which would be safe
because the password wouldn't appear in either .bash_history or ps . But you're trying to automate the
process of mounting the Samba share, so the last choice wouldn't work, either.
That leaves you with credentials=/home/scott/credentials_eliot . This tells smbmount that the
username and password are stored in a file, in this format:
username = scott
password = 123456
This is just like the credentials file discussed previously in "Access Samba Resources with an FTP-Like
Client ," and as in that example, use chmod after you create a credentials file to tightly limit who can
view that file. If you don't care about automating the logon process, by all means, use username=scott
and get prompted to enter the password, which is certainly the safest option.
The fmask and dmask options respectively control the default permissions for any new files and
directories that you create in the mounted Samba share. 644 would produce rw-r--r-- for files, while
755 would end up with rwxr-xr-x for directories.
Note
Don't recall what we're talking about? Take a look back at "Understand the Basics of
Permissions " in Chapter 7 .
Remember that all usernames and group names are really references to numbers, which you can see in
/etc/passwd and /etc/group , respectively. Your user ID number on your current machine might be
1000, but on the Samba server, 1000 could be a completely different user, with the same problem for
your group ID. With uid=1001 and gid=1001 , you tell the Samba server who you are on the Samba
server . In other words, you need to look those numbers up in /etc/passwd and /etc/group on the
Samba server, not on your local machine. If you don't, you may create files and directories, only to find
that you don't really own them and can't use them the way you want.
Finally, workgroup=TURING specifies the workgroup, which is obvious enough.
After the share is mounted, you can open your file manager and start browsing the contents of
eliot_documents , start OpenOffice.org and directly open a file inside eliot_documents/presentations/ ,
or open a PDF in eliot_documents/to_print and send it to the printer. If you place the following lines in
your machine's /etc/fstab file, the share is mounted automatically.
[View full width]//eliot/documents /home/scott/eliot_documents smbfs credentials=/home/scott
/credentials_eliot,fmask=644,dmask=755,uid=1001,gid=1001,workgroup=TURING 0 0
Caution
Be very careful changing your /etc/fstab file because mistakes can render your system
unbootable. Sure, you can fix that problem (see my book Hacking Knoppix for some tips) but
it's still a pain that would be better to avoid with caution and prudence. For more on the fstab
file, see man fstab on your system.
If you decide that you don't want to access the Samba share at eliot_documents any longer, you can
unmount it with smbumount (notice that's umount , not unmount ).
$ smbumount eliot_documents
smbumount must be installed suid root
How frustrating! You need to set smbumount as suid root, too.
$ ls -l /usr/bin/smbumount
-rwxr-xr-x 1 root root /usr/bin/smbumount
$ sudo chmod u+s /usr/bin/smbumount
$ ls -l /usr/bin/smbumount
-rwsr-xr-x 1 root root /usr/bin/smbumount
Now the user that mounted eliot_documents and only that usercan unmount eliot_documents . Root
can unmount it, too, because root can do whatever it wants.
$ smbumount eliot_documents
And that's it. If you want to verify that smbumount worked, try ls eliot_documents , and there shouldn't
be anything in there. You're disconnected from that Samba share.
Conclusion
Much as Linux users would like to pretend that Microsoft doesn't exist, you have to realize that (for
now) you live in a Windows world. Samba makes coexistence that much easier because it allows Linux
users to access resources on Windows machines, and even makes resources available on a Linux box to
users of other operating systems. Microsoft might not like it very much, but Samba is a mature, stable,
powerful technology, and thanks to its open-source nature, it has helped bring the entire world of
computing more closely together. Samba helps you reduce the need for Windows servers in homes and
businesses, and that, as a certain doyenne might say, is a good thing.
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
! (bang) 2nd
$( ) (command substitution)
&& (double ampersand)
* (asterisk) 2nd 3rd
- (dash)
-R (dash R) option to commands
. (period)
.so (shared object) file
/ (forward slash)
" (quotation marks)
in regular expressions
single quotes (')
: (colon)
; (semicolon)
< > (angle brackets)
double right angle brackets
right angle brackets
? (question mark)
[ ] (square brackets)
\ (backslash)
_ (underscore)
` (backtick)
{} (curly braces)
| (pipe)
double pipe
text editors and
~ (tilde)
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
alias
permanent aliases
removing aliases
temporary aliases
viewing specific command aliases
angle brackets (< >)
double right angle brackets
right angle brackets
apropos command
apt utility
apt configuration file
apt-cache search command
apt-get clean command
apt-get install command
apt-get remove command
apt-get upgrade command
Get and Ign
sudo and
troubleshooting
archiving 2nd 3rd 4th
bzip2 command
gzip command
tar command [See tar command.]
zip command
asterisk (*) 2nd 3rd
ath0
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
backslash (/)
backtick (`)
backups
and secure transfer between hosts
cp command, using
bang (!) 2nd
bash shell
bunzip command
bunzip2 command
bzip2 command 2nd
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
case sensitivity, filenames
cat command
cd command
chgrp command
chmod command
-R option
alphabetic notation
numeric notation
sgid, setting and clearing
sticky bit , setting and clearing
suid, setting and clearing
chown command
colon (:)
color-coded output, ls command
columnar output, ls command
comma-separated listing, ls command
command documentation [See also info command, man command.]
apropos command
command defaults, identifying
manpages [See default printer, manpages.]
path, source files, and manpages, finding
searching
synopses
commands
-R options to
appending output to a file
canceling
command aliases [See alias.]
command stacking
command substitution
files, using for command input
output redirection 2nd
output, searching with grep
repeating commands
Common Unix Printing System [See default printer, CUPS, canceling print jobs to.]
compress command
compressed files, testing for corruption 2nd
concatenate
cp command
creating backups using
directories, copying
interactive option
verbose option
wildcards, using
Ctrl+c command
CUPS (Common Unix Printing System) [See also files, printing.]
CUPS Software Users Manual
Linux Journal article
URIs and
curl command
curly braces ({})
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
dashes (-)
DEB packages
default printer
CUPS
canceling print jobs to
determining
printing to
job ID numbers
Linux Journal overview
lpstat, collecting system information with
manpages
print jobs
canceling all
listing
listing corresponding printers
multiple copies, printing
to any printer
to any printer, canceling
to default, canceling
printer connections
printer status, determining
deleting directories
deleting files
accidental deletes, preventing
verbose option
with special characters in the filename
with wildcards
df command
dhclient command
DHCP (Dynamic Host Control Protocol)
directories
changing
contents of, listing [See ls command.]
copying
creating new
creating with included subdirectories
current directory, determining
deleting with contents
disk space usage, totals only
empty, removing
file space usage, monitoring
moving and renaming
ownerships [See ownership of files.]
permissions [See permissions.]
Directory node, info command
disk usage, monitoring
DNS (Domain Name System)
dog command
domains
double ampersand (&&)
double bang (!!)
double pipe (||)
double quotes in regular expressions
double right angle brackets
dpkg command
du command
Dynamic Host Control Protocol (DHCP)
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
echo command
encryption keys
environment variables
ESSID (Extended Service Set Identifier)
eth0
Ethernet cards 2nd
executables, finding
Expert Info, info command
Extended Service Set Identifier (ESSID)
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
files
archiving [See archiving.]
command input, using for
compressing
copying
avoiding overwrites
backups
cat command and output redirection, using
wildcards, using
copying between hosts
deleting
accidental deletes, preventing
special characters, handling
wildcards, using
empty files, creating
file attributes
file descriptors
file size, listing
file type, displaying
filenames
finding [See find command; ; locate command.]
hidden files, viewing
listing [See ls command.]
moving
open files, listing
ownerships [See ownership of files.]
permission and type characters
permissions [See permissions.]
printing
renaming
searching in [See grep command.]
secure network transfers
secure transfer and backup between hosts
time, changing on
users of, listing
viewing
beginning bytes
beginning lines
cat command, using
last ten lines
one screen at a time
repeated viewing, appended updates
find command
-a (AND) option
-exec option
-fprint option
-n (NOT) option
-o (OR) option
file type
name
ownership
size
Flags
forward slash (/)
free command
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
Gateways
Genmasks
GID (Group ID)
grep command
-R (recursive) option
case-insensitive option
limiting results to matched files
line numbers of results, showing
outputting lines near matched terms
outputting names of files containing matches
regular expressions
searching command output
searching for negative matches
searching text files for patterns
searching within results
versions of
whole words, limiting searches to
Group ID (GID)
gunzip command
gzip command 2nd
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
head command
bytes option
multiple files, using with
number option
hidden files, viewing
history command
.bash history file, security concerns
repeating commands
repeating the last command
home directory 2nd
host command
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
ICMP messages
ifconfig command 2nd
MAC address spoofing
network configuration using
network inspection using
ifdown command
ifup command
import command
info command [See also searching, command documentation.]
Directory node
Expert Info
navigation
input redirection
input/output streams
installing software [See package management.]
IP addresses 2nd
IPv4 and IPv6
iwconfig command
connection encryption schemes
ESSID
WEP
wireless interface configuration
wireless interface status, checking
wireless network topologies
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
job ID numbers, printing
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
kill command
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
left angle bracket (<)
less command
links, online resources concerning
Linux distributions
lo
locate command
case-insensitive option
piping output to pagers
slocate and
updating the database
log files, repeated viewing of appended updates
loopback address
lpq command
lpr command
lprm command
lpstat command 2nd
ls command
color-coded output
columnar output
comma-separated listing
directory contents, listing
file size output
file type, displaying
files, listing
hidden files and folders, viewing
listing ownership, permissions and size
long option
piping output to less
reverse-order output
sorting
lsof command
files, listing users of
piping to less
specific program processes, listing
users' open files
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
MAC (Media Access Control) addresses 2nd
man command [See also searching, command documentation.]
apropos option
displaying command synopses
manpage classification
manpage sections
manpage updates
printing output
man pages [See also man command.]
command classifications
databases, rebuilds and updates
finding
printing out
whatis, accessing with
Master Browsers
memory usage, monitoring
mkdir command
mtr command
mv command
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
nameserver
navigation, info command
networking
activating a connection
deactivating a connection
DHCP
DNS
encryption
file downloads
IP addresses
IP routing tables
network configuration [See ifconfig command.]
network topologies
networking devices
non-interactive website downloads
ping, checking connections with
remote logins [See ssh command.]
routing tables, modifying
secure file copying between hosts
secure file transfer and backup
secure file transfers
sequential file downloads
troubleshooting
wireless interfaces [See iwconfig command.]
with Windows machines [See Samba.]
nmbd
nmblookup command
-M option
-S option
-T option
finding file and print shares
Master Browsers, querying for
noclobber option
nodes
numeric permissions
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
octal permissions
output redirection
ownership of files
changing
group owners, changing
listing
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
package management
Debian distributions
apt [See apt utility.]
downloadable packages, finding
handling dependencies [See apt utility.]
RPM-based distributions
handling dependencies [See yum (Yellow Dog Updater, Modified), yum install command.]
installing packages
removing packages
pagers
Parent Process ID (PPID) number
paths
current directory path, displaying
finding
period (.)
permission characters
permissions 2nd
alphabetic notation, modifying with
file attributes, abbreviations for
listing
numeric notation
recursive application to files and directories
sticky bit, setting and clearing
suid, setting and clearing
PID (process ID) number
ping command
pipe (|)
PPID (Parent Process ID) number
printing
command-line compatible formats
CUPS [See default printer, CUPS, canceling print jobs to.]
process ID (PID) number
process trees
programs, non-pipable
ps command
grep, using with
piping output to grep
process trees
STAT letters
user-owned processes
ps2pdf command
pump command
pwd command
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
question mark (?)
quotation marks (") 2nd
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
RAM usage, monitoring
redirection of output
regular expressions (regexp)
remote logins [See ssh command.]
renaming files and folders
reverse-order output, ls command
right angle bracket (>)
rm command
interactive option
recursive force option
verbose option
wildcards and
rmdir command
root
route command
rpm command
RPM packages
rsync command
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
Samba
domains
finding file and print shares
logging in to shares
Master Browsers, discovering
mounting a filesystem
Samba prompt
Samba shares, listing
server daemons
unmounting a filesystem
user and group names on the Samba server
Windows workgroups
scp command
screenshots
searching
command documentation
for files [See also locate command.]
in files [See grep command.]
security of remote logins [See ssh command.]
semicolon (;)
sftp command
sgid 2nd
shared object (.so) file
single quotes
sleep command
slocate command 2nd
SMB (Server Message Block) protocol
smbclient command
-A option
-U option
listing Samba shares
logging in to shares
smbd
smbmnt command, setting as suid root
smbmount command
smbumount command
soft links, moving
software installation [See package management.]
sorting ls command output
date and time
file extension
size
source files, finding
spaces
special characters
in filenames
in regular expressions
spoofing
square brackets ([ ])
ssh command
HashKnownHosts option
known hosts file
remote logins
OS reinstallations and
passwordless logins
standard input, output, and error
STAT letters
stdin, stdout, and stderr
sticky bit 2nd
su command
login option
root, becoming
subfolders, viewing contents of
subnodes
sudo command and apt utility
suid 2nd
Synaptic
system monitoring
directory, usage of file space
totals only
file system disk usage
files, listing users of
killing processes
listing open files
listing users' open files
monitoring processes dynamically
process trees
program processes, listing
RAM usage, monitoring
user-owned processes
viewing running processes
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
tail command
multiple files, using with
number option
repeated operation on changing files
tar command 2nd
testing tarballs prior to untarring
untarring and uncompressing tarballs
using with compression utilities
text editors
tilde (~)
time, changing on files
top command
touch command
empty files, creating with
files, changing to arbitrary time
files, updating to current time
traceroute command
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
UID (User ID)
unalias command
underscores (_)
unzip command 2nd
updatedb command
URIs (Uniform Resource Indicators) and CUPS
User ID (UID)
users
listing open files of
switching
user abbreviations
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
WEP (Wired Equivalent Privacy)
WEP Key Generator
wget command
non-interactive file downloads
non-interactive website downloads
whatis command
whereis command
which command
Wi-Fi Protected Access (WPA)
wildcards
copying files
deleting files
directory contents, listing
regular expressions, compared to
Windows networking [See Samba.]
Wired Equivalent Privacy (WEP)
wireless cards 2nd
wireless networking
workgroups
WPA (Wi-Fi Protected Access)
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
yum (Yellow Dog Updater, Modified) [See also package management.]
yum install command 2nd
yum list available command
yum remove command
yum search command
yum update command
Index
[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J] [K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [U] [W] [Y] [Z]
zip command
zombie processes 2nd