next_inactive up previous


GNU Programming Utilities

Michael Benedict

November 1, 2001

Documentation

One of the most important aids in programming is always documentation. Whether it be for the utilities you are using to program with, like a compiler, or library calls, programming generally requires a lot of knowledge that should be as easy to look up as possible. While many specific types of documentation exist for specific tools1 there are generally three places to find information.

man

If you have used Unix at all, you have probably used man pages. Man pages are divided into sections:

  1. Executable programs or shell commands
  2. System calls (functions provided by the kernel)
  3. Library calls (functions within system libraries)
  4. files (usually found in /dev)
  5. File formats and conventions eg /etc/passwd
  6. Games
  7. Macro packages and conventions eg man(7), groff(7).
  8. System administration commands (usually only for root)
  9. Kernel routines (Linux Specific)

Man is invoked by typing man <name> on the command line, where <name> is the name of a program, call, etc. that is documented in the man system. You may also search man pages by using the apropos or man -k commands.

info

Info is the standard format for GNU documentation. It is similar to man, but has some advantages:

It also is searchable, by using info -apropos=SUBJECT. If a program is part of the GNU system, expect its primary documentation to be in info format.

WWW

Well, we live in the information age, and you knew it was coming. There is a virtually limitless amount of information available online. Online documentation is particularly good for libraries and languages, less so for utilities themselves. For a more targeted search, I have found http://www.google.com/linux to be particularly good for technical searches.

The Open Source Software Club at The Ohio State University

Yes, us! Feel free to join our mailing list or hang out on our irc channel. We are happy to answer questions on both. You can find instructions for joining the mailing list at http://opensource.cis.ohio-state.edu/ and the irc channel is #osuoss on irc.openprojects.net.

GCC

GCC, or the GNU Compiler Collection, is an impressive collection of compilers freely available for a great many platforms. To understand how to use gcc more effectively, we need to understand the compilation process.

Compiling a program

Preprocessing

This is the step in which a program's Macro's are expanded and defined keywords are substituted for the appropriate value. This is also where conditional parts of the program that should not be compiled are removed. Here is a quick example before preprocessing:
#define AUTHOR "Michael Benedict"

#ifdef AUTHOR
	printf(AUTHOR" wrote this program\n");
#else
	printf("This program has no author\n");
#endif

After preprocessing, the above code would look like this:

	printf("Michael Benedict" " wrote this program\n");

Preprocessing extends well beyond definitions, this is why C or C++ almost always begin with #include, usually this allows them to include function prototypes and type definitions.

Compiling

In this step, source code is translated from a high level language, like C or C++, to assembler code.

Assembling

This is the step where human-readable source code is changed into machine code (the binary language computers understand)

Linking

Often programs use libraries so they don't have to write everything from scratch. Linking is the process of including those libraries in the the assembled program so the program can run as an executable.


Using GCC

GCC understands certain file suffixes as a way to determine what it should do with a file. You can use the -o flag to specify the output file. A quick example would be gcc -o main.o main.c. Since GCC understands the .c suffix as C program source, and .o as an assembled, unlinked object file, GCC would preprocess, compile, and assemble main.c. You can also let GCC decide the names of the files (usually it changes the suffix, but the default executable name is a.out. Here is a table showing the suffixes, stage of compilation, and GCC flag to compile to that stage.

Compilation Stage Suffix / Name GCC flag
Source Code .c, .C, .cpp, ... NA
preprocessed .i, .ii -E2
Compiled .s -S
Assembled .o -c
Linked a.out / NA NA (default)

Here is a list of additional flags that I have found helpful:

It is important to remember that this is just the tip of the iceberg as far as GCC options. Check the info page or man page for a complete listing of options.

GDB

The GDB environment

The most important part of obtaining useful information with gdb is to make sure your program has debugging symbols compiled into it. Please see Using GCC, section [*].

GDB features emacs-like line editing and tab-autocompletion of commands. Moreover, you only need to specify the unique portion of a command. An example of this is that p will cause gdb to use the print function

Before using gdb, it is also important to understand the effects of compiler optimization on code. Therefore, if you are unsure or seeing anything odd while using gdb, make sure your program is not compiled with optimization.

Invoking gdb

Launch gdb as gdb progname. From here you will have a text session where you can run and examine your program. Alternatively, you may launch gdb as gdb progname COREFILE where COREFILE is the name of the core file left by your program. This way, you are left with the exact status of the program when it dumped core. The final way to run gdb is to invoke it as gdb progname PID, where PID is the process ID of a currently running process you which to debug. Of course, for the last two options the program executable must match the process ID or corefile you are calling gdb with.

Now what?

Now you can run your program! To do so, just type 'run' to start your program. Arguments passed to run will be directly passed to the program. However, before you run your program, you will probably want to become familiar with some basic GDB commands. I have divided them into sections.

It is important to note that GDB offers online help by typing 'help' at any time. Moreover, GDB also offers command completion and shortcuts. Utilizing these makes your gdb session much more efficient. Lastly, just hitting enter at the gdb prompt will repeat the last command.

Flow of control

  1. run - As previously mentioned, you can begin executing your program by typing 'run' or 'r.' The arguments you pass after the run command will be passed to your program, as interpreted by your $SHELL. Thus, to debug the usage routine of a program you might run something like this: (gdb) run -help. Other information, such as environmental variables and working directory, are inherited from the gdb environment. To change the environment, you gdb's set environment and unset environment. You can also use gdb's cd command to change the working directory before executing the program. Lastly, you can user basic shell redirection in gdb, just not from the program you are debugging to another program. One important note about run is that it remembers the command line arguments. If you have called run with arguments previously in your gdb session, and you want to clear them, use set args.

  2. break - set a breakpoint. The syntax is very flexible. You may set a breakpoint as break main, to break when the function main() is entered. You may also break at a specific file and linenumber, for example break main.c:52. More interestingly, you can break on certain conditions. A trivial example is break main if argc == 1. The program pauses termination if and only if the condition is true. Thus, calling run -help would not stop the program, but run would. Two other interesting break variations are tbreak, which effectively inserts the breakpoint and removes it the first time it is hit, and rbreak, which takes a regular expression to match when to break.

  3. watch - set a watchpoint. Watchpoints are very similar to breakpoints, but rather than breaking because the program flow has reached a certain point, watchpoints pause the program when data is accessed or changed. Here are the three man watch commands: The format for ARG will be expressed later, but for now specifying a pointer or memory address will work.

  4. step and next - Now that your program has stopped, we need something to do. The step and next functions will allow you to single-step through your program. The difference is that next will not step through functions your program calls, while step will. A variation on these commands are the stepi and nexti, which single step by machine instruction rather than line of code.

  5. continue - If you want to continue executing your program until it completes or the next breakpoint / watchpoint is encountered, you may just use the command continue

  6. delete - Is that breakpoint no longer useful? Sick of that watchpoint? Use delete to get rid of them. Without an argument, delete removes all breakpoints. To remove one, use that breakpoint's number as the argument to delete. You can find out about all your watchpoints/breakpoints by using info watch.

Examining Data

Of Course, GDB would not be too useful unless you were able to see the data that is most important to your program. We will look at a couple of commands to do just that.

  1. print - This is the most basic command to view data. Saying print variable will print the variable if it is in scope. The @ operator allows you to print the variable as an array, so print array@10 will print the first ten elements of array. You can also specify the format by using a forward slash. Thus print /x foo will print foo in hex. Another interesting feature is the $ variable, which is assigned to the most recent value in the history. This is useful because you can tell gdb print (struct linked_list ll)foo, and then print *($->next). Just repeatedly hitting enter at this point will cause gdb to traverse and print the list.

  2. x - similar to the print command, x lets you examine larger regions of memory. This could be useful when you have a pointer arithmetic problem, for example. To examine the 64 bytes surrounding ptr, use x /64x (ptr-8). Note that I used arithmetic directly in the expression I entered for gdb. This is another powerful feature of gdb.

  3. display - It may get tedious to continually call print to see a value that changes frequently. Instead, using the display causes the displayed values to print at every line of execution.

Miscellaneous Commands

  1. list - The list command prints a code listing of the area around the current line of execution. It is useful to see exactly where you are in the code.

  2. call - You may also, at any part of the program execution, call a function or procedure.

  3. bt - lastly, you may see the calling stack of the executing program by calling using the bt command.

Front-ends

GDB is console application, but it has an emacs front end and at least three projects provide graphical frontends:

GMAKE

Targets, Rules, and Commands

Make is based on these three simple concepts. We will look at each to understand how Make works, using this simple example:

myprog : main.c
	gcc -o myprog -Wall main.c

Targets

This is the name of a file the make should try to update or create. In the example, the target is myprog. In a makefile, it is always to the left followed by a colon. An important note is that the first target is the default target3.

Rules

The rules tell Make when to make the Target. In this case, it requires there to be a file named main.c. Moreover, if myprog is newer than main.c, Make knows it has nothing to do for this target.

Commands

These tell make how to actually create or update the Target, usually this is the compiler and options to create the program, but it can vary greatly.

Practical Make

One of the really nice features of make is that the rules give a notion of up-to-date. Therefore, working on a large project with many source files, you would probably have a rule that links them all, and then a rule to compile each one. In this way, if you modify just one file, you only need to recompile that file and link it against the rest, which is a very nice time saving feature in comparison with rebuilding the entire project from scratch.

Another nice feature of make is that it allows for variable names, mostly to save on typing. An example would be creating a variable CFLAGS like so

CFLAGS = "-g -Wall -Werror -c"

An important note about Make is that referencing variables should be done like so $(VARIABLENAME).

A simple example

Below is an example used to produce this talk:

 1 NAME=GNU-utils
 2 
 3 all: $(NAME).ps
 4 
 5 $(NAME).ps: $(NAME).dvi
 6 	/usr/bin/dvips -o $(NAME).ps $(NAME).dvi
 7 
 8 $(NAME).dvi: $(NAME).tex
 9 	/usr/bin/latex $(NAME).tex
10 
11 rtf: $(NAME).rtf
12 
13 pdf: $(NAME).pdf
14 
15 ps: $(NAME).ps
16 
17 html:
18 	/usr/bin/latex2html $(NAME).tex
19 
20 $(NAME).rtf: $(NAME).tex
21 	/usr/bin/latex2rtf $(NAME).tex -o $(NAME).rtf
22 
23 $(NAME).pdf: $(NAME).tex
24 	/usr/bin/pdflatex $(NAME).tex
25 
26 ispell: $(NAME).tex 
27 	/usr/bin/ispell -t $(NAME).tex
28 
29 clean: 
30 	/bin/rm -f $(NAME).aux $(NAME).log $(NAME).dvi $(NAME).tex.bak
31 
32 cleanall: clean
33 	/bin/rm -f $(NAME).ps $(NAME).pdf $(NAME).rtf 
34 	

More Advanced Commands

While the sections before this should get you used to some basic concepts of Make, there are many more things you can and should do to improve your makefiles. I will show you some to get you started. Again, there is also much more information not included, please refer to the documentation for complete information.

Implicit Rules

Notice that often you make one object file the same way you make another. This can be generalized in make using implicit rules. To make a .o file, for example, you need a corresponding .c file. You then call the C compiler on it with your cflags and the -c directive. You can use the % sign to pattern match for these rules. The $< variable takes the value of the rule and the $@ variable takes the value of the target. Here is an example:

%.o : %.c
     gcc $(CFLAGS) -c $< -o $@

Advanced Variables

Not all variables must be stated explicitly.

One last type of variable is a substitution variable. Suppose you have a variable $(SRCS), that contains a list of all .c files needed to build your program. Rather than manually list the object files to be created, you could create the variable $(OBJS) by merely assigning it to $(SRCS:.c=.o).

Telling Make things

You may also give make extra information through special commands. Here are some.

A More Advanced Example

#set because some systems inherit shell from the environment
SHELL = /bin/sh

# what to build? 
SRCS = $(wildcard *.c)

# useful variables
CFLAGS = -g -O -Wall -Werror -I. -Ilibdb
LIBS = libdb/libglodb.a
TARGET = trace
VERSION = 0.0.1
DEST = /usr/local/
TAGFILES = GPATH GRTAGS GSYMS GTAGS

# useful for other make programs
.SUFFIXES:
.SUFFIXES: .c .o .d

# not real targets!
.PHONY : all clean dep distclean dist doc dvi error info install \
install-strip maintainer-clean TAGS touch uninstall

all : $(SRCS:.c=.o)
	$(CC) $(LDFLAGS) -o $(TARGET) $(SRCS:.c=.o) $(LIBS)

clean: 
	$(RM) $(SRCS:.c=.o) $(SRCS:.c=.d)

debug : CFLAGS += -DDEBUG
debug : all

dep: $(SRCS:.c=.d)

distclean : clean
	$(RM) $(TARGET) $(TARGET)-$(VERSION).tar.gz

dist : maintainer-clean $(TARGET)-$(VERSION).tar.gz

doc :
	@echo ""
	@echo "I am too lazy to document my program, figure it out yourself"
	@echo ""

dvi : doc

error :	all 
error : $(override CFLAGS += -DERROR) 


info : doc

install : all
	@echo ""
	@echo "according to make documentation, the install target does not"
	@echo "strip the target. please run make install-strip if you want a"
	@echo "stripped executable"
	@echo ""
	cp $(TARGET) $(addprefix $(DEST), bin/)

install-strip: install
	strip $(TARGET) $(addprefix $(DEST), bin/)

maintainer-clean : distclean
	$(RM) $(TAGFILES) 

TAGS : $(TAGFILES)

touch :
	touch $(SRCS)

uninstall :
	$(RM) $(addprefix $(DEST), $(addprefix "bin/", $TARGET))
	

%.d : %.c
	$(CC) -MD $(CFLAGS) -c $<

%.o : %.c
	$(CC) -MD $(CFLAGS) -c $<

%.gz : %
	gzip $<

$(TARGET)-$(VERSION).tar :
	tar -C .. -c -f $@ $(TARGET)-$(VERSION)

$(TAGFILES) : $(SRCS) $(wildcard *.h)
	gtags

inc = 
include $(wildcard *.d)

$(if $(wildcard *.d), $(call $(inc)))

CVS

CVS is free versioning software. While not a product of the GNU project, it has become integral in all fields of software development. CVS is loosely based on an older versioning system, called RCS. CVS allows for multiple users to work on projects simultaneously, as well as being network transparent. RCS lacked both of these features.

Almost all CVS commands are arguments to the cvs(1) command.

The Repository

The CVS Repository may either be on the local file system or on a remote server. If it is remote, the recommended way to connect is through ssh(1). To do this, export the CVS_RSH environmental variable as ''ssh.'' To specify which CVS Repository to use, the cvs command takes a -d argument. The format of the repository argument is as follows: :<method>:<path>. A common method of setting up local CVS is to have it in /usr/local/cvsroot. Therefore, cvs commands would look something of the form cvs <command> -d :local:/usr/local/cvsroot. Fortunately, rather than specify the repository every time you run cvs, you can export the CVSROOT variable to the repository you wish to use. For remote repositories, the method will usually be replaced, either with ext, pserver, kserver, or <user>@<host>. Setting up remote repositories is beyond the scope of this talk, but not terribly complicated. Suffice it to say, if you are using someone elses repository, they will be able to tell you what cvsroot you should use.

Creating a Repository

Creating a repository is quite easy. All you need to do is make a directory to store project information in. In our example, we will use /usr/local/cvsroot. You then must run cvs init -d :local:/usr/local/cvsroot so that cvs can initialize the archive. Of course, if you have already set your CVSROOT, you can just run cvs init.

Adding a Project

Once agian, creating a new cvs project is quite easy. First, begin the project with some initial source files you will want in the project. Next, you import the sources using the following command: cvs import <directory> <vendor> <release>. It does matter the directory you do the import from. If you have the directory src/myproj in your home directory, you will want to cd to the src/mproj directory. Then, using cvs import myproject <vendor> <release> will create myproject as a top-level project in the CVS repository. The <vendor> refers to a tag you want to identify the project owner as. For individuals, this doesn't matter too much, but if you are doing something for the open source club, you might wan the vendor tag to be ''OPENSOURCE_OSU''. The release tag simply creates a tag for the initial import. Something like ''MYPROJ_0_1'' works well here.

After using this command, your editor (specified by either your CVSEDITOR, VISUAL, or EDITOR environmental variables) will be launched for you to edit a commit message. If you wish to instead specify the commit message on the command line, you may use the -m argument, followed immediately by the message. The -m argument is available on all cvs commands that require log messages.

Working With the Project

Since there might be multiple people working on the project with you, from time to time you will want to know how your copy of files corresponds to the servers repository. To do this you can use the cvs status command.

If you discover that your files are not the same as the files in the archive, or if you have been working on a file and are ready to update the repository, the first step is to bring your project directory up to date. To do this, use the cvs update command. This is where you will do any necessary hand merging of files. If you find that your you have made changes to local files, you can simply remove the file and a cvs update will bring it back.

Once you are updated, you may then actually commit your changes to the repository and make them public. To do this, use the cvs commit command. If you are up to date, any files that are ready to be updated in the repository are updated. This is the other situation where you are expected to provide a message.

To add a new file to the repository, use cvs add on the new file or directory. To remove a file or directory, use cvs remove. In both cases, the repository won't be updated until you cvs commit

Another useful command is cvs log. This will show you log messages for a given file.

One of the most useful commands is cvs diff. Specified with a filename it will do a diff between your current copy of the file and the version in the repository. To view the differences between your version of a file, use the -r <revision> argument. You can also specify the differences between two separate revisions by using two -r arguments.

cvs tag <NAME> filename will tag the current revision of the file filename with the tag NAME.

Putting it all together

Suppose you change a file and update it in the repository with cvs commit. Then you realize you just introduced a bug into the system. What do you do? Well, you can run cvs diff on the file, and redirect the output to a patch file. You can then patch the file using the patch(1) command. All you now have to do is a cvs update and cvs commit and you have a new version of the file that no longer has the bug in it!

Summary

Parting Notes

It is no accident this talk began with how to obtain documentation. For many of these tools, what was presented here was the tip of the iceberg. However, as you learn the tools better, your productivity with them will increase. The last note is that choosing an editor and learning it well is extremely important. All decent unix editors integrate with these tools to some extent, and can make your experience using them even better.

Other Interesting Software

There is a family of other interesting pieces of software, deriving from ctags(1). Ctags (and derivatives) let you easily find the locations of function and variable definitions. More advanced ones allow you to find where functions are called from. Here is a list of these programs, all integrate with at least some editors.

Name where you can find it
exhuberant-ctags vim distribution
etags emacs distribution
global http://www.tamacom.com/global
cscope http://cscope.sourceforge.net/

URLs

About this document ...

GNU Programming Utilities

This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.48)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -split 1 GNU-utils.tex

The translation was initiated by Isaac Jones on 2001-11-02


next_inactive up previous
Isaac Jones 2001-11-02