Programming: July 2008

Simple stuff for Linux

Man(ual) pages

To get help on a particular command:
$ man
eg. man printf
Man stands for manual. It is split in to sections.

Section # Topic
1 Commands available to users
2 Unix and C system calls
3 C library routines for C programs
4 Special file names
5 File formats and conventions for files used by Unix
6 Games
7 Word processing packages
8 System administration commands and procedures

In the example above it pulls the printf man page from the shell which is most likely not the one wanted.
So run:
$ man 3 printf
to get the C library printf routine
To quit from a man page, press "q"
To search a man page, type "/"

To search the man pages
$ apropos

Shell shortcuts

^D - logout
cd - - return to the previous directory you were in
!$ - use the last arguement from the previous command
(ctrl)c - cancel a command

Unix utilities

To search through all the .h files in the projects directory and directories below for the definition of "Cont_"
find ~/projects -name "*.h" -exec grep -H "typedef struct Cont_" {} \; -print

Use awk to print one column from a text file.
eg: Given a file, ctr.log, with lines like
ctr key: 1983224, iso: 2210, ctr no: TTNU1519344 , etd: 1215752400
ctr key: 1983225, iso: 22G1, ctr no: FSCU3621588 , etd: 1215752400

$ grep "iso: 22G1" ctr.log | awk ' { print $8 } ' | sort | uniq
This will find all of the containers with the iso 22G1 in the ctr.log file; print the 8th column, the container number; sort them and remove any duplicates.

$ grep "iso: 22G1" ctr.log | wc -l
This will count how many 22G1 containers there are by counting the number of lines.

CVS

Usage

cvs -d (repository) login
cvs -d (repository) checkout (module)
cvs update
cvs diff
cvs commit

NB: you can also shorten usage, eg: cvs up works just like cvs update.

I recommend always doing a cvs diff before any commits. This shows you exactly what you have changed and will be committing, and can avoid accidentally introduced typos.

Avoid mixing formatting changes with logic changes. Doing so makes it very difficult to figure out exactly what changed. Formatting changes typically effect large portions of the file but have no impact on the meaning of the file. Logic changes typically effect only small portions of the file but greatly effect behavior of the program you are editing. If you mix them, cvs tools wont be able to help you discover what really changed to cause some new behaviour. So avoid making formatting changes, and batch them up into one big commit which changes nothing but the format. This works well with indentation tools such as astyle.

Checking changes

cvs annotate (file) > tmp
Will create you a replica of the file, marked with the person who last changed each line, and what date they made the change. This is useful for finding out who is responsible for some part of the code.

cvs log (file)
Gives you the list of revisions, who made them and when.

cvs diff -r (revision) (file)
Shows you the differences between the current file, and what it was as of the specified revision. This is useful when trying to figure out what has been introduced into a new version if you are looking for an obscure bug.

Debugging

GDB can really be a life saver, even if used for only the most basic functions.

When programs crash they create a core file (sometimes you need to do some configuration to ensure this happens and what the file is called).
gdb myprog core
(This loads the information dumped when the program crashed)

bt (or backtrace)
provides the call stack when the error occured. It shows you exactly on which line of which file the problem occured (if you have debug symbols).

info locals
prints out information about local variables, usually at this point you will go oh damn that should not be NULL, and realize what needs to be fixed.

print myvariable
if you want to check a particular variable.

up
down
move up or down the stack to look at variables in a different scope.

If you are writting a service, you can create a script that loads a core and does a bt and info locals and emails you. For example:
# See if there was a crash
corefile=/var/crash/corefiles/core.${PROCESS}.$UID.$pid
if [ -f $corefile ]; then
gdb $bindir/${PROCESS} $corefile < $basedir/sh/gdb_script > mail.out
mail -s "${PROCESS} crashed" me@email.com <>
rm mail.out
fi

Where gdb_script just contains:
backtrace
info locals

Sometimes (especially with C++) the stack gets corrupted and provides no information. Now you have to either resort to printfs in the code, interactive GDB or use electric-fence/valgrind.

Logging

If you are writting a non-trivial program, chances are you have a log file that you write to. The more useful information you have in the log, the less hit and miss you have to do to narrow down the problem. A big part of this usefulness is simply choosing unique and searchable strings for your output, and formatting the information in ways that make it relatively easy to grep for. Some people like to put file/function/line information in the log. This can be useful, but my preference is to stick with a timestamp and a concise, unique message which includes not only a description of the event, but any id information associated linked to the event. I never have the same log string twice in a program if I can help it.

Two things that will save you hours of pain are:
1) Always expect the worst, always check for bad inputs or undefined cases and at the very least write an error message to your log file. Always check that pointers passed into a funciton are not NULL before dereferencing, always check for indices outside the array bounds, elses in if trees and defaults in case statements. In large projects the unexpected always happens and the sooner you can localise where, the better. If your program doesn't handle it, at least you KNOW about it and can fix it.
2) Write to the log file regularly. A good rule of thumb for me is whenever processing something external, changing state, or responding to some event. The more information you have about what happened leading up the the problem the better. What you don't want is repeatitive or overly verbose spam, as it can bury the real information. However so long as you use unique strings, you can usually use text processing techniques to quickly dig up what you want.

Less is better than VIM for log files

A final note on using log files: Use less to read them.
less is a command just like more, but is ironically much more powerful.
It has all the searching functionality of VIM, but is a better choice than VIM because if your log file is large (which it will be) VIM uses vast system resources (memory, tempory files) and is slow to load.

Some common shortcuts you should know [work in less and VIM]:
(shift)G - goto end of file
?(string) - search backward for (string)
/(string) - search forward for (string)
n - repeat last search (in last direction)
If you press shift+F, it will start acting like "tail -f". You can stop auto refreshing by Ctrl+C

Keep a seperate editor open that you can copy and paste important lines into. This allows you to build up a timeline of important events which will allow you to understand why the impossible has happened.

Building with Makefiles

Always compile with warnings on!
gcc -Wall foo.c
The -Wall means Warn all. Without it you will have subtle flaws in your program that will be impossible to debug. Behavior that just does not make sense because of some trivial syntax or cast precision loss. It does not take a lot of effort to produce warning free code, but it does take a lot of effort to find a bug that the compile could have told you about. You will also learn quite a lot about the subtleties C/C++ in a controlled environment, instead of the lost in the wilderness with wild animals way.

Build debug symbols (-g), you wont regret it when you get a phone call at 2am in the morning.

Use compiler optimisation (-O1 -O2 or -O3), it really does improve performance dramatically.

When in doubt, make clean.
Often you are working with other people's make files. Maybe edited by multiple people afterward. They or you might not have set the dependencies up in a fool proof manner, resulting in a broken build. If you have library dependencies, make clean;make on those too! Spend some time reading yosefk if you are depressed about this and need cheering up. But quite often if you are having some weird problem it is a broken build. Make clean will either fix it or at least give a compile error as to what is wrong. (eg: a function might have been redefined with different arguments, but if the .o is not rebuilt you will link to it incorrectly and crash!)

Makefiles can be complicated things, but really we want them to do something very simple: If anything changes, recompile it! Dependencies can be generated for you to do this, so you don't need to sit there figuring them out.
gcc -MMD
creates a .d file for all your
-include $(OBJS:.o=.d) will include these dependency files (if they exist, and the .o can't exist if they don't) so this will always keep you up to date and avoid make clean paralysis.

You can make rules to automatically compile your stuff for you. Check this out, it even puts the .o files and .d files into an obj directory so it wont clutter up your filesystem:
$(OBJDIR)/%.o: %.cpp Makefile
@$(RM) -f $@
@mkdir -p $(DIRS)
$(CPP) -MMD $(CFLAGS) $(INCLUDE) -c $< -o $@ 2>&1 | more

If you are lazy it can even find all your source files to compile:
SRCS = $(shell find -name 'tests' -prune -o -name '*.cpp' -printf "%P ")
DIRS = $(shell find $ -name 'CVS' -o -name '$(OBJDIR)' -o -name 'tests' $ -prune -o -type d -printf "$(OBJDIR)/%P ")
OBJS = $(patsubst %.cpp,$(OBJDIR)/%.o,$(SRCS))

However sometimes it is preferable to be able to specify the source manually if you have multiple executables to build.

VIM

VI(M) is a very powerful editor which requires you to survive a learning period as it is different from almost any interface you have used before. If you survive, you will love it, if not you will hate it.

So lets us vi a file right now!
vi test.txt

The first thing that whacks you in the face is that you cannot type text into your document! Why? VI has two modes:
1) command mode
2) edit mode
And you initially start in command mode. You can enter edit mode by pressing 'i'. Now you can type into the document. You can always get back to command mode by pressing escape. Back in command mode you can type :x to exit and save. That is the major hurdle that most people don't jump. There are plenty of good VIM documents on the web explaining basic usage to build on from there.

Here are some development specific tips once you are comfortable with VIM as an editor which make it far more productive than an IDE:

Use ctags

If you were recently editing but navigated away and want to go back to where you were, press u for undo then (ctr)r for redo.

Use .
This is 'repeat last command'
Suppose you called a variable 'notvalid' and then later decided you wanted it to be 'isvalid':
/notvalid
(searches forward for occurance of notvalid)
cw !isvalid
(change word - replaces notvalid with !isvalid)
(esc)
/
(search again)
.
(repeat last command - this will replace the notvalid we just found with !isvalid)
/
.
(and again)

Of course you could also use a global search and replace:
:%s/notvalid/!isvalid/g
Which is less typing, but I prefer to check each replace as I go, so the search and repeat is good for that. You can do very complex editing very quickly using this technique, just by putting a little thought into how to modify text in a repeatable way.

:make
will build your project from within VIM. The advantage of this is that it will take you instantly to the line and file of any error that might occur in the build.
:cn and :cp can be used to move forward and backward through the error messages.

Customise your ~/.vimrc
This is what will transform VIM into a customised IDE.

syntax on
Will do syntax highlighting

set viminfo='10,\"100,:20,%,n~/.viminfo
au BufReadPost * if line("'\"") > 0|if line("'\"") <= line("$")|exe("norm '\"")|else|exe "norm $"|endif|endif
Will remember the last line you were in when you lasted edited a file.

set expandtab
Is usually the easiest way to ensure tab size is irrelevant.

set smartcase
Will allow you to search for strings case-insensitive unless you use capitals. There are many more handy ones, look up the documentation and go nuts!

set autoindent
set smartindent
set softtabstop=4
set shiftwidth=4
set showmatch
set ruler
set incsearch
set ignorecase
set number
set cindent
set backspace=2
set laststatus=2

%
If you press this while over a brace, it will take you to the complimentary brace. This is a life saver if the code isn't formatted how you like it and you can't be sure what nest you are in.

(shift)G
Go to end of file

?
search backward

/(up) or %(up) or ?(up)
each time you press (up) it goes back to the previous string so you can modify it. (down) goes forward through the list.

I
insert at start of line

A
append at end of line,
just press (esc) after if you dont want to put any text in

8yy
(navigate)
p
Will copy 8 lines and paste it where you want it

8dd
Will delete 8 lines of code

8(down)
Will move down 8 lines
I use this to measure how much I want to cut copy or delete.
Alternatively you can use visual select mode
(ctrl)v

ctags

ctags allow you to quickly navigate source code by jumping to the definition of a tag (eg: a function or variable declaration). This is vital on large projects, as without it you are stuck with grep or find.

ctags -R
will recursively build a tags file which is recognised by VIM (and other editors). Now if you edit a source file, you will be able to press (ctrl)] to jump to the definition of a tag, and (ctr)t to return back to where you were.

You need to keep the tags up to date, so in your Makefile add the following rule:
tags: $(SRCS) Makefile
-ctags -R
This will rebuild the tags if any of your source files change.

Large projects typically have nested directories. By default VIM does not search up for a central tags file, but you can make it do this by adding to your ~.vimrc
set tags=tags;/

What doesn't work too well is if your project depends on an external library. If the library is part of your source tree you can have a top level tags file, however how to keep this up to date is unknown to me. Also you will run into duplicate tags which your editor may or may not deal with nicely.

Pages

Simple stuff for Linux

CVS

Debugging

Building with Makefiles

VIM

ctags