GObject Refcount Debugger (RefDbg) - README ChangeLog - Whats new in refdbg 1.2 - Whats new in refdbg 1.1 - What is it? - What is "reference counting"? - So whats the problem? - Why would I use it? - How does it work? - Getting started - Controlling RefDbg via GDB - RefDbg Interactive Shell - Event types (Updated) - Event rules - Event rule format - Event display format (Updated) - Rule examples - Refcount timer - Commands (Updated) - Quirks - glib notes - Why am I getting "LD_PRELOAD function override not working. Need to build glib with --disable-visibility?" README ChangeLog ---------------- v1.2 * Added "Whats new in refdbg 1.2" * Updated: "Event types" with a Note on the reason for PreNew and New events * Updated: "Event display format" with backtrace display details * Updated: "Commands" section with new/changed commands v1.1 * Updated: "Refcount timer" * Updated: "Commands" - "notimer" command added * Updated: "Commands" - Added information on getting detailed help * Updated: "Quirks" * Added: "glib notes" * Added: 'Why am I getting "LD_PRELOAD function override not working. Need to build glib with --disable-visibility?"' Whats new in refdbg 1.2 ----------------------- - Improved backtrace info (source file:line number) using libbfd and dladdr() - Added 'btpaths=0|1' command to enable display of full paths in backtraces - Performance improvements - backtrace info is resolved only when needed - Memory optimizations - constant strings from libbfd and dladdr() not copied - Added logging of alive objects to log - disabled with new 'logobjs' command - The 'savelog' variable was renamed to 'logexit' - The 'savelog' command now saves the event/obj log immediately - Size of saved log file displayed - Renamed 'object' command to 'objects' Whats new in refdbg 1.1 ----------------------- - Improved refcount timer expiration - Negative offsets to the display command "limit=OFS,MAX" for offsets from the end of the log (example: -10 is the 10th event from the most recent event) - "object" command now displays destroyed objects implicitly when an include object instance address is given - Runtime checks for GObject function overrides (problems with glib 2.6.x) - "notimer" command added for disabling glib timer callback installation What is it? ----------- RefDbg is a GObject reference count debugger. GObject is part of the glib library (not to be confused with glibc) which forms the basis of GTK+ (the Gimp ToolKit), Gnome, GStreamer and many other projects. The GObject library adds object oriented stuff to C which by itself lacks object oriented features. What is "reference counting"? ----------------------------- All GObjects have a reference count field which holds the current number of external references (pointers) to that object. When an object is created (a GtkButton for example) it will have a reference count of 1 which the calling function is responsible for. Whenever a pointer to an object is persistently stored for later use, the object's reference count should be incremented. When this pointer no longer points to the object the reference count should be decremented. When the object's reference count reaches 0 it is destroyed and its allocated memory is freed. So whats the problem? --------------------- The problem is that its easy to forget a ref or unref call and thus the object would end up with one too many or one too few references. In the case of one too many references the object will never be freed, thus causing a memory leak. On the other hand if there are too few references then the object may get destroyed even though there are still references, leading to program crashes (segfaults) when attempts are made to access the destroyed object. Another type of bug is when no ref or unrefs are done, but an object is used persistently and outside the scope of any guaranteed references. This kind of bug would only surface when all other references are removed, and would lead to a program crash when an attempt is made to access the destroyed object. These types of bugs can be extremely hard to track down. RefDbg aims to be a utility to assist developers in fixing these types of problems. Currently only the first two bug scenarios described above (positive or negative refcount leaks) are within the scope of refdbg. The latter bug scenerio (code using an object without ref/unrefs) is very hard to detect (anyone have any ideas?) Why would I use it? ------------------- If you are a software developer working with GObjects and would like some assistance in tracking down refcount related bugs, you may find RefDbg to be quite handy. It doesn't automate the task for you, but it will help you to see under the hood when things go wrong. RefDbg can also be useful in monitoring where and when GObjects are being created in programs. How does it work? ----------------- RefDbg is essentially a shared library that is loaded using the LD_PRELOAD feature of the Linux dynamic library loader. This allows for key functions in the GObject library to be intercepted (such as g_object_ref, g_object_unref, g_object_newv, etc). Note that this means that a program does not need to be recompiled or changed in any way (although having debugging symbols built in is really helpful). When GObject functions are intercepted, RefDbg gathers information such as: a stack backtrace of the calling functions, current time, type of object, its current refcount, etc. This information is then processed according to a set of user definable rules which determine if the data is printed to stderr, a software break point is executed (causing a debugger such as gdb to take over) and/or log to an event log. RefDbg can be controlled by: - Passing startup commands via the REFDBG_OPTIONS environment variable - The `refdbg' shell script -c switch (which sets the environment variable) - The RefDbg shell The RefDbg shell is the most useful method to interact with RefDbg and contains nice things like command history and help system. See the section entitled "RefDbg Interactive Shell" for more information. Getting started --------------- To start a RefDbg session use the installed `refdbg' shell script to execute the program you would like to debug. The syntax of this shell script looks like this: refdbg [options] [program arguments..] If you would like to debug a program called "swami" you could run: refdbg swami You can pass command line arguments to the program directly: refdbg swami ~/sbks/Vintage.sf2 By default refdbg will output the event log to a file called `refdbg.log' at the end of program execution. If you want to interact with RefDbg at run time, then execute gdb on your program, like: refdbg gdb swami To pass command line arguments to the program via GDB (a GDB specific feature) do this: refdbg gdb --args swami ~/sbks/Vintage.sf2 A useful option to the `refdbg' script is the -c switch which allows initial RefDbg commands to be passed to RefDbg (via the REFDBG_OPTIONS environment variable). See the section titled "RefDbg Commands" for more details on commands. An example usage would be: refdbg -c "btnum=8 ; r0=D:All" gdb --args swami ~/sbks/Vintage.sf2 This would set the max stack back trace caller count to 8 (this variable can only be set at refdbg initialization time) and set the default rule to display all events to stderr (can be a lot of output!). Controlling RefDbg via GDB -------------------------- RefDbg can be controlled from within GDB (provided GDB was started using the `refdbg' script described in the section titled "Getting started"). This is done via the GDB `call' commmand which is used to call functions from within a GDB session. Here are some useful functions to call in RefDbg (see refdbg.h header file for a more complete list). /* The RefDbg shell */ void refdbg (void) /* run a single command */ gboolean refdbg_cmd (const char *cmds) /* print statistics */ void refdbg_stats (void) /* save the event log to 'refdbg.log' */ void refdbg_save_log (void) Some examples (typed at the GDB prompt, comments after '#'): call refdbg () # Start the RefDbg shell call refdbg_stats () # Print RefDbg statistics call refdbg_cmd ("rules") # Print out the current event rules call refdbg_save_log () # Save the event log to 'refdbg.log' RefDbg Interactive Shell ------------------------ The RefDbg shell is the easiest way to control RefDbg. In order to start it you need to run RefDbg on GDB and the program you want to debug (see the section entitled "Getting started" for more information). Example to of starting 'refdbg' shell script from a terminal: refdbg gdb swami Once you are in GDB you will need to start the program before you can access RefDbg (it has not yet been loaded for your program). You can do this by setting a break point on "main" or hitting CTRL-C while your program is running to stop it. Example GDB commands (and output): (gdb) break main Breakpoint 1 at 0x8056633: file main.c, line 68. (gdb) run Breakpoint 1, main (argc=1, argv=0xbffff734) at main.c:68 Now you might want to start up the RefDbg shell, so do: (gdb) call refdbg () Type 'help' to get a list of help topics and a command reference. Typing 'exit', 'quit' or 'q' will exit out of the shell (it can be started again at any time in the same debug session). Event types ----------- When a GObject refcount related function call is intercepted a RefDbg event is generated. There are different types of events depending on what the operation and different types of errors that can be detected during these operations. Here is a list of event types followed by a brief description: PreNew - Occurs just prior to a GObject being created New - Occurs after a GObject has been created Ref - Occurs before g_object_ref() is called on an object Unref - Occurs before g_object_unref() is called on an object PreFinalize - Occurs just before an object is destroyed (last unref) Finalize - Occurs after an object has been destroyed Note: There are separate PreNew and New events since reference activity can occur during the object construction stage, using 2 separate events helps to indicate this. Here is a list of errors that can occur: EUnknownObj - (Ref/Unref events) Object address has never been seen before EDestroyedObj - (Ref/Unref events) Object has already been destroyed ENotObj - (Ref/Unref events) Address points to a non-object (corrupted memory?) EInitCount - (New events) Initial reference count not equal to 1 EBadCount - (Ref/Unref) Unexpected reference count (corrupted memory?) Event rules ----------- When an event is generated it is compared to a list of rules which set the criteria for what should be done with each event. We'll call the list of rules the "rule stack". Each rule is given an ID of the form "r[n]" where [n] is a number from 0 to the max allowed rules (currently set at compile time to 10, so r0 - r10). The first rule "r0" is a special rule called the "default rule" which always exists (at the top of the rule stack), but this is the only reason it is special, it behaves exactly as other existing rules. Each rule defines criteria and the actions to be performed when an event match occurs. Rules are always consecutive (if a rule is removed the rules "below" it will be shifted upwards in the stack). Events are compared to each rule starting from the highest numbered existing rule until a match occurs, at which point the defined actions are performed (event is displayed, break point instruction executed and/or event is logged). If no matches occur then nothing is done. The default rule stack is defined as such: r0 = D:Error B:None L:All Which defines only one rule (the default rule) to display only error events, not execute any break points, and log everything. Event rule format ----------------- An event rule has the following format: 0x76543210 !0xFEDCBA98 D: B: L: Each of these fields is optional and has a default if not specified. All characters in the above format string should be used as is, except for which are described below. The description of these fields follows: - A GType to match. If this field is set then only events operating on objects of the given GType or derived thereof will be matched. Defaults to a 0 type which matches any type. - A GType to exclude. If this field is set then any events operating on objects of the given GType or derived thereof will NOT match the rule. Defaults to a 0 type which doesn't exclude any types. 0x76543210 - An address of an object instance to match. Can be specified in hex, decimal, or binary. Only events operating on a specific object instance will match the rule. Defaults to NULL which matches any object. !0xFEDCBA98 - An address of an object instance to NOT match. Only events that do NOT match a specific object will match the rule. Defaults to NULL which doesn't exclude any object. D: - Display flags. Event and error types that will cause the event to be displayed to stderr. B: - Break point flags. Event and error types that will cause a break point to be executed, thereby causing a debugger such as GDB to interrupt execution of the program. L: - Log flags. Event and error types that will cause the event to be logged to the event log. The identifier above should actually be replaced with one or more of the following flags separated by the pipe '|' character: PreNew | New | Ref | Unref | PreFinalize | Finalize EUnknownObj | EDestroyedObj | ENotObj | EInitCount | EBadCount Timer Paranoid The following values may also be specified and have the described effect: All - This means all events and errors (does not include Paranoid flag) None - All flags are cleared (a 0 value) Event - All event types (first group above) Error - All errors (second group above) The "Timer" flag is a special flag which is only used with the display (D:) flags. It enables the refcount timer for objects matching the rule for which this flag is set. When setting this flag "L:New|Ref|Unref|Finalize" is implied since logging of these events is required for timer operation. The refcount timer tracks object refcount changes over time. See the section entitled "Refcount timer" for more information. "Paranoid" is a special flag which causes EUnknownObj/EDestroyedObj errors to be processed immediately even during object construction. The reason this flag exists is due to the inabality for RefDbg to obtain a new object's address prior to ref/unref possibly being called on the new object during its construction (and thus being unrecognized by RefDbg). By default, while there are objects being constructed, RefDbg will log EUnknownObj/EDestroyedObj errors but will not break or display them. After an object has been created the log is then fixed up using the now known object address and any genuinely invalid object refs/unrefs will then be detected and displayed (but break points will not be honored due to the fact that the event already occured). Unfortunately if a Ref/Unref of an invalid object does occur while any objects are being constructed it may cause the faulty program to crash before RefDbg reports anything or executes a break point. When trying to track down this type of problem the Paranoid flag can be enabled, which will cause all unknown ref/unrefs to be processed immediately, despite the fact that they may be valid calls generated by object initialization functions. Event display format -------------------- When events are displayed they will be of the following format: [!]EVENT {ERROR} [0x12345678] |REFCOUNT| mm:ss.iiiiii 0xb74753eb: libgobject-2.0.so.0+0xd3eb [g_object_new+0x6b] (gobject.c:797) ... Not all of these fields are necessarily present. Here are what the different fields mean: [!] An event line is preceeded by an bang '!' if its an error EVENT The event type (PRE_NEW/NEW/REF/UNREF/PRE_FINALIZE/FINALIZE) {ERROR} The error type, (UNKNOWN/DESTROYED/INVALID/INITREF/BADREF/FAILED) The GType of the object [0x12345678 The address of the object |REFCOUNT| The object's refcount (example: |1| or |1 != 2| if refcount error) mm:ss.iiiiii The time stamp The information in a backtrace line is described below, some fields may not be present depending on how much debugging information is available. The fields are from the example backtrace line above: 0xb74753eb: The return program counter address of the function call libgobject-2.0.so.0 The object file that the address is in (lib or program) +0xd3eb Offset in object file of the backtrace address g_object_new The name of the function that address is part of +0x6b Offset in function to backtrace address gobject.c Source code file where the backtrace address resides in :797 Line number in source file to the instruction at address Rule examples ------------- Here you will find examples of some commands dealing with rules. addrule D:Error B:None L:All Add a rule to match GtkObject derived types except GtkAdjustment and display errors, no break points will be executed and log all events. r0= D:All B:All L:All Set the default rule to match only GtkWindow derived types and display, break and log all events. addrule 0x12345678 D:Ref|Unref|Error B:Ref|Unref|Error L:All Add a rule to match a specific object instance and display and set break points for Ref, Unref and Error events and log all events. addrule 0x12345678 D:Error|Timer ; timer = 1000 Add a rule to match a specific object instance, display errors or refcount timer events, no breakpoints (default) and log all (default). The next command "timer = 1000" sets the timer expire time to 1 second. delrule r1 Delete rule `r1'. Refcount timer -------------- The refcount timer detects object refcount changes over time. Since a lot of transient refcount activity can occur in a short period of time, it can be useful to only display long term refcount changes. This feature can help discover refcount leaks (postive or negative). To enable the refcount timer, set the "timer" variable (using the command interface) to the desired number of expire milliseconds. You then need to add "D:Timer" to an event rule to enable display of refcount changes for objects matching that rule (L:New|Ref|Unref|Finalize is implied). If an object's refcount has not changed for the specified expire time, then the refcount is considered "stable" and is printed if it differs from the previous stable refcount value. Example RefDbg commands (comments after '#'): timer=1000 # Set timer to expire changes after 1 second r0=D:Error|Timer # Enable display of errors or timer events rules # Display current rules r0: D:Error|Timer B:None L:All quit (gdb) continue Continuing. TIMER [0x811d9f0] |86 (+1)| 19:10.040099 TIMER [0x81b37f8] |5 (+1)| 19:10.386663 TIMER [0x8290ed0] |1 (-1)| 19:10.710706 Timer event output description: TIMER - Indicates that its a TIMER event - The GType of the object [0x12345678] - The address of the object |N (+/-V)| - N is the current refcount V is the change amount mm:ss.iiiiii - The time stamp of the event UPDATE (refdbg 1.1) Refdbg 1.0 did not have a way for timing out refcount activity if no additional refcount activity occured. So if a bunch of object refcount changes occurred, followed by no more refcount activity, those changes wouldn't be displayed till the next new/ref/unref. In refdbg 1.1 a glib timeout callback is used to handle refcount timer expiry in the case mentioned above. This mechanism should work as long as the glib main loop is running. If the main loop is not running, then refcount timer events may not be displayed till additional object activity occurs (the previous behavior). If you experience any problems with refdbg installing a glib timeout callback you can disable it with the 'notimer' option. This option is only valid during initialization (must be passed via the refdbg command line). Commands -------- Here is a list of commands that can be used with RefDbg (passed via the REFDBG_OPTIONS environment variable, the 'refdbg' shell script, or the RefDbg shell). You can get this list at any time by typing "help" in the RefDbg shell. Separate multiple commands with a semi-colon ';', white space is irrelevant and numbers can be specified in decimal (10), hex (0xAB), binary (0110101b) or octal (047). To display the current value of a variable listed below as a "variable=value" assignment just use the "variable" name as the command. help [topic] Get help exit/quit/q Quit the RefDbg shell (ignored if not in the shell) display [DispRule] Display event log with optional criteria object [ObjRule] Display object information with optional criteria clear Clear event log and refcount timer stats Get RefDbg statistics (event log size, mem usage, etc) time Get current time (offset from RefDbg start time) timer= Refcount timer expire in milliseconds (0 = disable) rules Display current rule stack addrule Add an event rule to the event stack insrule Insert a rule at the given position (POS >= 0) delrule Delete a rule by rule number r0= Set the default event rule r= Set a specific rule savelog Saves the current event log and objects (if logobjs=1) logobjs=0|1* Log active objects to log file on save logexit=0|1* Enable or disable saving of event log on exit btnum= Set max callers in stack backtraces to N (init only) btpaths=0*|1 Enable or disable display of full paths in backtraces notimer Disable glib timer callback (init only) dispmax= Set max items/events to display for display commands More detailed help can be found for the following topics by typing "help ": commands Command reference display Additional information on `display' command examples Examples of command usage object Additional information on `object' command rules Rule format reference timer Information on the refcount timer Quirks ------ A re-hash of what was written in other sections of this manual concerning quirks and other weirdness that is related to how RefDbg does its dirty work. The Paranoid Display/Break flag: By default, while there are objects being constructed, RefDbg will log EUnknownObj/EDestroyedObj errors but will not break or display them. After an object has been created the log is then fixed up using the now known object address and any genuinely invalid object refs/unrefs will then be detected and displayed (but break points will not be honored due to the fact that the event already occured). Unfortunately if a Ref/Unref of an invalid object does occur while any objects are being constructed it may cause the faulty program to crash before RefDbg reports anything or executes a break point. When trying to track down this type of problem the Paranoid flag can be enabled, which will cause all unknown ref/unrefs to be processed immediately, despite the fact that they may be valid calls generated by object initialization functions. RefDbg Timer: UPDATE (refdbg 1.1) Refdbg 1.0 did not have a way for timing out refcount activity if no additional refcount activity occured. So if a bunch of object refcount changes occurred, followed by no more refcount activity, those changes wouldn't be displayed till the next new/ref/unref. In refdbg 1.1 a glib timeout callback is used to handle refcount timer expiry in the case mentioned above. This mechanism should work as long as the glib main loop is running. If the main loop is not running, then refcount timer events may not be displayed till additional object activity occurs (the previous behavior). If you experience any problems with refdbg installing a glib timeout callback you can disable it with the 'notimer' option. This option is only valid during initialization (must be passed via the refdbg command line). glib notes ---------- Beginning with glib 2.6 the default build does not allow functions to be overridden using LD_PRELOAD (for performance reasons). This is the method that refdbg uses to intercept calls to the GObject reference count related functions and therefore refdbg will not work. The solution is to build glib 2.6.x with the '--disable-visibility' configure flag or use an older version of glib (<= 2.4.x). Its often helpful to build a debug version of glib anyways to get more useful stack back traces. Here is an example of the steps to take in building a debug version of glib that will be installed in your home directory (~/local) and therefore not change the system wide glib. ----- cd glib-2.6.x ./configure --prefix=~/local --enable-debug=yes --disable-visibility make make install export LD_LIBRARY_PATH=~/local/lib:$LD_LIBRARY_PATH refdbg my_program ----- The LD_LIBRARY_PATH variable will likely need to be set to the lib directory of the prefix you installed into before running refdbg. Why am I getting "LD_PRELOAD function override not working. Need to build glib with --disable-visibility?" ------------------------------------------------------------------- When refdbg initializes, a test is performed to ensure that it is able to override the GObject reference count related functions. If this test fails then this message is displayed and refdbg will terminate, since it will not be able to operate correctly. See the "glib notes" section above for more information on resolving this issue.