C and Unix programming. Entry: NULL terminated arrays of pointers Date: Thu Oct 23 14:43:21 CEST 2008 == a cdr coded lisp list. These work better than arrays + length. The only problem is that allocating temporary data structures on the run-time stack requires object size. But.. maybe alloca works here? Entry: High level programming in C: Data Structures Date: Wed Oct 29 10:14:30 CET 2008 [ This should go in a separate CS.txt log. ] What I miss most in C are closures and directed acyclic datatypes. This made me think a bit about different classes of data structures in a C program. In a dynamic language with GC these all tend to blend. 1. Linear types (pure trees): every object has a single reference, which could be implemented as embedded structs (C struct inheritance) or embedded struct pointers (C struct delegation). 2. Directed acyclic graphs: objects can have multiple references, but there are no loops. Reference counting works here. 3. General graphs: If there are circular references, reference counts won't work: you need to build "intelligent" constructors / destructors for the whole graph. Entry: C Date: Mon Nov 10 12:11:11 CET 2008 As long as everything is static, C isn't so bad. But anything that involves complicated datastructures is a pain.. The first thing to standardize on is a list implementation. Looks like this is done the right way in the Linux kernel. Entry: RPC and ad-hoc protocols Date: Mon Nov 10 17:34:00 CET 2008 This is about different ways of looking at procedure calls, mostly from an object-oriented pov (modify object in-place) vs. functional programming view (share + copy). * In the OBJECT model, the client passes a data object to the server, which is modified in place and returned back to the client. (I.e. Unix IOCTL). The advantage here is that server is relieved from memory management. In the setting of FP, linear memory management can be used. * In the FUNCTION model, there is no mutation of shared data between client and server. The client passes a copy of the necessary data to the server over a message channel, and the server returns a reply over a different channel. The problem I ran into is having to implement an RPC interface over a unidirectional message protocol. This happens when what you want is really a shared library of data objects and function calls, but it needs to provide control operations behind the scenes (i.e. it has an event loop that's not triggered by the clients). An interesting road is to investigate what is necessary to write an RPC generator: eliminate the red tape by hiding the message protocol from the user, and expose only procedure calls on both client and server side, where client calls, and server gets called. The problem with automating RPC generation is data-sharing: often in C, a function will allocate or mutate an object, and thus creating a side-channel. So it looks like this abstraction is going to be somewhat leaky, since it needs to be purely functional (copy instead of mutation). Entry: Protocol definition Date: Tue Nov 11 13:26:19 CET 2008 1. Make sure code never depends on the protocol: all encoding should be done automatically. 2. Provide automatic documentation for wire debugging. (Or better still, a sniffer that produces human-readable data.) Entry: packet filters vs. aspect oriented programming Date: Wed Nov 12 12:29:53 CET 2008 I'm working on an RPC mechanism over unidirectional Xenomai message queues, and it struck me that a packet filter is actually a pointcut, a set of join points matching a specification: given a stream of RPC messages, let some through but modify others. Entry: CSP style channels on Xenomai Date: Wed Nov 12 16:13:23 CET 2008 I'm trying to use queues for something they are probably not useful for. What I really want is CSP style channels. There are two problems: * A select() is necessary to wait on multiple events. * Tasks might be too expensive for fine task granularity. This describes the CT library from UTwente: http://www.ce.utwente.nl/rtweb/publications/2007/pdf-files/102CE2007_WMC07_pdf.pdf Q: What is the difference between waiting on a set of messages from a single "socket", and waiting on a set of ports? Can such descriptions be automatically transcribed? Anyways, the event flag group services provide the necessary machinery: An event flag group is a synchronization object represented by a long-word structure; every available bit in such word can be used to map a user-defined event flag. When a flag is set, the associated event is said to have occurred. Xenomai tasks and interrupt handlers can use event flags to signal the occurrence of events to other tasks; those tasks can either wait for the events to occur in a conjunctive manner (all awaited events must have occurred to wake up), or in a disjunctive way (at least one of the awaited events must have occurred to wake up). Entry: Daemon vs. Shared library Date: Tue Nov 18 11:16:06 CET 2008 Since a shared object can be used to share static data, as long as processes need to communicate on the same machine and need RPC semantics (clients are single thread, server uses synchronized access), this is probably enough. However, this does needs shared memory for the shared data, and semaphores to protect them. The API can abstract this. Entry: Implementing CSP in a single thread. Date: Tue Nov 18 16:17:20 CET 2008 Apparently, "CSP semantics" for conditions are inefficient to implement. One usually chooses for "Mesa semantics", where setting a condition does not mean a yield, which means that the condition can change before signalled tasks are effectively woken up. The remedy is to always check the condition after wake-up (loop before you leap!) http://www.cs.duke.edu/courses/spring01/cps110/slides/sem/sld005.htm But as far as I can see CSP semantics are only inefficient if task switches are expensive. The idea is probably that task switches can be limited if a single thread sets more than one condition in a row. So, when task switches are cheap (stack machines), this problem goes away? I'd like to implement this in Staapl. Entry: Iterators vs. Fold vs. Comprehensions Date: Fri Nov 28 12:11:49 CET 2008 http://lambda-the-ultimate.org/node/1224 What you want in C however, is a comprehension instead of a fold: this allows to use lexical variables without creating explicit iteratee context objects, or using gcc's downward closures. A comprehension is really just the macro form of a fold: the iteratee is inlined. The problem with comprehension macros is the maintenance of state across the loop body. Such a macro looks like: type1 var1; type2 var2; ... FOR_COLLECTION(var1, var2, ...) { // code that uses var1, var2, ... } Without access to the scope of the FOR and an intermediate scope of the loop body, this becomes quite difficult to hack.. Something like this might work better: type1 var1; type2 var2; ... FOR_COLLECTION(var1, var2, ...) // code that uses var1, var2, ... END_FOR In short, it's too complicated to do in general, so let's stick to generic folds. EDIT: Actually it is not necessary to store iteration state variables inside the context. Using a macro like this: #define DE_COLLECTION_FOR_(_list, _de, _del) \ struct list *_del; \ for(_del = (*_list), _de = _del ? _del->de : NULL; \ _de; \ _del = _del->next, _de = _del ? _del->de : NULL) #define DE_COLLECTION_FOR(list, de) DE_COLLECTION_FOR_(list, de, __del__) The iteration state is stored outside the context. The compiler will probably optimize the use of the variable, so we need not worry about memory usage. The only remaining problem is the "__del__" symbol: it is generated inside the macro and might cause name clashes. How to implement gensym in C?[1] [1] entry://20100825-142132 Entry: C++ local objects + exceptions. Date: Mon Dec 8 10:46:40 CET 2008 Local objects are destroyed on exceptions. This is actually quite useful as a code generation target.. But isn't this the same as alloca() ? Entry: GET/SET refactoring Date: Fri Dec 19 11:05:07 CET 2008 I've found an interesting test case for automatic refactoring: implementing endianness abstraction by exctracting GET/SET accessors from direct struct access in a low-level driver that uses little endian packed data structures. Entry: datastructures in C Date: Wed Jan 21 17:14:43 CET 2009 In lisp, a composite datastructure is always a bunch of pointers to other data structures. In C, some data might be "inlined". This inlining is done behind the scenes in Lisp (i.e. tagged ints), while in C you have: struct->field or struct.field And it's always _your_ responsability to know which of both it is. Is it possible to do this automatically, but still retain manual inlining at the definition site? I.e. to write something like struct=>field, and have the compiler expand it to struct->field or struct.field (analogous to Macros in Staapl)? Entry: memory management + RT Date: Thu Jan 29 09:27:42 CET 2009 - RT datastructures (directed acyclic graphs) This means eliminating malloc(). How far can we get with DAGs? With downward closures and read-only structures all data is stored in the tasks's activation frame. This means DAG memory management without reference counting. - traversal and partial continuations In the face of lightweight tasks (i.e. stack machines), partial contuations (tasks) become a feasible abstraction. Figure out if there are "free rides". I.e. if this can lead to datastructures that would otherwise require special memory management (refcounts) but can be built on top of stack-based allocators. One remark though: this requires one (arbitrarily large) stack per task, so just moves the problem. The problem being that _real_ memory comes from a shared pool - a machine architecture problem. So.. Can we solve this problem with virtual memory? Or is VM exactly the thing that makes context switches expensive? To summarize the idea: ASSUMPTION: Downward closures are 'powerful enough' for building data structures in simple RT programming. ADVANTAGE: Use fast stack-based memory management. PROBLEM: One stack per task. How to allocate this from a single memory pool? It is a non-issue for non-shared memory, but that might require inefficient pre-allocation. Where is the trade-off exactly? Entry: logic & constraint based programming Date: Sat Jan 31 12:35:51 CET 2009 More specifically: reversible dataflow. From a collection of equations (or inequalities), construct a C program or interpreter + program representation that computes a solution (or computes feasability). Entry: lists Date: Thu Feb 5 17:29:21 CET 2009 For a simple static C program with a bit of dynamic data, this is a simple list structure based only on concatenation, with "next" pointers embedded in the objects. This works for pure trees (flat trees). Conceptually, only these operations are necessary: * a PUSH operation of a singleton to a stack. * a REVERSE operation to finalize construction preserving order * a FOR operation for traversal If objects have built-in list pointers, there is conceptually no "element". All things are lists, and the primitive operations are "overlay_1" and "split_1". overlay_1 (abcd... , 1234...) = a1234... split_1 (abcd...) = bcd... I'm not sure if this way of looking at it is so useful, but there _is_ a difference between objects that contain a next pointer, and obects that are contained in a separate container structure (i.e. cons cells). Using embedded next is commonplace, and is easier to use with manual memory management, so maybe it's better to stick to it. Another way of looking at it: if datastructure construction is not time-critical, using quadratic list insertion is not really a problem.. That way append and iteration might be enough. The problem with embedded "next" pointers is that if you want to put the objects in an alternative container, you get two different containement mechanism (one built-in, one explicit). Entry: Data structure traversal in C Date: Fri Feb 6 15:31:43 CET 2009 In a block-scoped language without proper lexical closures, comprehensions (iteration macros) are easier to use than iteratatee objects (higher order functions) -- the operation passed to an iterator. The reason is that the body of the comprehension has access to the lexical environment outside of that body. In this scenario, factor out the data structure's FIRST and NEXT operations into procedures, and construct a trivial FOR macro using those based on C's for( _ ; _ ; _) construct. They are more limited: the comprehension loop block cannot be reused, while (stateless) iteration objects can be. EDIT: it is all about the difference between ``let'' and ``lambda''. The latter ``forks'' the stack while the former does not. Entry: (*x)[i] Date: Mon Feb 9 12:00:00 CET 2009 Given int **x, what's the difference between (*x)[i] and *x[i] ? This is clear to see when we write (*x)[i] and *(x[i]); In the first x is a pointer to an array of ints, in the second x is an array of pointers to int. Be careful, the compiler can't see the difference! Entry: C macros and name capture Date: Thu Feb 19 17:41:02 CET 2009 Today I spent an hour tracking down a classic bug: hygiene problems with non-hygienic macros (C preprocessor). See the following definition, call and expansion: // definition #define LST_ADD(head_lvalue, tail) {\ typeof(head_lvalue) *p = &(head_lvalue); \ while(*p) { p = &((*p)->next); } \ *p = tail; } // call LST_ADD(m->param_head, p); // expansion { typeof(m->param_head) *p = &(m->param_head); while(*p) { p = &((*p)->next); } *p = p; }; The introduced symbol "p" shadows the variable "p" in the calling context. This can be avoided by always binding all macro input names to local variable names before introducing macro-local names, but it's sufficient to bind those that might be shadowed by the introduced binding, which depends on how they are used. // new definition #define LST_ADD(head_lvalue, tail) {\ typeof(tail) _tail = tail; {\ typeof(head_lvalue) *p = &(head_lvalue); \ while(*p) { p = &((*p)->next); } \ *p = _tail; }} Entry: Automatic binary network protocol definition Date: Thu Feb 26 15:09:23 CET 2009 The idea: given a collection of (somehow restricted) functions, create server and client protocol pack/unpack routines. The reason: I loose too much time doing this by hand. The restrictions: obviously, the functionality should fit in a simple RPC mechanism that sends one packet to a server, and expects a return packet (see above). The main difficulty is memory management, and mapping C function calls to structs and back. Entry: classes vs. prototypes Date: Sun Mar 29 14:30:19 CEST 2009 Writing plain vanilla C code, one often wants some form of OOP indirection. Instead of storing references to classes in objects, it is often simpler to get rid of this level of indirection and store methods directly in objects. This is called ``Prototype based OOP''. When it is clear that some kind of class structure emerges, "behaviour" objects can be constructed: objects which contain only a collection of methods and are referenced by other objects. http://en.wikipedia.org/wiki/Prototype-based_programming Entry: Concurrency oriented programming in C + Scheme Date: Sat Aug 1 21:58:50 CEST 2009 The context of this article is the use of Scheme as an extension language in a predominantly C-based application. It is a way to make C programming easier by sticking to a subset that performs only `hierarchical' data allocation on the C-stack (i.e. doesn't use malloc() and free(), or any kind of explicit alloc/dealloc). Let's call this subset stack-C. From my experience(1) I make the assumption that stack-C is enough to solve many practical subproblems, given the external scheduling and memory management can be solved. The main idea is that while mixing ordinary C with (scripting) languages that have a different memory and execution model can be problematic (i.e. Scheme with GC'd CONS cells and a continuation-based control mechanism), the stack-C approach provides a clean separation between the C side and any time and space resource management provided by the scripting language runtime. The reason is that stack-C tasks can be: - trivially suspended to (contiguous!) memory - resumed multiple times (trivially forked) - garbage collected without finalization - separated completely from memory and control management - trivially parallelized This is essentially declarative concurrency. You shift the object-oriented model (state machines) to tasks. The idea is related to deforestation, where intermediate data structures (which require knowledge of the memory model) are eliminated by (static) scheduling of communicating tasks. Concretely: the stack-C code needs to only consume and produce atomic data, which is easily wrapped, and never builds any (recursive) data collections. When the need arises to build datastructures (intermediate representations) this can be done in the scripting language, without complicating the C code. This allows core algorithms to be written in `functional C' while all data management and code scheduling can be performed by the scripting language, be it a GCd Scheme, or a linear concatenative language. Memory and time management can then be `library provided'. I wonder if this is the key to a practical linear language. The fact that such `suspendable functional C' primitives could be used in two completely different memory models is a powerful hint. Footnotes. (1) The idea comes from an implementation of an EtherCat master driver written in C, in a style that does not use malloc() for data structure allocation, but instead allocates all memory on the activation stack. This made me wonder if this programming style can't be generalized. The idea popped up again trying to extend TinyScheme with `inverted enumerators' aimed at replacing list allocation with coroutine / generator style parameter linkage. (2) When stack-C is extended to include open/close or malloc/free style external resource management, the main challenge in this approach is finalization of tasks[2]. What we do is creating cursors by enumerator inversion, which doesn't guarantee the traversal is run to the end. This might however be elegantly solved by always re-inverting such inverted enumerators back to enumerators in the scripting language. Once could however argue that managed external resources are an implementation artifact and could be designed out. (3) Alternatively, a linear approach (no cyclic data, refcount based management) makes this easier to manage. However, it does require a finalize() method to the task. Linearly managing the execution contexts themselves (making fork() expensive) might be not such a bad idea. [1] http://okmij.org/ftp/Haskell/misc.html#fp-arrays-assembly [2] http://okmij.org/ftp/Scheme/enumerators-callcc.html#Finalization [3] http://okmij.org/ftp/papers/LL3-collections-enumerators.txt Entry: Generating C code Date: Sun Aug 9 12:52:37 CEST 2009 When generating boilerplate C code for a project, it seems better to generate either full C files, or header files with defines and inline functions. Including bits and pieces of C code from files seems error-prone. Entry: Unlikely Date: Sat Aug 15 11:36:41 CEST 2009 I've started to use the `unlikely' annotation. #define unlikely(x) __builtin_expect((x),0) if (unlikely(NULL == foo)) { ... } Apart from catering to that insatiable hunger for fast code it is also makes code more readable: on the first reading pass you can ignore these checks to see the default control path. Entry: CONS using alloca() Date: Mon Aug 17 19:28:18 CEST 2009 I'd like to write a tokenizer / parser in C without using malloc(). Is it ok to use alloca() for this (i.e. one per character)? I did this before actually... Hmm.. Apparently alloca() is no longer standard? Something to do with ANSI C's specification of compile-time knowable stack frame size. Entry: mmap() tricks Date: Mon Aug 17 19:34:43 CEST 2009 How to use mmap() for memory allocation tricks? Say I want to allocate a single 4K page into my address space, and free it later. Can this be done using mmap? It's possible to place a page at a fixed address using MAP_FIXED. Entry: Dynamic scope and thread-local variables. Date: Tue Aug 18 12:42:11 CEST 2009 Dynamic scoping is easy to implement using global variables and save/restore combined with some `dynamic-wind' guard for longjmp() in case that's used. However, in the presence of pre-emptive threads, this isn't an option. So how to use thread-local storage? [1] http://en.wikipedia.org/wiki/Thread-local_storage [2] http://publib.boulder.ibm.com/infocenter/iseries/v6r1m0/index.jsp?topic=/rzahw/rzahwex1.htm Entry: Will use mmap() instead of malloc() for allocation of major heap chunks. Date: Mon Aug 24 09:54:03 CEST 2009 That's what shows up in the output of the ./configure of (Meta)OCaml. Entry: Hiding circular dependencies using "temporary binding". Date: Mon Jul 26 14:07:25 CEST 2010 If you stick to proper trees as datastructures for C programs, you can avoid memory allocation problems. In such case each datastructure has exactly one owner/parent/... an can in principle be _embedded_ in that larger structure. If you need directed, acyclic graphs, you need reference counting. If you need generic cyclic object reference graphs, you need asynchronous GC. However, in many cases it is possible to go back to DAG / tree rep by providing a "temporary binding" object that associates 2 objects that need to know about each other. This requires a rule that methods cannot store pointers to other objects, but they might be "temporarily bound" during the execution of one method. The binding object (the one that actually stores the two pointers) is then responsible for the management of the two pointers. This is actually dependency injection[1] combined with a `short-lived clause': object only live during method call. In the DAG case, another trick is to impose an order on the data structure. The simplest one is the call stack: if a DAG is built only from local variables, simply returning functions will properly cleanup. FIXME: de-ramble [1] http://en.wikipedia.org/wiki/Dependency_injection Entry: Translating error codes Date: Wed Jul 28 08:01:22 CEST 2010 I notices an interesting pattern: because error codes are not real exceptions, they do not have a clear identity. I.e. an error code obtained from a different framework probably needs to be translated before it is returned up the chain. In C, it seems to be best to keep a single error namespace for all errors in a single module to avoid multiple translations. Entry: Error handling and "smart code" Date: Thu Jul 29 10:46:03 CEST 2010 For very stateful object (i.e. file systems) that have many possible local errors that could in principle be solved locally, it might sometimes be better to separate the code into two layers: * A dumb layer that gives up at the first sign of trouble, cancelling the transaction (not modifying the state). * An error recovery layer at the API entry point that localizes retry/resolve to all conditions that are recoverable, restarting transactions from the start. The basic idea is: don't _ever_ leave your object in an inconsistent state between method calls. This can sometimes conflict with "use small method calls" and pushes you towards thinking very hard about what a decently sized transaction does. If a state transition consists of a lot of small increments with intermediate inconsistencies, make sure it can be aborted at all times. In C this can be done by keeping the updated state in local variables, and only committing when everything went right. Entry: Error handlers as goto labels Date: Fri Jul 30 07:23:28 CEST 2010 What you see a lot in C code is a "goto error" sprinkled around code, where the "error" points to code that performs some error recovery before function return. The problem with this is that you don't know where the error condition came from when you set a debugger breakpoint on the "error" label. The solution is to place the handling code in a separate function. Instead of "goto error" one would use: return handle_error(err); This way a breakpoint can be set on "handle_error" and from the backtrace it is clear which branch of the code generated it. Entry: An argument for NULL-terminated lists Date: Wed Aug 11 19:15:31 CEST 2010 It's funny how a seemingly arbitrary choice keeps roaring its head. If you represent an array in C or any other low-level language that allows pointers, you have a choice between: - size + contiguous vector (SV) i.e. Pascal string - contiguous vector + sentinel (V0) i.e. C string Up to now I've found that all other things being equal, the SV is better because 1) you don't need out-of-band values and 2) you know the size without traversal. Granted, NULL usually isn't a problem as an out-of-band value, but it can be problematic in other cases where the elements are not simply memory addresses. Knowledge of the total size can come in handy when the vector is being accessed sequentially (i.e. during serial communication), and you need to allocate temporary storage. In the V0 case you need to traverse twice: once to get the size, and once to copy the data. This can be expensive and in some cases even impossible, in which case you need to start guessing the size and handle cases where you guessed to low. However, today I've found a case where V0 is actually better: when you want a compact representation of traversal stacks for tree data structures(1). In the V0 case you just need to store a stack of element pointers, and pop the stack whenever the sentinel is reached (i.e. the "return" instruction). In the SV case you need to store _two_ words per recursion level: one current pointer, and one counter or pointer to the end of the data structure(3). (1) Sometimes you do want explicit traversal stacks: recursion in C can be memory-inefficient due to presence of arguments or local variables that are "global" to the data structure descent. (2) Actually implementing a traversal on a SV-style tree it struck me that I could still use a single-word stack by _copying_ the elements to the stack all at once. Of course this is not a real solution as it less efficient overall wrt. stack space: whole lists are stored instead of pointers into lists. (3) The V0 vector case are CDR-coded linked lists, meaning you can easily "pop" such a data structure[1]. [1] entry://20081023-144321 Entry: Programming without malloc() Date: Sun Aug 15 21:05:41 CEST 2010 ( Related to PF and linear memory management. ) How come that I have not much trouble writing linear C code (not using malloc) for embedded systems work? Is it that embedded systems code is inherently "simple", or does it have to do with component-oriented approach (usually embedded software is really about the underlying hardware, which is necessarily finite). [1] entry://../libprim/20100815-204639 Entry: Explicit recursion stack Date: Mon Aug 16 07:45:09 CEST 2010 To eliminate recursive calls in C and replace them with an explicit stack, use the following approach: 1. Make the code self-recursive by unrolling mutually recursive calls. This is necessary to provide a single "context" to the loop. The stack will be in that context. 2. Replace the recursive call with a CALL macro that pushes the current data structure pointer to the stack and jumps to the entry point. 3. Replace the entry point to examine the top of the stack. Entry: Stacks and Queues Date: Tue Aug 17 09:19:20 CEST 2010 Another non-obvious obvious thing (NOOS). Stacks and queues are "dual" as they represent two ways of traversing a tree: depth-first and breadth-first. Now, is a breadth-first search necessarily less efficient than a depth-first search? It seems it needs more bookkeeping information. Or is this a consequence of how I usually implement? Entry: Explicit recursion and post-ops. Date: Tue Aug 17 09:46:52 CEST 2010 To write a tree descent algorithm which performs an operation _after_ recursion in the tree, the recursion stack needs two types of "continuations". One for the child traversal, and one for the post-op. This is exactly the same problem as the representation of continuations in a language interpreter. The solution is to build a stack which has elements that are unions instead of structs, and write a dispatcher (pattern matcher) for the different component types. Entry: Struct assigments Date: Tue Aug 17 11:08:05 CEST 2010 Apparently memcpy() isn't really necessary for structs. [1] http://stackoverflow.com/questions/324011/memcpy-vs-assignment-in-c [2] http://groups.google.com/group/comp.lang.c.moderated/browse_thread/thread/17f9d2a82b1ee88f/31d43c3c27a478af?pli=1 Entry: Static allocation: stacks and queues. Date: Wed Aug 18 18:34:27 CEST 2010 * In an embedded application, if a static structure is built at run-time, built up from many small fragments of memory, it might be better to allocate from a stack than to pollute the heap. If the amount of memory needed is predictable, the stack size can even be fixed at compile time. This requires a two-pass algorithm though. * Instead of writing pointers to a work queue, it is also possible to write self-contained objects. This avoids malloc() for the payload data. Both are really the same pattern: ``catch'' memory allocations and implement them differently based on knowledge of the memory lifetime. Entry: Non-cyclic objects Date: Thu Aug 19 18:47:17 CEST 2010 The free() operation is trivial for "linear" objects that do not contain any links to dynamic resources: simply discard. This is the: ``finalizers are evil'' anti-pattern. Is this actually useful? Entry: Constant data in Flash Date: Thu Aug 19 22:43:47 CEST 2010 How do you make sure constant data goes into Flash memory? Is it possible to generate constant data structures with cross references? From [1]: const volatile int *data __attribute__((section("FLASH")) = /* whatever */; [1] http://stackoverflow.com/questions/1284619/gcc-c-arm-and-const-pointer-to-struct-field [2] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0474a/BABHIIEF.html Entry: Constant Recursive Data Structures Date: Mon Aug 23 18:36:41 CEST 2010 How to define constant, cyclic data structures in C? Same as for functions! The syntax for variable definitions and declarations is the same, it is the order that matters. Only the last occurance is a definition, the rest are declarations. //------------------------------- #include struct node; struct node { const struct node **parents; const struct node **children; }; const struct node a; const struct node b; const struct node c; const struct node d; const struct node e; const struct node *a_p[] = {NULL}; const struct node *a_c[] = {&c, NULL}; const struct node a = {a_p, a_c}; const struct node *b_p[] = {NULL}; const struct node *b_c[] = {&c, NULL}; const struct node b = {b_p, b_c}; const struct node *c_p[] = {&a, &b, NULL}; const struct node *c_c[] = {&d, &e, NULL}; const struct node c = {c_p, c_c}; const struct node *d_p[] = {&c, NULL}; const struct node *d_c[] = {NULL}; const struct node d = {d_p, d_c}; const struct node *e_p[] = {&c, NULL}; const struct node *e_c[] = {NULL}; const struct node e = {e_p, e_c}; int main() { return 0; } Entry: Book about embedded C Date: Mon Aug 23 21:05:07 CEST 2010 I wonder if it would make sense to start writing abook about embedded C. Maybe it's best to start with cataloging the tricks I already know. TOPICS: * Memory allocation patterns - static memory - stacks and queues - pure trees and inline structs - DAGs and refcounting - garbage collection and Scheme - hiding circular deps (the "environment" pattern / S combinator) * Data structure traversal - callbacks with context - generic fold w. callback - generators - for MACROS to avoid context objects * Debugging - gdb as an application console Entry: Gensym in C Date: Wed Aug 25 14:21:32 CEST 2010 To make iteration macros for abstract data structures on top of the `for' keyword, it helps to be able to define unique symbols to store the iteration state object. I.e. #define X_COLLECTION_FOR_(_list, _x, _xl) \ struct x_list *_xl; \ for(_xl = (*_list), _x = _xl ? _xl->x : NULL; \ _x; \ _xl = _xl->next, _x = _xl ? _xl->x : NULL) #define X_COLLECTION_FOR(list, x) X_COLLECTION_FOR_(list, x, __xl__) The `__xl__' symbol above is inserted into the macro invocation name context as a local variable and might clash. Is there a way to properly solve this problem? The following seems to work: #define __GENSYM2(x,y) x##y #define __GENSYM1(x,y) __GENSYM2(x,y) #define GENSYM(x) __GENSYM1(x,__COUNTER__) The indirection is about prescan[3]: "If an argument is stringified or concatenated, the prescan does not occur. If you want to expand a macro, then stringify or concatenate its expansion, you can do that by causing one macro to call another macro that does the stringification or concatenation." [1] http://stackoverflow.com/questions/1132751/how-can-i-generate-unique-values-in-the-c-preprocessor [2] http://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html#Common-Predefined-Macros [3] http://gcc.gnu.org/onlinedocs/cpp/Argument-Prescan.html#Argument-Prescan Entry: Resource Acquisition Is Initialization (RAII) Date: Thu Aug 26 11:57:21 CEST 2010 Mixing resources and exceptions, or how to model external resources as short-lived local (automatic) variables. This tickles the idea again about ``resource-free programming''. In general, ``garbage collection is good but finalizers are bad'' [Steele]. [1] http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization [2] http://calculist.blogspot.com/ Entry: Constant multidimensional arrays Date: Mon Aug 30 10:48:25 CEST 2010 One of those things that has always puzzled me is how multidimensional arrays work in C. I've always avoided this and embedded multidim arrays inside of 1-dim arrays using explicit coordinate mapping. int temp[0x10][0x20]; In the definition of `temp`, is there an array of pointers involved? K&R: "In C, a two-dimentional array is really a one-dimensional array, each of whose elements is an array.". So my confusion is mostly about the difference between an array and a pointer. These are not the same! See the sizeof() macro. The confuses arrises because an array is automatically converted to a pointer whenever it is used in a pointer context. Entry: Order of evaluation in CPP macros Date: Tue Aug 31 11:19:00 CEST 2010 Is it possible to play with order of expansion in CPP? Judging from the GENSYM[1] macro it is. How to do this systematically? Should evaluation order matter? For a purely functional language, evaluation order only matters for termination issues. So it's no surprise that we run into evaluation order when an impure features is used (the __COUNTER__ variable). [1] entry://20100825-142132 Entry: Defining constant linked lists using CPP Date: Tue Aug 31 13:37:45 CEST 2010 Is it possible to define constant linked lists from isolated macro expansions? I'd think it's possible using name concatenation, but it would still require manual input of symbols, so it's no worse than giving an explicit previous list item to link.. Entry: git diff Date: Tue Aug 31 17:12:23 CEST 2010 Not really C related, but I only use git for C projects. Problem: make a diff that can be applied by `patch' on a subtree of a git repository. It's probably simplest to use two braches and diff between the branches. Entry: C99 function names: __func__ Date: Wed Sep 1 13:59:19 CEST 2010 See GCC manual[1]. Apparently __func__ is part of the C99 standard, and behaves as: static const char __func__[] = "function-name"; __FUNCTION__ is a GCC extension and not portable. [1] http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Function-Names.html Entry: Non-invasive C code tracing Date: Wed Sep 1 14:56:44 CEST 2010 Is it possible to attach code to C function entry and exit points in a non-invasive way? One (invasive) way is to wrap each function in an instrumentation function. Entry: The meaning of `const' (.rodata section) Date: Tue Sep 7 09:10:01 CEST 2010 One often hears that ``const is broken'' which means that it is possible in C to get an unrestricted pointer to const data. However, on a more pragmatic side, the `const' qualifier does tell the compiler that the data is _intended_ to be constant, and as such it can be put in a separate linker section `.rodata'. This section can then be put in (p)ROM. I.e. for ARM uC it can go into program Flash. Entry: Linker sections on AT91SAM7X512 (ARM7TDMI, 128k RAM, 512k Flash) Date: Tue Sep 7 09:25:35 CEST 2010 The output of objdump (reformatted): tom@one:~/$ objdump -h app.elf app.elf: file format elf32-little Sections: Idx Name Size VMA LMA File off Algn 0 .debug_aranges 000021f8 00000000 00000000 00030e30 2**3 CONTENTS, READONLY, DEBUGGING 1 .debug_pubnames 00006976 00000000 00000000 00033028 2**0 CONTENTS, READONLY, DEBUGGING 2 .debug_info 0007fbd5 00000000 00000000 0003999e 2**0 CONTENTS, READONLY, DEBUGGING 3 .debug_abbrev 00014342 00000000 00000000 000b9573 2**0 CONTENTS, READONLY, DEBUGGING 4 .debug_line 00017cc3 00000000 00000000 000cd8b5 2**0 CONTENTS, READONLY, DEBUGGING 5 .debug_frame 000062f4 00000000 00000000 000e5578 2**2 CONTENTS, READONLY, DEBUGGING 6 .debug_str 0000f462 00000000 00000000 000eb86c 2**0 CONTENTS, READONLY, DEBUGGING 7 .debug_loc 0001b6ee 00000000 00000000 000facce 2**0 CONTENTS, READONLY, DEBUGGING 8 .rom_vectors 00000040 00100000 00100000 00008000 2**0 CONTENTS, ALLOC, LOAD, READONLY, CODE 9 .text 0001e790 00100040 00100040 00008040 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 10 .rodata 00002580 0011e7d0 0011e7d0 000267d0 2**3 CONTENTS, ALLOC, LOAD, READONLY, DATA 11 .got 00000228 00120d50 00120d50 00028d50 2**2 CONTENTS, ALLOC, LOAD, DATA 12 .fixed_vectors 00000140 00200040 00200040 001163c0 2**5 CONTENTS, READONLY 13 .data 00000cac 00200180 00120f78 00030180 2**3 CONTENTS, ALLOC, LOAD, CODE 14 .bss 0001140c 00200e2c 00121c24 00030e2c 2**4 ALLOC 15 .ARM.attributes 0000002c 00000000 00000000 00116500 2**0 CONTENTS, READONLY 16 .debug_ranges 000073e0 00000000 00000000 00116530 2**3 CONTENTS, READONLY, DEBUGGING 17 .comment 00001784 00000000 00000000 0011d910 2**0 CONTENTS, READONLY VMA = virtual memory area LMA = ??? 00100000 Flash 00200000 RAM 00000000 Remappable Flash/RAM Some sections start at 0 but this seems to be a dummy value, i.e. for DEBUGGING. Entry: Reinventing linked lists Date: Tue Sep 14 09:35:46 CEST 2010 Yes, I know i deserve to be slapped when implementing (sorted) linked lists in C, but really, sometimes it's just easier to just write it down. Trouble is, this is error prone. I _think_ I can just write it down but there's always some bug in the code that's not caught by the type system. Double pointers are notorious for this. See the following: struct timeout_list **tol = &a->timeouts; while ((*tol) && ((*tol)->time < at)) { (*tol) = (*tol)->next; } What I really wanted to write is: struct timeout_list **tol = &a->timeouts; while ((*tol) && ((*tol)->time < at)) { tol = &((*tol)->next); } The latter iterates, while the former removes elements! Entry: Emacs and parsing C Date: Tue Sep 14 11:59:08 CEST 2010 There are some tools available, but they don't make the click to me (i.e. CEDET). The one that cought my attention is [1]. The problem is quite simple: * build a database from your source code. * query the database * allow "replace", taking part of a syntax tree and replacing it with something else. * integrate this with emacs The most complex part seems to be the C parser. I.e. to go from code to (modifiable!) datastructure back to code. [1] http://mike.struct.cn/blogs/entry/15/ Entry: Embedding arbitrary data in ELF using objcopy Date: Sat Sep 18 15:13:33 CEST 2010 Good hint from [1]: objcopy \ -I binary -O elf32-i386 -B i386 \ --rename-section .data=.rodata file.bin file.o Default secion is .data which is here renamed to .rodata for embedding of const data that goes into program Flash. This creates 3 symbols: SYMBOL TABLE: 00000000 l d .rodata 00000000 .rodata 00000000 g .rodata 00000000 _binary_file_bin_start 00000007 g .rodata 00000000 _binary_file_bin_end 00000007 g *ABS* 00000000 _binary_file_bin_size To use different names, use something like: -–redefine-sym _binary_file_bin_start=_my_bin_data [1] http://www.doof.me.uk/2010/05/07/cute-objcopy-hack/ Entry: Makefile error: multiple target patterns Date: Fri Sep 17 15:18:20 CEST 2010 Check for multiple colons[1]. [1] http://stackoverflow.com/questions/2100448/multiple-target-patterns-makefile-error Entry: Const correctness Date: Thu Sep 23 10:13:38 CEST 2010 What we already know is that global variables declared "const" will go into the .rodata section. What about functions? I'd like to hash the __FUNCTION__ variable into an integer to get method tags. Does the compiler inline this if the hashing function is declared const? [1] http://en.wikipedia.org/wiki/Const-correctness [2] http://stackoverflow.com/questions/212237/constants-and-compiler-optimization-in-c Entry: Makefile patterns Date: Fri Feb 25 16:21:30 EST 2011 Pattern targets. The pitfall is to not make them too general. How are ambiguities handled? What I like to know is how to insert a phony target into a real one. I.e. some targets do not have a full dependency tree, only a recursive make... What I want to do is to build a better build system for ecos. - If the config file or any of the cdl files change, regenerate the whole tree. - For other changes, just perform recursive make. Maybe it's not really worth it : keep current as is (only rebuild if config changes) and do manual rebuild when working on eCos source files. Entry: Environment or makefile variables? Date: Sat Mar 5 14:27:01 EST 2011 Basic idea: stick to one; don't use both. Entry: Datastructures in Flash Date: Sat Mar 5 23:35:58 EST 2011 Maybe I misunderstood the code, but I saw a segment in eCos where a datastructure is defined and attributed to a certain linker segment. The linker then collects all of these and places them next to each other. This seems like a nice way to build contiguous memory areas from dispersed source code locations when using C as a compiler target language. [1] entry://20100918-151333 [2] http://../ecos/20110529-123810 Entry: GNU make : target specific variables Date: Wed Mar 9 12:42:34 EST 2011 [1] http://www.gnu.org/software/make/manual/make.html#Target_002dspecific Entry: Generators Date: Thu Mar 17 11:53:12 EDT 2011 When you have list structures moving between layers of code, say A->B->C, there is a choice to be made about the form of the data that flows between the functions. The simplest approach is to put a bunch of data in a (large) buffer and pass it on to the next function. An alternative way is to pass only a single element at a time, and place the A->B->C components inside a loop. Taking some ideas from [1], and taking into account the limitations of C, the optimal seems to be to provide a "fold" for each data structure. In C code I prefer to use the word "for". The main advantage of such an approach is that it avoids intermediate data structure storage such as large buffers to pass lists of elements from one function to the next. Using the "for" loop aproach instead of the iterator object approach -- i.e. methods that implement first, next, last -- has the added advantage that locking and other resource management can be abstracted away from the callback function. [1] http://okmij.org/ftp/Streams.html#enumerator-stream Entry: Abstract data types vs. static RAM allocation Date: Fri Apr 1 14:28:39 EDT 2011 There is a clear conflict of interest between 1. C structures with static allocation 2. Data abstraction It would be handy to be able to know the size of a data structure at compile time, but hide its layout such that data accessors can be used. I'm not so much interested in hiding reads, but writes really do need special attention, as they can unknowingly introduce the worst kind of (single-threaded) bug the data structure invariant violation. I was thinking about a hack that would use the C "const" to protect a struct from direct write access, but cast the const in the implementation. That might not interfere well with optimization though.. However it might be so that the C compiler doesn't use the "const" for optimization simply because const pointers can be recast. I'd like to pose this question on Stack Overflow. How to formulate? Q: How to get at the size of a forwared declared struct? A good practice when writing C code is to always use forward structure declarations (incomplete types) in a header file, and keep the implementation in a code file. // in header struct abc; // in code struct { int a; int b; int c; } When writing C code in an embedded development setting with a lot of memory perssure, it is often a good idea to statically allocate data structures whenever possible. Doing so requires complete types. i.e. a struct definition which has all fields accessible by the user. Is there an elegant way to resolve the conflict of these two constraints? Is it possible to provide struct size information at compile time to make it possible to perform static allocation, but at the same time hide the layout of the struct to prevent arbitrary access? Entry: C and packed bitfields Date: Sun Apr 3 13:49:15 EDT 2011 Is this actually useful? Entry: Producer/consumer in C in an embedded context Date: Mon Apr 4 14:11:08 EDT 2011 Main problem in embedded is to limit RAM memory usage. This makes algorithms that use intermediate data structures prohibitive. Usually there is plenty of ROM in comparison to RAM, so it is usually possible to use use manual "deforestation". What this means is to turn code into producers/consumers (threads and streams). In languages that support partial contiuations this is straightforward. How to do it in C? Entry: "for" functions or "FOR" macros? Date: Tue Apr 5 21:15:28 EDT 2011 The idea is to always use abstract collection traversal interfaces. There are essentially two forms: - the STREAM: open, close, next, eof - the early-abort FOR loop, either as a macro or as a higher order function (callback + state). Most people are familiar with the STREAM option, but the FOR function has some advantages, such as centralizing resource allocation (open/close are abstract). In C, a "FOR" macro is more convenient since it can use its surrounding lexical context without using tricks, but a "for" function is more powerful though a bit clumsy to use for its need of specific "context" structures (closures). Entry: static inline: (pseudo-) separate interface from inplementation Date: Thu Apr 28 12:28:44 EDT 2011 I tend to use "static inline" functions a lot for simple accessors to allow the compiler to perform more aggressive optimization. The problem is that this always needs to combine interface and implementation for two reasons: - data structures need to be fully specified. - function bodies are visible. The former you can't do anything about: compiler needs to know the data layout so it's technically possible for the API user to abuse this. The latter makes it hard to read headers. It seems a good idea to use the following: use forward declarations for data structures and inline functions and stash the implementation of both in a _priv.h file that is not supposed to be looked at by human eyes, but is there for the compiler: // foo.h struct foo_data; static inlien int foo_data_size(struct foo_data *x); #include "foo_priv.h" // foo_priv.h struct foo_data { int a, b, c; }; static inline int foo_data_size(struct foo_data *x) { return sizeof(foo_data); } Another issue whether one allows struct members as: struct highlevel { struct lowlevel_foo foo; struct lowlevel_bar bar; } or needs to abstract using pointers and possible constructors such as lowlevel_foo_new() and lowlevel_bar_new() to instantiate the objects. struct lowlevel { struct lowlevel_foo *foo; struct lowlevel_bar *bar; } Entry: GNU cflow Date: Thu May 19 17:31:40 CEST 2011 Print a control flow graph (as an indented text file) from a bunch of C files. Another tool is egypt[4]. Apparently doxygen can also use Graphviz to generate call graphs[3]. cflow2dot[5] to connect cflow ang Graphviz. [1] http://www.gnu.org/software/cflow/ [2] http://en.wikipedia.org/wiki/Call_graph [3] http://en.wikipedia.org/wiki/Doxygen [4] http://www.gson.org/egypt/ [5] https://code.google.com/p/cflow2dot/ [6] http://sourceforge.net/projects/cflow2vcg/files/cflow2vcg/0.5/cflow2vcg-0.5.tar.gz/download Entry: Link-Time Optimization (LTO) vs. static inline Date: Sun May 22 18:27:51 CEST 2011 Does GCC perform LTO? Seems the short answer is yes[1]. How to enable it? From [2]: To enable LTO, simply add the flag '-flto' to both compile and link commands. [1] http://gcc.gnu.org/wiki/LinkTimeOptimization [2] http://gcc.gnu.org/ml/gcc/2009-10/msg00060.html Entry: Automatic casting and overflow Date: Mon May 23 16:59:12 CEST 2011 Something I always do explicitly because I can't remember the rules: char i = 255; int j = i + 1; Does i get cast from char -> int before the addition, or after? Where is this made explicit? Section 2.7 Type Conversions in K&R is quite explicit: When an operator has operands of different types, they are converted to a common type according to a small number of rules. In general, the only automatic conversions are those that convert a ``narrower'' operand into a ``wider'' one without losing information [...] Entry: Compile time assert Date: Mon Jun 6 11:30:01 CEST 2011 I need a compile-time assert macro that can go into proprietary code, so can't use the one in Linux. Good news: I did not look at any code yet, so can I reinvent it? The trick alledgedly is to use the macro arguments to construct data of negative size. So let's analyze this: CT_ASSERT(x) - If the argument of the assert macro is zero, a compilation error is triggered. - To trigger a compilation error, construct a data structure with negative size. - A name is necessary for the data structure, so let's use a name generation macro (GENSYM[1]). It turns out to be quite straightforward: using result values of boolean expressions (after a comparison of the input to 0), a simple negation can be used to map the failed assert to a negative array size. #define CTASSERT(x) static const char GENSYM(ctassert_)[-((x)==0)]; EDIT: There is even a more elegant way. Instead of using a code construct that has a run-time run-time remnant as a symbol part of the binary if it's not optimized out, it's possible to use just a completely ephemeral construct: i.e. a type. #define CTASSERT(x) struct GENSYM(ctassert_) {char a[-((x)==0)]; } EDIT: It's also possible to avoid size 0 arrays and use size -1 for false and size +1 for true: #define CT_ASSERT(x) struct GENSYM(_ctassert_){ char a[1 - 2*((x)==0)]; } [1] entry://20100825-142132 Entry: Know your linker: how to avoid unused functions to end up in a binary. Date: Sat Jun 11 18:01:52 CEST 2011 Use the CFLAGS -ffunction-sections -fdata-sections which will put functions in their own sections. Together with the linker option -Wl,--gc-sections unused sections (functions or data) will be removed. The only downside is that using separate sections requires more space due to alignment. Entry: binutils RTFM Date: Sun Jun 12 15:59:20 CEST 2011 From [1]: You can see the symbols in an object file by using the nm program, or by using the objdump program with the `-t' option. [1] http://www.delorie.com/gnu/docs/binutils/ld_8.html Entry: What are __restore and __restore_rt ? Date: Sat Jun 4 15:39:51 CEST 2011 I ran into these missing symbols in an eCos synthetic target build[1]. I have no idea where these functions should be defined, and what the meaning is of the construct in the eCos code: .align 16 .global cyg_hal_sys_restore_rt cyg_hal_sys_restore_rt: movl $SYS_rt_sigreturn, %eax int $0x80 1: .type __restore_rt,@function .size __restore_rt,1b - __restore_rt .align 8 .global cyg_hal_sys_restore cyg_hal_sys_restore: popl %eax movl $SYS_sigreturn, %eax int $0x80 1: .type __restore,@function .size __restore,1b - __restore The build fails like this: make[1]: Entering directory `/opt/xc/ecos/build/cvs/linux/hal/synth/i386linux/current' gcc -c -I/opt/xc/ecos/build/cvs/linux/install/include -I/opt/xc/ecos/src/cvs//packages/hal/synth/i386linux/current -I/opt/xc/ecos/src/cvs//packages/hal/synth/i386linux/current/src -I/opt/xc/ecos/src/cvs//packages/hal/synth/i386linux/current/tests -I. -I/opt/xc/ecos/src/cvs//packages/hal/synth/i386linux/current/src/ -finline-limit=7000 -Wall -Wpointer-arith -Wstrict-prototypes -Wundef -Wno-write-strings -g -O2 -ffunction-sections -fdata-sections -fno-exceptions -Wp,-MD,src/syscall-i386-linux-1.0.tmp -o src/hal_synth_i386linux_syscall-i386-linux-1.0.o /opt/xc/ecos/src/cvs//packages/hal/synth/i386linux/current/src/syscall-i386-linux-1.0.S /tmp/ccIPQc1w.s: Assembler messages: /tmp/ccIPQc1w.s: Error: .size expression for __restore_rt does not evaluate to a constant /tmp/ccIPQc1w.s: Error: .size expression for __restore does not evaluate to a constant make[1]: *** [src/syscall-i386-linux-1.0.o.d] Error 1 make[1]: Leaving directory `/opt/xc/ecos/build/cvs/linux/hal/synth/i386linux/current' make: *** [build] Error 2 make: Leaving directory `/opt/xc/ecos/build/cvs/linux' According to something I found in google code search[2], these functions are trampolines. It's a part of gdb code[3]. It says: ... as of version 2.1.2, the GNU C Library uses signal trampolines (named __restore and __restore_rt) that are identical to the ones used by the kernel. Maybe that is something that changed? I used objdump -T to find these symbols in any of the libraries in /lib and /usr/lib and I didn't find anything. Maybe it's in libgcc or so? So let's see what that code actually means. The .type[4] directive records the symbol table type for the associated symbol. So it seems that ".type" in .type __restore,@function .size __restore,1b - __restore does not much more thanannotate the symbol "__restore": it does not define it. Same for ".size": it records the size associated to the symbol. Maybe what is meant here is to just refer to the "cyg_hal_sys_restore" symbol, and not the "__restore" symbol? I.e. the code probably has changed at some point to incorportate a name change, but somebody forgot to update that directive. It's not necessary for the code to work so it was only detected when binutils got stricter. See fix next post. [1] http://ecos.sourceware.org/ml/ecos-discuss/2011-06/msg00013.html [2] http://www.google.com/codesearch?as_q=__restore_rt&btnG=Search+Code&hl=en&as_package=&as_lang=&as_filename=&as_class=&as_function=&as_license=&as_case= [3] http://www.google.com/codesearch/p?hl=en#pFm0LxzAWvs/darwinsource/tarballs/other/gdb-203.tar.gz%7CYXbJTYT1R-s/gdb-203/src/gdb/i386-linux-tdep.c&q=__restore_rt [4] http://tigcc.ticalc.org/doc/gnuasm.html#SEC133 Entry: Synthetic target fix Date: Sat Jun 4 16:41:20 CEST 2011 After checking what ".type" and ".size" actually mean (they annotate a symbol with type and size), it seems likely that what is meant here is to just refer to the "cyg_hal_sys_restore" symbol, and not the "__restore" symbol? I.e. the code probably has changed at some point to incorporate a name change, but somebody forgot to update that directive. It's not necessary for the code to work so it is only detected now that binutils got stricter about it's input. If this is the case, see the fix below Cheers, Tom ? synth_fix.patch Index: syscall-i386-linux-1.0.S =================================================================== RCS file: /cvs/ecos/ecos/packages/hal/synth/i386linux/current/src/syscall-i386-linux-1.0.S,v retrieving revision 1.13 diff -u -8 -p -r1.13 syscall-i386-linux-1.0.S --- syscall-i386-linux-1.0.S 23 Aug 2009 11:34:45 -0000 1.13 +++ syscall-i386-linux-1.0.S 4 Jun 2011 14:40:09 -0000 @@ -439,20 +439,20 @@ SYSCALL5(ipc) // via another system call. .align 16 .global cyg_hal_sys_restore_rt cyg_hal_sys_restore_rt: movl $SYS_rt_sigreturn, %eax int $0x80 1: - .type __restore_rt,@function - .size __restore_rt,1b - __restore_rt + .type cyg_hal_sys_restore_rt,@function + .size cyg_hal_sys_restore_rt,1b - cyg_hal_sys_restore_rt .align 8 .global cyg_hal_sys_restore cyg_hal_sys_restore: popl %eax movl $SYS_sigreturn, %eax int $0x80 1: - .type __restore,@function - .size __restore,1b - __restore + .type cyg_hal_sys_restore,@function + .size cyg_hal_sys_restore,1b - cyg_hal_sys_restore Entry: Are structs bad? Date: Thu Jun 23 11:07:57 CEST 2011 The problem with C is not that structs are a bad interface, but that it is hard to enforce structs to be immutable. I.e. in essence there is not such a big difference between: struct foo { int a,b,c; }; and void foo (int a, int b, int c); If any of the two interfaces changes, the code that uses it needs to be changed anyway. Of course, exposing _internal_ data structures that contain implementation details that are irrelevant to the user is a a bad idea. I'm thinking mostly of structs as configuration files. Entry: Be careful with typeof() for casts in macros Date: Sat Jun 25 11:48:18 CEST 2011 Excerpt of a simple stack implementation. #define STACK_PUSH_SAFE(stack, thing, overflow) { \ void *room = stack_allot(stack); \ if (!room) goto overflow; \ *((typeof(thing)*)room) = thing; \ } The trouble with this is that the input argument of the macro is used to determine the access. So if the stack represents short ints (16 bits) and the type of the parameter at the call site is a 32bit int, there is a problem: - data abort on ARM due to alignment problems - incorrect access (writes 2 values) on archs that support non-algined access Entry: Locking and callbacks Date: Sat Jun 25 13:19:14 CEST 2011 Is it possible (and sound?) to do the following: When a mutex is locked by the current thread, do not lock (causing deadlock) but throw an error. This shows up in a case where I'm using callbacks inside a lock to avoid manually having to manage resources. However, the hierarchy of the locks of the application is such that it isn't completely impossible to avoid deadlocks: it's possible to call functions that ultimately depend on a low level lock the callback has already reserved, in this particular case a disk lock. Probably this is just a sign that the cause of the inner deadlock needs to be decoupled, i.e. by using a thread. In this case, the inner cause is a logger which might aquire a disk lock, however it is possible to put the logger in a separate thread and use a buffer to capture logging data until the disk log is released. Is there a (practical) way to statically verify lock hierarchy violations? I.e. this[1] mentions a mechanism where lock hierarchy is constructed (probably through control flow graph) and it triggers an error when this graph contains loops, i.e. joins on the same lock. It seems[2] that it is possible to do it at run-time. That drdobbs article[2] also mentions layer violation through callbacks, which is exactly my problem. I'm using callbacks to abstract locking, but this then has the potential to violate hierarchies. [1] http://www.osronline.com/ddkx/ddtools/dv_8pkj.htm [2] http://drdobbs.com/high-performance-computing/204801163?pgno=1 Entry: emacs + cscope Date: Mon Jul 11 20:43:11 CEST 2011 Minimal setup: apt-get install cscope cscope-el cscope -bR -I ... # Index directory, set include path. Emacs: (require 'xcscope) M-x cscope-set-initial-directory # Set it to the file that contains cscope.out M-x cscope-find-this-text-string # C-c s s Entry: GCC not catching undefined variable error Date: Tue Jul 12 11:08:46 CEST 2011 With -Wall enabled, this undefined variable error is not caught: void foo(int arg) { int err; if (arg < 0) goto exit; err = 0; exit: return err; } Might it be because of -O0, i.e. no data flow analysis performed? Yep, absence of opti is the problem: cc1: warning: -Wuninitialized is not supported without -O Entry: cflow & entry point Date: Mon Jul 18 22:03:39 CEST 2011 Just passing a random file to cflow made me think first that all functions are sorted, meaning that if a function is called by another one, it is not taken as one of the main entry points. However, it does seem to miss some things: my code contains static functions that go into a struct which are completely ignored by cflow. I tried "--include=s" to include static functions, but that doesn't seem to work. Hmm.. Something's wrong. Removing all the "static" annotations does make the functions show up. Entry: Why buffer? Date: Tue Jul 19 16:43:23 CEST 2011 The 7 whies down the rabbit hole answer: to avoid loosing time in context swapping. From this simplified view it seems straightforward to conclude that to optimize for small memory usage, optimize context swapping. Entry: CPP macro for counting the number of bits in a word. Date: Sun Oct 30 15:00:38 EDT 2011 I don't know a way to write recursive macros; don't know how to terminate the recursion and I don't know if recursion is supported in the first place. Though this is a simple manual unrolling: #define NB_BITS_32(x) (NB_BITS_16(x) + NB_BITS_16(x >> 16) #define NB_BITS_16(x) (NB_BITS_8(x) + NB_BITS_8(x >> 8) #define NB_BITS_8(x) (NB_BITS_4(x) + NB_BITS_4(x >> 4) #define NB_BITS_4(x) (NB_BITS_2(x) + NB_BITS_2(x >> 2) #define NB_BITS_2(x) (NB_BITS_1(x) + NB_BITS_1(x >> 1) #define NB_BUTS_1(x) (x & 1) Entry: Counting in the preprocessor Date: Mon Nov 14 11:08:52 EST 2011 It's probably possible to count in the preprocessor by allocating an array, and using sizeof. Then somehow get rid of the thing later.. Entry: MMX on AMD64 Date: Tue Nov 29 12:15:22 EST 2011 Does it not work? Entry: Assigning pointers to zero Date: Sat Dec 17 14:58:52 EST 2011 I ran into a bug today that's similar to void foo(int *a) { a = 0; /// a = 123; } when I meant: void foo(int *a) { *a = 0; /// *a = 123; } How to make this into an error without using leaky const arguments? Entry: Getting over C++ hate (fear?) Date: Sun Dec 18 09:00:38 EST 2011 In discussion with someone that doesn't have this problem, it's come to my attention lately that I have an irrational attitude towards C++. Sure, C is simple and straightforward, but in my style I rely a lot on the CPP. Maybe it's time to start collecting a couple of C++ features that would be interesting to explore, with an eye on embedded development, to avoid having to resort to text macros too much.. - Resource management through local objects. RAAI[1]. Even without exception support (like eCos) it's still useful to be able to return and have destructors called. - Templates: building complicated const data structures at compile time to be able to store them in read-only Flash memory. [1] http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization Entry: Typed CPP Date: Sun Dec 18 09:07:42 EST 2011 Is it possible to build some kind of type system on top of CPP? Limit use to construct that have some structure? I.e. build a system of CPP tools that can be compiled by standard CPP but also compiled by a preprocessor with a type system, i.e. something that would fit C syntax so I can feed it into Language.C ? This decoupling gets you: * Compatibility with standard CPP, meaning a source tree can be compiled without the static analyzer and its dependencies present on the system, which is *really* important. * Full freedom in tool dependency. Since analysis is only necessary when you're actually *changing* the code, the static analyzer can have a whole host of dependencies and special purpose features. Is this just C++'s template system? Entry: Obtaining pointer to current function Date: Thu Dec 29 11:05:45 EST 2011 Is there a way to obtain a pointer to the current function's address without knowing its name? I.e. void foo(void) { void *me = &this_function; } From [1] it seems that this isn't possible, except for doing it with a function call like this: #include void *get_addr(void) { return __builtin_return_address(0); } int main(void) { printf("%p\n", get_addr()); return 0; } This might be enough though. The thing is that gdb knows how to map this to a source location, given the binary compiled with debugging symbols: # gdb (gdb) list *
[1] http://stackoverflow.com/questions/2154852/get-a-pointer-to-the-current-function-in-c-gcc Entry: Detecting re-entrancy Date: Thu Dec 29 19:44:49 EST 2011 How to detect re-entrancy in a function, while allowing normal multi-threaded access? I don't see a way to do this without thread-local variables.