avatar

(搬运)Linus教你写链表

A few weeks ago Linus Torvalds answered some questions on slashdot. All his responses make good reading but one in particular caught my eye. Asked to describe his favourite kernel hack, Torvalds grumbles he rarely looks at code these days — unless it’s to sort out someone else’s mess. He then pauses to admit he’s proud of the kernel’s fiendishly cunning filename lookup cache before continuing to moan about incompetence.

At the opposite end of the spectrum, I actually wish more people understood the really core low-level kind of coding. Not big, complex stuff like the lockless name lookup, but simply good use of pointers-to-pointers etc. For example, I’ve seen too many people who delete a singly-linked list entry by keeping track of the prev entry, and then to delete the entry, doing something like

1
2
3
4
if (prev)
prev->next = entry->next;
else
list_head = entry->next;

and whenever I see code like that, I just go “This person doesn’t understand pointers”. And it’s sadly quite common.

People who understand pointers just use a “pointer to the entry pointer”, and initialize that with the address of the list_head. And then as they traverse the list, they can remove the entry without using any conditionals, by just doing a *pp = entry->next.

Below is an explanation from a reader (and some notes of me).

Well I thought I understood pointers but, sad to say, if asked to implement a list removal function I too would have kept track of the previous list node. Here’s a sketch of the code:

Who don’t understand pointers

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
typedef struct node
{
struct node * next;
....
} node;

typedef bool (* remove_fn)(node const * v);
// remove_fn returns whether the node is to be deleted.

// Remove all nodes from the supplied list for which the
// supplied remove function returns true.
// Returns the new head of the list.
node * remove_if(node * head, remove_fn rm)
{
for (node * prev = NULL, * curr = head; curr != NULL; )
{
node * const next = curr->next;
if (rm(curr))
{
if (prev)
prev->next = next;
else
head = next;
free(curr);
}
else
prev = curr;
curr = next;
}
return head;
}

The linked list is a simple but perfectly-formed structure built from nothing more than a pointer-per-node and a sentinel value, but the code to modify such lists can be subtle. No wonder linked lists feature in so many interview questions!

The subtlety in the implementation shown above is the conditional required to handle any nodes removed from the head of the list.

Now let’s look at the implementation Linus Torvalds had in mind. In this case we pass in a pointer to the list head, and the list traversal and modification is done using a pointer to the next pointers.

Who understand pointers

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
void remove_if(node ** head, remove_fn rm)
{
for (node** curr = head; *curr; )
{
node * entry = *curr;
if (rm(entry))
{
*curr = entry->next;
free(entry);
}
else
curr = &entry->next;
// curr stores address of the previous node->next.
}
}

Much better! The key insight is that the links in a linked list are pointers and so pointers to pointers are the prime candidates for modifying such a list.

The improved version of remove_if() is an example of two star programming: the doubled-up asterisks indicate two levels of indirection. A third star would be one too many.

Author: Lsc2001
Link: http://yoursite.com/2020/07/20/%EF%BC%88%E6%90%AC%E8%BF%90%EF%BC%89Linus%E6%95%99%E4%BD%A0%E5%86%99%E9%93%BE%E8%A1%A8/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.
Donate
  • 微信
    微信
  • 支付宝
    支付宝