My big question: "Write a C function taking two strings and returning a string containing one copy of the characters that occur in both." Hint: Spend some time looking at one of the strings and use an array of flags.
Given that hint, I guess we can assume the strings contain only 8-bit single-byte characters and the sort of thing the Google interviewers might have been looking for was something like this:
|
Yes, I wrote it in C++. (I'm the kind of guy Linus Torvalds loves to hate and Peter Seibel doesn't understand.) So how would we get the result back to the caller in C? I can think of about a dozen ways, all of which are also available to me in C++, and all of which are more tedious and error-prone both for the implementation and for the callers of the function. Of course, if it made sense to use one of those methods in the context in which this function was to operate, then I would use it.
Going back to the whiteboard: if I was standing in front of a hyper-intelligent Googler I would probably wet my pants and I suspect I would have written the function differently: I wouldn’t have commented it; I probably would have written 256 instead of UCHAR_MAX + 1; more seriously, I may have forgotten to cast away the (possible) sign of the char when used as an array index. All of these things would have occurred to me only after I'd left the building. (And, since the requirement was for a C function, I guess I would have had to fiddle around with buffers and buffer sizes and null terminators and shit.)
In the real world I would unit test this function with something like this:
|
I've been writing code for a long time - long enough to know that I make mistakes. I have found unit tests to be an excellent way to pick up some of those mistakes as well as being useful for other reasons I'll write about another time.
Of course, passing a bunch of unit tests does not mean the code doesn't have horrible bugs in it. For example, I tested this code under a compiler where
char
s are signed. If you remove the static_cast<unsigned char>()
casts the unit tests still all pass. What this means is that when *a
is '\x80'
the expression occurs_in_a[*a++] = 1
is similar to occurs_in_a[-128] = 1
and this sets a byte to 1 on the stack 128 bytes before the occurs_in_a array. But that's alright (NOT!) because if (occurs_in_a[*b])
will look for it in the same place, and find it there, which is why the tests still pass. But the stack is now corrupted, the consequences of which may be nil or catastrophic, immediate or postponed.