Xilinx customer service 1, environment 0

We had a nice Xilinx FPGA deveopment board in work which had some damage:

The important parts were okay, but some of the microswitches were mangled. I was struggling to source the part (I think Omron would only sell me them in batches of a million or so). There might have been an alternative, but I contacted Xilinx who were amazingly helpful: they were happy to send some spares. Although this level of customer service is a rare treat, I was equally amazed when this turned up:

The Register has a (semi-)ongoing competition to find the “excessive packaging world record” which has some mind boggling entries; this might not a world record exactly, but good effort none the less. I’ve never really worked out if it was some sort of comment on me being a total pain in their ass, or whether they were really worried the new microswitches would be damaged in transit.

The pursuit of beautiful modular arithmetic in Python

Imagine you (and by you I mean me) are a bit OCD and really worry about the cosmetic excellence of your Python code (or any code for that matter). When you need to evaluate a lengthy expression modulo some fixed modulus N, say N=11 for example, the result can be ugly:

( 9 * ( ( 3 - 5 ) % 11 ) ) % 11

There are a couple of things which might irk you … they are driving me mad just looking at them:

  • you need to include the "evaluate this subexpression modulo  N" part lots of time, even though you know the whole expression should be evaluated modulo  N,
  • the extra parentheses are really needed because the modulo operator has a higher precedence than the arithmetic operators (or at least addition and subtraction).

Close, but no cigar

  • We could attempt to overload the built-in Python int operators, but apparently this isn’t possible.
  • We could write a class, say mod, which is a subclass of int and which applies similar operators but applies a modular reduction automatically. The problem with this is that if we write the expression 3 + 5 then our modular operators aren’t invoked: 3 and 5 are of type int so we’d have to write the significantly more ugly alternative mod( 3 ) + mod( 5 ) which sort of defeats the point.
  • We could write the expression as a string, i.e., "3 + 5", parse it into an AST then replace all the operators with calls to appropriate functions which apply a modular reduction automatically; Python has all the machinery to do this, but it seems sort of heavy-weight, plus not so nice that we need to wrap everything up as a string.

Toward a beautiful solution

Languages like Magma have a coercion operator which looks like

Integers( 11 ) ! 9 * ( 3 - 5 )

but even that only seems to bind to a single subexpression: Magma only applies the coercion to the result of multiplication but not the subtraction (which still gives the correct result, but might fail if one of the subexpressions fails, e.g., is an exponentiation which should be modular but exhausts the maximum integer size). The long and the short of it is that I’d like to be able to write

Integers( 11 ) ! 9 * ( 3 - 5 )

in Python and have each subexpression evaluated modulo  N.

In the end I failed, but I at least got close: basically, start with a neat (but slightly mind-bending) recipe for inline operators, i.e., something like

class Infix( object ):
  def __init__( self, f ):
    self.f = f

  def __or__( self, x ):
    return self.f( x )

  def __ror__( self, y ):
    return infix( lambda x, self = self, y = y : self.f( y, x ) )

  def __call__( self, x, y ):
    return self.f( x, y )

then write a function to build you some arithmetic functions modulo  N

def Integers( N ) :
  return Infix( lambda x, y : ( x + y ) % N  ), \
         Infix( lambda x, y : ( x - y ) % N  ), \
         Infix( lambda x, y : ( x * y ) % N  )

then actually use it, e.g.,

add, sub, mul = Integers( 11 )

then what you finally get is the ability to write expressions such as

9 |mul| ( 3 |sub| 5 )

and have the evaluated modulo  N initially selected, in this cases N=11.

On one hand this seems close to a nice solution; on the other hand, I’m never really sure whether I should be delighted or terrified that such a thing is possible in Python. The real question is whether there is a better way to do this ?

Cryptographers buy vanity license plate; attach it to ice cream van

Spotted outside the K.U. Leuven ESAT building at an ECRYPT II event

… just three off a NIST standard license plate; I guess they didn’t have enough letters to fit Rijndael on it. I found another one, just outside work on The Triangle in Bristol:

When compilers attack: musing about the legality of program transformation

This is quite long, and I could be talking total crap, so bare with me: maybe it is interesting either way.

In a written version of his Turing award lecture, UNIX coinventor Ken Thompson poses a question about how much one can trust a compiler. Thompson describes an ingenious bootstrapping mechanism for embedding a Trojan horse in C compilers: the malign compiler produces target programs which are less secure than the associated source program suggests. From a security perspective, the question, and therefore the topic more generally as well, is already fascinating.

I was involved in CACE, an EU-funded research project looking at "smart" compilers for cryptography. One of the review meetings threw up an interesting question: if a "smart" compiler produces a target program which is say more efficient or more secure than a "dumb" compiler would produce, can you be sure it hasn’t violated a patent by doing so ?

Describing the problem using an analogy

Consider the following two scenarios; they are deliberately selected to be somewhat analogous within the respective areas of patents on software and more traditional machinery:

  1. A compiler vendor develops a compiler} which can translate general purpose source programs into corresponding target programs; the compilation process involves iterated application of one or more program transformations. The compiler works with "general" source programs in the sense that any syntactically valid program is accommodated. An application vendor acquires said compiler and uses it to translate a particular source program into a particular target program that is distributed to a customer. After distributing the target program, analysis shows that it infringes a patent held by some IP vendor.
  2. A machine vendor develops a machine which refines crude oil into gasoline; the refining process involves iterated application of one or more chemical reactions. The machine works with "general" crude oil in the sense that variations in the chemical composition are accommodated. An oil refinery acquires said machine and uses it to refine a particular type of crude oil into a particular type of gasoline that is distributed to customer. After distributing the gasoline, analysis shows that it infringes a patent held by some IP vendor.

The compiler example is certainly related to the concept of automated invention, as eloquently described by Robert Plotkin: using similar terminology as him, the compiler is a "genie" capable of granting a "wish" by the application vendor to create an efficient target program from the source program. The machine example is not so easily placed in the same setting: it seems difficult to describe what has happened as invention rather than malfunction, even though the end result is potentially useful.

Although these are asides that may need to be resolved, assume for the moment that:

  • We are not concerned with how the patent was granted, only that it exists and definitively covers the scenario in question.
  • We are not concerned with jurisdiction (i.e., we are more concerned with what happens in general, not in the UK or EU specifically); if it comes down to a choice, we are most interested in the US.
  • The program transformations (resp. chemical reactions) used within the compiler (resp. machine) are themselves not patented, or at least pre-date the patent so they could be argued to be prior art of some sort.
  • The IP vendor can always detect an instance of patent infringement, and is always interested in pursuing such an instance regardless of the cost and potential reward.

It seems there are at least three cases that describe why the target program (resp. gasoline) might infringe the patent:

  • Case 1: It could be that the source program (resp. crude oil) infringes the patent outright, and the program transformations (resp. chemical reactions) simply caused this infringement to be retained in the target program (resp. gasoline).
  • Case 2: It could be that the compilation process (resp. refining process) specifically tries to identify special cases in the source program (resp. crude oil) that could be replaced by a patented alternative in the target program (resp. gasoline). For example, for a given class of source program (resp. crude oil) the compiler (resp. machine) "knows" it can be more effective somehow. By making this choice, the compiler (resp. machine) causes the target program (resp. gasoline) to infringe the patent "on purpose".
  • Case 3: It could be that the compiler (resp. machine) applies a series of completely general purpose program transformations (resp. chemical reactions); each one performs a useful standalone task which is totally unrelated to the patent in question. As such, the nature of the input (i.e., source program or crude oil) and composition of the program transformations (resp. chemical reactions) has caused the target program (resp. gasoline) to infringe the patent "by accident".

In relation to the machine, it seems near impossible that case 3 could ever occur. The central question is, given that in the case of the compiler it at least might occur, what happens next ? That is

  • which party (or parties) can the IP vendor sue,
  • what is the likely outcome in each such lawsuit,
  • are there any robust ways to either avoid this situation, or defend oneself from such lawsuits.

Plotkin looks at automated invention from a constructive view point, e.g., creation of new law and business models as the result of the use of genetic algorithms to invent new products. In a sense, the marked difference is that we are being destructive, i.e., looking at what happens when things go wrong within existing law and business models.

A concrete example

A patent exists (assigned to NeXT Computer, Inc.) relating to use of a particular technique for modular reduction. Claim 1 of the patent outlines the context; claim 2 and 3 of the patent relate to modular reduction within this context:

  1. The method of claim 1 wherein p is a Mersenne prime given by 2^q -1.
  2. The method of claim 2 wherein said mod p arithmetic is performed by the steps of: shifting q LSB’s of a binary number and adding the shifted q LSB’s to the remaining q LSB’s to generate a sum; and repeating the previous step on the sum until a sum is generated of q or fewer bits.

So we can have a concrete example, imagine the case where q=31, i.e., the case where p=2^{31} - 1. The idea is that reduction of some value x modulo p can be very efficient. To see this, notice that since 2^{31} - 1 \equiv 0 \pmod{p}, it is also true that 2^{31} \equiv 1 \pmod{p}. Writing x = 2^{31} \cdot x_{hi} + x_{lo}, one can therefore reduce x modulo p by first computing t = x_{hi} + x_{lo} and then conditionally subtracting p from t if required.

The source programs

Consider two source programs, written in C, which include two functions called f1 and f2:

uint32_t f1( uint64_t x ) {
  uint32_t lo = ( uint32_t )( x       ) & 0x7FFFFFFF;
  uint32_t hi = ( uint32_t )( x >> 31 ) & 0x7FFFFFFF;

  uint32_t t  = lo + hi;

  if( t >= p ) {
    t = t - p;
  }

  return t;
}
uint32_t f2( uint64_t x ) {
  {
    uint32_t lo = ( uint32_t )( x       ) & 0x7FFFFFFF;
    uint32_t hi = ( uint32_t )( x >> 31 ) & 0x7FFFFFFF;
  }
  {
    uint32_t y;
    uint32_t z;

    uint32_t t  = y + z;

    if( t >= p ) {
      t = t - p;
    }

    return t;
  }
}
  • The function f1 first sets lo and hi equal to the least-significant and most-significant 31-bit halves of the input x. Next it adds lo and hi to compute t, which is conditionally reduced by the modulus p before being returned as the result. The function clearly implements the behaviour described in claim 3 of the patent.
  • The function f2 is very similar to f1, but there are two major differences:
    1. the function is split (rather oddly) into two sub-scopes,
    2. the computation of t now results from the addition of y and z.

    If one draws a data-flow graph of f2, based on the semantics of C it does not seem reasonable to argue this function implements the behaviour in claim 3 of the patent (although perhaps testing any "doctrine of equivalents"). That is, it cannot be the case that the function is equivalent to "adding the shifted q LSB’s to the remaining q LSB’s to generate a sum": neither y nor z are initialised with a value, so the value of t is essentially undefined.

The target programs

In order to compile the source programs into corresponding target programs, our experimental platform comprised:

  • a commodity workstation including a $1.6$ GHz Intel Core2 processor,
  • Linux kernel revision $2.6.18$,
  • GCC C compiler version $3.4.6$.

The target programs, written in x86-32 assembly language, demonstrate a potentially surprising result:

f1:
.LFB5:
        pushq   %rbp
.LCFI0:
        movq    %rsp, %rbp
.LCFI1:
        movq    %rdi, -8(%rbp)
        movq    -8(%rbp), %rax
        andl    $2147483647, %eax
        movl    %eax, -12(%rbp)
        movq    -8(%rbp), %rax
        shrq    $31, %rax
        andl    $2147483647, %eax
        movl    %eax, -16(%rbp)
        movl    -16(%rbp), %eax
        addl    -12(%rbp), %eax
        movl    %eax, -20(%rbp)
        mov     -20(%rbp), %eax
        cmpq    P(%rip), %rax
        jb      .L2
        movq    P(%rip), %rdx
        leaq    -20(%rbp), %rax
        subl    %edx, (%rax)
.L2:
        movl    -20(%rbp), %eax
        leave
        ret
f2:
.LFB5:
        pushq   %rbp
.LCFI0:
        movq    %rsp, %rbp
.LCFI1:
        movq    %rdi, -8(%rbp)
        movq    -8(%rbp), %rax
        andl    $2147483647, %eax
        movl    %eax, -12(%rbp)
        movq    -8(%rbp), %rax
        shrq    $31, %rax
        andl    $2147483647, %eax
        movl    %eax, -16(%rbp)
        movl    -12(%rbp), %eax
        addl    -16(%rbp), %eax
        movl    %eax, -20(%rbp)
        mov     -20(%rbp), %eax
        cmpq    P(%rip), %rax
        jb      .L2
        movq    P(%rip), %rdx
        leaq    -20(%rbp), %rax
        subl    %edx, (%rax)
.L2:
        movl    -20(%rbp), %eax
        leave
        ret

The two functions compute the same result, implying that both target programs infringe the patent. The reason for this is somewhat mundane: in f2, GCC notices that the space allocated to hi and lo is not needed after the end of the scope in which they are computed. It therefore re-allocates the same space to y and z, effectively meaning they become aliases for the results computed into hi and lo. As a result, the expression in the source program that adds together y and z actually adds together hi and lo in the target program and hence computes the same result.

One way to look at this is that Plotkin (Page 205) says

… with most traditional computer programming languages, once a program is written describing what the program does, it is guaranteed that the program will run and perform the function it describes, so long as the programmer followed the rules of the programming language.

which is certainly a reasonably (if optimistic) way to look at things. The example above shows that this is not a limit: we can also write programs that, by virtue of features in the compiler, do something different (yet useful) to what the rules of the programming language suggest. However, note that for at least two technical reasons, i.e., rather than legal reasons in respect to the patent, this particular example is highly contrived:

  • GCC 3.4.6 is far from modern (at the time of writing, the latest version of GCC is 4.3.3). In fact ISO/IEC 9899 (Section 6.7.8), the current C standard, now stipulates

    If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static storage duration is not initialized explicitly, then:

    • if it has pointer type, it is initialized to a null pointer;
    • if it has arithmetic type, it is initialized to (positive or unsigned) zero;
    • if it is an aggregate, every member is initialized (recursively) according to these rules;
    • if it is a union, the first named member is initialized (recursively) according to these rules.

    As such, under this specification (which GCC 3.4.6 clearly does not implement) y and z would be initialised to zero and hence f2 would compute zero as a result.

  • During compilation, GCC was instructed not to perform certain types of optimisation. For example, use of dead-code elimination would clearly cause removal of machine instructions in the target program related to computation of x and y: these values are not used if one considers f2.

So what ?!

The million dollar question is whether any of this makes sense. On one hand, I certainly don’t know enough about law to say; on the other hand, if you suspend disbelief and think about the three cases then some interesting things seem to pop out.

Case 1

The source program already infringes the patent before the compiler translates it into a target program. As such, it seems clear that the application developer is entirely to blame. The role of the compiler is essentially moot: in a sense, the application developer has made a patented "wish".

Case 2

The compiler vendor deliberately implemented a program transformation to specifically instrument a patented technique in the target program. It seems reasonable that use of the transformation was due to either

  1. automatic (i.e., by default) with no input from the application developer, or
  2. via some "opt-in" mechanism, and thus enabled explicitly by the application developer.

In either situation it seems possible that the compiler vendor could have enforced an End-User License Agreement (EULA) whereby legal responsibility is passed to the application developer. At very least, it is common practice to limit liability for any damages to the original cost of purchasing the compiler.

Ben Klemens (Page 100) alludes to this not being so in relation to linking (rather than compilation):

Even a single function could lead to a patent suit, so lawsuit-averse programmers had best purchase function libraries from vendors instead of writing their own and glue them together using a purchased copy of Microsoft’s Visual Studio instead of a freely downloaded copy of the GNU Compiler Collection.

That is, if you use Visual Studio and the target program has a patented library component linked into it, then Microsoft is likely at fault not the programmer.

Finally, Plotkin (Page 82) argues reasonably that with use of automated invention being an inevitable trend,

… the process as a whole, performed by human and machine in combination, constitutes "automated inventing".

So, in both situations one can presumably blame the application developer. Even if there is no EULA, in the latter situation it seems unreasonable for the application developer to claim stupidity as a defense. In the former situation, the combination of human and machine has resulted in patent infringement, not the machine alone.

The crucial problem is that this represents a highly unattractive outcome in terms of economics. The compiler is supposed to reduce the workload of the application developer: if they have to manually check the output for patent infringement (assuming they are even capable of doing so) and there is even a chance they will be liable, they probably avoid using the compiler in the first place. This makes the business model of the "smart" compiler vendor tenuous: why would they invest in development of such compiler technology if the application developer is too scared to use it ?
Bummer: this seems like a kick in the nuts for the full employment theorem for compiler writers !

Case 3

This seems to represent the dual of what Plotkin (Page 76) calls multiple realisability, i.e., the potential for a single source program to be compiled into multiple target programs. Roughly speaking, here we have a situation where multiple source programs compile into a single target program.

The concrete example matches this situation, but there are some valid reasons it is not a particularly good example:

  • One might reasonably argue that the example is unrealistic because more general occurrences of such "accidents" are very unlikely: the low-level nature of C and assembly language do not usually match the high-level concepts in a patent. The counterargument is that programming languages and compilers will necessarily become more high-level and "smarter". Since their modus operandi is compilation of high-level mathematical descriptions, the program transformations they use will necessarily be at a similar level to potential patents. As such it seems at least possible, if not totally probable, that the scenario will become more plausible in time. In line with the so-called infinite monkey theorem, if there are enough "genies" and "wishers" then eventually two different "wishes" will produce the same result.
  • One might reasonably argue that the example is unrealistic because it is clear that the application developer has deliberately tried to avoid the patent, i.e., "trick" the compiler, in this case. That is, the application developer has deliberately tried to place the blame on the compiler (and hence compiler vendor): the two source programs are not arbitrary, they are basically one source program with deliberate alterations to make them different yet invoke the right compiler behaviour to end up with the same target program. The metaphor here is that there is a single fixed "genie" that basically grants one type of "wish" the game is to ask it rather odd "wishes", of this one type, in a way that the "genie" still does what you want.