# Circuit Minimization Work

We plan to post on this page results obtained by the "circuit minimization team" (CMT). This is an informal group of friends and colleagues that are working on the problem of finding "good" circuits over GF2 (alternatively, circuits using only AND, XOR, and XNOR gates). XNOR gates are represented by the operation t = a # b , meaning t = (a + b + 1) modulo 2. The definition of "good" varies: small, low-depth, few AND gates, and so on.

1. Multiplication in GF(2^6) using irreducible polynomial x^6 + x^3 + 1 . Here is the straight-line program for a circuit with 57 gates and depth 5. is a pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates.
The Ai's and Bi's are the coefficients of polynomials A, B. The Ci's are the coefficients of C = A*B.
The best circuit we can find in the literature for this problem, here, has depth 5, size 61.

2. Multiplication in GF(2^7).
1. Using irreducible polynomial x^7 + x^4 + 1 .
The best published circuit we can find for this has depth 7 and 100 gates. is the straight-line program for a circuit with 84 gates and depth 5. is a pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates.
2. Using irreducible polynomial x^7 + x^3 + 1 .
The best published circuit we can find for this has depth 7 and 100 gates. is the straight-line program for a circuit with 85 gates and depth 5. is a pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates.

3. Multiplication in GF(2^8) using the AES polynomial x^8 + x^4 + x^3 + x + 1 . The best published circuit we can find for this has 135 gates and depth 7 ( link ). is the straight-line program for a circuit with 117 gates and depth 6.
The Ai's and Bi's are the coefficients of polynomials A, B. The Ci's are the coefficients of C = A*B.
4. Multiplication in GF(2^8) : custom-built field representation. The next circuit was built using a tower field representation. It has 106 gates and is of depth 6. It probably beats any previous (as of Sept. 20, 2017) result for GF(2^8) multiplication. Here is the straight-line program.
Send me email for details of the construction and representation. The inputs to the circuit are the Ai's and Bi's. The outputs are the Ci's.
5. A 16-bit Sbox. The paper "Customizable Sponge-Based Authenticated Encryption Using 16-bit S-boxes" (K3LR)-(Kelly, Kaminsky, Kurdziel, Lukowiak, Radziszowski in Military Communications Conference, MILCOM 2015 - 2015 IEEE) presents a 16-bit Sbox based on GF(2^16) inversion. It quotes a size of 1382 gates (1238 XOR gates and 144 AND gates, no depth is given). A previously posted circuit here was incorrect (thanks to an anonymous referee for pointing this out). So we want to post something that is not too hard to verify. The K3LR Sbox is inversion in GF(2^16) followed by a linear transform that appears in Christopher Wood's thesis . I do not know if the particular representation of GF(2^16) is important for cryptographic purposes. The following circuit for inversion in GF(2^16) uses a tower field representation and has depth 31 and size 387. The circuit for the linear transform has depth 4 and size 59. Combining the two circuits yields a circuit with depth 35 and size is 446 (113 ANDs, 324 XORs, 9 XNORs). The Sbox circuit (using the tower field representation) is here. A change of basis yields the K3LR Sbox. It has 537 gates (we have not attempted to build it without going through the tower field representation).
6. Here is what we did to verify the inversion circuit: the algebra used to construct the various Galois fields is here. The following c++ program generates a multiplication circuit for GF(2^16). One can verify the program by matching it against the algebra. The multiplication circuit, here ,is not very good, but it is correct provided you believe the algebra. The following circuit uses our inversion circuit to produce xinverse from x, then it uses the multiplication circuit to multiply x and xinverse. If you believe the multiplication circuit is correct, then correctness of the inversion circuit follows from the joint circuit calculating the function f(x) = GaloisFieldIdentity (x not 0). The Galois Field identity is the vector of all 1's. This has been verified. When x is 0 the xinverse circuit outputs 0 since it has no negations.
7. Binary Multiplication.
This is an old and much-studied problem. It can also be viewed as that of multiplication of polynomials of degree n over GF(2).
Dan Bernstein's results at High-speed cryptography in characteristic 2 have been recently improved by Murat Cenk and M. Anwar Hasan (see Some New Results on Binary Polynomial Multiplication.) Further improvements in size, depth, or both are shown below.

Here, M(n) is the minimum number of bit operations (ANDs and XORs) needed to multiply two polynomials of degree n-1 .
It is not hard to show that M(n) \leq M(n-1) + 4(n-1).
Below are a few circuits synthesized in the last few years.
1. M(10) \leq 154. is the straight-line program for a circuit with 154 gates and depth 7.
Here is the straight-line program for a circuit with 155 gates and depth 6.
2. M(11) \leq 186. is the straight-line program for a circuit with 186 gates and depth 7.
Here is the pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates.
3. M(12) \leq 207. is the straight-line program for a circuit with 207 gates and depth 7.
Here is the pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates.
4. M(15) \leq 312.
This illustrates why I think we don't really know the values of M(n).
Bernstein's 2009 circuit has 329 gates. Cenk and Hasan improved this to 317 gates in 2015. is an optimization of the latter circuit with 312 gates and depth 9 (send me email for lower-depth circuits).
5. M(n) for n \leq 100.
contains circuits for values of n up to 100. Most of these are the smallest circuits known.

8. The S-Box of AES.
Here is a circuit for the S-Box of AES. The circuit is described as a straight-line program using ANDs, XORs, XNORs (denoted by #). It contains 113 gates (32 AND gates and 81 XOR/XNOR gates) and has depth 27. This is an improvement, by Cagdas Calik, of a previous circuit in this page.
Here is a circuit for the reverse direction of the S-Box. The circuit has size 121 and depth 21. We haven't worked much on this.
The S-Box can be reduced to depth 16 and size 125 in the forward direction and 127 in the reverse direction.
Here is a circuit for the forward directions of the S-Box. Here is a circuit for the reverse direction of the S-Box.

9. All predicates on 4 bits.
The multiplicative complexity of a function is the number of AND gates needed to implement it over the basis AND,XOR,1 (the 1 is for negation).
For example, it is not easy to build a circuit with minimum number of AND gates for a function such as

g(a,b,c,d) = abcd + abc + abd + acb + acd + ab + ac + ad + bc + bd + cd.

The following facts have been verified:
1) Every predicate on four bits can be implemented with at most 3 AND gates and at most 7 XOR gates; (for functions that have the constant 1 in their algebraic normal form, the last gate would have to be negated; there are only two functions - modulo permutation of inputs - which appear to require 7 XOR gates)
2) Every predicate on four bits, but of degree 3, can be implemented with 2 AND gates.

The following (large) file contains straight-line programs (SLPs) for all predicates f(x1,x2,x3,x4) with f(0,0,0,0) = 0. The SLPs are indexed by fn, where n is the decimal number corresponding to the truth table of the function. For example the function x1*x2*x3*x4 has truth table 0000 0000 0000 0001 = 1. So you can find it under "f1". The circuits were independently verified by Chris Wood (many thanks!).
If you denote by C_n[A,X] the set of functions with multiplicative complexity A for which an AND-optimal circuit requires exactly X XOR gates, then C_n[A,X] is closed under permutation of inputs (but not under arbitrary affine transformations). This property allows for detection of sub-optimal circuits. I have not coded this yet, so many of the circuits below are likely to be suboptimal.
There are the two functions (up to permutation of inputs) that use 7 XOR gates and 3 AND gates. These functions do not appear to be special in any way. Magnus Find used a SAT solver to look into this. The SAT solver "proved" that neither function has a circuit with 6 XOR gates. Given other functions, the SAT does find circuits with 6 XOR gates. So it would appear that these two functions are special in some way.

Predicates on four bits.

10. All predicates on 5 bits. Meltem Turan, Cagdas Calik, and I have shown that all predicates on 5 bits have multiplicative complexity at most 4. I lost a bet about this with Cagdas (I thought it would be 5 or more). Multiplicative complexity grows exponentially with the number of input bits, so one of these days I will win a bet.

Currently, CMT is Joan Boyar, Morris Dworkin, Rene Peralta, Meltem Turan, Cagdas Calik, Luis Brandao. Past collaborators include: Michael Bartock, Ramon Collazo, Magnus Find, Michael Fischer, Murat Cenk, Christopher Wood, Andrea Visconti, Chiara Schiavo, Holman Gao, Bruce Strackbein, Larry Bassham .