I made one crypto challenge each for CODEGATE CTF 2022 quals and finals. In this article, I’m gonna explain how the cube attack works, and show the way to solve my challenge “Happy S-box” from the quals.

Cube Attack

Consider a multivariate equation over $\text{GF}(2)$, $f(x_1, \dots, x_n)$. There will be multiple monomials in $f$, and what we want to do is to pick one monomial $g = x_{i_1}x_{i_2} \cdots x_{i_k}$ and divide $f$ by $g$. Then we will get: $f = qg + r$. If the degree of $q$ is 1, then we can do an interesting attack here.

First, let’s fix the value of $x_i$ which is not in $g$. Then, we will try all the possible combinations of $(x_{i_1}, x_{i_2}, \dots, x_{i_k})$ to calculate $f$, and sum up the results: $s = \sum_{x_{i_1}, x_{i_2}, \dots, x_{i_k}}{f}$. Here’s an interesting property, the core of the cube attack: $s = q$.

Let’s think of one monomial $h$ in $r$. Then we can write $h = h_g h’$: $h_g$ consists of variables which are also in $g$, meanwhile $h’$ consists of variables which are not in $g$. We know that when we calculate $\sum_{x_{i_1}, x_{i_2}, \dots, x_{i_k}}{h}$ as above, the value of $h’$ is fixed as $h’$ does not have any variable from $g$. Also, we know that $h = h_g h’ = h’$ when the variables in $h_g$ is all $1$. See, how many times that $h_g$ will be $1$ while doing $\sum_{x_{i_1}, x_{i_2}, \dots, x_{i_k}}$?

If the number of variables in $h_g$ is $l$, then the answer is $2^{k-l}$ times. We know $l < k$, as $h$ is not dividable by $g$. Therefore, $\sum_{x_{i_1}, x_{i_2}, \dots, x_{i_k}}{h} = \sum_{2^{k-l} \text{ times}}{h’} = 0$. From this, we know that $\sum_{x_{i_1}, x_{i_2}, \dots, x_{i_k}}{r}$ is also $0$. Finally, $\sum_{x_{i_1}, x_{i_2}, \dots, x_{i_k}}{f} = \sum_{x_{i_1}, x_{i_2}, \dots, x_{i_k}}{qg + r} = \sum_{x_{i_1}, x_{i_2}, \dots, x_{i_k}}{qg} = q$, as $g$ becomes $1$ only once when $x_{i_1}, x_{i_2}, \dots, x_{i_k}$ are all $1$.

So, we can get an linear equation $q$, by calculating $f$ $2^{d - 1}$ times where $d$ is the degree of $f$. Wow, we know how to solve a linear system over $\text{GF}(2)$, right?

Happy S-box

Here’s the code of Happy S-box.

#!/usr/bin/python3
import os
import random

class LFSR:
    rand = random.Random("Codegate2022")
    Sbox = [ [0] * 64 + [1] * 64 for _ in range(512 - 6) ]
    for i in range(len(Sbox)):
        rand.shuffle(Sbox[i])

    def __init__(self, seed):
        self.state = seed
        for _ in range(1024):
            self.next()
    
    def next(self):
        v = self.state
        # x^512 + x^8 + x^5 + x^2 + 1
        n = ((v >> 0) ^ (v >> 504) ^ (v >> 507) ^ (v >> 509)) & 1
        self.state = (v >> 1) | (n << 511)
    
    def output(self):
        out = 0
        for i in range(512 - 6):
            out ^= self.Sbox[i][(self.state >> i) & 127]
        return out


guess_this = os.urandom((512 - 8) // 8)
guess_this = guess_this[:-1] + bytes([guess_this[-1] & 0xf0])
seed = int.from_bytes(guess_this, 'big') << 8
print("Seed generated. It'll take some time to generate a hint.")

v = 0
for i in range(2 ** 12):
    lfsr = LFSR(seed | i)
    v <<= 1
    v |= lfsr.output()

print("Here's the hint. It should be enough.")
print(v.to_bytes(2 ** 12 // 8, 'big').hex())
ans = input("Guess the seed > ")

if guess_this == bytes.fromhex(ans):
    with open('flag', 'r') as f:
        print(f.read())
else:
    print("WRONG")

There are too many S-boxes in LFSR.output(). Can we use the cube attack here?

Let’s see the degree of these S-boxes:

from sage.crypto.boolean_function import BooleanFunction
import random
rand = random.Random("Codegate2022")
Sbox = [ [0] * 64 + [1] * 64 for _ in range(512 - 6) ]
for i in range(len(Sbox)):
    rand.shuffle(Sbox[i])

mx = 0
for i in range(len(Sbox)):
    v = BooleanFunction(Sbox[i]).algebraic_degree()
    mx = max(v, mx)

print(mx)

The result is 6. Oh, so we can perform the cube attack by picking 5 variables for $g$! As the challenge gives LFSR(seed | i).output() for for i in range(2 ** 12), we can solve this challenge by:

  1. Preprocess linear equations which can be acquired from for i in range(2 ** 12)
  2. Solve the linear system after getting outputs from the server

Preprocess

#!/usr/bin/python3
import os
import random
import itertools
from tqdm import tqdm

class LFSR:
    rand = random.Random("Codegate2022")
    Sbox = [ [0] * 64 + [1] * 64 for _ in range(512 - 6) ]
    for i in range(len(Sbox)):
        rand.shuffle(Sbox[i])

    def __init__(self, seed):
        self.state = seed
        for _ in range(1024):
            self.next()
    
    def next(self):
        v = self.state
        # x^512 + x^8 + x^5 + x^2 + 1
        n = ((v >> 0) ^ (v >> 504) ^ (v >> 507) ^ (v >> 509)) & 1
        self.state = (v >> 1) | (n << 511)
    
    def output(self):
        out = 0
        for i in range(512 - 6):
            out ^= self.Sbox[i][(self.state >> i) & 127]
        return out

outfile = open('eq.txt', 'w')

for c in tqdm(itertools.combinations(list(range(12)), 5), total=792):
    base = 0
    for x in itertools.product(range(2), repeat=5):
        inp = 0
        for i in range(5):
            if x[i]:
                inp |= 1 << c[i]
        
        base ^= LFSR(inp).output()

    arr = []
    for idx in range(512 - 12):
        res = base
        for x in itertools.product(range(2), repeat=5):
            inp = 1 << (idx + 12)
            for i in range(5):
                if x[i]:
                    inp |= 1 << c[i]
            
            res ^= LFSR(inp).output()
        
        arr.append(res)
    arr.append(base)

    outfile.write(str(arr) + '\n')

outfile.close()

Solve

#!/usr/bin/sage
import os
import random
import itertools
from pwn import *
import sys

if len(sys.argv) != 3:
    print("How to use: ./solver.sage <IP> <PORT>")
    exit(0)

r = remote(sys.argv[1], sys.argv[2])
print("CONNECTED")

r.recvuntil(b'enough.\n')
output = int(r.recvline().strip().decode(), 16)
print("RECEIVED")

# Solver ======================================

mat = []
vec = []
with open('eq.txt', 'r') as f:
    for _ in range(792):
        arr = eval(f.readline().strip())
        mat.append(arr[:-1])
        vec.append(arr[-1])

mat = Matrix(GF(2), mat)

for idx, c in enumerate(itertools.combinations(list(range(12)), 5)):
    base = 0
    for x in itertools.product(range(2), repeat=5):
        inp = 0
        for i in range(5):
            if x[i]:
                inp |= 1 << c[i]
        
        base ^^= (output >> (2 ** 12 - 1 - inp)) & 1
    vec[idx] ^^= base

vec = vector(GF(2), vec)
ans = mat.solve_right(vec)

guess = int(0)
for i in range(512 - 12):
    guess |= int(ans[i]) << int(i)

r.recvuntil('> ')
r.sendline((hex(guess)[2:] + '0').encode())

r.interactive()

Conclusion

The attack itself is easy to understand, but it may seem a bit hard to use in real world, as most of PRNGs are much more complicated than this nowadays. Who uses S-boxes over $\text{GF}(2)$? However, it’s still useful in attacking symmetric ciphers, including an existing scheme called Trivium! If you are interested in, please check the section 5, 6 and 7 of the original paper.