Pycoin - Hack.lu 2021
This is what we know:
1
2
3
4
5
6
7
| PYCOIN
Sold: 92 times
Type: rev
Risk: Low
Seller: tunn3l
A friend gave me this and he says he can not reverse this... but this is just python?
|
and we get a .pyc file and a hint flag[5] == "5"
.
I unfortunately did not solve this challenge on time for the CTF but found it interesting (I got stuck trying to disassemble with dis
and did not know xdis
did the work).
Thanks to tunn3l, TheBadGod, crabbers and crazyman for their help elaborating this write-up.
Solution
There is another write-up here, but I didn’t find it detailed enough to understand how to solve the challenge. Hopefully a better attempt below!
Uncompyle
I used uncompyle6 to transform the .pyc
to .py
: uncompyle6 pycoin.pyc > uncompiled.py
. The resulting source code is still obscure:
1
2
3
| import marshal
marshalled = b'\xe3\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00.... '
exec(marshal.loads(marshalled))
|
Marshal
Marshal is an internal Python object serialization module.
If we have a look at the marshalled string, we see it is actually Python byte code from a file named disassembly
.
1
2
3
| >>> redata = marshal.loads(marshalled)
>>> print(redata)
<code object <module> at 0x7efedf31e9d0, file "<disassembly>", line 1>
|
By scanning the marshalled
string more in details, we spot strings the program asks:
- please supply a valid key:
- valid key!
- invalid key :
and obviously there is MD5 hashing involved (spot md5, digest etc):
1
2
3
| xda\x03md5z
\xda\x06digestZ\
...
|
Disassembling Python bytecode
Unfortunately, the regular dis
disassembler does not work:
1
2
3
4
5
6
7
8
9
10
11
12
13
| >>> dis.disassemble(redata)
1 0 JUMP_FORWARD 2 (to 4)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/dis.py", line 369, in disassemble
_disassemble_bytes(co.co_code, lasti, co.co_varnames, co.co_names,
File "/usr/lib/python3.8/dis.py", line 401, in _disassemble_bytes
for instr in _get_instructions_bytes(code, varnames, names,
File "/usr/lib/python3.8/dis.py", line 340, in _get_instructions_bytes
argval, argrepr = _get_name_info(arg, names)
File "/usr/lib/python3.8/dis.py", line 304, in _get_name_info
argval = name_list[name_index]
IndexError: tuple index out of range
|
The solution consists in using xdis which handles python bytecode from multiple versions.
1
2
3
4
5
6
7
8
| >>> import xdis.std as dis
>>> dis.dis(redata)
1: 0 JUMP_FORWARD (to 4)
2 LOAD_GLOBAL (99)
>> 4 LOAD_CONST (0)
6 LOAD_CONST (('md5',))
8 IMPORT_NAME (hashlib)
10 IMPORT_FROM (md5)
|
Understanding the byte code
We read the byte code step by step to understand what the program does.
It reads the user input, and stores it in a variable named k
. Precisely the following bytecode would correspond to k = str(input('please supply a valid key:')).encode()
.
1
2
3
4
5
6
7
8
9
| 14 POP_TOP
16 LOAD_NAME (str)
18 LOAD_NAME (input)
20 LOAD_CONST ('please supply a valid key:')
22 CALL_FUNCTION 1
24 CALL_FUNCTION 1
26 LOAD_METHOD (encode)
6: 28 CALL_METHOD 0
30 STORE_NAME (k)
|
Check the input length (k
) is 16:
1
2
3
4
5
6
|
32 LOAD_NAME (len)
34 LOAD_NAME (k)
36 CALL_FUNCTION 1
38 LOAD_CONST (16)
40 COMPARE_OP (==)
|
Check that k[0] = 102
. This corresponds to character f
.
1
2
3
4
5
6
| 46 LOAD_NAME (k)
48 LOAD_CONST (0)
50 BINARY_SUBSCR
6: 52 LOAD_CONST (102)
54 COMPARE_OP (==)
|
Each time, if the comparison fails, we will fail with invalid key:
1
2
| 7: 42 EXTENDED_ARG (256)
44 JUMP_IF_FALSE_OR_POP (to 462)
|
Check that k[1] = k[0] + 6
.
1
2
3
4
5
6
7
8
9
10
| 60 LOAD_NAME (k)
62 LOAD_CONST (1)
64 BINARY_SUBSCR
66 LOAD_NAME (k)
68 LOAD_CONST (0)
70 BINARY_SUBSCR
72 LOAD_CONST (6)
6: 74 BINARY_ADD
76 COMPARE_OP (==)
|
As k[0] is f
, we get k[1] = l
. The format of Hack.Lu flags is flag{....}
, so it makes sense.
The next bytecode checks that k[2] = k[1] - k[0] + 91
. So, k[2]=‘a’.
Then, k[3] = 103 = ‘g’.
The following bytecode checks that k[4] = k[11] * 3 - 42.
1
2
3
4
5
6
7
8
9
10
11
12
13
| 138 LOAD_NAME (k)
140 LOAD_CONST (4)
142 BINARY_SUBSCR
6: 144 LOAD_NAME (k)
146 LOAD_CONST (11)
12: 148 BINARY_SUBSCR
150 LOAD_CONST (3)
152 BINARY_MULTIPLY
154 LOAD_CONST (42)
156 BINARY_SUBTRACT
158 COMPARE_OP (==)
|
As we know that k[4]=’{’, this means k[11] = (k[4] + 42)/3 = ‘7’
The next check is: k[5] = sum(k)-1322
1
2
3
4
5
6
7
8
9
10
11
| 164 LOAD_NAME (k)
6: 166 LOAD_CONST (5)
168 BINARY_SUBSCR
13: 170 LOAD_NAME (sum)
172 LOAD_NAME (k)
174 CALL_FUNCTION 1
176 LOAD_CONST (1322)
178 BINARY_SUBTRACT
180 COMPARE_OP (==)
|
The next check is k[6] + k[7] + k[10] == 260
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| 186 LOAD_NAME (k)
188 LOAD_CONST (6)
190 BINARY_SUBSCR
192 LOAD_NAME (k)
194 LOAD_CONST (7)
6: 196 BINARY_SUBSCR
198 BINARY_ADD
14: 200 LOAD_NAME (k)
202 LOAD_CONST (10)
204 BINARY_SUBSCR
206 BINARY_ADD
208 LOAD_CONST (260)
210 COMPARE_OP (==)
|
Then, we have: int(chr(k[7]) * 2) + 1 = k[9]
. In this formula, beware: chr(k[7]) * 2
is the Python way to create a string of 2 characters chr(k[7])
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| 216 LOAD_NAME (int)
218 LOAD_NAME (chr)
220 LOAD_NAME (k)
222 LOAD_CONST (7)
224 BINARY_SUBSCR
226 CALL_FUNCTION 1
228 LOAD_CONST (2)
6: 230 BINARY_MULTIPLY
232 CALL_FUNCTION 1
15: 234 LOAD_CONST (1)
236 BINARY_ADD
238 LOAD_NAME (k)
240 LOAD_CONST (9)
242 BINARY_SUBSCR
244 COMPARE_OP (==)
|
We won’t detail the other checks, but there are:
- int(chr(k[7]) * 2) + 1 = k[9]
- k[8] % 17 = 16
- k[9] = k[8] * 2
- md5(k[10] * b’a’).digest()[0] - 1 = k[3]
- k[11] = 55
- k[12] = k[14] / 2 - 2
- k[13] = k[10] * k[8] % 32 * 2 - 1
- k[14] = (k[12] ^ k[9] ^ k[15]) * 3 - 23
- k[15] = 125 = ‘}’
Solving the conditions and flag
First, we solve all conditions which directly provide characters.
This gives us flag{....}
and k[5]='5'
and k[11]='7'
.
Then, we try to find possible values for k[8], knowing that k[8] % 17 == 16
, and that k[9] depends on k[8] is probably an alphanumeric character. We are lucky, there is only one possible value for k[8]: ‘2’.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| def possible_k8():
'''
computes possible values for k[8] which provide alphanumeric value for k[9]
given formulas:
k[8] % 17 == 16
and k[9] = k[8] * 2
'''
possible = []
for k8 in range(ord('0'), ord('z')+1):
if k8 % 17 == 16:
k9 = k8 * 2
if k9 <= ord('z'):
possible.append(chr(k8))
else:
logger.debug("k8={} k9={} not possible".format(chr(k8), chr(k9)))
logger.debug(possible)
return possible
|
We immediately deduce k[9] = d
from formula k[9] = k[8] * 2
, and k[7] = 9
from int(chr(k[7]) * 2) + 1 = k[9]
. In the last formula, note that the Python operation chr(k[7])*2
returns 9
*2, i.e 99
.
Same, we try to guess possible values for k[10] knowing that md5(k10 * b'a').digest()[0] - 1 = k[3]
. We already have k[3]. Assuming k[10] will be alphanumeric, we brute force each value and keep those that match the md5 formula. There is only 1 result. We get k[10] = ’e’. From k[10], we automatically compute k[13] and k[6].
Then, there are 2 remaining characters to find: k[12] and k[14] which depend on each other.
Assuming k[12] will be alphanumeric, we compute k[14] with (k[12] ^ k[9] ^ k[15]) * 3 - 23
and keep possible values.
In the end, there is only one possible solution: flag{5f92de703d}
Full source code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
| #!/usr/bin/env python3
import logging
from hashlib import md5
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
handler = logging.StreamHandler()
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
handler.setLevel(logging.DEBUG)
logger.addHandler(handler)
def init():
'''
Sets the known characters for the flag
'''
k = list('flag{'+'x'*10+'}')
k[11] = chr(55)
k[5] = '5'
logger.debug('len={} k={}'.format(len(k),''.join(k)))
return k
def possible_k8():
'''
computes possible values for k[8] which provide alphanumeric value for k[9]
given formulas:
k[8] % 17 == 16
and k[9] = k[8] * 2
'''
possible = []
for k8 in range(ord('0'), ord('z')+1):
if k8 % 17 == 16:
k9 = k8 * 2
if k9 <= ord('z'):
possible.append(chr(k8))
logger.debug("Possible k8={}".format(possible))
return possible
def possible_k10(k3):
'''
computes possible values for k[10], knowing correct k[3]
'''
possible = []
for k10 in range(ord('0'), ord('z')+1):
computed_k3 = md5(k10 * b'a').digest()[0] - 1
if k3 == computed_k3:
possible.append(chr(k10))
logger.debug("Possible k10={}".format(possible))
return possible
def possible_k12(k9, k15):
'''
computes possible values for k[12], knowing correct k[9] and k[15]
which give alphanumeric result for k[14]
knowing formulas:
k[14] = (k[12] ^ k[9] ^ k[15]) * 3 - 23
k[12] = k[14] / 2 - 2
'''
possible = []
for k12 in range(ord('0'), ord('z')+1):
k14 = (k12 ^ k9 ^ k15) * 3 - 23
if k14 >= ord('0') and k14 <= ord('z'):
if k14 == (k12 + 2) *2:
possible.append(chr(k12))
logger.debug("Possible k12={}".format(possible))
return possible
if __name__ == '__main__':
k = init()
possible = possible_k8()
assert(len(possible) == 1), "we have several possible k8"
k[8] = possible[0]
logger.debug("k[8]={}".format(k[8]))
# k[9] = k[8] * 2
k[9] = chr(ord(k[8]) * 2)
logger.debug("k[9]={}".format(k[9]))
# int(chr(k[7]) * 2) + 1 = k[9]
k[7] = str(ord(k[9])-1)[0]
logger.debug("k[7]={}".format(k[7]))
possible = possible_k10(ord(k[3]))
assert(len(possible) == 1), "we have several possible k10"
k[10] = possible[0]
k[13] = chr( (((ord(k[10]) * ord(k[8])) % 32) * 2) - 1)
logger.debug("k[13]={}".format(k[13]))
#k[6] + k[7] + k[10] = 260
k[6] = chr(260 - ord(k[7]) - ord(k[10]))
logger.debug("k[6]={}".format(k[6]))
possible = possible_k12(ord(k[9]), ord(k[15]))
assert(len(possible) == 1), "we have several possible k12"
k[12] = possible[0]
k[14] = chr((ord(k[12]) + 2) * 2)
logger.debug("k[14]={}".format(k[14]))
logger.info("Solution: {}".format(''.join(k)))
|