YACST2
Captcha is a modern simple Turing test for everyday use, for human it's simple, but for bot or a simple neural network captcha can become a hard nut to crack.
You can try to solve it with your AI too, but it definitely can be solved with several lines of code, isn’t it?
Hints
OK, so that's yet another captcha. We navigate to the site and see the form where we're asked to enter the captcha 3000 times.
That's YACST2, so there was the first YACST. Google shows us some writeups on YACST (which was on VolgaCTF 2015). The task is quite similar: we have an audio captcha (WAV file with five digits pronounced). The sound is quite noisy. As I don't know shit about WAV files (and don't have any of these fancy Python modules to work with it), I've wanted to do something with these files, so I could solve the task the other way.
At first I wanted to use something like Audacity's "sound search" feature, which returns marks on sound beginning and sound end. As we have only ten different samples, I thought that could be enough to determine which one it is by simply calculating its duration.
Audacity cannot be controlled with command line arguments, so I needed another solution. I've found sox, which is 100% command line utility to work with sounds. The only drawback was that I couldn't find anything like "sound search" feature.
That's when I've bumped into the spectrogram:
 
sox can generate us a spectrogram image of any sound. We have only 10 samples to recognize and those are quite separated from each other in this sound. So, I've decided to take OpenCV and just find samples on the image. That was easy:
 
So, with someone's old YACST script to send the captchas to server, curl to download the captcha, sox to generate spectrogram and OpenCV to find the digits, I've been solving these captchas. It wasn't 100% recognition, because sounds were with varied duration, but that was enough. 3000 captchas were solved in a few hours:
 
So,
FLAG: VolgaCTF{Sound IS L1ke M@th if A+B=C THEN C-B=A}
Script:
import cv2
import numpy as np
import os
import urllib2
specs = [["0"], ["1"], ["2"], ["3"], ["4"], ["5", "5_2"], ["6", "6_2"], ["7"], ["8"], ["9"]]
templates = []
for i in range(0, 10):
    templates += [[]]
    for j in range(0, len(specs[i])):
        templates[i] += [cv2.imread('templates/spectrogram{0}.png'.format(specs[i][j]),0)]
def detect(img):
    out = []
    for i in range(0, 10):
        for j in range(0, len(templates[i])):
            template = templates[i][j]
            res = cv2.matchTemplate(img,template,cv2.TM_CCOEFF_NORMED)
            threshold = 0.85
            loc = np.where( res >= threshold)
            for pt in zip(*loc[::-1]):
                out += [(pt[0], i)]
    
    out = sorted(out, key=lambda a_entry: a_entry[0])
    newout = []
    if len(out) == 0:
        return []
    prev_x, prev_v = out[0]    
    n = False
    for i in range(0, len(out)):
        x, v = out[i]
        if x-prev_x < 10 and v == prev_v:
            #same
            pass
        else:
            newout += [prev_v]
            prev_x = x
            prev_v = v
            n = True
    if n:
        newout += [prev_v]
    return newout
def find():
    #make spectrogram
    os.system("D:/Programs/sox-14-4-2/sox captcha.wav -n spectrogram")
    img = cv2.imread('spectrogram.png',0)
    #detect
    c = ""
    o = detect(img)
    for x in o:
        c += str(x)
    return c
while(True):
    cookee="JSESSIONID=kjavTFD2FYNe1_oOdQNcan6CbSprbCXUi87z-TTn"
 
    #download
    os.system("D:/Programs/cygwin64/bin/curl -s --header \"Cookie: "+cookee+"\" http://yacst2.2016.volgactf.ru:8090/captcha --output captcha.wav")
    c = find()
    print(c)
    #answer
    url="http://yacst2.2016.volgactf.ru:8090/captcha"
    data="captcha="+c
    cookie ={"Cookie":cookee}
    req = urllib2.Request(url,data,cookie)
 
    try:
        f = urllib2.urlopen(req)
        print(f.read())
    except urllib2.HTTPError:
        print("error")
 
 
No comments:
Post a Comment