Sebastian Schmieg

Search by Image (2011, ongoing)


search by image, recursively
starting with a transparent png (400×225px), 2951 images, 12fps
december 9th, 2011, netherlands

 

Esper | Blade Runner

hal-9000

 


similar images, associatively
starting with a transparent "spacer" gif (50×50px), 677 images, 12fps
december 1st, 2011, netherlands

 


search by image, frame by frame,
source video: most popular video on youporn (excerpt), 1858 different images, 2880 frames, 24fps
december 28th, 2011, germany

 


search by image, recursively
starting with a photo of myself, 351 images, 12fps
december 28th, 2011, germany

 


search by image, recursively
starting with a search result for “earth”, 391 images, 12fps
december 11th, 2011, netherlands

 


search by image, frame by frame
movie: once upon a time (excerpt), 1810 different images, 2160 frames, 12fps
december 12th, 2011, netherlands

 

wunderkammer

Google Search by Image - Interface

 

archive

 

import re, subprocess, time

class GoogleSearchByImage :

    GOOGLE_URL = "http://www.google.com"

    GOOGLE_SBI_URL = "/searchbyimage?image_url="
    
    AGENT_ID = "Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20100101 Firefox/7.0.1"

    MIN_SECONDS_BETWEEN_REQUESTS = 2

    _myLastRequestTimestamp = 0
    
    _myCurrentHtml = ""

    def scrape(self, theReference) :
        if time.time() - self._myLastRequestTimestamp < self.MIN_SECONDS_BETWEEN_REQUESTS :
            time.sleep(self.MIN_SECONDS_BETWEEN_REQUESTS - (time.time() - self._myLastRequestTimestamp))
            return self.scrape(theReference)
        else :
            self._myCurrentHtml = self.getHtml(self.GOOGLE_URL + self.GOOGLE_SBI_URL + theReference)
            self._myLastRequestTimestamp = time.time()
 
    def getHtml(self, theUrl) :
        try :
            myHtml = subprocess.check_output(["curl", "-L", "-A", self.AGENT_ID, theUrl], stderr=subprocess.STDOUT)
            return myHtml
        except :
            print "Curl error. Will sleep for 10 seconds"
            time.sleep(10)
            return self.getHtml(theUrl)
         
    
    def getSimilarImages(self) :
        myPattern = re.compile("\" href\=\"\/imgres\?imgurl\=(.*?)(\&amp|\%3F)")        
        myImages = myPattern.findall(self._myCurrentHtml)
        myImagesUrls = []        
        for myImage in myImages :
            myImagesUrls.append(myImage[0])
        return myImagesUrls
    
    def getLinkToSimilarImagesPage(self) :
        myPattern = re.compile("\<a href\=\"([^\"]+[.]?)\"\>Visually similar images\<\/a\>")
        myPageUrl = myPattern.findall(self._myCurrentHtml)
        myPageUrl = str(myPageUrl[0]).replace("&amp;", "&")
        myPageUrl += "&biw=1600&bih=825" # always keep this
        return self.GOOGLE_URL + myPageUrl