Detourning the Web

Week 3: The List (Part 2)

Finally, it's here: all the answers. Two hundred and ninety-seven of them in a tidy paperback. Thanks to the Village Copier I now have my book of answers, scraped from DuckDuckGo's autosuggest. For a prototype I'm very pleased. Next time I might increase the gutter a tad more (and somehow the last page ended up on the back cover), but overall it's very gratifying to hold. I'm especially fond of the answer pairings across the page spreads considering that I printed the list in order. 

the answers is a quarter of the size of my list, sometimes. It's too many pages to publish with the Village Copier or online at Blurb, Lulu, or even with Issuu (shown here) without breaking it into volumes. Of all the lists I scraped, I'm most moved by the range of human experience represented in these saved search submissions. Note for future Ellen: code your own site for an uninterrupted page-turning experience. 

The future is on stickers! Excited to explore an alternative physical list manifestation, I printed autocompletions from the phrase "the future is" onto 568 round, clear stickers to share with others and deposit around the city. 

Week 2: The List (Part 1)

We conducted our first scraping exercises this week. After reviewing some command line and Python basics, we installed pip, a Python package manager, and virtualenv to create isolated Python environments—useful if projects require different libraries and/or versions of Python.

For our assignment to generate a looong list from scraped web text, I thought about search engines as both oracles and confessionals. Initially, I hoped to scrape the headlines from the returns of search queries, specifically the results to “the answer is.” I mean, aren’t ALL the answers online? Seriously, where do you turn if you have a question? Your phone, a person, the card catalog? More importantly, what might you ask if you thought you were anonymous or perhaps didn't realize that your query was being logged for future publication by an algorithm or someone like me?

Scraping from Google’s search page proved a different animal from the comparatively straight forward examples in class with Craigslist and my experiments with the NYTimes and Reddit. With my many repeated attempts to solve the puzzle, it didn’t take long for Google to block my IP. Sam suggested I try Bing or DuckDuckGo, and in the process of exploring those options, we couldn’t help but notice the search engines’ autocomplete search suggestions for my query. Though I could not locate the specifics of DuckDuckGo's auto-suggest algorithm, Google's autocomplete predictions are "based on several factors, like how often others have searched for a term" and trending, popular topics. 

Screen Shot 2018-02-03 at 5.43.02 PM.png

With his help using the browser’s Developer Tools, we figured out that on DuckDuckGo this information was formatted as JSON. Fortunately, there’s a JSON parser built directly into Python (so no need to use the beautifulsoup library required for parsing HTML), and together we walked through writing the initial lines of code for scraping with this condition.

My program passes any phrase into the URL request parameters along with each letter from the alphabet, so “the answer is a…” followed by the “the answer is b…”. Each individual pass generates a list of auto-suggestions. After it exhausts all 26 letters, it then passes the phrase plus double letters to increase the number of possible returns. So again, “the answer is aa…” followed by “the answer is ab…” Once I figured out the code and created a working template, I could request results with any phrase I wished. SO MUCH FUN! Thank you, Sam!!

Here are my steps for this process:
1. Create a project directory
2. Within that directory, create a virtual environment and activate it
3. Install the requests library
4. Run my Python program and save the results to a new file:
    python > rawtext.txt
5. Sort the results, remove duplicate lines, and save to a new file:
    sort rawtext.txt | uniq > sorted_noduplicates.txt

The code for is on Github, along with my favorite and the most poignant auto-suggest lists so far:

the answer is…
why is my…
the future is…
sometimes i…

Week 1: Altered Websites

This week introduced browser developer tools (Chrome > View > Developer > View Source, Developer Tools, and JavaScript Console) to reveal the underlying structure of a website and temporarily manipulate its content on your local computer. Class discussion and the readings offered multiple strategies for repurposing material, including reconfiguring, replacing, removing, repeating, and/or juxtaposing with content from elsewhere. Ideally the transformation shifts the context of the presentation and generates new meanings. 

Self Search Portrait
Currently a reserved user of social media and contributor to public web forums, I'm always curious to see what turns up in a Google image search of my name. It's always changing, and it the results usually have nothing to do with me. Individual clicks on images reveal that each picture is accompanied with descriptive text (visible upon inspection in the Developer Tools) that often contains (but not always) both my first and last name or one or the other. Sometimes my name appears somewhere on the page linked to the photo. By replacing the image results with the associated text and displaying it all at once, I hope to better understand how the Google algorithm "sees" at this moment in time. And right now, it sees some of my LinkedIn activity, at least two other people with the same name, my father, my grandfather, and few others who share either my first or last name. Of note, I've never been on a dairy farm tour, but among the commenters in the linked article I found an "Ellen" and a "Nickles"--why such a top return for matching information that is buried relatively much deeper? I also noticed the mention of Ellen Sabin and her book, The Nickels, Dime, and Dollars Book. My last name is spelled "les" not "els". Folks commonly make this mistake, and I'm used to it. But c'mon Google, I had higher expectations for you!

Baby Donut Heads
Last year I noticed a couple of funny photos in the Yelp reviews of dessert restaurants. At first I thought they were accidental: a sweet treat photographed in front a blurred out child such that the two combined into hybrid creature. (Picture a baby with an ice cream cone for a body.) As I started this exercise I discovered that I am not the only one with this observation. It appears to be recent trend, with images of the sort only appearing in the first pages of Yelp galleries and once you notice them, you can't not see them. Here I've gathered a bunch together for a chain of local donut shops. I love how this public sharing site has become a playground for silly adults to play with their food and their unassuming children. On a broader level it speaks to how behavior and content organically replicates itself across the internet.

Strictly Platonic
[WARNING] The following screenshots contain language that some may find offensive.
Looking for material I started perusing craigslist and stumbled into the seeking strictly platonic relationships of the personals' section. Let's just say that this category is considered a loose suggestion for participants, and after curating a selection to show the range, I paired listings with with quotes about friends and friendship from a self-help site to amplify this disconnect. But there are also displays of extreme vulnerability. Folks are clearly looking for connections but under a veil of anonymity. I'm struck by the paradox of revealing a profound human need all the while under the protection of a cloaked identity.