-
It comes in voters’ own words, often registered onto the clipboards of canvassers, during a call-center phone conversation, in an online signup sequence or a stunt like “share your story.” As part of the Dreamcatcher project, Obama campaign officials have already set out to redesign the “notes” field on individual records in the database they use to track voters so that it sits visibly at the top of the screen—encouraging volunteers to gather and enter that information. And they’ve made the field large enough to include the “stories” submitted online. (One story was 60,000 text characters long.)
What can the campaign do with this blizzard of text snippets? Theoretically, Ghani could isolate keywords and context, then use statistical patterns gleaned from the examples of millions of voters to discern meaning. Say someone prattles on about “the auto bailout” to a volunteer canvasser: Is he lauding a signature domestic-policy achievement or is he a Tea Party sympathizer who should be excluded from Obama’s future outreach efforts? An algorithm able to interpret that voter’s actual words and sort them into categories might be able to make an educated guess.
A crazy ambitious data-mining project for the Obama reelection campaign, which they call “microlistening.” Can algorithms read into voters’ stories and find the puppet string that seems to draw the flock one way or the other?
-
In other words, person for person:
Asians are the most desired racial group in the country.
Then Latinos. Then whites, sort of. As we’re seeing it now, the data is being distorted: a huge part of the country is white, and white people mostly like to talk amongst themselves. Intentionally or not, minorities are left out in the cold.
Nonetheless, people prefer their own race
Given equal choice, every race strongly prefers itself: And white people actually prefer themselves the least, but right now there’s just so many of them.
What If There Weren’t So Many White People? « OkTrends
Booyah! Asians are the most desired race when OkCupid crunched their messaging data by race. Whatever you feel about online dating, the OkTrends posts always have something interesting. It reminds me that when you finish interpreting and analyzing a set of data under one context, you should always examine from a different angle. Here, they do a straight-up analysis of what race sends messages to what race, then rescale the information based on percentage in the general population. It’s common sense, but it’s just so magical how the same numbers can tell totally different stories.
-
One request I got was someone hoping to study how social connectedness and social networking relates to finding jobs. If you can correlate high social connection and employment rates, you can make a conclusion about what actually works in job searches. You can then design computer programs that help people do that. I could see that making a difference in day-to-day life.
Mining and its risks—not just coal mining, but data mining. A scholar used a web crawler to collect data from public Facebook profiles and was threatened with a lawsuit by Facebook’s legal department.
One of the studies requesting use of his data was on unemployment and social media connectedness, which is relevant here. But I’m also interested in the larger issues: is scholarly data mining of public data unethical? Was Facebook protecting its users, or protecting itself (since the technical team was aware of the data mining)? Do I want this guy to have in his possession information that I have listed “sandwiches” as an interest on Facebook?
