Print This Post Print This Post 1,527 views
Apr 08

Reported by Alexis Madrigal in the Atlantic, 18 March 2011.

Given the awesome correlating powers of today’s stock trading computers, the idea may not be as far-fetched as you think.

Images: 1. Anne Hathaway. Flickr/zigzaglens; 2. Warren Buffett. Flickr/trackrecord.

A couple weeks ago, Huffington Post blogger Dan Mirvish noted a funny trend: when Anne Hathaway was in the news, Warren Buffett’s Berkshire Hathaway’s shares went up. He pointed to six dates going back to 2008 to show the correlation. Mirvish then suggested a mechanism to explain the trend: “automated, robotic trading programming are picking up the same chatter on the Internet about ‘Hathaway’ as the IMDb’s StarMeter, and they’re applying it to the stock market.”

The idea seems ridiculous. But the more I thought about the strange behavior of algorithmic trading systems and the news that Twitter sentiment analysis could be used by stock market analysts and the fact that many computer programs are simply looking for tradeable correlations, I really started to wonder if Mirvish’s theory was plausible.

I called up John Bates, a former Cambridge computer scientist whose company Progress Software works with hedge funds and others to help them find new algorithmic strategies. I asked, “Is this at all possible?” And I was surprised that he answered, roughly, “Maybe?”

“We come across all sorts of strange things in our line of business, strange correlations,” Bates told me. “And I’ve had a lot of interest in this for a long time because it’s really often the secret source for certain hedge funds.”

Companies are trying to “correlate everything against everything,” he explained, and if they find something that they think will work time and again, they’ll try it out. The interesting, thing, though, is that it’s all statistics, removed from the real world. It’s not as if a hedge fund’s computers would spit the trading strategy as a sentence: “When Hathway news increases, buy Berkshire Hathaway.” In fact, traders won’t always know why their algorithms are doing what they’re doing. They just see that it’s found some correlation and it’s betting on Buffett’s company.

Now, generally the correlations are between some statistical indicator and a stock or industry. “Let’s say a new instrument comes to an exchange, you might suddenly notice that that an instrument moves in conjunction with the insurance sector,” Bates posited. But it’s thought that some hedge funds are testing strategies out to mine news and social media datasets for other types of correlations.

Does it happen a lot? Bates doesn’t think so, but it’s not out of the question. And, in any case, we’re going to see a lot of strange trading strategies as hedge fund managers’ computing resources grow ever more powerful and they are actually able to “correlate everything against everything.” Oh, it’s raining in Kazakhstan? Buy pork bellies in Brazil! And sell wheat in Kansas! Dump Apple stock! Why? Because the computer says that the 193 out of the last 240 times it rained in Kazakhstan, pork bellies in Brazil went up, and wheat prices and Apple shares went down.

It sounds crazy, sure, but they’re the ones making 10 figures.

Tagged with:
Print This Post Print This Post 1,669 views
Feb 21

Reported by Nicola Jones, in Nature News, 15 February 2011, updated 17 February 2011.

TV star Watson is a step towards a new kind of search engine.

A superstar supercomputer can carry out powerful searches that may one day be of help to scientists. AP/Press Association Images

IBM’s supercomputer Watson is going up against top players of the US television quiz programme Jeopardy! this week, stirring up excitement in the artificial-intelligence community and prompting computer science departments across the country to gather and watch.

“It is, in my mind, a historic moment,” says Oren Etzioni, director of the Turing Center at the University of Washington, Seattle. “I watched Gary Kasparov playing Deep Blue. This absolutely ranks up there with that.”

Jeopardy! contestants are given clues in the form of answers and must try to work up the right questions. Watson, with its 16-terabyte memory, is capable of tackling normal Jeopardy! clues — including all the puns, quips and ambiguities they typically contain. It dissects the clue, compares it against a ream of facts and rules that it has gleaned from reading a raft of books (from encyclopaedias to the complete works of Shakespeare), and assigns probabilities to its answers before coming up with a response.

In the time it takes host Alex Trebek to read the clue to the human contestants, Watson comes up with an answer and decides whether it is confident enough to ring in. The match, which was filmed in advance in January, airs 14–16 February (see the programme’s Watson website).

Although it might sound like nothing more than a stunt, computer scientists say that Watson is an important advance in artificial intelligence, marking a shift that will create much better search engines and help scientists to keep up in their fields.

What Watson doesn’t do is attempt to mimic the human ability to use common sense, make leaps of logic or imagine the future, notes Patrick Winston of Massachusetts Institute of Technology in Cambridge. As a result, it has failed to capture his interest. “I’m planning to go to bed early. I’ll watch the re-runs,” he says.

Deep thought

The computer system is based on IBM’s DeepQA project, which aims to answer ‘natural-language’ questions in standard English, such as ‘Which nanotechnology companies are hiring on the West Coast?’ The trick is for it to both understand that type of query and provide a meaningful answer. “Good luck getting that from Google,” says Etzioni.

That goal is at the core of many computer science-endeavours, including the long-running artificial-intelligence project Cyc, started in 1984 and now run by Cycorp in Austin, Texas. One trouble with Cyc, says Etzioni, is that its database relies on human beings typing coded facts and knowledge into its system. The alternative is to train computers to learn by reading. There are several big projects in the works on this front — including the Never-Ending Language Learning system (NELL) at Carnegie Mellon University in Pittsburgh, Pennsylvania, and Etzioni’s KnowItAll system — most of which are part-funded by the US Defense Advanced Research Projects Agency.

“I watched Gary Kasparov playing Deep Blue. This absolutely ranks up there with that.”

Oren Etzioni
University of Washington, Seattle

What IBM has done that’s different, says Etzioni, is to focus on a very specific situation (the game of Jeopardy!), spend a lot of time on how to interpret cunning clues, create a database that is the equivalent of about a million books, and find some way to get the system’s performance to shoot up — it comes up with answers in seconds. IBM hasn’t released all the details of how Watson works, so how they have done this is not clear. But Etzioni guesses that the way it collates facts from its reading is similar to how his KnowItAll system approaches the problem. It’s impressive, notes Etzioni, how DeepQA has managed to basically achieve what Cyc set out to do decades ago, but in just a few years.

“Just like with Deep Blue, it’s really bringing together the state-of-the-art in hardware and software,” says Henry Kautz, a computer scientist at the University of Rochester, New York, and president of the Association for the Advancement of Artificial Intelligence in Menlo Park, California, who is also impressed by Watson.

Etzioni says he expects natural-language software to make a big dent in search applications over the next five years, although at the moment systems such as Watson aren’t ready for ‘prime time’: he notes that Microsoft bought a natural-language processing company called Powerset in 2008 for US$100 million, “but you don’t see Microsoft using it in any visible way”. Kautz agrees that systems as broad and powerful as Watson could be available for general use “surprisingly soon… Let’s say three to four years.”

Crying for help

Etzioni argues that a search engine that can deal with natural-language queries is necessary for scientists trying to keep up with the mass of knowledge now being generated in their field, so they can ask, say, “What are the top ten genes currently being studied in cancer research?”, rather than having to trawl through the literature to find out.

Others disagree. Canadian writer Malcolm Gladwell said in a recent discussion about the future of search technologies that current projects “are solving lots of problems that aren’t really problems… You cannot point to any area of intellectual activity or innovation or what have you that is today being compromised or hamstrung by some failure in their search technology. Can we honestly go to some scientist and say the reason we can’t cure cancer is you don’t have access to information about cancer research? No!”

Etzioni laughs at that. “To me, that’s as short-sighted as the famous statement that there’s only a world market for five computers,” he says — a statement that, ironically, is attributed to IBM founder Thomas Watson, after whom Watson is named.

“There’s massive production of knowledge, particularly in the biological community, and researchers can’t keep up with it,” says Etzioni. “Applying these tools specifically for medical researchers to keep track of what’s relevant in what they’re interested in is a huge area of my field. It’s true we don’t yet have a killer app, but you talk to anybody and they’re crying out for help.”

Updated:

Watson won against the human contestants. IBM plans to donate their US$1 million winnings to charity. The final scores were: Watson $77,147, Ken Jennings $24,000 and Brad Rutter $21,600.

Read more in Nature | doi:10.1038/news.2011.95 as well as in Designing a computer that can process and understand natural language.

Tagged with:
Print This Post Print This Post 1,248 views
Jan 18

Reported by By Bob Brown, in Network World, January 11, 2011.

Google Science Fair looks to bring glory to science talent.

Google (NSDQ: GOOG) is urging youths ages 13 to 18 to take part in a worldwide science fair that will be hosted by the search giant online.

Google, in a blog posting titled “Google Science Fair seeks budding Einsteins and Curies,” invokes the story of its founders, onetime computer science students Larry Page and Sergey Brin, to encourage young people to take part in the event.

“Larry and Sergey were fortunate to be able to get their idea in front of lots of people. But how many ideas are lost because people don’t have the right forum for their talents to be discovered? We believe that science can change the world—and one way to encourage that is to celebrate and champion young scientific talent as we do athletes and pop idols,” Google writes.

The Google Science Fair is being conducted in partnership with CERN, The LEGO Group, National Geographic and Scientific American.  Details on how to enter are here, but the basics are that students can enter by themselves or in groups of three by April 4.  Finalists will be invited to participate in a live event at Google headquarters in Silicon Valley. Prizes include everything from a trip to the Galapagos Islands to scholarships, and entrants are free to double dip by submitting projects they are doing for local competitions into the Google Science Fair.

The Google Science Fair isn’t the first time Google has sought to inspire creativity via the contest route. It used to hold an Android Developer Challenge to entice programmers to create apps for Android smartphones. (That effort seems to have worked out pretty well, given the growing popularity of Android devices.)

Tagged with:
Print This Post Print This Post 1,526 views
Nov 22

Reported by Joel’s Blog in November 10, 2010.

“I’ve been playing around with the Text CAPTCHA demo page and wondered how well WolframAlpha is at logic questions.  As it turns out, Wolfram is pretty smart!  Although, since a CAPTCHA requires an exact answer, some of the results from WolframAlpha are logically correct, but are not exactly correct.  If someone wanted to use WolframAlpha to crack the text CAPTCHA technology, they could build in filters and such to narrow down answers to what the CAPTCHA is likely looking for.

Out of 10 demo questions, 3 failed and 7 were correct (although, 4 had the correct answer but would fail a CAPTCHA if the exact answer was not parsed out).  Here are the results:

Text CAPTCHA: What is seven hundred and forty four as a number?
WolframAlpha: NumberQ[744]
Result: ALMOST

Text CAPTCHA: The 7th letter in the word “central” is?
WolframAlpha: the word
Result: FAILED

Text CAPTCHA: Which word in this sentence is all IN capitals?
WolframAlpha: capitals IN
Result: ALMOST

Text CAPTCHA: Which word contains “z” from the list: zoologist, midwifery, spiderweb, crimps?
WolframAlpha: zoologist
Result: SUCCESS!

Text CAPTCHA: The 2nd colour in purple, yellow, arm, white and blue is?
WolframAlpha: yellow
Result: SUCCESS!

Text CAPTCHA: Of the numbers seventy six, 2, 50 or forty, which is the lowest?
WolframAlpha: or
Result: FAILED

Text CAPTCHA: What is the 7th digit in 9686561?
WolframAlpha: 1
Result: SUCCESS!

Text CAPTCHA: Which of these is a colour: monkey, bank or purple?
WolframAlpha: colour purple
Result: ALMOST

Text CAPTCHA: The day of the week in chips, house, bank, mouse, trousers or Friday is?
WolframAlpha: mouse
Result: FAILED

Text CAPTCHA: If a person is called Mary, what is their name?
WolframAlpha: called Mary
Result: ALMOST

Wolfram, you’re close… keep up the good work!  Text CAPTCHA, the demo page was easy.  Are the other questions harder?”

Update: There’s a discussion going on over at Hacker News, if you want to check it out!

Update 2: WolframAlpha can generate a CAPTCHA image of each of these text questions, as to make it harder for a bot to decode AND answer the question!  Check it out:  http://www.wolframalpha.com/input/?i=CAPTCHA+What+is+seven+hundred+and+forty+four+as+a+number%3F

Update 3: There is more discussion going on over at Reddit for you guys looking for more insights….

Update 4: Looks like someone put together a script that knows the format of the Text CAPTCHA questions.  It was posted on Hacker News.

Tagged with:
preload preload preload