New York Tech Journal
Tech news from the Big Apple

Data Driven NYC: #DataScience, #Artificial Intelligence and #ImageRecognition

Posted on November 18th, 2014


11/18/2014 @ Bloomberg, 58th and Lexington, NY

Matt Zeller @Clarifai talked about using #NeuralNetworks to do image recognition. (See to try it out). The theoretical foundations for Deep Learning were established in the 1980’s, but the successful application of these algorithms needed faster hardware (GPU) and larger data sets for algorithm training. Recently, these have become available and neural net models have become more accurate than other competing algorithms.

Matt mentioned the many uses of image recognition including classifying consumer photos, shopping for specific items, accessing stock photos,  placing ads appropriately (e.g. away from negative news mages), ad targeting and analyzing satellite and medical imagery.

For a more technical description of the image recognition methods (and a example of the current limits of the Clarifai web demo) see my notes on a presentation by Rob Fergus.


Next, Mark & Rob @bitly talked about the business model and technology they employ. Mark Josephson said that bitly works with more than 45k brands to improve the reach of their web sites. To most users, however, bitly is known as service to convert long URLs into shorter links. However, by shortening a link, they can reroute clicks through their servers and monitor who accesses the sites. In this way they learn how content moves around the world. For instance, they have been able to track the growth in Facebook usage as compared to Twitter’s slower growth.

Rob Platzer talked about how they built a scalable data architecture. Their goals are to create a flexible system to handle very large volumes, but retain the ability to add information as it becomes available, conduct multiple parallel analyzes of the data and make it easy to create customized reports. To do this they use a pipeline architecture in which information is added to items in a queue (or data are analyzed) and the output passed to the next queue.

Undoubtedly, some of their inspiration must have come from the queue-based architecture of operating systems.

20141118_192355[1] @xdotai is an application which acts as a personal assistant for scheduling your meetings. Dennis Mortensen talked about how he wants to recreate the same experience as a personal assistant. So this is not an app, but “someone” who can accept an email requesting a meeting and will handle all the negotiations for setting up the meeting. It is designed to handle human dialogs as if it were a call center, even though there is no humans in the process.

Marcos Jiminez Berlenguer talked about how “Amy” is a set of modules (which are currently using humans as trainers, but will eventually dispense with humans).  Marcos spoke about the challenges of understanding responses such s ‘1-2 Monday’ and how the system is designed to parse responses, negotiate times, understand the time-of-day preferences, the history of the negotiations, etc. He also talked about the longer term challenges of understanding social dynamics (such as status or preferences) which might be incorporated into these negotiations.


Finally, Michael Rubenstein @Appnexus described how appnexus programmatically matches ad buyers and ad sellers.

Catherine Williams talked the technology challenges the company has faced as it moved from simple databases to large databases that can be mined to assist both buyers and sellers. Some of the challenges were making sure that real-time ads were evenly spread throughout the day, eliminating porn sites from their service by key-word detection (however, ‘’ should not be excluded even if it contains the letters ‘slut’), and improving the market place design to help both ad buyers and ad sellers.

posted in:  applications, data analysis, Data Driven NYC, databases, DesignDrivenNYC    / leave comments:   1 comment