New York Tech Journal
Tech news from the Big Apple

DataDrivenNYC: computer #security, #DeepLearning, distributing #tweets to web sites, customizing user experience

Posted on October 13th, 2015

#DataDrivenNYC

10/12/2015 @Bloomberg, 731 Lexington Ave, NY

20151012_183104[1] 20151012_183443[1] 20151012_185641[1] 20151012_190056[1] 20151012_191304[1] 20151012_192512[1] 20151012_194157[1]

This month’s speakers were

Liz Crawford, CTO of Birchbox (discovery commerce platform for beauty products)

Richard Socher, Founder and CEO of MetaMind (artificial intelligence enterprise platform for automating image recognition and language understanding)

• Ramana Rao, CTO of Livefyre (real-time content marketing and engagement platform)

Oren Falkowitz, Founder and CEO of Area 1 Security (provides visibility into the next generation of unknown, sophisticated targeted attacks)

 

Oren Falkowitz @Area1Security talked about his company’s approach to computer security. The traditional approach to network/computer security uses layers of defense including fire walls, passwords, etc.

Area 1 Security starts from the approach that 97% of all attacks original through phishing and these attacks can circumvent most traditional defenses. In addition, these attacks can mine your data for months until they are discovered. In contrast to the traditional approach, they strives to better understand the attacker’s behaviors and look for the telltale network and external usage patterns that indicate attempts to probe your network.

Oren talked about how intrusions often are periodic and from specific parts of the world wide web. By looking within the network and for usage patterns across the entire www, they hope to detect intrusions quickly using

  1. Visibility (big data)
  2. Detection (deep learning)

He closed by displaying a map showing a snapshot of all activity world-wide on the web.

Next, Richard Socher @MetaMind spoke about how MetaMind brings deep learning tools to companies. MetaMind applies deep learning to vision and language, He demonstrated how the tool classifies images based on a standard vocabulary, but can also be taught new classifiers using a drag and drop web interface.

For instance, to create a classifier for BMW vs Audi vs Telsa, he dropped three sets of images into slots on the interface and the classifier (powered by GPUs) used these images to evaluate new images.

Other uses of this technology include:

  1. Examine each frame of a game video and find your company logo.
  2. Find your company logo on social media
  3. In Diabetic retinopathy, classify images within needing people that have spent years learning how to read the scans

They also have tools that do natural language understanding using deep learning algorithms. These can predict the sentiment of a sentence and extract the main actors and themes in that sentence.

Richard addressed the question of why deep learning have just gained traction after existing for decades with little commercial interest:

  1. Enough large data sets are now available
  2. Larger models can be created due to faster machines – GPUs, multi-core CPUs
  3. Lots of small algorithmic advances over the years

Ramana Rao @Liverfyre talked about his company which provides real-time tweets (and other internet updates) to web sites. They capture social media, organize it, and publish it.

They create walls showing the tweets that are being sent during a conference. They insert tweets to actively change web pages. These live updates encourage viewers to stay longer on the site which increases overall engagement and develops greater affinity for the brand.

Ramana illustrated the uses of these updates to create an “earnings wall” for wall street pages, show social media comments during a TV show, or display tweets during presidential debates.

Analytics tools show the total amount of time spent on a site. They also show types of behavior as users interact in different ways by noting likes, or posting comments.

The large volume of data (currently 350 million global uniques each month) requires a large system which they run on Amazon Web Services.

The large volume of communications also requires them to employ various filters. Their spam/abuse filters use word lists and regular expressions to detect specific patterns.  They also detect spam using bulk detection to look for multiple repeated messages. They also filter out nudity.

The fourth speaker, Liz Crawford @Birchbox spoke about Birchbox’s use of analytics. Birchbox is a beauty retailer which has a retail and online presence, but specializes in a personalized package of items they send to subscribers every month.

To create the personalized package and better market to their customers, they have a staff of data scientists and statistical analysts:

  1. Data scientist = Ph.D. who can write production code, embed into product development teams
  2. Statistical analysts = focus on analytics. Specialized into business concerns.
  3. Analysts = junior statistical analysts
  4. Data engineering (warehouse) embedded with platform engineering.

In this structure, data scientists work on methods to better use the user profiles to personalizing the box subscribers receive each month. The problem becomes an integer programming optimization problem since subscribers never receive the same thing twice and there are a limited numbers of specific samples available each month.  To further complicate the optimization, customers can request one item each month. The data scientists use the Gurobi solver.

They also make customized recommendations online.

They have outlets on web, offline, brick & mortar, apps, social media. They track each customer’s touches on all outlets to better understand customer preferences to get the right message to the right customer.

takeaways

  1. You can build up your data science practice over time
  2. Data science and product development work well when organically partnered
  3. Don’t be limited by the data you have today – get the data you need

They use other standard market analytics, but they built their internal analytics so they can focus on their core competency. Early in their growth process they taught all their staff to use sql to better understand their data. She noted that the U.S. rollout of sql went well, but the European rollout had its challenges.

posted in:  applications, data, data analysis, Data Driven NYC, Media, startup    / leave comments:   No comments yet