New York Tech Journal
Tech news from the Big Apple

#DataDrivenNYC: #FaultTolerant #Web sites, #Finance, Predicting #B2B buying behavior, training #DeepLearning

Posted on May 18th, 2016

#DataDrivenNYC

05/18/2016 @AXA auditorium, 787 7th  Avenue, NY

20160518_182345[1] 20160518_184230[1] 20160518_191103[1] 20160518_193745[1]

Four speakers presented:

First, Nicolas Dessaigne @Algolia (Subscription service to access a search API) talked about the challenges building a highly fault-tolerant world-wide service. The steps resulted from their understanding of points of failure within their systems and the infrastructure their systems depend on.

Initially, they concentrated on their software development process including failed updates.  To overcome these problems, they update one server at a time (with a rack of servers), do partial updates, use Chef to automate deployment.

Then they migrated their DNS provider from .io to .net TLD to avoid slow response times they had seen intermittently in Asia. This was followed by the upgrades:

Feb 2015. Set up clusters of servers world-wide , so users have a server in their region:  lower latency

March 2015. Physically separate server clusters within a region to different providers

May 2015. Create fallback DNS servers

July 2015. Put a third data center online to make indexing robust

April 2016. Implement  a 1 second granularity for their system monitoring

Next, Matt Turck interviewed Louis DiModugno @AXA . In the US, AXA’s main focus is on predictive underwriting of insurance process. They also have projects to incorporate sensors into products and correctly route queries to call centers based on the demographics of the customer. World-wide they have three analysis hubs: France, US, Singapore (coming online).

Louis oversees both data and analytics in the U.S. and both he and the CTO report to the CIO.  They are interested in expanding their capabilities in areas such as creating unstructured databases from life insurance data that are currently on microfiche.

In the third presentation, Amanda Kahlow @6Sense talked about their business model  to provide information to customers in B2B commerce. They analyze business searches, customer web sites, visits to publisher’s (e.g. Forbes) web sites. Their goal is to determine the timing of customer purchases.

B2B purchases are different from B2C purchases since

  1. Businesses research their purchases online before they buy
  2. The research takes time (long sales cycle)
  3. The decision to buy involves multiple people within the company

So, there are few impulse buys and buyer behavior signals that a purchase is imminent.

The main CMO question is when (not who).

6sense ties data across searches (anonymous data). The goal is to identify when companies are in a specific part of the buying cycle, so sales can approach them now. (Example: show click-to-chat when the analytics says that the customer is ready to buy)

Lastly, Peter Brodsky @HyperScience  spoke about tools they are developing to speed machine learning. These include

  1. Tools to make it easier to add new data sets
  2. need to match fields, such as date which may be in different formats
  3. what to do with missing data
  4. need labeled data – lots of examples
  5. Speed up training time

The speed up is done by identifying subnets within the larger neural network. The subnets perform distinct functions. To determine if two subnets (in different networks) are equivalent, move one subnet from one network to replace another subnet in another network and see if the function is unchanged: Freeze the weights within the subnet and outside the subnet. Retrain the interface between the net and the subnet.

This creates building blocks which can be combined into larger blocks. These blocks can be applied to jump start the training process.

 

posted in:  AI, applications, Big data, data analysis, Data Driven NYC, startup    / leave comments:   No comments yet