New York Tech Journal
Tech news from the Big Apple

Listening to Customers as you develop, assembling a #genome, delivering food boxes

Posted on September 21st, 2016

#CodeDrivenNYC

09/21/2016 @FirstMark, 100 Fifth Ave, NY, 3rd floor

img_20160921_1824581 img_20160921_1850401 img_20160921_1910301 img_20160921_1937151

JJ Fliegelman @WayUp (formerly CampusJob) spoke about the development process used by their application which is the largest market for college students to find jobs. JJ talked about their development steps.

He emphasized the importance of specing out ideas on what they should be building and talking to your users.

They use tools to stay in touch with your customers

  1. HelpScout – see all support tickets. Get the vibe
  2. FullStory – DVR software – plays back video recordings of how users are using the software

They also put ideas in a repository using Trello.

To illustrate their process, he examined how they work to improved job search relevance.

They look at Impact per unit Effort to measure the value. They do this across new features over time. Can prioritize and get multiple estimates. It’s a probabilistic measure.

Assessing impact – are people dropping off? Do people click on it? What are the complaints? They talk to experts using cold emails. They also cultivate a culture of educated guesses

Assess effort – get it wrong often and get better over time

They prioritize impact/effort with the least technical debt

They Spec & Build – (product, architecture, kickoff) to get organized

Use Clubhouse is their project tracker: readable by humans

Architecture spec to solve today’s problem, but look ahead. Eg.. initial architecture – used wordnet, elastic search, but found that elastic search was too slow so they moved to a graph database.

Build – build as little as possible; prototype; adjust your plan

Deploy – they will deploy things that are not worse (e.g. a button that doesn’t work yet)

They do code reviews to avoid deploying bad code

Paul Fisher @Phosphorus (from Recombine – formerly focused on the fertility space: carrier-screening. Now emphasize diagnostic DNA sequencing) talked about the processes they use to analyze DNA sequences. With the rapid development of laboratory technique, it’s a computer science question now. Use Scala, Ruby, Java.

Sequencers produce hundreds of short reads of 50 to 150 base pairs. They use a reference genome to align the reads. Want multiple reads (depth of reads) to create a consensus sequence

To lower cost and speed their analysis, they focus on particular areas to maximize their read depth.

They use a variant viewer to understand variants between the person’s and the reference genome:

  1. SNPs – one base is changed – degree of pathogenicity varies
  2. Indels – insertions & deletions
  3. CNVs – copy variations

They use several different file formats: FASTQ, Bam/Sam, VCF

Current methods have evolved to use Spark, Parquet (columnar storage db), and Adam (use Avro framework for nested collections)

Use Zepplin to share documentation: documentation that you can run.

Finally, Andrew Hogue @BlueApron spoke about the challenges he faces as the CTO. These include

Demand forecasting – use machine learning (random forest) to predict per user what they will order. Holidays are hard to predict. People order less lamb and avoid catfish. There was also a dip in orders and orders with meat during Lent.

Fulfillment – more than just inventory management since recipes change, food safety, weather, …

Subscription mechanics – weekly engagement with users. So opportunities to deepen engagement. Frequent communications can drive engagement or churn. A/B experiments need more time to run

BlueApron runs 3 Fulfillment centers for their weekly food deliveries: NJ, Texas, CA shipping 8mm boxes per month.

posted in:  applications, Big data, Code Driven NYC, data, data analysis, startup    / leave comments:   No comments yet