New York Tech Journal
Tech news from the Big Apple

The Rise of the #DataArtist

Posted on March 9th, 2016


03/09/2016 @ PivotalLabs, 625 6th ave, NY

Olivier Meyer & Ryan Haber@Zoomdata talked about the advantages of interactive #DataAnalysis. They showed how a single picture can show the ruin of an army through cold and casualties. This was done by Charles Minard in his graphic of Napoleon’s 1812 invasion of Russia. There, 6 time series are displayed to great effect.

Next, they talked about the complexity of displaying facts buried in large data sets. This complexity creates a new category: Data artist who sits between the business analyst and the data scientist

They demonstrated how their program facilitates the interactive search for patterns in the data by retrieving only the relevant subset when needed for the graphics display. They call this microservices & data sharpening (initially a rough picture is presented, but results are refined as you watch).

Many interesting points were brought up in the discussion.

  1. Before diving into the data, one needs hypotheses of what is relevant to decision making
  2. Care must be taken, since interactive graphics (as in all graphics – see Darrell Huff “How to Lie with Statistics”) can inspire misleading or unfounded conclusions
  3. The data artist is obligated to present graphics that are truthful
  4. Generic templates may not be the best data presentation
  5. One needs to balance the customization of the data presentation with the time & effort expended to create an improved graphic
  6. Graphically inspired conclusions need to be supported by relevant statistics
  7. Frequently, statistics (alone) are not the best way to present findings
  8. The best way to communicate is dependent on the audience.
  9. The tools for data exploration may or may not be different from those for presenting conclusions.

posted in:  UI, UX, UX+Data    / leave comments:   No comments yet

Massively Collaborative Problem Solving

Posted on November 11th, 2015

UX + Data

11/11/2015 @Pivotal Labs, 625 6th Ave, NY

20151111_191504[1] 20151111_191933[1] 20151111_192011[1] 20151111_195605[1]

Matt Weber @zoomdata started by describing how simple rules can create complex, interesting systems

  1. #Conway’s game of life – simple rule
  2. The #Delphi Method (Rand Corporation) – collaboration

He next described his use of #Amazon Turk in 2009 to obtain interesting answers to complex problems. His example was a question asking for ways to make the U.S. energy self-sufficient. He used

Simple rules + iterative collaboration = massively #collaborative #ProblemSolving

Answers were selected using three simple tasks

  1. Create –each worker creates a list of 7 proposals – repeated by 50 workers
  2. Rate – Each proposal was rated on a 1-10 scale – done by 20 workers
  3. Atomize – take the 7 proposals with the highest aggregate score. Of the 7 proposals ask which need more details – ask 50 workers

This person proposed 7 as the maximum number of items that can be kept in working memory. Answer: Who is George A. Miller in his paper the Magical number seven plus or minus two?

End of round one.

  1. Take the top proposals and ask another set of workers to make a plan of action for this proposal
  2. 20 workers rate the subproposals on a 1-10 scale
  3. Select the top sub-proposals

Repeat for each of the top tasks

Matt then displayed the answers and commented on how many proposals were reasonable and well-thought-out

He next talked about design considerations when determining what problems could be successfully addressed by this method. The main consideration is to pick a general topic and let the crowd guide the process. The problem should be of general interest and be framed so it is

  1. Human readable – also can be handled by computers
  2. Short text – can be written and consumed fast
  3. Hierarchical
  4. Keep it relevant and passionate – people need to be involved

The problem needs to be encapsulated so it is bite-size and does not need a context.

posted in:  UX, UX+Data    / leave comments:   No comments yet

The #UX of Events Data: helping event organizers understand their audience

Posted on October 14th, 2015

UX + Data

10/14/2015 @Pivotal Labs, 625 6th Ave, NY

20151014_192410[1] 20151014_192551[1]

Chett Rubenstein @InsightXM spoke about InsightXM’s work on understanding attendance and registration of events such as trade shows, conferences, and festivals.

Chett described how insightXM can analyze the data which organizers already collect to help them achieve their goals such as increasing attendance, reaching a target market, etc.

He then talked about how insightXM improved their process to help the clients solve their problems. They proceeded in three iterations:

  1. First iteration – build a platform to upload data with some basic analytics
  2. Second iteration – build tools to help clients visualize files with large numbers of fields. Build mouse-overs so you can see the contents of the data fields. One of their interactive graphs shows the cumulative registrations over time, a map of the geographic distribution of registrations, and a slider and filters to slice the data by time and customer characteristics.
  3. Third iteration – make the data upload and categorization easy. The deliverables are bullet points summarizing any graphics presented to the client. InsightXM does the analysis behind the scenes.

Chett talked about current and future directions of insightXM and marketing in general.

  1. Increased used of behavioral analytics to better know the customer
  2. Linguistic analysis of marketing materials
  3. Real time demographic and behavioral prediction of customer preferences. For example, once a badge is scanned at a booth, you will know the individual’s behavioral preferences.
  4. Demographic lead scoring within CRM systems
  5. Referral engines at conferences suggesting sessions to attend based on individual preferences and behavior patterns of other attendees


posted in:  UI, UX, UX+Data    / leave comments:   No comments yet

Beaker Notebook: the #UX of Iterative Data Exploration

Posted on August 12th, 2015

UX + Data

08/12/2015 @ Pivotal Labs, 625 6th Ave, NY

20150812_192833[1] 20150812_193405[1] 20150812_193750[1] 20150812_193942[1]

Jeff Hendy and Scott Draves @TwoSigma presented @Beaker, a lab #notebook on the #web. The notebook allows researchers to collect data, code, graphs & tables for analyses done using one or more programming languages. The tool provides a seamless method to transfer data across a variety of languages making it easy to use the tools from each. Languages currently include: #Python, #R, #Julia, #JavaScript, #Scala, #Ruby, #Node.js, #D3, #Latex, #HTML, …

Beaker was developed by Two Sigma, an investment manager, to give their researcher a tool to analyze markets and document their findings. It is now an open source product.

The notebook is divided into sections and sections can be grouped hierarchically into larger sections. Within a section, an analysis can be performed in Python, for instance, and the output is saved to Beaker variables. These variables can be analyzed using R, Python or any of the supported languages. Beaker can also produce interactive graphics using its own native charting package. The notebook with code, data, and graphs can be saved for further analysis.

Jeff and Scott next talked about the design challenges when creating Beaker. These include:

  1. All languages are fully supported.
  2. Open source
  3. Environment independent

To create an expandable library of supported languages they have an intermediate Beaker language with plug-ins to handle each programming language. To insure Beaker can run on different operating systems, on- and off-the cloud, the user interface is text-based with little formatting.

To accommodate the wide range of programming and data analysis experience across users, they developed several interfaces from verbose (shows language employed, etc.) to terse. To help all levels of users, they adapted the web interface to provide key features available on local desktops, but frequently not available in browsers: 1. Menus in the upper margins, 2. Windows that can be repositioned on the desktop, 3. File dialogs.

To give the web app these functions, they used a framework from ‘The Electron’ which is developed in Chromium incorporating the tools from Node.js.

Data and data structures are passed across languages using #JSON. This offers generality, but with some loss of accuracy for floating point numbers. (in the future they plan to pass values using binary files). They are currently working on methods to share notebook sections (and possibly forked versions).

The audience was invited to try out the system at

posted in:  applications, data, data analysis, Programming, UI, UX, UX+Data    / leave comments:   No comments yet

The #UX of #StatisticalSoftware for #MobileDevices

Posted on December 10th, 2014

UX + Data

12/10/2014 @Pivotal, 625 6th Ave, NY

20141210_190702[1] 20141210_191409[1] 20141210_193124[1]

Sungjoon Nam @NumberAnalytics talked about the software he has developed for the analysis of business data without the clutter of standard statistical interfaces.

When he started teaching at Rutgers Business School in Newark, he realized how hard it was for the students to navigate the interface of SPSS and decipher the statistical tables output by the package.

He found similar issues with SAS and R, so he developed a web interface (using R as the statistical engine) that takes in data and produces outputs that directly address the business decision. In addition, the analysis were clearly labeled by business question addressed, so users can go directly to the needed analysis without needing to decide if it was a regression analysis, clustering, or other statistical technique.

One of the techniques used to guide the user to the important factors is the use colors to show when variables are statistically significant. Instead of using tables, they use graphs to show which variables are statistically significant. The software also provides a text description of what is significant.

Sungjoon then talked about the challenges when moving the application to the iPad. One challenge was that local storage may not be sufficient to handle some data sets, so alternatives such as dropbox must be available. Screen space is also limited, so they adopted a rule where user interactions move left to right on the screen and cover only one topic per page.

He closed by listing some lesson learned when presenting the software at a training class in China:

  1. Google does not work – avoid Google graphs.
  2. Make sure it runs on Windows XP, IE 6.0 – also Chrome is not available since it’s from Google
  3. Internet speed varies widely from provider-to-provider – make sure the site works in all environments
  4. Internet server speeds may vary over time.
  5. Use a local contact

One interesting design decision was to not include data cleaning facilities in the software. This greatly simplifies the interface and the technical demands on the user. The assumption is that the user will analyze clean data from sources such as Salesforce and Alibaba.

posted in:  data analysis, UX, UX+Data    / leave comments:   No comments yet