Critical Approaches to #DataScience & #MachineLearning
Posted on March 18th, 2017
3/17/2017 @Hunter College, 68th & Lexington Ave, New York, Lang Theater
Geetu Ambwani @HuffingtonPost @geetuji spoke about how the Huffington Post is looking at data as a way around the filter bubble in which separates individuals from views that are contrary to their previously help beliefs. Filter bubbles are believed to be a major reason for the current levels of polarization in society.
The talked about ways that the media can respond to this confirmation bias
- Show opposing point of view
- Show people their bias
- Show source crediability
For instance, Chrome and Buzzfeed have tools that will insert opposing points of view in your news feed. Flipfeed enables you to easily load another feed. AlephPost clusters articles and color codes them indicating the source’s vantage view. However, showing people opposing views can backfire.
Second, Readacross the spectrum will show you your biases. Politico will show you how blue or red you by indicating the color of your information sources.
Third, one can show source credibility and where it lies on the political spectrum
However, there is still a large gap between what is produced by the media and what consumers want. Also this does not remove the problem that ad dollars are given for “engagement” which means that portals are incented to continue delivering what the reader wants.
Next, Justin Hendrix @NYC Media Lab (consortium of universities started by the city of NY) talked about emerging media technologies. Examples were
- Vidrovr – teach computers how to watch video – produce searchable tags.
- Data selfi project – from the new school. See the data which Facebook has on us. A chrome extension. 100k downloads in the first week.
- Braiq – connect the mind with the on-board self-driving software on cars. Build software which is more reactive to the needs and wants of the passenger. Technology in the headrest and other inputs that will talk to the self-driving AI.
The follow up discussion covered a wide range of topics including
- The adtech fraud is known, but no one has the incentive to address. Fake audience – bots clicking sites
- Data sources are readily available lead by the Twitter or Facebook APIs. Get on github for open source code on downloading data
- Was the 20th century an aberration as to how information was disseminated? We might just be going back to a world with pools of information.
- What are the limits on what points of view any media company is willing to explore?
- What is the future of work and the social contract as jobs disappear?
A Panel Discussion on The Future of #DigitalPerformance
Posted on November 21st, 2015
11/20/2015 @Westbeth Gallery, 155 Bank St, NY
The Westbeth Galley, which displays art incorporating digital technology, hosted a discussion on the future of images and other digital media in performing arts. The panelists had a range of backgrounds and current uses of digital presentations.
Mark Coniglio – Creator of Isadora Software, Co-Founder, Troika Ranch
Wendall K. Harrington – theatre and production design, Assistant Professor, Yale School of Drama
Jared Mezzocchi – Award winning multi-media theatre director and designer, designer, Assistant Professor University of Maryland
Maya Ciarrocchi – Interdisciplinary Artist
Kevin Cunningham – Director 3 Legged Dog Theatre
Moderator: Andrew Scoville – Brooklyn based theater director focusing on developing new work that merges science and performance
Several themes were explored by the panelists.
- The inclusion of alternative or digital content in performances should push the main ideas forward and needs to be consistent with the other parts of the program as if it was just another performer/actor/musician in the ensemble
- One needs to express the idea before thinking of technology. Make sure that technology is not attached at the end. It should be given time within rehearsals to grow. If possible avoid the word “digital” as it artificially divides you and the rest of the creative team.
- The artist needs to intuit the director’s vision and present what is needed not what is requested.
- Flash may be easier with digital media, but the goal is still to give the audience a new experience so they have a chance to grow. The goal is still to tell a story and give a memorable experience.
- Creative tension is important as it is in all artistic ventures.
The panelists also mentioned digits works that they considered unusually immersive or interesting.
- Daito Manabe (Rhizomatiks) puts his experimental videos online. He has a series in which he electrically stimulates his face. He also has videos with dancers interacting with lights and drones and robots.
- Tod Machover and the MIT media lab worked on a glove and other interactive instruments.
- Luke DuBois drew a map of the U.S. with cities identified by the most used words on the dating sites for people in those cities.
- Audience participation at ”danger parties”.
- On commercial side there is a call to make it immersive which can only be done with digital technology: “Charlie Victor Romeo”.
DataDrivenNYC: computer #security, #DeepLearning, distributing #tweets to web sites, customizing user experience
Posted on October 13th, 2015
10/12/2015 @Bloomberg, 731 Lexington Ave, NY
This month’s speakers were
• Liz Crawford, CTO of Birchbox (discovery commerce platform for beauty products)
• Richard Socher, Founder and CEO of MetaMind (artificial intelligence enterprise platform for automating image recognition and language understanding)
• Ramana Rao, CTO of Livefyre (real-time content marketing and engagement platform)
• Oren Falkowitz, Founder and CEO of Area 1 Security (provides visibility into the next generation of unknown, sophisticated targeted attacks)
Oren Falkowitz @Area1Security talked about his company’s approach to computer security. The traditional approach to network/computer security uses layers of defense including fire walls, passwords, etc.
Area 1 Security starts from the approach that 97% of all attacks original through phishing and these attacks can circumvent most traditional defenses. In addition, these attacks can mine your data for months until they are discovered. In contrast to the traditional approach, they strives to better understand the attacker’s behaviors and look for the telltale network and external usage patterns that indicate attempts to probe your network.
Oren talked about how intrusions often are periodic and from specific parts of the world wide web. By looking within the network and for usage patterns across the entire www, they hope to detect intrusions quickly using
- Visibility (big data)
- Detection (deep learning)
He closed by displaying a map showing a snapshot of all activity world-wide on the web.
Next, Richard Socher @MetaMind spoke about how MetaMind brings deep learning tools to companies. MetaMind applies deep learning to vision and language, He demonstrated how the tool classifies images based on a standard vocabulary, but can also be taught new classifiers using a drag and drop web interface.
For instance, to create a classifier for BMW vs Audi vs Telsa, he dropped three sets of images into slots on the interface and the classifier (powered by GPUs) used these images to evaluate new images.
Other uses of this technology include:
- Examine each frame of a game video and find your company logo.
- Find your company logo on social media
- In Diabetic retinopathy, classify images within needing people that have spent years learning how to read the scans
They also have tools that do natural language understanding using deep learning algorithms. These can predict the sentiment of a sentence and extract the main actors and themes in that sentence.
Richard addressed the question of why deep learning have just gained traction after existing for decades with little commercial interest:
- Enough large data sets are now available
- Larger models can be created due to faster machines – GPUs, multi-core CPUs
- Lots of small algorithmic advances over the years
Ramana Rao @Liverfyre talked about his company which provides real-time tweets (and other internet updates) to web sites. They capture social media, organize it, and publish it.
They create walls showing the tweets that are being sent during a conference. They insert tweets to actively change web pages. These live updates encourage viewers to stay longer on the site which increases overall engagement and develops greater affinity for the brand.
Ramana illustrated the uses of these updates to create an “earnings wall” for wall street pages, show social media comments during a TV show, or display tweets during presidential debates.
Analytics tools show the total amount of time spent on a site. They also show types of behavior as users interact in different ways by noting likes, or posting comments.
The large volume of data (currently 350 million global uniques each month) requires a large system which they run on Amazon Web Services.
The large volume of communications also requires them to employ various filters. Their spam/abuse filters use word lists and regular expressions to detect specific patterns. They also detect spam using bulk detection to look for multiple repeated messages. They also filter out nudity.
The fourth speaker, Liz Crawford @Birchbox spoke about Birchbox’s use of analytics. Birchbox is a beauty retailer which has a retail and online presence, but specializes in a personalized package of items they send to subscribers every month.
To create the personalized package and better market to their customers, they have a staff of data scientists and statistical analysts:
- Data scientist = Ph.D. who can write production code, embed into product development teams
- Statistical analysts = focus on analytics. Specialized into business concerns.
- Analysts = junior statistical analysts
- Data engineering (warehouse) embedded with platform engineering.
In this structure, data scientists work on methods to better use the user profiles to personalizing the box subscribers receive each month. The problem becomes an integer programming optimization problem since subscribers never receive the same thing twice and there are a limited numbers of specific samples available each month. To further complicate the optimization, customers can request one item each month. The data scientists use the Gurobi solver.
They also make customized recommendations online.
They have outlets on web, offline, brick & mortar, apps, social media. They track each customer’s touches on all outlets to better understand customer preferences to get the right message to the right customer.
- You can build up your data science practice over time
- Data science and product development work well when organically partnered
- Don’t be limited by the data you have today – get the data you need
They use other standard market analytics, but they built their internal analytics so they can focus on their core competency. Early in their growth process they taught all their staff to use sql to better understand their data. She noted that the U.S. rollout of sql went well, but the European rollout had its challenges.
Investments In Media: Figuring Out The Endgame
Posted on July 1st, 2015
06/30/2015 @WNYC Greene Space 44 Charlton St New York, NY 10006 sponsored by @Parsely
Four panelists talked about trends and the business outlook for media companies. Panelists were
- Cyna Alderman, Managing Director, @DailyNews Innovation Lab
- Joey Marburger, Director of Digital Products and Design, @TheWashingtonPost
- Erin Griffith, Writer, @FortuneMagazine
- Andrew Cleland, Managing Director, @ComcastVentures
The discussion, lead by Cyna, started with views on the current state of media.
- Joey spoke about how Jeff Bezo has brought energy to the Washington Post creating an experimentation culture which feels like a startup.
- Erin spoke about how companies are exploring new ways to make money on the internet as banner ads to a large audience become less effective. She mentioned Buzzfeed as one company trying to create multiple different revenue sources: video,…
- Andrew said he looks for content and monetization when considering ventures
- Proprietary access to high value content
- Alternative can they scale free content (contributed content)
- Do they have a platform to allow grading of content.
- Monetization (Ads)
- Smart at aggregating data from different sources
- Quantitative decision making based on data
Next the panelists talked about trends they see
- Joey spoke about how the Post launch a free-lance network launched last week. The goal of the network was to facilitate contact with high quality content providers.
- Andrew spoke about the difficulty in selling better tools to media companies. Issues include the inability to directly attribute revenues to these tools within financially stress companies. The one exception he noted by support tools for video.
- Erin extended these comments by noting that publishers are generating video as a money maker. This, however, has resulted in a proliferation of autoplay videos which distract from content and degrade the user experience.
- Joey said that micropayments to read articles might be increasing important as revenue generators as content owners experiment with different distribution channels.
- Andrew talked about the long term importance of high-curated, personalized media. To accomplish successful delivery, companies must become smart in their calls to action with A/B testing an important part of the development process. This testing will become increasing important as video assumes a larger role in selling product and the company story.
- Erin emphasized the importance of content management systems as the variety of different offerings increased as well as their personalization to individuals.
The panelists talked about the future of content providers: the top people will become increasing well rewarded relative to the rest. Brand and quality are key and being associated with a big brand can provide some protection from falling into the long lower tail.
They also expressed uncertainty on how long trends will last and whether some advertising wisdom, such as marketing to mobile millenials accurately segments the audience. All agreed that a more global view of user experience is needed to better understanding customers and reward the content providers as well as the sales engines. New technologies such as AR/VR may eventually be part of the media offerings, but these will take time to engineer and integrate into the delivery of information.