2018-08 | Data Science at Home

Episodes

Tuesday Aug 28, 2018

Episode 45: why do machine learning models fail?

Tuesday Aug 28, 2018

The success of a machine learning model depends on several factors and events. True generalization to data that the model has never seen before is more a chimera than a reality. But under specific conditions a well trained machine learning model can generalize well and perform with testing accuracy that is similar to the one performed during training.
In this episode I explain when and why machine learning models fail from training to testing datasets.

Tuesday Aug 21, 2018

Episode 44: The predictive power of metadata

Tuesday Aug 21, 2018

In this episode I don't talk about data. In fact, I talk about metadata.
While many machine learning models rely on certain amounts of data eg. text, images, audio and video, it has been proved how powerful is the signal carried by metadata, that is all data that is invisible to the end user.Behind a tweet of 140 characters there are more than 140 fields of data that draw a much more detailed profile of the sender and the content she is producing... without ever considering the tweet itself.

References You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information https://www.ucl.ac.uk/~ucfamus/papers/icwsm18.pdf

Tuesday Aug 14, 2018

Episode 43: Applied Text Analysis with Python (interview with Rebecca Bilbro)

Tuesday Aug 14, 2018

Today’s episode is about text analysis with python. Python is the de facto standard in machine learning. A large community, a generous choice in the set of libraries, at the price of less performant tasks, sometimes. But overall a decent language for typical data science tasks.
I am with Rebecca Bilbro, co-author of Applied Text Analysis with Python, with Benjamin Bengfort and Tony Ojeda.
We speak about the evolution of applied text analysis, tools and pipelines, chatbots.

Tuesday Aug 07, 2018

Episode 42: Attacking deep learning models (rebroadcast)

Tuesday Aug 07, 2018

Attacking deep learning models
Compromising AI for fun and profit

Deep learning models have shown very promising results in computer vision and sound recognition. As more and more deep learning based systems get integrated in disparate domains, they will keep affecting the life of people. Autonomous vehicles, medical imaging and banking applications, surveillance cameras and drones, digital assistants, are only a few real applications where deep learning plays a fundamental role. A malfunction in any of these applications will affect the quality of such integrated systems and compromise the security of the individuals who directly or indirectly use them.
In this episode, we explain how machine learning models can be attacked and what we can do to protect intelligent systems from being compromised.