Data Science at Home
Episodes

Tuesday Jun 15, 2021
Time to take your data back with Tapmydata (Ep. 156)
Tuesday Jun 15, 2021
Tuesday Jun 15, 2021
In this episode I am with Gilbert Hill, head of strategy at https://tapmydata.com/
We speak about personal data, blockchain and the ability to control it and monetize with another simple yet effective app in the ecosystem.
References
https://tapmydata.com/
https://medium.com/@tholder/we-dont-want-your-data-pushing-boundaries-in-data-collection-and-end-to-end-encryption-for-apps-ebd1d5f79df5
![You are the product [RB] (Ep. 147)](https://pbcdn1.podbean.com/imglogo/image-logo/1799802/dsh-cover-2_300x300.jpg)
Sunday Apr 11, 2021
You are the product [RB] (Ep. 147)
Sunday Apr 11, 2021
Sunday Apr 11, 2021
In this episode I am with George Hosu from Cerebralab
and we speak about how dangerous it is not to pay for the services you use, and as a consequence how dangerous it is letting an algorithm decide what you like or not.
Our Sponsors
This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey.To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience
If building software is your passion, you’ll love ThoughtWorks Technology Podcast. It’s a podcast for techies by techies. Their team of experienced technologists take a deep dive into a tech topic that’s piqued their interest — it could be how machine learning is being used in astrophysics or maybe how to succeed at continuous delivery.
Links
https://cerebralab.com
https://www.eugenewei.com/blog/2019/2/19/status-as-a-service

Sunday Feb 07, 2021
What's up with WhatsApp? (Ep. 138)
Sunday Feb 07, 2021
Sunday Feb 07, 2021
Have you clicked the button? Accepted the new terms?
It's time we have a talk.

Saturday Dec 19, 2020
What is data ethics? (Ep. 133)
Saturday Dec 19, 2020
Saturday Dec 19, 2020
What is data ethics? In this episode I have an interesting chat with Denny Wong from FaqBot and Muna.
Our Sponsor
Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.
References
Denny's Twitter profile
The data ethics awareness workshop for AI practitioners

Friday Dec 04, 2020
What happens to data transfer after Schrems II? (Ep. 131)
Friday Dec 04, 2020
Friday Dec 04, 2020
In this episode Adam Leon Smith, CTO of DragonFly and expert in data regulations explains some of the consequences of Schrems II and data transfers from EU to US.
For very interesting references and a practical example, subscribe to our Newsletter

Friday Sep 04, 2020
Testing in machine learning: checking deeplearning models (Ep. 118)
Friday Sep 04, 2020
Friday Sep 04, 2020
In this episode I speak with Adam Leon Smith, CTO at DragonFly and expert in testing strategies for software and machine learning.We cover testing with deep learning (neuron coverage, threshold coverage, sign change coverage, layer coverage, etc.), combinatorial testing and their practical aspects.
On September 15th there will be a live@Manning Rust conference. In one Rust-full day you will attend many talks about what's special about rust, building high performance web services or video game, about web assembly and much more.If you want to meet the tribe, tune in september 15th to the live@manning rust conference.

Wednesday Aug 12, 2020
Why you care about homomorphic encryption (Ep. 116)
Wednesday Aug 12, 2020
Wednesday Aug 12, 2020
After deep learning, a new entry is about ready to go on stage. The usual journalists are warming up their keyboards for blogs, news feeds, tweets, in one word, hype.This time it's all about privacy and data confidentiality. The new words, homomorphic encryption.
Join and chat with us on the official Discord channel.
Sponsors
This episode is supported by Amethix Technologies.
Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. They are a consulting firm focused on data science, machine learning, and artificial intelligence.
References
Towards a Homomorphic Machine Learning Big Data Pipeline for the Financial Services Sector
IBM Fully Homomorphic Encryption Toolkit for Linux

Friday May 08, 2020
Pandemics and the risks of collecting data (Ep. 103)
Friday May 08, 2020
Friday May 08, 2020
Codiv-19 is an emergency. True. Let's just not prepare for another emergency about privacy violation when this one is over.
Join our new Slack channel
This episode is supported by Proton. You can check them out at protonmail.com or protonvpn.com

Monday Mar 23, 2020
WARNING!! Neural networks can memorize secrets (ep. 100)
Monday Mar 23, 2020
Monday Mar 23, 2020
One of the best features of neural networks and machine learning models is to memorize patterns from training data and apply those to unseen observations. That's where the magic is. However, there are scenarios in which the same machine learning models learn patterns so well such that they can disclose some of the data they have been trained on. This phenomenon goes under the name of unintended memorization and it is extremely dangerous.
Think about a language generator that discloses the passwords, the credit card numbers and the social security numbers of the records it has been trained on. Or more generally, think about a synthetic data generator that can disclose the training data it is trying to protect.
In this episode I explain why unintended memorization is a real problem in machine learning. Except for differentially private training there is no other way to mitigate such a problem in realistic conditions.At Pryml we are very aware of this. Which is why we have been developing a synthetic data generation technology that is not affected by such an issue.
This episode is supported by Harmonizely. Harmonizely lets you build your own unique scheduling page based on your availability so you can start scheduling meetings in just a couple minutes.Get started by connecting your online calendar and configuring your meeting preferences.Then, start sharing your scheduling page with your invitees!
References
The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networkshttps://www.usenix.org/conference/usenixsecurity19/presentation/carlini

Sunday Mar 01, 2020
Why sharing real data is dangerous (Ep. 97)
Sunday Mar 01, 2020
Sunday Mar 01, 2020
There are very good reasons why a financial institution should never share their data. Actually, they should never even move their data. Ever.In this episode I explain you why.