Data Science at Home
Episodes

Sunday Oct 11, 2020
How to test machine learning in production (Ep. 121)
Sunday Oct 11, 2020
Sunday Oct 11, 2020
Come join me in our Discord channel speaking about all things data science.
Follow me on Twitch during my live coding sessions usually in Rust and Python
This episode is supported by Monday.com
Monday.com bring teams together so you can plan, manage and track everything your team is working on in one centralized place
The monday Apps Challenge is bringing developers around the world together to compete in order to build apps that can improve the way teams work together on monday.com.

Saturday Sep 26, 2020
Why synthetic data cannot boost machine learning (Ep. 120)
Saturday Sep 26, 2020
Saturday Sep 26, 2020
Come join me in our Discord channel speaking about all things data science.
Follow me on Twitch during my live coding sessions usually in Rust and Python
This episode is supported by Women in Tech by Manning Conferences
![Machine learning in production: best practices [LIVE from twitch.tv] (Ep. 119)](https://pbcdn1.podbean.com/imglogo/image-logo/1799802/dsh-cover-2_300x300.jpg)
Wednesday Sep 16, 2020
Machine learning in production: best practices [LIVE from twitch.tv] (Ep. 119)
Wednesday Sep 16, 2020
Wednesday Sep 16, 2020
Hey there! Having the best time of my life ;)
This is the first episode I record while I am live on my new Twitch channel :) So much fun!
Feel free to follow me for the next live streaming. You can also see me coding machine learning stuff in Rust :))
Don't forget to jump on the usual Discord and have a chat
I'll see you there!

Wednesday Aug 12, 2020
Why you care about homomorphic encryption (Ep. 116)
Wednesday Aug 12, 2020
Wednesday Aug 12, 2020
After deep learning, a new entry is about ready to go on stage. The usual journalists are warming up their keyboards for blogs, news feeds, tweets, in one word, hype.This time it's all about privacy and data confidentiality. The new words, homomorphic encryption.
Join and chat with us on the official Discord channel.
Sponsors
This episode is supported by Amethix Technologies.
Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. They are a consulting firm focused on data science, machine learning, and artificial intelligence.
References
Towards a Homomorphic Machine Learning Big Data Pipeline for the Financial Services Sector
IBM Fully Homomorphic Encryption Toolkit for Linux

Sunday Jul 26, 2020
GPT-3 cannot code (and never will) (Ep. 114)
Sunday Jul 26, 2020
Sunday Jul 26, 2020
The hype around GPT-3 is alarming and gives and provides us with the awful picture of people misunderstanding artificial intelligence. In response to some comments that claim GPT-3 will take developers' jobs, in this episode I express some personal opinions about the state of AI in generating source code (and in particular GPT-3).
If you have comments about this episode or just want to chat, come join us on the official Discord channel.
This episode is supported by Amethix Technologies.
Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. They are a consulting firm focused on data science, machine learning, and artificial intelligence.

Sunday Jul 19, 2020
Sunday Jul 19, 2020
In this episode I speak about data transformation frameworks available for the data scientist who writes Python code. The usual suspect is clearly Pandas, as the most widely used library and de-facto standard. However when data volumes increase and distributed algorithms are in place (according to a map-reduce paradigm of computation), Pandas no longer performs as expected. Other frameworks play a role in such context.
In this episode I explain the frameworks that are the best equivalent to Pandas in bigdata contexts.
Don't forget to join our Discord channel and comment previous episodes or propose new ones.
This episode is supported by Amethix Technologies
Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. Amethix is a consulting firm focused on data science, machine learning, and artificial intelligence.
References
Pandas a fast, powerful, flexible and easy to use open source data analysis and manipulation tool - https://pandas.pydata.org/
Modin - Scale your pandas workflows by changing one line of code - https://github.com/modin-project/modin
Dask advanced parallelism for analytics https://dask.org/
Ray is a fast and simple framework for building and running distributed applications https://github.com/ray-project/ray
RAPIDS - GPU data science https://rapids.ai/

Monday Jun 29, 2020
Rust and machine learning #4: practical tools (Ep. 110)
Monday Jun 29, 2020
Monday Jun 29, 2020
In this episode I make a non exhaustive list of machine learning tools and frameworks, written in Rust. Not all of them are mature enough for production environments. I believe that community effort can change this very quickly.
To make a comparison with the Python ecosystem I will cover frameworks for linear algebra (numpy), dataframes (pandas), off-the-shelf machine learning (scikit-learn), deep learning (tensorflow) and reinforcement learning (openAI).
Rust is the language of the future.Happy coding!
Reference
BLAS linear algebra https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms
Rust dataframe https://github.com/nevi-me/rust-dataframe
Rustlearn https://github.com/maciejkula/rustlearn
Rusty machine https://github.com/AtheMathmo/rusty-machine
Tensorflow bindings https://lib.rs/crates/tensorflow
Juice (machine learning for hackers) https://lib.rs/crates/juice
Rust reinforcement learning https://lib.rs/crates/rsrl

Monday Jun 22, 2020
Rust and machine learning #3 with Alec Mocatta (Ep. 109)
Monday Jun 22, 2020
Monday Jun 22, 2020
In the 3rd episode of Rust and machine learning I speak with Alec Mocatta. Alec is a +20 year experience professional programmer who has been spending time at the interception of distributed systems and data analytics. He's the founder of two startups in the distributed system space and author of Amadeus, an open-source framework that encourages you to write clean and reusable code that works, regardless of data scale, locally or distributed across a cluster.
Only for June 24th, LDN *Virtual* Talks June 2020 with Bippit (Alec speaking about Amadeus)

Wednesday Jun 17, 2020
Rust and machine learning #1 (Ep. 107)
Wednesday Jun 17, 2020
Wednesday Jun 17, 2020
This is the first episode of a series about the Rust programming language and the role it can play in the machine learning field.
Rust is one of the most beautiful languages I have ever studied so far. I personally come from the C programming language, though for professional activities in machine learning I had to switch to the loved and hated Python language.
This episode is clearly not providing you with an exhaustive list of the benefits of Rust, nor its capabilities. For this you can check the references and start getting familiar with what I think it's going to be the language of the next 20 years.
Sponsored
This episode is supported by Pryml Technologies. Pryml offers secure and cost effective data privacy solutions for your organisation. It generates a synthetic alternative without disclosing you confidential data.
References
The Rust Programming Language
Cookin' with Rust

Friday Feb 07, 2020
Friday Feb 07, 2020
Why so much silence? Building a company! That's why :) I am building pryml, a platform that allows data scientists build their applications on data they cannot get access to. This is the first of a series of episodes in which I will speak about the technology and the challenges we are facing while we build it.
Happy listening and stay tuned!