Data Science at Home
Episodes
![Machine learning in production: best practices [LIVE from twitch.tv] (Ep. 119)](https://pbcdn1.podbean.com/imglogo/image-logo/1799802/dsh-cover-2_300x300.jpg)
Wednesday Sep 16, 2020
Machine learning in production: best practices [LIVE from twitch.tv] (Ep. 119)
Wednesday Sep 16, 2020
Wednesday Sep 16, 2020
Hey there! Having the best time of my life ;)
This is the first episode I record while I am live on my new Twitch channel :) So much fun!
Feel free to follow me for the next live streaming. You can also see me coding machine learning stuff in Rust :))
Don't forget to jump on the usual Discord and have a chat
I'll see you there!

Friday Sep 04, 2020
Testing in machine learning: checking deeplearning models (Ep. 118)
Friday Sep 04, 2020
Friday Sep 04, 2020
In this episode I speak with Adam Leon Smith, CTO at DragonFly and expert in testing strategies for software and machine learning.We cover testing with deep learning (neuron coverage, threshold coverage, sign change coverage, layer coverage, etc.), combinatorial testing and their practical aspects.
On September 15th there will be a live@Manning Rust conference. In one Rust-full day you will attend many talks about what's special about rust, building high performance web services or video game, about web assembly and much more.If you want to meet the tribe, tune in september 15th to the live@manning rust conference.

Saturday Aug 29, 2020
Testing in machine learning: generating tests and data (Ep. 117)
Saturday Aug 29, 2020
Saturday Aug 29, 2020
In this episode I speak with Adam Leon Smith, CTO at DragonFly and expert in testing strategies for software and machine learning.
On September 15th there will be a live@Manning Rust conference. In one Rust-full day you will attend many talks about what's special about rust, building high performance web services or video game, about web assembly and much more.If you want to meet the tribe, tune in september 15th to the live@manning rust conference.

Monday Aug 03, 2020
Test-First machine learning (Ep. 115)
Monday Aug 03, 2020
Monday Aug 03, 2020
In this episode I speak about a testing methodology for machine learning models that are supposed to be integrated in production environments.
Don't forget to come chat with us in our Discord channel
Enjoy the show!
--
This episode is supported by Amethix Technologies.
Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. They are a consulting firm focused on data science, machine learning, and artificial intelligence.

Wednesday Jul 22, 2020
Make Stochastic Gradient Descent Fast Again (Ep. 113)
Wednesday Jul 22, 2020
Wednesday Jul 22, 2020
There is definitely room for improvement in the family of algorithms of stochastic gradient descent. In this episode I explain a relatively simple method that has shown to improve on the Adam optimizer. But, watch out! This approach does not generalize well.
Join our Discord channel and chat with us.
References
More descent, less gradient
Taylor Series

Sunday Jul 19, 2020
Sunday Jul 19, 2020
In this episode I speak about data transformation frameworks available for the data scientist who writes Python code. The usual suspect is clearly Pandas, as the most widely used library and de-facto standard. However when data volumes increase and distributed algorithms are in place (according to a map-reduce paradigm of computation), Pandas no longer performs as expected. Other frameworks play a role in such context.
In this episode I explain the frameworks that are the best equivalent to Pandas in bigdata contexts.
Don't forget to join our Discord channel and comment previous episodes or propose new ones.
This episode is supported by Amethix Technologies
Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. Amethix is a consulting firm focused on data science, machine learning, and artificial intelligence.
References
Pandas a fast, powerful, flexible and easy to use open source data analysis and manipulation tool - https://pandas.pydata.org/
Modin - Scale your pandas workflows by changing one line of code - https://github.com/modin-project/modin
Dask advanced parallelism for analytics https://dask.org/
Ray is a fast and simple framework for building and running distributed applications https://github.com/ray-project/ray
RAPIDS - GPU data science https://rapids.ai/
![[RB] It’s cold outside. Let’s speak about AI winter (Ep. 111)](https://pbcdn1.podbean.com/imglogo/image-logo/1799802/dsh-cover-2_300x300.jpg)
Friday Jul 03, 2020
[RB] It’s cold outside. Let’s speak about AI winter (Ep. 111)
Friday Jul 03, 2020
Friday Jul 03, 2020
In this episode I speak with Filip Piekniewski about some of the most worth noting findings in AI and machine learning in 2019. As a matter of fact, the entire field of AI has been inflated by hype and claims that are hard to believe. A lot of the promises made a few years ago have revealed quite hard to achieve, if not impossible. Let's stay grounded and realistic on the potential of this amazing field of research, not to bring disillusion in the near future.
Join us to our Discord channel to discuss your favorite episode and propose new ones.
This episode is brought to you by Protonmail
Click on the link in the description or go to protonmail.com/datascience and get 20% off their annual subscription.

Monday Jun 29, 2020
Rust and machine learning #4: practical tools (Ep. 110)
Monday Jun 29, 2020
Monday Jun 29, 2020
In this episode I make a non exhaustive list of machine learning tools and frameworks, written in Rust. Not all of them are mature enough for production environments. I believe that community effort can change this very quickly.
To make a comparison with the Python ecosystem I will cover frameworks for linear algebra (numpy), dataframes (pandas), off-the-shelf machine learning (scikit-learn), deep learning (tensorflow) and reinforcement learning (openAI).
Rust is the language of the future.Happy coding!
Reference
BLAS linear algebra https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms
Rust dataframe https://github.com/nevi-me/rust-dataframe
Rustlearn https://github.com/maciejkula/rustlearn
Rusty machine https://github.com/AtheMathmo/rusty-machine
Tensorflow bindings https://lib.rs/crates/tensorflow
Juice (machine learning for hackers) https://lib.rs/crates/juice
Rust reinforcement learning https://lib.rs/crates/rsrl

Monday Jun 22, 2020
Rust and machine learning #3 with Alec Mocatta (Ep. 109)
Monday Jun 22, 2020
Monday Jun 22, 2020
In the 3rd episode of Rust and machine learning I speak with Alec Mocatta. Alec is a +20 year experience professional programmer who has been spending time at the interception of distributed systems and data analytics. He's the founder of two startups in the distributed system space and author of Amadeus, an open-source framework that encourages you to write clean and reusable code that works, regardless of data scale, locally or distributed across a cluster.
Only for June 24th, LDN *Virtual* Talks June 2020 with Bippit (Alec speaking about Amadeus)

Friday Jun 19, 2020
Rust and machine learning #2 with Luca Palmieri (Ep. 108)
Friday Jun 19, 2020
Friday Jun 19, 2020
In the second episode of Rust and Machine learning I am speaking with Luca Palmieri, who has been spending a large part of his career at the interception of machine learning and data engineering. In addition, Luca contributed to several projects closer to the machine learning community using the Rust programming language. Linfa is an ambitious project that definitely deserves the attention of the data science community (and it's written in Rust, with Python bindings! How cool??!).
References
Series Announcement - Zero to Production in Rust https://www.lpalmieri.com/posts/2020-05-10-announcement-zero-to-production-in-rust/
Zero To Production #0: Foreword https://www.lpalmieri.com/posts/2020-05-24-zero-to-production-0-foreword/
Taking ML to production with Rust: a 25x speedup https://www.lpalmieri.com/posts/2019-12-01-taking-ml-to-production-with-rust-a-25x-speedup/