Page 19 | Data Science at Home

Listen on:

Episodes

Wednesday Aug 12, 2020

Why you care about homomorphic encryption (Ep. 116)

Wednesday Aug 12, 2020

After deep learning, a new entry is about ready to go on stage. The usual journalists are warming up their keyboards for blogs, news feeds, tweets, in one word, hype.This time it's all about privacy and data confidentiality. The new words, homomorphic encryption.

Join and chat with us on the official Discord channel.

Sponsors
This episode is supported by Amethix Technologies.
Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. They are a consulting firm focused on data science, machine learning, and artificial intelligence.

References
Towards a Homomorphic Machine Learning Big Data Pipeline for the Financial Services Sector
IBM Fully Homomorphic Encryption Toolkit for Linux

Monday Aug 03, 2020

Test-First machine learning (Ep. 115)

Monday Aug 03, 2020

In this episode I speak about a testing methodology for machine learning models that are supposed to be integrated in production environments.
Don't forget to come chat with us in our Discord channel

Enjoy the show!

--
This episode is supported by Amethix Technologies.

Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. They are a consulting firm focused on data science, machine learning, and artificial intelligence.

Sunday Jul 26, 2020

GPT-3 cannot code (and never will) (Ep. 114)

Sunday Jul 26, 2020

The hype around GPT-3 is alarming and gives and provides us with the awful picture of people misunderstanding artificial intelligence. In response to some comments that claim GPT-3 will take developers' jobs, in this episode I express some personal opinions about the state of AI in generating source code (and in particular GPT-3).

If you have comments about this episode or just want to chat, come join us on the official Discord channel.

This episode is supported by Amethix Technologies.
Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. They are a consulting firm focused on data science, machine learning, and artificial intelligence.

Wednesday Jul 22, 2020

Make Stochastic Gradient Descent Fast Again (Ep. 113)

Wednesday Jul 22, 2020

There is definitely room for improvement in the family of algorithms of stochastic gradient descent. In this episode I explain a relatively simple method that has shown to improve on the Adam optimizer. But, watch out! This approach does not generalize well.
Join our Discord channel and chat with us.

References
More descent, less gradient
Taylor Series

Sunday Jul 19, 2020

What data transformation library should I use? Pandas vs Dask vs Ray vs Modin vs Rapids (Ep. 112)

Sunday Jul 19, 2020

In this episode I speak about data transformation frameworks available for the data scientist who writes Python code. The usual suspect is clearly Pandas, as the most widely used library and de-facto standard. However when data volumes increase and distributed algorithms are in place (according to a map-reduce paradigm of computation), Pandas no longer performs as expected. Other frameworks play a role in such context.
In this episode I explain the frameworks that are the best equivalent to Pandas in bigdata contexts.
Don't forget to join our Discord channel and comment previous episodes or propose new ones.

This episode is supported by Amethix Technologies
Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. Amethix is a consulting firm focused on data science, machine learning, and artificial intelligence.

References
Pandas a fast, powerful, flexible and easy to use open source data analysis and manipulation tool - https://pandas.pydata.org/
Modin - Scale your pandas workflows by changing one line of code - https://github.com/modin-project/modin
Dask advanced parallelism for analytics https://dask.org/
Ray is a fast and simple framework for building and running distributed applications https://github.com/ray-project/ray
RAPIDS - GPU data science https://rapids.ai/

Friday Jul 03, 2020

[RB] It’s cold outside. Let’s speak about AI winter (Ep. 111)

Friday Jul 03, 2020

In this episode I speak with Filip Piekniewski about some of the most worth noting findings in AI and machine learning in 2019. As a matter of fact, the entire field of AI has been inflated by hype and claims that are hard to believe. A lot of the promises made a few years ago have revealed quite hard to achieve, if not impossible. Let's stay grounded and realistic on the potential of this amazing field of research, not to bring disillusion in the near future.
Join us to our Discord channel to discuss your favorite episode and propose new ones.

This episode is brought to you by Protonmail
Click on the link in the description or go to protonmail.com/datascience and get 20% off their annual subscription.

Monday Jun 29, 2020

Rust and machine learning #4: practical tools (Ep. 110)

Monday Jun 29, 2020

In this episode I make a non exhaustive list of machine learning tools and frameworks, written in Rust. Not all of them are mature enough for production environments. I believe that community effort can change this very quickly.
To make a comparison with the Python ecosystem I will cover frameworks for linear algebra (numpy), dataframes (pandas), off-the-shelf machine learning (scikit-learn), deep learning (tensorflow) and reinforcement learning (openAI).
Rust is the language of the future.Happy coding!
Reference
BLAS linear algebra https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms
Rust dataframe https://github.com/nevi-me/rust-dataframe
Rustlearn https://github.com/maciejkula/rustlearn
Rusty machine https://github.com/AtheMathmo/rusty-machine
Tensorflow bindings https://lib.rs/crates/tensorflow
Juice (machine learning for hackers) https://lib.rs/crates/juice
Rust reinforcement learning https://lib.rs/crates/rsrl

Monday Jun 22, 2020

Rust and machine learning #3 with Alec Mocatta (Ep. 109)

Monday Jun 22, 2020

In the 3rd episode of Rust and machine learning I speak with Alec Mocatta. Alec is a +20 year experience professional programmer who has been spending time at the interception of distributed systems and data analytics. He's the founder of two startups in the distributed system space and author of Amadeus, an open-source framework that encourages you to write clean and reusable code that works, regardless of data scale, locally or distributed across a cluster.
Only for June 24th, LDN *Virtual* Talks June 2020 with Bippit (Alec speaking about Amadeus)

Friday Jun 19, 2020

Rust and machine learning #2 with Luca Palmieri (Ep. 108)

Friday Jun 19, 2020

In the second episode of Rust and Machine learning I am speaking with Luca Palmieri, who has been spending a large part of his career at the interception of machine learning and data engineering. In addition, Luca contributed to several projects closer to the machine learning community using the Rust programming language. Linfa is an ambitious project that definitely deserves the attention of the data science community (and it's written in Rust, with Python bindings! How cool??!).

References
Series Announcement - Zero to Production in Rust https://www.lpalmieri.com/posts/2020-05-10-announcement-zero-to-production-in-rust/
Zero To Production #0: Foreword https://www.lpalmieri.com/posts/2020-05-24-zero-to-production-0-foreword/
Taking ML to production with Rust: a 25x speedup https://www.lpalmieri.com/posts/2019-12-01-taking-ml-to-production-with-rust-a-25x-speedup/

Wednesday Jun 17, 2020

Rust and machine learning #1 (Ep. 107)

Wednesday Jun 17, 2020

This is the first episode of a series about the Rust programming language and the role it can play in the machine learning field.
Rust is one of the most beautiful languages I have ever studied so far. I personally come from the C programming language, though for professional activities in machine learning I had to switch to the loved and hated Python language.
This episode is clearly not providing you with an exhaustive list of the benefits of Rust, nor its capabilities. For this you can check the references and start getting familiar with what I think it's going to be the language of the next 20 years.

Sponsored
This episode is supported by Pryml Technologies. Pryml offers secure and cost effective data privacy solutions for your organisation. It generates a synthetic alternative without disclosing you confidential data.

References
The Rust Programming Language
Cookin' with Rust

Become a sponsor

Data Science at Home is the top-10 best data science podcasts on Apple Podcasts, Spotify, Stitcher, Podbean and many more aggregators.

We reach our audience on a weekly basis via 30-minute episodes enriched with blog posts and show notes. Our episodes reach a highly targeted audience across a wide demographics and globally distributed.

Data Science at home currently accepts at most two advertising slots per episode. The scheduled episode for your advertising campaign will be defined by our team, depending on the topic and the current advertising queue.

Our team is available to give you recommendations about your application and to discuss rates. Please send a direct email to media@amethix.com to make first contact. After connecting, we will share the best available date for you to proceed with the onboarding.

We promote services and products related to IT, Internet services, Research, Data Science, Machine learning, Fintech and Banking, Healthcare, Energy, etc. Below are some of the most recent statistics of the show.

About this Show

Data Science at Home is a podcast about machine learning, artificial intelligence and algorithms.

The show is hosted by Dr. Francesco Gadaleta on solo episodes and interviews with some of the most influential figures in the field