Archive for the 'machine learning' Category

Come join me in our Discord channel speaking about all things data science.

Follow me on Twitch during my live coding sessions usually in Rust and Python

Subscribe to the official Newsletter and never miss an episode

Our Sponsors

  • ProtonMail offers a simple and trusted solution to protect your internet connection and access blocked or restricted websites. All of ProtonMail and ProtonVPN's apps are open source and have been inspected by cybersecurity experts, and Proton is based in Switzerland, home to some of the world’s strongest privacy laws
  • Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

Read Full Post »

Come join me in our Discord channel speaking about all things data science.

Follow me on Twitch during my live coding sessions usually in Rust and Python

Our Sponsors

  • Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

 

References

Dataset distillation (official paper)

GitHub repo

 

Read Full Post »

Come join me in our Discord channel speaking about all things data science.

Follow me on Twitch during my live coding sessions usually in Rust and Python

Our Sponsors

  • physicspodcast.com is not just a physics podcast. But also interviews with scientists, scholars, authors and reflections on the history and future of science and technology are all in the wheelhouse.
  • Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

Read Full Post »

Come join me in our Discord channel speaking about all things data science.

Follow me on Twitch during my live coding sessions usually in Rust and Python

Our Sponsors

  • The Monday Apps Challenge is bringing developers around the world together to compete in order to build apps that can improve the way teams work together on monday.com
  • Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

 

References

A Simple Framework for Contrastive Learning of Visual Representations

 

Read Full Post »

Let's talk about federated learning. Why is it important? Why large organizations are not ready yet?

 

Come join me in our Discord channel speaking about all things data science.

Follow me on Twitch during my live coding sessions usually in Rust and Python

This episode is supported by Monday.com

The Monday Apps Challenge is bringing developers around the world together to compete in order to build apps that can improve the way teams work together on monday.com.

 

Read Full Post »

Come join me in our Discord channel speaking about all things data science.

Follow me on Twitch during my live coding sessions usually in Rust and Python

 

This episode is supported by Monday.com

Monday.com bring teams together so you can plan, manage and track everything your team is working on in one centralized place

The monday Apps Challenge is bringing developers around the world together to compete in order to build apps that can improve the way teams work together on monday.com.

Read Full Post »

Come join me in our Discord channel speaking about all things data science.

Follow me on Twitch during my live coding sessions usually in Rust and Python

This episode is supported by Women in Tech by Manning Conferences

Read Full Post »

Hey there! Having the best time of my life ;)

This is the first episode I record while I am live on my new Twitch channel :) So much fun!

Feel free to follow me for the next live streaming. You can also see me coding machine learning stuff in Rust :))

Don't forget to jump on the usual Discord and have a chat

I'll see you there!

 

 

 

 

Read Full Post »

In this episode I make a non exhaustive list of machine learning tools and frameworks, written in Rust. Not all of them are mature enough for production environments. I believe that community effort can change this very quickly.

To make a comparison with the Python ecosystem I will cover frameworks for linear algebra (numpy), dataframes (pandas), off-the-shelf machine learning (scikit-learn), deep learning (tensorflow) and reinforcement learning (openAI).

Rust is the language of the future.
Happy coding!
 

Reference

  1. BLAS linear algebra https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms
  2. Rust dataframe https://github.com/nevi-me/rust-dataframe
  3. Rustlearn https://github.com/maciejkula/rustlearn
  4. Rusty machine https://github.com/AtheMathmo/rusty-machine
  5. Tensorflow bindings https://lib.rs/crates/tensorflow
  6. Juice (machine learning for hackers) https://lib.rs/crates/juice
  7. Rust reinforcement learning https://lib.rs/crates/rsrl

Read Full Post »

In the second episode of Rust and Machine learning I am speaking with Luca Palmieri, who has been spending a large part of his career at the interception of machine learning and data engineering.
In addition, Luca contributed to several projects closer to the machine learning community using the Rust programming language. Linfa is an ambitious project that definitely deserves the attention of the data science community (and it's written in Rust, with Python bindings! How cool??!).

 

References

Read Full Post »

Data science and data engineering are usually two different departments in organisations. Bridging the gap between the two is essential to success. Many times the brilliant applications created by data scientists don't find a match in production, just because they are not production-ready.

In this episode I have a talk with Daan Gerits, co-founder and CTO at Pryml.io

 

Read Full Post »

What happens to a neural network trained with random data?

Are massive neural networks just lookup tables or do they truly learn something? 

Today’s episode will be about memorisation and generalisation in deep learning, with Stanislaw Jastrzębski from New York University.

Stan spent two summers as a visiting student with Prof. Yoshua Bengio and has been working on 

  • Understanding and improving how deep network generalise
  • Representation Learning
  • Natural Language Processing
  • Computer Aided Drug Design

 

What makes deep learning unique?

I have asked him a few questions for which I was looking for an answer for a long time. For instance, what is deep learning bringing to the table that other methods don’t or are not capable of? 
Stan believe that the one thing that makes deep learning special is representation learning. All the other competing methods, be it kernel machines, or random forests, do not have this capability. Moreover, optimisation (SGD) lies at the heart of representation learning in the sense that it allows finding good representations. 

 

What really improves the training quality of a neural network?

We discussed about the accuracy of neural networks depending pretty much on how good the Stochastic Gradient Descent method is at finding minima of the loss function. What would influence such minima?
Stan's answer has revealed that training set accuracy or loss value is not that interesting actually. It is relatively easy to overfit data (i.e. achieve the lowest loss possible), provided a large enough network, and a large enough computational budget. However, shape of the minima, or performance on validation sets are in a quite fascinating way influenced by optimisation.
Optimisation in the beginning of the trajectory, steers such trajectory towards minima of certain properties that go much further than just training accuracy.

As always we spoke about the future of AI and the role deep learning will play.

I hope you enjoy the show!

Don't forget to join the conversation on our new Discord channel. See you there!

 

References

 

Homepage of Stanisław Jastrzębski https://kudkudak.github.io/

A Closer Look at Memorization in Deep Networks https://arxiv.org/abs/1706.05394

Three Factors Influencing Minima in SGD https://arxiv.org/abs/1711.04623

Don't Decay the Learning Rate, Increase the Batch Size https://arxiv.org/abs/1711.00489

Stiffness: A New Perspective on Generalization in Neural Networks https://arxiv.org/abs/1901.09491

Read Full Post »

In this episode I speak with Jon Krohn, author of Deeplearning Illustrated a book that makes deep learning easier to grasp. 
We also talk about some important guidelines to take into account whenever you implement a deep learning model, how to deal with bias in machine learning used to match jobs to candidates and the future of AI. 
 
 
You can purchase the book from informit.com/dsathome with code DSATHOME and get 40% off books/eBooks and 60% off video training

Read Full Post »

Join the discussion on our Discord server

 

In this episode, I am with Aaron Gokaslan, computer vision researcher, AI Resident at Facebook AI Research. Aaron is the author of OpenGPT-2, a parallel NLP model to the most discussed version that OpenAI decided not to release because too accurate to be published.

We discuss about image-to-image translation, the dangers of the GPT-2 model and the future of AI.
Moreover, 
Aaron provides some very interesting links and demos that will blow your mind!

Enjoy the show! 

References

Multimodal image to image translation (not all mentioned in the podcast but recommended by Aaron)

Pix2Pix: 
 
CycleGAN:
 

GANimorph

 

Read Full Post »

Join the discussion on our Discord server

 

After reinforcement learning agents doing great at playing Atari video games, Alpha Go, doing financial trading, dealing with language modeling, let me tell you the real story here.
In this episode I want to shine some light on reinforcement learning (RL) and the limitations that every practitioner should consider before taking certain directions. RL seems to work so well! What is wrong with it?

 

Are you a listener of Data Science at Home podcast?
A reader of the Amethix Blog? 
Or did you subscribe to the Artificial Intelligence at your fingertips newsletter?
In any case let’s stay in touch! 
https://amethix.com/survey/

 

 

References

Read Full Post »

Play this podcast on Podbean App