Archive for the 'computer science' Category

In this episode I make a non exhaustive list of machine learning tools and frameworks, written in Rust. Not all of them are mature enough for production environments. I believe that community effort can change this very quickly.

To make a comparison with the Python ecosystem I will cover frameworks for linear algebra (numpy), dataframes (pandas), off-the-shelf machine learning (scikit-learn), deep learning (tensorflow) and reinforcement learning (openAI).

Rust is the language of the future.
Happy coding!
 

Reference

  1. BLAS linear algebra https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms
  2. Rust dataframe https://github.com/nevi-me/rust-dataframe
  3. Rustlearn https://github.com/maciejkula/rustlearn
  4. Rusty machine https://github.com/AtheMathmo/rusty-machine
  5. Tensorflow bindings https://lib.rs/crates/tensorflow
  6. Juice (machine learning for hackers) https://lib.rs/crates/juice
  7. Rust reinforcement learning https://lib.rs/crates/rsrl

Read Full Post »

In the 3rd episode of Rust and machine learning I speak with Alec Mocatta.
Alec is a +20 year experience professional programmer who has been spending time at the interception of distributed systems and data analytics. He's the founder of two startups in the distributed system space and author of Amadeus, an open-source framework that encourages you to write clean and reusable code that works, regardless of data scale, locally or distributed across a cluster.

Only for June 24th, LDN *Virtual* Talks June 2020 with Bippit (Alec speaking about Amadeus)

 

Read Full Post »

This is the first episode of a series about the Rust programming language and the role it can play in the machine learning field.

Rust is one of the most beautiful languages I have ever studied so far. I personally come from the C programming language, though for professional activities in machine learning I had to switch to the loved and hated Python language.

This episode is clearly not providing you with an exhaustive list of the benefits of Rust, nor its capabilities. For this you can check the references and start getting familiar with what I think it's going to be the language of the next 20 years.

 

Sponsored

This episode is supported by Pryml Technologies. Pryml offers secure and cost effective data privacy solutions for your organisation. It generates a synthetic alternative without disclosing you confidential data.

 

References

 

Read Full Post »

Why so much silence? Building a company! That's why :) 
I am building pryml, a platform that allows data scientists build their applications on data they cannot get access to. 
This is the first of a series of episodes in which I will speak about the technology and the challenges we are facing while we build it. 

Happy listening and stay tuned!

Read Full Post »

Join the discussion on our Discord server

 

After reinforcement learning agents doing great at playing Atari video games, Alpha Go, doing financial trading, dealing with language modeling, let me tell you the real story here.
In this episode I want to shine some light on reinforcement learning (RL) and the limitations that every practitioner should consider before taking certain directions. RL seems to work so well! What is wrong with it?

 

Are you a listener of Data Science at Home podcast?
A reader of the Amethix Blog? 
Or did you subscribe to the Artificial Intelligence at your fingertips newsletter?
In any case let’s stay in touch! 
https://amethix.com/survey/

 

 

References

Read Full Post »

Join the discussion on our Discord server

 

In this episode I have an amazing conversation with Jimmy Soni and Rob Goodman, authors of “A mind at play”, a book entirely dedicated to the life and achievements of Claude Shannon. Claude Shannon does not need any introduction. But for those who need a refresh, Shannon is the inventor of the information age

Have you heard of binary code, entropy in information theory, data compression theory (the stuff behind mp3, mpg, zip, etc.), error correcting codes (the stuff that makes your RAM work well), n-grams, block ciphers, the beta distribution, the uncertainty coefficient?

All that stuff has been invented by Claude Shannon :) 

 
Articles: 
 
Claude's papers:
 
A mind at play (book links): 

Read Full Post »

Join the discussion on our Discord server

Scaling technology and business processes are not equal. Since the beginning of the enterprise technology, scaling software has been a difficult task to get right inside large organisations. When it comes to Artificial Intelligence and Machine Learning, it becomes vastly more complicated. 

In this episode I propose a framework - in five pillars - for the business side of artificial intelligence.

 

Read Full Post »

Join the discussion on our Discord server

Training neural networks faster usually involves the usage of powerful GPUs. In this episode I explain an interesting method from a group of researchers from Google Brain, who can train neural networks faster by squeezing the hardware to their needs and making the training pipeline more dense.

Enjoy the show!

 

References

Faster Neural Network Training with Data Echoing
https://arxiv.org/abs/1907.05550

Read Full Post »

In this episode I am with Jadiel de Armas, senior software engineer at Disney and author of Videflow, a Python framework that facilitates the quick development of complex video analysis applications and other series-processing based applications in a multiprocessing environment. 

I have inspected the videoflow repo on Github and some of the capabilities of this framework and I must say that it’s really interesting. Jadiel is going to tell us a lot more than what you can read from Github 

 

References

Videflow Github official repository
https://github.com/videoflow/videoflow

 

Read Full Post »

In this episode, I am with Dr. Charles Martin from Calculation Consulting a machine learning and data science consulting company based in San Francisco. We speak about the nuts and bolts of deep neural networks and some impressive findings about the way they work. 

The questions that Charles answers in the show are essentially two:

  1. Why is regularisation in deep learning seemingly quite different than regularisation in other areas on ML?

  2. How can we dominate DNN in a theoretically principled way?

 

References 

 

 

Read Full Post »

In this episode I am with Jadiel de Armas, senior software engineer at Disney and author of Videflow, a Python framework that facilitates the quick development of complex video analysis applications and other series-processing based applications in a multiprocessing environment. 

I have inspected the videoflow repo on Github and some of the capabilities of this framework and I must say that it’s really interesting. Jadiel is going to tell us a lot more than what you can read from Github 

 

References

Videflow Github official repository
https://github.com/videoflow/videoflow

 

Read Full Post »

Today I am with David Kopec, author of Classic Computer Science Problems in Python, published by Manning Publications.

His book deepens your knowledge of problem solving techniques from the realm of computer science by challenging you with interesting and realistic scenarios, exercises, and of course algorithms.
There are examples in the major topics any data scientist should be familiar with, for example search, clustering, graphs, and much more.

Get the book from https://www.manning.com/books/classic-computer-science-problems-in-python and use coupon code poddatascienceathome19 to get 40% discount.

 

References

Twitter https://twitter.com/davekopec

GitHub https://github.com/davecom

classicproblems.com

Read Full Post »