Data Science at Home
Episodes
Tuesday Jan 30, 2024
Is SQream the fastest big data platform? (Ep. 250)
Tuesday Jan 30, 2024
Tuesday Jan 30, 2024
Join us in a dynamic conversation with Yori Lavi, Field CTO at SQream, as we unravel the data analytics landscape. From debunking the data lakehouse hype to SQream's GPU-based magic, discover how extreme data challenges are met with agility. Yori shares success stories, insights into SQream's petabyte-scale capabilities, and a roadmap to breaking down organizational bottlenecks in data science. Dive into the future of data analytics with SQream's commitment to innovation, leaving legacy formats behind and leading the charge in large-scale, cost-effective data projects. Tune in for a dose of GPU-powered revolution!
References
SQream - GPU-based Big Data Platform
Patents Assigned to SQREAM TECHNOLOGIES LTD
Monday Jun 13, 2022
Online learning is better than batch, right? Wrong! (Ep. 200)
Monday Jun 13, 2022
Monday Jun 13, 2022
In this episode I speak about online learning systems and why blindly choosing such a paradigm can lead to very unpredictable and expensive outcomes.Also in this episode, I have to deal with an intruder :)
Links
Birman, K.; Joseph, T. (1987). "Exploiting virtual synchrony in distributed systems". Proceedings of the Eleventh ACM Symposium on Operating Systems Principles - SOSP '87. pp. 123–138. doi:10.1145/41457.37515. ISBN 089791242X. S2CID 7739589.
Tuesday Jan 25, 2022
Embedded Machine Learning: Part 4 - Machine Learning Compilers (Ep. 185)
Tuesday Jan 25, 2022
Tuesday Jan 25, 2022
In this episode I speak about machine learning compilers, the most important tools to bridge the gap between high level frontends, ML backends and hardware target architectures.
There are several compilers one can choose. Before that, let's get familiar with what a compiler is supposed to do.
Enjoy the episode!
Chat with me
Join us on Discord community chat to discuss the show, suggest new episodes and chat with other listeners!
Sponsored by Amethix Technologies
Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.
Links
Amethix Embedded Machine Learning
https://tvm.apache.org/
https://github.com/pytorch/glow
https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html
Saturday Jan 15, 2022
Embedded Machine Learning: Part 2 (Ep. 183)
Saturday Jan 15, 2022
Saturday Jan 15, 2022
In Part 2 of Embedded Machine Learning, I speak about one important technique to prune a neural network and perform inference on small devices. Such technique helps preserving most of the accuracy with a model orders of magnitude smaller.
Enjoy the show!
References
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Friday Jun 04, 2021
True Machine Intelligence just like the human brain (Ep. 155)
Friday Jun 04, 2021
Friday Jun 04, 2021
In this episode I have a really interesting conversation with Karan Grewal, member of the research staff at Numenta where he investigates how biological principles of intelligence can be translated into silicon.We speak about the thousand brains theory and why neural networks forget.
References
Main paper on the Thousand Brains Theory: https://www.frontiersin.org/articles/10.3389/fncir.2018.00121/full
Blog post on Thousand Brains Theory: https://numenta.com/blog/2019/01/16/the-thousand-brains-theory-of-intelligence/
GLOM paper by Geoff Hinton: https://arxiv.org/pdf/2102.12627.pdf
Why neural networks forget? https://numenta.com/blog/2021/02/04/why-neural-networks-forget-and-lessons-from-the-brain
Tuesday Nov 03, 2020
Remove noise from data with deep learning (Ep.125)
Tuesday Nov 03, 2020
Tuesday Nov 03, 2020
Come join me in our Discord channel speaking about all things data science.
Follow me on Twitch during my live coding sessions usually in Rust and Python
Our Sponsors
ProtonMail is a secure and private email provider that protects yourmessages with end-to-end encryption and zero-access encryption so that besides you, noone can access them.
Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.
References
DeepInterpolation
Monday Nov 18, 2019
How to improve the stability of training a GAN (Ep. 88)
Monday Nov 18, 2019
Monday Nov 18, 2019
Generative Adversarial Networks or GANs are very powerful tools to generate data. However, training a GAN is not easy. More specifically, GANs suffer of three major issues such as instability of the training procedure, mode collapse and vanishing gradients.
In this episode I not only explain the most challenging issues one would encounter while designing and training Generative Adversarial Networks. But also some methods and architectures to mitigate them. In addition I elucidate the three specific strategies that researchers are considering to improve the accuracy and the reliability of GANs.
The most tedious issues of GANs
Convergence to equilibrium
A typical GAN is formed by at least two networks: a generator G and a discriminator D. The generator's task is to generate samples from random noise. In turn, the discriminator has to learn to distinguish fake samples from real ones. While it is theoretically possible that generators and discriminators converge to a Nash Equilibrium (at which both networks are in their optimal state), reaching such equilibrium is not easy.
Vanishing gradients
Moreover, a very accurate discriminator would push the loss function towards lower and lower values. This in turn, might cause the gradient to vanish and the entire network to stop learning completely.
Mode collapse
Another phenomenon that is easy to observe when dealing with GANs is mode collapse. That is the incapability of the model to generate diverse samples. This in turn, leads to generated data that are more and more similar to the previous ones. Hence, the entire generated dataset would be just concentrated around a particular statistical value.
The solution
Researchers have taken into consideration several approaches to overcome such issues. They have been playing with architectural changes, different loss functions and game theory.
Listen to the full episode to know more about the most effective strategies to build GANs that are reliable and robust. Don't forget to join the conversation on our new Discord channel. See you there!
Tuesday Nov 12, 2019
Tuesday Nov 12, 2019
What happens to a neural network trained with random data?
Are massive neural networks just lookup tables or do they truly learn something?
Today’s episode will be about memorisation and generalisation in deep learning, with Stanislaw Jastrzębski from New York University.
Stan spent two summers as a visiting student with Prof. Yoshua Bengio and has been working on
Understanding and improving how deep network generalise
Representation Learning
Natural Language Processing
Computer Aided Drug Design
What makes deep learning unique?
I have asked him a few questions for which I was looking for an answer for a long time. For instance, what is deep learning bringing to the table that other methods don’t or are not capable of? Stan believe that the one thing that makes deep learning special is representation learning. All the other competing methods, be it kernel machines, or random forests, do not have this capability. Moreover, optimisation (SGD) lies at the heart of representation learning in the sense that it allows finding good representations.
What really improves the training quality of a neural network?
We discussed about the accuracy of neural networks depending pretty much on how good the Stochastic Gradient Descent method is at finding minima of the loss function. What would influence such minima?Stan's answer has revealed that training set accuracy or loss value is not that interesting actually. It is relatively easy to overfit data (i.e. achieve the lowest loss possible), provided a large enough network, and a large enough computational budget. However, shape of the minima, or performance on validation sets are in a quite fascinating way influenced by optimisation. Optimisation in the beginning of the trajectory, steers such trajectory towards minima of certain properties that go much further than just training accuracy.
As always we spoke about the future of AI and the role deep learning will play.
I hope you enjoy the show!
Don't forget to join the conversation on our new Discord channel. See you there!
References
Homepage of Stanisław Jastrzębski https://kudkudak.github.io/
A Closer Look at Memorization in Deep Networks https://arxiv.org/abs/1706.05394
Three Factors Influencing Minima in SGD https://arxiv.org/abs/1711.04623
Don't Decay the Learning Rate, Increase the Batch Size https://arxiv.org/abs/1711.00489
Stiffness: A New Perspective on Generalization in Neural Networks https://arxiv.org/abs/1901.09491
Thursday Aug 29, 2019
[RB] Complex video analysis made easy with Videoflow (Ep. 75)
Thursday Aug 29, 2019
Thursday Aug 29, 2019
In this episode I am with Jadiel de Armas, senior software engineer at Disney and author of Videflow, a Python framework that facilitates the quick development of complex video analysis applications and other series-processing based applications in a multiprocessing environment.
I have inspected the videoflow repo on Github and some of the capabilities of this framework and I must say that it’s really interesting. Jadiel is going to tell us a lot more than what you can read from Github
References
Videflow Github official repository https://github.com/videoflow/videoflow
Tuesday Aug 27, 2019
[RB] Validate neural networks without data with Dr. Charles Martin (Ep. 74)
Tuesday Aug 27, 2019
Tuesday Aug 27, 2019
In this episode, I am with Dr. Charles Martin from Calculation Consulting a machine learning and data science consulting company based in San Francisco. We speak about the nuts and bolts of deep neural networks and some impressive findings about the way they work.
The questions that Charles answers in the show are essentially two:
Why is regularisation in deep learning seemingly quite different than regularisation in other areas on ML?
How can we dominate DNN in a theoretically principled way?
References
The WeightWatcher tool for predicting the accuracy of Deep Neural Networks https://github.com/CalculatedContent/WeightWatcher
Slack channel https://weightwatcherai.slack.com/
Dr. Charles Martin Blog http://calculatedcontent.com and channel https://www.youtube.com/c/calculationconsulting
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning - Charles H. Martin, Michael W. Mahoney