optimisation | Data Science at Home

Episodes

Monday Dec 16, 2024

Autonomous Weapons and AI Warfare (Ep. 275)

Monday Dec 16, 2024

Here’s the updated text with links to the websites included:
AI is revolutionizing the military with autonomous drones, surveillance tech, and decision-making systems. But could these innovations spark the next global conflict? In this episode of Data Science at Home, we expose the cutting-edge tech reshaping defense—and the chilling ethical questions that follow. Don’t miss this deep dive into the AI arms race!
🎧 LISTEN / SUBSCRIBE TO THE PODCAST
Apple Podcasts
Podbean Podcasts
Player FM
Chapters00:00 - Intro01:54 - Autonomous Vehicles03:11 - Surveillance And Reconnaissance04:15 - Predictive Analysis05:57 - Decision Support System08:24 - Real World Examples10:42 - Ethical And Strategic Considerations12:25 - International Regulation13:21 - Conclusion14:50 - Outro
✨ Connect with us!
🎥Youtube: https://www.youtube.com/@DataScienceatHome📩 Newsletter: https://datascienceathome.substack.com🎙 Podcast: Available on Spotify, Apple Podcasts, and more.🐦 Twitter: @DataScienceAtHome📘 LinkedIn: Francesco Gad📷 Instagram: https://www.instagram.com/datascienceathome/📘 Facebook: https://www.facebook.com/datascienceAH💼 LinkedIn: https://www.linkedin.com/company/data-science-at-home-podcast💬 Discord Channel: https://discord.gg/4UNKGf3
NEW TO DATA SCIENCE AT HOME?Welcome! Data Science at Home explores the latest in AI, data science, and machine learning. Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions. Learn more at https://datascienceathome.com.
📫 SEND US MAIL!We love hearing from you! Send us mail at:hello@datascienceathome.com
Don’t forget to like, subscribe, and hit the 🔔 for updates on the latest in AI and data science!
#DataScienceAtHome #ArtificialIntelligence #AI #MilitaryTechnology #AutonomousDrones #SurveillanceTech #AIArmsRace #DataScience #DefenseInnovation #EthicsInAI #GlobalConflict #PredictiveAnalysis #AIInWarfare #TechnologyAndEthics #AIRevolution #MachineLearning

Wednesday Nov 13, 2024

AI vs. The Planet: The Energy Crisis Behind the Chatbot Boom (Ep. 271)

Wednesday Nov 13, 2024

In this episode of Data Science at Home, we dive into the hidden costs of AI’s rapid growth — specifically, its massive energy consumption. With tools like ChatGPT reaching 200 million weekly active users, the environmental impact of AI is becoming impossible to ignore. Each query, every training session, and every breakthrough come with a price in kilowatt-hours, raising questions about AI’s sustainability.

Join us, as we uncovers the staggering figures behind AI's energy demands and explores practical solutions for the future. From efficiency-focused algorithms and specialized hardware to decentralized learning, this episode examines how we can balance AI’s advancements with our planet's limits. Discover what steps we can take to harness the power of AI responsibly!

Check our new YouTube channel at https://www.youtube.com/@DataScienceatHome

Chapters
00:00 - Intro
01:25 - Findings on Summary Statics
05:15 - Energy Required To Querry On GPT
07:20 - Energy Efficiency In BlockChain
10:41 - Efficicy Focused Algorithm
14:02 - Hardware Optimization
17:31 - Decentralized Learning
18:38 - Edge Computing with Local Inference
19:46 - Distributed Architectures
21:46 - Outro

#AIandEnergy #AIEnergyConsumption #SustainableAI #AIandEnvironment #DataScience #EfficientAI #DecentralizedLearning #GreenTech #EnergyEfficiency #MachineLearning #FutureOfAI #EcoFriendlyAI #FrancescoFrag #DataScienceAtHome #ResponsibleAI #EnvironmentalImpact

Tuesday Jan 30, 2024

Is SQream the fastest big data platform? (Ep. 250)

Tuesday Jan 30, 2024

Join us in a dynamic conversation with Yori Lavi, Field CTO at SQream, as we unravel the data analytics landscape. From debunking the data lakehouse hype to SQream's GPU-based magic, discover how extreme data challenges are met with agility. Yori shares success stories, insights into SQream's petabyte-scale capabilities, and a roadmap to breaking down organizational bottlenecks in data science. Dive into the future of data analytics with SQream's commitment to innovation, leaving legacy formats behind and leading the charge in large-scale, cost-effective data projects. Tune in for a dose of GPU-powered revolution!

References
SQream - GPU-based Big Data Platform
Patents Assigned to SQREAM TECHNOLOGIES LTD

Monday Jun 13, 2022

Online learning is better than batch, right? Wrong! (Ep. 200)

Monday Jun 13, 2022

In this episode I speak about online learning systems and why blindly choosing such a paradigm can lead to very unpredictable and expensive outcomes.Also in this episode, I have to deal with an intruder :)

Links
Birman, K.; Joseph, T. (1987). "Exploiting virtual synchrony in distributed systems". Proceedings of the Eleventh ACM Symposium on Operating Systems Principles - SOSP '87. pp. 123–138. doi:10.1145/41457.37515. ISBN 089791242X. S2CID 7739589.

Tuesday Jan 25, 2022

Embedded Machine Learning: Part 4 - Machine Learning Compilers (Ep. 185)

Tuesday Jan 25, 2022

In this episode I speak about machine learning compilers, the most important tools to bridge the gap between high level frontends, ML backends and hardware target architectures.
There are several compilers one can choose. Before that, let's get familiar with what a compiler is supposed to do.
Enjoy the episode!

Chat with me
Join us on Discord community chat to discuss the show, suggest new episodes and chat with other listeners!

Sponsored by Amethix Technologies
Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

Links
Amethix Embedded Machine Learning
https://tvm.apache.org/
https://github.com/pytorch/glow
https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html

Saturday Jan 15, 2022

Embedded Machine Learning: Part 2 (Ep. 183)

Saturday Jan 15, 2022

In Part 2 of Embedded Machine Learning, I speak about one important technique to prune a neural network and perform inference on small devices. Such technique helps preserving most of the accuracy with a model orders of magnitude smaller.
Enjoy the show!

References
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

Friday Jun 04, 2021

True Machine Intelligence just like the human brain (Ep. 155)

Friday Jun 04, 2021

In this episode I have a really interesting conversation with Karan Grewal, member of the research staff at Numenta where he investigates how biological principles of intelligence can be translated into silicon.We speak about the thousand brains theory and why neural networks forget.

References
Main paper on the Thousand Brains Theory: https://www.frontiersin.org/articles/10.3389/fncir.2018.00121/full
Blog post on Thousand Brains Theory: https://numenta.com/blog/2019/01/16/the-thousand-brains-theory-of-intelligence/
GLOM paper by Geoff Hinton: https://arxiv.org/pdf/2102.12627.pdf
Why neural networks forget? https://numenta.com/blog/2021/02/04/why-neural-networks-forget-and-lessons-from-the-brain

Tuesday Nov 03, 2020

Remove noise from data with deep learning (Ep.125)

Tuesday Nov 03, 2020

Come join me in our Discord channel speaking about all things data science.
Follow me on Twitch during my live coding sessions usually in Rust and Python
Our Sponsors
ProtonMail is a secure and private email provider that protects yourmessages with end-to-end encryption and zero-access encryption so that besides you, noone can access them.
Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.
References
DeepInterpolation

Monday Nov 18, 2019

How to improve the stability of training a GAN (Ep. 88)

Monday Nov 18, 2019

Generative Adversarial Networks or GANs are very powerful tools to generate data. However, training a GAN is not easy. More specifically, GANs suffer of three major issues such as instability of the training procedure, mode collapse and vanishing gradients.

In this episode I not only explain the most challenging issues one would encounter while designing and training Generative Adversarial Networks. But also some methods and architectures to mitigate them. In addition I elucidate the three specific strategies that researchers are considering to improve the accuracy and the reliability of GANs.

The most tedious issues of GANs

Convergence to equilibrium

A typical GAN is formed by at least two networks: a generator G and a discriminator D. The generator's task is to generate samples from random noise. In turn, the discriminator has to learn to distinguish fake samples from real ones. While it is theoretically possible that generators and discriminators converge to a Nash Equilibrium (at which both networks are in their optimal state), reaching such equilibrium is not easy.

Vanishing gradients

Moreover, a very accurate discriminator would push the loss function towards lower and lower values. This in turn, might cause the gradient to vanish and the entire network to stop learning completely.

Mode collapse

Another phenomenon that is easy to observe when dealing with GANs is mode collapse. That is the incapability of the model to generate diverse samples. This in turn, leads to generated data that are more and more similar to the previous ones. Hence, the entire generated dataset would be just concentrated around a particular statistical value.

The solution

Researchers have taken into consideration several approaches to overcome such issues. They have been playing with architectural changes, different loss functions and game theory.

Listen to the full episode to know more about the most effective strategies to build GANs that are reliable and robust. Don't forget to join the conversation on our new Discord channel. See you there!

Tuesday Nov 12, 2019

What if I train a neural network with random data? (with Stanisław Jastrzębski) (Ep. 87)

Tuesday Nov 12, 2019

What happens to a neural network trained with random data?
Are massive neural networks just lookup tables or do they truly learn something?
Today’s episode will be about memorisation and generalisation in deep learning, with Stanislaw Jastrzębski from New York University.
Stan spent two summers as a visiting student with Prof. Yoshua Bengio and has been working on
Understanding and improving how deep network generalise
Representation Learning
Natural Language Processing
Computer Aided Drug Design

What makes deep learning unique?
I have asked him a few questions for which I was looking for an answer for a long time. For instance, what is deep learning bringing to the table that other methods don’t or are not capable of? Stan believe that the one thing that makes deep learning special is representation learning. All the other competing methods, be it kernel machines, or random forests, do not have this capability. Moreover, optimisation (SGD) lies at the heart of representation learning in the sense that it allows finding good representations.

What really improves the training quality of a neural network?
We discussed about the accuracy of neural networks depending pretty much on how good the Stochastic Gradient Descent method is at finding minima of the loss function. What would influence such minima?Stan's answer has revealed that training set accuracy or loss value is not that interesting actually. It is relatively easy to overfit data (i.e. achieve the lowest loss possible), provided a large enough network, and a large enough computational budget. However, shape of the minima, or performance on validation sets are in a quite fascinating way influenced by optimisation. Optimisation in the beginning of the trajectory, steers such trajectory towards minima of certain properties that go much further than just training accuracy.
As always we spoke about the future of AI and the role deep learning will play.
I hope you enjoy the show!
Don't forget to join the conversation on our new Discord channel. See you there!

References

Homepage of Stanisław Jastrzębski https://kudkudak.github.io/
A Closer Look at Memorization in Deep Networks https://arxiv.org/abs/1706.05394
Three Factors Influencing Minima in SGD https://arxiv.org/abs/1711.04623
Don't Decay the Learning Rate, Increase the Batch Size https://arxiv.org/abs/1711.00489
Stiffness: A New Perspective on Generalization in Neural Networks https://arxiv.org/abs/1901.09491