Archive for October 2019

Some of the most powerful NLP models like BERT and GPT-2 have one thing in common: they all use the transformer architecture.
Such architecture is built on top of another important concept already known to the community: self-attention.
In this episode I explain what these mechanisms are, how they work and why they are so powerful.

Don't forget to subscribe to our Newsletter or join the discussion on our Discord server



Read Full Post »

Join the discussion on our Discord server


In this episode, I am with Aaron Gokaslan, computer vision researcher, AI Resident at Facebook AI Research. Aaron is the author of OpenGPT-2, a parallel NLP model to the most discussed version that OpenAI decided not to release because too accurate to be published.

We discuss about image-to-image translation, the dangers of the GPT-2 model and the future of AI.
Aaron provides some very interesting links and demos that will blow your mind!

Enjoy the show! 


Multimodal image to image translation (not all mentioned in the podcast but recommended by Aaron)




Read Full Post »

Join the discussion on our Discord server


After reinforcement learning agents doing great at playing Atari video games, Alpha Go, doing financial trading, dealing with language modeling, let me tell you the real story here.
In this episode I want to shine some light on reinforcement learning (RL) and the limitations that every practitioner should consider before taking certain directions. RL seems to work so well! What is wrong with it?


Are you a listener of Data Science at Home podcast?
A reader of the Amethix Blog? 
Or did you subscribe to the Artificial Intelligence at your fingertips newsletter?
In any case let’s stay in touch!




Read Full Post »

Join the discussion on our Discord server


In this episode I have an amazing conversation with Jimmy Soni and Rob Goodman, authors of “A mind at play”, a book entirely dedicated to the life and achievements of Claude Shannon. Claude Shannon does not need any introduction. But for those who need a refresh, Shannon is the inventor of the information age

Have you heard of binary code, entropy in information theory, data compression theory (the stuff behind mp3, mpg, zip, etc.), error correcting codes (the stuff that makes your RAM work well), n-grams, block ciphers, the beta distribution, the uncertainty coefficient?

All that stuff has been invented by Claude Shannon :) 

Claude's papers:
A mind at play (book links): 

Read Full Post »

Join the discussion on our Discord server

As ML plays a more and more relevant role in many domains of everyday life, it’s quite obvious to see more and more attacks to ML systems. In this episode we talk about the most popular attacks against machine learning systems and some mitigations designed by researchers Ambra Demontis and Marco Melis, from the University of Cagliari (Italy). The guests are also the authors of SecML, an open-source Python library for the security evaluation of Machine Learning (ML) algorithms. Both Ambra and Marco are members of research group PRAlab, under the supervision of Prof. Fabio Roli.

SecML Contributors

Marco Melis (Ph.D Student, Project Maintainer,
Ambra Demontis (Postdoc, 
Maura Pintor (Ph.D Student,
Battista Biggio (Assistant Professor,


SecML: an open-source Python library for the security evaluation of Machine Learning (ML) algorithms

Demontis et al., “Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks,” presented at the 28th USENIX Security Symposium (USENIX Security 19), 2019, pp. 321–338.

W. Koh and P. Liang, “Understanding Black-box Predictions via Influence Functions,” in International Conference on Machine Learning (ICML), 2017.

Melis, A. Demontis, B. Biggio, G. Brown, G. Fumera, and F. Roli, “Is Deep Learning Safe for Robot Vision? Adversarial Examples Against the iCub Humanoid,” in 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), 2017, pp. 751–759.

Biggio and F. Roli, “Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning,” Pattern Recognition, vol. 84, pp. 317–331, 2018.

Biggio et al., “Evasion attacks against machine learning at test time,” in Machine Learning and Knowledge Discovery in Databases (ECML PKDD), Part III, 2013, vol. 8190, pp. 387–402.

Biggio, B. Nelson, and P. Laskov, “Poisoning attacks against support vector machines,” in 29th Int’l Conf. on Machine Learning, 2012, pp. 1807–1814.

Dalvi, P. Domingos, Mausam, S. Sanghai, and D. Verma, “Adversarial classification,” in Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), Seattle, 2004, pp. 99–108.

Sundararajan, Mukund, Ankur Taly, and Qiqi Yan. "Axiomatic attribution for deep networks." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017. 

Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. "Model-agnostic interpretability of machine learning." arXiv preprint arXiv:1606.05386 (2016).

Guo, Wenbo, et al. "Lemna: Explaining deep learning based security applications." Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2018.

Bach, Sebastian, et al. "On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation." PloS one 10.7 (2015): E0130140. 

Read Full Post »

Play this podcast on Podbean App