Data Science at Home
Episodes
Monday Sep 18, 2023
Attacking LLMs for fun and profit (Ep. 239)
Monday Sep 18, 2023
Monday Sep 18, 2023
As a continuation of Episode 238, I explain some effective and fun attacks to conduct against LLMs. Such attacks are even more effective on models served locally, that are hardly controlled by human feedback.
Have great fun and learn them responsibly.
References
https://www.jailbreakchat.com/
https://www.reddit.com/r/ChatGPT/comments/10tevu1/new_jailbreak_proudly_unveiling_the_tried_and/
https://arxiv.org/abs/2305.13860
Tuesday Oct 25, 2022
Private machine learning done right (Ep. 207)
Tuesday Oct 25, 2022
Tuesday Oct 25, 2022
There are many solutions to private machine learning. I am pretty confident when I say that the one we are speaking in this episode is probably one of the most feasible and reliable.I am with Daniel Huynh, CEO of Mithril Security, a graduate from Ecole Polytechnique with a specialisation in AI and data science. He worked at Microsoft on Privacy Enhancing Technologies under the office of the CTO of Microsoft France. He has written articles on Homomorphic Encryptions with the CKKS explained series (https://blog.openmined.org/ckks-explained-part-1-simple-encoding-and-decoding/). He is now focusing on Confidential Computing at Mithril Security and has written extensive articles on the topic: https://blog.mithrilsecurity.io/.
In this show we speak about confidential computing, SGX and private machine learning
References
Mithril Security: https://www.mithrilsecurity.io/
BindAI GitHub: https://github.com/mithril-security/blindai
Use cases for BlindAI:Deploy Transformers models with confidentiality: https://blog.mithrilsecurity.io/transformers-with-confidentiality/
Confidential medical image analysis with COVID-Net and BlindAI: https://blog.mithrilsecurity.io/confidential-covidnet-with-blindai/
Build a privacy-by-design voice assistant with BlindAI: https://blog.mithrilsecurity.io/privacy-voice-ai-with-blindai/
Confidential Computing Explained: https://blog.mithrilsecurity.io/confidential-computing-explained-part-1-introduction/
Confidential Computing Consortium: https://confidentialcomputing.io/
Confidential Computing White Papers: https://confidentialcomputing.io/white-papers-reports/
List of Intel processors with Intel SGX:https://www.intel.com/content/www/us/en/support/articles/000028173/processors.html
https://github.com/ayeks/SGX-hardware
Azure Confidential Computing VMs with SGX:Azure Docs: https://docs.microsoft.com/en-us/azure/confidential-computing/confidential-computing-enclaves
How to deploy BlindAI on Azure: https://docs.mithrilsecurity.io/getting-started/cloud-deployment/azure-dcsv3
Confidential Computing 101: https://www.youtube.com/watch?v=77U12Ss38Zc
Rust: https://www.rust-lang.org/
ONNX: https://github.com/onnx/onnx
Tract, a Rust inference engine for ONNX models: https://github.com/sonos/tract
Tuesday Dec 14, 2021
Capturing Data at the Edge (Ep. 180)
Tuesday Dec 14, 2021
Tuesday Dec 14, 2021
In this episode I speak with Manavalan Krishnan from Tsecond about capturing massive amounts of data at the edge with security and reliability in mind.
This episode is brought to you by Tsecond
The growth of data being created at static and moving edges across industries such as air travel, ocean and space exploration, shipping and freight, oil and gas, media, and more proposes numerous challenges in capturing, processing, and analyzing large amounts of data.
and by Amethix Technologies
Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.
References
https://tsecond.us/company/manavalan-krishnan/
Tuesday Jun 15, 2021
Time to take your data back with Tapmydata (Ep. 156)
Tuesday Jun 15, 2021
Tuesday Jun 15, 2021
In this episode I am with Gilbert Hill, head of strategy at https://tapmydata.com/
We speak about personal data, blockchain and the ability to control it and monetize with another simple yet effective app in the ecosystem.
References
https://tapmydata.com/
https://medium.com/@tholder/we-dont-want-your-data-pushing-boundaries-in-data-collection-and-end-to-end-encryption-for-apps-ebd1d5f79df5
Sunday Apr 11, 2021
You are the product [RB] (Ep. 147)
Sunday Apr 11, 2021
Sunday Apr 11, 2021
In this episode I am with George Hosu from Cerebralab
and we speak about how dangerous it is not to pay for the services you use, and as a consequence how dangerous it is letting an algorithm decide what you like or not.
Our Sponsors
This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey.To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience
If building software is your passion, you’ll love ThoughtWorks Technology Podcast. It’s a podcast for techies by techies. Their team of experienced technologists take a deep dive into a tech topic that’s piqued their interest — it could be how machine learning is being used in astrophysics or maybe how to succeed at continuous delivery.
Links
https://cerebralab.com
https://www.eugenewei.com/blog/2019/2/19/status-as-a-service
Sunday Feb 07, 2021
What's up with WhatsApp? (Ep. 138)
Sunday Feb 07, 2021
Sunday Feb 07, 2021
Have you clicked the button? Accepted the new terms?
It's time we have a talk.
Saturday Dec 19, 2020
What is data ethics? (Ep. 133)
Saturday Dec 19, 2020
Saturday Dec 19, 2020
What is data ethics? In this episode I have an interesting chat with Denny Wong from FaqBot and Muna.
Our Sponsor
Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.
References
Denny's Twitter profile
The data ethics awareness workshop for AI practitioners
Friday Dec 04, 2020
What happens to data transfer after Schrems II? (Ep. 131)
Friday Dec 04, 2020
Friday Dec 04, 2020
In this episode Adam Leon Smith, CTO of DragonFly and expert in data regulations explains some of the consequences of Schrems II and data transfers from EU to US.
For very interesting references and a practical example, subscribe to our Newsletter
Friday Sep 04, 2020
Testing in machine learning: checking deeplearning models (Ep. 118)
Friday Sep 04, 2020
Friday Sep 04, 2020
In this episode I speak with Adam Leon Smith, CTO at DragonFly and expert in testing strategies for software and machine learning.We cover testing with deep learning (neuron coverage, threshold coverage, sign change coverage, layer coverage, etc.), combinatorial testing and their practical aspects.
On September 15th there will be a live@Manning Rust conference. In one Rust-full day you will attend many talks about what's special about rust, building high performance web services or video game, about web assembly and much more.If you want to meet the tribe, tune in september 15th to the live@manning rust conference.
Wednesday Aug 12, 2020
Why you care about homomorphic encryption (Ep. 116)
Wednesday Aug 12, 2020
Wednesday Aug 12, 2020
After deep learning, a new entry is about ready to go on stage. The usual journalists are warming up their keyboards for blogs, news feeds, tweets, in one word, hype.This time it's all about privacy and data confidentiality. The new words, homomorphic encryption.
Join and chat with us on the official Discord channel.
Sponsors
This episode is supported by Amethix Technologies.
Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. They are a consulting firm focused on data science, machine learning, and artificial intelligence.
References
Towards a Homomorphic Machine Learning Big Data Pipeline for the Financial Services Sector
IBM Fully Homomorphic Encryption Toolkit for Linux