Archive for the 'computer science' Category

Our Sponsors

Quantum Metric

Stay off the naughty list this holiday season by reducing customer friction, increasing conversions, and personalizing the shopping experience. Want a sneak peak? Visit us at quantummetric.com/podoffer and see if you qualify to receive our “12 Days of Insights” offer with code DATASCIENCE. This offer gives you 12-day access to our platform coupled with a bespoke insight report that will help you identify where customers are struggling or engaging in your digital product.

 

Amethix Technologies

Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

 

References

Paper https://deeptir.me/papers/posh-atc20.pdf

Code https://github.com/deeptir18/posh

Read Full Post »

It's time we get serious about replacing the CSV format with something that, guess what? it has been around for so long.

In this episode I explain the good parts of CSV files and the not so good ones. It's time we evolve to something better.

 

Our Sponsors

Quantum Metric

Stay off the naughty list this holiday season by reducing customer friction, increasing conversions, and personalizing the shopping experience. Want a sneak peak? Visit us at quantummetric.com/podoffer and see if you qualify to receive our “12 Days of Insights” offer with code DATASCIENCE. This offer gives you 12-day access to our platform coupled with a bespoke insight report that will help you identify where customers are struggling or engaging in your digital product.

 

Amethix Technologies

Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

 

Read Full Post »

Do you want to know the latest in big data analytics frameworks? Have you ever heard of Apache Arrow? Rust? Ballista? In this episode I speak with Andy Grove one of the main authors of Apache Arrow and Ballista compute engine.
Andy explains some challenges while he was designing the Arrow and Ballista memory models and he describes some amazing solutions.

 

Our Sponsors

If building software is your passion, you’ll love ThoughtWorks Technology Podcast. It’s a podcast for techies by techies. Their team of experienced technologists take a deep dive into a tech topic that’s piqued their interest — it could be how machine learning is being used in astrophysics or maybe how to succeed at continuous delivery.

 

Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

 

References

 

https://arrow.apache.org/

 

https://ballistacompute.org/

 

https://github.com/ballista-compute/ballista

 

 

 

 

Read Full Post »

In this episode I have a really interesting conversation with Karan Grewal, member of the research staff at Numenta where he investigates how biological principles of intelligence can be translated into silicon.
We speak about the thousand brains theory and why neural networks forget.

 

 

References

 

Read Full Post »

In this episode I speak with Ritchie Vink, the author of Polars, a crate that is the fastest dataframe library at date of speaking :) If you want to participate to an amazing Rust open source project, this is your change to collaborate to the official repository in the references.

 

References

https://github.com/ritchie46/polars

 

Read Full Post »

Do you want to know the latest in big data analytics frameworks? Have you ever heard of Apache Arrow? Rust? Ballista? In this episode I speak with Andy Grove one of the main authors of Apache Arrow and Ballista compute engine.
Andy explains some challenges while he was designing the Arrow and Ballista memory models and he describes some amazing solutions.

 

Our Sponsors

This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey.
To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience

 

If building software is your passion, you’ll love ThoughtWorks Technology Podcast. It’s a podcast for techies by techies. Their team of experienced technologists take a deep dive into a tech topic that’s piqued their interest — it could be how machine learning is being used in astrophysics or maybe how to succeed at continuous delivery.

 

References

 

https://arrow.apache.org/

 

https://ballistacompute.org/

 

https://github.com/ballista-compute/ballista

 

 

 

 

Read Full Post »

Pandas vs Rust (Ep. 144)

Pandas is the de-facto standard for data loading and manipulation. Python is the de-facto programming language for such operations. Rust is the underdog. Or is it?
In this episode I am showing you why that is no longer the case.

 

Our Sponsors

This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey.
To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience

 

Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.  

 

Useful Links

https://github.com/haixuanTao/Data-Manipulation-Rust-Pandas

https://github.com/ritchie46/polars

https://github.com/rust-ndarray/ndarray

 

Read Full Post »

In plain English, concurrent and parallel are synonyms. Not for a CPU. And definitely not for programmers. In this episode I summarize the ways to parallelize on different architectures and operating systems.

Rock-star data scientists must know how concurrency works and when to use it IMHO.

 

Our Sponsors

This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey.
To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience

 

Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.  

 

Useful Links

http://web.mit.edu/6.005/www/fa14/classes/17-concurrency/

https://doc.rust-lang.org/book/ch16-00-concurrency.html

https://urban-institute.medium.com/using-multiprocessing-to-make-python-code-faster-23ea5ef996ba

 

Read Full Post »

In plain English, concurrent and parallel are synonyms. Not for a CPU. And definitely not for programmers. In this episode I summarize the ways to parallelize on different architectures and operating systems.

Rock-star data scientists must know how concurrency works and when to use it IMHO.

 

Our Sponsors

This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey.
To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience

 

Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.  

 

 

Read Full Post »

In this podcast I get inspired by Paul Done's presentation about The Six Principles for Building Robust Yet Flexible Shared Data Applications, and show how powerful of a language Rust is while still maintaining the flexibility of less strict languages.

 

Our Sponsor

This episode is supported by Chapman’s Schmid College of Science and Technology, where master's and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey.
To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience

 

Read Full Post »

In this episode I explain the basics of computer architecture and introduce some features of the Apple M1

Is it good for Machine Learning tasks?

 

References

Read Full Post »

In this episode I speak with Daniel McKenna about Rust, machine learning and artificial intelligence.

You can find Daniel from

 

Don't forget to come join me in our Discord channel speaking about all things data science.

Subscribe to the official Newsletter and never miss an episode

Read Full Post »

Let's finish this year with an amazing episode about scaling ML with clusters and GPUs. Kind of as a continuation of Episode 112 I have a terrific conversation with Aaron Richter from Saturn Cloud about, well, making ML faster and scaling it to massive infrastructure.

Aaron can be reached on his website https://rikturr.com and Twitter @rikturr

 

Our Sponsor

Saturn Cloud is a data science and machine learning platform for scalable Python analytics. Users can jump into cloud-based Jupyter and Dask to scale Python for big data using the libraries they know and love, while leveraging Docker and Kubernetes so that work is reproducible, shareable, and ready for production.

Try Saturn Cloud for free at https://saturncloud.io 

Twitter: @saturn_cloud

 

 

Read Full Post »

Our Links

Come join me in our Discord channel speaking about all things data science.

Subscribe to the official Newsletter and never miss an episode

Follow me on Twitch during my live coding sessions usually in Rust and Python

Our Sponsors

  • ProtonMail offers a simple and trusted solution to protect your internet connection and access blocked or restricted websites. All of ProtonMail and ProtonVPN’s apps are open source and have been inspected by cybersecurity experts, and Proton is based in Switzerland, home to some of the world’s strongest privacy laws
  • Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

References

  1. https://data-apis.org/blog/announcing_the_consortium
  2. https://data-apis.github.io/array-api/latest/
  3. https://github.com/data-apis/python-record-api

Read Full Post »

Come join me in our Discord channel speaking about all things data science.

Follow me on Twitch during my live coding sessions usually in Rust and Python

Subscribe to the official Newsletter and never miss an episode

Our Sponsors

  • ProtonMail offers a simple and trusted solution to protect your internet connection and access blocked or restricted websites. All of ProtonMail and ProtonVPN's apps are open source and have been inspected by cybersecurity experts, and Proton is based in Switzerland, home to some of the world’s strongest privacy laws
  • Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

Read Full Post »

Podbean App

Play this podcast on Podbean App