Giles Weaver
Data scientist. Domain expertise in maritime shipping (AIS). User of PySpark & Dask for over five years. Formerly a bioinformatician. Available for contract work.
Sessions
06-03
15:45
40min
Pandas 2, Dask or Polars? Quickly tackling larger data on a single machine
Giles Weaver, Ian Ozsvald
Pandas 2 brings new Arrow data types, faster calculations and better scalability. Dask scales Pandas across cores. Polars is a new competitor to Pandas designed around Arrow with native multicore support. Which should you choose for modern research workflows? We'll solve a "just about fits in ram" data task using the 3 solutions, talking about the pros and cons so you can make the best choice for your research workflow. You'll leave with a clear idea of whether Pandas 2, Dask or Polars is the tool for your team to invest in.
Salisbury