Deep dive into pandas Copy-on-Write mode - part III

Explaining the migration path for Copy-on-Write


The introduction of Copy-on-Write (CoW) is a breaking change that will have some impact on existing pandas-code. We will investigate how we can adapt our code to avoid errors when CoW will be enabled by default. This is currently planned for the pandas …

What's new in pandas 2.1

The most interesting things about the new release

pandas 2.1 was released on August 30th 2023. Let’s take a look at the things this release introduces and how it will help us improving our pandas workloads. It includes a bunch of improvements and also a set of new …

High Level Query Optimization in Dask


Dask DataFrame doesn't currently optimize your code for you (like Spark or a SQL database would). This means that users waste a lot of computation. Let's look at a common example which looks ok at first glance, but is actually pretty inefficient.

import dask.dataframe as dd

df = dd …