duckdb.org· 15h ago

MacBook Neo:最便宜的Mac也能跑大数据?

Big Data on the Cheapest MacBook

我们用最便宜的MacBook Neo跑了一番大数据工作负载,发现它在冷启动时表现惊人,能完成所有查询在不到一分钟内。虽然在热启动时性能有所下降,但依然能击败中型云实例。尽管如此,MacBook Neo的硬盘I/O和内存限制使其不适合日常大数据处理。如果你偶尔需要在本地处理数据,DuckDB在MacBook Neo上依然能胜任。

bcye247points222 comments

TL;DR: How does the latest entry-level MacBook perform on database workloads? We benchmarked it using ClickBench and TPC-DS SF300. We found that it could complete both workloads, sometimes with surprisingly good results.

Apple released the MacBook Neo today and there is no shortage of tech reviews explaining whether it's the right device for you if you are a student, a photographer or a writer. What they don't tell you is whether it fits into our Big Data on Your Laptop ethos. We wanted to answer this using a data-driven approach, so we went to the nearest Apple Store, picked one up and took it for a spin.

Well, not much! If you buy this machine in the EU, there isn't even a charging brick included. All you get is the laptop and a braided USB-C cable. But you likely already have a few USB-C bricks lying around – let's move on to the laptop itself!

The only part of the hardware specification that you can select is the disk: you can pick either 256 or 512 GB. As our mission is to deal with alleged “Big Data”, we picked the larger option, which brings the price to $700 in the US or €800 in the EU. The amount of memory is fixed to 8 GB. And while there is only a single CPU option, it is quite an interesting one: this laptop is powered by the 6-core Apple A18 Pro, originally built for the iPhone 16 Pro.

It turns out that we have already tested this phone under some unusual circumstances. Back in 2024, with DuckDB v1.2-dev, we found that the iPhone 16 Pro could complete all TPC-H queries at scale factor 100 in about 10 minutes when air-cooled and in less than 8 minutes while lying in a box of dry ice. The MacBook Neo should definitely be able to handle this workload – but maybe it can even handle a bit more. Cue the inevitable benchmarks!

For our first experiment, we used ClickBench, an analytical database benchmark. ClickBench has 43 queries that focus on aggregation and filtering operations. The operations run on a single wide table with 100M rows, which uses about 14 GB when serialized to Parquet and 75 GB when stored in CSV format.