About efficiency


Intro

A few weeks back I needed to test working with the “parquet” file type. It turned out I couldn’t (as much as I tried) get my Docker container’s RStudio to take the “arrow” package and install it (it would always fail, but that’s beyond the point for this particular post).

A side note about Parquet files

I hadn’t needed it before, I’ll admit. Working with databases was about enough and fine for me. But for EXCHANGING a rather big dataset (say millions of rows), you can forget about Excel (or CSV for that matter!)…

Well, the “.parquet” file type seems to work nicely. And if you manage to get an environment with all requirements (not hard, but maybe for Containerized RStudio Server on Apple’s ARM architecture, apparently…), installing the arrow package and working with parquet files is pretty straightforward, and indeed is FAST as Hell for such a large amount of entries, for a rather SMALL file format in spite of said number of entries…

Anyhow, that’s not where I was going for today…

Energy efficiency v. Docker on Mac with M1

So as I have mentioned, I WAITED for Docker to be compatible with the M1 chip/SOC of Apple before buying a Macbook Air.

I like the Docker option a lot (it shows in many past entries of this blog…), but there was one thing I never really think about until a recent finding…

See, if I leave my MacBook Air with M1 running the dockerized RStudio Server, then use a browser to access the RStudio, and work on the container basically… It turns out, the battery gets drained MUCH faster. (I don’t know whether I was influenced by the study of the “Green Computing” chapter of the HPC course I was taking right then… Probably :D)

I didn’t do much measuring, I’ll admit, but because of the need to test the arrow package mentioned above, I ended up installing RStudio Desktop DIRECTLY on the MacOS itself… Meaning I can work in R without Docker (I had my reasons not to before, and I still like the Docker way, I just turned out to need to install on the bare machine this one time…).

What happened is, I worked for a few hours with the RStudio Desktop, and no Docker was running… And after a few hours, WHERE I USED to see Docker (meaning in my case RStudio/R scripts…) eat up 30% of the battery in a similar amount of time, RStudio Desktop on the Mac itself turned out to be MUCH EASIER ON THE BATTERY! And I’m talking, roughly (very roughly) eating 20% (or 1/5th) of what I was getting used to with the Docker setup.

Then the new idea for a test came up… Would the bare-metal version also be much faster…? We will have to launch some test for that, but that’s not for today, maybe next time 😉

Conclusions

Docker is great, I can’t say that enough. Sharing setups and compatibility (BUT FOR THE ARM arch :S) is great, meaning it has helped me bring stuff to a small home Linux server, use the SAME setup on a Windows (with WSL2) and that was very much seemless. And it’s much lighter than a full-fledged VM setup.

But it’s still not perfect, meaning there is an overhead, at least a notable one in terms of energy consumption on a MacBook Air M1 setup, with vs without Docker.

And no, I don’t have numbers, and I’m even comparing Apples with Oranges (the Docker setup is for an RStudio Server, vs the RStudio Desktop client installed on the Mac… So there, it’s not even fair to compare).

The other way to look at it? RStudio Desktop on Mac seems to be nicely optimized in terms of power-consumption. I’ll admit: It’s not quite “scientific” enough of me to make such statements at this point… But at least it’s what it seems like.

Further tests shall come at some point later.