An Epic Journey with the Cloudfryer!
I’ve been a long-time follower of Metin Seylan, a Senior Software Developer known for sharing both entertaining and educational content. Today, I stumbled upon Metin Seylan’s tweet explaining his homegrown ‘Cloudfryer’ project. I was captivated by the insights he shared and felt compelled to spread the knowledge within the community.
In this article, we will find answers to these questions:
Why did he start this project?
What hardware did he use?
What kind of cloud experience did he create at home?
How much investment did he make in the hardware?
Was all this effort worth it for him to avoid using the cloud?
So, let’s start.
He started learning TensorFlow and realized that his primary need wasn’t a massive GPU but rather an efficient data pipeline.
So, what did he need to do?
According to him;
Continuously scraping data from websites using bots.
Using Kafka Streams to preprocess this data, making it ideal for model training.
- Approximately 36 CPU cores
- Around 200GB of RAM
- 10Gbit network bandwidth
At first, he considered using a Raspberry Pi, but he found it to be:
His preference was the Lenovo Thinkcentre Tiny M720Q, which he described as a real beast.
According to him, the Lenovo Thinkcentre Tiny M720Q had several appealing features:
All components are upgradeable.
It supports 8th and 9th generation CPUs, including the i9–9900T (8 Core 16 Thread).
It can handle up to 64GB of RAM.
It has a PCI-E slot, which is a significant advantage.
It offers a variety of storage options, including NVME and SATA slots.
Upgraded the CPU to an i5–9500T, offering 6 cores and 6 threads.
Increased RAM to 64GB with Corsair Vengeance SODIMM.
Opted for a Samsung 980 NVME SSD for storage.
Enhanced networking with a 10GB Cluster Networking Intel X540-T2 NIC.
All software components were automated using Ansible, eliminating manual installations. It’s entirely open source!
You can find the details and configurations on *his GitHub repository*.
He also mentioned that he used various software components, and you can learn more about which ones and why by checking out the repository.
According to him, he used the following software components and the reasons behind their selection:
- **Kubernetes** — A well-known choice for container orchestration.
- **Cilium** — Selected for Kubernetes networking and bare-metal load balancing support.
- **Helm** — Chosen as the Kubernetes package manager.
- Prometheus and Grafana — Utilized for cluster observability.
- **Longhorn**] — Referred to as “SSD for K8s,” it allows distributed use of all disks within the cluster and automates backup and snapshot processes.
- **ArgoCD** — For GitOps.
These software choices were made based on their respective functionalities and suitability for the project’s requirements.
For his web scraper bot, he is writing code in Kotlin and outlined his approach:
He’s using Kotlin for web scraping.
To build a lightweight and efficient native executable application, he relies on a combination of Spring Boot, Spring Native, and GraalVM.
For storage and pipeline management, he mentioned using:
Kafka for handling messaging.
Kafka Streams for stream processing.
These choices reflect his strategy for building an efficient and resource-friendly web scraper bot and pipeline.
Regarding the costs of the hardware components, here’s the breakdown:
- Think Centre Tiny: Approximately 65 Euro
- i5–9500T CPU: Ranging from 30 to 65 Euro
- 64GB RAM: Approximately 100 Euro
- X540 T2 NIC: Ranging from 30 to 70 Euro
- NVME: About 30 Euro
He highlights that it’s a budget-friendly setup, mainly consisting of second-hand purchases, making it a cost-effective solution. Roughly around 255 to 330 euro.
When it comes to electricity consumption, he shared:
At idle, it consumes only about ~66 watts.
Under heavy usage, it’s around ~250 watts or so.
He explained why he opted for an on-premises setup instead of the cloud:
He doesn’t have critical, customer-focused work, so concerns about downtime and security are minimal.
His home has a parallel internet connection with 1GBit up/down speeds, which makes it feasible for him to run his own infrastructure effectively.
These factors contributed to his decision to choose an on-premises solution over a cloud-based one.
He shared his estimation on Google Cloud Platform(GCP). By the way, it’s time to mention that he is a Google Cloud Developer Expert. Don’t forget to visit his personal website. And here is the GCP bill:
As an AWS Community Builder, I want to share the cost of a similar infra on AWS(Amazon Web Services). If you go with an on-demand machine like the Metin did(I assume), the cost is:
And this would make around 2,694 USD. It’s slightly more expensive than GCP. But since we compare it with a home setup, there will be an initial investment. Hence, we can choose Reserve Instances with a 3-year commitment. And the price goes down to:
It’s monthly 2,403 USD. With proper adjustments, this price can go even more down. It’s just a matter of adjustment according to your use cases.
By making informed plan choices, cloud bills can be significantly reduced. This underscores the invaluable role of consulting with cloud professionals who can guide you toward optimizing your infrastructure and keeping costs in check.
As you can see with specific requirements, a meticulously crafted ‘Cloudfryer’ setup at home might just be the perfect recipe. However, the cloud offers unparalleled scalability, convenience, and cost-efficiency, making it an enticing choice for many. So, as you embark on your own journey of technological exploration, remember that the answer lies in aligning your choice with your goals, resources, and ambitions. Whichever path you choose, whether it’s the cozy corner of your home or the boundless skies of the cloud, let it serve as the canvas upon which you craft your digital masterpiece.
To see the original Twitter thread(in Turkish) click here.