One of the most under-appreciated parts of software engineering is actually deploying your code. There is al lot of focus on building highly scalable data pipelines, but in the end your code has to ‘magically’ transferred from a local machine to a deployable piece of pipeline in the cloud.
In a previous article I’ve discussed building data pipelines in Scala & Spark and deploying them on Kubernetes, well at least deploying them on your local minikube setup for testing purposes.
Day 1 of the journey on building my first Scala GraalVM ZIO application
Continued from Day 0
Now that we have a basic project setup and somewhat building, we need to stabilise it a bit more, so we can spend all our future energy on actually writing the code. This means CI/CD, dockerising, versioning, but more important getting the binary stable on *nix and mac environments. This means diving deeper into GraalVM and it’s available build tools. …
Day 0 of the journey on building my first Scala GraalVM ZIO application
Caveat: Normally I write posts about how I tackled a problem and present the solution on a silver platter. In this series I’d like you to take you on a journey of exploration. I have no idea how fast this will progress or where it will end up. I will try to work at least 1 day a week on this. There are way better tutorials write ups for each of these tools (zio, kafka, api, graalvm), but I just want…
For each challenge there are many technology stacks that can provide the solution. I’m not claiming this approach is the holy grail of data processing, but this more the tale of my quest to combine these widely supported tools in a maintainable fashion.
From the onset I’ve always tried to generate as much configuration as possible, mainly because I’ve experienced it’s easy to drown in a sea of yaml-files, conf-files and incompatible versions in registries, repositories, CI/CD pipelines and deployments.
What I created was a sbt script that, when triggered, builds a fat-jar, which gets wrapped it in a docker-file…
Recently a colleague at Datlinq asked me to help her with a data problem, that seemed very straightforward at a glance.
She had purchased a small set of data from the chamber of commerce (Kamer van Koophandel: KvK) that contained roughly 50k small sized companies (5–20FTE), which can be hard to find online.
She noticed that many of those companies share the same address, which makes sense, because a lot of those companies tend to cluster in business complexes.
Since my childhood I’ve always been a coder. I got started with some GW BASIC, but quickly moved to C and C++ during my high school years, though I never really considered it as a possible future occupation. It was more of a fun hobby and since my friends also did it, it didn’t seem that strange or special. Some liked football, some drawing, I liked programming.
In fact: I’ve always thought of ending up a veterinarian. Programming and hacking seemed more as as hobby to me, also because it came from and ended up in playing games (eg. …
Freelance Data & ML Engineer | husband + father of 2 | #Spark #Scala #BigData #ML #DeepLearning #Airflow #Kubernetes | Shodan Aikido