My first two years at Sanoma Media Finland

Two years ago, I joined Sanoma Media Finland seeking new challenges in product development and more meaningful work in media. During these two years, I have collaborated with exceptional colleagues, worked on complex technical challenges and learned a lot about efficient teamwork.

Fall 2023: Joining the team

By spring 2023, I had started to feel that the product development philosophy in the company I was working in at the time was too different from mine. Therefore, I was open for new opportunities. I was contacted by a headhunter in May 2023, who told me that Sanoma Media Finland (SMF) was searching for a new lead developer in their news personalization team. I knew that SMF had good values that aligned with mine—they had, for example, partnered with Helsinki Pride in earlier years. I also felt that working for a media company would feel personally more meaningful to me than consulting. Therefore, I entered the interview process.

I was interviewed three times, interviewers including my future manager, personalization team lead, lead data scientist, two lead developers and the head of data. I liked everyone I met and learned that SMF had both interesting challenges related to data & AI as well as high quality standards for professional software development. After the three interview rounds, I wasn't selected for the lead role, but I was offered a position as a senior developer in the same personalization team. We agreed that I would take on lead-like responsibilities in things related to data architecture, which suited me well.

I joined the personalization team in August 2023. At that time, the team was responsible for operating both news personalization in our news sites as well as developing a new real-time analytics (RTA) dashboard for newsrooms. In the RTA project, there was a strict deadline for the dashboard to be operational by the end of September 2023, because the contract with the existing analytics provider was ending at that time.

When I arrived in August, the RTA project was not ready for production. The data processing backend, which was using Apache Flink for real-time processing, was crashing almost daily. As the newsrooms relied on the new dashboard more and more in their daily work, something needed to be done.

Our team formed a task force to find out what needed to be done. We discovered that the data processing backend was lacking, for example, deployment pipelines, unit tests, monitoring and alarms. Without deployment pipelines, the developer had to manually build and copy JAR files to S3. Without unit tests, every code change was a hail-mary–the only way to verify functionality was under live traffic. There was no support for Flink snapshots, meaning that when the computation job was changed, the Flink job would start backfilling data since the last midnight. Because the job had serious performance issues, this backfill could take hours.

We explained the situation to the stakeholders and asked for time to re-write the data processing backend from scratch. They understood the situation and gave us the time we needed. We created a temporary analytics dashboard for the newsrooms and pulled the broken RTA project back to the development phase.

To fix the issues, we had to embrace the DevOps philosophy–getting feedback as early as possible, testing as early as possible, and thinking about security and production environments as early as possible. We rebuilt the data processing backend incrementally, with each small pull request changing one thing. This allowed us to understand the impact of every small change on the system behaviour.

We migrated every computation from Flink SQL to DataStream API, because the latter supported fault tolerance via state snapshots and thereby allowed us to update the job in production without triggering any re-computation. We built comprehensive unit tests using MiniClusterWithClientResource, enabling us to catch issues in local development rather than production. With help from our platform team, we created a CI/CD pipeline in AWS CodePipeline to automatically update the Flink application in production with a push of a button. We created a custom CloudWatch dashboard showing every operational metric that we cared about. We created CloudWatch alarms with notifications, enabling us to detect and resolve issues before newsrooms would even notice them. We fixed performance issues by eliminating data skew with salting and optimizing serialization. We had multiple meetings with AWS Flink experts who helped us validate our assumptions and guided us in our endeavour.

By December 2023, we had completely re-written the data processing backend and it was estimated to be ready for production. There were still minor hiccups, but the number of production issues was close to zero. I gave a presentation in Sanoma Developer Community meetup with the title "10 Mistakes We Made with Apache Flink", explaining everything we had learned about running a production-grade real-time analytics system in Flink. I also gave a presentation about the re-written RTA dashboard for the on-call ring and the developer community.

This rewrite was an excellent opportunity for our team to learn to work together. We meticulously documented every issue in Confluence and created a step-by-step plan to tackle issues one by one. We learned how to communicate effectively with one another and how to write small, targeted pull requests that were easy to review. We gained the stakeholders' trust by giving realistic time estimations and by focusing on building a production-grade system from the start.

I believe the original project team had done the best they could given the resources and time they had available. Our organization learned a lot from the incident and has since changed its development philosophy from project mindset, with shorter-term goals in focus, to product mindset, with longer-term goals in mind.

Spring 2024: New team, new habits

By the end of 2023, it had become clear that developing and maintaining the RTA dashboard was happening at the cost of news personalization. In fall 2023, we had not been able to deliver any new features in news personalization, which had negative impact on our readers. It was therefore decided that a new analytics team would be formed to take on the analytics responsibilities. The new personalization team, in which I stayed, would focus on news personalization and would be lead by the lead data scientist.

Our first major task in the new team was to stabilize and re-write the "most read" service. This service is responsible for serving the near-real-time list of most read articles to our news sites. The legacy implementation was done using Kinesis Analytics for SQL, which was deprecated, unstable and had production incidents every few weeks.

We started designing the new system by mapping the needs and conducting meticulous trade-off analyses for all important decisions that mattered. For example, we created architecture decision records for the choice of programming languages, databases and the data processing technology.

Once we had a plan on what to build, we split the development into multiple milestones. Each milestone was split into small tasks that were maintained in Jira or Confluence. We had learned from the RTA project that every component should be built together as a team instead of letting one developer build, say, the data processing while the other developer built the API. We had some challenges in splitting the development tasks between developers, mostly because some of the tasks had intricate dependencies to other tasks and some developers were used to working on "end-to-end" tasks. However, I think we were quite successful in splitting the work and every piece in the new system was built by at least two developers.

As we were building a business-critical system, we tried to keep the DevOps philosophy in mind and minimized the risk of future production issues at every step of the development. Every component was carefully unit-tested. We built an integration test that automatically verified that the system as a whole worked as intended. This task was simplified by the fact that our team owned the whole data processing pipeline from data collection to serving the data. We ran the integration test as part of the CI/CD pipeline and blocked deployments that failed the test.

We implemented a custom monitoring dashboard for the system and ensured that all metrics that mattered were logged. We implemented canary release for every component, ensuring that faulty deployments would be automatically rolled back. At the code level, we threw exceptions and let the system crash for every unexpected situation, which enabled us to notice implementation errors as soon as they happened and fix them immediately.

Once the system was becoming close to production-ready, we implemented shadow launch to stress-test the system under realistic production load without the risk of affecting our users. After successful shadow launch, we gradually moved traffic from the old system to the new system, continuously monitoring that everything worked as expected.

The new system was very robust. As of today, there has not been a single production issue. Later, we removed the manual approval step in the deployment pipeline, letting all changes committed to the main branch to automatically move to production (assuming that the tests pass, of course).

In addition to revising the above-mentioned service, we spent the spring improving our news personalization in many ways. We improved subscription rates by integrating the personalization service with the near-real-time sales data, letting us boost the visibility of articles that generated a lot of subscriptions. We improved our MLOps capabilities by logging all our scoring service calls to S3, which gave us direct visibility to the inputs and outputs of our machine learning components and thereby helped us debug many production issues. We also implemented server-side A/B testing, giving us the flexibility to run controlled A/B tests in the new mobile apps.

We also set up the ways of working for the new personalization team. We established biweekly retros, where we reflected on how we could improve as a team. We created "deployment Mondays", in which we would automatically create a ticket for deploying all our services to production, covering also the merging of all pending dependency updates and checking our monitoring dashboards for any uncaught errors or warnings. On every Monday, these maintenance tasks were divided between the developers, giving every developer only a few services to update.

Via deployment Mondays, we also improved other aspects of maintaining our services. Together with the rest of our engineering organization, we started to automatically merge dependency updates. This reduced the developer burden on deployment Mondays, because most dependency updates were already merged. We created integration tests for all our services, helping developers rely more on test automation in ensuring that the services were safe to deploy to production.

In this period, I became a big fan of our internal platform team. Their AWS wizardry made it possible for our team to focus on building meaningful services rather than fighting against the cloud infrastructure. I am very grateful for having the opportunity to learn from them and seeing first-hand how a skilled platform team can accelerate software development throughout the company.

I want to also mention that our team's lead developer Tiina had a big impact on me. I think she's an amazing role model for every software engineer. She stays calm in every situation and always reminds the team how important it is to maintain steady pace instead of fast pace. I have learned a lot from her and hope to become more like her in future.

During spring 2024, I gave two presentations to the developer community in our department. The first presentation showed how we used Datadog to monitor the new most-read service, including custom metrics, monitors, SLOs and dashboards. I also gave a presentation about event-driven architecture, focusing on AWS EventBridge.

In April, I started mentoring a trainee developer as part of the Sanoma Media Trainee program. The trainee period was very fruitful to both of us and I learned a lot about mentoring a junior developer. The trainee was offered a permanent position as a junior developer in the company at the end of 2024 and they're a valuable part of the team today.

Summer 2024: Learning new, updating the old

In summer 2024, we focused on improving existing systems. I spent most of the summer updating the "tagging service" that is used to automatically propose tags for new articles. Like the RTA service, also this service was in need of a full "DevOps transformation", adding everything considered a best practice in modern software development: linters, formatting, type-checking, monitoring, and proper dependency management.

By adding proper Python packaging, we were able to remove sys.path.append hacks littered around the codebase. We also improved error handling, migrating away from so-called Pokemon error handling and instead raising exceptions and crashing the service for any unexpected exceptions. This enabled powerful automated monitoring for the system and canary deployments with automated rollbacks.

There were no tests of any kind in the tagging service–the developer building the system had even boasted that they did not believe in unit tests. Therefore, refactoring the system was risky. We added simple unit tests to cover all the business-critical functionality and ensured that these tests were run as part of both the pull request checks and the deployment pipeline. This dramatically improved our confidence in test automation and enabled refactoring the code more and more. We also added end-to-end tests in the deployment pipeline to verify that training new models and serving their predictions worked as expected.

Apart from updating the tagging service, I spent the summer studying in depth how our recommendation system works under the hood. I refreshed my memory about Bayesian data analysis (I took the BDA course during my doctoral studies), learned about Pyro and NumPyro and wrote an in-depth mathematical presentation about how users' interest in articles is modelled in our recommendation service. I also studied evaluation metrics for recommendation systems and implemented new metrics based on counterfactual evaluation, with little success. I also compared the performance of two different sampling algorithms, the Markov Chain Monte Carlo algorithm used in production and stochastic variational inference. I found that the latter produced equivalent results much faster and with less computational resources.

In our tech improvement weekly, I presented an introduction to Bayesian inference based on everything I had learned during the summer. This presentation was fun to make, but it was probably the worst presentation I have given during my time in Sanoma. There was too much obscure math for a community of software engineers and my examples failed to convey the big ideas behind Bayesian inference. I had more success in presenting software design, the presentation mostly being based on my favorite programming book The Pragmatic Programmer.

Fall 2024 – Spring 2025: New features

After a summer spent on learning and improving the maintainability of existing services, we spent the rest of year building new features for our core personalization services. We implemented support for collecting audio event data from our websites and apps, allowing us to use this data for personalization and analytics. We built a new experimental machine learning model for predicting users' purchase propensities and tried using the model for personalizing the content on our frontpages.

We also implemented support for more structured personalized frontpage by extending our personalization to personalizing articles per topic. We A/B-tested these changes and decided to scrap the project for a while. I gave a tech improvement presentation on how Google measures engineering productivity and improved our data export pipelines by refactoring Glue Spark code, adding unit tests and adding support for audio events.

In fall 2024, our team also started prototyping a new recommendation algorithm based on deep learning. I was amazed by the machine learning chops of our lead data scientist, who single-handledly built a prototype of a complex recommendation system in a few weeks. In the beginning of 2025, we productionized this system and built the infrastructure necessary for serving the new recommendations to our users. This involved a lot of machine learning and data engineering work mostly taking place in Databricks. I contributed mostly by making the services more production-grade, by improving testing, deployment pipelines and monitoring.

Summer 2025: Promotion to lead and joining a new team

In spring 2025, I had started to feel that my time in the personalization team was nearing its end. For more than a year, we had improved our ways of working to build a high-performing team with high standards for quality. I enjoyed working in the team and was proud that all team members had so many opportunities to grow both professionally and as a person. It had become a routine to us to build and maintain production services with the best practices from DevOps and MLOps. I was very grateful for the time I had spent in the personalization team, but I felt that I could contribute even more in some less mature team.

In March, it was announced internally that the company was looking to hire a new lead developer. The lead would be responsible for leading the technical development in a new "Newsroom AI" team. I took the opportunity, chatted with various stakeholders and was selected as the lead developer. To build the team, my colleagues and I interviewed and hired two developers and a pair of developer trainees to join the new team. I joined the team full-time in May 2025.

Writing this in July 2025, I have been working in the new team for almost three months. It has been a very good experience so far–we have a great team consisting of professional and enthusiastic developers and editorial members. Everything I learned about good teamwork and setting up the ways of working in the personalization team has been immensely valuable in the new team as well. We have established joint ways of working and we're delivering software at a steady and predictable pace. I have a good relationship with our team's product owner and I believe our skill sets balance each other very well. Our team works in close collaboration with the end-users and I have high confidence that we will build very valuable products for our journalists in future.

All in all, I have found a position where I can contribute to building lovable products using a methodology that I believe in. This closes the circle started in the beginning of this post.