chapter six

6 Yle – architecture highlights, lessons learned

 

This chapter covers

  • Yle’s big data architecture that processes more than 500 million events per day
  • Lessons learned around scalability and resilience

Yle is the national broadcaster in Finland and operates their own streaming service called Yle Areena, which is the most popular streaming service in Finland and used by millions of households. For a number of years now, Yle has used serverless technologies at scale in their architecture. They use a combination of AWS Fargate (https://aws.amazon.com/fargate), Lambda, and Kinesis to process more than 500 million user-interaction events per day. These events feed Yle’s Machine Learning (ML) algorithm and help them provide better content recommendations, image personalization, smart notifications, and more.[1]

6.1 Ingesting events at scale with Fargate

To provide better content recommendations, Yle needs to know which content the visitors interact with the most. Yle ingests user-interaction data from streaming services as well as mobile and TV apps via an HTTP API.

The challenge with this API is that the traffic can be very spiky, such as during live sporting events. And sometimes these events can overlap! For example, when the election results coverage was on at the same time as hockey (which is the most popular sport in Finland).

6.1.1 Cost considerations

6.1.2 Performance considerations

6.2 Process events in real-time

6.2.1 Kinesis Data Streams

6.2.2 SQS Dead-letter queue

6.2.3 The Router Lambda function

6.2.4 Kinesis Data Firehose

6.2.5 Kinesis Data Analytics

6.2.6 Putting it altogether

6.3 Lessons learned

6.3.1 Know your service limits

6.3.2 Build with failure in mind

6.3.3 Batching is good for cost and efficiency

6.3.4 Cost estimation is tricky…

6.4 Summary