6 Yle: Architecture highlights, lessons learned

 

This chapter covers

  • Yle’s big data architecture
  • Scalability and resilience, lessons learned

Yle is the national broadcaster for Finland and operates their own popular streaming service called Yle Areena, which is used by millions of households. For a number of years now, Yle has used serverless technologies at scale in their architecture. They use a combination of AWS Fargate (https://aws.amazon.com/fargate), Lambda, and Kinesis to process more than 500 million user-interaction events per day. These events feed Yle’s machine learning (ML) algorithm and help them provide better content recommendations, image personalization, smart notifications, and more.1

6.1 Ingesting events at scale with Fargate

To provide better content recommendations, Yle needs to know which content the visitors interact with the most. Yle ingests user-interaction data from streaming services as well as mobile and TV apps via an HTTP API. The challenge with this API is that the traffic can be spiky, such as during live sporting events. And sometimes events overlap (for example, when the election results coverage was on at the same time as a hockey game, which is the most popular sport in Finland)!

6.1.1 Cost considerations

6.1.2 Performance considerations

6.2 Processing events in real-time

6.2.1 Kinesis Data Streams

6.2.2 SQS dead-letter queue (DLQ)

6.2.3 The Router Lambda function

6.2.4 Kinesis Data Firehose

6.2.5 Kinesis Data Analytics

6.2.6 Putting it altogether

6.3 Lessons learned

6.3.1 Know your service limits

6.3.2 Build with failure in mind

6.3.3 Batching is good for cost and efficiency