Skip to main content

6 Event-Driven Architecture Patterns — Part 1

6 Event-Driven Architecture Patterns — Part 1

Photo by Denys Nevozhai on Unsplash

1. Consume and Project

MetaSite service handled ~1M RPM various kinds of requests
  • First, they streamed all the DB’s Site Metadata objects to a Kafka topic, including new site creations and site updates. Consistency can be achieved by doing DB inserts inside a Kafka Consumer, or by using CDC products like Debezium.
  • Second, they created a “write-only” service (Reverse lookup writer) with its own DB, that consumed the Site Metadata objects but took only the Installed Apps Context and wrote it to the DB. I.e. it projected a certain “view” (installed apps) of the site-metadata into the DB.
Consume and Project Installed Apps Context
  • Third, they created a “read-only” service that only accepted requests related to the Installed Apps context which they could fulfill by querying the DB that stored the projected “Installed Apps” view.
Split Read from Write
  • By streaming the data to Kafka, the MetaSite service became completely decoupled from the consumers of the data, which reduced the load on the service and DB dramatically.
  • By consuming the data from Kafka and creating a “Materialized View” for a specific context, The Reverse lookup writer service was able to create an eventually consistent projection of the data that was highly optimized for the query needs of its client services.
  • Splitting the read service from the write service, made it possible to easily scale the amount of read-only DB replications and service instances that can handle ever-growing query loads from across the globe in multiple data centers.

2. Event-driven from end to end

E2E event-driven Using Kafka and Websockets
open websocket “channel” for notifications
HTTP Import Request + Import Job Message Produced
Job consumed, processed and completion status notified
  • Using this design, it becomes trivial to notify the browser on various stages of the importing process without the need to keep any state and without needing any polling.
  • Using Kafka makes the import process more resilient and scalable, as multiple services can process jobs from the same original import http request.
  • Using Kafka replication, It’s easy to have each stage in the most appropriate datacenter and geographical location. Maybe the importer service needs to be on a google dc for faster importing of google contacts.
  • The incoming notification requests to the web sockets can also be produced to kafka and be replicated to the data center where the websockets service actually resides.

3. In memory KV store

Each In-memory KV Store and their respective compacted Kafka topics
Bookings consums updates from Countries compacted topic
A new time zone for South Sudan is added to the compacted topic
Two In-memory KV Stores consuming from the same compacted topic

Comments

Popular posts from this blog

Apache Spark Discretized Streams (DStreams) with Pyspark

Apache Spark Discretized Streams (DStreams) with Pyspark SPARK STREAMING What is Streaming ? Try to imagine this; in every single second , nearly 9,000 tweets are sent , 1000 photos are uploaded on instagram, over 2,000,000 emails are sent and again nearly 80,000 searches are performed according to Internet Live Stats. So many data is generated without stopping from many sources and sent to another sources simultaneously in small packages. Many applications also generate consistently-updated data like sensors used in robotics, vehicles and many other industrial and electronical devices stream data for monitoring the progress and the performance. That’s why great numbers of generated data in every second have to be processed and analyzed rapidly in real time which means “ Streaming ”. DStreams Spark DStream (Discretized Stream) is the basic concept of Spark Streaming. DStream is a continuous stream of data.The data stream receives input from different kind of sources like Kafka, Kinesis...

6 Rules of Thumb for MongoDB Schema Design

“I have lots of experience with SQL and normalized databases, but I’m just a beginner with MongoDB. How do I model a one-to-N relationship?” This is one of the more common questions I get from users attending MongoDB office hours. I don’t have a short answer to this question, because there isn’t just one way, there’s a whole rainbow’s worth of ways. MongoDB has a rich and nuanced vocabulary for expressing what, in SQL, gets flattened into the term “One-to-N.” Let me take you on a tour of your choices in modeling One-to-N relationships. There’s so much to talk about here, In this post, I’ll talk about the three basic ways to model One-to-N relationships. I’ll also cover more sophisticated schema designs, including denormalization and two-way referencing. And I’ll review the entire rainbow of choices, and give you some suggestions for choosing among the thousands (really, thousands) of choices that you may consider when modeling a single One-to-N relationship. Jump the end of the post ...

Khác nhau giữa các chế độ triển khai giữa Local, Standalone và YARN trong Spark

Trong Apache Spark, có ba chế độ triển khai chính: Local, Standalone và YARN. Dưới đây là sự khác biệt giữa chúng: Chế độ triển khai Local: Chế độ triển khai Local là chế độ đơn giản nhất và được sử dụng cho môi trường phát triển và kiểm thử. Khi chạy trong chế độ Local, Spark sẽ chạy trên một máy tính duy nhất bằng cách sử dụng tất cả các luồng CPU có sẵn trên máy đó. Đây là chế độ phù hợp cho các tác vụ nhỏ và không yêu cầu phân tán dữ liệu. Chế độ triển khai Standalone: Chế độ triển khai Standalone cho phép bạn triển khai một cụm Spark độc lập bao gồm nhiều máy tính. Trong chế độ này, một máy tính được chọn làm "Spark Master" và các máy tính khác được kết nối với Spark Master như là "Spark Workers". Spark Master quản lý việc phân phối công việc và quản lý tài nguyên giữa các Spark Workers. Chế độ Standalone phù hợp cho triển khai Spark trên các cụm máy tính riêng lẻ mà không có hệ thống quản lý cụm chuyên dụng. Chế độ triển khai YARN: YARN (Yet Another Resource N...