Apache Kafka Certification Training is designed to provide you with the knowledge and skills to become a successful Kafka Big Data Developer. The training encompasses the fundamental concepts (such as Kafka Cluster and Kafka API) of Kafka and covers the advanced topics (such as Kafka Connect, Kafka streams, Kafka Integration with Hadoop, Storm and Spark) thereby enabling you to gain expertise in Apache Kafka.
Fundamental knowledge of Java concepts is mandatory. Edureka provides a complimentary course i.e., "Java Essentials" to all the participants, who enrolls for the Apache Kafka Certification Training
This course is designed for professionals who want to learn Kafka techniques and wish to apply it on Big Data. It is highly recommended for:
After the completion of Real-Time Analytics with Apache Kafka course at Edureka, you should be able to:
Goal: In this module, you will understand where Kafka fits in the Big Data space, and Kafka Architecture. In addition, you will learn about Kafka Cluster, its Components, and how to Configure a Cluster
Skills:
Objectives: At the end of this module, you should be able to:
Topics:
Hands on:
Goal: Kafka Producers send records to topics. The records are sometimes referred to as Messages. In this Module, you will work with different Kafka Producer APIs.
Skills:
Objectives:At the end of this module, you should be able to:
Topics:
Hands On:
Goal: Applications that need to read data from Kafka use a Kafka Consumer to subscribe to Kafka topics and receive messages from these topics. In this module, you will learn to construct Kafka Consumer, process messages from Kafka with Consumer, run Kafka Consumer and subscribe to TopicsSkills:
Objectives: At the end of this module, you should be able to:
Topics:
Hands-On:
Goal: Apache Kafka provides a unified, high-throughput, low-latency platform for handling real-time data feeds. Learn more about tuning Kafka to meet your high-performance needs.
Skills:
Objectives: At the end of this module, you should be able to:
Topics:
Hands On:
Goal: Kafka Cluster typically consists of multiple brokers to maintain load balance. ZooKeeper is used for managing and coordinating Kafka broker. Learn about Kafka Multi-Cluster Architectures, Kafka Brokers, Topic, Partitions, Consumer Group, Mirroring, and ZooKeeper Coordination in this module.
Skills:
Objectives:At the end of this module, you should be able to
Topics:
Hands on:
Goal: Learn about the Kafka Connect API and Kafka Monitoring. Kafka Connect is a scalable tool for reliably streaming data between Apache Kafka and other systems.
Skills:
Objectives: At the end of this module, you should be able to:
Hands on:
Goal: Learn about the Kafka Streams API in this module. Kafka Streams is a client library for building mission-critical real-time applications and microservices, where the input and/or output data is stored in Kafka Clusters.
Skills:
Objectives:
Topics:
Hands on:
Goal: In this module, you will learn about Apache Hadoop, Hadoop Architecture, Apache Storm, Storm Configuration, and Spark Ecosystem. In addition, you will configure Spark Cluster, Integrate Kafka with Hadoop, Storm, and Spark.
Skills:
Objectives:At the end of this module, you will be able to:
Topics:
Hands On:
Goal: Learn how to integrate Kafka with Flume, Cassandra and Talend.
Skills:
Objectives:At the end of this module, you should be able to,
Topics:
Hands On:
Goal: In this module, you will work on a project, which will be gathering messages from multiple sources.
Scenario:In E-commerce industry, you must have seen how catalog changes frequently. Most deadly problem they face is “How to make their inventory and priceconsistent?”.
There are various places where price reflects on Amazon, Flipkart or Snapdeal. If you will visit Search page, Product Description page or any ads on Facebook/google. You will find there are some mismatch in price and availability. If we see user point of view that’s very disappointing because he spends more time to find better products and at last if he doesn’t purchase just because of consistency.Here you have to build a system which should be consistent in nature. For example, if you are getting product feeds either through flat file or any eventstream you have to make sure you don’t lose any events related to product specially inventory and price.
If we talk about price and availability it should always be consistent because there might be possibility that the product is sold or the seller doesn’t want to sell it anymore or any other reason. However, attributes like Name, description doesn’t make that much noise if not updated on time.
Problem Statement
You have given set of sample products. You have to consume and push products to Cassandra/MySQL once we get products in the consumer. You have to save below-mentioned fields in Cassandra.
1. PogId
2. Supc
3. Brand
4. Description5. Size6. Category
7. Sub Category
8. Country
9. Seller Code
In MySQL, you have to store
1. PogId
2. Supc
3. Price
4. Quantity
This Project enables you to gain Hands-On experience on the concepts that you have learned as part of this Course.
You can email the solution to our Support team within 2 weeks from the Course Completion Date. Edureka will evaluate the solution and award a Certificate with a Performance-based Grading.
Problem Statement:You are working for a website techreview.com that provides reviews for different technologies. The company has decided to include a new feature in the website which will allow users to compare the popularity or trend of multiple technologies based on twitter feeds. They want this comparison to happen in real time. So, as a big data developer of the company, you have been task to implement following things:
• Near Real Time Streaming of the data from Twitter for displaying last minute's count of people tweeting about a particular technology.
• Store the twitter count data into Cassandra.