Best Practices for Stream Computing Processing with Flink on Zeppelin

Best Practices for Stream Computing Processing with Flink on Zeppelin

Content framework:

Big Data Overview
Flink Learning Framework
Demonstration of best practices for stream computing on EMR Studio

1. Overview of Big Data

Big Data Processing ETL (Data → Data)
Big Data Analysis BI (Data → Dashboard)
Machine Learning AI (Data → Model)

2. Flink Learning Framework

Flink Essentials

Stateful
Time
Flink Architecture
Flink API
Flink Configuration
Flink Log

Stateful:

Why

Timeliness of stream computing

Unbounded Stream Computing

When

Window

Join

Pattern

How

statebackend

Time

Event time
Processing time
Watermark

Flink Architecture

Flink API

Flink Configuration

Cluster Configuration
Job Configuration
Statebackend
Resource Manager
SQL/Python
Reference documentation: https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/

Flink Log

III. Best Practices for Stream Computing on EMR Studio

EMR Studio features:

Compatible with open source components

EMR Studio has been optimized and enhanced based on the open source software Apache Zeppelin, Jupyter Notebook, and Apache Airflow.


Supports connecting multiple clusters and adapting to multiple computing engines. Interactive development + seamless job scheduling. Applicable to a variety of big data application scenarios. Computing and storage separation

Flink Clients

Flink on Zeppelin (Phase 1) - Interactive Flink Client

Flink on Zeppelin (Phase 2) - Interactive JobManager

Flink on Zeppelin Main Features

Original link: http://click.aliyun.com/m/1000286010/

<<:  It’s time to launch 5G applications

>>:  External tools connect to SaaS mode cloud data warehouse MaxCompute practice

Recommend

Breaking news! 5G standard postponed for 3 months

According to reliable intelligence, at the 3GPP m...

Four trends will occur in the telecommunications industry in 2023

Greater emphasis on data Telecommunications busin...

ITU releases draft specification, a big step forward in 5G standardization

According to foreign media reports, despite the o...

Design and implementation of Nodejs-Ipc

[[347927]] This article is reprinted from the WeC...

Do you really understand the network layer model?

I went for interviews throughout the summer and i...

What exactly is UWB technology?

This article is reprinted from the WeChat public ...

Aryaka: Providing a global network "highway" for multinational enterprises

Gary Sevounts, Aryaka's chief marketing offic...

A network administrator's self-cultivation: TCP protocol

Today, let’s continue with the network administra...

The interviewer asked me to turn left because of the thread pool!

A few days ago, my friend had an interview. Durin...