Spark Streaming data yekuchenesa michina
(I) DStream uye RDD
Sezvatinoziva, Spark Streaming computation yakavakirwa paSpark Core, uye musimboti weSpark Core iRDD, saka Spark Kutenderera kunofanira kunge kune hukama neRDD zvakare.Nekudaro, Spark Kutenderera haitenderi vashandisi kushandisa RDD zvakananga, asi inodhirodha seti yeDStream concepts, DStream uye RDD inosanganisirwa hukama, unogona kuzvinzwisisa seyekushongedza patani muJava, ndiko kuti, DStream kusimudzira kweRDD, asi. maitiro akafanana neRDD.
DStream uye RDD ese ane akati wandei mamiriro.
(1) iva nezviito zvekushandura zvakafanana, semepu, kuderedzaByKey, nezvimwewo, asiwo zvimwe zvakasiyana, seWindow, mapWithSated, nezvimwe.
(2) ese ane Action zviito, senge foreachRDD, kuverenga, nezvimwe.
Iyo purogiramu yemuenzaniso inowirirana.
(B) Nhanganyaya yeDStream muSpark Streaming
DStream ine makirasi akati wandei.
(1) Data source makirasi, akadai seInputDStream, yakananga seDirectKafkaInputStream, nezvimwe.
(2) Makirasi ekushandura, kazhinji MappedDStream, ShuffledDStream
(3) makirasi ekubuda, anowanzo senge ForEachDStream
Kubva pane zviri pamusoro, iyo data kubva pakutanga (inopinza) kusvika kumagumo (yakabuda) inoitwa neiyo DStream system, zvinoreva kuti mushandisi haagone kuburitsa zvakananga nekushandisa maRDD, zvinoreva kuti DStream ine mukana uye chisungo chekuve. inotarisira kutenderera kwehupenyu hweRDDs.
Mune mamwe mazwi, Spark Streaming ineotomatiki kuchenesabasa.
(iii) Maitiro ekugadzira RDD muSpark Streaming
Hupenyu hwekuyerera kweRDDs muSpark Streaming hwakakasharara seizvi.
(1) MuInputDStream, iyo data yakagamuchirwa inoshandurwa kuita RDD, yakadai seDirectKafkaInputStream, iyo inogadzira KafkaRDD.
(2) ipapo kuburikidza neMappedDStream uye kumwe kushandurwa kwedata, ino nguva inodaidzwa zvakananga RDD inoenderana nemepu nzira yekushandura.
(3) Mukuita kwekirasi yekubuda, chete kana iyo RDD yafumurwa, unogona kurega mushandisi aite chengetedzo inoenderana, mamwe maverengero, uye mamwe maoperation.