Skip to main content

Apache Spark Components CheatSheet

Troubled by confusing concepts such as Executors, Node, RDD, Task in spark? Invest just 2 minutes of your time to make some order in this mess!


I'll clean up these apache spark concepts for you!

Spark building blocks: executor,tasks,cache,sparkcontext,cluster manager


Executor => Multiple Tasks: is a JVM process sitting on all nodes.  Executors receive tasks (jars with your code) deserialize it, and run it as a task.

Executors utilize cache so that the tasks can run faster.

Node => Multiple Executors: Each node has multiple executors.

RDD => Big DataStructure: Its main strength is that it represents data which cannot be stored on a single machine, so its data is distributed, partitioned, split across computers.

Input => RDD: Every RDD is born out of some input like a text file, hadoop files etc.

Output => RDD: The output of functions in spark can produce an RDD.  So it's like one function after another each receives an input RDD and outputs an output RDD, it's functional.

RDD[Type, Type] : RDD's are typed, they are data of a certain type.

RDD => 1,2,3: RDD's are ordered.

RDD => Zzzz: RDD's are lazily evaluated.  We said functional, didn't we? so you have multiple transformations on your data and only when you hit an action you need the actual data.

RDD => Partitioned: RDD's are partitioned between servers, we said it's big data so we need to partition it.

RDD => Array(thing1, thing2, thing3) : You can think of RDD's as a bunch of things.

Guys if you have any other mess and want me to cheatsheet something for you just comment below, also I would highly appreciate any comment's about this post please feedback me!

Comments

  1. Thanks for the information. The one thing I have noticed in this website is that you were continuously updating the changes that you have been made. It is a good sign to attract more people and I appreciate it. Hope more update and news from you.
    Oracle Training | Online Course | Certification in chennai | Oracle Training | Online Course | Certification in bangalore | Oracle Training | Online Course | Certification in hyderabad | Oracle Training | Online Course | Certification in pune | Oracle Training | Online Course | Certification in coimbatore

    ReplyDelete
  2. Really it was an awesome article… very interesting to read…. oracle training in chennai

    ReplyDelete
  3. Infycle Technologies is the best software training center in Chennai and is widely known for its excellence in giving the best software training in Chennai. Providing quality software programming training with 100% assured placement & to build a strong career for every individual and young professionals in the software industry is the ultimate aim of Infycle Technologies. Apart from all, the students love the 100% practical training, which is the specialty of Infycle Technologies. To proceed with your career with a solid base, reach Infycle Technologies through 7502633633.Best Software Training Center in Chennai | Infycle Technologies

    ReplyDelete

  4. This post is so interactive and informative.keep update more information...
    ccna Training in Tambaram
    ccna course in Chennai

    ReplyDelete
  5. Red Gate .NET Reflector Crack is a program with which users can extract the source code for Windows programs and apply the required changes.Red Gate .NET Reflector Crack

    ReplyDelete
  6. Beyond Compare Key License Keygen fully lets key's the latest stage to give you various countenances for the same data format without .Beyond Compare Crack</

    ReplyDelete
  7. Surprise Quotes For Him our man, despite his gruff look, longs to be cherished and wanted by you, furthermore on hear that you just love him. Surprise Quotes For Him

    ReplyDelete


  8. This is a very well-written piece. Keep posting great things on your page. Your blog is wonderful.
    https://softkeygen.com/scrivener-crack-license-key/

    ReplyDelete

Post a Comment

Popular posts from this blog

Keychron Q1 vs GMMK Pro: A Comparative Review

Introduction Mechanical keyboards have become increasingly popular in recent years, and with a plethora of options available on the market, choosing the right keyboard can be a daunting task. In this blog post, I will be comparing two highly rated 75% mechanical keyboards - the Keychron Q1 and the GMMK Pro. Build Quality and Sound Profile The GMMK Pro has a stiff plate and does not move, making it suitable for mechanical switches that bottom out. This creates a noisy ring on the brass plate, especially with double-shot SA caps that produce a higher pitched metallic sound when bottoming out. The Q1, on the other hand, features a plate that sits on spongy spacers, which absorb the shock from typing, resulting in a smoother and softer typing experience. The plate also moves slightly when bottoming out, which is a unique feeling. When it comes to sound, both keyboards are not significantly different, but the GMMK Pro tends to be a little more hollow due to its stiffer plate. The K

Dev OnCall Patterns

Introduction Being On-Call is not easy. So does writing software. Being On-Call is not just a magic solution, anyone who has been On-Call can tell you that, it's a stressful, you could be woken up at the middle of the night, and be undress stress, there are way's to mitigate that. White having software developers as On-Calls has its benefits, in order to preserve the benefits you should take special measurements in order to mitigate the stress and lack of sleep missing work-life balance that comes along with it. Many software developers can tell you that even if they were not being contacted the thought of being available 24/7 had its toll on them. But on the contrary a software developer who is an On-Call's gains many insights into troubleshooting, responsibility and deeper understanding of the code that he and his peers wrote. Being an On-Call all has become a natural part of software development. Please note I do not call software development software engineering b

LeetCode 51 N Queens

The problem Input integer n Place n queens on n x n  board Q  means queen at cell .  Means empty space Queen can move in any direction Horizontally Up Down Diagonal neg + pos The Trick, let's say for 4x4 Understanding Notice that each and every queen has to be in a different row ! Notice that each and every queen has to be in a different column ! Notice that each and every queen has to be in a different positive diagonal ! Notice that each and every queen has to be in a different negative diagonal ! State - Set - is queen in column, posDiag (row-col), negDiag (row+col) Which rows have queen - do not have to store in state we just loop the rows Which columns have queen - store in state ! Which posDiag have queen - store in state! Which negDiag have queen - store in state! Trick posDiag -> row - column = constant Every time we increase row we increase column negDiag -> row + column = constant Every time we increase row we decrease column Loop 1st queen 1st row Can put first q