Different Types of “Join Strategies” in “Apache Spark”
<h1>What is “Join Selection Strategy”?</h1>
<ul>
<li>When “<strong>Any Type</strong>” of “<strong>Join</strong>”, like the “<strong>Left Join</strong>”, or, the “<strong>Inner Join</strong>” is “<strong>Performed</strong>” between “<strong>Two DataFrames</strong>”, “<strong>Apache Spark</strong>” “<strong>Internally</strong>” decides which “<strong>Algorithm</strong>” will be used to “<strong>Perform</strong>” the “<strong>Join</strong>” <strong>Operations</strong> between the “<strong>Two DataFrames</strong>”.</li>
<li>That particular “<strong>Algorithm</strong>” that is “<strong>Responsible</strong>” for “<strong>Planning</strong>” the “<strong>Join</strong>” <strong>Operation</strong> between the “<strong>Two DataFrames</strong>”, is called as the “<strong>Join Selection Strategy</strong>”.</li>
</ul>
<h1>Why Learning About “Join Selection Strategies” is Important?</h1>
<ul>
<li>To “<strong>Optimize</strong>” a “<strong>Spark Job</strong>” that “<strong>Involves</strong>” a “<strong>Lot of Joins</strong>”, the “<strong>Developers</strong>” need to be very much aware about the “<strong>Internal Algorithm</strong>” that “<strong>Apache Spark</strong>” will “<strong>Choose</strong>” to “<strong>Perform</strong>” “<strong>Any</strong>” of the “<strong>Join</strong>” <strong>Operations</strong> between “<strong>Two DataFrames</strong>”.</li>
<li>The “<strong>Developers</strong>” need to know about the “<strong>Join Selection Strategies</strong>” so that the “<strong>Wrong Join Selection Strategy</strong>” is “<strong>Not Used</strong>” in the “<strong>Join</strong>” <strong>Operation</strong> between “<strong>Two DataFrames</strong>”.</li>
<li>An “<strong>Incorrect Join Selection Strategy</strong>” will “<strong>Increase</strong>” the “<strong>Execition Time</strong>” of the “<strong>Join</strong>” <strong>Operation</strong>, and, the “<strong>Join</strong>” <strong>Operation</strong> becomes a “<strong>Heavy Operation</strong>” on the “<strong>Executors</strong>” as well.</li>
</ul>
<p><a href="https://oindrila-chakraborty88.medium.com/different-types-of-join-strategies-in-apache-spark-5c0066999d0d"><strong>Learn More</strong></a></p>