#stratascratch
https://platform.stratascratch.com/coding/9881-make-a-report-showing-the-number-of-survivors-and-non-survivors-by-passenger-class?code_type=6
Problem Statement: -
Make a report showing the number of survivors and non-survivors by passenger class. Classes are categorized based on the pclass value as:
pclass = 1:first_class
pclass = 2: second_classs
pclass = 3: third_class
Output the number of survivors and non-survivors by each class.

Dataframe API Solution: -
import pyspark
from pyspark.sql.functions import *
titanic_rpt_df = titanic.select("passengerid","survived","pclass")
# titanic_rpt_df.show()
titanic_cnt = titanic_rpt_df.groupBy("survived","pclass").agg(count("passengerid").alias("count")) \
.withColumn("pclass", when(col("pclass") == 1, "first_class").when(col("pclass") == 2, "second_class").otherwise("third_class") )
# titanic_cnt.show()
res_df = titanic_cnt.groupBy("survived").pivot("pclass").agg(sum(col("count")))
res_df.toPandas()
