Problem Statement: -
Write a pyspark code perform below function
- Write the query to get the department and department wise total(sum) salary from "EmployeeDetail" table.
- Write the query to get the department and department wise total(sum) salary, display it in ascending order according to salary.
- Write the query to get the department and department wise total(sum) salary, display it in descending order according to salary.
- Write the query to get the department, total no. of departments, total(sum) salary with respect to department from "EmployeeDetail" table.
from pyspark.sql.window import *
df.filter(col("Salary")>600000).show()
df.filter(col("Salary")<600000).show()
df.filter(col("Salary").between(500000,700000)).show()
window_spec = Window.orderBy(desc("Salary"))
df_ranked = df.select("Salary").distinct().withColumn("rnk",rank().over(window_spec).alias("Rnk")).filter(col("rnk") == 2)
Second_Highest_Sal = df_ranked.collect()[0]["Salary"]
print(f"Second Highest Salary = {int(Second_Highest_Sal)}")
+----------+----------+---------+---------+--------------------+----------+------+
|EmployeeID|First_Name|Last_Name| Salary| Joining_Date|Department|Gender|
+----------+----------+---------+---------+--------------------+----------+------+
| 3| Ashish| Kumar|1000000.0|2014-01-09 10:05:...| IT| Male|
+----------+----------+---------+---------+--------------------+----------+------+
+----------+----------+---------+--------+--------------------+----------+------+
|EmployeeID|First_Name|Last_Name| Salary| Joining_Date|Department|Gender|
+----------+----------+---------+--------+--------------------+----------+------+
| 2| nikita| Jain|530000.0|2014-01-09 17:31:...| HR|Female|
| 4| Nikhil| Sharma|480000.0|2014-01-09 09:00:...| HR| Male|
| 5| anish| kadian|500000.0|2014-01-09 09:31:...| Payroll| Male|
+----------+----------+---------+--------+--------------------+----------+------+
+----------+----------+---------+--------+--------------------+----------+------+
|EmployeeID|First_Name|Last_Name| Salary| Joining_Date|Department|Gender|
+----------+----------+---------+--------+--------------------+----------+------+
| 1| Vikas| Ahlawat|600000.0|2013-02-15 11:16:...| IT| Male|
| 2| nikita| Jain|530000.0|2014-01-09 17:31:...| HR|Female|
| 5| anish| kadian|500000.0|2014-01-09 09:31:...| Payroll| Male|
+----------+----------+---------+--------+--------------------+----------+------+
Second Highest Salary = 600000