#DataframeMethod

Example: -

from pyspark.sql import SparkSession

# Initialize Spark session
spark = SparkSession.builder.appName("exceptAll Example").getOrCreate()

# Create two DataFrames
data1 = [(1, 'Alice'), (2, 'Bob'), (2, 'Bob'), (3, 'Charlie')]
data2 = [(2, 'Bob'), (3, 'Charlie')]

columns = ["id", "name"]

df1 = spark.createDataFrame(data1, columns)
df2 = spark.createDataFrame(data2, columns)

# Use exceptAll() to get rows in df1 that are not in df2
result = df1.exceptAll(df2)
result.show()
+---+-----+
| id| name|
+---+-----+
|  1|Alice|
|  2|  Bob|
+---+-----+
Feature subtract() exceptAll()
Applicable To RDDs DataFrames
Duplicates Does not include duplicates Includes duplicates
Schema Support No schema (operates on RDD elements) Requires identical schemas
Output Type RDD DataFrame
Complexity Simpler; works at RDD level Supports structured data