#Function
In PySpark, the date_format() function is used to format a DateType or TimestampType column into a string in a specified format. This function is part of the pyspark.sql.functions module and can be useful when you need to display or work with date and time data in a more readable or custom format.
Syntax: -
from pyspark.sql.functions import date_format
date_format(column, format)
column: The name of the column containing the date or timestamp.
format: A string representing the desired date/time format.
Common Date/Time Format Patterns:
- yyyy: 4-digit year (e.g., 2024)
- MM: 2-digit month (01-12)
- MMM: 3 letter month (Jan,Feb)
- dd: 2-digit day of the month (01-31)
- HH: 2-digit hour (00-23)
- mm: 2-digit minute (00-59)
- ss: 2-digit second (00-59)
Important Notes:
- The date_format() function returns the result as a string.
- It is often used in combination with other functions like to_date() or to_timestamp() to -convert string columns into date or timestamp types before formatting.