Explanation: Explanation
transactionsDf.withColumn("transactionDateFormatted", from_unixtime("transactionDate", format="MM/dd/yyyy"))
Correct. This code block adds a new column with the name transactionDateFormatted to DataFrame transactionsDf, using Spark's from_unixtime method to transform values in column
transactionDate into strings, following the format requested in the question.
transactionsDf.withColumn("transactionDateFormatted", from_unixtime("transactionDate", format="dd/MM/yyyy"))
No. Although almost correct, this uses the wrong format for the timestamp to date conversion: day/month/year instead of month/day/year.
transactionsDf.withColumnRenamed("transactionDate", "transactionDateFormatted", from_unixtime("transactionDateFormatted", format="MM/dd/yyyy"))
Incorrect. This answer uses wrong syntax. The command DataFrame.withColumnRenamed() is for renaming an existing column only has two string parameters, specifying the old and the new name
of the column.
transactionsDf.apply(from_unixtime(format="MM/dd/yyyy")).asColumn("transactionDateFormatted")
Wrong. Although this answer looks very tempting, it is actually incorrect Spark syntax. In Spark, there is no method DataFrame.apply(). Spark has an apply() method that can be used on grouped
data – but this is irrelevant for this question, since we do not deal with grouped data here.
transactionsDf.withColumn("transactionDateFormatted", from_unixtime("transactionDate"))
No. Although this is valid Spark syntax, the strings in column transactionDateFormatted would look like this: 2020-04-26 15:35:32, the default format specified in Spark for from_unixtime and not
what is asked for in the question.
More info: pyspark.sql.functions.from_unixtime — PySpark 3.1.1 documentation and pyspark.sql.DataFrame.withColumnRenamed — PySpark 3.1.1 documentation
Static notebook | Dynamic notebook: See test 1, QUESTION NO: 39 (Databricks import instructions)