问题描述
在pyspark数据框中有一列字符串。
我尝试了3种不同的方法将字符串转换为日期,但是所有方法都返回NULL。
private void button1_click(object sender,EventArgs e)
{
IProgress<string> progress = new Progress<string>(s =>
{
textBox1.Text += s;
});
NewClass.Test(progress);
}
public static class NewClass
{
public static void Test(IProgress<string> progress)
{
Process process = new Process();
process.StartInfo.FileName = "cmd.exe";
process.StartInfo.Arguments = "ipconfig /all";
process.StartInfo.CreateNowindow = true;
process.StartInfo.UseShellExecute = false;
process.StartInfo.RedirectStandardOutput = true;
process.StartInfo.RedirectStandardError = true;
process.StartInfo.StandardOutputEncoding = Encoding.GetEncoding(866);
process.StartInfo.StandardErrorEncoding = Encoding.GetEncoding(866);
var handler = new DataReceivedEventHandler((s,e) => progress?.Report(e.Data));
process.OutputDataReceived += handler;
process.ErrorDataReceived += handler;
process.Start();
process.BeginoutputReadLine();
}
}
有什么想法吗?
解决方法
在这种情况下,可以通过将格式指定为 to_date(),to_timestamp(),from_unixtime(unix_timestamp)
来使用 MM/dd/yy
功能。
Example:
df=spark.createDataFrame([('12/18/20',)],['ExpDate'])
from pyspark.sql.functions import *
df.withColumn("ExpDate1",to_date(col("ExpDate"),"MM/dd/yy")).\
show()
#+--------+----------+
#| ExpDate| ExpDate1|
#+--------+----------+
#|12/18/20|2020-12-18|
#+--------+----------+
#to get timestamp type
df.withColumn("ExpDate1",to_timestamp(col("ExpDate"),"MM/dd/yy")).\
show()
#+--------+-------------------+
#| ExpDate| ExpDate1|
#+--------+-------------------+
#|12/18/20|2020-12-18 00:00:00|
#+--------+-------------------+
#or to get date from timestamp
df.withColumn("ExpDate1","MM/dd/yy").cast("date")).\
show()
#+--------+----------+
#| ExpDate| ExpDate1|
#+--------+----------+
#|12/18/20|2020-12-18|
#+--------+----------+
df.select('ExpDate',from_unixtime(unix_timestamp('ExpDate','MM/dd/yy')).cast('date').alias("ExpDate1")).show()
#+--------+----------+
#| ExpDate| ExpDate1|
#+--------+----------+
#|12/18/20|2020-12-18|
#+--------+----------+