
pyspark.sql.DataFrameWriter.saveAsTable — PySpark 4.1.0 …
>>> _ = spark.sql("DROP TABLE IF EXISTS tblA") >>> spark.createDataFrame([ ... (100, "Hyukjin Kwon"), (120, "Hyukjin Kwon"), (140, "Haejoon Lee")], ... schema=["age", "name"] ...
pyspark.sql.DataFrame.select — PySpark 4.1.0 documentation
pyspark.sql.DataFrame.select # DataFrame.select(*cols) [source] # Projects a set of expressions and returns a new DataFrame. New in version 1.3.0. Changed in version 3.4.0: Supports …
pyspark.sql.Column.cast — PySpark 4.1.0 documentation - Apache …
pyspark.sql.Column.cast # Column.cast(dataType) [source] # Casts the column into type dataType. New in version 1.3.0. Changed in version 3.4.0: Supports Spark Connect.
pyspark.sql.DataFrame.withColumnRenamed - Apache Spark
pyspark.sql.DataFrame.withColumnRenamed # DataFrame.withColumnRenamed(existing, new) [source] # Returns a new DataFrame by renaming an existing column. This is a no-op if the …
pyspark.sql.DataFrame.unionByName - Apache Spark
Returns a new DataFrame containing union of rows in this and another DataFrame. This method performs a union operation on both input DataFrames, resolving columns by name (rather …
pyspark.sql.DataFrame.dropna — PySpark 4.0.1 documentation
pyspark.sql.DataFrame.dropna # DataFrame.dropna(how='any', thresh=None, subset=None) [source] # Returns a new DataFrame omitting rows with null or NaN values. …
pyspark.sql.functions.date_format — PySpark 4.1.0 documentation
Mar 18, 1993 · Converts a date/timestamp/string to a value of string in the format specified by the date format given by the second argument. A pattern could be for instance dd.MM.yyyy and …
StructField — PySpark 4.0.1 documentation - Apache Spark
DDL-formatted string representation of types, e.g. pyspark.sql.types.DataType.simpleString, except that top level struct type can omit the struct<> for the compatibility reason with …
pyspark.sql.DataFrame.replace — PySpark 4.1.0 documentation
When replacing, the new value will be cast to the type of the existing column. For numeric replacements all values to be replaced should have unique floating point representation. In …
pyspark.sql.Column.isin — PySpark 4.1.0 documentation - Apache …
Parameters colsAny The values to compare with the column values. The result will only be true at a location if any value matches in the Column. Returns Column Column of booleans showing …