笔记:spark dataframe 添加一列索引列


https://stackoverflow.com/questions/43406887/spark-dataframe-how-to-add-a-index-column-aka-distributed-data-index

 

import pyspark.sql.functions as fn
from pyspark.sql import SparkSession, Window


df.withColumn("__no", fn.row_number().over(Window.orderBy(fn.monotonically_increasing_id())) - 1)
Python spark

到现在有0条评论

添加我的评论