Apache Spark 3.2.0 是 3.x 系列的第三个版本。 凭借开源社区的巨大贡献,此版本解决了超过 1,700 个 Jira 问题。
在此版本中,Spark 支持 Spark 上的 Pandas API 层。 Pandas 用户可以通过一行代码更改在 Spark 上扩展其应用程序。 其他主要更新包括 RocksDB StateStore 支持、会话窗口支持、基于推送的 shuffle 支持、ANSI SQL INTERVAL 类型、默认启用自适应查询执行 (AQE) 和 ANSI SQL 模式 GA。
要下载 Apache Spark 3.2.0,请访问下载页面。 您可以查阅 JIRA 以了解详细更改。 我们在此处整理了一份高级别更改列表,按主要模块分组。
ANSI SQL 兼容性增强
支持 (IGNORE | RESPECT) NULLS 用于 LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE (SPARK-30789) |
性能
连接器增强
Kubernetes 增强
数据源 V2 API
功能增强
其他值得注意的更改
主要特征
其他值得注意的更改
Zen 项目
其他值得注意的更改
性能改进
模型训练改进
BLAS 改进
其他值得注意的更改
编程指南:机器学习库 (MLlib) 指南。
编程指南:SparkR (R on Spark)。
编程指南:GraphX 编程指南。
最后但同样重要的是,如果没有以下贡献者,这个版本是不可能实现的:Adam Binford, Ali Afroozeh, Alkis Polyzotis, Allison Wang, Almog Tavor, Amandeep Sharma, Ammar Al-Batool, Andrew Liu, Andy Grove, Ankur Dave, Anton Okolnychyi, Ashray Jain, Attila Zsolt Piros, Ayushi Agarwal, Baohe Zhang, Bo Zhang, Bruce Robbins, Byungsoo Oh, Carlos Peña, Cary Lee, Chandni Singh, Chao Sun, ChaoJun Zhang, Chendi Xue, Cheng Pan, Cheng Su, Chongguang LIU, Chris Thomas, Chris Wu, Daoyuan Wang, David Christle, David Li, David McWhorter, Denis Pyshev, Dereck Li, Dhruv Kumar, Dhruvil Dave, Dingyu Xu, Dominik Gehl, Dongdong Hong, Dongjoon Hyun, Dooyoung Hwang, Duc Hoa Nguyen, Emil Ejbyfeldt, Enzo Bonnal, Erik Krogen, Eugene Koifman, Fabian A.J. Thiele, Fokko Driesprong, Fu Chen, Gabor Somogyi, Gabriele Nizzoli, Gengliang Wang, Gera Shegalov, Gidon Gershinsky, Guangxin Wang, Haejoon Lee, Haiyang Sun, Han, Harsh Panchal, He Qi, Hector Zhang, Holden Karau, Hopefulnick, Huaxin Gao, Hyukjin Kwon, Ionut Boicu, Ismaël Mejía, Ivan Sadikov, Jarek Potiuk, Jason Yarbrough, Jiaan Geng, Jie Hu, Jose Torres, Josh Rosen, Josh Soref, Julien Lafaye, Jungtaek Lim, Kaifei Yi, Kamil Breguła, Karen Feng, Karuppayya Rajendran, Kazuyuki Tanimura, Ke Jia, Keerthan Vasist, Kent Yao, Kevin Pis, Kevin Su, Koert Kuipers, Kousuke Saruta, Kun Wan, Kunlun Huang, Leanken Lin, Lei Peng, Leona Yoda, Li Zhang, Liang-Chi Hsieh, Lidiya Nixon, Linhong Liu, Lipeng Zhu, Luca Canali, Ludovic Henry, Luka Sturtewagen, Lukas Rytz, Luran He, Maciej Szymkiewicz, Marios Meimaris, Maryann Xue, Matthew Powers, Max Gekk, Maya Anderson, Michael Chen, Michael Zhang, Min Shen, Minchu Yang, Mohanad Elsafty, Nicholas Marion, Ohad Raviv, Pablo Langa, Pawel Ptaszynski, Peter Toth, Phillip Henry, Prakhar Jain, Qi Liu, Qi Zhu, Qilong SU, Qingbo Jiao, Quang-Huy Nguyen, Rahul Mahadev, Raza Jafri, Richard Chen, Richard Penney, Rongchuan Jin, Rui Zeng, Ruifeng Zheng, Ryan Blue, Sajith Ariyarathna, Samuel Moseley, Sanket Reddy, Satish Gopalani, Saurabh Chawla, Sean Owen, Serge Rielau, Shahid K I, Shaoyun Chen, Shardul Mahadik, Shiqi Sun, Shixiong Zhu, Steve Loughran, Steven Aerts, Sumeet Gajjar, Swinky Mann, Takeshi Yamamuro, Takuya UESHIN, Tanel Kiis, Tathagata Das, Tengfei Huang, Terry Kim, Tianhan Hu, Tianhua Huang, Tim Armstrong, Tobias Hermann, Tom Van Bussel, Tomas Pereira De Vasconcelos, Twoentartian, Vasily Kolpakov, Venkata Krishnan Sowrirajan, Venkata Sai Akhil Gudesa, Venki Korukanti, Viettel DGD, Vinod KC, Vlad Glinsky, Walid Gara, Wan Kun, Weichen Xu, Wenchen Fan, William Hyun, Xiao Li, Xiduo You, Xingbo Jiang, Xinrong Meng, XiuLi Wei, Xuedong Luan, Yajun Gao, Yang He, Yang Jie, Yazhi Wang, Ye Zhou, Yi Wu, Yi Zhu, Yijia Cui, Yikun Jiang, Yingyi Bu, Yu Zhong, Yuanjian Li, Yuchen Huo, Yuming Wang, Yuto Akutsu, Zebing Lin, Zhang Xingchao, Zhichao Zhang