Apache Spark 2.4.0 是 2.x 系列的第五个版本。此版本添加了 Barrier Execution Mode(屏障执行模式),以更好地与深度学习框架集成;引入了 30 多个内置和高阶函数,以便更轻松地处理复杂数据类型;改进了 K8s 集成,并提供了实验性的 Scala 2.12 支持。其他主要更新包括内置的 Avro 数据源、图像数据源、灵活的流式接收器、消除了传输期间 2GB 块大小限制,以及 Pandas UDF 改进。此外,此版本在解决约 1100 个问题的同时,继续专注于可用性、稳定性和完善。
要下载 Apache Spark 2.4.0,请访问下载页面。您可以查阅 JIRA 以获取详细更改。我们在此处整理了一份高级别更改列表,按主要模块分组。
编程指南:Spark RDD 编程指南和Spark SQL、DataFrames 和 Datasets 指南。
编程指南:结构化流编程指南。
编程指南:机器学习库 (MLlib) 指南。
编程指南:SparkR (R 语言 on Spark)。
编程指南:GraphX 编程指南。
请阅读迁移指南以了解所有行为变更
-i
选项最后但同样重要的是,没有以下贡献者的努力,本次发布是不可能实现的:Achuth17, Adam Bradbury, Adamyuanyuan, Adelbert Chang, Ala Luszczak, Aleksandr Koriagin, Alessandro Bellina, Alessandro Solimando, Andrew Korzhuev, Anton Okolnychyi, Antonio Murgia, Arseniy Tashoyan, Artem Rudoy, Arun Mahadevan, Asher Saban, Bago Amirbekian, Benjamin Peterson, Bo Meng, Bogdan Raducanu, Bounkong Khamphousone, Brandon Krieger, Brian Lindblom, Bruce Robbins, Bryan Cutler, Cheng Lian, Chongguang LIU, Chris Horn, Chris Martin, Cody Koeninger, DB Tsai, Daniel Sakuma, Daniel Van Der Ende, Darcy Shen, David Vogelbacher, Devaraj K, Dhruve Ashar, Dilip Biswal, Dongjoon Hyun, DylanGuedes, Efim Poberezkin, Eric Chang, Eric Liang, Erik Erlandson, Eyal Farago, Fangshi Li, Felix Cheung, Feng Liu, Fernando Pereira, Florent Pepin, Fokko Driesprong, Gabor Somogyi, Gengliang Wang, Ger Van Rossum, Gera Shegalov, Goun Na, Hao Ren, Henry Robinson, Herman Van Hovell, Hieu Huynh, Holden Karau, Huang Tengfei, Huaxin Gao, Hyukjin Kwon, Ilan Filonenko, Imran Rashid, Jacek Laskowski, Jake Charland, James Thompson, James Yu, Jaroslav Chladek, Jeff Zhang, JiahuiJiang, Jim Kleckner, Joey Krabacher, John Zhuge, Jongyoul Lee, Jooseong Kim, Jose Torres, Joseph Bradley, Joseph K. Bradley, Josh Rosen, Julien Cuquemelle, Juliusz Sompolski, Jungtaek Lim, KaiXinXIaoLei, Kallman, Steven, Karthik Palaniappan, Kaya Kupferschmidt, Kazuaki Ishizaki, Kelley Robinson, Kent Yao, Kevin Yu, KevinZwx, Koert Kuipers, Kousuke Saruta, Kris Mok, LantaoJin, Lee Dongjin, Lemonjing, Li Jin, Liang-Chi Hsieh, Lu WANG, LucaCanali, Marcelo Vanzin, Marco Gaido, Marek Novotny, Mario Molina, Mark Petruska, Maryann Xue, Mathieu St-Louis, Matthew Cheah, Matthew Tovbin, Mauro Palsgraaf, Maxim Gekk, Michael (Stu) Stewart, Michael Allman, Michael Chirico, Michael Mior, Michal Switakowski, Mihaly Toth, Miklos C, Miles Yucht, Misha Dmitriev, Mukul Murthy, Mykhailo Shtelma, Neal Song, Ngone51, Nihar Sheth, Nolan Emirot, Norman Maurer, Onur Satici, Patrick McGloin, Patrick Pisciuneri, Paul Mackles, Peter Toth, Prashant Sharma, Rao Fu, Ray Burgemeestre, Rekha Joshi, Reynold Xin, Reza Safi, Ricardo Martinelli De Oliveira, Rob Vesse, Robert Kruszewski, Rong Tang, Ryan Blue, Sahil Takiar, Saisai Shao, Sandeep Singh, Sandor Murakozi, Sanket Chintapalli, Santiago Saavedra, Sean Owen, Sean Suchter, Sebastian Arzt, Shane Knapp, Shixiong Zhu, Soham Aurangabadkar, Stacy Kerkela, Stan Zhai, Stavros Kontopoulos, Steve Loughran, Sunitha Kambhampati, Takeshi Yamamuro, Takuya UESHIN, Tathagata Das, Ted Yu, Teng Peng, Thiruvasakan Paramasivan, Thomas Graves, Tom Saleeba, Vayda, Oleksandr: IT (PRG), Vinod KC, Vladimir Kuriatkov, Wang Gengliang, Weichen Xu, Wenbo Zhao, Wenchen Fan, William Sheu, XD-DENG, Xiangrui Meng, Xianjin YE, Xianyang Liu, Xiao Li, Xiaogang Tu, Xiayun Sun, Xingbo Jiang, Yacine Mazari, Yash Sharma, Ye Zhou, Yinan Li, Yogesh Garg, Yuanbo Liu, Yuanjian Li, Yuchen Huo, Yuexin Zhang, Yuming Wang, Yuval Itzchakov, Zhan Zhang, Zhang Le, Zheng RuiFeng, Zoltan C. Toth