Apache Spark 2.4.0 是 2.x 系列中的第五个版本。此版本添加了屏障执行模式,以便更好地与深度学习框架集成,引入了 30 多个内置和高阶函数,以便更轻松地处理复杂数据类型,改进了 K8s 集成,以及实验性的 Scala 2.12 支持。其他主要更新包括内置的 Avro 数据源、Image 数据源、灵活的流式接收器、消除传输期间 2GB 块大小限制以及 Pandas UDF 改进。此外,此版本继续专注于可用性、稳定性和优化,同时解决了大约 1100 个问题。
要下载 Apache Spark 2.4.0,请访问下载页面。您可以查阅 JIRA 以了解详细更改。我们在此处整理了一份高级更改列表,按主要模块分组。
编程指南:Spark RDD 编程指南 和 Spark SQL、DataFrames 和 Datasets 指南。
编程指南:Structured Streaming 编程指南。
编程指南:机器学习库 (MLlib) 指南。
编程指南:SparkR (R on Spark)。
编程指南:GraphX 编程指南。
请阅读迁移指南,了解所有行为更改
-i
选项最后但同样重要的是,如果没有以下贡献者,这个版本是不可能完成的:Achuth17, Adam Bradbury, Adamyuanyuan, Adelbert Chang, Ala Luszczak, Aleksandr Koriagin, Alessandro Bellina, Alessandro Solimando, Andrew Korzhuev, Anton Okolnychyi, Antonio Murgia, Arseniy Tashoyan, Artem Rudoy, Arun Mahadevan, Asher Saban, Bago Amirbekian, Benjamin Peterson, Bo Meng, Bogdan Raducanu, Bounkong Khamphousone, Brandon Krieger, Brian Lindblom, Bruce Robbins, Bryan Cutler, Cheng Lian, Chongguang LIU, Chris Horn, Chris Martin, Cody Koeninger, DB Tsai, Daniel Sakuma, Daniel Van Der Ende, Darcy Shen, David Vogelbacher, Devaraj K, Dhruve Ashar, Dilip Biswal, Dongjoon Hyun, DylanGuedes, Efim Poberezkin, Eric Chang, Eric Liang, Erik Erlandson, Eyal Farago, Fangshi Li, Felix Cheung, Feng Liu, Fernando Pereira, Florent Pepin, Fokko Driesprong, Gabor Somogyi, Gengliang Wang, Ger Van Rossum, Gera Shegalov, Goun Na, Hao Ren, Henry Robinson, Herman Van Hovell, Hieu Huynh, Holden Karau, Huang Tengfei, Huaxin Gao, Hyukjin Kwon, Ilan Filonenko, Imran Rashid, Jacek Laskowski, Jake Charland, James Thompson, James Yu, Jaroslav Chladek, Jeff Zhang, JiahuiJiang, Jim Kleckner, Joey Krabacher, John Zhuge, Jongyoul Lee, Jooseong Kim, Jose Torres, Joseph Bradley, Joseph K. Bradley, Josh Rosen, Julien Cuquemelle, Juliusz Sompolski, Jungtaek Lim, KaiXinXIaoLei, Kallman, Steven, Karthik Palaniappan, Kaya Kupferschmidt, Kazuaki Ishizaki, Kelley Robinson, Kent Yao, Kevin Yu, KevinZwx, Koert Kuipers, Kousuke Saruta, Kris Mok, LantaoJin, Lee Dongjin, Lemonjing, Li Jin, Liang-Chi Hsieh, Lu WANG, LucaCanali, Marcelo Vanzin, Marco Gaido, Marek Novotny, Mario Molina, Mark Petruska, Maryann Xue, Mathieu St-Louis, Matthew Cheah, Matthew Tovbin, Mauro Palsgraaf, Maxim Gekk, Michael (Stu) Stewart, Michael Allman, Michael Chirico, Michael Mior, Michal Switakowski, Mihaly Toth, Miklos C, Miles Yucht, Misha Dmitriev, Mukul Murthy, Mykhailo Shtelma, Neal Song, Ngone51, Nihar Sheth, Nolan Emirot, Norman Maurer, Onur Satici, Patrick McGloin, Patrick Pisciuneri, Paul Mackles, Peter Toth, Prashant Sharma, Rao Fu, Ray Burgemeestre, Rekha Joshi, Reynold Xin, Reza Safi, Ricardo Martinelli De Oliveira, Rob Vesse, Robert Kruszewski, Rong Tang, Ryan Blue, Sahil Takiar, Saisai Shao, Sandeep Singh, Sandor Murakozi, Sanket Chintapalli, Santiago Saavedra, Sean Owen, Sean Suchter, Sebastian Arzt, Shane Knapp, Shixiong Zhu, Soham Aurangabadkar, Stacy Kerkela, Stan Zhai, Stavros Kontopoulos, Steve Loughran, Sunitha Kambhampati, Takeshi Yamamuro, Takuya UESHIN, Tathagata Das, Ted Yu, Teng Peng, Thiruvasakan Paramasivan, Thomas Graves, Tom Saleeba, Vayda, Oleksandr: IT (PRG), Vinod KC, Vladimir Kuriatkov, Wang Gengliang, Weichen Xu, Wenbo Zhao, Wenchen Fan, William Sheu, XD-DENG, Xiangrui Meng, Xianjin YE, Xianyang Liu, Xiao Li, Xiaogang Tu, Xiayun Sun, Xingbo Jiang, Yacine Mazari, Yash Sharma, Ye Zhou, Yinan Li, Yogesh Garg, Yuanbo Liu, Yuanjian Li, Yuchen Huo, Yuexin Zhang, Yuming Wang, Yuval Itzchakov, Zhan Zhang, Zhang Le, Zheng RuiFeng, Zoltan C. Toth