Re: [presto-tsc] Nominate Arun as Presto Committer


Thanks for all the votes! We have reached the majority of the consensus now (9 affirmative ones out of 17 committers). Congrats @Arun Thirupathi on becoming a committer!






From: James Petty <petty.jamesm@...>
Date: Tuesday, October 19, 2021 at 3:29 PM
To: James Sun <jamessun@...>
Cc: presto-tsc@... <presto-tsc@...>
Subject: Re: [presto-dev] [presto-tsc] Nominate Arun as Presto Committer



On Tue, Oct 19, 2021 at 12:37 PM jamessun via <> wrote:

Hi Presto committers,



I would like to nominate Arun Thirupathi (github ID: arunthirupathi) as a Presto committer. Arun has been contributing a lot to the Presto ORC reader and writer. He is now one of the few experts in the Presto community having a deep grasp of file formats.


In Presto, Arun worked extensively on ORC support and is an expert in the columnar file formats. Arun also worked on Presto core and has contributed to multiple modules in Presto. Arun improves the code by constant refactoring, adding additional tests and improving the documentation of Presto. Arun has reviewed most changes to the columnar file format. Arun has reviewed changes from different contributors and provides quality feedback.


In addition, Arun is not new to open source. Arun was an active member of Voldemort, a distributed key/value store that was once popular.


In details, Arun has

  • 48 commits
  • 10K lines of addition and 4K lines of removal
  • 30+ PR reviews and 140+ review comments


The major contributions include:


  1. Rewrote the ORC dictionary writer to improve performance by 3X.
  2. Improve the performance of queries that use Map functions like MAP_AGG, ELEMENT_AT by introducing lazy hash tables in Presto.
  3. Improved the IO performance of Presto ORC reader and writer by introducing new layouts, configurable tail sizes.
  4. Optimized the dictionary writer performance by using chunked memory, optimized data structure.
  5. Fixed multiple bugs in presto-orc like support for stripes with 2 billion rows, IO errors are masked, Hive filter pushdown bugs.
  6. Improved columnar statistics memory efficiency.
  7. Simplified the presto-orc code by constant refactoring.
  8. Maintaining/Upgrading dependency of presto like orc-protobuf, fastutil, hive-apache



Please reply to this email to vote.






Join { to automatically receive all group messages.