Publications
Greedy-Advantage-Aware RLHF
Blogpost published to LessWrong
Greedy-Advantage-Aware RLHF addresses the negative side effects from misspecified reward functions problem in language modeling domains. In a simple setting, the algorithm improves on traditional RLHF methods by producing agents that have a reduced tendency to exploit misspecified reward functions. I also detect the presence of sharp parameter topology in reward hacking agents, which suggests future research directions.
Are They What They Claim: A Comprehensive Study of Ordinary Linear Regression Among the Top Machine Learning Libraries in Python
Paper presented at 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
We authored a comprehensive survey of current implementations of the Original Least Squares method in popular Python libraries (TensorFlow, PyTorch, scikit-learn, MXNet) to give users actionable information about state of ML. Within this work, we conducted original experiments to analyze the runtime across platforms, space requirement, performance over big data, and strength of model implementation of these popular libraries.
AReS: An AutoML Regression Service for Data Analytics and Novel Data-centric Visualizations
Paper presented at 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Our service is intended to enable researchers to make use of ML that can augment data analysis and exploration in their respective fields. AReS allows users to upload data and automatically build dozens of different ML models, each with its own strengths. These models are compared to determine which is most effective, and several, novel data-centric visualizations are presented in an informative report. This is intended to help users better understand their data and the effectiveness of the models over their data. AReS can be found here.