Publications
For an up-to-date list, see my Google Scholar.
-
SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric KnowledgeTech Report 2025
-
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMsEmpirical Methods in Natural Language Processing (EMNLP) 2025
-
Confidence Improves Self-Consistency in LLMsAssociation for Computational Linguistics (ACL) 2025 Findings
-
Keep Guessing? When Considering Inference Scaling, Mind the BaselinesNorth American Chapter of the Association for Computational Linguistics (NAACL) 2025 Findings
-
Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words?Empirical Methods in Natural Language Processing (EMNLP) 2024
-
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?Empirical Methods in Natural Language Processing (EMNLP) 2024
-
Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity AnswersAssociation for Computational Linguistics (ACL) 2024
-
Surfacing Biases in Large Language Models using Contrastive Input Decodingpreprint
-
Malign Overfitting: Interpolation Can Provably Preclude InvarianceInternational Conference on Learning Representations (ICLR) 2023
-
Decision-Making under MiscalibrationInnovations in Theoretical Computer Science (ITCS) 2023
-
Useful Confidence Measures: Beyond the Max ScoreWorkshop on Distribution Shifts at NeurIPS 2022
-
Active Learning with Label ComparisonsUncertainty in Artificial Intelligence (UAI) 2022
-
Beyond Bernoulli: Generating Random Outcomes that cannot be Distinguished from NatureAlgorithmic Learning Theory (ALT) 2022
-
On Fairness and Stability in Two-Sided MatchingsInnovations in Theoretical Computer Science (ITCS) 2022
-
Revisiting Sanity Checks for Saliency MapsWorkshop on eXplainable AI approaches for debugging and diagnosis at NeurIPS 2021
-
Consider the Alternatives: Navigating Fairness-Accuracy Tradeoffs via DisqualificationACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO) 2021
-
Multi-group Agnostic PAC LearningInternational Conference on Machine Learning (ICML) 2021
-
Who's responsible? Jointly quantifying the contribution of the learning algorithm and training dataArtificial Intelligence, Ethics and Society (AIES) 2021
-
Outcome IndistinguishabilitySymposium on Theory of Computing (STOC) 2021
-
Addressing bias in prediction models by improving subpopulation calibrationJournal of the American Medical Informatics Association (JAMIA) 2020
-
Developing a COVID-19 mortality risk prediction model when individual-level data are not availableNature Communications 2020
-
Preference-Informed FairnessInnovations in Theoretical Computer Science (ITCS) 2020
-
Evidence-Based RankingsFoundations of Computer Science (FOCS) 2019
-
Probably Approximately Metric Fair LearningInternational Conference on Machine Learning (ICML) 2018
Blog
Talks
Factuality & trustworthiness: I gave a keynote at the UncertaintyNLP workshop at EMNLP 2025 on trustworthiness beyond factuality, spoke about factuality at Google AI Research Day 2025, and presented on evaluating with multi-granularity answers at TAU AI Day 2024.
Fairness in ML: I gave a keynote on How Fair Can We Be? at WiDS TLV (2020) and was a guest on the Unsupervised podcast discussing fairness.
Meetups: I've spoken at DataTalks TLV about test-set reuse and adversarial learning, and at PyData TLV on visualizing high-dimensional data.