[Paper Reading] PENCIL: Long Thoughts with Short Memory

🗓 April 04, 2025 👀

For mobile: View PDF

Main References

Chenxiao Yang, Nathan Srebro, David McAllester, and Zhiyuan Li. PENCIL: Long Thoughts with Short Memory. arXiv preprint, 2025.

Supplementary References

Maxwell Nye et al. Show Your Work: Scratchpads for Intermediate Computation with Language Models. arXiv preprint, 2021.
Jason Wei et al. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS 2022.
William Merrill and Ashish Sabharwal, The Parallelism Tradeoff: Limitations of Log-Precision Transformers. TACL 2023.
William Merrill and Ashish Sabharwal. The Expressive Power of Transformers with Chain of Thought. ICLR 2024.

Share on

X (Twitter) Facebook LinkedIn

You May Also Enjoy

[Ph.D. Thesis Proposal] Unraveling and Overcoming Challenges in Machine Learning: Generalizability, Adaptability, and Multifacetedness

👤 Hanseul Cho

🗓 January 24, 2025 👀

📌 Proposed for Ph.D. Candidacy Exam @ KAIST AI

[Paper Reading] Viewing Log-Depth Transformers via the Lens of Distributed Computing

🗓 October 10, 2024 👀

📌 Presented in OptiML Group Meeting (Fall 2024)

[Paper Reading] StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization

🗓 July 16, 2024 👀

📌 Presented in OptiML Group Meeting (Summer 2024)

[Paper Reading] Convex and Non-convex Optimization under Generalized Smoothness

🗓 June 03, 2024 👀

📌 Presented in AI709 Advanced Deep Learning Theory (Spring 2024)