Applications of LLMs
Overview
The purpose of this course is to develop students who can:
- understand Large Language Models (LLMs) sufficiently to be able to use them;
- work productively to develop and implement an LLM-based application; and
- attract users to that application, to an extent that they are able to conduct some evaluation and write a paper about it.
To borrow the YCombinator motto - this course develops students who can build something (based on LLMs that) people want.
Students are expected to develop:
- an understanding of LLMs, and their evolving place in the world;
- a hacker mindset focused on building something;
- the ability to build an application
- exceptional written and verbal communication skills; and
- contribute in some small way to our understanding of something related to NLP, ideally LLMs.
Pre-requisites
- Comfortable with Python, GitHub, APIs, and data science fundamentals.
Content
Week 1
- Karpathy, Andrej, 2013, “Intro to Large Language Models”, YouTube, 22 November, https://youtu.be/zjkBMFhNj_g.
- Chollet, François, 2021, Deep Learning with Python, Chapters 1-3.
- Tunstall, von Werra and Wolf, 2022, Natural Language Processing with Transformers, Chapters 1 and 2.
Week 2
- Karpathy, Andrej, 2022, Neural Networks: Zero to Hero, “The spelled-out intro to neural networks and backpropagation: building micrograd”, https://youtu.be/VMj-3S1tku0.
- Chollet, François, 2021, Deep Learning with Python, Chapters 4 and 5.
- Tunstall, von Werra and Wolf, 2022, Natural Language Processing with Transformers, Chapter 3.
- Bolukbasi, Tolga, Kai-Wei Chang, James Y. Zou, Venkatesh Saligrama and Adam T. Kalai, 2016, ‘Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings’, Advances in Neural Information Processing Systems, 29 (NIPS 2016), http://papers.nips.cc/paper/6228-man-is-to-computer-programmer-as-woman-is-to-homemaker-d.
Week 3
- Alammar, Jay, 2020, ‘How GPT3 Works - Visualizations and Animations’, 27 July, https://jalammar.github.io/how-gpt3-works-visualizations-animations/
- Karpathy, Andrej, 2022, Neural Networks: Zero to Hero, “The spelled-out intro to language modeling: building makemore”.
- Chollet, François, 2021, Deep Learning with Python, Chapters 6 and 7.
- Tunstall, von Werra and Wolf, 2022, Natural Language Processing with Transformers, Chapters 4 and 5.
- Hutchinson, Ben, Vinodkumar Prabhakaran, Emily Denton, Kellie Webster, Yu Zhong, and Stephen Denuyl, 2020, ‘Social Biases in NLP Models as Barriers forPersons with Disabilities’, arXiv, https://arxiv.org/abs/2005.00813.
Week 4
- Karpathy, Andrej, 2022, Neural Networks: Zero to Hero, “Building makemore Part 2: MLP”.
- Chollet, François, 2021, Deep Learning with Python, Chapter 11.
- Tunstall, von Werra and Wolf, 2022, Natural Language Processing with Transformers, Chapters 6 and 7.
- Solaiman, Irene, Miles Brundage, Jack Clark, Amanda Askell, ArielHerbert-Voss, Jeff Wu, Alec Radford, Gretchen Krueger, Jong Wook Kim, SarahKreps, Miles McCain, Alex Newhouse, Jason Blazakis, Kris McGuffie, Jasmine Wang, 2019, ‘Release Strategies and the Social Impacts of Language Models’,arXiv, https://arxiv.org/abs/1908.09203.
Week 5
- Karpathy, Andrej, 2022, Neural Networks: Zero to Hero, “Building makemore Part 3: Activations & Gradients, BatchNorm”.
- Chollet, François, 2021, Deep Learning with Python, Chapters 12 and 13.
- Tunstall, von Werra and Wolf, 2022, Natural Language Processing with Transformers, Chapter 8.
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser and Illia Polosukhin, 2017, ‘Attention Is All You Need’, arXiv, http://arxiv.org/abs/1706.03762.
- Tatman, Rachel, 2020, ‘What I Won’t Build’, Widening NLP Workshop 2020, Keynote address, 5 July, https://slideslive.com/38929585/what-i-wont-build and http://www.rctatman.com/talks/what-i-wont-build.
Week 6
- Karpathy, Andrej, 2022, Neural Networks: Zero to Hero, “Building makemore Part 4: Becoming a Backprop Ninja”.
- Boykis, Vicky, What are embeddings?, https://vickiboykis.com/what_are_embeddings/
- Rush, Alexander, 2018, ‘The Annotated Transformer’, https://nlp.seas.harvard.edu/2018/04/03/attention.html
- Zhao, Jieyu, Tianlu Wang, Mark Yatskar, Vicente Ordonez and Kai-Wei Chang,2017, ‘Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints’, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2979–2989, https://aclweb.org/anthology/D17-1323.pdf.
Week 7
- Karpathy, Andrej, 2022, Neural Networks: Zero to Hero, “Building makemore Part 5: Building a WaveNet”.
- Alammar, Jay, 2018, ‘The Illustrated Transformer’, http://jalammar.github.io/illustrated-transformer/.
- Manning, Vaswani and Huang, 2019, ‘Stanford CS224N: NLP with Deep Learning | Winter 2019 | Lecture 14 – Transformers and Self-Attention’, https://www.youtube.com/watch?v=5vcj8kSwBCY&list=PLoROMvodv4rOhcuXMZkNm7j3fVwBBY42z&index=14&ab_channel=stanfordonline.
- Uszkoreit, Jakob, 2017, ‘Transformer: A Novel Neural Network Architecture for Language Understanding’, Google AI Blog, 31 August, https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html
Week 8
- Karpathy, Andrej, 2022, Neural Networks: Zero to Hero, “Let’s build GPT: from scratch, in code, spelled out”.
- Bender, Emily M. and Koller, Alexander, 2020, ‘Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data’, *Proceedings of the 58th Annual Meeting of the
- Seibel Michael, “How to build an MVP”, https://www.youtube.com/watch?v=QRZ_l7cVzzU.
Week 9
- Seibel Michael, “Building product”, https://www.youtube.com/watch?v=C27RVio2rOs.
- Jacob Devlin and Ming-Wei Chang, 2018, ‘Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing’, 2 November, Google AI Blog, https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html.
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, 2018, ‘BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding’, arXiv, https://arxiv.org/abs/1810.04805.
Week 10
- Alströmer, Gustaf, “How to get your first customers”, https://www.youtube.com/watch?v=hyYCn_kAngI
- Tom B. Brown etc, 2020, ‘Language Models are Few-Shot Learners’, arXiv, https://arxiv.org/abs/2005.14165
Week 11
- None.
Week 12
- None.
Assessment
Notebook
- Due date: Updated this weekly.
- Task: Use Quarto to keep a notebook of what you have done the past week and what you want to do the coming week.
- Weight: 10 per cent.
Application
- Due date: Week 10.
- Task: Create a web application based on LLMs that has paying users.
- Weight: 60 per cent.
Final Paper
- Due date: Thursday, noon, Week 12 + two weeks.
- Task: Write a paper that involves evaluating some aspect of your application.
- Weight: 30 per cent.