RLHF WiiGF 2024: RLHF (what is it good for?) Malmö University Malmö, Sweden, June 11, 2024 |
Conference website | https://rlhf-huh-wiigf.github.io/ |
Submission link | https://easychair.org/conferences/?conf=rlhfwgf2024 |
Reinforcement Learning with Human Feedback (RLHF) is used to fine-tune Large Language Models (LLMs) to 'align to human values and preferences' and improve the 'harmlessness, helpfulness, and honesty' of such models. While it has lead to technical achievements in model performance that need acknowledging, it is still early days for these models and RLHF; we lack a proper understanding of the importance of feedback more broadly in improving language technology.
The increasing body of work criticising RLHF (see e.g. Casper et al., 2023; Hosking et al., 2023; Wei et al., 2023) suggests that many of the non-technical issues, such as harmlessness, cannot be solved with this type of feedback, especially not on a global scale. When we widen the scope, and invite more critical perspectives from across several disciplines, the oversimplification of what the technique actually produces becomes more evident. Studying the method in an interdisciplinary fashion may allow us as researchers to course correct, and consider where RL from Feedback (RLF) can be applied such that it becomes truly useful.
The goal of this workshop is to bring together researchers, industry practitioners, and policy makers to look at the ethical, legal, and societal aspects of how RLHF are developed and deployed. What we need is to open up a broader debate, both critical and more imaginative, of what forms of feedback we need to safeguard the development and use of LLMs, and what open requirements we can expect to see from advances in RLHF.
Here are some of the questions that this workshop will shed some light on:
- What is the scope of RLHF in serving as a control mechanism for reducing or eliminating negative impacts of LLMs?
- What considerations for “data-production dispositif” are acknowledged and accounted for? What can we learn from other examples of crowdsourcing?
- What issues are there with RLHF in practice (e.g. how do instructions to crowdsource workers influence their decisions, what biases arise when crowdsourcing safety criteria)?
- What lessons can we learn from open source to inform the development of techniques to ensure LLM system safety?
- What are the ethical considerations of RLHF, or similar methods, in terms of, e.g., human autonomy, dignity, and control?
- Given the inscrutable size of LLMs, to what extent is RLHF just adding to an already overly complex technology? What are the limits of complexity with regards to safeguarding crucial values in the integration of RLHF+LLM in public contexts?
- What other forms of feedback are currently missing to address relevant value issues?
Submission Guidelines
The impact and significance of LLMs spans a wide range of disciplines, and we are looking for submissions from researchers and practitioners from (non-exhaustive list) computing science, cognitive science, philosophy, social science, law and policy, human-computer interaction. Since the development of these methods is driven in large part by industry, we also want to include the experience of industry experts.
We encourage submissions of two forms:
- extended abstracts (2 pages)
- previously published work
The following is a non-exhaustive list of the contributions we are seeking:
- Positions detailing 'What is it good for?'
- Socio-technical perspective on the harms or benefits of RLHF
- Positions on RLHF from human-computer interaction, philosophy, cognitive science, psychology
- Case studies of applications of RLHF
- Experience working with RLHF in industry
- Visionary approaches to incorporating human feedback in language technology design, (emphasising e.g., groundbreaking theories, case studies, or empirical discoveries)
Important Dates
- Submission Deadline: April 17, 2024 (AoE)
- Notification of Acceptance: May 1st, 2024 (AoE)
- Workshop time and date: HHAI 2024 (Malmö, Sweden): June 11, 2024, 9AM-17PM
Organising committee
- Adam Dahlgren Lindström
- Lea Krause
- Petter Ericson
- Roel Dobbe
- Dimitri Coelho Mollo
- Íñigo Martínez de Rituerto de Troya
- Leila Methnani
Venue
The workshop will be held in Malmö, Sweden on June 11, 2024, 9AM-17PM. Co-located with the third International Conference on Hybrid Human-Artificial Intelligence (HHAI 2024).
Contact
All questions about submissions should be emailed to leila.methnani@umu.se or dali@cs.umu.se