Mechanistic Exploration of the Architectural Impact of DPO Fine-Tuning on Ethical Alignment in LLMs

mehr
Titel Mechanistic Exploration of the Architectural Impact of DPO Fine-Tuning on Ethical Alignment in LLMs
Medien In: Degen, H., Ntoa, S. (eds) Artificial Intelligence in HCI. HCII 2025. Lecture Notes in Computer Science, Springer, Cham
Band 15820
Verfasser Fabian Maag, Betiel Woldai, Prof. Dr. Sigurd Schacht
Veröffentlichungsdatum 30.05.2025
Zitation Maag, Fabian; Woldai, Betiel; Schacht, Sigurd (2025): Mechanistic Exploration of the Architectural Impact of DPO Fine-Tuning on Ethical Alignment in LLMs. In: Degen, H., Ntoa, S. (eds) Artificial Intelligence in HCI. HCII 2025. Lecture Notes in Computer Science, Springer, Cham 15820. DOI: 10.1007/978-3-031-93415-5_3