BEGIN:VCALENDAR
PRODID:-//AddEvent Inc//AddEvent.com v1.7//EN
VERSION:2.0
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:STANDARD
DTSTART:20261101T010000
RRULE:FREQ=YEARLY;BYDAY=1SU;BYMONTH=11
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20260308T030000
RRULE:FREQ=YEARLY;BYDAY=2SU;BYMONTH=3
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
DESCRIPTION:About: I am a PhD student at the Center for Data Science at NYU advised by Professor Andrew Gordon Wilson and a Visiting Researcher in the Fundamental AI Research (FAIR) group at Meta AI where I work with Brandon Amos. I work on the foundations of deep learning. My goal is to understand and quantify generalization in deep learning\, and use this understanding to build more robust and reliable machine learning models. \n\n Session Description: Modern language models can contain billions of parameters\, raising the question of whether they can generalize beyond the training data or simply regurgitate their training corpora. We provide the first non-vacuous generalization bounds for pretrained large language models (LLMs)\, indicating that language models are capable of discovering regularities that generalize to unseen data. In particular\, we derive a compression bound that is valid for the unbounded log-likelihood loss using prediction smoothing\, and we extend the bound to handle subsampling\, accelerating bound computation on massive datasets. To achieve the extreme level of compression required for non-vacuous generalization bounds\, we devise SubLoRA\, a low-dimensional non-linear parameterization. Using this approach\, we find that larger models have better generalization bounds and are more compressible than smaller models\n\n------\n\nPowered by addevent.com \nShare your next event with us!\n
X-ALT-DESC;FMTTYPE=text/html:About: I am a PhD student at the Center for Data Science at NYU advised by Professor Andrew Gordon Wilson and a Visiting Researcher in the Fundamental AI Research (FAIR) group at Meta AI where I work with Brandon Amos. I work on the foundations of deep learning. My goal is to understand and quantify generalization in deep learning, and use this understanding to build more robust and reliable machine learning models. <br><br> Session Description: Modern language models can contain billions of parameters, raising the question of whether they can generalize beyond the training data or simply regurgitate their training corpora. We provide the first non-vacuous generalization bounds for pretrained large language models (LLMs), indicating that language models are capable of discovering regularities that generalize to unseen data. In particular, we derive a compression bound that is valid for the unbounded log-likelihood loss using prediction smoothing, and we extend the bound to handle subsampling, accelerating bound computation on massive datasets. To achieve the extreme level of compression required for non-vacuous generalization bounds, we devise SubLoRA, a low-dimensional non-linear parameterization. Using this approach, we find that larger models have better generalization bounds and are more compressible than smaller models<br /><br />------<br /><br />Powered by addevent.com <br>Share your next event with us!<br>
UID:f6f2f603e5e7490cb4e14b785d078ea7addeventcom
SUMMARY:[C4AI] Sanae Lotfi - Non-Vacuous Generalization Bounds for Large Language Models (Geo Asia)
DTSTART;TZID=America/Los_Angeles:20240320T090000
DTEND;TZID=America/Los_Angeles:20240320T100000
DTSTAMP:20260620T022002Z
TRANSP:OPAQUE
STATUS:CONFIRMED
SEQUENCE:0
LOCATION:https://meet.google.com/yhv-tiir-ava
X-MICROSOFT-CDO-BUSYSTATUS:BUSY
BEGIN:VALARM
TRIGGER:-PT30M
ACTION:DISPLAY
DESCRIPTION:Reminder
END:VALARM
END:VEVENT
END:VCALENDAR