Notes on ICML conference - empahsis on RL
- Lots of great work on off-policy evaluation and off-policy learning (see, for instance, work
by Hanna et al. , Le et al. , Fujimoto et al. , Gottesman et al. , and talks in
Section 6.3). These problem settings are really important, as I (and many others) anticipate
RL applications will come along with loads of data from sub-optimal policies.
- Exploration was a hot topic again, and rightfully so (see work by Mavrin et al. , Fatemi
et al. , Hazan et al. , Shani et al. ). Along with off-policy evaluation (and a few
others), it’s one of the foundational problems in RL that we’re in a good position to make
serious progress on at the moment.
- Some really nice work continuing to clarify distributional RL  (see work by [74, 57, 67]).
- The AI for climate change workshop on Friday was fantastic and extremely well attended
(standing room only for the talks I was there for). I’ve said this after previous conferences,
but: as we all know, there are profoundly important problems, and the tools of ML can be
extremely effective in their current form.
- I really think we need to standardize evaluation in RL. Not that we only need a single method
for doing so, or a single domain, but at the moment there is far too much variance in evaluation
- Loved the panel at the RL for real life workshop (see Section 6.2.1)