Machine Learning
arXiv 20
Second Order Optimization Made Practical
R. Anil, V. Gupta, T. Koren, K. Regan and Y. Singer
arXiv Preprint
Optimization in machine learning, both theoretical and applied, is presently
dominated by first-order gradient methods such as stochastic gradient descent.
Second-order optimization methods that involve second-order derivatives and/or
second-order statistics of the data have become far less prevalent despite strong
theoretical properties, due to their prohibitive computation, memory and communication
costs.
In an attempt to bridge this gap between theoretical and practical optimization, we
present a proof-of-concept distributed system implementation of a second-order preconditioned
method (specifically, a variant of full-matrix Adagrad), that along with a few yet
critical algorithmic and numerical improvements, provides significant practical gains
in convergence on state-of-the-art deep models and gives rise to actual wall-time
improvements in practice compared to conventional first-order methods. Our design
effectively utilizes the prevalent heterogeneous hardware architecture for training
deep models which consists of a multicore CPU coupled with multiple accelerator units.
We demonstrate superior performance on very large learning problems in machine translation
where our distributed implementation runs considerably faster than existing gradient-based
methods.
NeurIPS 18
Large Margin Deep Networks for Classification
H. Mobahi, D. Krishnan, G. Elsayed, K. Regan and S. Bengio
Thirty-second Conference on Neural Information Processing Systems
We present a formulation of deep learning that aims at producing a large margin
classifier. The notion of margin, minimum distance to a decision boundary, has
served as the foundation of several theoretically profound and empirically suc-
cessful results for both classification and regression tasks. However, most large
margin algorithms are applicable only to shallow models with a preset feature
representation; and conventional margin methods for neural networks only enforce
margin at the output layer. Such methods are therefore not well suited for deep
networks. In this work, we propose a novel loss function to impose a margin on
any chosen set of layers of a deep network (including input and hidden layers).
Our formulation allows choosing any lp norm (p >= 1) on the metric measuring
the margin. We demonstrate that the decision boundary obtained by our loss has
nice properties compared to standard classification loss functions. Specifically, we
show improved empirical results on the MNIST, CIFAR-10 and ImageNet datasets
on multiple tasks: generalization from small training sets, corrupted labels, and
robustness against adversarial perturbations. The resulting loss is general and
complementary to existing data augmentation (such as random/adversarial input
transform) and regularization techniques (weight decay, dropout, and batch norm).
@inproceedings{elsayed2018large,
title={Large margin deep networks for classification},
author={Elsayed, Gamaleldin and Krishnan, Dilip and Mobahi, Hossein and Regan, Kevin and Bengio, Samy},
booktitle={Advances in neural information processing systems},
pages={842--852},
year={2018}
}
Robust Decision Making
IJCAI 11
Robust Online Optimization of Reward-uncertain MDPs
Kevin Regan and Craig Boutilier
Twenty-Second Joint Conference on Artificial Intelligence
Imprecise-reward Markov decision processes (IRMDPs) are MDPs in which the
reward function is only partially specified (e.g., by some elicitation process).
Recent work using minimax regret to solve IRMDPs has shown, despite theoretical
intractability, how the set of policies that are nondom- inated w.r.t. reward
uncertainty can be exploited to accelerate regret computation. However,
the number of nondominated policies is generally so large as to undermine this
leverage. In this paper, we show how the quality of the approximation can be
improved online by pruning/adding nondominated policies during reward
elicitation, while maintain- ing computational tractability. Drawing insights
from the POMDP literature, we also develop a new anytime algorithm for
constructing the set of nondominated policies with provable (anytime)
error bounds.
@article{ijcai2011:rb:a,
author = {Kevin Regan and Craig Boutilier},
title = {
Robust Online Optimization of Reward-uncertain MDPs
},
journal = {
Twenty-Second Joint Conference on Artificial Intelligence (IJCAI 2011)
},
year = {2011}
}
IJCAI 11
Eliciting Additive Reward Functions for Markov Decision Processes
Kevin Regan and Craig Boutilier
Twenty-Second Joint Conference on Artificial Intelligence
Specifying the reward function of a Markov decision process (MDP) can be
demanding, requiring human assessment of the precise quality of, and tradeoffs
among, various states and actions. However, reward functions often possess
considerable structure which can be leveraged to streamline their
specification. We develop new, decision-theoretically sound heuristics for
eliciting rewards for factored MDPs whose reward functions exhibit additive
independence. Since we can often find good policies without complete reward
specification, we also develop new (exact and approximate) algorithms for
robust optimization of imprecise-reward MDPs with such additive reward. Our
methods are evaluated in two domains: autonomic computing and assistive
technology.
@article{ijcai2011:rb:b,
author = {Kevin Regan and Craig Boutilier},
title = {
Eliciting Additive Reward Functions for Markov Decision Processes
},
journal = {
Twenty-Second Joint Conference on Artificial Intelligence (IJCAI 2011)
},
year = {2011}
}
AAAI 10
Robust Policy Computation in Reward-uncertain MDPs using Nondominated Policies
Kevin Regan and Craig Boutilier
Twenty-Fourth AAAI Conference on Artificial Intelligence
The precise specification of reward functions for
Markov decision processes (MDPs)
is often extremely difficult, motivating research into both
reward elicitation and the robust solution of MDPs with imprecisely
specified reward (IRMDPs). We develop new techniques for the
robust optimization of IRMDPs, using the minimax regret
decision criterion, that exploit the set of nondominated policies,
i.e., policies that are optimal for some instantiation of the
imprecise reward function. Drawing parallels to POMDP value functions,
we devise a Witness-style algorithm for identifying
nondominated policies. We also examine several new algorithms for
computing minimax regret using the nondominated set, and examine
both practically and theoretically the impact of approximating
this set. Our results suggest that a small subset of the nondominated
set can greatly speed up computation, yet yield very tight approximations
to minimax regret.
@article{aaai2010rb,
author = {Kevin Regan and Craig Boutilier},
title = {
Robust Policy Computation in Reward-uncertain MDPs using Nondominated Policies
},
journal = {
Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2010)
},
year = {2010}
}
UAI 09
Regret-based Reward Elicitation for Markov Decision Processes
Kevin Regan and Craig Boutilier
The 25th Conference on Uncertainty in Artificial Intelligence
The specification of a Markov decision process (MDP) can
be difficult. Reward function specification is especially
problematic; in practice, it is often cognitively complex and
time-consuming for users to precisely specify rewards. This
work casts the problem of specifying rewards as one of preference
elicitation and aims to minimize the degree of precision with
which a reward function must be specified while still allowing
optimal or near-optimal policies to be produced. We first
discuss how robust policies can be
computed for MDPs given only partial reward information using the
minimax regret criterion. We then demonstrate how regret can be
reduced by efficiently eliciting reward information using
bound queries, using regret-reduction as a means for choosing suitable
queries. Empirical results demonstrate that regret-based reward
elicitation offers an effective way to produce near-optimal policies
without resorting to the precise specification of the entire reward
function.
@article{uai09rb,
author = {Kevin Regan and Craig Boutilier},
title = {
Regret-based Reward Elicitation for Markov Decision Processes
},
journal = {
UAI-09 The 25th Conference on Uncertainty in Artificial Intelligence
},
year = {2009}
}
NIPS 08
Regret-based Reward Elicitation for Markov Decision Processes
Kevin Regan and Craig Boutilier
Workshop on Model Uncertainty and Risk in Reinforcement Learning
held at the Conference on Neural Information Processing Systems
Traditional methods for finding optimal policies in
stochastic, multi-step decision environments require a
precise model of both the environment dynamics and
the rewards associated with taking actions and the
effects of those actions. While dynamics are often
easily learnable through observation—and in many
domains are stable across different users—reward
functions are more problematic. In practice, it is
often cognitively complex and time-consuming for users to
precisely specify rewards. This work casts the problem of
specifying rewards as one of preference elicitation. We
will first discuss how robust policies can be computed
for Markov Decision Processes given partial reward
information using the minimax regret criterion. We will
then show how regret can be reduced by efficiently
eliciting rewards information using bound queries.
Regret-based elicitation of reward offers an efficient way to
produce desirable policies without resorting to the
precise specification of the entire reward function, and as
such, opens up a new avenue for the design of stochastic
controllers.
@article{nips08workshop,
author = {Kevin Regan and Craig Boutilier},
title = {
Regret-based Reward Elicitation for Markov Decision Processes
},
journal = {
NIPS-08 workshop on Model Uncertainty and Risk in Reinforcement Learning
},
volume = {1},
year = {2008}
}
ICML 06
An Analytic Solution to Discrete Bayesian Reinforcement Learning
Pascal Poupart and Nikos Vlassis and Jesse Hoey and Kevin Regan
The Twenty-First National Conference on Artificial Intelligence
Reinforcement learning (RL) was originally proposed as a framework
to allow agents to learn in an online fashion as they interact with
their environment. Existing RL algorithms come short of achieving this
goal because the amount of exploration required is often too costly
and/or too time consuming for online learning. As a result, RL is mostly
used for offline learning in simulated environments. We propose a new
algorithm, called BEETLE, for effective online learning that is
computationally efficient while minimizing the amount of exploration.
We take a Bayesian model-based approach, framing RL as a partially
observable Markov decision process. Our two main contributions are the
analytical derivation that the optimal value function is the upper envelope
of a set of multivariate polynomials, and an efficient point-based value
iteration algorithm that exploits this simple parameterization.
@article{icml06,
author = {Pascal Poupart and Nikos Vlassis and Jesse Hoey and Kevin Regan},
title = {
An Analytic Solution to Discrete Bayesian Reinforcement Learning
},
journal = {
The Twenty-First National Conference on Artificial Intelligence
},
volume = {1},
year = {2006}
}
Learning User Concepts
AAAI 10
Simultaneous Elicitation of Preference Features and Utility
Paolo Viappiani, Craig Boutilier and Kevin Regan
Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2010)
ICML 09
Online Feature Elicitation in Interactive Optimization
Craig Boutilier, Kevin Regan and Paolo Viappiani
International Conference on Machine Learning
Most models of utility elicitation in decision support
and interactive optimization assume a predefined set of
"catalog" features over which user
preferences are expressed. However, users may
differ in the features over which they are most
comfortable expressing their preferences. In this
work we consider the problem of feature elicitation:
a user's utility function is expressed using
features whose definitions (in terms of "catalog"
features) are unknown. We cast this as a
problem of concept learning, but whose goal is
to identify only enough about the concept to
enable a good decision to be recommended. We
describe computational procedures for identifying
optimal alternatives w.r.t. minimax regret in
the presence of concept uncertainty; and describe
several heuristic query strategies that focus on
reduction of relevant concept uncertainty.
@article{icml09brv,
author = {Craig Boutilier, Kevin Regan and Paolo Viappiani},
title = {
Online Feature Elicitation in Interactive Optimization
},
journal = {
ICML-09 Iternational Conference on Machine Learning
},
year = {2009}
}
RecSys 09
Preference Elicitation with Subjective Features
Craig Boutilier, Kevin Regan and Paolo Viappiani
The 3rd ACM Conference on Recommender Systems
Utility or preference elicitation is a critical component in many rec-
ommender and decision support systems. However, most frame-
works for elicitation assume a predefined set of features (e.g., as
derived from catalog descriptions) over which user preferences are
expressed. Just as user preferences vary considerably, so too can
the features over which they are most comfortable expressing these
preferences. In this work, we consider preference elicitation in the
presence of subjective or user-defined features. We treat the prob-
lem of learning a user’s feature definition as one of concept learn-
ing, but whose goal is to learn only enough about the concept defi-
nition to enable a good decision to be made. This is complicated by
the fact that user preferences are unknown. We describe computa-
tional procedures for identifying optimal alternatives w.r.t minimax
regret in the presence of both utility and concept uncertainty; and
develop several heuristic query strategies that focus on reduction
of relevant concept and utility uncertainty. Computational experi-
ments verify the efficacy of these strategies.
@article{uai09rb,
author = {Craig Boutilier, Kevin Regan and Paolo Viappiani},
title = {
Preference Elicitation with Subjective Features
},
journal = {
RECSYS-09, The 3rd ACM Conference on Recommender Systems
},
year = {2009}
}
Reputation & Trust
AAAI 06
Bayesian Reputation Modeling in E-Marketplaces Sensitive to Subjectivity, Deception and Change
Kevin Regan and Pascal Poupart and Robin Cohen
International Conference on Machine Learning
We present a model for buying agents in e-marketplaces to interpret
evaluations of sellers provided by other buying agents, known as
advisors. The interpretation of seller evaluations is complicated by
the inherent subjectivity of each advisor, the possibility that
advisors may deliberately provide misleading evaluations to deceive
competitors and the dynamic nature of seller and advisor behaviors
that may naturally change seller evaluations over time. Using a
Bayesian approach, we demonstrate how to cope with subjectivity,
deception and change in a principled way. More specifically, by
modeling seller properties and advisor evaluation functions as dynamic
random variables, buyers can progressively learn a probabilistic model
that naturally and ``correctly'' calibrate the interpretation of seller
evaluations without having to resort to heuristics to explicitely
detect and filter/discount unreliable seller evaluations. Our model,
called BLADE, is shown empirically to achieve lower mean error in the
estimation of seller properties when compared to other models for
reasoning about advisor ratings of sellers in electronic maketplaces.
@article{aaai06,
author = {Kevin Regan and Pascal Poupart and Robin Cohen},
title = {
Bayesian Reputation Modeling in E-Marketplaces Sensitive to Subjectivity, Deception and Change
},
journal = {
International Conference on Machine Learning
},
volume = {1},
year = {2006}
}
PST 05
The Advisor-POMDP: A Principled Approach to Trust through Reputation in Electronic Markets
Kevin Regan and Robin Cohen and Pascal Poupart
Conference on Privacy Security & Trust
This paper examines approaches to representing uncertainty in reputation systems for electronic markets
with the aim of constructing a decision theoretic framework for collecting information about selling
agents and making purchase decisions in the context of a social reputation system. A selection of
approaches to representing reputation using Dempster-Shafter Theory and Bayesian probability are
surveyed and a model for collecting and using reputation is developed using a Partially Observable
Markov Decision Process.
@article{pst05,
author = {Kevin Regan and Robin Cohen and Pascal Poupart},
title = {
The Advisor-POMDP: A Principled Approach to Trust through Reputation in Electronic Markets
},
journal = {
Conference on Privacy Security & Trust
},
volume = {1},
year = {2005}
}
JBT 05
Designing Adaptive Buying Agents in Electronic Markets
Using Advice from Other Buyers to Model Seller Reputation
Kevin Regan and Robin Cohen
The Journal of Business and Technology
In this paper, we present a model for designing buying agents in electronic marketplaces
that adapt by adjusting decisions about which sellers to select for business, based on
reputation ratings provided by other buying agents in the marketplace (known as indirect
reputation information). The focus of our research is a method for effectively representing
and employing this indirect reputation information. In particular, we address the case of
buying agents providing deceptive information to other buyers, by having each buyer model
not only the reputation of all sellers in the marketplace but also the reputation of each
buyer. We also systematically account for differing standards between buyers, in assessing
the reputation of sellers. Our approach is to avoid disreputable buyers when seeking advice
and then to pass through three phases: adjusting for subjective bias, estimating seller
reputability from reputable buyers only and eliminating any wild predictions. Overall the
model presented here builds on a strong foundation of how best to model seller reputation
but allows for a suitably cautious integration of a social aspect to the reputation
modeling, towards improved purchasing decisions for buyers.
@article{jbt05,
author = {Kevin Regan and Robin Cohen},
title = {
Designing Adaptive Buying Agents in Electronic Markets
Using Advice from Other Buyers to Model Seller Reputation
},
journal = {
The Journal of Business and Technology
},
volume = {1},
year = {2005}
}
DASUM 05
Sharing models of sellers amongst buying agents in electronic marketplaces
Kevin Regan and Robin Cohen and Thomas Tran
Decentralized Agent Based and Social Approaches to User Modelling workshop
In this paper, we demonstrate the value of decentralized models of sellers in electronic marketplaces, as the basis for
purchasing decisions from buyers. We discuss how buying agents can model the reputation of sellers in order to make
effective purchases and how these agents can also take advantage of reputation ratings provided by other buying agents
in the marketplace, once it is established that each buyer will be independently modelling the sellers. We outline the
methods required to make use of reputation ratings of sellers provided by other buyers, including adjustments for
possibly different subjective scales and for possible deception in reporting the reputation ratings. In particular, we
propose that buying agents model the reputation not only of sellers but of advisors, as part of the overall processing.
We describe as well how buyers can most effectively model the set of possible buying advisors in order to apply
selective processing of the information provided by these agents, in their purchasing decisions.
@article{dasum05,
author = {Kevin Regan and Robin Cohen and Thomas Tran},
title = {
Sharing models of sellers amongst buying agents in electronic marketplaces
},
journal = {
Decentralized Agent Based and Social Approaches to User Modelling workshop
},
volume = {1},
year = {2005}
}
BASEWEB 05
Indirect Reputation Assessment for Adaptive Buying Agents in Electronic Markets
Kevin Regan and Robin Cohen
Business Agents and the Semantic Web workshop
In this paper, we present a model for designing buying agents in electronic marketplaces that adapt by adjusting
decisions about which sellers to select for business, based on reputation ratings provided by other buying agents in the
marketplace (known as indirect reputation information). The focus of our research is a method for effectively
representing and employing this indirect reputation information. In particular, we address the case of buying agents
providing deceptive information to other buyers, by having each buyer model not only the reputation of all sellers in
the marketplace but also the reputation of each buyer. We also systematically account for differing standards between
buyers, in assessing the reputation of sellers. Overall, the model presented here builds on a strong foundation of how
best to model seller reputation but allows for a suitably cautious integration of a social aspect to the reputation
modelling, towards improved purchasing decisions for buyers.
@article{baseweb05,
author = {Kevin Regan and Robin Cohen},
title = {
Indirect Reputation Assessment for Adaptive Buying Agents in Electronic Markets
},
journal = {
Business Agents and the Semantic Web workshop
},
volume = {1},
year = {2005}
}