|
Note that I am
not at Yale any more, but
I am continuing with
similar work.
To disseminate papers, information, etc., this old page will be maintained.
|
Michael Mahoney
Assistant Professor,
Department of Mathematics
Research Affiliate,
Department of Computer Science
Yale University
|
By popular demand
... we are running the
MMDS Workshop on
"Algorithms for Modern Massive Data Sets"
again!
As with the 2006 meeting,
MMDS 2008
will address algorithmic, statistical, and mathematical
challenges in large-scale statistical data analysis.
Click
here
for the 2006 Stanford/Yahoo "Workshop on Algorithms for Modern Massive
Data Sets (MMDS 2006)" that I organized (with
G. Golub,
P. Drineas, and
L-H. Lim)
and that took place on June 21-24, 2006,
or click
here
for an article in SIAM News about the meeting.
Click
here
for a ppt of the SIAM-SDM06 2006 tutorial,
or click
here
for the overheads for several other talks.
Research interests
-
Design and analysis of algorithms for very large matrix, graph, and
regression problems.
-
Applications to the statistical data analysis of massive scientific and
Internet data.
-
Monte Carlo methods in
computer science,
statistical physics,
and scientific computation.
I work on developing and analyzing algorithms for large matrix, graph, and
regression problems, as well as applications of these tools to the statistical
analysis of extremely large scientific and Internet data sets.
Current work includes applying these and related algorithmic techniques to
problems arising in Internet applications, and past work includes the
application of these techniques to DNA single nucleotide polymorphism data.
I have also worked on developing and analyzing Monte Carlo algorithms for
performing useful computations on extremely large
matrices,
e.g., the additive-error and relative-error CUR matrix decompositions.
Past research has also included work in computational statistical mechanics on
the development and analysis of the
TIP5P
model of liquid water.
Current and planned future work includes developing novel algorithmic tools and
applying them to novel statistical data analysis problems.
Publications
-
Unsupervised Feature Selection for Principal Components Analysis
-
C. Boutsidis, M. W. Mahoney, and P. Drineas
-
Proc. 14-th Annual SIGKDD, 61-69 (2008)
-
Statistical properties of community structure in large social and information networks
-
J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney,
-
Proc. 17-th International WWW, 695-704 (2008)
-
Journal version submitted for publication.
-
Learning with Spectral Kernels and Heavy-Tailed Data
-
M. W. Mahoney and H. Narayanan,
-
Submitted for publication.
-
Faster Least Squares Approximation
-
P. Drineas, M. W. Mahoney, S. Muthukrishnan, and T. Sarlos,
-
Technical Report, Preprint: arXiv:0710.1435 (2007)
(arXiv),
-
Journal version submitted for publication.
-
Relative-Error CUR Matrix Decompositions
-
P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
-
Technical Report, Preprint: arXiv:0708.3696 (2007)
(arXiv),
-
Sampling Algorithms and Coresets for Lp Regression
-
A. Dasgupta, P. Drineas, B. Harb, R. Kumar, and M. W. Mahoney,
-
Technical Report, Preprint: arXiv:0707.1714 (2007)
(arXiv),
-
Proc. 19-th Annual SODA, 932-941 (2008)
(ps,
pdf).
-
Journal version submitted for publication.
-
Feature Selection Methods for Text Classification
-
A. Dasgupta, P. Drineas, B. Harb, V. Josifovski, and M. W. Mahoney,
-
Proc. 13-th Annual SIGKDD, 230-239 (2007)
(ps,
pdf).
-
PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations
-
P. Paschou, E. Ziv, E. G. Burchard, S. Choudhry, W. Rodriguez-Cintron, M. W. Mahoney, and P. Drineas,
-
PLOS Genetics, 3, 1672-1686 (2007)
(ps,
pdf).
-
Intra- and interpopulation genotype reconstruction from tagging SNPs,
-
P. Paschou, M. W. Mahoney, A. Javed, J. R. Kidd, A. J. Pakstis, S. Gu, K. K. Kidd, and P. Drineas,
-
Genome Research, 17(1), 96-107 (2007)
(ps,
pdf).
-
Bridging the Gap Between Numerical Linear Algebra, Theoretical Computer Science, and Data Applications
-
G. H. Golub, M. W. Mahoney, P. Drineas, and L.-H. Lim,
-
SIAM News 39:8 October 2006
(ps,
pdf).
-
Randomized Algorithms for Matrices and Massive Data Sets,
-
P. Drineas and M. W. Mahoney,
-
Proc. 32-nd Annual VLDB, 1269 (2006)
(ps,
pdf).
-
Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods,
-
P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
-
Proc. 14-th Annual ESA, 304-314 (2006)
(ps,
pdf).
-
Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods,
-
P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
-
Proc. 10-th Annual RANDOM, 316-326 (2006)
(ps,
pdf).
-
Tensor-CUR Decompositions For Tensor-Based Data,
-
M. W. Mahoney, M. Maggioni, and P. Drineas,
-
Proc. 12-th Annual SIGKDD, 327-336 (2006)
(ps,
pdf),
-
Polynomial Time Algorithm for Column-Row-Based Relative-Error Low-Rank Matrix Approximation,
-
P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
-
Technical Report, DIMACS TR 2006-04 March 2006
(ps,
pdf).
-
Sampling Algorithms for L2 Regression and Applications,
-
P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
-
Proc. 17-th Annual SODA, 1127-1136 (2006)
(ps,
pdf).
-
A Randomized Algorithm for a Tensor-Based Generalization of the Singular Value Decomposition,
-
P. Drineas and M. W. Mahoney,
-
Technical Report, YALEU/DCS/TR-1327, June 2005
(ps,
pdf),
-
Linear Algebra and its Applications, 420, 553-571 (2007)
(ps,
pdf).
-
On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning,
-
P. Drineas and M. W. Mahoney,
-
Technical Report, YALEU/DCS/TR-1319, April 2005
(ps,
pdf),
-
Proc. 18-th Annual COLT, 323-337 (2005)
(ps,
pdf),
-
J. Machine Learning Research, 6, 2153-2175 (2005).
(ps,
pdf).
-
Sampling Sub-problems of Heterogeneous Max-Cut Problems and Approximation Algorithms,
-
P. Drineas, R. Kannan, and M. W. Mahoney,
-
Technical Report, YALEU/DCS/TR-1283, April 2004
(ps,
pdf),
-
Proc. 22-nd Annual STACS, 57-68 (2005)
(ps,
pdf),
-
Accepted for publication, Random Structures and Algorithms.
-
Fast Monte Carlo Algorithms for Matrices III: Computing an Efficient Approximate Decomposition of a Matrix,
-
P. Drineas, R. Kannan, and M. W. Mahoney,
-
Technical Report, YALEU/DCS/TR-1271, February 2004
(ps,
pdf),
-
SIAM J. Computing, 36, 184-206 (2006)
(ps,
pdf).
-
Fast Monte Carlo Algorithms for Matrices II: Computing Low-Rank Approximations to a Matrix,
-
P. Drineas, R. Kannan, and M. W. Mahoney,
-
Technical Report, YALEU/DCS/TR-1270, February 2004
(ps,
pdf),
-
SIAM J. Computing, 36, 158-183 (2006)
(ps,
pdf).
-
Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication,
-
P. Drineas, R. Kannan, and M. W. Mahoney,
-
Technical Report, YALEU/DCS/TR-1269, February 2004
(ps,
pdf),
-
SIAM J. Computing, 36, 132-157 (2006)
(ps,
pdf).
-
Rapid Mixing of Several Markov Chains for a Hard-Core Model,
-
R. Kannan, M. W. Mahoney, and R. Montenegro,
-
Proc. 14-th Annual ISAAC, 663-675 (2003)
(ps,
pdf).
-
Quantum, intramolecular flexibility, and polarizability effects on the reproduction of the density anomaly of liquid water by simple potential functions,
-
M. W. Mahoney and W. L. Jorgensen,
-
J. Chem. Phys., 115, 10758-10768 (2001)
(ps,
pdf).
-
Rapid estimation of electronic degrees of freedom in Monte Carlo calculations for polarizable models of liquid water,
-
M. W. Mahoney and W. L. Jorgensen,
-
J. Chem. Phys., 114, 9337-9349 (2001)
(ps,
pdf).
-
Diffusion constant of the TIP5P model of liquid water,
-
M. W. Mahoney and W. L. Jorgensen,
-
J. Chem. Phys., 114, 363-366 (2001)
(ps,
pdf).
-
A five-site model for liquid water and the reproduction of the density anomaly by rigid, nonpolarizable potential functions,
-
M. W. Mahoney and W. L. Jorgensen,
-
J. Chem. Phys., 112, 8910-8922 (2000)
(ps,
pdf).
-
Repression and activation of promoter-bound RNA polymerase activity by Gal repressor,
-
H. E. Choy, R. R. Hanger, T. Aki, M. Mahoney, K. Murakami, A. Ishihama, and S. Adhya,
-
J. Mol. Biol. 272: 293-300, 1997
(ps,
pdf).
-
Discrete representations of the protein C-alpha chain,
-
X. F. de la Cruz, M. W. Mahoney, and B. K. Lee,
-
Fold. & Des. 2: 223-234, 1997
(ps,
pdf).
|