I am a statistician and applied mathematician with broad interests in
statistical methodology and computing in the physical sciences. My primary
interest is modeling complicated dependence structure in real-life
temporal/spatial/spatio-temporal processes, and I'm particularly interested in
theoretical questions that are motivated by computationally scalable methods and
approximations.
I am currently a PhD student in the
Department of Statistics at Rutgers University
working with
Michael Stein.
Since August 2016, I have worked at
Argonne National Laboratory
under the supervision of
Mihai Anitescu.
Here is a list of some research interests that I have worked on with some level
of specificity and purpose, although not all of those projects have led to
publishable work.
-
Statistical computing and algorithms
-
Spatio-temporal processes, especially spectral domain representations of
multidimensional and nonstationary processes
-
Nonstationarity, nonlinearity, and non-Gaussianity in time-dependent
processes
-
Applications in the physical sciences
-
More abstract stochastic processes, like combinatorial stochastic processes
-
Matrix computations, especially hierarchical matrices and other methods
for scalable dense linear algebra
-
Applied functional and harmonic analysis
-
Irregular/nonuniform Fourier-type methods
-
Inverse problems and data assimilation
Publications:
-
C.J. Geoga, M.L. Stein.
A Scalable Method to Exploit
Screening in Gaussian Process Models with Noise (Arxiv v1). Submitted.
-
P.G. Beckman, C.J. Geoga, M.L. Stein, and M. Anitescu.
Scalable Computations for
Nonstationary Gaussian Processes (Arxiv v1). Submitted.
-
C.J. Geoga, O. Marin, M. Schanen, and M.L. Stein.
Fitting Matern smoothness parameters using automatic differentation
(ArXiv v3). To appear, Statistics and Computing.
-
C.J. Geoga, M. Anitescu, and M.L. Stein.
Flexible nonstationary spatio-temporal modeling of high-frequency
monitoring data
(ArXiv v1). Environmetrics. DOI:https://doi.org/10.1002/env.2670
-
C.J. Geoga, M. Anitescu, and M.L. Stein.
Scalable Gaussian process computations using hierarchical matrices.
Journal of Computational and Graphical Statistics.
DOI:https://doi.org/10.1080/10618600.2019.1652616
-
C.J. Geoga, C.L. Haley, A. Siegel, and M. Anitescu.
Frequency-wavenumber spectral analysis of spatiotemporal flows.
Journal of Fluid Mechanics.
DOI:https://doi.org/10.1017/jfm.2018.366
Software:
I'd like to briefly point out an issue that I have experienced a few times. I
think it's important for scientific practitioners to clean up and package code
to the point where other people who are interested in the work have a fighting
chance of actually running the code themselves. I do my best to meet that
expectation. On the other hand, however, I am not a software engineer, and so my
code is not always perfect in terms of steering users away from footguns or
testing all edge cases. So if you try any of this code, which I sincerely hope
you do, and experience something working way differently (often much worse) than
a paper or figure or example file suggests, please reach out to me before
writing off the code or method completely. There is a non-negligible chance that
I just haven't enforced some aspect of proper usage that may be obvious to a
practitioner in that field---such as sensibly enumerating the observations when
assembling a hierarchical matrix approximation---but that may not occur to
somebody who is just trying out the code or who isn't particularly familiar with
the field. While I obviously do my best to try and enforce proper usage through
the code design itself, it is challenging and I make mistakes. So please do
reach out if you experience surprising things when trying any of these software
packages. There might be an easy remedy, and you might save future users similar
frustrations.
As a second small rant, if you use any of these packages in your own research in
a way that substantively helped your project, *please* do cite them
appropriately. Almost all of these packages are companions to a paper, or the
README of the package will have citation instructions. As an example, please
cite the "Fitting Matern..." paper if you use BesselK.jl to fit a full
three-parameter Matern model. It may seem like sort of a trivial software
dependency, but in this hypothetical you're really using that package to do
something that for decades hasn't really been reliably possible. So please do
acknowledge the work that did make your results possible.
I write all software using the GPLv2 license unless otherwise specified.
-
SMultitaper.jl,
a software package that provides a simple but efficient implementation of
an adaptively weighted multitaper spectral density estimator. There is a
lot of workspace and result caching, so even without careful
pre-allocations repeated calls will be fast. You'll see that this package
doesn't have many commits in the last few years, but that's because it's
done and it has basically no dependencies that need to be bumped. The code
still works well and I do still use it regularly. If you want to compute
hundreds of adaptively weighted multitaper estimates for data of the same
(padded) length, this package is probably a plenty good choice.
-
GPMaxlik.jl,
a simple software package that provides a reasonably tuned function for
evaluating the exact log-likelihood for Gaussian processes and stochastic
estimators for its first and second derivatives. It also now provides
exact methods for the log-likelihood where I have added custom methods for
the gradient and Hessian, so that you can call `ForwardDiff.hessian` on it
and it will actually compute the Hessian using the analytical formula,
thus using LAPACK and only operate on arrays of doubles, but it will still
use ForwardDiff for the derivatives of the kernel function.
-
HalfSpectral.jl,
a sort of software companion to "Flexible nonstationary..." above, this
small package provides a reasonably tuned method for conveniently working
with half-spectral covariance functions for regular data completely in the
time domain. The example files demonstrate how straightforward it is to
build quite complicated models with very little code and then use
automatic differentiation for the derivatives. Besides the recent re-write
for complete AD compatibility, the example files also provide a
demonstration of how to efficiently hook it in to `Vecchia.jl` so that you
can fit your fancy half-spectral covariance function scalably and
efficiently.
-
Vecchia.jl,
(MIT license) A very simple implementation of Vecchia's likelihood
approximation applied to mean-zero Gaussian processes. The code is
specifically optimized to have low allocations and perform well with
multiple threads (in particular, by using aggressive pre-allocation to
squeeze out all allocations from the section that spawns tasks to compute
the small likelihood terms). Gradients and true Hessians are available
using automatic differentiation (ForwardDiff and ReverseDiff), and are
very fast. The package also offers the "reverse" Cholesky factor (a la
Katzfuss and Guinness 2021), and also gives a convenient software
implementation of the EM-based method for estimating parameters when you
have measurement noise that was described in "A Scalable Method to Exploit
Screening...". See the README for a few more tricks that it offers.
-
BesselK.jl,
(MIT license) An ostensibly simple package that implements the modified
second-kind Bessel function $\mathcal{K}_\nu(x)$ natively in Julia for the
purpose of using automatic differentiation to compute derivatives with
respect to $\nu$. Which sounds very simple, but really isn't due to a
variety of subtle but legitimately problematic issues. See the paper
"Fitting Matern..." for details. The real point is to fit three-parameter
Matern covariance functions. The example file gives a simple Matern
covariance function that you can just drop in to your code (and your AD
stack, assuming you use ForwardDiff.jl) and then just forget about. But
please do cite the paper if you do that...
Gratitude:
A short and incomplete list.
-
Mihai Anitescu
was the best boss I'll ever have. I took two of his classes as an
undergraduate that I wasn't entirely ready for and he was patient and
encouraging, and he took a huge chance on me when he hired me at ANL. He
has been a huge influence on my academic interests and on how I approach
problems in general. If he hadn't given me such a huge opportunity and
been so supportive throughout, I probably would not have become a
scientist at all. I really can't say enough about what he's done for me
and how grateful I am.
-
Rong Chen
has been incredibly generous with his time and thoughts. I have done a lot
of exploring in the Rutgers PhD program and Professor Chen has
consistently been patient and supportive beyond what anybody could
reasonably expect.
-
Charlotte Haley
taught me basically everything I know about spectral domain
representations of stochastic processes, and was incredibly supportive and
generous with her time when I started at ANL.
-
Wenjia Jing
's introductory real analysis class was my first serious exposure to the
subject, and the two quarters that I had with him really cemented my love
for analysis.
-
Paytsar Muradyan
Was _incredibly_ helpful as I tried to work with the Doppler LIDAR data for
the "Flexible nonstationary..." paper. Taking that data and the relevant
physical properties of the process seriously would have been effectively
impossible without her help. It's hard to overstate how helpful and
informative my conversations with her were.
-
Michael Stein
's introductory course (STAT 244) made me change my undergraduate plans
and try to take statistics seriously. Despite me being pretty annoying,
he's also been consistently patient, helpful, and supportive for over five
years now. He has been incredibly influential on my interests and
approach.
-
Amie Wilkinson
taught my favorite course that I took as an undergraduate---complex
analysis---and has been incredibly generous with her time as I've asked
her to write way too many recommendation letters for me.
-
Lydia Zoells
Is my partner and my best friend. She painted the portrait displayed above
and copyedits all of my boring papers. I couldn't imagine life without
her.
-
The entire spatial statistics community has been incredibly warm and welcoming.
Contact:
I do not use any social media or internet networking websites. If you see any
account with my name on it, please assume that it is not really me. Please
contact me by email at one of these addresses:
-
cgeoga $at anl $dot gov
-
christopher.geoga $at $dot rutgers $dot edu
I also use XMPP, so if you'd prefer to communicate by instant messaging please
first reach out by email and I will share that contact information with you.