Understanding UMAP

I am a mathematician who works mostly with the tools of category theory and algebraic topology to study geometry, so when I open up a paper and see adjunctions and simplicial sets and geometric realisation and Riemannian metrics, I feel pretty happy. One of the infinitely many things I am not is a data scientist, so when I am asked about dimension-reduction methods, I cannot give a meaningful or confident answer. This blog post arises from an odd middle ground, where I hope I can say something useful by being on this side of the fence.

This is a companion discussion topic for the original entry at https://topos.site/blog/2024-04-05-understanding-umap/

Really great post Tim! It pays to think slowly and carefully about what every little choice in a procedure means about our implicit assumptions, and what effect that has on the end data.

Math is hard, so people often get lost in the sauce just trying to understand the procedure at all. It’s especially important for us trained mathematicians to take some time to provide guides that demystify our strange formal brews, and signpost the places that assumptions get smuggled in. Thanks for the wonderful post!