Friday, March 18, 2016
TDA - Topological Spaces
Setup
In our last post we saw the following 4 basic steps of using TDA- Start with a Data Set
- Make a topological space out of the data
- Compute the persistent homology of the space
- Investigate the results
Intuition
To get some intuition you can think of a topological space as some sort of shape made out of Playdoh or silly putty. If you push or pull the silly putty around without tearing it then you can consider this the same "shape". We can't tear the Playdoh because we want all of our pushing and pulling to be continuous, which means points that are close stay close when we push and pull. If we tear the Playdoh points that were right next to each other are torn apart which is not continuous. So from a topological point of view a square and a circle are the same since we can just push and pull the square (without ripping it) to form a circle. This intuition, along with the quick examples below, are probably all you really need out of this post but if you're curious read on.
The general definition below is so general that it leads to all sorts of exotic topological spaces, and even with some extra assumptions there are still very many mind bending examples of spaces with interesting properties. In fact there is a whole book Counter Examples in Topology which shows that there are many strange examples of spaces with exotic properties, such as the warsaw circle. We won't really be concerned with such exotic examples and we won't be concerned with the general definition either, but it is included for completeness
Quick Examples
1. Torus and Double Torus
2. Circle
3. Sphere
4. Mobius Band
All images taken from Wikipedia
First Naive Definition
Intuitively a topological space $\mathcal{X}$ is a collection of points with some extra information telling us how "close" points are to each other. We need this extra information because recall we want to consider properties that a preserved under continuous transformations and the idea of continuous requires a notion of closeness. So what we use is the idea of a neighborhood of a point, or an open set, around a point (generally the word neighborhood is reserved for a metric space). If two points are close there should be a small open set that contains both of them.Second Less Naive Definition
We can refine our previous definition to say a topological space $\mathcal{X}$ is a collection of points together with a collection of open sets (recall open sets are like neighborhoods telling us "closeness" information about points). This is essentially the abstract definition except we need to make sure these open sets actually behave the way we want them to. In other words, not any old set can be an open set, we have to define what it means for a set to be open. Recall that open sets are a more general way of declaring a neighborhood of a point or in other words, open sets are a way of defining "closeness" in a topological space. This is a more abstract way of thinking as we don't need a metric to know what "close" means, we only need the open sets. With this in mind mathematicians came to the following definition that abstractly describes the properties such sets must have. This definition won't be important to us but its included for completenessFull definition
A topology, $\mathcal{T}$ on a set $\mathcal{X}$ is a collection of subsets $U\subset \mathcal{X}$ called open sets, such that- $\mathcal{X}\in \mathcal{T}$
- The union of any collection of open sets is open. That is if $U_i\in\mathcal{T}$ then $\bigcup U_i\in \mathcal{T}$
- The intersection of any finite collection of open sets is open. That is if $A,B\in \mathcal{T}$ then $A\cap B\in \mathcal{T}$
Examples
- $\mathbb{R}$ with open sets any open interval $(a,b)$ or union of open intervals $(a,b)\cup(c,d)$ ect.
- $\mathbb{R}^n$ with open sets being the union and finite intersection of any open balls in space
- $\mathbf{S}^1$ the circle. The open sets are just the union of open intervals of the circle
Topological Data Analysis - Start Here
About these posts on TDA
Topological Data Analysis (TDA) is exactly what it sounds like, using tools from topology to study data. I plan on writing a series of posts that takes us from the basics of topology to the current state of affairs in TDA, this is the first in that series.What is Topology
From a million miles away topology is the study of shapes, you may also remember that geometry is the study of shapes. So what is the difference between geometry and topology? Well, geometry cares about every little detail of a shape like, does it have corners, what is the curvature at a point, what are the distances between two points, angles, ect. Topology however only cares about the global properties of a shape, what is the basic shape of the object even if we smush it around a little bit? Below is an example, you can see that geometry cares that a square and a circle are different (one has corners ect) however topology only cares that both basically form a loop.To make this more formal we can say that topology is the study of the properties of shapes that are preserved under continuous deformations. So two shapes are considered the same if we can stretch or bend one to look like the other. It is exactly this flexibility that makes topology such a useful tool.
Motivation for TDA
Topology is a mature mathematical subject with many tools and techniques. The basic idea behind TDA is to use these techniques to learn something about a data set. Data sets may be very high dimensional making them impossible to visualize and hard to find qualitative information about. This is where topology comes in, the dimension of the data is in many ways irrelevant and the tools of topology give new types of "statics" for data sets.Basic outline for use of TDA
- Start with a data set
- Build a topological space (shape) out of the data
- Compute the (persistent) homology of the space (homology is a computable topological invariant of the space, we'll have alot more to say about homology later)
- Use the information from 3 to further investigate your data