Discovery Tool: Downsample

That is how you can use the graphical window. And now, I want to show you how you can use the more advanced tools; the dimensionality reduction and the layout to explore your data a bit more, and to characterize your populations. So if we are at that point, you know how to gate on different populations, and you know how to create your own gene sets and how to work with them to create new Boolean gene sets, which is good enough. But then what we want to do, is also try to work here with dimension reduction, and we want to try to characterize cells in an unsupervised way.

The discovery tool is here so it's in the analyze tab, if you don't change anything. So we have dimension reduction. We have PCA, LDA and t-SNE. So they're all naturally the same and you might have good reason to use t-SNE over PCA or the opposite or simply just try them and see whatever works the best for you. The good thing is that you don't have to program that. You simply can select a few options and start them. The good thing is once you've selected also some more cells, that might not be my case here with 5,000 cells, but if you have like a million cells, you probably want to select one or two clusters and then start a PCA or t-SNE on them, using only a few dimensions of your gene sets to reduce the noise associated to the rest of your data there.

The point is if it's extremely noisy, because you have many different cells, if you want to characterize only the immune cells, which are in your tumor, then you should exclude the tumor cells and gate only on your immune cells, and use them to reduce your dimensionality. So you remove all the noise, which is associated with the other cells and you can do that over and over to characterize and go deeper in your cell.

We have to downsample too, which I won't have to use today. It's only used for if you have way more cells, as you might want to reduce that number of cells a bit before you start any dimension reduction. Otherwise, it would just take you forever. You can start it over night, but if you don't have that much time, then use the downsample tool and reduce that to something, which is like 10,000,-50,000 cells, which it can analyze within one hour if you have a normal computer.

It's really true for t-SNE, well PCA as well. With not even 5,000 cells I won't have to do it, but if you start that you will have to select whatever gate you want to use, if all of your cells, only your B cells, whatever you have created. Then you can chose between deterministic and random, so those are naturally the same options. If you choose random, well it's called random because it is random, so it will give you each time a new result, but cells will be selected randomly. You can just ask SeqGeq to select around all cells or to try and to pick them in a way, which is density dependent. Select mild, medium, or strongly associated with density, and that will give you something totally random.

If you don't want that you can select deterministic, which is the default option I guess. You can select when you open downsample and there you will always get the same if you choose deterministic. If you redo the same downsample, you will get the same exact cells. Again, choose if you want to select all cells, only the first cells, the last cells, first and last. You can play here with the options, and also with the genes here in the advanced options, you can select whatever you want to use to characterize to reduce that.                                                  

Each time you have something here in gray, that means you can not use that action. Normally, you need to select a sample before, so right here I cannot reduce the dimensionality, and you have a small help menu. If you stay somewhere, a flow chart will display, helping you to tell you what is that button used for, plus the shortcut so if you want to save some time. If I select sample, now it's colored, and I can do both downsample and dimension reduction.