How to find marker genes in cell clusters

The thousands of cells in a biological sample are all different and can be analyzed individually, cell by cell. Based on their gene activity, they can be sorted into clusters. But which genes are particularly characteristic of a given cluster, i.e. what are its “marker genes”? A new statistical method called Association Plot facilitates the determination and analysis of these marker genes.

Which genes are specific for a certain cell type, i.e. “mark” their identity? With the increasing size of datasets nowadays, answering this question is often challenging. Often, marker genes are simply genes that have been found in specific cell populations. However, many more genes could be characteristic of a particular cell type but remain undiscovered.

“Association Plots (APL),” a new statistical method for visualizing gene activity within a cell cluster makes it easier to find its marker genes. The plots compare the activity of genes of a given cluster with all other clusters from the data set. Additionally, they make it easy to see which genes are shared with other clusters.

“Association Plots not only allow us to identify new marker genes. It also works the other way around — we are able to match clusters of unknown identity in a dataset to cell types, based on a provided list of marker genes,” says Elzbieta Gralinska of the Max Planck Institute for Molecular Genetics in Berlin.

The biotechnologist works in the team of Martin Vingron, which developed the technique, demonstrated its functionality on two publicly available datasets, and published the results. Moreover, APL has been released as a free module for the statistical environment R. The APL package allows researchers to visually inspect their single-cell data and select individual genes with the cursor to learn more in-depth details.

Analyzing and grouping single cells

Why is it necessary to identify marker genes in the first place? Modern sequencing technologies are able to decipher individual RNA molecules in individual cells. From a blood sample, for example, each cell can be separated and a sample of the cell’s RNAs can be decoded. These single-cell data represent the active genes that were transcribed into RNA molecules.

Source: Read Full Article