In this example, we will analyze and compare data for Galectin-3, Galectin-3C and Galectin-7 which was collected on the CFG arrays. It allows for the easy differentiation and analysis of key differences between the Galectins. Galectin-3C is a recombinant truncated form of Galectin-3 which lacks the 107 amino acids at the N-terminus. You can either watch the video or read the steps with screenshots below.
Sample Data Used in This Example in GLAD Format
Note- There may be some differences between the video and the Step-By-Step walkthrough in terms of colors etc.
1. Get the Data:
Either, obtain the data from the Consortium for Functional Glycomics (CFG) website and use the ExceltoGLAD tool to convert the data into the tab-delimited text file of appropriate formatting for GLAD; or
Use the sample data provided above which is already formatted in the GLAD format.
2. Open GLAD:
Open a new browser window and open the GLAD tool by going to https://www.glycotoolkit.com/Tools/GLAD/
3. Input Data:
Click on the button to add your data to selections by selecting them multiple files
4. Format Colors:
Check and change the sample colors by clicking on the button.
5. Sort Data:
Sort the data in the order you like by clicking on the button and input the sort order and click on the “Sort Samples” button.
6. Save Selections:
Save the selections by clicking on the and then the button. This allows you to resume from this point .
7. Normalize the Data:
We normalize the data before we begin in order to make it easier to compare the data.
Note- You can view the structures of the glycan in all the visualizations following this by simply opening the Glycan Structure drawer from the right side of the window and placing your mouse pointer over data points.
Note- You can also save the figure as an SVG file by clicking on the button on the top right of the visualization.
8. Create the Grouped Bar Chart with Filtered Data:
To create the Grouped Bar Chart by clicking on the button. As you can see because we have selected all the data, there is a lot of information in the chart.
We can focus in on specific glycans by filtering the data. Click on the button and input the query “Galb1-4GlcNAcb1-3,Manb1-4” with the option “And” selected we click on the Filter Glycans by Name. Once the filtering is done a message alerts us as to how many data points are filtered and an undo icon pops up next to the data manipulation allowing you to undo the action. Then check the settings of the Grouped Bar Chart by clicking on the to see the error bars so we check the option Display Error Bars. After this we click on the to refresh the chart.
If you would like to see the structure of these glycans you can just open the glycan structure drawer and place your mouse over the data, and the structure will be visible on the side. Using this one can see that the common feature is the Galb1-4GlcNAcb1-3 and the N-glycan core Manb1-4 in all the glycans which matches with our filter query.
9. Create the Heatmap:
The next data visualization is the heatmap whose colors and scales can be adjusted in the settings by clicking the next to the heatmap / calendar heatmap buttons. Once adjusted click on the button to produce the heatmap. The heatmap scales can be adjusted to visualize lower binding glycans. It makes patterns of similar binding very easy to observe and compare.
10. Create Calendar Heatmap:
Visualizing all the data of >600 glycans/array in a heatmap would make the heatmap very long vertically. We therefore created a visualization called calendar heatmap. First we undo the filter by clicking on the button next to the button. We check the settings of the calendar heatmap. The color and scale settings of the calendar heatmap are shared with the heatmap (which can be set above). Then we click on the button to produce the Calendar Heatmap. We use the scroll wheel on the visualization to zoom out. GLAD automatically aligns all the glycans by name. If the glycans are missing or different glycans are present they are either left as white spaces or added to the end of the block. Upon adjusting the scales you can bring up the low binders in the Galectin-3C data as Galectin-3C showed lower binding overall. Yet the pattern Galectin-3C produces is similar to that of Galectin-3 as compared to Galectin-7, which is to be expected.
11. Create a Force Graph:
To create a force graph simply click on the button. Tweaks can be done by changing the settings by clicking on the next to it. The force graph creates a network visualization where the colored nodes represent the samples and the gray nodes represent individual glycans. The links are formed if the sample binds a glycan above a minimum value. The link distance between the nodes are governed by the strength of binding.
From the force graph it can be easily seen how each Galectin shares some binding partners with the others. Double-clicking on the nodes specifically highlights all other nodes bound by it. Interestingly the force graph also clusters the different samples together in that Galectin-3 samples are separated from the Galectin-3C and the Galectin-7, making it easy to visualize differences.
One can also highlight specific glycan features by typing in a query in the Search and Highlight Glycans field and clicking on Go.
12. Create a Correlation Map:
To create a correlation map check the settings by clicking on the next to the and then click on the Correlation Map button. It produces a correlation heatmap with a scale from -1 to +1 depending on how well the correlation is. The map is scaled by color allowing for quick visualization of strongly correlating samples vs weakly correlating samples. Clicking on the tiles which compares pairs of data within the correlation map allows you to see the scatter plot of the data of the pair. You can also see the Bland-Altman plot for the pair of samples by scrolling down in the window.
It becomes increasingly evident that Galectin-3C and Galectin-3 are very similar but Galectin-7 is quite distinct. The Bland-Altman plot helps identify further key glycans which show significant differences.
13. Create the Bubble Box Plot:
It is best to create this plot on un-normalized data so that the statistics are not biased by the normalization process. To Create the Bubble Box Plot simply click on the . This produces a scatter plot with the signal on the y-axis and the different samples on the x-axis. The bubbles or spots of the scatter plot are individual glycans. Placing the cursor over the mouse highlights the signal for that glycan in all the samples. If the bubbles are too close to each other, they can be spread apart by either using the Jitter or the Spray buttons. You can also create a box plot overlaid on the data by clicking on the button, and you can show the mean and standard deviations by clicking on the button.
Overall, this visualization makes it clear how statistically different the samples behave especially Galectin-3C as compared to the others.