Line graph

Synonyms:
Line chart
Graphic of a line graph with 4 lines

A line graph is commonly used to display change over time as a series of data points connected by straight line segments on two axes.

The line graph therefore helps to determine the relationship between two sets of values, with one data set always being dependent on the other set.

Line graphs are drawn so that the independent data are on the horizontal a-axis (e.g. time) and the dependent data are on the vertical y-axis. Line graphs are used to track changes over short and long periods of time. There is some debate about the degree of measurement between time points. Some say the data must be measured nearly continually in order for the lines to be accurate representations. Others feel a monthly measurement is sufficient, even though the line implies data at points where no measurement was taken.

Line graphs are useful in that they show data variables and trends very clearly and can help to make predictions about the results of data not yet recorded. They can also be used to display several dependent variables against one independent variable. When comparing data sets, line graphs are only useful if the axes follow the same scales. Some experts recommend no more than 4 lines on a single graph; any more than that and it becomes difficult to interpret.

"Line graphs and scatterplots are similar in that they record individual data values as marks on a graph. The difference between these two formats is how the line is created. In line graphs, the line is created by connecting each individual data point to show local changes, in this way, the local change from point to point can be seen. This is done when it is important to be able to see the local change between pair of points. An overall trend can still be seen, but this trend is joined by the local trend between individual or small groups of points, whereas the line of the scatterplot does not connect individual points but instead shows the trend followed by the data."

Line graphs that connect only two points in time are called slope graphs. These can be handy when you don’t have data for each time period, but want to compare, for example, the start of the program in 1986 to the current state of the program in 2014. Use these if you don’t have other data to show program outcomes.

Examples

The difference in median household income between California and the rest of the United States

This example from the California Budget Project is an example of a line chart showing change in median household income in California between 1989 and 2010. The colour difference outlines the difference between California and the United States.

Red River discharge rate per month in 1993

Single line on graph showing Red River discharge peaking in April and again in August 1993

Source: Wallace (2005) 

"This example could have also been produced as a bar graph. You would use a line graph when you want to be able to more clearly see the rate of change (slope) between individual data points. If the independent variable was nominal, you would almost certainly use a bar graph instead of a line graph."

"Here, we have taken the same graph seen above and added a second independent variable, year. Both the independent variables, month and year, can be treated as being either as ordinal or scalar. This is often the case with larger units of time, such as weeks, months, and years. Since we have a second independent variable, some sort of coding is needed to indicate which level (year) each line is. Though we could label each bar with text indicating the year, it is more efficient to use color and/or a different symbol on the data points. We will need a legend to explain the coding scheme.

Multiple line graphs have space-saving characteristics over a comparable grouped bar graph. Because the data values are marked by small marks (points) and not bars, they do not have to be offset from each other (only when data values are very dense does this become a problem). Another advantage is that the lines can easily dual coded. With the lines, they can both be color coded (for computer and color print display) or shape coded with symbols (for black & white reproduction). With bars, shape coding cannot be used, and pattern coding has to be substituted. Pattern coding tends to be much more limiting.

Notice that there is a break in the 1996 data line (green/triangle) between August and October. Because the data point for September is missing, the line should not be connected between August and October since this would give an erroneous local slope. This is particularly important if you display the line without symbols at individual data points."

Multiple lines on a graph comparing Red River discharge over several years; discharge generally peaks in April with the highest being April 1997

Advice for choosing this method

Line graphs are usually used to represent changes over time.

Advice for using this method

As best as possible, label the line with its name rather than using a legend.

Highlight important points in the line with the exact value, such as the highest and lowest points or those points where actual data collection occurred.  

Resource

Wikipedia (2012) Line chart.  Retrieved from http://en.wikipedia.org/wiki/Line_chart

Wallace, R. (2005, May 16). Line graphs and scatter plots. Retrieved from https://projects.ncsu.edu/labwrite//res/gh/gh-linegraph.html

'Line graph' is referenced in: