Data Visualisation - Histogram (Frequency distribution)

> (Data|State) Management and Processing > Data Visualization Foundation

1 - About

A histogram is a type of graph used to display a distribution

Histograms can reveal information not captured by summary statistics such as:

of the distribution.

An histogram is just a frequency distribution.

An histograms plots continuous data in bin. For categorical data, you may want to see the bar charts.

A histogram is a combination of:


3 - Histogram

3.1 - Ggplot

The below code handles outliers by:

  • creating manually the breaks
  • limiting the Cartesian coordinates (zooming)
## Create bin breaks
value_breaks = c( seq(10,120,by=10), max(res_succes$TOTAL_TIME_SEC))
## Labels
label_breaks = c(as.character(seq(10, 120, by=10)), "Max+")
ggplot(res_succes, aes(x=res_succes$TOTAL_TIME_SEC, fill = factor(res_succes$REPORT_TYPE))) +
  geom_histogram(breaks=value_breaks) +
  labs(x = "Total Time (min)", fill= "Report Type") +
  coord_cartesian(xlim=c(10,130)) +
  scale_x_continuous(breaks=value_breaks, labels=label_breaks)

4 - Documentation / Reference

data/viz/histogram.txt ยท Last modified: 2017/11/28 23:29 by gerardnico