Cloud Interdatacenter Network Performance dataset
datasetposted on 18.07.2019 by Antonio Pescape', Valerio Persico, Antonio Montieri, Alessio Botta
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
This dataset contains network performance data related to both AWS and Azure inter-datacenter networks, collected between March and November 2015. The collection process required more than 790 hours of syntethic traffic generation. We considered the 12 combinations of four regions selected for each provider (North Virginia, Ireland, Sao Paulo, and Singapore). Experiments have been run between VMs of the same size (M or XL). Repeated, 5-minute-long experiments have been performed in the same conditions, equally spaced in 24-hour intervals. According to the presence of multiple availability zones in each region for Amazon, we run around 8.6K and 880 5-minute-long experiments for Amazon and Azure, respectively. Beside performance measures, path information about each scenario has also been collected to complement the view on the performance. The directory tree identifies the specific experiment. In more details: provider └── VMsize and L4Protocol └──Region └──availability zone (only por AWS) └──file.json each json file contains experiments, reporting a number of network performance metrics (throughput, latency, jitter, packet loss), considering both synthetic and detailed results. json files are self-described. When refering to our data set, please cite the following reference: Valerio Persico, Alessio Botta, Pietro Marchetta, Antonio Montieri, Antonio Pescapè: On the performance of the wide-area networks interconnecting public-cloud datacenters around the globe. Computer Networks 112: 67-83 (2017)