Amazon S3 cloud storage service data set
datasetposted on 18.07.2019 by Antonio Pescape', Valerio Persico, Antonio Montieri
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
The dataset contains cloud network performance data related to the Amazon S3 storage service. The dataset refers to experimental campaigns conducted in May 2016. The dataset was collected leveraging 77 Bismark VPs, instructed as detailed in the following. Each VP performed repeated download cycles over 7 days. Each cycle is composed of 40 sequential download requests spaced out by 10 seconds and uniquely identified by a combination of factors, i.e. cloud region, file size, and storage class. Downloads within cycles are randomly scheduled and repeated from each VP every 2 hours. After every download, VPs run TCP-traceroute towards the IP address that served the request in order to trace the information related to the path and estimate the RTT to the S3 cloud datacenter (note that this information is not always available, due to the version of the firmware of the Bismark nodes and to the measurement tools available on them). When refering to our data set, please cite the following reference: Valerio Persico, Antonio Montieri, Antonio Pescapè: On the Network Performance of Amazon S3 Cloud-Storage Service. CloudNet 2016: 113-118