Cloudflare Research logo
 

This dataset represents one week of cache requests over two east coast, two west coast, two asia, and two lower tier metals. Requests are sampled at a rate of 1/100 using uniform sampling.

The dataset contains the fields:

namedata typedescription
timeuint32
obj_iduint64hash of url
obj_sizeuint32
ttluint32
ageuint32calculated as time-last-modified time
extensionstring(enum) jpg, png, ts, html, js...
origin_response_chunkedboolthe presence of 'Transfer-Encoding: chunked' in HTTP header
hostnameuint64
request_methodboolonly GET and PURGE requests 
colouint32used only in dataset 2
colo_tierstring(enum) used only in dataset 2

Example csv file line:

1728492382, 38fe47fc1aa3c1973099107d87216057, 12098, 3600, 36, jpg, 1, hippos.in.love, 1, dtw01, 2

And an invocation example of the helper script:


more = True

while more: 
  if so_inclined: more = process_line()

be_done()


This dataset was used to generate the graphs for the paper "FIFO queues are all you need for cache eviction", published in the Proceedings of the 29th Symposium on Operating Systems Principles.