EntirelyDigital bills excess network bandwidth (over contracted rate) based on a statistical method called "95th percentile". The charging between ISPs is also based on 95th percentile billing.
95th percentile billing is an outgrowth of network monitoring systems that use 95th percentile measurements for capacity planning. A statistical measurement of network usage was ideal for this task. However, there is no standard defined for 95th percentile computations and many ISPs are implementing billing based on this mechanism without an understanding of the errors they introduce. This lack of a consistent, accurately implemented billing mechanism makes it impossible to compare service providers based on price or to estimate future bills based upon projected business.
95th percentile measurement allows the us to bill our customer for the maximum bandwidth used during the billing period while forgiving a small amount of bandwidth spiking.
Analogy to Physical Content Delivery
Explaining percentile billing in terms of cumulatively billed usage products, like water, can help describe how the mechanism operates.
For example, assume the water company provides 95th percentile billing for water usage on a 2-hour average gallons per minute scale. The water meter no longer keeps track of your total water usage for the month, but instead records the water usage for each 2-hour period during the month.
Twin brothers Jim and Tim live next door to each other in identical houses with identical yards. Both have automatic sprinklers that water the yard at 7 AM for 2 hours each morning. They have rather large yards and this constitutes the bulk of their water usage. However, on weekdays, Jim showers at 7 AM and Tim showers at 7 PM. At the end of the month Jim and Tim have used exactly the same amount of water, however, Jim's water bill is higher because he is being billed for the water used each weekday to water the lawn and shower while Tim is only billed for watering the lawn.
Jim will pay 28% more than Tim because he showers in the morning. Because percentile billing is focused on your peak usage, Tim gets water for showers for free. Buyers of bandwidth can get free bandwidth when they "shower at night". It is more difficult for most buyers because most usage is from an uncontrolled population of Internet users that you want to provide quality response.
As a buyer of IP bandwidth, I'm first concerned with delivering an experience that gets my revenue, then I'm concerned about controlling costs. Since I don't want to force users to wait for me to respond, I have to buy the throughput to satisfy their "busy hour" needs. I can reduce my cost by funneling every other kind of usage (i.e. backups of web content, downloads of access logs, etc.) into my less busy times of the day.
Percentile
Definition of Percentile
The pth percentile of a set of measurements is the value for which at most p% of the measurements are less than that value and at most (100-p)% of measurements are greater than that value.
Some special cases of percentiles exist. The median is equivalent to the 50th percentile. Quartiles occur at the 25th and 75th percentile. Deciles occur every 10th percentile, thus the ninth decile is the 90th percentile.
Two separate computations of percentile exist in the real world, discrete percentile and continuous percentile. Discrete percentile differs from continuous percentile in that the discrete percentile value must be a member of the data set. Use discrete percentile only in the case of a discrete distribution. It is important to note that the median of a discrete distribution may not be defined; therefore, the 50th discrete percentile may not be the median if you don't have an odd number of measurements.
(Note: you don't want to use discrete percentiles in billing applications.)
Continuous percentile treats the measurements as a statistical population and determines the value that would be the discrete percentile by interpolating a value when it isn't present. For example, if you are doing the 50th percentile and you have an odd number of measurements, the continuous and discrete values are the same. If you have an even number of measurement, you interpolate a value between the actual measurements that are just above and just below the "perfect" center of your measurements.
Computing Continuous Percentile
Step 1. Sort the measurements
Samples must be ordered. For example, 1, 3, 7, 21, 25, 26 and 72 are my example measurements.
| Row Number |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
| Value |
1 |
3 |
7 |
21 |
25 |
26 |
72 |
Step 2. Compute Row Numbers of the percentile value
Using percentile value P and number of rows N
RN = 1+((N-1)*P)
FRN = floor(RN)
CRN = ceiling(RN)
For my example above, let the percentile number be 90 (USE 0.90). The number of rows is the number of data samples. This is 7.
The value for RN in our example is 1+((7-1)*90). This evaluates to 6.4. The FRN (floor row number) is 6. The CRN (Ceiling row number) is 7.
Step 3. Determine the Result
if (CRN = FRN = RN) then
(value of expression from row at RN)
else
(value of expression for row at FRN) + (RN - FRN) * (CRN row - FRN row value)
Since the row numbers don't all match, we have to interpolate a value between two measurements. For our example, we interpolate between row 6 (value 26) and row 7 (value 72).
The equation is 26 + ((6.4 - 6.0) * (72 - 26)). It evaluates to 44.4. This is your continuous percentile for this sample.
Averaging Effects
The sample set is composed of the average bandwidth for each sample period during the billing period. If the sample period is the same, or greater than, the duration of the traffic spikes, averaging distorts the measurements.
In an idealized case, the customer has high bandwidth utilization for 5 minute periods alternating with 5 minute periods of no bandwidth usage. Their ISP polls their network port using a 5 minute sample period. In the case where the ISP's sample period is synchronous with the customer's high activity period, the ISP collects alternating high and low samples. However, if the ISP's sampling is 150 seconds out of phase with the customer's high activity period then the ISP collects uniform samples with half the bandwidth used during the high activity periods. The 95th percentile value of the synchronous sample is 100% more than the 95th percentile value of the out of phase sample.
Since our final mechanism for determining the customers billed traffic depends on a single value, any averaging effect that affects a single value can have an effect on the total billed amount.
Example - Averaging Effects
In our original example, Tim and Jim were watering their yard at 7 AM. After learning Tim was paying less, Jim moved his shower to 7 PM.
Since the water company can not read everyone's meter simultaneously they have decided to read Tim's meter on the even hours and Jim's meter on the odd hours.
Tim and Jim use the same amount of water each month, but when the water bills arrive, Tim's usage is half of the Jim's usage. This is because sampling divides Tim's peak usage into two sample periods while Jim's usage is contained in a single sample period.
It seems unlikely that this could happen to an Internet user, but consider the following real example. A network based business that monitors remote devices on a regular business tells it customers that they will connect to their devices every hour and gather information from them. Their programmers are given the job of gathering information from all of the devices shown in a database on an hourly basis. They create program that will read the database, connect to the devices, and save the gathered information for additional processing. They schedule the program to run every hour at the beginning of the hour. This means that all the network traffic occurs during the first five minutes of each hour. In this example, they will move 200 megabytes in that five minute period. They will have one measurement of 200 megabytes and eleven measurements of 0 megabytes per hour.
Imagine instead that they break the job into quarters and check 1/4 of the devices every 15 minutes. This changes their measurements to 4 measurements of 50 megabytes and 8 measurements of 0 megabytes. If they split the work evenly over the hour, they will get 12 measurements of about 16 megabytes.
If they change the programming to spread the work out over the hour, they will have 1/12th the network usage compared to doing their work in one batch at the beginning of the hour. (Please note: If they can squeeze all their usage into less than 5% of all the sample periods, then they're usage would be eliminated in the calculation of the 95th percentile. They're vendor would probably have a minimum fee that they charge them.)