Estimation of Web Proxy Server Cache Size using G/G/1 Queuing Model

Year : 2010, Volume : 1, Issue : 1
First page : ( 46) Last page : ( 58)
Print ISSN : 0975-8070. Online ISSN : 0975-8089.

Srivastava Riktesh^1,*

¹Assistant Professor, Information Systems, Skyline University College, University City of Sharjah, Sharjah, PO 1797, UAE.

* Email ID: rsrivastava@skylineuniversity.com

Abstract

Continuous boost in the number of Internet users has taken an exponential escalation over the years. It is becoming thorny to endow with services to all the Internet users because of infrastructural precincts of WWW. Caching web object at proxy servers has proven to be one of the preeminent alternatives for fast services. Caching web proxy server improves the performance of overall web access. Since the introduction of Web proxy servers, most of the evaluations studies are performed either on Web-replacement algorithms or methodologies of maintaining the data in cache. Only few of studies have been done assuring Web proxy server cache size. This paper describes the methodology of estimating the cache size under high busty traffic situation. An investigational evaluation study applying M/M/1 Queuing Theory methodology is first appraised and then experimented. The study uses a trace- driven simulation framework, real traces containing approximately 10 ¹⁰ number of user's requests per unit time, and then evaluates the Optimal Cache Size (CS _optimaI) using G/G/1 Queue Analysis..

Top

Keywords

Web Proxy Server, M/M/1 Queuing Model, G/G/1 Queuing Model, Optimal Cache Size.

Top

Introduction

By all indications, WWW continues its remarkable and seemingly unregulated growth [1]. There is rapid growth in the number of users and hosts [2], number of web servers, application servers and network traffic [3] thereby resulting in increase in network loads and users’ response time [4]. The performance of Gateway Servers was studied by [5], [6], [7] using comparative analysis of M/M/1 with M/G/1, G/M/1 and G/G/1 queuing theory. Number of research is being performed to improve the rate of data access and decrease the response time. [8], [9], [10], [11] have carried out the study on workloads of network and server performace under such traffic. [12], [14], [15] have used the mathematical analysis for Internet Servers performance analysis. [15], [16] defined the dynamic buffer management techniques by which the buffer space can be reduced for fast transmission of data. The study by [15] has used the case study of Media servers in order to conduct the experiments. In this paper, the author claims that the most prominent solution to problem is “Web caching”. In web caching, the data requested by the users are accessed by web server and stored in the cache memory of web proxy server for further reference. Number of Web replacement algorithms, including FIFO (First In First Out), LRU (Least Recently Used) and MRU (Most Recently Used) were implemented to define the mechanism of data storage, data removal and data retrieval. Surprisingly, not much of the research is done to estimate the Optimal Cache Size (CS_optimal), under heavy traffic. This paper evaluates the CS_optjmal by first evaluating the results for M/M/1 Queue Model and then implements the results for G/G/1 Queue Model.

The paper devices the mathematical approach to resolve the size of cache, where λ, is the arrival rate of users request at Web Proxy Server and μ defines the departure of accesses resources. Based on the values λ,μ, CS_optjmal is calculated such that the rate of data access from cache increases. The paper evaluates the number of 15000 references to cache and later simulates the result for higher rate of arrivals, to the maximum extent of 10¹⁰ number of requests.

The result of the paper divided as follows: Section 2 delineates the Internet architecture with Proxy Server and then defines the mathematical formulation of e and i. Section 3 elaborates the analytical study of M/M/1 Queue model for estimation of cache size mathematically. Section 4 gives the computation results and estimation of Cache size using the regression technique, for higher rate of users’ requests to estimate the CS_optimal for Web Proxy Server by means of G/G/1 Queuing model. Section 5 concludes the paper.

Top

Architecture of Internet with Web Proxy Server

The architecture of Internet with Web Proxy Server is portrayed in Figure 1, given below:

If Figure 1 is further consolidated, it will be condensed to a single request arrival and a single response departure processing system. In this structural design, the resource is first searched in cache memory, if the resource is obtained in Cache, the response is send back to the client, and otherwise, the resource is taken from the application server, stored in cache and response is then send to the requesting client. In this case, when the cache memory is full, web replacement algorithms are applied to place the new resource in cache and obsolete resource is replaced. As Arrival of request and Departure of responses are random in nature, the estimation of memory of cache becomes the issue for Internet Designers. The mission of design should be:

No request from user should be overflowed/freezed out.
Cache Size should not be abnormally large.
There is “n” number of requests from users at a time. The architecture must be such designed that it should provide stable functioning.

These three problems are of prime importance for designing the complete Internet architecture with Web Proxy Server. The Web Proxy Server resembles the working of Queuing theory, where there are number of requests arriving in Web Server and number of resources generated at a single instance of time.

Mathematical Evaluation of the System

Consider that λ, is the summation of users’ request at Web Proxy Server per unit time [5],

where,

req₁, req₂, req₃,…….. req_n are the number of requests arriving at Web Proxy Server, and, T=Total time in which the request has arrived at Web Proxy Server.

Similarly, if the requested data is found in the cache of web proxy server, the resource is given back to the client. The rate of departure of resources from Web Proxy Server is defined as [5]:

where,

res₁, res₂, res₃,…….. res_n are total number of responses from Web Proxy Server, and, T= Total time in which the resources has departed from Web Proxy Server.

Based on equation (1) and (2), there can be 3 possibilities [6],[7]:

a) If rate of arrival of users request is greater than the departure rate, i.e.,

This case is called Transient state. If arrival rate of users’ request is greater than the departure rate of resources, then, there are chances of more cache miss rather than cache hit, which makes the system unstable. Hence, it is concluded that under no circumstances equation (3) should prevail.

b) When arrival rate of users’ request is equal to departure rate of resources from Web Proxy Server, i.e.,

This is a very typical case, often referred to as Null state. Such phenomenon randomly occurs. This case is typical for academic studies only. Practically, this neither occurs nor is desirable.

c) When arrival rate of users’ request is less than departure rate of resources from Web Proxy Server, i.e.,

This is case is called as Ergodic state. If this situation is maintained throughout the system, there were be not system failure and often it results in higher rate of cache hit and less cache miss.

The study is thus made for the estimation of CS_optimal, which provides all the randomly arriving users’ request. In the present study, when a very huge amount of users’ request is arriving at the Web Proxy Server for cache reference, and very huge random departure occurs after referring the cache. The problem becomes very complicated to be solved analytically. This problem gets further complicated when the rate of users arrival is huge in number from multiple sources. Had it been arriving from a single source with similar type of requests, the queuing model would have been proximated to M/M/1, where first M describes the arrival process of users request to be Markovian. Markovian arrival process is nothing but Poisson arrival where inter-arrival distribution is negative exponential. The secondM, describes the departure process with processing unit 1 (one in number). In case of multiple users’ requests arriving at a single instance of time, this assumption does not fit in. Under this condition when arrival of users request or departures of responses, both are considered multi-channel, the appropriate model becomes G/G/1. It is worth to start with M/M/1 queue model have been analytically studied for the estimation of average cache size. However, for multiple requests and responses, the study is carried for G/G/1 model to compute the CS_optimal. The M/M/1 model is studied analytically in the next segment.

Top

Analytical Study of M/M/1 Queue Model for Estimation of Cache Size Mathematically

In this section, study is performed to evaluate the Cache Size mathematically using M/M/1 Queue model. Based on the mathematical formulation, CS_optimal for high busty traffic using G/G/1 model is evaluated in the next section of the paper.

Assumptions for Mathematical Formulation of Cache Size

For estimation of “n” number of requests arriving at the Web Proxy Server, certain assumptions have to be made. This can be given as follows:

Δt is a very small time, in which only one process can occur, i.e., either arrival of users’ requests or departure of resources from Web Proxy Server.
The Ergodic state is maintained throughout the study.
The state of arrival of users’ request is λ and state of departure from Web Proxy Server is μ.

Probability of only one arrival of users’ request at Web Proxy Server = λΔt

Probability of only one departure of resource from Web Proxy Server = μΔt

Then, Probability of no arrival of users’ request from Web Proxy Server = 1- λΔt and,

Probability of no departure of resource from Web Proxy Server = 1 - μΔt

Now, consider that there are “n” number of users’ request at any time “t”. This, will be represented by P (t). If the time is increased from “t” to “t+Δt” and at the end of this time, then, this state can be arrived at from the states as given below:

or,

But,

Thus, the R.H.S. of equation 8 becomes

To solve, equation 9, we need to consider the initial condition, i.e., there is 0 (nil) presence of users’ requests at time (t+Δt). This can be obtained from the states as given under:

Thus, L.H.S. of equation 10 becomes

Hence, equation 11 becomes

From equation 9 and equation 12, following can be easily derived as:

Summation of all the equations in equation 13 is given as under:

Based on limiting condition, when n → ∞, and , L.H.S. becomes 1 and R.H.S. becomes

Thus equation 14 becomes

If equation 14 is substituted in equation 13, then

Hence, the probability for the presence of “n”data can be computed at any time “t”provided rate of users’ request and rate of resources from the Web Proxy Server is known.

Estimation of Cache Size for M/M/1 Queuing Model

In section 3.1, the probability density function for the existence of “n” data has been derived as:

For variable “n” the average value can be written as:

Where CS is Cache Size.

The equation 17 measures the Cache size of Web proxy server mathematically for M/M/1 Queue model.

Top

Computational Results and Estimation of Cache Size Using G/G/1 Queue Model Implementing Regression Technique

The random arrival of users’ requests at Web Proxy Server can be assumed either disciplined distribution or it can be assumed undisciplined arrival or departure. Estimation of CS_optimal becomes a case of G/G/1 Queue model, as there are multiple requests arriving at Web Proxy Server at a time, and multiple resources are transferred at a time. Initially, the experiment was conducted for λ= 5000, 7500, 10000, 12500, 15000 and for the worst case of cache size is μ = λ+1 based becomes μ= 5001, 7501, 10001, 12501, 15001. The average computed Cache Size is given in table 1.

The Cache Size computed is for the lower rates of arrivals. It is not possible to have very high arrival rate for computation of models. It is therefore proposed to carryout study for higher values of arrival rates by employing “curve=fitting techniques”. Curve-fitting technique, also known as “Regression Techniques” thus, gives the extrapolation for the average queue values at high rate of users’ requests.

Extrapolation of Cache Size Length at High rate of users’ request using Regression/Curve Fitting technique

In case of M/M/1 model, the equation is plotted between rate of users’ requests and Cache Size, it is then expressed as:

For the worst case of Cache Size [CS], the equation can be derived using:

The equation 20 is a linear equation with slope of unity. In case of other models, if we observe the plot, we find that G/G/1 models have curves. The curves are smooth and can be assumed to be a second order polynomial, which is close to first order polynomial formed by model M/M/1.

The equation of the plots can be assumed:

This equation can be fitted in all the cases of plots. However, the coefficients in each model will be different. What is required to find out is the values of a₁, a₂ and a₃ for the best fitted curve which has minimal error. For this reason, we have to express error, differentiate it and equate to zero for calculation of coefficients.

Coefficients of G/G/1 model

It is observed from Figure 2, that the curve of G/G/1 represents a polynomial equation for representing plot of Cache Length w.r.t. rate of arrival of users’ requests and Cache Size. To have minimal error in regression, mean square error is made minimal to give good result. Let x. represents the rate of arrivals for the computation and then the value of Cache Size will be represented by y It is represented as given below:

where a₀, a₁ and a₃ are coefficients of polynomial for G/G/1 model.

The equation 22 is valid for larger rates of arrival of users’ requests, which includes

CAUSAL EFFECT. This cannot be employed for lower rates of arrival.

Let “S” represents the error in computation and real values of Cache Size, then, “S”, which is square of derivation, is given as:

If be differentiate S w.r.t a₀, a₁, a₂ and setting each of these coefficients equal to zero, we get

where “n” represents the degree of polynomials as “n” equations are formed for the summation.

These are three linear equations in three unknowns. These are called normal equations for quadratic regression. These may be solved using Gauss Elimination procedure.

Upon calculating, following set of equation were obtained:

By Gaussian elimination and using back substitution, we obtain:

Optimal Cache Size (CS_optimal) using G/G/1 Queuing Model

It is to note that computation of Cache size was for average value. The actual Cache size will deviate from average value at every instance. Sometimes, Cache size is lower than average and at other times it is higher average value. The estimation of cache size should be such that cache hit should be higher that than the average value. Thus, it indicates that positive deviations are to be incorporated in the estimation of cache size. Standard deviation, which is positive square root of variance, is required to be added in the

average value of cache size. Standard deviation in many statistical distributions is equal to average value. Table 2 measures the cache size at Web Proxy Server.

Using equations, the expected Cache size are computed

The optimal value for memory size is given as:

The Standard Deviation is given as:

Hence, the desired cache size to meet the eventualities, the optimal cache size is:

The calculated cache size(s) are depicted in table 3.

Top

Conclusion

The study confirms that the proposed model be such that it should follow the condition λ<μ [ERGODIC CONDITION]. This will guarantee more number of cache hit resulting in stability of the system implementation. The cache size estimated by queuing theory will ensure that there is minimal cache miss by evaluating the optimal size of cache size.

Top

Figures

Figure 1::

Users request and response from Web Proxy Server

Top Back

Figure 2::

Computation result between M/M/1 and G/G/1 Queue model

Top Back

Tables

Table 1::

Cache Size Computations

S. No.	λ	μ	Cache Size for M/M/1	Cache Size for G/G/1
1	5000	5001	5000	5538
2	7500	7501	7500	8400
3	10000	10001	10000	11190
4	12500	12501	12500	14126
5	15000	15001	15000	17695

Top Back

Table 2::

Average Cach Size

Expected users requests	Average Cach Size
Expected users requests	M/M/1	G/G/1
10000000	10000000	1791307386
100000000	100000000	178370212857
1000000000	1000000000	17829416028571
1000000000	1000000000	1782865550285710

Top Back

Table 3::

Optimal Cache Size

Expected users requests	*Optimal Cach Size (CS_optimal)*
Expected users requests	M/M/1	G/G/1
10000000	20000000	3582614771
100000000	200000000	356740425714
1000000000	2000000000	35658832057142
10000000000	20000000000	3565731100571420

Top Back

References
1..	Top Back
2..	Top Back
3..	Top Back
4..	Top Back
5..	Top Back
6..	Top Back
7..	Top Back
8..	Top Back
9..	Top Back
10..	Top Back
11..	Top Back
12..	Top Back
13..	Top Back
14..	Top Back
15..	Top Back
16..	Top Back
17..	Top Back

Agriculture
Applied Science/Technology
Biology
Botany
Business/Economics/Management
Chemistry
Civil Engineering
Commerce/Banking/Finance
Computers/Information Technology
Dental Science
Earthscience
Education
Engineering Mechanics/Materials
Environment
Health Science
Humanities
Library and Information Science
Management
Mathematics/Statistics
Medical Science
Nanotechnology
Nursing
Pharmacy
Physics
Social Science
Veterinary/Animal Sciences