Htaccess considered slow

Are .htacccess configuration files slow ? One could guess so, considering they are checked for changes on every single request. To verify this, I tested three scenarios:

  1. No rewrite rule.
  2. One rewrite rule, matched, rule is in .htaccess file.
  3. One rewrite rule, matched, rule is in Apache configuration file.

Here are the results, with concurrency = 100 and 100 000 requests:

Scenario Avg. throughput (#/s) Avg. response time (ms)
1 5304.41 18.852
2 4321.07 23.142
3 4644.17 21.532

This roughly means that not using an htaccess file for a simple rewrite rule is means a 7% increase in throughput and 7% decrease in response time, which is non-negligible and easy to obtain, especially since it does not affect code quality at all.

Even more surprising is the slowness of the rewrite engine. Granted, regular expressions are not free, but I would guess that Apache caches the matches in a hashmap, on which lookups are practically free (complexity is O(1)). As such, how can you explain the 14% difference in throughput between the scenarios 1 and 3 ?


Ridiculously RADiculous

EDIT:An user with the name gwoo corrected an incorrect assertion I made in the comments section:

Your last example shows how an active record can be created and has very little to do with your assertion that it is to “use late static binding inside of closures…” The actual code to use late static binding in an anonymous function is…$class = __CLASS__; $lambda = function() use ($class) { return $class::invokeMethod(); }

—–
I have seen my fair share of absolutely atrocious PHP code, having being called way too many times on projects at a fairly late state only to discover I should have asked for a bonus (in the form of a personal masseur or something similar) just for the headaches.

I’m presently doing some research for a series of paper I plan to publish on bad design decisions plaguing the PHP code base. To this end, I started studying the Lithium 3 framework, which deemed it appropriate to crown itself as the most RAD framework for PHP 5.3+, hence the name of this blog post. I learned about the existence of this framework whilst attending a talk Joël Perras and Nate Abele gave at the Tek-X conference last week, which ended up getting a mitigated reception because of their we’re right, you’re wrong attitude.

Well, at the end of the presentation, I knew something was wrong. I knew I would be surprised if I ever opened the hood to see what kind of black magic happens in there. Boy, was I right !

I won’t enter in the detail of my fundamental disagreements with the design of the framework (for now, this’ll come later on the form of a detailed analysis) but I simply wish to share what I think is one of the funniest piece of code publicly available on the interwabs which actually is distributed shamelessly, to say the least.

public static function invokeMethod($method, $params = array()) {
    switch (count($params)) {
        case 0:
            return static::$method();
        case 1:
            return static::$method($params[0]);
        case 2:
            return static::$method($params[0], $params[1]);
        case 3:
            return static::$method($params[0], $params[1], $params[2]);
        case 4:
            return static::$method($params[0], $params[1], $params[2], $params[3]);
        case 5:
            return static::$method($params[0], $params[1], $params[2], $params[3], $params[4]);
        default:
            return call_user_func_array(array(get_called_class(), $method), $params);
    }
}

That code resides in the lithiumcoreStaticObject. You can browse it online here.

It was love at first sight. Oh dear ! This is so great. Pure genius. Of course, I’m being a bit cynical here. Interestingly enough, Tek-X’s keynote speech, given Josh Holmes, had the title The Lost Art of Simplicity (if I may, I want to mention that it was one of the best talks I have attended in a long time). Well, this code is exactly the opposite of simple. Is the invokeMethod() method called so often that it necessitates such a micro-micro-optimization ? Probably not. And, well, isn’t it a bit like solving a problem that would not have needed to be solved, had sane solutions been applied, i.e. stop using static methods all the time ?

Yes.

From what I could see (and maybe I am wrong), the Li3 guys use this method inside closures to do some black magic.

/* Why is this method named so ambiguously anyway ? */
$self::_instance();

$badAssLambdaFilter = function ($whatever) use ($self) {
    return 'OH HAI '. $self::invokeMethod('getEndOfLine');
}

But wait. Wouldn’t something along the lines of:


class Test {
    public static function aMethod() { echo 'OH HAI !'; }
}

$lambda = function () { Test::aMethod(); };

do the trick ? Ohh wait, no ! Because of Li3’s obsession for anything static (a strange obsession indeed), black magic needs to happen; the static keyword happens to not work inside of a closure, hence, they had to hack their way around this limitation.

Here’s yet another piece of code to entertain you; it’ll show you exactly how they do that:

protected static function &_instance() {
    $class = get_called_class();

    if (!isset(static::$_instances[$class])) {
        static::$_instances[$class] = new $class();
    }

    return static::$_instances[$class];
}

So, in order to use late static binding inside of closures, there’s actually a method that creates “fake” instances which are then passed to the closure as a parameter. This baffles me. Li3 is full of such “anomalies” and cool stuff that does nothing more than being cool. This is broken beyond repairable. I often have reached a point where it seemed my code was broken, not because I knew exactly why it was broken, but because of all the weird hacks I had to use to actually make the code work. In the end, I always end up finding the missing link. In this case, it seems that it has never crossed the minds of the Li3 masterminds that something was evidently wrong.

But in the end, I’m still glad I save a few femtoseconds each time I call invokeMethod(). Hurray !


The CES Production Function – Addendum

In the last post, I presented the CES production function and mentionned that its elasticity of substitution (which is constant), is:

\sigma = \dfrac{1}{1 - \rho}

I already have received a question about how to proove this so I thought it would be a good idea to share my answer with everyone. First of all, for a many-input production function, the elasticity of substitution must be measured between two inputs and, as such, it was imprecise of me to say that the elasticity of substitution is constant and I should’ve rather said that all elasticities of substitutions are constant and equal. It is interesting to note that there are, in fact:

\displaystyle\binom{n}{2} = \dfrac{n!}{2(n - 2)!}

distinct elasticities of substitution. We note the elasticity of substitution between the i-th and the j-th production factor:

\sigma_{ij} =\dfrac{\partial \ln (X_{i}/X_{j}) }{ \partial \ln (f_j/f_i)}

where f = f(X_1, X_2, ..., X_n) is the production function, f_i = \dfrac{\partial f}{\partial X_i} and f_j = \dfrac{\partial f}{\partial X_j}. First, we calculate the first degree partial derivatives, for all 1 \leq k \leq n:

\dfrac{\partial f}{\partial X_k} =\gamma \alpha_{k}^{\rho} X_{k}^{\rho - 1} \left(\displaystyle\sum_{i=1}^{n}\alpha_{i}^{\rho}X_{i}^{\rho}\right)^{\frac{\gamma}{\rho} -1}

Hence, the ratio of the first degree partial derivatives, for i \neq j and 1 \leq i,j \leq n:

\dfrac{f_j}{f_i} = \dfrac{\alpha_{j}^{\rho} X_{j}^{\rho - 1}}{\alpha_{i}^{\rho} X_{i}^{\rho - 1}} \iff \dfrac{f_j}{f_i} =\left(\dfrac{\alpha_{j}}{\alpha_{i}}\right)^{\rho} \left(\dfrac{X_{i}}{X_{j}}\right)^{1 - \rho} \iff

\ln \dfrac{f_j}{f_i} = \rho\ln\dfrac{\alpha_i}{\alpha_j} + (1 - \rho)\ln \dfrac{X_i}{X_j}

From this yields the result, which is now obvious:

\sigma_{ij} =\dfrac{\partial \ln (X_{i}/X_{j}) }{ \partial \left(\rho\ln\dfrac{\alpha_i}{\alpha_j} + (1 - \rho)\ln \dfrac{X_i}{X_j}\right)} = \dfrac{1}{1 - \rho}\dfrac{\partial \ln (X_{i}/X_{j}) }{\partial \ln (X_{i}/X_{j})} \iff

\sigma_{ij} = \dfrac{1}{1 - \rho}

QED.


The CES Production Function

In many fields of economics, a particular class of functions called Constant Elasticity of Substitution (CES) functions are privileged because of their invariant characteristic, namely that the elasticity of substitution between the parameters is constant on their domains (for a definition of elasticity, take a glance at this post).

More production !

More production

A standard production function, linking the production factors to output is of the form:

Y = f(L, K)

where Y is the total production, L the quantity of human capital (labour, measured in a unit like man-hours) used and K the quantity of phyisical capital used (measured in a unit like machine-hours). Generally speaking, the production function for n production factors is of the form:

Y=f(X_1,X_2,...,X_n)

Where X_i is the quantity of factor i used. The n factors generalized CES production function (also called the Armington aggregator) is:

Y = f(X_1,X_2,...,X_n) = \left(\displaystyle\sum_{i=1}^{n}\alpha_{i}^{\rho}X_{i}^{\rho}\right)^{\frac{\gamma}{\rho}}

With \rho \leq 1, \rho \neq 0, \gamma > 0. First and foremost, to study the returns to scale (refer to the link if you are not familiar with a formal definition of returns to scale) of the function, we shall study its homogeneity.

\forall \lambda \in \mathbb{R} \backslash \{0\}: f(\lambda X_1,\lambda X_2,...,\lambda X_n) =  \left(\displaystyle\sum_{i=1}^{n}\lambda^{\rho}\alpha_{i}^{\rho}X_{i}^{\rho}\right)^{\frac{\gamma}{\rho}} =   \lambda^{\gamma}\left(\displaystyle\sum_{i=1}^{n}\alpha_{i}^{\rho}X_{i}^{\rho}\right)^{\frac{\gamma}{\rho}}

In other words, the CES function is homogeneous of degree \gamma. Hence:

  • If \gamma > 1 \iff  f(\lambda X_1,\lambda X_2,...,\lambda X_n) > \lambda f(X_1,X_2,..., X_n), the returns to scale are increasing.
  • If \gamma < 1 \iff  f(\lambda X_1,\lambda X_2,...,\lambda X_n) < \lambda  f(X_1,X_2,..., X_n), the returns to scale are decreasing.
  • If \gamma = 1 \iff  f(\lambda X_1,\lambda X_2,...,\lambda X_n) = \lambda  f(X_1,X_2,..., X_n), the returns to scale are constant.

The other parameters are:

  • Relative weight: The \alpha_i parameter associated with each production factor represents its relative distributional weight, i.e. the significance in the production.
  • Elasticity of substitution: The elasticity of substitution, as the function indicates, is constant. It is: \sigma = \dfrac{1}{1 - \rho} (see the addendum to this post).

What is so interesting about this function is that one can derive special cases from it:

  • If \rho \rightarrow 1, we obtain the perfect substitutes production function: Y = \sum_{i=1}^n\alpha_i X_i.
  • If \rho \rightarrow -\infty, we obtain the Leontief production function, also known as the perfect complements production function: Y = Min \left(\dfrac{X_1}{\alpha_1},\dfrac{X_2}{\alpha_2},...,\dfrac{X_n}{\alpha_n}\right)
  • If \rho \rightarrow 0, we obtain the Cobb-Douglas production function, also known as the imperfect complements production function: Y = \prod_{i=1}^n X_i^{\alpha_{i}}.

Proving such assertions is far from trivial and my next blog post will be dedicated to deriving the Cobb-Douglas production function from the CES function.


The Elasticity Conjecture

If you know economists, you probably have noticed that when they discuss microeconomics, they almost certainly end up talking about elasticity. In fact, I have summed up this simple observation in my very own conjecture:

The longer a discussion about microeconomics between two economists is, the probability that elasticity is mentionned exponentially approaches 1.

Of course, if you are not acquainted with this notion, it may seem far-fetched to talk about the elasticty of a curve and discard the comments about the demand for insulin being perfectly inelastic as yet another surrealistic comment about the strange nature of markets. Well, not really.

But what exactly is elasticty ? Let f: \mathbb{R}^n \rightarrow \mathbb{R}. We define an operator e_{x_i} which we call f-elasticty of parameter x_i as:

e_{x_i} f(x_1, ..., x_n) =\left |\dfrac{\partial f}{\partial x_i} \dfrac{x_i}{f} \right |

Simply put, it is the product of the partial derivative in regard to parameter x_i and the ratio of x_i and f, which, you probably have noticed, means elasticity has no unit and is exactly the reason why its use is so widespread in economics. Elasticity is the measure of the relative effect the change in a variable has on another variable, regardless of the units employed. The elasticity of a parameter is classified in the following categories:

  • e = 0 : perfectly inelastic: a change in the parameter has no effect on the other
  • 0 < |e| < 1: inelastic: a change in the parameter has a small effect on the other
  • e = 1: unit elastic: a change in the parameter has a proportional effect on the other
  • e \geq 1: elastic: a change in the parameter has a more than proportional effect on the other
  • e = \infty: perfectly elastic: a change in the parameter nullifies the other

To better illustrate the notion, let us take a very straightforward, thus bogus but instructive exemple. Let us imagine the demand for wheat is described by the following demand curve:

Q = -2P + 15

We can calculate the price elasticity of demand:

e_P Q = \dfrac{-2P}{-2P+15}

Thus, at point P = 1, elasticity is:

e_P Q(1) \approx 0.13

Which means that the demand is quite inelastic at this point, i.e. that a change in price will only have a small effect on the demanded quantity. Insulin is a very good example of an almost inelastic good: whatever the price, someone with diabetes will pay this much to get his dose, as it is of vital importance to him. In contrario, an example of a very elastic price demand is the demand for leisure goods, such as DVDs and books.

Microeconomists study many kinds of demand elasticities (e.g. income elasticty of demand, cross-price elasticity of demand, etc.) and it is of peculiar interest to observe the empirical measures of such elasticities. In a future blog post, I will introduce you to a selected collection of elasticity data taken from litterature.


The Frivolous Theorem of Arithmetic

Steven Pigeon recently introduced to us the Frivolous Theorem of Arithmetic in his Harder, Better, Faster blog. As a complement, I shall present a rigorous, rather useless but quite educative proof of this theorem which will serve as a base for more inquiries in the internal structure of natural numbers.

The theorem reads as follows:

Almost all natural numbers are very, very, very large.

First and foremost, this yields the question of what is considered very, very, very large. In fact, the notion of a number being large or very large is inexistant in mathematics and irelevant to the internal structure of numbers. Hence, for the purpose of this proof, we shall choose a überlargeness threshold that we shall call M, a natural number. In fact, you can be an eccentric fellow and say 1 is already large enough for you and so be it. On the other end, maybe the excruciatingly, humongously large number:

M = 10!!!!!!!!!!!!!!

is a number you envison as almost big (the Google calculator accepts to calculate 10! but dutifuly refuses to calculate 10!!; how unfortunate !). Regardless, the proof still holds. Let A and B be two sets such that:

A = \{ x \in \mathbb{N} : x \leq M \}

B = \mathbb{N} - A = \{ x \in \mathbb{N} : x > M \}

Where A is the set of natural numbers smaller than or equal to the threshold and B the set of numbers we consider to be very, very, very large. To prove the theorem, we have to compare the cardinalities (the number of elements) of both sets. First, we note that the cardinality of of A is:

|A| = M

And the cardinality of the set of natural numbers is:

|\mathbb{N}| = \aleph_0

Which reads aleph-null, the first transfinite cardinal number who denotes the cardinality of countably infinite sets, i.e. sets with elements which can be counted individually, even if there is an infinite number of them. Since there exists a one-on-one relation (bijectivity) between N and B, which is of the form:

F(x) =\left\{\begin{array}{rl} M - x & \text{if } 0 < x \leq M + 1 \\ x & \text{if } x > M + 1\end{array}\right.

And according to the properties of cardinalities:

|B| = \aleph_0

Meaning there is an infinity of numbers larger than M, a number chosen to be überlarge and we can thus conclude that almost all natural numbers are very, very, very large. This proof led us to an important insight on the internal structure of natural numbers: since each natural number has a succesor which also has a succesor (1 has succesor 2, which has succesor 3, ad nauseam),  no natural number can be used to express the cardinality of this set. We can’t simply say there is an “infinite” number of natural numbers, as this is too imprecise. In fact, there is an injective relation between the reals and the naturals but there can be no bijective relation between the two sets, meaning that the cardinality of the naturals is “smaller” than the cardinality of the reals. Intuitively, the cardinality of reals is also infinite, meaning we have two or more types of “infinity”. This will be the subject of a follow-up post.