SitePoint PHPVisualize Your Code’s Quality with PhpMetrics (22.12.2014, 17:00 UTC)

We had been looking into code quality checking tools for a while here on SitePoint - most recently in a series on Jenkins, but none of those could do what a project I’ve only recently found out about can.

PhpMetrics uses D3 and some sophisticated analysis algorithms to scan your application’s code and output intricate reports about it.

phpmetrics-maintenability

Installing and Using PhpMetrics

It’s a bit hard to talk about it without seeing a proper example, so let’s install and run it, then explain every part.

Continue reading %Visualize Your Code’s Quality with PhpMetrics%

Link
Nikita PopovPHP's new hashtable implementation (22.12.2014, 00:00 UTC)

About three years ago I wrote an article analyzing the memory usage of arrays in PHP 5. As part of the work on the upcoming PHP 7, large parts of the Zend Engine have been rewritten with a focus on smaller data structures requiring fewer allocations. In this article I will provide an overview of the new hashtable implementation and show why it is more efficient than the previous implementation.

To measure memory utilization I am using the following script, which tests the creation of an array with 100000 distinct integers:

$startMemory = memory_get_usage();
$array = range(1, 100000);
echo memory_get_usage() - $startMemory, " bytes\n";

The following table shows the results using PHP 5.6 and PHP 7 on 32bit and 64bit systems:

        |   32 bit |    64 bit
------------------------------
PHP 5.6 | 7.37 MiB | 13.97 MiB
------------------------------
PHP 7.0 | 3.00 MiB |  4.00 MiB

In other words, arrays in PHP 7 use about 2.5 times less memory on 32bit and 3.5 on 64bit (LP64), which is quite impressive.

Introduction to hashtables

In essence PHP’s arrays are ordered dictionaries, i.e. they represent an ordered list of key/value pairs, where the key/value mapping is implemented using a hashtable.

A Hashtable is an ubiquitous data structure, which essentially solves the problem that computers can only directly represent continuous integer-indexed arrays, whereas programmers often want to use strings or other complex types as keys.

The concept behind a hashtable is very simple: The string key is run through a hashing function, which returns an integer. This integer is then used as an index into a “normal” array. The problem is that two different strings can result in the same hash, as the number of possible strings is virtually infinite while the hash is limited by the integer size. As such hashtables need to implement some kind of collision resolution mechanism.

There are two primary approaches to collision resolution: Open addressing, where elements will be stored at a different index if a collision occurs, and chaining, where all elements hashing to the same index are stored in a linked list. PHP uses the latter mechanism.

Typically hashtables are not explicitly ordered: The order in which elements are stored in the underlying array depends on the hashing function and will be fairly random. But this behavior is not consistent with the semantics of PHP arrays: If you iterate over a PHP array you will get back the elements in the exact order in which they were inserted. This means that PHP’s hashtable implementation has to support an additional mechanism for remembering the order of array elements.

The old hashtable implementation

I’ll only provide a short overview of the old hashtable implementation here, for a more comprehensive explanation please see the hashtable chapter of the PHP Internals Book. The following graphic is a very high-level view of how a PHP 5 hashtable looks like:

The elements in the “collision resolution” chain are referred to as “buckets”. Every bucket is individually allocated. What the image glosses over are the actual values stored in these buckets (only the keys are shown here). Values are stored in separately allocated zval structures, which are 16 bytes (32bit) or 24 bytes (64bit) large.

Another thing the image does not show is that the collision resolution list is actually a doubly linked list (which simplifies deletion of elements). Next to the collision resolution list, there is another doubly linked list storing the order of the array elements. For an array containing the keys "a", "b", "c" in this order, this list could look as follows:

So why was the old hashtable structure so inefficient, both in terms of

Truncated by Planet PHP, read more at the original (another 30003 bytes)

Link
SitePoint PHP3 Ways to Implement Embeddable Custom Badges (20.12.2014, 17:00 UTC)

One great way of organically promoting your application is to provide “badges”; snippets of content that people can embed on their own websites.

This can contain up-to-the-minute information from your application about a user, piece of content or another object, dynamically generated and inserted into other websites. This is probably best illustrated with some examples:

Some examples of embedded content

In this article I’m going to take a look at some of the ways you can implement this.

Setting up our Example Application

All the code from this tutorial is available on Github. There’s also an online demo.

First, we’ll define our application’s dependencies using Composer:

"silex/silex": "~2.0@dev",
"twig/twig": ">=1.8,<2.0-dev",
"smottt/wideimage": "dev-master"

Continue reading %3 Ways to Implement Embeddable Custom Badges%

Link
Anthony FerraraOn PHP Version Requirements (19.12.2014, 19:00 UTC)
I learned something rather disturbing yesterday. CodeIgniter 3.0 will support PHP 5.2. To put that in context, there hasn't been a supported or secure version of PHP 5.2 since January, 2011. That's nearly 4 years. To me, that's beyond irresponsible... It's negligent... So I tweeted about it (not mentioning the project to give them the chance to realize what the problem was):

<script async src="//platform.twitter.com/widgets.js" charset="utf-8">
I received a bunch of replies. Many people thought I was talking about WordPress. I wasn't, but the same thing does apply to the project. Most people agreed with me, saying that not targeting 5.4 or higher is bad. But some disagreed. Some disagreed strongly. So, I want to talk about that.
Read more »
Link
SitePoint PHPNo More var_dump – Introducing Symfony VarDumper! (19.12.2014, 17:00 UTC)

Recently, Symfony went from Zend-like bloat and rigidity to extreme decoupling and modularity. With the new Developer Experience initiative, Symfony has done a Laravel-style 180° and dove right into making its components more end-user friendly, its docs more complete, and its AppBundles unbundled, simplifying entry and further development almost exponentially. Considering user friendliness, it’s a long way from “best pals friendly” but it’s definitely no longer hostile. One factor that contributes to this factor a lot is their continuous pushing out of new components that are incredibly useful outside of Symfony’s context. One such component is the new VarDumper.

SYMFONY LOGO

Why?

You’re developing a feature. You either don’t feel like writing tests, or what you’re developing needs some variable testing in the middle of a function - something you can’t quite cover with a test. Inevitably, you resort to something like die(var_dump($var));. Even if you’ve abstracted it into a shorthand method like vddd($var), it’s still clumsy and unreadable, and tends to leave debugging snippets around your code, either as comments or, even worse, as code that can actually be triggered.

There’s little choice in the matter - sometimes we simply need our vddds. And sure, if you’re an Xdebug user, you’re probably used to a slightly better looking output than the raw PHP prints. Still, few good solutions existed that beautified this output for us enough to make it worth installing a dev dependency. Until VarDumper.

Continue reading %No More var_dump – Introducing Symfony VarDumper!%

Link
Nomad PHPMarch 2015 – US (19.12.2014, 00:01 UTC)

Virtualization for Developers

Presented By
John Coggeshall
March 19, 2015 20:00 CDT

The post March 2015 – US appeared first on Nomad PHP.

Link
Nomad PHPMarch 2015 – EU (19.12.2014, 00:01 UTC)

Composer the Right Way

Presented By
Rafael Dohms
March 19, 2015 20:00 CET

The post March 2015 – EU appeared first on Nomad PHP.

Link
Evert PotTesting your composer dependencies with prefer-lowest (18.12.2014, 22:54 UTC)

A few days ago, a new feature landed in composer, the --prefer-lowest argument.

Normally, for any dependency, composer will attempt to always install the latest possible version.

For instance, if your composer.json looks like this:

"require" : {
    "vendor/package" : "~1.2.1"
}

Then you are telling composer effectively that it should install any version after 1.2.1, but before 1.3.

If version 1.2.5 is the latest, composer will always grab that version.

With the new --prefer-lowest setting, you can tell composer to install the oldest possible version of a package that still matches your requirement.

So if we run:

composer update --prefer-lowest

We should get version 1.2.1 for vendor/package and all its dependencies.

Why is this useful?

In some projects, there may be packages lying around that are not the latest version. This could be because it introduced some BC break, or introduced a bug.

If other packages also use the package that's being held back, they may get an older version as a dependency.

So for package maintainers, they will want to find out if their package correctly works with the oldest package they claim to support.

So for us to test this, we can simply run:

composer update --prefer-lowest
./bin/phpunit

If our unittests break with older dependencies, we know that we either need to increase the oldest supported version, or implement a workaround so it still works.

Automating this with Travis

If you are using Travis for automated testing, you can also tell travis to test with the oldest possible dependencies.

Travis allows you to add new lines to the build matrix with the env: setting, which makes this super easy to do.

If for instance your .travis.yml looks a bit like this:

language: php

php:
  - 5.4
  - 5.5
  - 5.6
  - hhmv

before_script:
  - composer update --prefer-source

script:
  - ./bin/phpunit

You can modify it to this, to allow Travis to do 1 build for the latest, and one build for the oldest possible depdendencies:

language: php

php:
  - 5.4
  - 5.5
  - 5.6
  - hhmv

env:
  matrix:
    - PREFER_LOWEST="--prefer-lowest"
    - PREFER_LOWEST=""

before_script:
  - composer update --prefer-source $PREFER_LOWEST 

script:
  - ./bin/phpunit

This would result in a total of 8 builds.

Depend on a wide ra

Truncated by Planet PHP, read more at the original (another 901 bytes)

Link
Anthony FerraraStack Machines: Compilers (18.12.2014, 20:31 UTC)
I have the honor today of writing a guest blog post on Igor Wiedler's Blog about Compilers. If you don't know @igorwhiletrue, he's pretty much the craziest developer that I know. And crazy in that genious sort of way. He's been doing a series of blog posts about Stack Machines and building complex runtimes from simple components. Well, today I authored a guest post on compiling code to run on said runtime. The compiler only took about 100 lines of code!!!

Check it out!

Link
SitePoint PHPEfficient Chinese Search with Elasticsearch (18.12.2014, 17:00 UTC)

If you have played with Elasticsearch, you already know that analyzing and tokenization are the most important steps while indexing content, and without them your pertinency is going to be bad, your users unhappy and your results poorly sorted.

Even with English content you can lose pertinence with a bad stemming, miss some documents when not performing proper elision and so on. And that’s worse if you are indexing another language; the default analyzers are not all-purpose.

When dealing with Chinese documents, everything is even more complex, even by considering only Mandarin which is the official language in China and the most spoken worldwide. Let’s dig into Chinese content tokenization and expose the best ways of doing it with Elasticsearch.

Chinese characters are logograms, they represents a word or a morpheme (the smallest meaningful unit of language). Put together, their meaning can change and represent a whole new word. Another difficulty is that there is no space between words or sentences, making it very hard for a computer to know where a word starts or ends.

There are tens of thousands of Chinese characters, even if in practice, written Chinese requires a knowledge of between three and four thousand. Let’s see an example: the word “volcano” (火山) is in fact the combination of:

  • 火: fire
  • 山: mountainsky

Our tokenizer must be clever enough to avoid separating those two logograms, because the meaning is changed when they are not together.

Continue reading %Efficient Chinese Search with Elasticsearch%

Link
LinksRSS 0.92   RDF 1.
Atom Feed   100% Popoon
PHP5 powered   PEAR
ButtonsPlanet PHP   Planet PHP
Planet PHP