Sebastian BergmannStatic Analysis with HipHop for PHP (27.1.2012, 19:00 UTC)

In July 2010 I already blogged about the fact that HipHop for PHP, the source code transformer that turns PHP code into C++ code that can then be compiled with g++, can also be used for static code analysis to find problems in PHP source code.

Today I started to work on a convenience wrapper for HipHop's static analyzer:

➜  ~  hphpa /usr/local/src/code-coverage/PHP
hphpa 1.0.0 by Sebastian Bergmann.

/usr/local/src/code-coverage/PHP/CodeCoverage/Filter.php
  206   TooManyArgument: $this->addFileToWhitelist($file, FALSE)

Of course the tool can also generate an XML logfile in a format that is suitable for continuous integration:

➜  ~  hphpa --checkstyle hphpa.xml --quiet /usr/local/src/code-coverage/PHP
hphpa 1.0.0 by Sebastian Bergmann.

➜  ~  cat hphpa.xml
<checkstyle>
 <file name="/usr/local/src/code-coverage/PHP/CodeCoverage/Filter.php">
  <error line="206"
         message="$this->addFileToWhitelist($file, FALSE)"
         source="TooManyArgument"/>
 </file>
</checkstyle>
Link
PHP ClassesThe Debate of Making PHP Faster using a JIT Compiler - Lately in PHP podcast episode 19 (5.1.2012, 09:10 UTC)
By Manuel Lemos
The official PHP implementation is evolving too slowly, while alternative implementations like Phalanger and Facebook HipHop can run PHP faster thanks to the use of JIT compiler engines.

JIT compilation was the main topic of the episode 19 of the Lately in PHP podcast presented by Manuel Lemos and Ernani Joppert who received as guests Miloslav Beno of the Phalanger team and Nuno Lopes of the PECL LLVM project to discuss this and other interesting topics of the PHP scene.

They also made a brief retrospective of what happened in the PHP world in 2011 and what they expect for 2012.

Listen to the podcast or read the transcript here.
Link
PHP AdventEveryone Loves PHP (10.12.2010, 05:00 UTC)

A year ago today, I started at Facebook. I’ve been using PHP for the last twelve years, and before I got there, I thought I had seen most PHP-related problems. I had deployed software on all manners of systems, but since I’ve worked at Facebook, my appreciation for how important PHP is has changed.

Earlier this year, another implementation of the PHP runtime emerged. This is nothing really new, as over the last few years, there had been several others — Quercus, Project Zero, Roadsend, and phc, to name a few. The main difference between them and HipHop for PHP is in the numbers. Nearly all of Facebook’s production traffic is served with HipHop. This consists of several million lines of code, worked on by several hundred engineers. Some run Apache and PHP; others run the HipHop interpreter.

One of the reasons for writing HipHop was to improve performance. The current Zend Engine is already pretty well optimized, and most changes result in a performance improvement of only a few percent. By building a code transformer, we were initially able to improve performance by 100 to 200 percent, and our current benchmarks show that in 2010, it’s improved an additional 200 percent.

This isn’t an article about HipHop and how awesome it is, though. What I’m trying to convey is our love affair with PHP. Take a look at the top hundred sites by traffic, size, or whatever other metric you fancy, and there is a high chance that they’re using PHP.

One of the main reasons I think PHP has become so popular is the critical mass that it acquired while it was growing up, but what was the original cause of that? It all comes down to its simplicity and the other options that were available at the time. Back then, your host offered Perl, ASP, or PHP. Of those, PHP was the simplest to learn, write, read, and debug. This made the barrier to entry much lower than the others.

It became easy to take a basic HTML web page, change the file extension, and sprinkle in a few lines of code from the documentation to make it dynamic. Now, there is no guarantee that it’s going to be good PHP, but the person who made the change was usually happy with the result.

Another reason PHP has become so popular is the documentation, which is second to none. It’s almost always up-to-date — and not just in English. Its central location, user comments, and easy search, among other things, make it the best resource for learning PHP. No books are required if you already know how to code. Before PHP, my background was Perl, and reading the manual was all I needed to make the switch.

One of the biggest reasons for PHP’s popularity is you, a member of the community. I recently discussed this with a friend, and we agreed that — especially compared to those surrounding other languages — PHP’s community is large, vibrant, and welcoming. There are dozens of user groups all over the world that meet on a monthly basis, along with larger conferences (and unconferences) that run throughout the year, all of which make it easy to be involved. One of my friends’ observations, while attending PHP London, was that someone who had just picked up the language that day was able to talk with a couple of contributors to the language core. Attendees at PHP-related events form a diverse group.

PHP is now in its teenage years, and the enterprise market is finally starting to warm up. Companies that wouldn’t normally have taken a second glance are now writing some large-scale systems with it. There are also a lot of companies built on providing consultation services, support, and training.

PHP’s growing pains are starting to show, though. For example, there isn’t a formal language specification — the current implementation itself is the specification, and it’s grown organically over the years. As things were needed, based on the trends of the Web, features were added. The last major version included closures, namespaces, and some other object-oriented goodies. The next version is going to include traits, scalar type hinting and some other languages changes. Developing modern PHP can be hard, but code written for PHP 4 — for the most part — still works.

There is also no single entity or foundation that is responsible for the direction or funding of development. The majority of PHP’s contributors are volunteers, doing so without any direct financial gain. The rest do so on behalf of their employers — Oracle, IBM, Microsoft, Yahoo, and Fac

Truncated by Planet PHP, read more at the original (another 3348 bytes)

Link
PHP AdventProfiling with XHGui (3.12.2010, 05:00 UTC)

Everyone wants a fast web site. Making one is a bit more difficult. Profiling code during development is easy, thanks to the excellent Xdebug extension (which also provides handy debugging tools), but this still leaves us blind in production, where we care the most. It’s also frequently more than a little different than the development machine.

Facebook faced a similar problem in their pre-HipHop days, and developed XHProf, a profiler lightweight enough to run on your production servers. It still adds load to the server when it runs, but the tradeoff is worth it, because you’ll know why your pages are slow. XHProf can record different levels of detail about your app for different levels of performance sacrifice. I generally run it full out, because I prefer the highest level of detail, and I accept the performance hit.

XHProf will need to be compiled and configured on your server. This is currently trivial on most Linux-like servers. A Windows version is in the works (it already works apart from returning zero for the number of CPU ticks). Once the extension is installed and confirmed through phpinfo(), you’re off to the races. It’s also capable of showing you a call graph if you have Graphviz installed. While you’re playing on the server, you may want to look for this.

The included GUI works, but it only provides information on a run-by-run basis, and it uses serialized data on disk for storage. I’d heartily recommend XHGui for your UI needs.

XHGui includes two components, a data recorder (in the /external directory) and a UI for display ( /xhprof_lib and /xhprof_html).

The recorder is used to actually profile your page loads. The easiest way to do this is to set PHP to automatically prepend header.php and append the footer.php. This can be trivially accomplished with two Apache directives for a given virtual host:

php_value auto_prepend_file "/var/www/xh.example.com/external/header.php"
php_value auto_append_file "/var/www/xh.example.com/external/footer.php"

The UI directories contain a config file (in /xhprof_lib). You’ll need to copy config.sample.php to config.php and make the appropriate changes. There are a bunch of settings to be aware of:

exceptionURLs
These URLs will never have the link to the result page displayed at the bottom, even for administrators. This is useful for URLs that return non-HTML output, such as images, videos, XML, &c.
exceptionPostURLs
The app records GET, POST, and cookie information when profiling a page. This is useful in determining exactly why a given code path was used. For example, it may reveal that a user was logged in, or submitted a form comment. POST data, however, will occasionally include privileged information such as login credentials. The URLs listed here will prevent saving their POST data.
controlIPs
The ability to turn profiling on or off and to display the link to profile results is restricted to users accessing the page from within a control IP. Add IPs here to give yourself control. When you’re in the controlIP list, you’ll be able to turn profiling on for all your requests by appending ?_profile=1 to a URL (you’ll be given a cookie and redirected back to the original page); a link to the results for that run will also be appended to the page.
weight
One of every $weight requests will be profiled.

There are two functions within the configuration file that are designed to be edited on each install:

_urlSimilator
This function accepts the URL of a requested resource. You should collapse URLs in such a way that functionally-identical endpoints return the same value. For example, if /blog/?post=25 and /blog/?post=26 are just two blog posts, you can probably assume that these URLs follow an identical path within PHP. The only difference is a parameter passed to the database (or, hopefully, memcached). Supply the _urlSimilator with as much intelligence as you can to help it collapse these URLs.
_aggregateCalls
This function controls the pie chart within a specific run. The goal is to collapse pie slices together to give the developer a better idea of how certain sections of your app are doing. For example, you may choose to co

Truncated by Planet PHP, read more at the original (another 5303 bytes)

Link
Philip OlsonPHPVille almost released today (6.9.2010, 20:34 UTC)
Today Facebook released a HipHop l33t shizznizzle of a tool for PHP but sadly it’s not the PHPVille I was hoping for. PHPVille is probably a game that encourages PHP community members to work on the PHP project (over at php.net). Much like FarmVille, users simply cannot stop playing! Possible game features: Each SVN commit [...]
Link
Thomas KochFree Software in companies (6.8.2010, 08:46 UTC)
At work we currently have a discussion: I'd like to develop some components as Free Software. My bosses like the idea, but the client goes nuts only thinking about it.
So as part of the discussion I thought to collect those companies who actively advertise their Free Software. I know about the big ones, but it would be even more interesting to get a list of succesful small companies that share at least part of their inhouse projects. On the other hand it would not be too interesting to list full Free Software companies like Red Hat.
It would also be fine to share this list in some wiki (FSFE?) when it grows larger. For some companies I also list very popular projects from my point of view.

big companies

  • Adobe
  • Facebook: Cassandra, Thrift, Scribe, HipHop for PHP, ...
  • Yahoo: YUI, Design Pattern Library, YSlow, Hadoop, ZooKeeper
  • Twitter
  • Google: Chromium Browser, Google Summer of Code, Android, Google Web Toolkit (GWT), Protocol Buffers - data interchange format, Java Collections Library, unladen-swallow - faster implementation of Python, ...
  • IBM: Developerworks (Tutorials), Eclipse, International Components for Unicode (ICU)

middle sized companies

small companies


Update: Need to add Danga (Gearman), Liip.ch with Okapi, Flux CMS, Jackalope, rackspace, Samsung
Link
Sebastian BergmannUsing HipHop for Static Analysis (27.7.2010, 09:30 UTC)

HipHop for PHP, the source code transformer that turns PHP code into C++ code that can then be compiled with g++, can also be used for static code analysis to find problems in PHP source code.

sb@vmware Money % hphp -t analyze --input-dir .
running hphp...
creating temporary directory /tmp/hphp_Zz7AXg ...
parsing inputs...
parsing inputs took 0'00" (20 ms) wall time
inferring types...
inferring types took 0'00" (10 ms) wall time
saving code errors, dependency graph and stats...
all files saved in /tmp/hphp_Zz7AXg ...
running hphp took 0'00" (208 ms) wall time

The script below takes a CodeErrors.js file (which is generated by hphp and in the example above is saved to /tmp/hphp_Zz7AXg) as its input and print an XML document in Checkstyle's format (the same XML format that is also used by PHP_CodeSniffer, for instance). This XML logfile can then be used with Hudson, for instance, in a continuous integration context.

<script src="http://gist.github.com/490858.js">
Link
Daniel KrookFacebook and HipHop at New York PHP (27.4.2010, 23:39 UTC)
Tonight we’re hosting Scott MacVicar from Facebook as he presents their highly optimized version of PHP called HipHop. HipHop translates and compiles regular PHP source into a C++ binary that reduces CPU and memory usage and thus helps Facebook serve twice the content with a two-thirds the resources of a stock Apache and PHP [...]
Link
Paul ReinheimerA GUI for XHProf (26.4.2010, 18:54 UTC)

Facebook was kind enough to open source the XHProf extension last year, but it flew under my radar until I saw a presentation including it earlier this year when they showed off HipHop. XHProf provides profiling information about your application, while being lightweight enough to run on a production server (against a percentage of requests). Once we got it installed we ran into a few limitations with the existing GUI. With the help of Graham Slater (one of our front end designers) I hacked on the Facebook code to come up with our own flavor. Let me stress the word “hack." I was given some really tough deadlines for this project so functionality was literally hacked into the existing codebase.

Key Features:

  • Nice interface
  • MySQL backend for data storage
  • Stores key values such as Peak Memory Usage, Wall Time, and CPU Ticks in the database directly for easy lookup.
  • Stores Get, Post, and Cookie data for easy comparison (so you can determine why a run hit that execution path)
  • Display includes easy lookup for worst runs, most recent, etc.
  • Support for profiling a percentage of requests, or on demand
  • Support for not including a link to profile requests on certain files (like xml documents, images, etc).
  • Uses Google Data Visualization API to graph runs over time
  • Concept of “Similar” urls, to handle ?story=23 and ?story=24 having different URLs, but similar to identical code execution paths

Screenshots:


Download:

Please download the source code from GitHub

Warnings:

Seriously, we hacked this up! Please let me know if you encounter any bugs, there's probably a bunch.

Link
Talk In CodeIs HipHop really worth it? (9.3.2010, 17:08 UTC)

In case you missed the news, a few months ago Facebook announced on their developer blog that they were releasing a piece of software to the wider community. That piece of software is called ‘HipHop’ and is one of the many technologies Facebook run behind the scenes to keep one of the world’s most popular sites up and running.

But what does it do? In short, HipHop translates PHP code into C++ code, which can then be compiled into native machine code. This, in theory, should remove a bottleneck or two from PHP itself – namely the interpreter – as a good deal of slow, interpreted code can be compiled into uber-fast native code. Well, in theory.

While there is a performance increase (just Google for benchmarks, there are bound to be dozens by now) I think that it simply does not justify the end for most PHP driven sites. The added complexity of ensuring all your PHP code can be understood by HipHop would be enough to drive me mad, in particular. Adding to that is the number of issues I have seen people encountering whilst trying to compile HipHop on anything other than Facebook’s own platform of choice… Not pretty.

It isn’t just that though. I believe the only real increase in performance is in processing-limited scenarios which just aren’t that common in real life. Sure, Facebook need it, but they serve hundreds of millions of pageviews each day. For Joe Bloggs’ e-commerce store the benefit of having slightly faster C++ code would quickly evaporate when it came to any code changes needing to be first done in PHP, passed through HipHop before being compiled and then debugged for a second time as C++ code.

Stability isn’t really an issue due to the above reason (it has apparently been powering Facebook for a while) and apparently HipHop supports ‘most’ PHP code, but I really do not think you should consider it for your next project. Well, unless you’re re-coding MySpace…

Link
LinksRSS 0.92   RDF 1.
Atom Feed   100% Popoon
PHP5 powered   PEAR
ButtonsPlanet PHP   Planet PHP
Planet PHP