The PHP team has released Alpha 2 of PHP 8.4 – an update of this popular language planned for general availability in November, with many new features including HTML 5 DOM (Document Object Model) parsing, support for large XML documents, and array helper functions.
PHP has a DOM extension which loads and parses HTML documents, but the existing parser – based on a GNOME library called libxml2 – only supports HTML 4.01, which is long out of date. HTML 5 is now the de facto standard, but attempting to use PHP’s loadHTML
function with HTML 5 content results in “multiple parsing errors,” the RFC on the subject states. It adds that “not being able to parse HTML 5 properly is one of the major pain points of our DOM extension.”
The new version uses an alternative HTML 5 parser and introduces a new DOM\HTMLDocument class, so that developers using the existing DOMDocument class will see no change in behavior. This is also an opportunity to improve the API. The new parser is based on an open source project called Lexbor.
Staying on the parsing theme, PHP users have complained that the XML parser breaks with out of memory errors on large XML chunks with no way to set an option in libxml2 to allow large input data. A new option called XML_OPTION_PARSE_HUGE
will fix this. This will be off by default, since there is some benefit in rejecting large input to resist denial of service attacks. The new option is for the event-driven ext\xml parser, as other XML parsing options in PHP already support a LIBXML_PARSEHUGE
setting.
PHP 8.4 also has new array searching functions. The problem here is that PHP’s array processing functions are missing “functions to find a single element matching a condition,” says the RFC. Developers frequently implement their own code for this, which is not difficult, but “these functions are often required, leading to the wheel being reinvented over and over,” the RFC continues. Hence the introduction of four new functions: array_find
, array_find_key
, array_any
and array_all
.
Another welcome feature is the ability to access members of a newly instantiated class without additional parentheses. The feature is called new MyClass()->method()
without parentheses. While it may seem a small change, it simplifies the code as well as saving developers a trip to the documentation to discover why something which looks like it should work, does not.
There are numerous other features expected in the 8.4 release – including a smarter JIT compiler, and one that may raise deprecation errors in existing code, plus the deprecation of implicitly nullable parameter types.
As an Alpha release, it is still possible that some features will not make the final version. But feature freeze is less than one month away – the timetable for PHP 8.4 provides for a further Alpha release and then feature freeze on August 13. Three Beta releases will follow, then release candidates until general availability on November 21.