November 03, 2024

Igalia and WebKit: status update and plans (2024)

by Mario Sánchez Prada

It’s been more than 2 years since the last time I wrote something here, and in that time a lot of things happened. Among those, one of the main highlights was me moving back to Igalia‘s WebKit team, but this time I moved as part of Igalia’s support infrastructure to help with other types of tasks such as general coordination, team facilitation and project management, among other things.

On top of those things, I’ve been also presenting our work around WebKit in different venues, such as in the Embedded Open Source Summit or in the Embedded Recipes conference, for instance. Of course, that included presenting our work in the WebKit community as part of the WebKit Contributors Meeting, a small and technically focused event that happens every year, normally around the Bay Area (California). That’s often a pretty dense presentation where, over the course of 30-40 minutes, we go through all the main areas that we at Igalia contribute to in WebKit, trying to summarize our main contributions in the previous 12 months. This includes work not just from the WebKit team, but also from other ones such as our Web Platform, Compilers or Multimedia teams.

So far I did that a couple of times only, both last year on October 24rth as well as this year, just a couple of weeks ago in the latest instance of the WebKit Contributors meeting. I believe the session was interesting and informative, but unfortunately it does not get recorded so this time I thought I’d write a blog post to make it more widely accessible to people not attending that event.

This is a long read, so maybe grab a cup of your favorite beverage first…

Igalia and WebKit

So first of all, what is the relationship between Igalia and the WebKit project?

In a nutshell, we are the lead developers and the maintainers of the two Linux-based WebKit ports, known as WebKitGTK and WPE. These ports share a common baseline (e.g. GLib, GStreamer, libsoup) and also some goals (e.g. performance, security), but other than that their purpose is different, with WebKitGTK being aimed at the Linux desktop, while WPE is mainly focused on embedded devices.

This means that, while WebKitGTK is the go-to solution to embed Web content in GTK applications (e.g. GNOME Web/Epiphany, Evolution), and therefore integrates well with that graphical toolkit, WPE does not even provide a graphical toolkit since its main goal is to be able to run well on embedded devices that often don’t even have a lot of memory or processing power, or not even the usual mechanisms for I/O that we are used to in desktop computers. This is why WPE’s architecture is designed with flexibility in mind with a backends-based architecture, why it aims for using as few resources as possible, and why it tries to depend on as few libraries as possible, so you can integrate it virtually in any kind of embedded Linux platform.

Besides that port-specific work, which is what our WebKit and Multimedia teams focus a lot of their effort on, we also contribute at a different level in the port-agnostic parts of WebKit, mostly around the area of Web standards (e.g. contributing to Web specifications and to implement them) and the Javascript engine. This work is carried out by our Web Platform and Compilers team, which tirelessly contribute to the different parts of WebCore and JavaScriptCore that affect not just the WebKitGTK and WPE ports, but also the rest of them to a bigger or smaller degree.

Last but not least, we also devote a considerable amount of our time to other topics such as accessibility, performance, bug fixing, QA... and also to make sure WebKit works well on 32-bit devices, which is an important thing for a lot of WPE users out there.

Who are our users?

At Igalia we distinguish 4 main types of users of the WebKitGTK and WPE ports of WebKit:

Port users: this category would include anyone that writes a product directly against the port’s API, that is, apps such as a desktop Web browser or embedded systems that rely on a fullscreen Web view to render its Web-based content (e.g. digital signage systems).

Platform providers: in this category we would have developers that build frameworks with one of the Linux ports at its core, so that people relying on such frameworks can leverage the power of the Web without having to directly interface with the port’s API. RDK could be a good example of this use case, with WPE at the core of the so-called Thunder plugin (previously known as WPEFramework).

Web developers: of course, Web developers willing to develop and test their applications against our ports need to be considered here too, as they come with a different set of needs that need to be fulfilled, beyond rendering their Web content (e.g. using the Web Inspector).

End users: And finally, the end user is the last piece of the puzzle we need to pay attention to, as that’s what makes all this effort a task worth undertaking, even if most of them most likely don’t need what WebKit is, which is perfectly fine :-)

We like to make this distinction of 4 possible types of users explicit because we think it’s important to understand the complexity of the amount of use cases and the diversity of potential users and customers we need to provide service for, which is behind our decisions and the way we prioritize our work.

Strategic goals

Our main goal is that our product, the WebKit web engine, is useful for more and more people in different situations. Because of this, it is important that the platform is homogeneous and that it can be used reliably with all the engines available nowadays, and this is why compatibility and interoperability is a must, and why we work with the the standards bodies to help with the design and implementation of several Web specifications.

With WPE, it is very important to be able to run the engine in small embedded devices, and that requires good performance and being efficient in multiple hardware architectures, as well as great flexibility for specific hardware, which is why we provided WPE with a backend-based architecture, and reduced dependencies to a minimum.

Then, it is also important that the QA Infrastructure is good enough to keep the releases working and with good quality, which is why I regularly maintain, evolve and keep an eye on the EWS and post-commit bots that keep WebKitGTK and WPE building, running and passing the tens of thousands of tests that we need to check continuously, to ensure we don’t regress (or that we catch issues soon enough, when there’s a problem). Then of course it’s also important to keep doing security releases, making sure that we release stable versions with fixes to the different CVEs reported as soon as possible.

Finally, we also make sure that we keep evolving our tooling as much as possible (see for instance the release of the new SDK earlier this year), as well as improving the documentation for both ports.

Last, all this effort would not be possible if not because we also consider a goal of us to maintain an efficient collaboration with the rest of the WebKit community in different ways, from making sure we re-use and contribute to other ports as much code as possible, to making sure we communicate well in all the forums available (e.g. Slack, mailing list, annual meeting).

Contributions to WebKit in numbers

Well, first of all the usual disclaimer: number of commits is for sure not the best possible metric,  and therefore should be taken with a grain of salt. However, the point here is not to focus too much on the actual numbers but on the more general conclusions that can be extracted from them, and from that point of view I believe it’s interesting to take a look at this data at least once a year.

With that out of the way, it’s interesting to confirm that once again we are still the 2nd biggest contributor to WebKit after Apple, with ~13% of the commits landed in this past 12-month period. More specifically, we landed 2027 patches out of the 15617 ones that took place during the past year, only surpassed by Apple and their 12456 commits. The remaining 1134 patches were landed mostly by Sony, followed by RedHat and several other contributors.

Now, if we remove Apple from the picture, we can observe how this year our contributions represented ~64% of all the non-Apple commits, a figure that grew about ~11% compared to the past year. This confirms once again our commitment to WebKit, a project we started contributing about 14 years ago already, and where we have been systematically being the 2nd top contributor for a while now.

Main areas of work

The 10 main areas we have contributed to in WebKit in the past 12 months are the following ones:

  • Web platform
  • Graphics
  • Multimedia
  • JavaScriptCore
  • New WPE API
  • WebKit on Android
  • Quality assurance
  • Security
  • Tooling
  • Documentation

In the next sections I’ll talk a bit about what we’ve done and what we’re planning to do next for each of them.

Web Platform

content-visibility:auto

This feature allows skipping painting and rendering of off-screen sections, particularly useful to avoid the browser spending time rendering parts in large pages, as content outside of the view doesn’t get rendered until it gets visible.

We completed the implementation and it’s now enabled by default.

Navigation API

This is a new API to manage browser navigation actions and examine history, which we started working on in the past cycle. There’s been a lot of work happening here and, while it’s not finished yet, the current plan is that Apple will continue working on that in the next months.

hasUAVisualTransition

This is an attribute of the NavigateEvent interface, which is meant to be True if the User Agent has performed a visual transition before a navigation event. It was something that we have also finished implementing and is now also enabled by default.

Secure Curves in the Web Cryptography API

In this case, we worked on fixing several Web Interop related issues, as well as on increasing test coverage within the Web Platform Tests (WPT) test suites.

On top of that we also moved the X25519 feature to the “prepare to ship” stage.

Trusted Types

This work is related to reducing DOM-based XSS attacks. Here we finished the implementation and this is now pending to be enabled by default.

MathML

We continued working on the MathML specification by working on the support for padding, border and margin, as well as by increasing the WPT score by ~5%.

The plan for next year is to continue working on core features and improve the interaction with CSS.

Cross-root ARIA

Web components have accessibility-related issues with native Shadow DOM as you cannot reference elements with ARIA attributes across boundaries. We haven’t worked on this in this period, but the plan is to work in the next months on implementing the Reference Target proposal to solve those issues.

Canvas Formatted Text

Canvas has not a solution to add formatted and multi-line text, so we would like to also work on exploring and prototyping the Canvas Place Element proposal in WebKit, which allows better text in canvas and more extended features.

Graphics

Completed migration from Cairo to Skia for the Linux ports

If you have followed the latest developments, you probably already know that the Linux WebKit ports (i.e. WebKitGTK and WPE) have moved from Cairo to Skia for their 2D rendering library, which was a pretty big and important decision taken after a long time trying different approaches and experiments (including developing our own HW-accelerated 2D rendering library!), as well as running several tests and measuring results in different benchmarks.

The results in the end were pretty overwhelming and we decided to give Skia a go, and we are happy to say that, as of today, the migration has been completed: we covered all the use cases in Cairo, achieving feature parity, and we are now working on implementing new features and improvements built on top of Skia (e.g. GPU-based 2D rendering).

On top of that, Skia is now the default backend for WebKitGTK and WPE since 2.46.0, released on September 17th, so if you’re building a recent version of those ports you’ll be already using Skia as their 2D rendering backend. Note that Skia is using its GPU-based backend only on desktop environments, on embedded devices the situation is trickier and for now the default is the CPU-based Skia backend, but we are actively working to narrow the gap and to enable GPU-based rendering also on embedded.

Architecture changes with buffer sharing APIs (DMABuf)

We did a lot of work here, such as a big refactoring of the fencing system to control the access to the buffers, or the continued work towards integrating with Apple’s DisplayLink infrastructure.

On top of that, we also enabled more efficient composition using damaging information, so that we don’t need to pass that much information to the compositor, which would slow the CPU down.

Enablement of the GPUProcess

On this front, we enabled by default the compilation for WebGL rendering using the GPU process, and we are currently working in performance review and enabling it for other types of rendering.

New SVG engine (LBSE: Layer-Based SVG Engine)

If you are not familiar with this, here the idea is to make sure that we reuse the graphics pipeline used for HTML and CSS rendering, and use it also for SVG, instead of having its own pipeline. This means, among other things, that SVG layers will be supported as a 1st-class citizen in the engine, enabling HW-accelerated animations, as well as support for 3D transformations for individual SVG elements.

On this front, on this cycle we added support for the missing features in the LBSE, namely:

  • Implemented support for gradients & patterns (applicable to both fill and stroke)
  • Implemented support for clipping & masking (for all shapes/text)
  • Implemented support for markers
  • Helped review implementation of SVG filters (done by Apple)

Besides all this, we also improved the performance of the new layer-based engine by reducing repaints and re-layouts as much as possible (further optimizations still possible), narrowing the performance gap with the current engine for MotionMark. While we are still not at the same level of performance as the current SVG engine, we are confident that there are several key places where, with the right funding, we should be able to improve the performance to at least match the current engine, and therefore be able to push the new engine through the finish line.

General overhaul of the graphics pipeline, touching different areas (WIP):

On top of everything else commented above, we also worked on a general refactor and simplification of the graphics pipeline. For instance, we have been working on the removal of the Nicosia layer now that we are not planning to have multiple rendering implementations, among other things.

Multimedia

DMABuf-based sink for HW-accelerated video

We merged the DMABuf-based sink for HW-accelerated video in the GL-based GStreamer sink.

WebCodecs backend

We completed the implementation of  audio/video encoding and decoding, and this is now enabled by default in 2.46. As for the next steps, we plan to keep working on the integration of WebCodecs with WebGL and WebAudio.

GStreamer-based WebRTC backends

We continued working on GstWebRTC, bringing it to a point where it can be used in production in some specific use cases, and we will still be working on this in the next months.

Other

Besides the points above, we also added an optional text-to-speech backend based on libspiel to the development branch, and worked on general maintenance around the support for Media Source Extensions (MSE) and Encrypted Media Extensions (EME), which are crucial for the use case of WPE running in set-top-boxes, and is a permanent task we will continue to work on in the next months.

JavaScriptCore

ARMv7/32-bit support:

A lot of work happened around 32-bit support in JavaScriptCore, especially around WebAssembly (WASM): we ported the WASM BBQJIT and ported/enabled concurrent JIT support, and we also completed 80% of the implementation for the OMG optimization level of WASM, which we plan to finish in the next months. If you are unfamiliar with what the OMG and BBQ optimization tiers in WASM are, I’d recommend you to take a look at this article in webkit.org: Assembling WebAssembly.

We also contributed to the JIT-less WASM, which is very useful for embedded systems that can’t support JIT for security or memory related constraints, and also did some work on the In-Place Interpreter (IPInt), which is a new version of the WASM Low-level interpreter (LLInt) that uses less memory and executes WASM bytecode directly without translating it to LLInt bytecode  (and should therefore be faster to execute).

Last, we also contributed most of the implementation for the WASM GC, with the exception of some Kotlin tests.

As for the next few months, we plan to investigate and optimize heap/JIT memory usage in 32-bit, as well as to finish several other improvements on ARMv7 (e.g. IPInt).

New WPE API

The new WPE API is a new API that aims at making it easier to use WPE in embedded devices, by removing the hassle of having to handle several libraries in tandem (i.e. WPEWebKit, libWPE and WPEBackend-FDO, for instance), available from WPE’s releases page, and providing a more modern API in general, better aimed at the most common use cases of WPE.

A lot of effort happened this year along these lines, including the fact that we finally upstreamed and shipped its initial implementation with WPE 2.44, back in the first half of the year. Now, while we recommend users to give it a try and report feedback as much as possible, this new API is still not set in stone, with regular development still ongoing, so if you have the chance to try it out and share your experience, comments are welcome!

Besides shipping its initial implementation, we also added support for external platforms, so that other ones can be loaded beyond the Wayland, DRM and “headless” ones, which are the default platforms already included with WPE itself. This means for instance that a GTK4 platform, or another one for RDK could be easily used with WPE.

Then of course a lot of API additions were included in the new API in the latest months:

  • Screens management APIAPI to handle different screens, ask the display for the list of screens with their device scale factor, refresh rate, geometry…
  • Top level management API: This API allows a greater degree of control, for instance by allowing more than one WebView for the same top level, as well as allowing to retrieve properties such as size, scale or state (i.e. full screen, maximized…).
  • Maximized and minimized windows API: API to maximize/minimize a top level and monitor its state. mainly used by WebDriver.
  • Preferred DMA-BUF formats API: enables asking the platform (compositor or DRM) for the list of preferred formats and their intended use (scanout/rendering).
  • Input methods APIallows platforms to provide an implementation to handle input events (e.g. virtual keyboard, autocompletion, auto correction…).
  • Gestures API: API to handle gestures (e.g. tap, drag).
  • Buffer damaging: WebKit generates information about the areas of the buffer that actually changed and we pass that to DRM or the compositor to optimize painting.
  • Pointer lock API: allows the WebView to lock the pointer so that the movement of the pointing device (e.g. mouse) can be used for a different purpose (e.g. first-person shooters).

Last, we also added support for testing automation, and we can support WebDriver now in the new API.

With all this done so far, the plan now is to complete the new WPE API, with a focus on the Settings API and accessibility support, write API tests and documentation, and then also add an external platform to support GTK4. This is done on a best-effort basis, so there’s no specific release date.

WebKit on Android

This year was also a good year for WebKit on Android, also known as WPE Android, as this is a project that sits on top of WPE and its public API (instead of developing a fully-fledged WebKit port).

In case you’re not familiar with this, the idea here is to provide a WebKit-based alternative to the Chromium-based Web view on Android devices, in a way that leverages HW acceleration when possible and that it integrates natively (and nicely) with the several Android subsystems, and of course with Android’s native mainloop. Note that this is an experimental project for now, so don’t expect production-ready quality quite yet, but hopefully something that can be used to start experimenting with selected use cases.

If you’re adventurous enough, you can already try the APKs yourself from the releases page in GitHub at https://github.com/Igalia/wpe-android/releases.

Anyway, as for the changes that happened in the past 12 months, here is a summary:

  • Updated WPE Android to WPE 2.46 and NDK 27 LTS
  • Added support for WebDriver and included WPT test suites
  • Added support for instrumentation tests, and integrated with the GitHub CI
  • Added support for the remote Web inspector, very useful for debugging
  • Enabled the Skia backend, bringing HW-accelerated 2D rendering to WebKit on Android
  • Implemented prompt delegates, allowing implementing things such as alert dialogs
  • Implemented WPEView client interfaces, allowing responding to things such as HTTP errors
  • Packaged a WPE-based Android WebView in its own library and published in Maven Central. This is a massive improvement as now apps can use WPE Android by simply referencing the library from the gradle files, no need to build everything on their own.
  • Other changes: enabled HTTP/2 support (via the migration to libsoup3), added support for the device scale factor, improved the virtual on-screen keyboard, general bug fixing…

On top of that, we published 3 different blog posts covering different topics, from a general intro to a more deep dive explanation of the internals, and showing some demos. You can check them out in Jani’s blog at https://blogs.igalia.com/jani

As for the future, we’ll focus on stabilization and regular maintenance for now, and then we’d like to work towards achieving production-ready quality for specific cases if possible.

Quality Assurance

On the QA front, we had a busy year but in general we could highlight the following topics.

  • Fixed a lot of API tests failures in the bots that were limiting our test coverage.
  • Fixed lots of assertions-related crashes in the bots, which were slowing down the bots as well as causing other types of issues, such as bots exiting early due too many failures.
  • Enabled assertions in the release bots, which will help prevent crashes in the future, as well as with making our debug bots healthier.
  • Moved all the WebKitGTK and WPE bots to building now with Skia instead of Cairo. This means that all the bots running tests are now using Skia, and there’s only one bot still using Cairo to make sure that the compilation is not broken, but that bot does not run tests.
  • Moved all the WebKitGTK bots to use GTK4 by default. As with the move to Skia, all the WebKit bots running tests now use GTK4 and the only one remaining building with GTK3 does not run tests, it only makes sure we don’t break the GTK3 compilation for now.
  • Working on moving all the bots to use the new SDK. This is still work in progress and will likely be completed during 2025 as it’s needed to implement several changes in the infrastructure that will take some time.
  • General gardening and bot maintenance

In the next months, our main focus would be a revamp of the QA infrastructure to make sure that we can get all the bots (including the debug ones) to a healthier state, finish the migration of all the bots to the new SDK and, ideally, be able to bring back the ready-to-use WPE images that we used to have available in wpewebkit.org.

Security

The current release cadence has been working well, so we continue issuing major releases every 6 months (March, September), and then minor and unstable development releases happening on-demand when needed.

As usual, we kept aligning releases for WebKitGTK and WPE, with both of them happening at the same time (see https://webkitgtk.org/releases and https://wpewebkit.org/release), and then also publishing WebKit Security Advisories (WSA) when necessary, both for WebKitGTK and for WPE.

Last, we also shortened the time before including security fixes in stable releases this year, and we have removed support for libsoup2 from WPE, as that library is no longer maintained.

Tooling & Documentation

On tooling, the main piece of news is that this year we released the initial version of the new SDK,  which is developed on top of OCI-based containers. This new SDK fixes the issues with the current existing approaches based on JHBuild and flatpak, where one of them was great for development but poor for testing and QA, and the other one was great for testing and QA, but not very convenient for development.

This new SDK is regularly maintained and currently runs on Ubuntu 24.04 LTS with GCC 14 & Clang 18. It has been made public on GitHub and announced to the public in May 2024 in Patrick’s blog, and is now the officially recommended way of building WebKitGTK and WPE.

As for documentation, we didn’t do as much as we would have liked here, but we still landed a few contributions in docs.webkit.org, mostly related to WebKitGTK (e.g. Releases and VersioningSecurity UpdatesMultimedia). We plan to do more on this regard in the next months, though, mostly by writing/publishing more documentation and perhaps also some tutorials.

Final thoughts

This has been a fairly long blog post but, as you can see, it’s been quite a year for WebKit here at Igalia, with many exciting changes happening at several fronts, and so there was quite a lot of stuff to comment on here. This said, you can always check the slides of the presentation in the WebKit Contributors Meeting here if you prefer a more concise version of the same content.

In any case, what’s clear it’s that the next months are probably going to be quite interesting as well with all the work that’s already going on in WebKit and its Linux ports, so it’s possible that in 12 months from now I might be writing an equally long essay. We’ll see.

Thanks for reading!

by mario at November 03, 2024 05:20 PM



September 27, 2024

Graphics improvements in WebKitGTK and WPEWebKit 2.46

by Carlos García Campos

WebKitGTK and WPEWebKit recently released a new stable version 2.46. This version includes important changes in the graphics implementation.

Skia

The most important change in 2.46 is the introduction of Skia to replace Cairo as the 2D graphics renderer. Skia supports rendering using the GPU, which is now the default, but we also use it for CPU rendering using the same threaded rendering model we had with Cairo. The architecture hasn’t changed much for GPU rendering: we use the same tiled rendering approach, but buffers for dirty regions are rendered in the main thread as textures. The compositor waits for textures to be ready using fences and copies them directly to the compositor texture. This was the simplest approach that already resulted in much better performance, specially in the desktop with more powerful GPUs. In embedded systems, where GPUs are not so powerful, it’s still better to use the CPU with several rendering threads in most of the cases. It’s still too early to announce anything, but we are already experimenting with different models to improve the performance even more and make a better usage of the GPU in embedded devices.

Skia has received several GCC specific optimizations lately, but it’s always more optimized when built with clang. The optimizations are more noticeable in performance when using the CPU for rendering. For this reason, since version 2.46 we recommend to build WebKit with clang for the best performance. GCC is still supported, of course, and performance when built with GCC is quite good too.

HiDPI

Even though there aren’t specific changes about HiDPI in 2.46, users of high resolution screens using a device scale factor bigger than 1 will notice much better performance thanks to scaling being a lot faster on the GPU.

Accelerated canvas

The 2D canvas can be accelerated independently on whether the CPU or the GPU is used for painting layers. In 2.46 there’s a new setting WebKitSettings:enable-2d-canvas-acceleration to control the 2D canvas acceleration. In some embedded devices the combination of CPU rendering for layer tiles and GPU for the canvas gives the best performance. The 2D canvas is normally rendered into an image buffer that is then painted in the layer as an image. We changed that for the accelerated case, so that the canvas is now rendered into a texture that is copied to a compositor texture to be directly composited instead of painted into the layer as an image. In 2.46 the offscreen canvas is enabled by default.

There are more cases where accelerating the canvas is not desired, for example when the canvas size is not big enough it’s faster to use the GPU. Also when there’s going to be many operations to “download” pixels from GPU. Since this is not always easy to predict, in 2.46 we added support for the willReadFrequently canvas setting, so that when set by the application when creating the canvas it causes the canvas to be always unaccelerated.

Filters

All the CSS filters are now implemented using Skia APIs, and accelerated when possible. The most noticeable change here is that sites using blur filters are no longer slow.

Color spaces

Skia brings native support for color spaces, which allows us to greatly simplify the color space handling code in WebKit. WebKit uses color spaces in many scenarios – but especially in case of SVG and filters. In case of some filters, color spaces are necessary as some operations are simpler to perform in linear sRGB. The good example of that is feDiffuseLighting filter – it yielded wrong visual results for a very long time in case of Cairo-based implementation as Cairo doesn’t have a support for color spaces. At some point, however, Cairo-based WebKit implementation has been fixed by converting pixels to linear in-place before applying the filter and converting pixels in-place back to sRGB afterwards. Such a workarounds are not necessary anymore as with Skia, all the pixel-level operations are handled in a color-space-transparent way as long as proper color space information is provided. This not only impacts the results of some filters that are now correct, but improves performance and opens new possibilities for acceleration.

Font rendering

Font rendering is probably the most noticeable visual change after the Skia switch with mixed feedback. Some people reported that several sites look much better, while others reported problems with kerning in other sites. In other cases it’s not really better or worse, it’s just that we were used to the way fonts were rendered before.

Damage tracking

WebKit already tracks the area of the layers that has changed to paint only the dirty regions. This means that we only repaint the areas that changed but the compositor incorporates them and the whole frame is always composited and passed to the system compositor. In 2.46 there’s experimental code to track the damage regions and pass them to the system compositor in addition to the frame. Since this is experimental it’s disabled by default, but can be enabled with the runtime feature PropagateDamagingInformation. There’s also UnifyDamagedRegions feature that can be used in combination with PropagateDamagingInformation to unify the damage regions into one before passing it to the system compositor. We still need to analyze the impact of damage tracking in performance before enabling it by default. We have also started an experiment to use the damage information in WebKit compositor and avoid compositing the entire frame every time.

GPU info

Working on graphics can be really hard in Linux, there are too many variables that can result in different outputs for different users: the driver version, the kernel version, the system compositor, the EGL extensions available, etc. When something doesn’t work for some people and work for others, it’s key for us to gather as much information as possible about the graphics stack. In 2.46 we have added more useful information to webkit://gpu, like the DMA-BUF buffer format and modifier used (for GTK port and WPE when using the new API). Very often the symptom is the same, nothing is rendered in the web view, even when the causes could be very different. For those cases, it’s even more difficult to gather the info because webkit://gpu doesn’t render anything either. In 2.46 it’s possible to load webkit://gpu/stdout to get the information as a JSON directly in stdout.

Sysprof

Another common symptom for people having problems is that a particular website is slow to render, while for others it works fine. In these cases, in addition to the graphics stack information, we need to figure out where we are slower and why. This is very difficult to fix when you can’t reproduce the problem. We added initial support for profiling in 2.46 using sysprof. The code already has some marks so that when run under sysprof we get useful information about timings of several parts of the graphics pipeline.

Next

This is just the beginning, we are already working on changes that will allow us to make a better use of both the GPU and CPU for the best performance. We have also plans to do other changes in the graphics architecture to improve synchronization, latency and security. Now that we have adopted sysprof for profiling, we are also working on improvements and new tools.

by carlos garcia campos at September 27, 2024 10:30 AM



August 05, 2024

I Entered My GitHub Credentials into a Phishing Website!

by Michael Catanzaro

We all think we’re smart enough to not be tricked by a phishing attempt, right? Unfortunately, I know for certain that I’m not, because I entered my GitHub password into a lookalike phishing website a year or two ago. Oops! Fortunately, I noticed right away, so I simply changed my unique, never-reused password and moved on. But if the attacker were smarter, I might have never noticed. (This particular attack website was relatively unsophisticated and proxied only an unauthenticated view of GitHub, a big clue that something was wrong. Update: I want to be clear that it would have been very easy for the attacker to simply redirect me to the real github.com after stealing my credentials, in which case I would not have noticed the attack and would not have known to change my password.)

You might think multifactor authentication is the best defense against phishing. Nope. Although multifactor authentication is a major security improvement over passwords alone, and the particular attack that tricked me did not attempt to subvert multifactor authentication, it’s actually unfortunately pretty easy for phishers to defeat most multifactor authentication if they wish to do so:

  • Multifactor authentication based on SMS is extremely insecure (because SS7 is insecure)
  • Multifactor authentication based on phone calls is also insecure (because SIM swapping isn’t going away; determined attackers will steal your phone number if it’s an obstacle to them)
  • Multifactor authentication based on authenticator apps (using TOTP or HOTP) is much better in general, but still fails against phishing. When you paste your one-time access code into a phishing website, the phishing website can simply “proxy” the access code you kindly provided to them by submitting it to the real website. This only allows authenticating once, but once is usually enough.

Fortunately, there is a solution: passkeys. Based on FIDO2 and WebAuthn, passkeys resist phishing because the authentication process depends on the domain of the service that you’re actually connecting to. If you think you’re visiting https://example.com, but you’re actually visiting a copycat website with a Cyrillic а instead of Latin a, then no worries: the authentication will fail, and the frustrated attacker will have achieved nothing.

The most popular form of passkey is local biometric authentication running on your phone, but any hardware security key (e.g. YubiKey) is also a good bet.

target.com Is More Secure than Your Bank!

I am not joking when I say that target.com is more secure than your bank (which is probably still relying on SMS or phone calls, and maybe even allows you to authenticate using easily-guessable security questions):

Screenshot of target.com prompting the user to register a passkey
target.com is introducing passkeys!

Good job for supporting passkeys, Target.

It’s probably perfectly fine for Target to support passkeys alongside passwords indefinitely. Higher-security websites that want to resist phishing (e.g. your employer’s SSO service) should consider eventually allowing only passkeys.

No Passkeys in WebKitGTK

Unfortunately for GNOME users, WebKitGTK does not yet support WebAuthn, so passkeys will not work in GNOME Web (Epiphany). That’s my browser of choice, so I’ve never touched a passkey before and don’t actually know how well they work in practice. Maybe do as I say and not as I do? If you require high security, you will unfortunately need to use Firefox or Chrome instead, at least for the time being.

Why Was Michael Visiting a Fake github.com?

The fake github.com appeared higher than the real github.com in the DuckDuckGo search results for whatever I was looking for at the time. :(

by Michael Catanzaro at August 05, 2024 01:00 PM



April 16, 2024

From WebKit/GStreamer to rust-av, a journey on our stack’s layers

by Philippe Normand

In this post I’ll try to document the journey starting from a WebKit issue and ending up improving third-party projects that WebKitGTK and WPEWebKit depend on.

I’ve been working on WebKit’s GStreamer backends for a while. Usually some new feature needed on WebKit side would trigger work …

by Philippe Normand at April 16, 2024 08:15 PM



February 20, 2024

A Clarification About WebKit Switching to Skia

by Carlos García Campos

In the previous post I talked about the plans of the WebKit ports currently using Cairo to switch to Skia for 2D rendering. Apple ports don’t use Cairo, so they won’t be switching to Skia. I understand the post title was confusing, I’m sorry about that. The original post has been updated for clarity.

by carlos garcia campos at February 20, 2024 06:11 PM



February 19, 2024

WebKitGTK and WPEWebKit Switching to Skia for 2D Graphics Rendering

by Carlos García Campos

In recent years we have had an ongoing effort to improve graphics performance of the WebKit GTK and WPE ports. As a result of this we shipped features like threaded rendering, the DMA-BUF renderer, or proper vertical retrace synchronization (VSync). While these improvements have helped keep WebKit competitive, and even perform better than other engines in some scenarios, it has been clear for a while that we were reaching the limits of what can be achieved with a CPU based 2D renderer.

There was an attempt at making Cairo support GPU rendering, which did not work particularly well due to the library being designed around stateful operation based upon the PostScript model—resulting in a convenient and familiar API, great output quality, but hard to retarget and with some particularly slow corner cases. Meanwhile, other web engines have moved more work to the GPU, including 2D rendering, where many operations are considerably faster.

We checked all the available 2D rendering libraries we could find, but none of them met all our requirements, so we decided to try writing our own library. At the beginning it worked really well, with impressive results in performance even compared to other GPU based alternatives. However, it proved challenging to find the right balance between performance and rendering quality, so we decided to try other alternatives before continuing with its development. Our next option had always been Skia. The main reason why we didn’t choose Skia from the beginning was that it didn’t provide a public library with API stability that distros can package and we can use like most of our dependencies. It still wasn’t what we wanted, but now we have more experience in WebKit maintaining third party dependencies inside the source tree like ANGLE and libwebrtc, so it was no longer a blocker either.

In December 2023 we made the decision of giving Skia a try internally and see if it would be worth the effort of maintaining the project as a third party module inside WebKit. In just one month we had implemented enough features to be able to run all MotionMark tests. The results in the desktop were quite impressive, getting double the score of MotionMark global result. We still had to do more tests in embedded devices which are the actual target of WPE, but it was clear that, at least in the desktop, with this very initial implementation that was not even optimized (we kept our current architecture that is optimized for CPU rendering) we got much better results. We decided that Skia was the option, so we continued working on it and doing more tests in embedded devices. In the boards that we tried we also got better results than CPU rendering, but the difference was not so big, which means that with less powerful GPUs and with our current architecture designed for CPU rendering we were not that far from CPU rendering. That’s the reason why we managed to keep WPE competitive in embeeded devices, but Skia will not only bring performance improvements, it will also simplify the code and will allow us to implement new features . So, we had enough data already to make the final decision of going with Skia.

In February 2024 we reached a point in which our Skia internal branch was in an “upstreamable” state, so there was no reason to continue working privately. We met with several teams from Google, Sony, Apple and Red Hat to discuss with them about our intention to switch from Cairo to Skia, upstreaming what we had as soon as possible. We got really positive feedback from all of them, so we sent an email to the WebKit developers mailing list to make it public. And again we only got positive feedback, so we started to prepare the patches to import Skia into WebKit, add the CMake integration and the initial Skia implementation for the WPE port that already landed in main.

We will continue working on the Skia implementation in upstream WebKit, and we also have plans to change our architecture to better support the GPU rendering case in a more efficient way. We don’t have a deadline, it will be ready when we have implemented everything currently supported by Cairo, we don’t plan to switch with regressions. We are focused on the WPE port for now, but at some point we will start working on GTK too and other ports using cairo will eventually start getting Skia support as well.

by carlos garcia campos at February 19, 2024 01:27 PM



July 08, 2023

GNOME Web Canary is back

by Philippe Normand

This is a short PSA post announcing the return of the GNOME Web Canary builds. Read on for the crunchy details.

A couple years ago I was blogging about the GNOME Web Canary flavor. In summary this special build of GNOME Web provides a preview of the upcoming version of …

by Philippe Normand at July 08, 2023 08:00 AM



April 03, 2023

WebKitGTK accelerated compositing rendering

by Carlos García Campos

Initial accelerated compositing support

When accelerated compositing support was added to WebKitGTK, there was only X11. Our first approach was quite simple, we sent the web view widget Xwindow ID to the web process to be used as rendering target using GLX. This was very efficient, but soon we realized it broke the GTK rendering model so it was not possible to use a web view inside a GtkOverlay, for example, to show status messages on top. The solution was to use a redirected Xcomposite window in the web process, and use its ID as the render target using GLX. The pixmap ID of the redirected Xcomposite window was sent to the UI process to be painted in the web view widget using a Cairo Xlib surface. Since the rendering happens in the web process, this approach required to use Xdamage to monitor when the redirected Xcomposite window was updated to schedule a web view redraw.

Wayland support

To support accelerated compositing under Wayland we initially added a nested Wayland compositor running in the UI process. The web process connected to the nested Wayland compositor and created a surface to be used as the rendering target using EGL. The good thing about this approach compared to the X11 one, is that we can create an EGLImage from Wayland buffers and use a GDK GL context to paint the contents in the web view. This is more efficient than X11 because we can use OpenGL both in web and UI processes.
WPE, when using the fdo backend, uses the same approach of running a nested Wayland compositor, but in a more efficient way, using DMABUF instead of Wayland buffers when available. So, we decided to use libwpe in the GTK port only for rendering under Wayland, and eventually remove our Wayland compositor implementation.
Before the removal of the custom Wayland compositor we had all these possible combinations:

  • UI Process
    • X11: Cairo Xlib surface
    • Wayland: EGL
  • Web Process
    • X11: GLX using redirected Xwindow
    • Wayland (nested Wayland compositor): EGL using Wayland surface
    • Wayland (libwpe): EGL using libwpe to get the Wayland surface

To reduce a bit the differences, and to make it easier to support WebGL with ANGLE we decided to change X11 to prefer EGL if possible, falling back to GLX only if EGL failed.

GTK4

GTK4 was released and we added support for it. The fact that GTK4 uses GL by default should make the rendering more efficient in accelerated compositing mode. This is definitely true under Wayland, because we are using a GL context already, so we just keep passing a texture to GTK to paint the contents in the web view. However, in the case of X11 we still have a Cairo Xlib surface that GTK paints into a Cairo image surface to be uploaded to the GPU. With GTK4 now we have two more combinations in the UI process side X11 + GTK3, X11 + GTK4, Wayland + GTK3 and Wayland + GTK4.

Reducing all the combinations to (almost) one: DMABUF

All these combinations to support the different platforms made it quite difficult to maintain, every time we get a bug report about something not working in accelerated compositing mode we have to figure out the combination actually used by the reporter, GTK3 or GTK4? X11 or Wayland? using EGL or GLX? custom Wayland compositor or libwpe? driver? version? etc.

We are already using DMABUF in WebKit for different things like WebGL and media rendering, so we thought that we could also use it for sharing the rendered buffer between the web and UI processes. That would be a more efficient solution but it would also drastically reduce the amount of combinations to maintain. The web process always uses the surfaceless platform, so it doesn’t matter if it’s under Wayland or X11. Then we create a surfaceless context as the render target and use EGL and GBM APIs to export the contents as a DMABUF buffer. The UI process imports the DMABUF buffer using EGL and GBM too, to be passed to GTK as a texture that is painted in the web view.

This theoretically recudes all the previous combinations to just one (note that we removed GLX support entirely, making EGL a requirement for accelerated compositing), but there’s a problem under X11: GTK3 doesn’t support EGL on X11 and GTK4 defaults to EGL but falls back to GLX if it doesn’t find an EGL config that perfectly matches the screen visual. In my system it never finds that EGL config because mesa doesn’t expose any 32 bit depth config. So, in the case of GTK3 we have to manually download the buffer to CPU and paint normally using Cairo, but in the case of GTK4 + GLX, GTK uploads the buffer again to be painted using GLX. I don’t think it’s possible to force GTK to use EGL from the API, but at least you can use GDK_DEBUG=gl-egl.

WebKitGTK 2.41.1

WebKitGTK 2.41.1 is the first unstable release of this cycle and already includes the DMABUF support that is used by default. We encourage everybody to try it out and provide feedback or report any issue. Please, export the contents of webkit://gpu and attach it to the bug report when reporting any problem related to graphics. To check if the issue is a regression of the DMABUF implementation you can use WEBKIT_DISABLE_DMABUF_RENDERER=1 to use the WPE renderer or X11 instead. This environment variable and the WPE render/X11 code will be eventually removed if DMABUF works fine.

WPE

If this approach works fine we plan to use something similar for the WPE port and get rid of the nested Wayland compositor there too.

by carlos garcia campos at April 03, 2023 08:57 AM



March 21, 2023

WebKitGTK API for GTK 4 Is Now Stable

by Michael Catanzaro

With the release of WebKitGTK 2.40.0, WebKitGTK now finally provides a stable API and ABI for GTK 4 applications. The following API versions are provided:

  • webkit2gtk-4.0: this API version uses GTK 3 and libsoup 2. It is obsolete and users should immediately port to webkit2gtk-4.1. To get this with WebKitGTK 2.40, build with -DPORT=GTK -DUSE_SOUP2=ON.
  • webkit2gtk-4.1: this API version uses GTK 3 and libsoup 3. It contains no other changes from webkit2gtk-4.0 besides the libsoup version. With WebKitGTK 2.40, this is the default API version that you get when you build with -DPORT=GTK. (In 2.42, this might require a different flag, e.g. -DUSE_GTK3=ON, which does not exist yet.)
  • webkitgtk-6.0: this API version uses GTK 4 and libsoup 3. To get this with WebKitGTK 2.40, build with -DPORT=GTK -DUSE_GTK4=ON. (In 2.42, this might become the default API version.)

WebKitGTK 2.38 had a different GTK 4 API version, webkit2gtk-5.0. This was an unstable/development API version and it is gone in 2.40, so applications using it will break. Fortunately, that should be very few applications. If your operating system ships GNOME 42, or any older version, or the new GNOME 44, then no applications use webkit2gtk-5.0 and you have no extra work to do. But for operating systems that ship GNOME 43, webkit2gtk-5.0 is used by gnome-builder, gnome-initial-setup, and evolution-data-server:

  • For evolution-data-server 3.46, use this patch which applies on evolution-data-server 3.46.4.
  • For gnome-initial-setup 43, use this patch which applies on gnome-initial-setup 43.2. (Update: for your convenience, this patch will be included in gnome-initial-setup 43.3.)
  • For gnome-builder 43, all required changes are present in version 43.7.

Remember, patching is only needed for GNOME 43. Other versions of GNOME will have no problems with WebKitGTK 2.40.

There is no proper online documentation yet, but in the meantime you can view the markdown source for the migration guide to help you with porting your applications. Although the API is now stable and it is close to feature parity with the GTK 3 version, there are some problems to be aware of:

Big thanks to everyone who helped make this possible.

by Michael Catanzaro at March 21, 2023 07:01 PM



February 16, 2023

WebRTC in WebKitGTK and WPE, status updates, part I

by Philippe Normand

Some time ago we at Igalia embarked on the journey to ship a GStreamer-powered WebRTC backend. This is a long journey, it is not over, but we made some progress …

by Philippe Normand at February 16, 2023 08:30 PM



November 04, 2022

Stop Using QtWebKit

by Michael Catanzaro

Today, WebKit in Linux operating systems is much more secure than it used to be. The problems that I previously discussed in this old, formerly-popular blog post are nowadays a thing of the past. Most major Linux operating systems now update WebKitGTK and WPE WebKit on a regular basis to ensure known vulnerabilities are fixed. (Not all Linux operating systems include WPE WebKit. It’s basically WebKitGTK without the dependency on GTK, and is the best choice if you want to use WebKit on embedded devices.) All major operating systems have removed older, insecure versions of WebKitGTK (“WebKit 1”) that were previously a major security problem for Linux users. And today WebKitGTK and WPE WebKit both provide a webkit_web_context_set_sandbox_enabled() API which, if enabled, employs Linux namespaces to prevent a compromised web content process from accessing your personal data, similar to Flatpak’s sandbox. (If you are a developer and your application does not already enable the sandbox, you should fix that!)

Unfortunately, QtWebKit has not benefited from these improvements. QtWebKit was removed from the upstream WebKit codebase back in 2013. Its current status in Fedora is, unfortunately, representative of other major Linux operating systems. Fedora currently contains two versions of QtWebKit:

  • The qtwebkit package contains upstream QtWebKit 2.3.4 from 2014. I believe this is used by Qt 4 applications. For avoidance of doubt, you should not use applications that depend on a web engine that has not been updated in eight years.
  • The newer qt5-qtwebkit contains Konstantin Tokarev’s fork of QtWebKit, which is de facto the new upstream and without a doubt the best version of QtWebKit available currently. Although it has received occasional updates, most recently 5.212.0-alpha4 from March 2020, it’s still based on WebKitGTK 2.12 from 2016, and the release notes bluntly state that it’s not very safe to use. Looking at WebKitGTK security advisories beginning with WSA-2016-0006, I manually counted 507 CVEs that have been fixed in WebKitGTK 2.14.0 or newer.

These CVEs are mostly (but not exclusively) remote code execution vulnerabilities. Many of those CVEs no doubt correspond to bugs that were introduced more recently than 2.12, but the exact number is not important: what’s important is that it’s a lot, far too many for backporting security fixes to be practical. Since qt5-qtwebkit is two years newer than qtwebkit, the qtwebkit package is no doubt in even worse shape. And because QtWebKit does not have any web process sandbox, any remote code execution is game over: an attacker that exploits QtWebKit gains full access to your user account on your computer, and can steal or destroy all your files, read all your passwords out of your password manager, and do anything else that your user account can do with your computer. In contrast, with WebKitGTK or WPE WebKit’s web process sandbox enabled, attackers only get access to content that’s mounted within the sandbox, which is a much more limited environment without access to your home directory or session bus.

In short, it’s long past time for Linux operating systems to remove QtWebKit and everything that depends on it. Do not feed untrusted data into QtWebKit. Don’t give it any HTML that you didn’t write yourself, and certainly don’t give it anything that contains injected data. Uninstall it and whatever applications depend on it.

Update: I forgot to mention what to do if you are a developer and your application still uses QtWebKit. You should ensure it uses the most recent release of QtWebEngine for Qt 6. Do not use old versions of Qt 6, and do not use QtWebEngine for Qt 5.

by Michael Catanzaro at November 04, 2022 05:20 PM



October 03, 2022

Mon 2022/Oct/03

by Claudio Saavedra

The series on the WPE port by the WebKit team at Igalia grows, with several new articles that go deep into different areas of the engine:

These articles are an interesting read not only if you're working on WebKit, but also if you are curious on how a modern browser engine works and some of the moving parts beneath the surface. So go check them out!

On a related note, the WebKit team is always on the lookout for talent to join us. Experience with WebKit or browsers is not necessarily a must, as we know from experience that anyone with a strong C/C++ background and enough curiosity will be able to ramp up and start contributing soon enough. If these articles spark your curiosity, feel free to reach out to me to find out more or to apply directly!

October 03, 2022 11:28 AM



July 01, 2022

Fri 2022/Jul/01

by Claudio Saavedra

I wrote a technical overview of the WebKit WPE project for the WPE WebKit blog, for those interested in WPE as a potential solution to the problem of browsers in embedded devices.

This article begins a series of technical writeups on the architecture of WPE, and we hope to publish during the rest of the year further articles breaking down different components of WebKit, including graphics and other subsystems, that will surely be of great help for those interested in getting more familiar with WebKit and its internals.

July 01, 2022 10:39 AM



August 02, 2021

Introducing the GNOME Web Canary flavor

by Philippe Normand

Today I am happy to unveil GNOME Web Canary which aims to provide bleeding edge, most likely very unstable builds of Epiphany, depending on daily builds of the WebKitGTK development version. Read on to know more about this.

Until recently the GNOME Web browser was available for end-users in two …

by Philippe Normand at August 02, 2021 12:00 PM



November 29, 2020

Accurate Conclusions from Bogus Data: Methodological Issues in “Collaboration in the open-source arena: The WebKit case”

by Michael Catanzaro

Nearly five years ago, when I was in grad school, I stumbled across the paper Collaboration in the open-source arena: The WebKit case when trying to figure out what I would do for a course project in network theory (i.e. graph theory, not computer networking; I’ll use the words “graph” and “network” interchangeably). The paper evaluates collaboration networks, which are graphs where collaborators are represented by nodes and relationships between collaborators are represented by edges. Our professor had used collaboration networks as examples during lecture, so it seemed at least mildly relevant to our class, and I wound up writing a critique on this paper for the class project. In this paper, the authors construct collaboration networks for WebKit by examining the project’s changelog files to define relationships between developers. They perform “community detection” to visually group developers who work closely together into separate clusters in the graphs. Then, the authors use those graphs to arrive at various conclusions about WebKit (e.g. “[e]ven if Samsung and Apple are involved in expensive patent wars in the courts and stopped collaborating on hardware components, their contributions remained strong and central within the WebKit open source project,” regarding the period from 2008 to 2013).

At the time, I contacted the authors to let them know about some serious problems I found with their work. Then I left the paper sitting in a short-term to-do pile on my desk, where it has been sitting since Obama was president, waiting for me to finally write this blog post. Unfortunately, nearly five years later, the authors’ email addresses no longer work, which is not very surprising after so long — since I’m no longer a student, the email I originally used to contact them doesn’t work anymore either — so I was unable to contact them again to let them know that I was finally going to publish this blog post. Anyway, suffice to say that the conclusions of the paper were all correct; however, the networks used to arrive at those conclusions suffered from three different mistakes, each of which was, on its own, serious enough to invalidate the entire work.

So if the analysis of the networks was bogus, how did the authors arrive at correct conclusions anyway? The answer is confirmation bias. The study was performed by visually looking at networks and then coming to non-rigorous conclusions about the networks, and by researching the WebKit community to learn what is going on with the major companies involved in the project. The authors arrived at correct conclusions because they did a good job at the latter, then saw what they wanted to see in the graphs.

I don’t want to be too harsh on the authors of this paper, though, because they decided to publish their raw data and methodology on the internet. They even published the python scripts they used to convert WebKit changelogs into collaboration graphs. Had they not done so, there is no way I would have noticed the third (and most important) mistake that I’ll discuss below, and I wouldn’t have been able to confirm my suspicions about the second mistake. You would not be reading this right now, and likely nobody would ever have realized the problems with the paper. The authors of most scientific papers are not nearly so transparent: many researchers today consider their source code and raw data to be either proprietary secrets to be guarded, or simply not important enough to merit publication. The authors of this paper deserve to be commended, not penalized, for their openness. Mistakes are normal in research papers, and open data is by far the best way for us to be able to detect mistakes when they happen.

Collaboration network
A collaboration network from the paper. The paper reports that this network represents collaboration between September 2008 (when Google began contributing to WebKit) and February 2011 (the departure of Nokia from the project). Because the authors posted their data online, I noticed that this was a mistake in the paper: the graph actually represents the period between February 2011 and July 2012. The paper’s analysis of this graph is therefore questionable, but note this was only a minor mistake compared to the three major mistakes that impact this network. Note the suspiciously-high number of unaffiliated (“Other”) contributors in a corporate-dominated project.

The rest of this blog post is a simplified version of my original school paper from 2016. I’ve removed maybe half the original content, including some flowery academic language and unnecessary references to class material (e.g. “community detection was performed using fast modularity maximization to generate an alternate visualization of the network,” good for high scores on class papers, not so good for blog posts). But rewriting everything to be informal takes a long time, and I want to finish this today so it’s not still on my desk five more years from now, so the rest of this blog post is still going to be much more formal than normal. Oh well. Tone shift now!

We (“we” means “I”) examine various methodological issues discovered by analyzing the paper. The first section discusses the effects on the collaboration network of choosing a poor definition of collaboration. The second section discusses a major source of error in detecting the company affiliation of many contributors. The third section describes a serious mistake in the data collection process. Each of these issues is quite severe, and any one alone calls into question the validity of the entire study. It must be noted that such issues are not necessarily unique to this paper, and must be kept in mind for all future studies that utilize collaboration networks.

Mistake #1: Poorly-defined Collaboration

The precise definition used to model collaboration has tremendous impact on the usefulness of the resultant collaboration network. Many collaboration networks are built using definitions of collaboration that are self-evidently useful, where there is little doubt that edges in the network represent real-world collaboration. The paper adopts an approach to building collaboration networks where developers are represented by nodes, and an edge exists between two nodes if the corresponding developers modified the same source code file during the time period under consideration for the construction of the network. However, it is not clear that this definition of collaboration is actually useful. Consider that it is a regular occurrence for developers who do not know each other and may have never communicated to modify the same files. Consider also that modifying a common file does not necessarily reflect any shared interest in a particular portion of the software project. For instance, a file might be modified when making an interface change in another file, or when fixing a build error occurring on a particular platform. Such occurrences are, in fact, extremely common in the WebKit project. Additionally, consider that there exist particular source code files that are unusually central to the project, and must be modified more frequently than other files. It is highly likely that almost all developers will at one point or another make some change in such a file, and therefore be connected via a collaboration edge to all other developers who have ever modified that file. (My original critique shows a screenshot of the revision history of WebPageProxy.cpp, to demonstrate that the developers modifying this file were working on unrelated projects.)

It is true, as assumed by the paper, that particular developers work on different portions of the WebKit source code, and collaborate more with particular other developers. For instance, developers who work for the same company typically, though not always, collaborate most with other developers from that same company. However, the paper’s naive definition of collaboration should ensure that most developers will be considered to have collaborated equally with most other developers, regardless of the actual degree of collaboration. For instance, consider developers A and B who regularly collaborate on a particular source file. Now, developer C, who works on a platform that does not use this file and would not ordinarily need to modify it, makes a change to some cross-platform interface in another file that requires updating this file. Developer C is now considered to have collaborated with developers A and B on this file! Clearly, this is not a desirable result, as developers A and B have collaborated far more on the development of the file. Moreover, consider that an edge exists between two developers in the collaboration network if they have ever both modified any file anywhere in WebKit during the time period under review; then we can expect to form a network that is almost complete (a “full” graph where edges exists between most nodes). It is evident that some method of weighting collaboration between different contributors would be desirable, as the unweighted collaboration network does not seem useful.

One might argue that the networks presented in the paper clearly show developers exist in subcommunities on the peripheries of the network, that the network is clearly not complete, and that therefore this definition of collaboration sufficed, at least to some extent. However, this is only due to another methodological error in the study. Mistake #3, discussed later, explains how the study managed to produce collaboration networks with noticeable subcommunities despite these issues.

We note that the authors chose this same definition of collaboration in their more recent work on OpenStack, so there exist multiple studies using this same flawed definition of collaboration. We speculate that this definition of collaboration is unlikely to be more suitable for OpenStack or for other software projects than it is for WebKit. The software engineering research community must explore alternative models of collaboration when undertaking future studies of software development collaboration networks in order to more accurately reflect collaboration.

Mistake #2: Misdetected Contributor Affiliation

One difficulty when building collaboration networks is the need to correctly match each contributor with the correct company affiliation. Although many free software projects are dominated by unaffiliated contributors, others, like WebKit, are primarily developed by paid contributors. Looking at the number of times a particular email domain appears in WebKit changelog entries made during 2015, most contributors commit using corporate emails, but many developers commit to WebKit using personal email accounts, such as Gmail accounts; additionally, many developers use generic webkit.org email aliases, which were previously available to active WebKit contributors. These developers may or may not be affiliated with companies that contribute to the project. Use of personal email addresses is a source of inaccuracy when constructing collaboration networks, as it results in an undercount of corporate contributions.  We can expect this issue has led to serious inaccuracies in the reported collaboration networks.

This substantial source of error is neither mentioned nor accounted for; all contributors using such email accounts were therefore miscategorized as unaffiliated. However, the authors clearly recognized this issue, as it has been accounted for in their more recent work covering OpenStack by cross-referencing email addresses from git revision history with a database containing corporate affiliations maintained by the OpenStack Foundation. Unfortunately, no such effort was made for the WebKit data set.

The WebKit project was previously dominated by contributors with chromium.org email domains. This domain is equivalent to webkit.org in that it can be used by contributors to the Chromium project regardless of corporate affiliation; however, most contributors with Chromium emails are actually Google employees. The high use of Chromium emails by Google employees appears to have led to a dramatic — by roughly an entire order of magnitude — undercount of Google’s contributors to the WebKit project, as only contributors with google.com emails were considered to be Google employees. The vast majority of Google employees used chromium.org emails, and so were counted as unaffiliated developers. This explains the extraordinarily high number of unaffiliated developers in the networks presented by the paper, despite the fact that WebKit development is, in reality, dominated by corporate contributors.

Mistake #3: Missing Most Changelog Data

The paper incorrectly claims to have gathered its data from both WebKit’s Subversion revision history and from its changelog files. We must draw a distinction between changelog entries and Subversion revision history. Changelog entries are inserted into changelog files that are committed into the Subversion repository; they are completely separate from the Subversion history. Each subproject within the WebKit project has its own set of changelog files used to record changes under the corresponding directory.

In fact, the paper processed only the changelog files. This was actually a good choice, as WebKit’s changelog files are much more accurate than the Subversion history, for two reasons. Firstly, it is easy for a contributor to change the email address entered into a changelog file, e.g. after a change in company affiliation. However, it is difficult to change the email address used to commit to Subversion, as this requires requesting a new Subversion account from the Subversion administrator; accordingly, contributors are more likely to use older email addresses, lacking accurate company affiliation, in Subversion revisions than in changelog files.  Secondly, many Subversion revisions are not directly committed by contributors, but rather are actually committed by the commit queue bot, which runs various tests before committing the revision. Subversion revisions are also, more rarely, committed by a completely different contributor than the patch author. In both cases, the proper contributor’s name will appear in only the changelog file, and not the Subversion data. Some developers are dramatically more likely to use the commit queue than others. Various other reviews of WebKit contribution history that examine data from Subversion history rather than from changelog files are flawed for this reason. Fortunately, by relying on changelog files rather than Subversion metadata, the authors avoid this problem.

Unfortunately, a serious error was made in processing the changelog data. WebKit has many different sets of changelog files, stored in various project subdirectories (JavaScriptCore, WebCore, WebKit, etc.), as well as toplevel changelogs stored in the root directory of the project. Regrettably, the authors were unaware of the changelogs in subdirectories, and based their analysis only on the toplevel changelogs, which contain only changes that occurred in subdirectories that lack their own changelog files. In practice, this inadvertently restricted the scope of the analysis to a very small minority of changes, primarily to build system files, manual tests, and the WebKit website. That is, the reported collaboration networks do not reflect collaboration on any actual source code files. All source code files are contained in subdirectories with their own changelog files, and therefore no source code files were actually considered in the analysis of collaboration on source code changes.

We speculate that the analysis’s focus on build system files likely exaggerates the effects of clustering in the network, as different companies used different build systems and thus were less likely to edit the build systems used by other companies, and that an analysis based on the correct data would display less of a clustering effect. Certainly, there would be dramatically more edges in the already-dense networks, because an edge exists between two developers if there exists any one file in WebKit that both developers have modified. Omitting all of the source code files from the analysis therefore dramatically reduces the likelihood of edges existing between nodes in the network.

Conclusion

We found that the original study was impacted by an unsuitable definition of collaboration used to build the collaboration networks, severe errors in counting contributor affiliation (including the classification of most Google employees as unaffiliated developers), and the omission of almost all the required data from the analysis, including all data on modifications to source code files. The authors constructed and studied essentially meaningless networks. Nevertheless, the authors were able to derive many accurate conclusions about the WebKit project from their inaccurate collaboration networks. Such conclusions illustrate the dangers of seeking to find particular meanings or explanations through visual inspection of collaboration networks. Researchers must work forwards from the collaboration networks to arrive at their conclusions, rather than backwards by attempting to match the networks to conclusions gained from prior knowledge.

Original Report

Wow, OK, you actually read this far? Since this blog post criticizes an academic paper, and since this blog post does not include various tables and examples that support my arguments, I’ve attached my original analysis in full. It is a boring, student-quality grad school project written with the objective of scoring the highest-possible grade in a class rather than for clarity, and you probably don’t want to look at it unless you are investigating the paper in detail. (If you download that, note that I no longer work for Igalia, and the paper was not authorized by Igalia either; I used my company email to disclose my affiliation and maybe impress my professor a little.) Phew, now I can finally remove this from my desk!

by Michael Catanzaro at November 29, 2020 11:46 PM



Catching up on WebKit GStreamer WebAudio backends maintenance

by Philippe Normand

Over the past few months the WebKit development team has been working on modernizing support for the WebAudio specification. This post highlights some of the changes that were recently merged, focusing on the GStreamer ports.

My fellow WebKit colleague, Chris Dumez, has been very active lately, updating the WebAudio implementation …

by Philippe Normand at November 29, 2020 12:45 PM



October 29, 2020

Thu 2020/Oct/29

by Claudio Saavedra

In this line of work, we all stumble at least once upon a problem that turns out to be extremely elusive and very tricky to narrow down and solve. If we&aposre lucky, we might have everything at our disposal to diagnose the problem but sometimes that&aposs not the case – and in embedded development it&aposs often not the case. Add to the mix proprietary drivers, lack of debugging symbols, a bug that&aposs very hard to reproduce under a controlled environment, and weeks in partial confinement due to a pandemic and what you have is better described as a very long lucid nightmare. Thankfully, even the worst of nightmares end when morning comes, even if sometimes morning might be several days away. And when the fix to the problem is in an inimaginable place, the story is definitely one worth telling.

The problem

It all started with one of Igalia&aposs customers deploying a WPE WebKit-based browser in their embedded devices. Their CI infrastructure had detected a problem caused when the browser was tasked with creating a new webview (in layman terms, you can imagine that to be the same as opening a new tab in your browser). Occasionally, this view would never load, causing ongoing tests to fail. For some reason, the test failure had a reproducibility of ~75% in the CI environment, but during manual testing it would occur with less than a 1% of probability. For reasons that are beyond the scope of this post, the CI infrastructure was not reachable in a way that would allow to have access to running processes in order to diagnose the problem more easily. So with only logs at hand and less than a 1/100 chances of reproducing the bug myself, I set to debug this problem locally.

Diagnosis

The first that became evident was that, whenever this bug would occur, the WebKit feature known as web extension (an application-specific loadable module that is used to allow the program to have access to the internals of a web page, as well to enable customizable communication with the process where the page contents are loaded – the web process) wouldn&apost work. The browser would be forever waiting that the web extension loads, and since that wouldn&apost happen, the expected page wouldn&apost load. The first place to look into then is the web process and to try to understand what is preventing the web extension from loading. Enter here, our good friend GDB, with less than spectacular results thanks to stripped libraries.

#0  0x7500ab9c in poll () from target:/lib/libc.so.6
#1  0x73c08c0c in ?? () from target:/usr/lib/libEGL.so.1
#2  0x73c08d2c in ?? () from target:/usr/lib/libEGL.so.1
#3  0x73c08e0c in ?? () from target:/usr/lib/libEGL.so.1
#4  0x73bold6a8 in ?? () from target:/usr/lib/libEGL.so.1
#5  0x75f84208 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
#6  0x75fa0b7e in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
#7  0x7561eda2 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
#8  0x755a176a in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
#9  0x753cd842 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
#10 0x75451660 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
#11 0x75452882 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
#12 0x75452fa8 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
#13 0x76b1de62 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
#14 0x76b5a970 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
#15 0x74bee44c in g_main_context_dispatch () from target:/usr/lib/libglib-2.0.so.0
#16 0x74bee808 in ?? () from target:/usr/lib/libglib-2.0.so.0
#17 0x74beeba8 in g_main_loop_run () from target:/usr/lib/libglib-2.0.so.0
#18 0x76b5b11c in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
#19 0x75622338 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
#20 0x74f59b58 in __libc_start_main () from target:/lib/libc.so.6
#21 0x0045d8d0 in _start ()

From all threads in the web process, after much tinkering around it slowly became clear that one of the places to look into is that poll() call. I will spare you the details related to what other threads were doing, suffice to say that whenever the browser would hit the bug, there was a similar stacktrace in one thread, going through libEGL to a call to poll() on top of the stack, that would never return. Unfortunately, a stripped EGL driver coming from a proprietary graphics vendor was a bit of a showstopper, as it was the inability to have proper debugging symbols running inside the device (did you know that a non-stripped WebKit library binary with debugging symbols can easily get GDB and your device out of memory?). The best one could do to improve that was to use the gcore feature in GDB, and extract a core from the device for post-mortem analysis. But for some reason, such a stacktrace wouldn&apost give anything interesting below the poll() call to understand what&aposs being polled here. Did I say this was tricky?

What polls?

Because WebKit is a multiprocess web engine, having system calls that signal, read, and write in sockets communicating with other processes is an everyday thing. Not knowing what a poll() call is doing and who is it that it&aposs trying to listen to, not very good. Because the call is happening under the EGL library, one can presume that it&aposs graphics related, but there are still different possibilities, so trying to find out what is this polling is a good idea.

A trick I learned while debugging this is that, in absence of debugging symbols that would give a straightforward look into variables and parameters, one can examine the CPU registers and try to figure out from them what the parameters to function calls are. Let&aposs do that with poll(). First, its signature.

int poll(struct pollfd *fds, nfds_t nfds, int timeout);

Now, let's examine the registers.

(gdb) f 0
#0  0x7500ab9c in poll () from target:/lib/libc.so.6
(gdb) info registers
r0             0x7ea55e58	2124766808
r1             0x1	1
r2             0x64	100
r3             0x0	0
r4             0x0	0

Registers r0, r1, and r2 contain poll()&aposs three parameters. Because r1 is 1, we know that there is only one file descriptor being polled. fds is a pointer to an array with one element then. Where is that first element? Well, right there, in the memory pointed to directly by r0. What does struct pollfd look like?

struct pollfd {
  int   fd;         /* file descriptor */
  short events;     /* requested events */
  short revents;    /* returned events */
};

What we are interested in here is the contents of fd, the file descriptor that is being polled. Memory alignment is again in our side, we don&apost need any pointer arithmetic here. We can inspect directly the register r0 and find out what the value of fd is.

(gdb) print *0x7ea55e58
$3 = 8

So we now know that the EGL library is polling the file descriptor with an identifier of 8. But where is this file descriptor coming from? What is on the other end? The /proc file system can be helpful here.

# pidof WPEWebProcess
1944 1196
# ls -lh /proc/1944/fd/8
lrwx------    1 x x      64 Oct 22 13:59 /proc/1944/fd/8 -> socket:[32166]

So we have a socket. What else can we find out about it? Turns out, not much without the unix_diag kernel module, which was not available in our device. But we are slowly getting closer. Time to call another good friend.

Where GDB fails, printf() triumphs

Something I have learned from many years working with a project as large as WebKit, is that debugging symbols can be very difficult to work with. To begin with, it takes ages to build WebKit with them. When cross-compiling, it&aposs even worse. And then, very often the target device doesn&apost even have enough memory to load the symbols when debugging. So they can be pretty useless. It&aposs then when just using fprintf() and logging useful information can simplify things. Since we know that it&aposs at some point during initialization of the web process that we end up stuck, and we also know that we&aposre polling a file descriptor, let&aposs find some early calls in the code of the web process and add some fprintf() calls with a bit of information, specially in those that might have something to do with EGL. What can we find out now?

Oct 19 10:13:27.700335 WPEWebProcess[92]: Starting
Oct 19 10:13:27.720575 WPEWebProcess[92]: Initializing WebProcess platform.
Oct 19 10:13:27.727850 WPEWebProcess[92]: wpe_loader_init() done.
Oct 19 10:13:27.729054 WPEWebProcess[92]: Initializing PlatformDisplayLibWPE (hostFD: 8).
Oct 19 10:13:27.730166 WPEWebProcess[92]: egl backend created.
Oct 19 10:13:27.741556 WPEWebProcess[92]: got native display.
Oct 19 10:13:27.742565 WPEWebProcess[92]: initializeEGLDisplay() starting.

Two interesting findings from the fprintf()-powered logging here: first, it seems that file descriptor 8 is one known to libwpe (the general-purpose library that powers the WPE WebKit port). Second, that the last EGL API call right before the web process hangs on poll() is a call to eglInitialize(). fprintf(), thanks for your service.

Number 8

We now know that the file descriptor 8 is coming from WPE and is not internal to the EGL library. libwpe gets this file descriptor from the UI process, as one of the many creation parameters that are passed via IPC to the nascent process in order to initialize it. Turns out that this file descriptor in particular, the so-called host client file descriptor, is the one that the freedesktop backend of libWPE, from here onwards WPEBackend-fdo, creates when a new client is set to connect to its Wayland display. In a nutshell, in presence of a new client, a Wayland display is supposed to create a pair of connected sockets, create a new client on the Display-side, give it one of the file descriptors, and pass the other one to the client process. Because this will be useful later on, let&aposs see how is that currently implemented in WPEBackend-fdo.

    int pair[2];
    if (socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, pair)  0)
        return -1;

    int clientFd = dup(pair[1]);
    close(pair[1]);

    wl_client_create(m_display, pair[0]);

The file descriptor we are tracking down is the client file descriptor, clientFd. So we now know what&aposs going on in this socket: Wayland-specific communication. Let&aposs enable Wayland debugging next, by running all relevant process with WAYLAND_DEBUG=1. We&aposll get back to that code fragment later on.

A Heisenbug is a Heisenbug is a Heisenbug

Turns out that enabling Wayland debugging output for a few processes is enough to alter the state of the system in such a way that the bug does not happen at all when doing manual testing. Thankfully the CI&aposs reproducibility is much higher, so after waiting overnight for the CI to continuously run until it hit the bug, we have logs. What do the logs say?

WPEWebProcess[41]: initializeEGLDisplay() starting.
  -> wl_display@1.get_registry(new id wl_registry@2)
  -> wl_display@1.sync(new id wl_callback@3)

So the EGL library is trying to fetch the Wayland registry and it&aposs doing a wl_display_sync() call afterwards, which will block until the server responds. That&aposs where the blocking poll() call comes from. So, it turns out, the problem is not necessarily on this end of the Wayland socket, but perhaps on the other side, that is, in the so-called UI process (the main browser process). Why is the Wayland display not replying?

The loop

Something that is worth mentioning before we move on is how the WPEBackend-fdo Wayland display integrates with the system. This display is a nested display, with each web view a client, while it is itself a client of the system&aposs Wayland display. This can be a bit confusing if you&aposre not very familiar with how Wayland works, but fortunately there is good documentation about Wayland elsewhere.

The way that the Wayland display in the UI process of a WPEWebKit browser is integrated with the rest of the program, when it uses WPEBackend-fdo, is through the GLib main event loop. Wayland itself has an event loop implementation for servers, but for a GLib-powered application it can be useful to use GLib&aposs and integrate Wayland&aposs event processing with the different stages of the GLib main loop. That is precisely how WPEBackend-fdo is handling its clients&apos events. As discussed earlier, when a new client is created a pair of connected sockets are created and one end is given to Wayland to control communication with the client. GSourceFunc functions are used to integrate Wayland with the application main loop. In these functions, we make sure that whenever there are pending messages to be sent to clients, those are sent, and whenever any of the client sockets has pending data to be read, Wayland reads from them, and to dispatch the events that might be necessary in response to the incoming data. And here is where things start getting really strange, because after doing a bit of fprintf()-powered debugging inside the Wayland-GSourceFuncs functions, it became clear that the Wayland events from the clients were never dispatched, because the dispatch() GSourceFunc was not being called, as if there was nothing coming from any Wayland client. But how is that possible, if we already know that the web process client is actually trying to get the Wayland registry?

To move forward, one needs to understand how the GLib main loop works, in particular, with Unix file descriptor sources. A very brief summary of this is that, during an iteration of the main loop, GLib will poll file descriptors to see if there are any interesting events to be reported back to their respective sources, in which case the sources will decide whether to trigger the dispatch() phase. A simple source might decide in its dispatch() method to directly read or write from/to the file descriptor; a Wayland display source (as in our case), will call wl_event_loop_dispatch() to do this for us. However, if the source doesn&apost find any interesting events, or if the source decides that it doesn&apost want to handle them, the dispatch() invocation will not happen. More on the GLib main event loop in its API documentation.

So it seems that for some reason the dispatch() method is not being called. Does that mean that there are no interesting events to read from? Let&aposs find out.

System call tracing

Here we resort to another helpful tool, strace. With strace we can try to figure out what is happening when the main loop polls file descriptors. The strace output is huge (because it takes easily over a hundred attempts to reproduce this), but we know already some of the calls that involve file descriptors from the code we looked at above, when the client is created. So we can use those calls as a starting point in when searching through the several MBs of logs. Fast-forward to the relevant logs.

socketpair(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, [128, 130]) = 0
dup(130)               = 131
close(130)             = 0
fcntl64(128, F_DUPFD_CLOEXEC, 0) = 130
epoll_ctl(34, EPOLL_CTL_ADD, 130, {EPOLLIN, {u32=1639599928, u64=1639599928}}) = 0

What we see there is, first, WPEBackend-fdo creating a new socket pair (128, 130) and then, when file descriptor 130 is passed to wl_client_create() to create a new client, Wayland adds that file descriptor to its epoll() instance for monitoring clients, which is referred to by file descriptor 34. This way, whenever there are events in file descriptor 130, we will hear about them in file descriptor 34.

So what we would expect to see next is that, after the web process is spawned, when a Wayland client is created using the passed file descriptor and the EGL driver requests the Wayland registry from the display, there should be a POLLIN event coming in file descriptor 34 and, if the dispatch() call for the source was called, a epoll_wait() call on it, as that is what wl_event_loop_dispatch() would do when called from the source&aposs dispatch() method. But what do we have instead?

poll([{fd=30, events=POLLIN}, {fd=34, events=POLLIN}, {fd=59, events=POLLIN}, {fd=110, events=POLLIN}, {fd=114, events=POLLIN}, {fd=132, events=POLLIN}], 6, 0) = 1 ([{fd=34, revents=POLLIN}])
recvmsg(30, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)

strace can be a bit cryptic, so let&aposs explain those two function calls. The first one is a poll in a series of file descriptors (including 30 and 34) for POLLIN events. The return value of that call tells us that there is a POLLIN event in file descriptor 34 (the Wayland display epoll() instance for clients). But unintuitively, the call right after is trying to read a message from socket 30 instead, which we know doesn&apost have any pending data at the moment, and consequently returns an error value with an errno of EAGAIN (Resource temporarily unavailable).

Why is the GLib main loop triggering a read from 30 instead of 34? And who is 30?

We can answer the latter question first. Breaking on a running UI process instance at the right time shows who is reading from the file descriptor 30:

#1  0x70ae1394 in wl_os_recvmsg_cloexec (sockfd=30, msg=msg@entry=0x700fea54, flags=flags@entry=64)
#2  0x70adf644 in wl_connection_read (connection=0x6f70b7e8)
#3  0x70ade70c in read_events (display=0x6f709c90)
#4  wl_display_read_events (display=0x6f709c90)
#5  0x70277d98 in pwl_source_check (source=0x6f71cb80)
#6  0x743f2140 in g_main_context_check (context=context@entry=0x2111978, max_priority=, fds=fds@entry=0x6165f718, n_fds=n_fds@entry=4)
#7  0x743f277c in g_main_context_iterate (context=0x2111978, block=block@entry=1, dispatch=dispatch@entry=1, self=)
#8  0x743f2ba8 in g_main_loop_run (loop=0x20ece40)
#9  0x00537b38 in ?? ()

So it&aposs also Wayland, but on a different level. This is the Wayland client source (remember that the browser is also a Wayland client?), which is installed by cog (a thin browser layer on top of WPE WebKit that makes writing browsers easier to do) to process, among others, input events coming from the parent Wayland display. Looking at the cog code, we can see that the wl_display_read_events() call happens only if GLib reports that there is a G_IO_IN (POLLIN) event in its file descriptor, but we already know that this is not the case, as per the strace output. So at this point we know that there are two things here that are not right:

  1. A FD source with a G_IO_IN condition is not being dispatched.
  2. A FD source without a G_IO_IN condition is being dispatched.

Someone here is not telling the truth, and as a result the main loop is dispatching the wrong sources.

The loop (part II)

It is at this point that it would be a good idea to look at what exactly the GLib main loop is doing internally in each of its stages and how it tracks the sources and file descriptors that are polled and that need to be processed. Fortunately, debugging symbols for GLib are very small, so debugging this step by step inside the device is rather easy.

Let&aposs look at how the main loop decides which sources to dispatch, since for some reason it&aposs dispatching the wrong ones. Dispatching happens in the g_main_dispatch() method. This method goes over a list of pending source dispatches and after a few checks and setting the stage, the dispatch method for the source gets called. How is a source set as having a pending dispatch? This happens in g_main_context_check(), where the main loop checks the results of the polling done in this iteration and runs the check() method for sources that are not ready yet so that they can decide whether they are ready to be dispatched or not. Breaking into the Wayland display source, I know that the check() method is called. How does this method decide to be dispatched or not?

    [](GSource* base) -> gboolean
    {
        auto& source = *reinterpret_cast(base);
        return !!source.pfd.revents;
    },

In this lambda function we&aposre returning TRUE or FALSE, depending on whether the revents field in the GPollFD structure have been filled during the polling stage of this iteration of the loop. A return value of TRUE indicates the main loop that we want our source to be dispatched. From the strace output, we know that there is a POLLIN (or G_IO_IN) condition, but we also know that the main loop is not dispatching it. So let&aposs look at what&aposs in this GPollFD structure.

For this, let&aposs go back to g_main_context_check() and inspect the array of GPollFD structures that it received when called. What do we find?

(gdb) print *fds
$35 = {fd = 30, events = 1, revents = 0}
(gdb) print *(fds+1)
$36 = {fd = 34, events = 1, revents = 1}

That&aposs the result of the poll() call! So far so good. Now the method is supposed to update the polling records it keeps and it uses when calling each of the sources check() functions. What do these records hold?

(gdb) print *pollrec->fd
$45 = {fd = 19, events = 1, revents = 0}
(gdb) print *(pollrec->next->fd)
$47 = {fd = 30, events = 25, revents = 1}
(gdb) print *(pollrec->next->next->fd)
$49 = {fd = 34, events = 25, revents = 0}

We&aposre not interested in the first record quite yet, but clearly there&aposs something odd here. The polling records are showing a different value in the revent fields for both 30 and 34. Are these records updated correctly? Let&aposs look at the algorithm that is doing this update, because it will be relevant later on.

  pollrec = context->poll_records;
  i = 0;
  while (pollrec && i  n_fds)
    {
      while (pollrec && pollrec->fd->fd == fds[i].fd)
        {
          if (pollrec->priority = max_priority)
            {
              pollrec->fd->revents =
                fds[i].revents & (pollrec->fd->events | G_IO_ERR | G_IO_HUP | G_IO_NVAL);
            }
          pollrec = pollrec->next;
        }

      i++;
    }

In simple words, what this algorithm is doing is to traverse simultaneously the polling records and the GPollFD array, updating the polling records revents with the results of polling. From reading how the pollrec linked list is built internally, it&aposs possible to see that it&aposs purposely sorted by increasing file descriptor identifier value. So the first item in the list will have the record for the lowest file descriptor identifier, and so on. The GPollFD array is also built in this way, allowing for a nice optimization: if more than one polling record – that is, more than one polling source – needs to poll the same file descriptor, this can be done at once. This is why this otherwise O(n^2) nested loop can actually be reduced to linear time.

One thing stands out here though: the linked list is only advanced when we find a match. Does this mean that we always have a match between polling records and the file descriptors that have just been polled? To answer that question we need to check how is the array of GPollFD structures filled. This is done in g_main_context_query(), as we hinted before. I&aposll spare you the details, and just focus on what seems relevant here: when is a poll record not used to fill a GPollFD?

  n_poll = 0;
  lastpollrec = NULL;
  for (pollrec = context->poll_records; pollrec; pollrec = pollrec->next)
    {
      if (pollrec->priority > max_priority)
        continue;
  ...

Interesting! If a polling record belongs to a source whose priority is lower than the maximum priority that the current iteration is going to process, the polling record is skipped. Why is this?

In simple terms, this happens because each iteration of the main loop finds out the highest priority between the sources that are ready in the prepare() stage, before polling, and then only those file descriptor sources with at least such a a priority are polled. The idea behind this is to make sure that high-priority sources are processed first, and that no file descriptor sources with lower priority are polled in vain, as they shouldn&apost be dispatched in the current iteration.

GDB tells me that the maximum priority in this iteration is -60. From an earlier GDB output, we also know that there&aposs a source for a file descriptor 19 with a priority 0.

(gdb) print *pollrec
$44 = {fd = 0x7369c8, prev = 0x0, next = 0x6f701560, priority = 0}
(gdb) print *pollrec->fd
$45 = {fd = 19, events = 1, revents = 0}

Since 19 is lower than 30 and 34, we know that this record is before theirs in the linked list (and so it happens, it&aposs the first one in the list too). But we know that, because its priority is 0, it is too low to be added to the file descriptor array to be polled. Let&aposs look at the loop again.

  pollrec = context->poll_records;
  i = 0;
  while (pollrec && i  n_fds)
    {
      while (pollrec && pollrec->fd->fd == fds[i].fd)
        {
          if (pollrec->priority = max_priority)
            {
              pollrec->fd->revents =
                fds[i].revents & (pollrec->fd->events | G_IO_ERR | G_IO_HUP | G_IO_NVAL);
            }
          pollrec = pollrec->next;
        }

      i++;
    }

The first polling record was skipped during the update of the GPollFD array, so the condition pollrec && pollrec->fd->fd == fds[i].fd is never going to be satisfied, because 19 is not in the array. The innermost while() is not entered, and as such the pollrec list pointer never moves forward to the next record. So no polling record is updated here, even if we have updated revent information from the polling results.

What happens next should be easy to see. The check() method for all polled sources are called with outdated revents. In the case of the source for file descriptor 30, we wrongly tell it there&aposs a G_IO_IN condition, so it asks the main loop to call dispatch it triggering a a wl_connection_read() call in a socket with no incoming data. For the source with file descriptor 34, we tell it that there&aposs no incoming data and its dispatch() method is not invoked, even when on the other side of the socket we have a client waiting for data to come and blocking in the meantime. This explains what we see in the strace output above. If the source with file descriptor 19 continues to be ready and with its priority unchanged, then this situation repeats in every further iteration of the main loop, leading to a hang in the web process that is forever waiting that the UI process reads its socket pipe.

The bug – explained

I have been using GLib for a very long time, and I have only fixed a couple of minor bugs in it over the years. Very few actually, which is why it was very difficult for me to come to accept that I had found a bug in one of the most reliable and complex parts of the library. Impostor syndrome is a thing and it really gets in the way.

But in a nutshell, the bug in the GLib main loop is that the very clever linear update of registers is missing something very important: it should skip to the first polling record matching before attempting to update its revents. Without this, in the presence of a file descriptor source with the lowest file descriptor identifier and also a lower priority than the cutting priority in the current main loop iteration, revents in the polling registers are not updated and therefore the wrong sources can be dispatched. The simplest patch to avoid this, would look as follows.

   i = 0;
   while (pollrec && i  n_fds)
     {
+      while (pollrec && pollrec->fd->fd != fds[i].fd)
+        pollrec = pollrec->next;
+
       while (pollrec && pollrec->fd->fd == fds[i].fd)
         {
           if (pollrec->priority = max_priority)

Once we find the first matching record, let&aposs update all consecutive records that also match and need an update, then let&aposs skip to the next record, rinse and repeat. With this two-line patch, the web process was finally unlocked, the EGL display initialized properly, the web extension and the web page were loaded, CI tests starting passing again, and this exhausted developer could finally put his mind to rest.

A complete patch, including improvements to the code comments around this fascinating part of GLib and also a minimal test case reproducing the bug have already been reviewed by the GLib maintainers and merged to both stable and development branches. I expect that at least some GLib sources will start being called in a different (but correct) order from now on, so keep an eye on your GLib sources. :-)

Standing on the shoulders of giants

At this point I should acknowledge that without the support from my colleagues in the WebKit team in Igalia, getting to the bottom of this problem would have probably been much harder and perhaps my sanity would have been at stake. I want to thank Adrián and &Zcaronan for their input on Wayland, debugging techniques, and for allowing me to bounce back and forth ideas and findings as I went deeper into this rabbit hole, helping me to step out of dead-ends, reminding me to use tools out of my everyday box, and ultimately, to be brave enough to doubt GLib&aposs correctness, something that much more often than not I take for granted.

Thanks also to Philip and Sebastian for their feedback and prompt code review!

October 29, 2020 01:10 PM



September 16, 2020

Epiphany 3.38 and WebKitGTK 2.30

by Michael Catanzaro

It’s that time of year again: a new GNOME release, and with it, a new Epiphany. The pace of Epiphany development has increased significantly over the last few years thanks to an increase in the number of active contributors. Most notably, Jan-Michael Brummer has solved dozens of bugs and landed many new enhancements, Alexander Mikhaylenko has polished numerous rough edges throughout the browser, and Andrei Lisita has landed several significant improvements to various Epiphany dialogs. That doesn’t count the work that Igalia is doing to maintain WebKitGTK, the WPE graphics stack, and libsoup, all of which is essential to delivering quality Epiphany releases, nor the work of the GNOME localization teams to translate it to your native language. Even if Epiphany itself is only the topmost layer of this technology stack, having more developers working on Epiphany itself allows us to deliver increased polish throughout the user interface layer, and I’m pretty happy with the result. Let’s take a look at what’s new.

Intelligent Tracking Prevention

Intelligent Tracking Prevention (ITP) is the headline feature of this release. Safari has had ITP for several years now, so if you’re familiar with how ITP works to prevent cross-site tracking on macOS or iOS, then you already know what to expect here.  If you’re more familiar with Firefox’s Enhanced Tracking Protection, or Chrome’s nothing (crickets: chirp, chirp!), then WebKit’s ITP is a little different from what you’re used to. ITP relies on heuristics that apply the same to all domains, so there are no blocklists of naughty domains that should be targeted for content blocking like you see in Firefox. Instead, a set of innovative restrictions is applied globally to all web content, and a separate set of stricter restrictions is applied to domains classified as “prevalent” based on your browsing history. Domains are classified as prevalent if ITP decides the domain is capable of tracking your browsing across the web, or non-prevalent otherwise. (The public-friendly terminology for this is “Classification as Having Cross-Site Tracking Capabilities,” but that is a mouthful, so I’ll stick with “prevalent.” It makes sense: domains that are common across many websites can track you across many websites, and domains that are not common cannot.)

ITP is enabled by default in Epiphany 3.38, as it has been for several years now in Safari, because otherwise only a small minority of users would turn it on. ITP protections are designed to be effective without breaking too many websites, so it’s fairly safe to enable by default. (You may encounter a few broken websites that have not been updated to use the Storage Access API to store third-party cookies. If so, you can choose to turn off ITP in the preferences dialog.)

For a detailed discussion covering ITP’s tracking mitigations, see Tracking Prevention in WebKit. I’m not an expert myself, but the short version is this: full third-party cookie blocking across all websites (to store a third-party cookie, websites must use the Storage Access API to prompt the user for permission); cookie-blocking latch mode (“once a request is blocked from using cookies, all redirects of that request are also blocked from using cookies”); downgraded third-party referrers (“all third-party referrers are downgraded to their origins by default”) to avoid exposing the path component of the URL in the referrer; blocked third-party HSTS (“HSTS […] can only be set by the first-party website […]”) to stop abuse by tracker scripts; detection of cross-site tracking via link decoration and 24-hour expiration time for all cookies created by JavaScript on the landing page when detected; a 7-day expiration time for all other cookies created by JavaScript (yes, this applies to first-party cookies); and a 7-day extendable lifetime for all other script-writable storage, extended whenever the user interacts with the website (necessary because tracking companies began using first-party scripts to evade the above restrictions). Additionally, for prevalent domains only, domains engaging in bounce tracking may have cookies forced to SameSite=strict, and Verified Partitioned Cache is enabled (cached resources are re-downloaded after seven days and deleted if they fail certain privacy tests). Whew!

WebKit has many additional privacy protections not tied to the ITP setting and therefore not discussed here — did you know that cached resources are partioned based on the first-party domain? — and there’s more that’s not very well documented which I don’t understand and haven’t mentioned (tracker collusion!), but that should give you the general idea of how sophisticated this is relative to, say, Chrome (chirp!). Thanks to John Wilander from Apple for his work developing and maintaining ITP, and to Carlos Garcia for getting it working on Linux. If you’re interested in the full history of how ITP has evolved over the years to respond to the changing threat landscape (e.g. tracking prevention tracking), see John’s WebKit blog posts. You might also be interested in WebKit’s Tracking Prevention Policy, which I believe is the strictest anti-tracking stance of any major web engine. TL;DR: “we treat circumvention of shipping anti-tracking measures with the same seriousness as exploitation of security vulnerabilities. If a party attempts to circumvent our tracking prevention methods, we may add additional restrictions without prior notice.” No exceptions.

Updated Website Data Preferences

As part of the work on ITP, you’ll notice that Epiphany’s cookie storage preferences have changed a bit. Since ITP enforces full third-party cookie blocking, it no longer makes sense to have a separate cookie storage preference for that, so I replaced the old tri-state cookie storage setting (always accept cookies, block third-party cookies, block all cookies) with two switches: one to toggle ITP, and one to toggle all website data storage.

Previously, it was only possible to block cookies, but this new setting will additionally block localStorage and IndexedDB, web features that allow websites to store arbitrary data in your browser, similar to cookies. It doesn’t really make much sense to block cookies but allow other types of data storage, so the new preferences should better enforce the user’s intent behind disabling cookies. (This preference does not yet block media keys, service workers, or legacy offline web application cache, but it probably should.) I don’t really recommend disabling website data storage, since it will cause compatibility issues on many websites, but this option is there for those who want it. Disabling ITP is also not something I want to recommend, but it might be necessary to access certain broken websites that have not yet been updated to use the Storage Access API.

Accordingly, Andrei has removed the old cookies dialog and moved cookie management into the Clear Personal Data dialog, which is a better place because anyone clearing cookies for a particular website is likely to also want to clear other personal data. (If you want to delete a website’s cookies, then you probably don’t want to leave its SQL databases intact, right?) He had to remove the ability to clear data from a particular point in time, because WebKit doesn’t support this operation for cookies, but that function is probably  rarely-used and I think the benefit of the change should outweigh the cost. (We could bring it back in the future if somebody wants to try implementing that feature in WebKit, but I suspect not many users will notice.) Treating cookies as separate and different from other forms of website data storage no longer makes sense in 2020, and it’s good to have finally moved on from that antiquated practice.

New HTML Theme

Carlos Garcia has added a new Adwaita-based HTML theme to WebKitGTK 2.30, and removed support for rendering HTML elements using the GTK theme (except for scrollbars). Trying to use the GTK theme to render web content was fragile and caused many web compatibility problems that nobody ever managed to solve. The GTK developers were never very fond of us doing this in the first place, and the foreign drawing API required to do so has been removed from GTK 4, so this was also good preparation for getting WebKitGTK ready for GTK 4. Carlos’s new theme is similar to Adwaita, but gradients have been toned down or removed in order to give a flatter, neutral look that should blend in nicely with all pages while still feeling modern.

This should be a fairly minor style change for Adwaita users, but a very large change for anyone using custom themes. I don’t expect everyone will be happy, but please trust that this will at least result in better web compatibility and fewer tricky theme-related bug reports.

Screenshot demonstrating new HTML theme vs. GTK theme
Left: Adwaita GTK theme controls rendered by WebKitGTK 2.28. Right: hardcoded Adwaita-based HTML theme with toned down gradients.

Although scrollbars will still use the GTK theme as of WebKitGTK 2.30, that will no longer be possible to do in GTK 4, so themed scrollbars are almost certain to be removed in the future. That will be a noticeable disappointment in every app that uses WebKitGTK, but I don’t see any likely solution to this.

Media Permissions

Jan-Michael added new API in WebKitGTK 2.30 to allow muting individual browser tabs, and hooked it up in Epiphany. This is good when you want to silence just one annoying tab without silencing everything.

Meanwhile, Charlie Turner added WebKitGTK API for managing autoplay policies. Videos with sound are now blocked from autoplaying by default, while videos with no sound are still allowed. Charlie hooked this up to Epiphany’s existing permission manager popover, so you can change the behavior for websites you care about without affecting other websites.

Screenshot displaying new media autoplay permission settings
Configure your preferred media autoplay policy for a website near you today!

Improved Dialogs

In addition to his work on the Clear Data dialog, Andrei has also implemented many improvements and squashed bugs throughout each view of the preferences dialog, the passwords dialog, and the history dialog, and refactored the code to be much more maintainable. Head over to his blog to learn more about his accomplishments. (Thanks to Google for sponsoring Andrei’s work via Google Summer of Code, and to Alexander for help mentoring.)

Additionally, Adrien Plazas has ported the preferences dialog to use HdyPreferencesWindow, bringing a pretty major design change to the view switcher:

Screenshot showing changes to the preferences dialog
Left: Epiphany 3.36 preferences dialog. Right: Epiphany 3.38. Note the download settings are present in the left screenshot but missing from the right screenshot because the right window is using flatpak, and the download settings are unavailable in flatpak.

User Scripts

User scripts (like Greasemonkey) allow you to run custom JavaScript on websites. WebKit has long offered user script functionality alongside user CSS, but previous versions of Epiphany only exposed user CSS. Jan-Michael has added the ability to configure a user script as well. To enable, visit the Appearance tab in the preferences dialog (a somewhat odd place, but it really needs to be located next to user CSS due to the tight relationship there). Besides allowing you to do, well, basically anything, this also significantly enhances the usability of user CSS, since now you can apply certain styles only to particular websites. The UI is a little primitive — your script (like your CSS) has to be one file that will be run on every website, so don’t try to design a complex codebase using your user script — but you can use conditional statements to limit execution to specific websites as you please, so it should work fairly well for anyone who has need of it. I fully expect 99.9% of users will never touch user scripts or user styles, but it’s nice for power users to have these features available if needed.

HTTP Authentication Password Storage

Jan-Michael and Carlos Garcia have worked to ensure HTTP authentication passwords are now stored in Epiphany’s password manager rather than by WebKit, so they can now be viewed and deleted from Epiphany, which required some new WebKitGTK API to do properly. Unfortunately, WebKitGTK saves network passwords using the default network secret schema, meaning its passwords (saved by older versions of Epiphany) are all leaked: we have no way to know which application owns those passwords, so we don’t have any way to know which passwords were stored by WebKit and which can be safely managed by Epiphany going forward. Accordingly, all previously-stored HTTP authentication passwords are no longer accessible; you’ll have to use seahorse to look them up manually if you need to recover them. HTTP authentication is not very commonly-used nowadays except for internal corporate domains, so hopefully this one-time migration snafu will not be a major inconvenience to most users.

New Tab Animation

Jan-Michael has added a new animation when you open a new tab. If the newly-created tab is not visible in the tab bar, then the right arrow will flash to indicate success, letting you know that you actually managed to open the page. Opening tabs out of view happens too often currently, but at least it’s a nice improvement over not knowing whether you actually managed to open the tab or not. This will be improved further next year, because Alexander is working on a completely new tab widget to replace GtkNotebook.

New View Source Theme

Jim Mason changed view source mode to use a highlight.js theme designed to mimic Firefox’s syntax highlighting, and added dark mode support.

Image showing dark mode support in view source mode
Embrace the dark.

And More…

  • WebKitGTK 2.30 now supports video formats in image elements, thanks to Philippe Normand. You’ll notice that short GIF-style videos will now work on several major websites where they previously didn’t.
  • I added a new WebKitGTK 2.30 API to expose the paste as plaintext editor command, which was previously internal but fully-functional. I’ve hooked it up in Epiphany’s context menu as “Paste Text Only.” This is nice when you want to discard markup when pasting into a rich text editor (such as the WordPress editor I’m using to write this post).
  • Jan-Michael has implemented support for reordering pinned tabs. You can now drag to reorder pinned tabs any way you please, subject to the constraint that all pinned tabs stay left of all unpinned tabs.
  • Jan-Michael added a new import/export menu, and the bookmarks import/export features have moved there. He also added a new feature to import passwords from Chrome. Meanwhile, ignapk added support for importing bookmarks from HTML (compatible with Firefox).
  • Jan-Michael added a new preference to web apps to allow running them in the background. When enabled, closing the window will only hide the the window: everything will continue running. This is useful for mail apps, music players, and similar applications.
  • Continuing Jan-Michael’s list of accomplishments, he removed Epiphany’s previous hidden setting to set a mobile user agent header after discovering that it did not work properly, and replaced it by adding support in WebKitGTK 2.30 for automatically setting a mobile user agent header depending on the chassis type detected by logind. This results in a major user experience improvement when using Epiphany as a mobile browser. Beware: this functionality currently does not work in flatpak because it requires the creation of a new desktop portal.
  • Stephan Verbücheln has landed multiple fixes to improve display of favicons on hidpi displays.
  • Zach Harbort fixed a rounding error that caused the zoom level to display oddly when changing zoom levels.
  • Vanadiae landed some improvements to the search engine configuration dialog (with more to come) and helped investigate a crash that occurs when using the “Set as Wallpaper” function under Flatpak. The crash is pretty tricky, so we wound up disabling that function under Flatpak for now. He also updated screenshots throughout the  user help.
  • Sabri Ünal continued his effort to document and standardize keyboard shortcuts throughout GNOME, adding a few missing shortcuts to the keyboard shortcuts dialog.

Epiphany 3.38 will be the final Epiphany 3 release, concluding a decade of releases that start with 3. We will match GNOME in following a new version scheme going forward, dropping the leading 3 and the confusing even/odd versioning. Onward to Epiphany 40!

by Michael Catanzaro at September 16, 2020 05:00 PM



September 02, 2020

Serious Encrypted Media Extensions on GStreamer based WebKit ports

by Xabier Rodríguez Calvar

Encrypted Media Extensions (a.k.a. EME) is the W3C standard for encrypted media in the web. This way, media providers such as Hulu, Netflix, HBO, Disney+, Prime Video, etc. can provide their contents with a reasonable amount of confidence that it will make it very complicated for people to “save” their assets without their permission. Why do I use the word “serious” in the title? In WebKit there is already support for Clear Key, which is the W3C EME reference implementation but EME supports more encryption systems, even privative ones (I have my opinion about this, you can ask me privately). No service provider (that I know) supports Clear Key, they usually rely on Widevine, PlayReady or some other.

Three years ago, my colleague Žan Doberšek finished the implementation of what was going to be the shell of WebKit’s modern EME implementation, following latest W3C proposal. We implemented that downstream (at Web Platform for Embedded) as well using Thunder, which includes as a plugin a fork of what was Open Content Decryption Module (a.k.a. OpenCDM). The OpenCDM API changed quite a lot during this journey. It works well and there are millions of set-top-boxes using it currently.

The delta between downstream and the upstream GStreamer based WebKit ports was quite big, testing was difficult and syncing was not always easy, so we decided reverse the situation.

Our first step was done by my colleague Charlie Turner, who made Clear Key work upstream again while adapted some changes the Apple folks had done meanwhile. It was amazing to see Clear Key tests passing again and his work with the CDMProxy related classes was awesome. After having ClearKey working, I had to adapt them a bit to accomodate Thunder. To explain a bit about the WebKit EME architecture, I must say that there are two layers. The first is the crossplatform one, which implements the W3C API (MediaKeys, MediaKeySession, CDM…). These classes rely on the platform ones (CDMPrivate, CDMInstance, CDMInstanceSession) to handle the platform management, message exchange, etc. which would be the second layer. Apple playback system is fully integrated with their DRM system so they don’t need anything else. We do because we need to integrate our own decryptors to defer to Thunder for decryption so in the GStreamer based ports we also need the CDMProxy related classes, which would be CDMProxy, CDMInstanceProxy, CDMInstanceSessionProxy… The last two extend CDMInstance and CDMInstanceSession respectively to be able to deal with the key management, that is abstracted to the KeyHandle and KeyStore.

Once the abstraction is there (let’s remember that the abstranction works both for Clear Key and Thunder), the Thunder implementation is quite simple, just gluing the CDMProxy, CDMInstanceProxy and CDMInstanceSessionProxy classes to the Thunder system and writing a GStreamer decryptor element for it. I might have made a mistake when selecting the files but considering Thunder classes + the GStreamer common decryptor code, cloc says it is just 1198 lines of platform code. I think it is pretty low for what it does. Apart from that, obviously, there are 5760 lines of crossplatform code.

To build and run all this you need to do several things:

  1. Build the dependencies with WEBKIT_JHBUILD=1 JHBUILD_ENABLE_THUNDER="yes" to enable the old fashioned JHBuild build and force it to build the Thunder dependencies. All dependendies are on JHBuild, even Widevine is referenced but to download it you need the proper credentials as it is closed source.
  2. Pass --thunder when calling build-webkit.sh.
  3. Run MiniBrowser with WEBKIT_GST_EME_RANK_PRIORITY="Thunder" and pass parameters --enable-mediasource=TRUE --enable-encrypted-media=TRUE --autoplay-policy=allow. The autoplay policy is usually optional but in this case it is necessary for the YouTube TV tests. We need to give the Thunder decryptor a higher priority because of WebM, that does not specify a key system and without it the Clear Key one can be selected and fail. MP4 does not create trouble because the protection system is specified and the caps negotiation does its magic.

As you could have guessed if you have a closer look at the GStreamer JHBuild moduleset, you’ll see that only Widevine is supported. To support more, you only have to make them build in the Thunder ecosystem and add them to CDMFactoryThunder::supportedKeySystems.

When I coded this, all YouTube TV tests for Widevine were green in the desktop. At the moment of writing this post they aren’t because of some problem with the Widevine installation that will be sorted quickly, I hope.

by calvaris at September 02, 2020 02:59 PM



August 13, 2020

Improving CSS Custom Properties performance

by Javier Fernández

Chrome 84 reached the stable channel a few weeks ago, and there are already several great posts describing the many important additions, interesting new features, security fixes and improvements in privacy policies (([1], [2], [3], [4]) it contains. However, there is a change that I worked on in this release which might have passed unnoticed by most, but I think is very valuable: A change regarding CSS Custom Properties (variables) performance.

The design of CSS, in general, takes great care in considering how features are designed with respect to making it possible for them to perform well. However, implementations may not perform as well as they could, and it takes a considerable amount of time to understand how authors use the features and which cases are more relevant for them.

CSS Custom Properties are an interesting example to look at here: They are a wonderful feature that provides a lot of advantages for web authors. For a whole lot of cases, all of the implementations of CSS Custom Properties perform well enough that most people won’t notice. However, we at Igalia have been analyzing several use cases and looking at some reports around their performance in different implementations.

Let’s consider a fairly straightforward example in which an author sets a single property in a toggleable class in the body, and then uses that property several times deeper in the tree to change the foreground color of some text.

   .red { --prop: red; }
   .green { --prop: green; }




Only about 20% of those actually use this property, 5 elements deep into the tree, and only to change the foreground color.

To evaluate Chromium’s performance in a case like this we can define a new perf tests, using the perf tools the Chromium project has available for browser engineers. In this case, we want a huge tree so that we can evaluate better the impact of the different optimizations.

    .green { --prop: green; }
    .red { --prop: red; }


    

These are the results obtained runing the test in Chrome 83:

avg median stdev min max
163.74 ms 163.79 ms 3.69 ms 158.59 ms 163.74 ms

I admit that it’s difficult to evaluate the results, especially considering the number of nodes of such a huge DOM tree. Lets compare the results of the same test on Firefox, using different number of nodes.

Nodes 50K 20K 10K 5K 1K 500
Chrome 83 163.74 ms 55.05 ms 25.12 ms 14.18 ms 2.74 ms 1.50 ms
FF 78 28.35 ms 12.05 ms 6.10 ms 3.50 ms 1.15 ms 0.55 ms
1/6 1/5 1/4 1/4 1/2 1/3

As I commented before, the data are more accurate when the DOM tree has a lot of nodes; in any case, the difference is quite clear and shows there is plenty room for improvement. WebKit based browsers have results more similar to Chromium as well.

Performance tests like the one above can be added to browsers for tracking improvements and regressions over time, so we’ve added (r763335) that to Chromium’s tree: We’d like to see it get faster over time, and definitely cannot afford regressions (see Chrome Performance Dashboard and the ChangeStyleCustomPropertyDeclaration test for details) .

So… What can we do?

In Chrome 83 and lower, whenever the custom property declaration changed, the new declaration would be inherited by the whole tree. This inheritance implied executing the whole CSS cascade and recalculating the styles of all the nodes in the entire tree, since with this approach, all nodes may be affected.

Chrome had already implemented an optimization on the CSS cascade implementation for regular CSS properties that don’t depend on any other to resolve their value. These subset of CSS properties are defined as Independent Properties in the Chromium codebase. The optimization mentioned before affects how the inheritance mechanism is implemented for these Independent properties. Whenever one of these properties changes, instead of recalculating the styles of the inherited properties, children can just copy the whole parent’s computed style. Blink’s style engine has a component known as Matched Properties Cache responsible of deciding when is possible to avoid the style resolution of an element and instead, performing an efficient copy of the matched computed style. I’ll get back to this concept in the last part of this post.

In the case of CSS Custom Properties, we could apply a similar approach as a good step. We can consider that the nodes with computed styles that don’t have references to custom properties declarations shouldn’t be affected by the new declaration, and we can implement the inheritance directly by copying the parent’s computed style. The patch with the optimization I’ve implemented in r765278 initially landed in Chrome 84.0.4137.0

Let’s look at the result of this one action in the Chrome Performance Dashboard:

That’s a really good improvement!

However, it’s also just a first step. It’s clear that Chrome still has a wide margin for improvement in this case, as well any WebKit based browser – Firefox is still, impressively, markedly faster as it’s been described in the bug report filed to track this issue. The following table shows the result of the different browsers together; even disabling the muti-thread capabilities of Firefox’s Stylo engine (STYLO_THREAD=1), FF is much faster than Chrome with the optimization applied.

Chrome 83 Chrome 84 FF 78 FF 78 th=1
avg
median
stdev
min
max
163.74 ms
163.79 ms
3.69 ms
158.59 ms
163.74 ms
117.37 ms
117.52 ms
1.98 ms
113.66 ms
120.87 ms
28.35 ms
28.50 ms
0.93 ms
26.00 ms
30.00 ms
38.25 ms
38.50 ms
1.86 ms
35.00 ms
41.00 ms

Before continue, I want get back to the Matched Properties Cache (MPC) concept, since it has an important role on these style optimizations. This cache is not a new concept in the Chrome’s engine; as a matter of fact, it’s also used in WebKit, since it was implemented long ago, before the fork that created the new blink engine. However, Google has been working a lot on this area in the last years and some of the most recent changes in the MPC have had an important impact on style resolution performance. As a result of this work, elements with independent and non-independent properties using CSS Variables might produce cache hits in the MPC. The results of the Performance Dashboard show a considerable improvement in the mentioned ChangeStyleCustomPropertyDeclaration test (avg: 108.06 ms)

Additionally, there are several other cases where the use of CSS Variables has a considerable impact on performance, compared with using regular CSS properties. Obviously, resolving CSS Variables has a cost, so it’s clear that we could apply additional optimizations that reduce the impact of the variable resolution, especially for handling specific style changes that might not affect to a substantial portion of the DOM tree. I’ve been experimenting with the MPC to explore the idea an independent CSS Custom Properties cache; nodes with variables referencing the same custom property will produce cache hits in the MPC, even though other properties don’t match. The preliminary approach I’ve been implementing consists on a new matching function, specific for custom properties, and a mechanism to transfer/copy the property’s data to avoid resolving the variable again, since the property’s declaration hasn’t change. We would need to apply the css cascade again, but at least we could save the cost of the variable resolution.

Of course, at the end of the day, improving performance has costs and challenges – and it’s hard to keep performance even once you get it. Bit if we really want performant CSS Custom Properties, this means that we have to decide to prioritize this work. Currently there is reluctance to explore the concept of a new Custom Properties specific cache – the challenge is big and the risks are not non-existent; cache invalidation can get complicated. But, the point is that we have to understand that we aren’t all going to agree what is important enough to warrant attention, or how much investment, or when. Web authors must convince vendors that these use cases are worth being optimized and that the cost and risks of such a complex challenges should be assumed by them.

This work has been sponsored by Bloomberg, which I consider one of the most important contributors of the Web Platform. After several years, the vision of this company and its responsibility as consumer of the platform has lead to many and important contributions that we all enjoy now. Although CSS Grid Layout might be the most remarkable one, there are may other not that big, like this work on CSS Custom Properties, or several other new features of the CSS Text specification. This is a perfect example of an company that tries to change priorities and adapt the web platform to its needs and the use cases they consider more aligned with their business strategy.

I understand that not every user of the web platform can do this kind of investment. This is why I believe that initiatives like Open Priorization could help to move the web platform in a positive direction. By providing a way for us to move past a lot of these conversation and focus on the needs that some web authors and users of the platform consider more important, or higher priority. Improving performance for CSS Custom Properties isn’t currently one of the projects we’ve listed, but perhaps it would be an interesting one we might try in the future if we are successful with these. If you haven’t already, have a look and see if there is something there that is interesting to you or your company – pledges of any size are good – ten thousand $1 donations are every bit as good as ten $1000 donations. Together, we can make a difference, and we all benefit.

Also, we would love to hear about your ideas. Is improving CSS Custom Properties performance important to you? What else is? Share your comments with us on Twitter, either me (@lajava77) or our developer advocate Brian Kardell (@briankardell), or email me at jfernandez@igalia.com. I’d be glad to answer any question about the Open Priorization experiment.

by jfernandez at August 13, 2020 06:16 PM



July 02, 2020

Web-augmented graphics overlay broadcasting with WPE and GStreamer

by Philippe Normand

Graphics overlays are everywhere nowadays in the live video broadcasting industry. In this post I introduce a new demo relying on GStreamer and WPEWebKit to deliver low-latency web-augmented video broadcasts.

Readers of this blog might remember a few posts about WPEWebKit and a GStreamer element we at Igalia worked on …

by Philippe Normand at July 02, 2020 01:00 PM



June 13, 2020

Web Engines Hackfest 2014

by Philippe Normand

Last week I attended the Web Engines Hackfest. The event was sponsored by Igalia (also hosting the event), Adobe and Collabora.

As usual I spent most of the time working on the WebKitGTK+ GStreamer backend and Sebastian Dröge kindly joined and helped out quite a bit, make sure to read …

by Philippe Normand at June 13, 2020 10:05 AM



June 09, 2020

WebKitGTK and WPE now supporting videos in the img tag

by Philippe Normand

Using videos in the <img> HTML tag can lead to more responsive web-page loads in most cases. Colin Bendell blogged about this topic, make sure to read his post on the cloudinary website. As it turns out, this feature has been supported for more than 2 years in Safari, but …

by Philippe Normand at June 09, 2020 04:00 PM



June 08, 2020

Introducing the WebKit Flatpak SDK

by Philippe Normand

Working on a web-engine often requires a complex build infrastructure. This post documents our transition from JHBuild to Flatpak for the WebKitGTK and WPEWebKit development builds.

For the last 10 years, WebKitGTK has been relying on a custom JHBuild moduleset to handle its dependencies and (try to) ensure a reproducible …

by Philippe Normand at June 08, 2020 04:50 PM



March 31, 2020

Sandboxing WebKitGTK Apps

by Michael Catanzaro

When you connect to a Wi-Fi network, that network might block your access to the wider internet until you’ve signed into the network’s captive portal page. An untrusted network can disrupt your connection at any time by blocking secure requests and replacing the content of insecure requests with its login page. (Of course this can be done on wired networks as well, but in practice it mainly happens on Wi-Fi.) To detect a captive portal, NetworkManager sends a request to a special test address (e.g. http://fedoraproject.org/static/hotspot.txt) and checks to see whether it the content has been replaced. If so, GNOME Shell will open a little WebKitGTK browser window to display http://nmcheck.gnome.org, which, due to the captive portal, will be hijacked by your hotel or airport or whatever to display the portal login page. Rephrased in security lingo: an untrusted network may cause GNOME Shell to load arbitrary web content whenever it wants. If that doesn’t immediately sound dangerous to you, let’s ask me from four years ago why that might be bad:

Web engines are full of security vulnerabilities, like buffer overflows and use-after-frees. The details don’t matter; what’s important is that skilled attackers can turn these vulnerabilities into exploits, using carefully-crafted HTML to gain total control of your user account on your computer (or your phone). They can then install malware, read all the files in your home directory, use your computer in a botnet to attack websites, and do basically whatever they want with it.

If the web engine is sandboxed, then a second type of attack, called a sandbox escape, is needed. This makes it dramatically more difficult to exploit vulnerabilities.

The captive portal helper will pop up and load arbitrary web content without user interaction, so there’s nothing you as a user could possibly do about it. This makes it a tempting target for attackers, so we want to ensure that users are safe in the absence of a sandbox escape. Accordingly, beginning with GNOME 3.36, the captive portal helper is now sandboxed.

How did we do it? With basically one line of code (plus a check to ensure the WebKitGTK version is new enough). To sandbox any WebKitGTK app, just call webkit_web_context_set_sandbox_enabled(). Ta-da, your application is now magically secure!

No, really, that’s all you need to do. So if it’s that simple, why isn’t the sandbox enabled by default? It can break applications that use WebKitWebExtension to run custom code in the sandboxed web process, so you’ll need to test to ensure that your application still works properly after enabling the sandbox. (The WebKitGTK sandbox will become mandatory in the future when porting applications to GTK 4. That’s thinking far ahead, though, because GTK 4 isn’t supported yet at all.) You may need to use webkit_web_context_add_path_to_sandbox() to give your web extension access to directories that would otherwise be blocked by the sandbox.

The sandbox is critically important for web browsers and email clients, which are constantly displaying untrusted web content. But really, every app should enable it. Fix your apps! Then thank Patrick Griffis from Igalia for developing WebKitGTK’s sandbox, and the bubblewrap, Flatpak, and xdg-desktop-portal developers for providing the groundwork that makes it all possible.

by Michael Catanzaro at March 31, 2020 03:56 PM



March 11, 2020

Epiphany 3.36 and WebKitGTK 2.28

by Michael Catanzaro

So, what’s new in Epiphany 3.36?

PDF.js

Once upon a time, beginning with GNOME 3.14, Epiphany had supported displaying PDF documents via the Evince NPAPI browser plugin developed by Carlos Garcia Campos. Unfortunately, because NPAPI plugins have to use X11-specific APIs to draw web content, this didn’t  suffice for very long. When GNOME switched to Wayland by default in GNOME 3.24 (yes, that was three years ago!), this functionality was left behind. Using an NPAPI plugin also meant the code was inherently unsandboxable and tied to a deprecated technology. Epiphany disabled support for NPAPI plugins by default in Epiphany 3.30, hiding the functionality behind a hidden setting, which has now finally been removed for Epiphany 3.36, killing off NPAPI for good.

Jan-Michael Brummer, who comaintains Epiphany with me, tried bringing back PDF support for Epiphany 3.34 using libevince, but eventually we decided to give up on this approach due to difficulty solving some user experience issues. Also, the rendering occurred in the unsandboxed UI process, which was again not good for security.

But PDF support is now back in Epiphany 3.36, and much better than before! Thanks to Jan-Michael, Epiphany now supports displaying PDFs using the amazing PDF.js. We are thankful for Mozilla’s work in developing PDF.js and open sourcing it for us to use. Viewing PDFs in Epiphany using PDF.js is more convenient than downloading them and opening them in Evince, and because the PDF is rendered in the sandboxed web process, using web technologies rather than poppler, it’s also approximately one bazillion times more secure.

Screenshot of Epiphany displaying a PDF document
Look, it’s a PDF!

One limitation of PDF.js is that it does not support forms. If you need to fill out PDF forms, you’ll need to download the PDF and open it in Evince, just as you would if using Firefox.

Dark Mode

Thanks to Carlos Garcia, it should finally be possible to use Epiphany with dark GTK themes. WebKitGTK has historically rendered HTML elements using the GTK theme, which has not been good for users of dark themes, which broke badly on many websites, usually due to dark text being drawn on dark backgrounds or various other problems with unexpected dark widgets. Since WebKitGTK 2.28, WebKit will try to manually change to a light GTK theme when it thinks a dark theme is in use, then use the light theme to render web content. (This work has actually been backported to WebKitGTK 2.26.4, so you don’t need to upgrade to WebKitGTK 2.28 to benefit, but the work landed very recently and we haven’t blogged about it yet.) Thanks to Cassidy James from elementary for providing example pages for testing dark mode behavior.

Screenshot demonstrating broken dark mode support
Broken dark mode support prior to WebKitGTK 2.26.4. Notice that the first two pages use dark color schemes when light color schemes are expected, and the dark blue links are hard to read over the dark gray background. Also notice that the text in the second image is unreadable.

Screenshot demonstrating fixed dark mode support in WebKitGTK 2.26.4
Since WebKitGTK 2.26.4, dark mode works as it does in most other browsers. Websites that don’t support dark mode are light, and websites that do support dark mode are dark. Widgets themed using GTK are always light.

Since Carlos had already added support for the prefers-color-scheme media query last year, this now gets us up to dark mode parity with most browsers, except, notably, Safari. Unlike other browsers, Safari allows websites to opt-in to rendering dark system widgets, like WebKitGTK used to do before these changes. Whether to support this in WebKitGTK remains to-be-determined.

Process Swap on Navigation (PSON)

PSON, which debuted in Safari 13, is a major change in WebKit’s process model. PSON is the first component of site isolation, which Chrome has supported for some time, and which Firefox is currently working towards. If you care about web security, you should care a lot about site isolation, because the web browser community has arrived at a consensus that this is the best way to mitigate speculative execution attacks.

Nowadays, all modern web browsers use separate, sandboxed helper processes to render web content, ensuring that the main user interface process, which is unsandboxed, does not touch untrusted web content. Prior to 3.36, Epiphany already used a separate web process to display each browser tab (except for “related views,” where one tab opens another and gains scripting ability over the opened tab, subject to the Same Origin Policy). But in Epiphany 3.36, we now also have a separate web process per website. Each tab will swap between different web processes when navigating between different websites, to prevent any one web process from loading content from different websites.

To make these process swap navigations fast, a pool of prewarmed processes is used to hide the startup cost of launching a new process by ensuring the new process exists before it’s needed; otherwise, the overhead of launching a new web process to perform the navigation would become noticeable. And suspended processes live on after they’re no longer in use because they may be needed for back/forward navigations, which use WebKit’s page cache when possible. (In the page cache, pages are kept in memory indefinitely, to make back/forward navigations fast.)

Due to internal refactoring, PSON previously necessitated some API breakage in WebKitGTK 2.26 that affected Evolution and Geary: WebKitGTK 2.26 deprecated WebKit’s single web process model and required that all applications use one web process per web view, which Evolution and Geary were not, at the time, prepared to handle. We tried hard to avoid this, because we hate to make behavioral changes that break applications, but in this case we decided it was unavoidable. That was the status quo in 2.26, without PSON, which we disabled just before releasing 2.26 in order to limit application breakage to just Evolution and Geary. Now, in WebKitGTK 2.28, PSON is finally available for applications to use on an opt-in basis. (It will become mandatory in the future, for GTK 4 applications.) Epiphany 3.36 opts in. To make this work, Carlos Garcia designed new WebKitGTK APIs for cross-process communication, and used them to replace the private D-Bus server that Epiphany previously used for this purpose.

WebKit still has a long way to go to fully implement site isolation, but PSON is a major step down that road. Thanks to Brady Eidson and Chris Dumez from Apple for making this work, and to Carlos Garcia for handling most of the breakage (there was a lot). As with any major intrusive change of such magnitude, regressions are inevitable, so don’t hesitate to report issues on WebKit Bugzilla.

highlight.js

Once upon a time, WebKit had its own implementation for viewing page source, but this was removed from WebKit way back in 2014, in WebKitGTK 2.6. Ever since, Epiphany would open your default text editor, usually gedit, to display page source. Suffice to say that this was not a very satisfactory solution.

I finally managed to implement view source mode at the Epiphany level for Epiphany 3.30, but I had trouble making syntax highlighting work. I tried using various open source syntax highlighting libraries, but most are designed to highlight small amounts of code, not large web pages. The libraries I tried were not fast enough, so I gave up on syntax highlighting at the time.

Thanks to Jan-Michael, Epiphany 3.36 supports syntax highlighting using highlight.js, so we finally have view source mode working fully properly once again. It works much better than my failed attempts with different JS libraries. Please thank the highlight.js developers for maintaining this library, and for making it open source.

Screenshot displaying Epiphany's view source mode
Colors!

Service Workers

Service workers are now available in WebKitGTK 2.28. Our friends at Apple had already implemented service worker support a couple years ago for Safari 11, but we were pretty slow in bringing this functionality to Linux. Finally, WebKitGTK should now be up to par with Safari in this regard.

Cookies!

Patrick Griffis has updated libsoup and WebKitGTK to support SameSite cookies. He’s also tightened up our cookie policy by implementing strict secure cookies, which prevents http:// pages from setting secure cookies (as they could overwrite secure cookies set by https:// pages).

Adaptive Design

As usual, there are more adaptive design improvements throughout the browser, to provide a better user experience on the Librem 5. There’s still more work to be done here, but Epiphany continues to provide the best user experience of any Linux browser at small screen sizes. Thanks to Adrien Plazas and Jan-Michael for their continued work on this.

Screenshot showing Epiphany running in mobile mode at small window size.
As before, simply resize your browser window to see Epiphany dynamically transition between desktop mode and mobile mode.

elementary OS

With help from Alexander Mikhaylenko, we’ve also upstreamed many elementary OS design changes, which will be used when running under the Pantheon desktop (and not impact users on other desktops), so that the elementary developers don’t need to maintain their customizations as separate patches anymore. This will eliminate a few elementary-specific bugs, including some keyboard shortcuts that were previously broken only in elementary, and some odd tab bar behavior. Although Epiphany still doesn’t feel quite as native as an app designed just for elementary OS, it’s getting closer.

Epiphany 3.34

I failed to blog about Epiphany 3.34 when I released it last September. Hopefully you have updated to 3.34 already, and are already enjoying the two big features from this release: the new adblocker, and the bubblewrap sandbox.

The new adblocker is based on WebKit Content Blockers, which was developed by Apple several years ago. Adrian Perez developed new WebKitGTK API to expose this functionality, changed Epiphany to use it, and deleted Epiphany’s older resource-hungry adblocker that was originally copied from Midori. Previously, Epiphany kept a large GHashMap of compiled regexes in every web process, consuming a very significant amount of RAM for each process. It also took time to compile these regexes when launching each new web process. Now, the adblock filters are instead compiled into an efficient bytecode format that gets mmapped between all web processes to avoid excessive resource use. The bytecode is interpreted by WebKit itself, rather than by Epiphany’s web process extension (which Epiphany uses to execute custom code in WebKit’s web process), for greatly improved performance.

Lastly, Epiphany 3.34 enabled Patrick’s bubblewrap sandbox, which was added in WebKitGTK 2.26. Bubblewrap is an amazing sandboxing tool, already used effectively by flatpak and rpm-ostree, and I’m very pleased with Patrick’s decision to use it for WebKit as well. Because enabling the sandbox can break applications, it is currently opt-in for GTK 3 apps (but will become mandatory for GTK 4 apps). If your application uses WebKitGTK, you really need to take some time to enable this sandbox using webkit_web_context_set_sandbox_enabled(). The sandbox has introduced a couple regressions that we didn’t notice until too late; notably,  printing no longer works, which, half a year later, we still haven’t managed to fix yet. (I’ll try to get to it soon.)

OK, this concludes your 3.36 and 3.34 updates. Onward to 3.38!

by Michael Catanzaro at March 11, 2020 03:00 PM



March 02, 2020

Flatpak repository for WPE

by Žan Doberšek

To let developers play with the WPE stack, we have set up a Flatpak repository containing all the necessary bits to start working with it. To install applications (like Cog, the very simple WPE launcher), first add the remote repository, and proceed with the following instructions:

$ flatpak --user remote-add wpe-releases --from https://software.igalia.com/flatpak-refs/wpe-releases.flatpakrepo
$ flatpak --user install org.wpe.Cog
$ flatpak run org.wpe.Cog -P fdo <url>

Currently the 2.26 release of the WPE port is used, along with libwpe 1.4.0, WPEBackend-fdo 1.4.0 and Cog 0.4.0. Upgrades to the newer releases (happening in next few weeks) will be done in the next month or two. Builds are provided for x86_64, arm and aarch64 architectures.

If you need ideas or inspiration on how to use WPE, this repository also contains GstWPEBroadcastDemo, an application that showcases both GStreamer and WPE, enabling you to mix live video input with HTML content that can be updated on-the-fly. You can read more about this in the blog post made by Philippe Normand.

The current Cog/WPE stack still imposes the Wayland-only limitation, with Mesa-based graphics stacks most likely to work well. In future release, we plan to add support for new platforms, graphics stacks and methods of integration.

All of this is still in very early stages. If you find an issue with the applications or libraries in the repository, please do not hesitate to report it to our issue tracker. The issues will be rerouted to the trackers of the problematic component if necessary.

by Žan Doberšek at March 02, 2020 09:00 AM



December 08, 2019

HTML overlays with GstWPE, the demo

by Philippe Normand

Once again this year I attended the GStreamer conference and just before that, Embedded Linux conference Europe which took place in Lyon (France). Both events were a good opportunity to demo one of the use-cases I have in mind for GstWPE, HTML overlays!

As we, at Igalia, usually have a …

by Philippe Normand at December 08, 2019 02:00 PM



September 18, 2019

Epiphany Technology Preview Users: Action Required

by Michael Catanzaro

Epiphany Technology Preview has moved from https://sdk.gnome.org to https://nightly.gnome.org. The old Epiphany Technology Preview is now end-of-life. Action is required to update. If you installed Epiphany Technology Preview prior to a couple minutes ago, uninstall it using GNOME Software and then reinstall using this new flatpakref.

Apologies for this disruption.

The main benefit to end users is that you’ll no longer need separate remotes for nightly runtimes and nightly applications, because everything is now hosted in one repo. See Abderrahim’s announcement for full details on why this transition is occurring.

by Michael Catanzaro at September 18, 2019 02:19 PM



September 08, 2019

WebKit Vulnerabilities Facilitate Human Rights Abuses

by Michael Catanzaro

Chinese state actors have recently abused vulnerabilities in the JavaScriptCore component of WebKit to hack the personal computing devices of Uighur Muslims in the Xinjiang region of China. Mass digital surveillance is a key component of China’s ongoing brutal human rights crackdown in the region.

This has resulted in a public relations drama that is largely a distraction to the issue at hand. Whatever big-company PR departments have to say on the matter, I have no doubt that the developers working on WebKit recognize the severity of this incident and are grateful to Project Zero, which reported these vulnerabilities and has previously provided numerous other high-quality private vulnerability reports. (Many other organizations deserve credit for similar reports, especially Trend Micro’s Zero Day Initiative.)

WebKit as a project will need to reassess certain software development practices that may have facilitated the abuse of these vulnerabilities. The practice of committing security fixes to open source long in advance of corresponding Safari releases may need to be reconsidered.

Sadly, Uighurs should assume their personal computing devices have been compromised by state-sponsored attackers, and that their private communications are not private. Even if not compromised in this particular incident, similar successful attacks are overwhelmingly likely in the future.

by Michael Catanzaro at September 08, 2019 05:32 PM



June 28, 2019

On Version Numbers

by Michael Catanzaro

I’m excited to announce that Epiphany Tech Preview has reached version 3.33.3-33, as computed by git describe. That is 33 commits after 3.33.3:

Epiphany about dialog displaying the version number

I’m afraid 3.33.4 will arrive long before we  make it to 3.33.3-333, so this is probably the last cool version number Epiphany will ever have.

I might be guilty of using an empty commit to claim the -33 commit.

I might also apologize for wasting your time with a useless blog post, except this was rather fun. I await the controversy of your choice in the comments.

by Michael Catanzaro at June 28, 2019 04:07 PM



June 10, 2019

A new terminal-style line breaking with CSS Text

by Javier Fernández

The CSS Text 3 specification defines a module for text manipulation and covers, among a few other features, the line breaking behavior of the browser, including white space handling. I’ve been working lately on some new features and bug fixing for this specification and I’d like to introduce in this posts the last one we made available for the Web Platform users. This is yet another contribution that came out the collaboration between Igalia and Bloomberg, which has been held for several years now and has produced many important new features for the Web, like CSS Grid Layout.

The feature

I guess everybody knows the white-space CSS property, which allows web authors to control two main aspects of the rendering of a text line: collapsing and wrapping. A new value break-spaces has been added to the ones available for this property, which allows web authors to emulate a terminal-like line breaking behavior. This new value operates basically like pre-wrap, but with two key differences:

  • any sequence of preserved white space characters takes up space, even at the end of the line.
  • a preserved white space sequence can be wrapped at any character, moving the rest of the sequence, intact, to the line bellow.

What does this new behavior actually mean ? I’ll try to explain it with a few examples. Lets start with a simple but quite illustrative demo which tries to emulate a meteorology monitoring system which shows relevant changes over time, where the gaps between subsequent changes must be preserved:

 #terminal {
  font: 20px/1 monospace;
  width: 340px;
  height: 5ch;
  background: black;
  color: green;
  overflow: hidden;
  white-space: break-spaces;
  word-break: break-all;
 }







Another interesting use case for this feature could be a logging system which should preserve the text formatting of the logged information, considering different window sizes. The following demo tries to describe this such scenario:

body { width: 1300px; }
#logging {
  font: 20px/1 monospace;
  background: black;
  color: green;

  animation: resize 7s infinite alternate;

  white-space: break-spaces;
  word-break: break-all;
}
@keyframes resize {
  0% { width: 25%; }
  100% { width: 100%; }
}






Hash: 5a2a3d23f88174970ed8 Version: webpack 3.12.0 Time: 22209ms Asset Size Chunks Chunk Names pages/widgets/index.51838abe9967a9e0b5ff.js 1.17 kB 10 [emitted] pages/widgets/index img/icomoon.7f1da5a.svg 5.38 kB [emitted] fonts/icomoon.2d429d6.ttf 2.41 kB [emitted] img/fontawesome-webfont.912ec66.svg 444 kB [emitted] [big] fonts/fontawesome-webfont.b06871f.ttf 166 kB [emitted] img/mobile.8891a7c.png 39.6 kB [emitted] img/play_button.6b15900.png 14.8 kB [emitted] img/keyword-back.f95e10a.jpg 43.4 kB [emitted] . . .

Use cases

In the demo shown before there are several cases that I think it’s worth to analyze in detail.

A breaking opportunity exists after any white space character

The main purpose of this feature is to preserve the white space sequences length even when it has to be wrapped into multiple lines. The following example tries to describe this basic use case:

.container {
  font: 20px/1 monospace;
  width: 5ch;
  white-space: break-spaces;
  border: 1px solid;
}






XX XX

The example above shows how the white space sequence with a length of 15 characters is preserved and wrapped along 3 different lines.

Single leading white space

Before the addition of the break-spaces value this scenario was only possible at the beginning of the line. In any other case, the trailing white spaces were either collapsed or hang, hence the next line couldn’t start with a sequence of white spaces. Lets consider the following example:

.container {
  font: 20px/1 monospace;
  width: 3ch;
  white-space: break-spaces;
  border: 1px solid;
}






XX XX

Like when using pre-wrap, the single leading space is preserved. Since break-spaces allows breaking opportunities after any white space character, we break after the first leading white space (” |XX XX”). The second line can be broken after the first preserved white space, creating another leading white space in the next line (” |XX | XX”).

However, lets consider now a case without such first single leading white space.

.container {
  font: 20px/1 monospace;
  width: 3ch;
  white-space: break-spaces;
  border: 1px solid;
}






XXX XX

Again, it s not allowed to break before the first space, but in this case there isn’t any previous breaking opportunity, so the first space after the word XX should overflow (“XXX | XX”); the next white space character will be moved down to the next line as preserved leading space.

Breaking before the first white space

I mentioned before that the spec states clearly that the break-space feature allows breaking opportunities only after white space characters. However, it’d be possible to break the line just before the first white space character after a word if the feature is used in combination with other line breaking CSS properties, like word-break or overflow-wrap (and other properties too).

.container {
  font: 20px/1 monospace;
  width: 4ch;
  white-space: break-spaces;
  overflow-wrap: break-word;
  border: 1px solid;
}






XXXX X

The two white spaces between the words are preserved due to the break-spaces feature, but the first space after the XXXX word would overflow. Hence, the overflow-wrap: break-word feature is applied to prevent the line to overflow and introduce an additional breaking opportunity just before the first space after the word. This behavior causes that the trailing spaces are moved down as a leading white space sequence in the next line.

We would get the same rendering if word-break: break-all is used instead overflow-wrap (or even in combination), but this is actualy an incorrect behavior, which has the corresponding bug reports in WebKit (197277) and Blink (952254) according to the discussion in the CSS WG (see issue #3701).

Consider previous breaking opportunities

In the previous example I described a combination of line breaking features that would allow breaking before the first space after a word. However, this should be avoided if there are previous breaking opportunities. The following example is one of the possible scenarios where this may happen:

.container {
  font: 20px/1 monospace;
  width: 4ch;
  white-space: break-spaces;
  overflow-wrap: break-word;
  border: 1px solid;
}






XX X X

In this case, we could break after the second word (“XX X| X”), since overflow-wrap: break-word would allow us to do that in order to avoid the line to overflow due to the following white space. However, white-space: break-spaces only allows breaking opportunities after a space character, hence, we shouldn’t break before if there are valid previous opportunities, like in this case in the space after the first word (“XX |X X”).

This preference for previous breaking opportunities before breaking the word, honoring the overflow-wrap property, is also part of the behavior defined for the white-space: pre-wrap feature; although in that case, there is no need to deal with the issue of breaking before the first space after a word since trailing space will just hang. The following example uses just the pre-wrap to show how previous opportunities are selected to avoid overflow or breaking a word (unless explicitly requested by word-break property).

.container {
  font: 20px/1 monospace;
  width: 2ch;
  white-space: pre-wrap;
  border: 1px solid;
}






XX
overflow-wrap:
break-word
word-break:
break-all

In this case, break-all enables breaking opportunities that are not available otherwise (we can break a word at any letter), which can be used to prevent the line to overflow; hence, the overflow-wrap property doesn’t take any effect. The existence of previous opportunities is not considered now, since break-all mandates to produce the longer line as possible.

This new white-space: break-spaces feature implies a different behavior when used in combination with break-all. Even though the preference of previous opportunities should be ignored if we use the word-break: break-all, this may not be the case for the breaking before the first space after a word scenario. Lets consider the same example but using now the word-break: break-all feature:

.container {
  font: 20px/1 monospace;
  width: 4ch;
  white-space: break-spaces;
  overflow-wrap: break-word;
  word-break: break-all;
  border: 1px solid;
}






XX X X

The example above shows that using word-break: break-all doesn’t produce any effect. It’s debatable whether the use of break-all should force the selection of the breaking opportunity that produces the longest line, like it happened in the pre-wrap case described before. However, the spec states clearly that break-spaces should only allow breaking opportunities after white space characters. Hence, I considered that breaking before the first space should only happen if there is no other choice.

As a matter of fact, specifying break-all we shouldn’t considering only previous white spaces, to avoid breaking before the first white space after a word; the break-all feature creates additional breaking opportunities, indeed, since it allows to break the word at any character. Since break-all is intended to produce the longest line as possible, this new breaking opportunity should be chosen over any previous white space. See the following test case to get a clearer idea of this scenario:

.container {
  font: 20px/1 monospace;
  width: 4ch;
  white-space: break-spaces;
  overflow-wrap: break-word;
  word-break: break-all;
  border: 1px solid;
}






X XX X

Bear in mind that the expected rendering in the above example may not be obtained if your browser’s version is still affected by the bugs 197277(Safari/WebKit) and 952254(Chrome/Blink). In this case, the word is broken despite the opportunity in the previous white space, and also avoiding breaking after the ‘XX’ word, just before the white space.

There is an exception to the rule of avoiding breaking before the first white space after a word if there are previous opportunities, and it’s precisely the behavior the line-break: anywhere feature would provide. As I said, all these assumptions were not, in my opinion, clearly defined in the current spec, so that’s why I filed an issue for the CSS WG so that we can clarify when it’s allowed to break before the first space.

Current status and support

The intent-to-ship request for Chrome has been approved recently, so I’m confident the feature will be enabled by default in Chrome 76. However, it’s possible to try the feature in older versions by enabling the Experimental Web Platform Features flag. More details in the corresponding Chrome Status entry. I want to highlight that I also implemented the feature for LayoutNG, the new layout engine that Chrome will eventually ship; this achievement is very important to ensure the stability of the feature in future versions of Chrome.

In the case of Safari, the patch with the implementation of the feature landed in the WebKit’s trunk in r244036, but since Apple doesn’t announce publicly when a new release of Safari will happen or which features it’ll ship, it’s hard to guess when the break-spaces feature will be available for the web authors using such browser. Meanwhile, It’s possible to try the feature in the Safari Technology Preview 80.

Finally, while I haven’t see any signal of active development in Firefox, some of the Mozilla developers working on this area of the Gecko engine have shown public support for the feature.

The following table summarizes the support of the break-spaces feature in the 3 main browsers:

Chrome Safari Firefox
Experimental M73 STP 80 Public support
Ship M76 Unknown Unknown

Web Platform Tests

At Igalia we believe that the Web Platform Tests project is a key piece to ensure the compatibility and interoperability of any development on the Web Platform. That’s why a substantial part of my work to implement this relatively small feature was the definition of enough tests to cover the new functionality and basic use cases of the feature.

white-space overflow-wrap word-break
pre-wrap-008
pre-wrap-015
pre-wrap-016
break-spaces-003
break-spaces-004
break-spaces-005
break-spaces-006
break-spaces-007
break-spaces-008
break-spaces-009
break-word-004
break-word-005
break-word-006
break-word-007
break-word-008
break-all-010
break-all-011
break-all-012
break-all-013
break-all-014
break-all-015

Implementation in several web engines

During the implementation of a browser feature, even a small one like this, it’s quite usual to find out bugs and interoperability issues. Even though this may slow down the implementation of the feature, it’s also a source of additional Web Platform tests and it may contribute to the robustness of the feature itself and the related CSS properties and values. That’s why I decided to implement the feature in parallel for WebKit (Safari) and Blink (Chrome) engines, which I think it helped to ensure interoperability and code maturity. This approach also helped to get a deeper understanding of the line breaking logic and its design and implementation in different web engines.

I think it’s worth mentioning some of these code architectural differences, to get a better understanding of the work and challenges this feature required until it reached web author’s browser.

Chrome/Blink engine

Lets start with Chrome/Blink, which was especially challenging due to the fact that Blink is implementing a new layout engine (LayoutNG). The implementation for the legacy layout engine was the first step, since it ensures the feature will arrive earlier, even behind an experimental runtime flag.

The legacy layout relies on the BreakingContext class to implement the line breaking logic for the inline layout operations. It has the main characteristic of handling the white space breaking opportunities by its own, instead of using the TextBreakIterator (based on ICU libraries), as it does for determining breaking opportunities between letters and/or symbols. This design implies too much complexity to do even small changes like this, especially because is very sensible in terms of performance impact. In the following diagram I try to show a simplified view of the classes involved and the interactions implemented by this line breaking logic.

The LayoutNG line breaking logic is based on a new concept of fragments, mainly handled by the NGLineBreaker class. This new design simplifies the line breaking logic considerably and it’s highly optimized and adapted to get the most of the TextBreakIterator classes and the ICU features. I tried to show a simplified view of this new design with the following diagram:

In order to describe the work done to implement the feature for this web engine, I’ll list the main bugs and patches landed during this time: CR#956465, CR#952254, CR#944063,CR#900727, CR#767634, CR#922437

Safari/WebKit engine

Although as time passes this is less probable, WebKit and Blink still share some of the layout logic from the ages prior to the fork. Although Blink engineers have applied important changes to the inline layout logic, both code refactoring and optimizations, there are common design patterns that made relatively easy porting to WebKit the patches that implemented the feature for the Blink’s legacy layout. In WebKit, the line breaking logic is also implemented by the BreakingContext class and it has a similar architecture, as it’s described, in a quite simplified way, in the class diagram above (it uses different class names for the render/layout objects, though) .

However, Safari supports for the mac and iOS platforms a different code path for the line breaking logic, implemented in the SimpleLineLayout class. This class provides a different design for the line breaking logic, and, similar to what Blink implements in LayoutNG, is based on a concept of text fragments. It also relies as much as possible into the TextBreakIterator, instead of implementing complex rules to handle white spaces and breaking opportunities. The following diagrams show this alternate design to implement the line breaking process.

This SimpleLineLayout code path in not supported by other WebKit ports (like WebKitGtk+ or WPE) and it’s not available either when using some CSS Text features or specific fonts. There are other limitations to use this SimpleLineLayout codepath, which may lead to render the text using the BreakingContext class.

Again, this is the list of bugs that were solved to implement the feature for the WebKit engine: WK#197277, WK#196169, WK#196353, WK#195361, WK#177327, WK#197278

Conclusion

I hope that at this point these 2 facts are clear now:

  • The white-space: break-spaces feature is a very simple but powerful feature that provides a new line breaking behavior, based on unix-terminal systems.
  • Although it’s a simple feature, on the paper (spec), it implies a considerable amount of work so that it reaches the browser and it’s available for web authors.

In this post I tried to explain in a simple way the main purpose of this new feature and also some interesting corner cases and combinations with other Line Breaking features. The demos I used shown 2 different use cases of this feature, but there are may more. I’m sure the creativity of web authors will push the feature to the limits; by then, I’ll be happy to answer doubts, about the spec or the implementation for the web engines, and of course fix the bugs that may appear once the feature is more used.

Igalia logo
Bloomberg logo

Igalia and Bloomberg working together to build a better web

Finally, I want to thank Bloomberg for supporting the work to implement this feature. It’s another example of how non-browser vendors can influence the Web Platform and contribute with actual features that will be eventually available for web authors. This is the kind of vision that we need if we want to keep a healthy, open and independent Web Platform.

by jfernandez at June 10, 2019 08:11 PM



January 08, 2019

Epiphany automation mode

by Carlos García Campos

Last week I finally found some time to add the automation mode to Epiphany, that allows to run automated tests using WebDriver. It’s important to note that the automation mode is not expected to be used by users or applications to control the browser remotely, but only by WebDriver automated tests. For that reason, the automation mode is incompatible with a primary user profile. There are a few other things affected by the auotmation mode:

  • There’s no persistency. A private profile is created in tmp and only ephemeral web contexts are used.
  • URL entry is not editable, since users are not expected to interact with the browser.
  • An info bar is shown to notify the user that the browser is being controlled by automation.
  • The window decoration is orange to make it even clearer that the browser is running in automation mode.

So, how can I write tests to be run in Epiphany? First, you need to install a recently enough selenium. For now, only the python API is supported. Selenium doesn’t have an Epiphany driver, but the WebKitGTK driver can be used with any WebKitGTK+ based browser, by providing the browser information as part of session capabilities.

from selenium import webdriver

options = webdriver.WebKitGTKOptions()
options.binary_location = 'epiphany'
options.add_argument('--automation-mode')
options.set_capability('browserName', 'Epiphany')
options.set_capability('version', '3.31.4')

ephy = webdriver.WebKitGTK(options=options, desired_capabilities={})
ephy.get('http://www.webkitgtk.org')
ephy.quit()

This is a very simple example that just opens Epiphany in automation mode, loads http://www.webkitgtk.org and closes Epiphany. A few comments about the example:

  • Version 3.31.4 will be the first one including the automation mode.
  • The parameter desired_capabilities shouldn’t be needed, but there’s a bug in selenium that has been fixed very recently.
  • WebKitGTKOptions.set_capability was added in selenium 3.14, if you have an older version you can use the following snippet instead
from selenium import webdriver

options = webdriver.WebKitGTKOptions()
options.binary_location = 'epiphany'
options.add_argument('--automation-mode')
capabilities = options.to_capabilities()
capabilities['browserName'] = 'Epiphany'
capabilities['version'] = '3.31.4'

ephy = webdriver.WebKitGTK(desired_capabilities=capabilities)
ephy.get('http://www.webkitgtk.org')
ephy.quit()

To simplify the driver instantation you can create your own Epiphany driver derived from the WebKitGTK one:

from selenium import webdriver

class Epiphany(webdriver.WebKitGTK):
    def __init__(self):
        options = webdriver.WebKitGTKOptions()
        options.binary_location = 'epiphany'
        options.add_argument('--automation-mode')
        options.set_capability('browserName', 'Epiphany')
        options.set_capability('version', '3.31.4')

        webdriver.WebKitGTK.__init__(self, options=options, desired_capabilities={})

ephy = Epiphany()
ephy.get('http://www.webkitgtk.org')
ephy.quit()

The same for selenium < 3.14

from selenium import webdriver

class Epiphany(webdriver.WebKitGTK):
    def __init__(self):
        options = webdriver.WebKitGTKOptions()
        options.binary_location = 'epiphany'
        options.add_argument('--automation-mode')
        capabilities = options.to_capabilities()
        capabilities['browserName'] = 'Epiphany'
        capabilities['version'] = '3.31.4'

        webdriver.WebKitGTK.__init__(self, desired_capabilities=capabilities)

ephy = Epiphany()
ephy.get('http://www.webkitgtk.org')
ephy.quit()

by carlos garcia campos at January 08, 2019 05:22 PM



October 20, 2017

Web Engines Hackfest, 2017 Edition

by Adrián Pérez de Castro

At the beginning of October I had the wonderful chance of attending the Web Engines Hackfest in A Coruña, hosted by Igalia. This year we were over 50 participants, which was great to associate even more faces to IRC nick names, but more importantly allows hackers working at all the levels of the Web stack to share a common space for a few days, making it possible to discuss complex topics and figure out the future of the projects which allow humanity to see pictures of cute kittens — among many other things.

Random kitten image
Mandatory fluff ([CC-BY-NC](https://www.flickr.com/photos/nwater/6781786068)).

During the hackfest I worked mostly on three things:

  • Preparing the code of the WPE WebKit port to start making preview releases.

  • A patch set which adds WPE packages to Buildroot.

  • Enabling support for the CSS generic system font family.

Fun trivia: Most of the WebKit contributors work from the United States, so the week of the Web Engines hackfest is probably the only single moment during the whole year that there is a sizeable peak of activity in European day times.

Watching repository activity during the hackfest.

Towards WPE Releases

At Igalia we are making an important investment in the WPE WebKit port, which is specially targeted towards embedded devices. An important milestone for the project was reached last May when the code was moved to main WebKit repository, and has been receiving the usual stream of improvements and bug fixes. We are now approaching the moment where we feel that is is ready to start making releases, which is another major milestone.

Our plan for the WPE is to synchronize with WebKitGTK+, and produce releases for both in parallel. This is important because both ports share a good amount of their code and base dependencies (GStreamer, GLib, libsoup) and our efforts to stabilize the GTK+ port before each release will benefit the WPE one as well, and vice versa. In the coming weeks we will be publishing the first official tarball starting off the WebKitGTK+ 2.18.x stable branch.

Wild WEBKIT PORT appeared!

Syncing the releases for both ports means that:

  • Both stable and unstable releases are done in sync with the GNOME release schedule. Unstable releases start at version X.Y.1, with Y being an odd number.

  • About one month before the release dates, we create a new release branch and from there on we work on stabilizing the code. At least one testing release with with version X.Y.90 will be made. This is also what GNOME does, and we will mimic this to avoid confusion for downstream packagers.

  • The stable release will have version X.Y+1.0. Further maintenance releases happen from the release branch as needed. At the same time, a new cycle of unstable releases is started based on the code from the tip of the repository.

Believe it or not, preparing a codebase for its first releases involves quite a lot of work, and this is what took most of my coding time during the Web Engines Hackfest and also the following weeks: from small fixes for build failures all the way to making sure that public API headers (only the correct ones!) are installed and usable, that applications can be properly linked, and that release tarballs can actually be created. Exhausting? Well, do not forget that we need to set up a web server to host the tarballs, a small website, and the documentation. The latter has to be generated (there is still pending work in this regard), and the whole process of making a release scripted.

Still with me? Great. Now for a plot twist: we won’t be making proper releases just yet.

APIs, ABIs, and Releases

There is one topic which I did not touch yet: API/ABI stability. Having done a release implies that the public API and ABI which are part of it are stable, and they are not subject to change.

Right after upstreaming WPE we switched over from the cross-port WebKit2 C API and added a new, GLib-based API to WPE. It is remarkably similar (if not the same in many cases) to the API exposed by WebKitGTK+, and this makes us confident that the new API is higher-level, more ergonomic, and better overall. At the same time, we would like third party developers to give it a try (which is easier having releases) while retaining the possibility of getting feedback and improving the WPE GLib API before setting it on stone (which is not possible after a release).

It is for this reason that at least during the first WPE release cycle we will make preview releases, meaning that there might be API and ABI changes from one release to the next. As usual we will not be making breaking changes in between releases of the same stable series, i.e. code written for 2.18.0 will continue to build unchanged with any subsequent 2.18.X release.

At any rate, we do not expect the API to receive big changes because —as explained above— it mimics the one for WebKitGTK+, which has already proven itself both powerful enough for complex applications and convenient to use for the simpler ones. Due to this, I encourage developers to try out WPE as soon as we have the first preview release fresh out of the oven.

Packaging for Buildroot

At Igalia we routinely work with embedded devices, and often we make use of Buildroot for cross-compilation. Having actual releases of WPE will allow us to contribute a set of build definitions for the WPE WebKit port and its dependencies — something that I have already started working on.

Lately I have been taking care of keeping the WebKitGTK+ packaging for Buildroot up-to-date and it has been delightful to work with such a welcoming community. I am looking forward to having WPE supported there, and to keep maintaining the build definitions for both. This will allow making use of WPE with relative ease, while ensuring that Buildroot users will pick our updates promptly.

Generic System Font

Some applications like GNOME Web Epiphany use a WebKitWebView to display widget-like controls which try to follow the design of the rest of the desktop. Unfortunately for GNOME applications this means Cantarell gets hardcoded in the style sheet —it is the default font after all— and this results in mismatched fonts when the user has chosen a different font for the interface (e.g. in Tweaks). You can see this in the following screen capture of Epiphany:

Web using hardcoded Cantarell and (on hover) `-webkit-system-font`.

Here I have configured the beautiful Inter UI font as the default for the desktop user interface. Now, if you roll your mouse over the image, you will see how much better it looks to use a consistent font. This change also affects the list of plugins and applications, error messages, and in general all the about: pages.

If you are running GNOME 3.26, this is already fixed using font: menu (part of the CSS spec since ye olde CSS 2.1) — but we can do better: Safari has had support since 2015, for a generic “system” font family, similar to sans-serif or cursive:

/* Using the new generic font family (nice!). */
body {
    font-family: -webkit-system-font;
}

/* Using CSS 2.1 font shorthands (not so nice). */
body {
    font: menu;       /* Pick ALL font attributes... */
    font-size: 12pt;  /* ...then reset some of them. */
    font-weight: 400;
}

During the hackfest I implemented the needed moving parts in WebKitGTK+ by querying the GtkSettings::gtk-font-name property. This can be used in HTML content shown in Epiphany as part of the UI, and to make the Web Inspector use the system font as well.

Web Inspector using Cantarell, the default GNOME 3 font ([full size](minibrowser-inspector-cantarell.png)).

I am convinced that users do notice and appreciate attention to detail, even if they do unconsciously, and therefore it is worthwhile to work on this kind of improvements. Plus, as a design enthusiast with a slight case of typographic OCD, I cannot stop myself from noticing inconsistent usage of fonts and my mind is now at ease knowing that opening the Web Inspector won’t be such a jarring experience anymore.

Outro

But there’s one more thing: On occasion we developers have to debug situations in which a process is seemingly stuck. One useful technique involves running the offending process under the control of a debugger (or, in an embedded device, under gdbserver and controlled remotely), interrupting its execution at intervals, and printing stack traces to try and figure out what is going on. Unfortunately, in some circumstances running a debugger can be difficult or impractical. Wouldn’t it be grand if it was possible to interrupt the process without needing a debugger and request a stack trace? Enter “Out-Of-Band Stack Traces” (proof of concept):

  1. The process installs a signal handler using sigaction(7), with the SA_SIGINFO flag set.

  2. On reception of the signal, the kernel interrupts the process (even if it’s in an infinite loop), and invokes the signal handler passing an extra pointer to an ucontext_t value, which contains a snapshot of the execution status of the thread which was in the CPU before the signal handler was invoked. This is true for many platform including Linux and most BSDs.

  3. The signal handler code can get obtain the instruction and stack pointers from the ucontext_t value, and walk the stack to produce a stack trace of the code that was being executed. Jackpot! This is of course architecture dependent but not difficult to get right (and well tested) for the most common ones like x86 and ARM.

The nice thing about this approach is that the code that obtains the stack trace is built into the program (no rebuilds needed), and it does not even require to relaunch the process in a debugger — which can be crucial for analyzing situations which are hard to reproduce, or which do not happen when running inside a debugger. I am looking forward to have some time to integrate this properly into WebKitGTK+ and specially WPE, because it will be most useful in embedded devices.

by aperez (adrian@perezdecastro.org) at October 20, 2017 11:30 PM



September 09, 2017

WebDriver support in WebKitGTK+ 2.18

by Carlos García Campos

WebDriver is an automation API to control a web browser. It allows to create automated tests for web applications independently of the browser and platform. WebKitGTK+ 2.18, that will be released next week, includes an initial implementation of the WebDriver specification.

WebDriver in WebKitGTK+

There’s a new process (WebKitWebDriver) that works as the server, processing the clients requests to spawn and control the web browser. The WebKitGTK+ driver is not tied to any specific browser, it can be used with any WebKitGTK+ based browser, but it uses MiniBrowser as the default. The driver uses the same remote controlling protocol used by the remote inspector to communicate and control the web browser instance. The implementation is not complete yet, but it’s enough for what many users need.

The clients

The web application tests are the clients of the WebDriver server. The Selenium project provides APIs for different languages (Java, Python, Ruby, etc.) to write the tests. Python is the only language supported by WebKitGTK+ for now. It’s not yet upstream, but we hope it will be integrated soon. In the meantime you can use our fork in github. Let’s see an example to understand how it works and what we can do.

from selenium import webdriver

# Create a WebKitGTK driver instance. It spawns WebKitWebDriver 
# process automatically that will launch MiniBrowser.
wkgtk = webdriver.WebKitGTK()

# Let's load the WebKitGTK+ website.
wkgtk.get("https://www.webkitgtk.org")

# Find the GNOME link.
gnome = wkgtk.find_element_by_partial_link_text("GNOME")

# Click on the link. 
gnome.click()

# Find the search form. 
search = wkgtk.find_element_by_id("searchform")

# Find the first input element in the search form.
text_field = search.find_element_by_tag_name("input")

# Type epiphany in the search field and submit.
text_field.send_keys("epiphany")
text_field.submit()

# Let's count the links in the contents div to check we got results.
contents = wkgtk.find_element_by_class_name("content")
links = contents.find_elements_by_tag_name("a")
assert len(links) > 0

# Quit the driver. The session is closed so MiniBrowser 
# will be closed and then WebKitWebDriver process finishes.
wkgtk.quit()

Note that this is just an example to show how to write a test and what kind of things you can do, there are better ways to achieve the same results, and it depends on the current source of public websites, so it might not work in the future.

Web browsers / applications

As I said before, WebKitWebDriver process supports any WebKitGTK+ based browser, but that doesn’t mean all browsers can automatically be controlled by automation (that would be scary). WebKitGTK+ 2.18 also provides new API for applications to support automation.

  • First of all the application has to explicitly enable automation using webkit_web_context_set_automation_allowed(). It’s important to know that the WebKitGTK+ API doesn’t allow to enable automation in several WebKitWebContexts at the same time. The driver will spawn the application when a new session is requested, so the application should enable automation at startup. It’s recommended that applications add a new command line option to enable automation, and only enable it when provided.
  • After launching the application the driver will request the browser to create a new automation session. The signal “automation-started” will be emitted in the context to notify the application that a new session has been created. If automation is not allowed in the context, the session won’t be created and the signal won’t be emitted either.
  • A WebKitAutomationSession object is passed as parameter to the “automation-started” signal. This can be used to provide information about the application (name and version) to the driver that will match them with what the client requires accepting or rejecting the session request.
  • The WebKitAutomationSession will emit the signal “create-web-view” every time the driver needs to create a new web view. The application can then create a new window or tab containing the new web view that should be returned by the signal. This signal will always be emitted even if the browser has already an initial web view open, in that case it’s recommened to return the existing empty web view.
  • Web views are also automation aware, similar to ephemeral web views, web views that allow automation should be created with the constructor property “is-controlled-by-automation” enabled.

This is the new API that applications need to implement to support WebDriver, it’s designed to be as safe as possible, but there are many things that can’t be controlled by WebKitGTK+, so we have several recommendations for applications that want to support automation:

  • Add a way to enable automation in your application at startup, like a command line option, that is disabled by default. Never allow automation in a normal application instance.
  • Enabling automation is not the only thing the application should do, so add an automation mode to your application.
  • Add visual feedback when in automation mode, like changing the theme, the window title or whatever that makes clear that a window or instance of the application is controllable by automation.
  • Add a message to explain that the window is being controlled by automation and the user is not expected to use it.
  • Use ephemeral web views in automation mode.
  • Use a temporal user profile in application mode, do not allow automation to change the history, bookmarks, etc. of an existing user.
  • Do not load any homepage in automation mode, just keep an empty web view (about:blank) that can be used when a new web view is requested by automation.

The WebKitGTK client driver

Applications need to implement the new automation API to support WebDriver, but the WebKitWebDriver process doesn’t know how to launch the browsers. That information should be provided by the client using the WebKitGTKOptions object. The driver constructor can receive an instance of a WebKitGTKOptions object, with the browser information and other options. Let’s see how it works with an example to launch epiphany:

from selenium import webdriver
from selenium.webdriver import WebKitGTKOptions

options = WebKitGTKOptions()
options.browser_executable_path = "/usr/bin/epiphany"
options.add_browser_argument("--automation-mode")
epiphany = webdriver.WebKitGTK(browser_options=options)

Again, this is just an example, Epiphany doesn’t even support WebDriver yet. Browsers or applications could create their own drivers on top of the WebKitGTK one to make it more convenient to use.

from selenium import webdriver
epiphany = webdriver.Epiphany()

Plans

During the next release cycle, we plan to do the following tasks:

  • Complete the implementation: add support for all commands in the spec and complete the ones that are partially supported now.
  • Add support for running the WPT WebDriver tests in the WebKit bots.
  • Add a WebKitGTK driver implementation for other languages in Selenium.
  • Add support for automation in Epiphany.
  • Add WebDriver support to WPE/dyz.

by carlos garcia campos at September 09, 2017 05:33 PM



August 31, 2017

Some rough numbers on WebKit code

by Xabier Rodríguez Calvar

My wife asked me for some rough LOC numbers on the WebKit project and I think I could share them with you here as well. They come from r221232. As I’ll take into account some generated code it is relevant to mention that I built WebKitGTK+ with the default CMake options.

First thing I did was running sloccount Source and got the following numbers:

cpp: 2526061 (70.57%)
ansic: 396906 (11.09%)
asm: 207284 (5.79%)
javascript: 175059 (4.89%)
java: 74458 (2.08%)
perl: 73331 (2.05%)
objc: 44422 (1.24%)
python: 38862 (1.09%)
cs: 13011 (0.36%)
ruby: 11605 (0.32%)
xml: 11396 (0.32%)
sh: 3747 (0.10%)
yacc: 2167 (0.06%)
lex: 1007 (0.03%)
lisp: 89 (0.00%)
php: 10 (0.00%)

This number do not include IDL code so I did some grepping to get the number myself that gave me 19632 IDL lines:

$ find Source/ -name ".idl" | xargs cat | grep -ve "^[[:space:]]\/*" -ve "^[[:space:]]*" -ve "^[[:space:]]$" -ve "^[[:space:]][$" -ve "^[[:space:]]};$" | wc -l
19632

The interesting part of the IDL files is that they are used to generate code so those 19632 IDL lines expand to:

ansic: 699140 (65.25%)
cpp: 368720 (34.41%)
python: 1492 (0.14%)
xml: 1040 (0.10%)
javascript: 883 (0.08%)
asm: 169 (0.02%)
perl: 11 (0.00%)

Let’s have a look now at the LayoutTests (they test the functionality of WebCore + the platform). Tests are composed mainly by HTML files so if you run sloccount LayoutTests you get:

javascript: 401159 (76.74%)
python: 87231 (16.69%)
xml: 22978 (4.40%)
php: 4784 (0.92%)
ansic: 3661 (0.70%)
perl: 2726 (0.52%)
sh: 199 (0.04%)

It’s quite interesting to see that sloccount does not consider HTML which is quite relevant when you’re testing a web engine so again, we have to count them manually (thanks to Carlos López who helped me to properly grep here as some binary lines were giving me a headache to get the numbers):

find LayoutTests/ -name ".html" -print0 | xargs -0 cat | strings | grep -Pv "^[[:space:]]$" | wc -l
2205690

You can see 2205690 of “meaningful lines” that combine HTML + other languages that you can see above. I can’t substract here to just get the HTML lines because the number above take into account files with a different extension than HTML, though many of them do include other languages, specially JavaScript.

But the LayoutTests do not include only pure WebKit tests. There are some imported ones so it might be interesting to run the same procedure under LayoutTests/imported to see which ones are imported and not written directly into the WebKit project. I emphasize that because they can be written by WebKit developers in other repositories and actually I can present myself and Youenn Fablet as an example as we wrote tests some tests that were finally moved into the specification and included back later when imported. So again, sloccount LayoutTests/imported:

python: 84803 (59.99%)
javascript: 51794 (36.64%)
ansic: 3661 (2.59%)
php: 575 (0.41%)
xml: 250 (0.18%)
sh: 199 (0.14%)
perl: 86 (0.06%)

The same procedure to count HTML + other stuff lines inside that directory gives a number of 295490:

$ find LayoutTests/imported/ -name ".html" -print0 | xargs -0 cat | strings | grep -Pv "^[[:space:]]$" | wc -l
295490

There are also some other tests that we can talk about, for example the JSTests. I’ll mention already the numbers summed up regarding languages and the manual HTML code (if you made it here, you know the drill already):

javascript: 1713200 (98.64%)
xml: 20665 (1.19%)
perl: 2449 (0.14%)
python: 421 (0.02%)
ruby: 56 (0.00%)
sh: 38 (0.00%)
HTML+stuff: 997

ManualTests:

javascript: 297 (41.02%)
ansic: 187 (25.83%)
java: 118 (16.30%)
xml: 103 (14.23%)
php: 10 (1.38%)
perl: 9 (1.24%)
HTML+stuff: 16026

PerformanceTests:

javascript: 950916 (83.12%)
cpp: 147194 (12.87%)
ansic: 38540 (3.37%)
asm: 5466 (0.48%)
sh: 872 (0.08%)
ruby: 419 (0.04%)
perl: 348 (0.03%)
python: 325 (0.03%)
xml: 5 (0.00%)
HTML+stuff: 238002

TestsWebKitAPI:

cpp: 44753 (99.45%)
ansic: 163 (0.36%)
objc: 76 (0.17%)
xml: 7 (0.02%)
javascript: 1 (0.00%)
HTML+stuff: 3887

And this is all. Remember that these are just some rough statistics, not a “scientific” paper.

Update:

In her expert opinion, in the WebKit project we are devoting around 50% of the total LOC to testing, which makes it a software engineering “textbook” project regarding testing and I think we can be proud of it!

by calvaris at August 31, 2017 09:03 AM



May 03, 2017

Can I use CSS Box Alignment ?

by Javier Fernández

As a member of the Igalia\’s team implementing the CSS Grid Layout feature for Blink and WebKit rendering engines, I\’m very proud of what we\’ve achieved from our collaboration with Bloomberg. I think Grid is a very interesting feature for the Web Platform and we still can\’t see all its potential.

One of my main assignments on this project is to implement the CSS Box Alignment spec for Grid. It\’s obvious that alignment is an important feature for many cases in web development, but I consider it a key for a layout model like the one Grid provides.

We recently announced that the patch implementing the self-baseline alignment landed in Blink. This was the last alignment functionality pending to implement, so now we can consider that the spec is complete for Grid. However, implementing a feature like CSS Box Alignment has an additional complexity in the form of interoperability issues.

Interoperability is always a challenge when implementing any new specification, but I think it\’s specially problematic for a feature like this for several reasons:

  • The feature applies to several layout models.
  • The CSS Flexible Box specification already defined some of the CSS properties and values.
  • Once a new layout model implements the new specification, Flexbox is forced to follow it as well.

I admit that the editors of this new specification document made a huge effort to keep backward compatibility with the Flexbox spec (which caused not so few implementation challenges). However, the current Flexbox implementation of the CSS properties and values that both specs have in common would become a Partial Implementation regarding the new spec.

Recently Florian Rivoal found out that this partial implementation of the CSS Box Alignment feature prevents the use of cascade or @support for providing customized fallbacks for the unimplemented Alignment properties.

What does Partial Implementation actually mean ?

As anybody can imagine, implementing a fancy web feature takes a considerable amount of time. During this period, the feature passes through several phases with different exposure to the end users. It\’s precisely due to the importance of end user\’s feedback that these new web features are shipped under experimental flags. This workflow is specially useful no only for browser devs but for the spec editors as well.

For this reason, the W3C CSS Working Group defines a general policy to manage Partial Implementations, which can be summarized as follows:

So that authors can exploit the forward-compatible parsing rules to assign fallback values, CSS renderers must treat as invalid (and ignore as appropriate) any at-rules, properties, property values, keywords, and other syntactic constructs for which they have no usable level of support. In particular, user agents must not selectively ignore unsupported property values and honor supported values in a single multi-value property declaration: if any value is considered invalid (as unsupported values must be), CSS requires that the entire declaration be ignored.

This policy is added to every spec as part of its Conformance appendix, so it is in the case of the CSS Box Alignment specification document. However, the interpretation of the Partial Implementation policy is far from trivial, specially for a feature like CSS Box Alignment. The most restrictive interpretation would imply the following facts:

  • Any new CSS property of the new spec should be declared invalid until is supported by all the layout models it applies to.
  • Any of the already existent CSS properties with new values defined in the new spec should be declared invalid until all these new values are implemented in all the layout models such property applies to.
  • Browsers shouldn\’t ship (without experimental flags) any CSS property or value until it\’s implemented in all the layout model it applies to.

When we discussed about this at Igalia we applied a less restrictive interpretation, based on the assumption that the spec actually defined several features which could be implemented and shipped independently, obviously avoiding any browsers interoperability issues. As it\’s been always in the nature of the specification, keeping backward compatibility with Flexbox implementations has been a must, since its spec already defines some of the CSS properties now present in the new spec.

The issue filed by Florian was discussed during the Tokyo F2F Apr 19-21 2017 meeting, where it was agreed to add a new section in the CSS Box Alignment spec to clarify how implementors of this feature should manage Partial Implementations:

Since it is expected that support for the features in this module will be deployed in stages corresponding to the various layout models affected, it is hereby clarified that the rules for partial implementations that require treating as invalid any unsupported feature apply to any alignment keyword which is not supported across all layout modules to which it applies for layout models in which the implementation supports the property in general.

The new text added makes the Partial Implementation policy less restrictive and, even it contradicts our interpretation of independent alignment features per layout model, it affects only to models which already implement any of the CSS properties defined in the new spec. In this case, only Flexbox has to be updated to implement the new values defined for its alignment related CSS properties: align-content, justify-content and align-self.

Analysis of the implementation and shipment status

Before thinking on how to address the Partial Implementation issues, I decided to analyze what\’s the status of the CSS Box Alignment feature in the different browsers. If you are interested in the full analysis, it\’s available here. The following table shows the implementation status of the new spec in the Safary, Chrome and Firefox browsers, using a color code like unimplemented, only grid or both (flex and grid):

If you can try out some examples of these Partial Implementation issues, just try flexbox vs grid cases with some of these alignment values: align-items: center, align-self: left; align-content: start or justify-content: end.

The 3 major browsers analyzed have shipped most, if not all, the CSS Box Alignment spec implemented for CSS Grid Layout (since Chrome 57, Safari 10.1, Firefox 52). Firefox is the browser which implemented and shipped a wider support for CSS Flexible Box.

We can extract the following conclusions:

  • The 3 browsers analyzed have shipped Partial Implementations of the CSS Box Alignment specification, although Firefox is almost complete.
  • The 3 browsers have shipped a Grid feature that supports completely the new CSS Box Alignment spec, although Safari still misses the self-baseline values.
  • The 3 implementations of the new CSS Box Alignment specification are backward compatible with the CSS Flexible Box specification, even though it implements for some properties a lower level of the spec (e.g. self-baseline keywords)

Work in progress

Although we are still evaluating the problem together with the Blink and WebKit communities, at Igalia we are already working on improving the situation. We all agree on the damage to the Web Platform that these Partial Implementation issues are causing, as Florian pointed out initially, so that\’s a good starting point. There are bug reports on both WebKit and Blink and we are already providing patches for some of them.

We are still discussing about the best approach, but our bet would be to request an intent-to-implement-and-ship for a CSS Box Alignment (for flexbox layout) feature. This approach fits naturally in our initial plans of implementing several independent features from the alignment specification. It seems that it\’s what Firefox is doing, which already announced the implementation of CSS Box Alignment (for block layout)

Thanks to Bloomberg for sponsoring this work, as part of the efforts that Igalia has been doing all these years pursuing a better and more open web.

by jfernandez at May 03, 2017 08:19 PM



WebKitGTK+ remote debugging in 2.18

by Carlos García Campos

WebKitGTK+ has supported remote debugging for a long time. The current implementation uses WebSockets for the communication between the local browser (the debugger) and the remote browser (the debug target or debuggable). This implementation was very simple and, in theory, you could use any web browser as the debugger because all inspector code was served by the WebSockets. I said in theory because in the practice this was not always so easy, since the inspector code uses newer JavaScript features that are not implemented in other browsers yet. The other major issue of this approach was that the communication between debugger and target was not bi-directional, so the target browser couldn’t notify the debugger about changes (like a new tab open, navigation or that is going to be closed).

Apple abandoned the WebSockets approach a long time ago and implemented its own remote inspector, using XPC for the communication between debugger and target. They also moved the remote inspector handling to JavaScriptCore making it available to debug JavaScript applications without a WebView too. In addition, the remote inspector is also used by Apple to implement WebDriver. We think that this approach has a lot more advantages than disadvantages compared to the WebSockets solution, so we have been working on making it possible to use this new remote inspector in the GTK+ port too. After some refactorings to the code to separate the cross-platform implementation from the Apple one, we could add our implementation on top of that. This implementation is already available in WebKitGTK+ 2.17.1, the first unstable release of this cycle.

From the user point of view there aren’t many differences, with the WebSockets we launched the target browser this way:

$ WEBKIT_INSPECTOR_SERVER=127.0.0.1:1234 browser

This hasn’t changed with the new remote inspector. To start debugging we opened any browser and loaded

http://127.0.0.1:1234

With the new remote inspector we have to use any WebKitGTK+ based browser and load

inspector://127.0.0.1:1234

As you have already noticed, it’s no longer possible to use any web browser, you need to use a recent enough WebKitGTK+ based browser as the debugger. This is because of the way the new remote inspector works. It requires a frontend implementation that knows how to communicate with the targets. In the case of Apple that frontend implementation is Safari itself, which has a menu with the list of remote debuggable targets. In WebKitGTK+ we didn’t want to force using a particular web browser as debugger, so the frontend is implemented as a builtin custom protocol of WebKitGTK+. So, loading inspector:// URLs in any WebKitGTK+ WebView will show the remote inspector page with the list of debuggable targets.

It looks quite similar to what we had, just a list of debuggable targets, but there are a few differences:

  • A new debugger window is opened when inspector button is clicked instead of reusing the same web view. Clicking on inspect again just brings the window to the front.
  • The debugger window loads faster, because the inspector code is not served by HTTP, but locally loaded like the normal local inspector.
  • The target list page is updated automatically, without having to manually reload it when a target is added, removed or modified.
  • The debugger window is automatically closed when the target web view is closed or crashed.

How does the new remote inspector work?

The web browser checks the presence of WEBKIT_INSPECTOR_SERVER environment variable at start up, the same way it was done with the WebSockets. If present, the RemoteInspectorServer is started in the UI process running a DBus service listening in the IP and port provided. The environment variable is propagated to the child web processes, that create a RemoteInspector object and connect to the RemoteInspectorServer. There’s one RemoteInspector per web process, and one debuggable target per WebView. Every RemoteInspector maintains a list of debuggable targets that is sent to the RemoteInspector server when a new target is added, removed or modified, or when explicitly requested by the RemoteInspectorServer.
When the debugger browser loads an inspector:// URL, a RemoteInspectorClient is created. The RemoteInspectorClient connects to the RemoteInspectorServer using the IP and port of the inspector:// URL and asks for the list of targets that is used by the custom protocol handler to create the web page. The RemoteInspectorServer works as a router, forwarding messages between RemoteInspector and RemoteInspectorClient objects.

by carlos garcia campos at May 03, 2017 03:43 PM



March 20, 2017

WebKitGTK+ 2.16

by Carlos García Campos

The Igalia WebKit team is happy to announce WebKitGTK+ 2.16. This new release drastically improves the memory consumption, adds new API as required by applications, includes new debugging tools, and of course fixes a lot of bugs.

Memory consumption

After WebKitGTK+ 2.14 was released, several Epiphany users started to complain about high memory usage of WebKitGTK+ when Epiphany had a lot of tabs open. As we already explained in a previous post, this was because of the switch to the threaded compositor, that made hardware acceleration always enabled. To fix this, we decided to make hardware acceleration optional again, enabled only when websites require it, but still using the threaded compositor. This is by far the major improvement in the memory consumption, but not the only one. Even when in accelerated compositing mode, we managed to reduce the memory required by GL contexts when using GLX, by using OpenGL version 3.2 (core profile) if available. In mesa based drivers that means that software rasterizer fallback is never required, so the context doesn’t need to create the software rasterization part. And finally, an important bug was fixed in the JavaScript garbage collector timers that prevented the garbage collection to happen in some cases.

CSS Grid Layout

Yes, the future here and now available by default in all WebKitGTK+ based browsers and web applications. This is the result of several years of great work by the Igalia web platform team in collaboration with bloomberg. If you are interested, you have all the details in Manuel’s blog.

New API

The WebKitGTK+ API is quite complete now, but there’s always new things required by our users.

Hardware acceleration policy

Hardware acceleration is now enabled on demand again, when a website requires to use accelerated compositing, the hardware acceleration is enabled automatically. WebKitGTK+ has environment variables to change this behavior, WEBKIT_DISABLE_COMPOSITING_MODE to never enable hardware acceleration and WEBKIT_FORCE_COMPOSITING_MODE to always enabled it. However, those variables were never meant to be used by applications, but only for developers to test the different code paths. The main problem of those variables is that they apply to all web views of the application. Not all of the WebKitGTK+ applications are web browsers, so it can happen that an application knows it will never need hardware acceleration for a particular web view, like for example the evolution composer, while other applications, especially in the embedded world, always want hardware acceleration enabled and don’t want to waste time and resources with the switch between modes. For those cases a new WebKitSetting hardware-acceleration-policy has been added. We encourage everybody to use this setting instead of the environment variables when upgrading to WebKitGTk+ 2.16.

Network proxy settings

Since the switch to WebKit2, where the SoupSession is no longer available from the API, it hasn’t been possible to change the network proxy settings from the API. WebKitGTK+ has always used the default proxy resolver when creating the soup context, and that just works for most of our users. But there are some corner cases in which applications that don’t run under a GNOME environment want to provide their own proxy settings instead of using the proxy environment variables. For those cases WebKitGTK+ 2.16 includes a new UI process API to configure all proxy settings available in GProxyResolver API.

Private browsing

WebKitGTK+ has always had a WebKitSetting to enable or disable the private browsing mode, but it has never worked really well. For that reason, applications like Epiphany has always implemented their own private browsing mode just by using a different profile directory in tmp to write all persistent data. This approach has several issues, for example if the UI process crashes, the profile directory is leaked in tmp with all the personal data there. WebKitGTK+ 2.16 adds a new API that allows to create ephemeral web views which never write any persistent data to disk. It’s possible to create ephemeral web views individually, or create ephemeral web contexts where all web views associated to it will be ephemeral automatically.

Website data

WebKitWebsiteDataManager was added in 2.10 to configure the default paths on which website data should be stored for a web context. In WebKitGTK+ 2.16 the API has been expanded to include methods to retrieve and remove the website data stored on the client side. Not only persistent data like HTTP disk cache, cookies or databases, but also non-persistent data like the memory cache and session cookies. This API is already used by Epiphany to implement the new personal data dialog.

Dynamically added forms

Web browsers normally implement the remember passwords functionality by searching in the DOM tree for authentication form fields when the document loaded signal is emitted. However, some websites add the authentication form fields dynamically after the document has been loaded. In those cases web browsers couldn’t find any form fields to autocomplete. In WebKitGTk+ 2.16 the web extensions API includes a new signal to notify when new forms are added to the DOM. Applications can connect to it, instead of document-loaded to start searching for authentication form fields.

Custom print settings

The GTK+ print dialog allows the user to add a new tab embedding a custom widget, so that applications can include their own print settings UI. Evolution used to do this, but the functionality was lost with the switch to WebKit2. In WebKitGTK+ 2.16 a similar API to the GTK+ one has been added to recover that functionality in evolution.

Notification improvements

Applications can now set the initial notification permissions on the web context to avoid having to ask the user everytime. It’s also possible to get the tag identifier of a WebKitNotification.

Debugging tools

Two new debugged tools are now available in WebKitGTk+ 2.16. The memory sampler and the resource usage overlay.

Memory sampler

This tool allows to monitor the memory consumption of the WebKit processes. It can be enabled by defining the environment variable WEBKIT_SMAPLE_MEMORY. When enabled, the UI process and all web process will automatically take samples of memory usage every second. For every sample a detailed report of the memory used by the process is generated and written to a file in the temp directory.

$ WEBKIT_SAMPLE_MEMORY=1 MiniBrowser 
Started memory sampler for process MiniBrowser 32499; Sampler log file stored at: /tmp/MiniBrowser7ff2246e-406e-4798-bc83-6e525987aace
Started memory sampler for process WebKitWebProces 32512; Sampler log file stored at: /tmp/WebKitWebProces93a10a0f-84bb-4e3c-b257-44528eb8f036

The files contain a list of sample reports like this one:

Timestamp                          1490004807
Total Program Bytes                1960214528
Resident Set Bytes                 84127744
Resident Shared Bytes              68661248
Text Bytes                         4096
Library Bytes                      0
Data + Stack Bytes                 87068672
Dirty Bytes                        0
Fast Malloc In Use                 86466560
Fast Malloc Committed Memory       86466560
JavaScript Heap In Use             0
JavaScript Heap Committed Memory   49152
JavaScript Stack Bytes             2472
JavaScript JIT Bytes               8192
Total Memory In Use                86477224
Total Committed Memory             86526376
System Total Bytes                 16729788416
Available Bytes                    5788946432
Shared Bytes                       1037447168
Buffer Bytes                       844214272
Total Swap Bytes                   1996484608
Available Swap Bytes               1991532544

Resource usage overlay

The resource usage overlay is only available in Linux systems when WebKitGTK+ is built with ENABLE_DEVELOPER_MODE. It allows to show an overlay with information about resources currently in use by the web process like CPU usage, total memory consumption, JavaScript memory and JavaScript garbage collector timers information. The overlay can be shown/hidden by pressing CTRL+Shit+G.

We plan to add more information to the overlay in the future like memory cache status.

by carlos garcia campos at March 20, 2017 03:19 PM



February 10, 2017

Accelerated compositing in WebKitGTK+ 2.14.4

by Carlos García Campos

WebKitGTK+ 2.14 release was very exciting for us, it finally introduced the threaded compositor to drastically improve the accelerated compositing performance. However, the threaded compositor imposed the accelerated compositing to be always enabled, even for non-accelerated contents. Unfortunately, this caused different kind of problems to several people, and proved that we are not ready to render everything with OpenGL yet. The most relevant problems reported were:

  • Memory usage increase: OpenGL contexts use a lot of memory, and we have the compositor in the web process, so we have at least one OpenGL context in every web process. The threaded compositor uses the coordinated graphics model, that also requires more memory than the simple mode we previously use. People who use a lot of tabs in epiphany quickly noticed that the amount of memory required was a lot more.
  • Startup and resize slowness: The threaded compositor makes everything smooth and performs quite well, except at startup or when the view is resized. At startup we need to create the OpenGL context, which is also quite slow by itself, but also need to create the compositing thread, so things are expected to be slower. Resizing the viewport is the only threaded compositor task that needs to be done synchronously, to ensure that everything is in sync, the web view in the UI process, the OpenGL viewport and the backing store surface. This means we need to wait until the threaded compositor has updated to the new size.
  • Rendering issues: some people reported rendering artifacts or even nothing rendered at all. In most of the cases they were not issues in WebKit itself, but in the graphic driver or library. It’s quite diffilcult for a general purpose web engine to support and deal with all possible GPUs, drivers and libraries. Chromium has a huge list of hardware exceptions to disable some OpenGL extensions or even hardware acceleration entirely.

Because of these issues people started to use different workarounds. Some people, and even applications like evolution, started to use WEBKIT_DISABLE_COMPOSITING_MODE environment variable, that was never meant for users, but for developers. Other people just started to build their own WebKitGTK+ with the threaded compositor disabled. We didn’t remove the build option because we anticipated some people using old hardware might have problems. However, it’s a code path that is not tested at all and will be removed for sure for 2.18.

All these issues are not really specific to the threaded compositor, but to the fact that it forced the accelerated compositing mode to be always enabled, using OpenGL unconditionally. It looked like a good idea, entering/leaving accelerated compositing mode was a source of bugs in the past, and all other WebKit ports have accelerated compositing mode forced too. Other ports use UI side compositing though, or target a very specific hardware, so the memory problems and the driver issues are not a problem for them. The imposition to force the accelerated compositing mode came from the switch to using coordinated graphics, because as I said other ports using coordinated graphics have accelerated compositing mode always enabled, so they didn’t care about the case of it being disabled.

There are a lot of long-term things we can to to improve all the issues, like moving the compositor to the UI (or a dedicated GPU) process to have a single GL context, implement tab suspension, etc. but we really wanted to fix or at least improve the situation for 2.14 users. Switching back to use accelerated compositing mode on demand is something that we could do in the stable branch and it would improve the things, at least comparable to what we had before 2.14, but with the threaded compositor. Making it happen was a matter of fixing a lot bugs, and the result is this 2.14.4 release. Of course, this will be the default in 2.16 too, where we have also added API to set a hardware acceleration policy.

We recommend all 2.14 users to upgrade to 2.14.4 and stop using the WEBKIT_DISABLE_COMPOSITING_MODE environment variable or building with the threaded compositor disabled. The new API in 2.16 will allow to set a policy for every web view, so if you still need to disable or force hardware acceleration, please use the API instead of WEBKIT_DISABLE_COMPOSITING_MODE and WEBKIT_FORCE_COMPOSITING_MODE.

We really hope this new release and the upcoming 2.16 will work much better for everybody.

by carlos garcia campos at February 10, 2017 05:18 PM



December 15, 2016

Thu 2016/Dec/15

by Claudio Saavedra

Igalia is hiring. We're currently interested in Multimedia and Chromium developers. Check the announcements for details on the positions and our company.

December 15, 2016 05:13 PM



November 10, 2016

Web Engines Hackfest 2016

by Xabier Rodríguez Calvar

From September 26th to 28th we celebrated at the Igalia HQ the 2016 edition of the Web Engines Hackfest. This year we broke all records and got participants from the three main companies behind the three biggest open source web engines, say Mozilla, Google and Apple. Or course, it was not only them, we had some other companies and ourselves. I was active part of the organization and I think we not only did not get any complain but people were comfortable and happy around.

We had several talks (I included the slides and YouTube links):

We had lots and lots of interesting hacking and we also had several breakout sessions:

  • WebKitGTK+ / Epiphany
  • Servo
  • WPE / WebKit for Wayland
  • Layout Models (Grid, Flexbox)
  • WebRTC
  • JavaScript Engines
  • MathML
  • Graphics in WebKit

What I did during the hackfest was working with Enrique and Žan to advance on reviewing our downstream implementation of our GStreamer based of Media Source Extensions (MSE) in order to land it as soon as possible and I can proudly say that we did already (we didn’t finish at the hackfest but managed to do it after it). We broke the bots and pissed off Michael and Carlos but we managed to deactivate it by default and continue working on it upstream.

So summing up, from my point of view and it is not only because I was part of the organization at Igalia, based also in other people’s opinions, I think the hackfest was a success and I think we will continue as we were or maybe growing a bit (no spoilers!).

Finally I would like to thank our gold sponsors Collabora and Igalia and our silver sponsor Mozilla.

by calvaris at November 10, 2016 08:59 AM



October 09, 2016

Web Engines Hackfest 2016

by Javier Fernández

Last week I attended the Web Engines Hackfest 2016, hosted by Igalia at the HQ premises in A Coruña. For those still unaware, it’s a unconference like event focused on pure hacking and technical discussions about the main Web Engines supporting the Web Platform.

30147116385_9e2737ef4d_o

This year there were a very interesting group of hackers, representing most of the main web engines, like Mozilla’s Gecko and Servo, Google’s Blink, Apple’s WebKit and Igalia’s WebKitGTK+.

Hacking log

This year I was totally focused on hacking Blink web engine to solve some of the most complex issues of the CSS Grid Layout feature. I’d like to start giving thanks to Google to join the hackfest sending Christian Biesinger, specially thanks to him because of the long flight to attend. It was a pleasure to work work with him on-site and have the opportunity to discuss these complex issues face to face.

Day 1

During the weeks previous to the hackfest I’ve been working on fixing the bug 628565, reported several months ago but hitting me quite much recently. It’s a very ugly issue, since it shows an unpredictable behavior of Grid layout logic, so I decided that I might fix it once for all. I managed to provide 2 different approaches, which can be analyzed in the following code review issues: Issue 2361373002 and Issue 2333583002. Basically, the root cause of both issues is the same; grid’s container intrinsic size is not computed correctly when there are grid items with orthogonal flows.

There are 2 fundamental concepts that are key for understanding this problem:

  • Intrinsic size computation must be done before layout.
  • Orthogonal flow boxes need to be laid out in order to compute min-content contribution to their container’s intrinsic width.

So, we had as single bug to fix for solving two different problems: unpredictable grid layout logic and incorrect grid’s intrinsic size computation. The first problem is the one reported in bug 628565. I’ll try to describe it here briefly, but further details can be obtained from the bug report. Let’s consider the following code:


   body { overflow: hidden; }

XX X

The following pictures illustrate the issue when loading the code above with Google Chrome 55.0.2859.0 (Official Build) dev (64-bit) Revision 3f63c614e8c4501b1bfa3f608e32a9d12618b0a0-refs/heads/master@{#418117} under Linux operating system:

Chrome BEFORE resizing
chrome-before-resizing

Chrome AFTER resizing
chrome-after-resizing

Christian and I analyzed carefully Blink’s layout code and how it deals with orthogonal boxes. It’s worth mentioning that there are important issues with orthogonal flows in almost every web engine, but Blink has made lately some improvements on this regard. See how a very basic example works in Blink, Gecko and WebKit engines:

XX X

orthogonal-flow-different-engines

Blink has implemented a kind of pre-layout logic which is executed for any orthogonal flow box even before doing the actual layout, when box’s intrinsic size computation takes place. This solution allows using a more precise info about an orthogonal box’s height when computing its container’s intrinsic width; otherwise zero width would be assumed, as it’s the case of WebKit and Gecko engines. However, this approach leads to an incorrect behavior when using Grid Layout as I’ll explain later.

My first approach to solve these two issues was to update grid container’s intrinsic size after its orthogonal children were laid out. Hence, we could use the actual size of the children to compute their container’s intrinsic size. However, this violates the first rule of the two assertions defined before: no layout should be done for intrinsic size computation.

Layout Breakout Session

During the evening we held the Layout Breakout Session, where we had a nice discussion about the future of Layout in the different web engines. Christian Biesinger, one of the members of the Google’s Blink Layout team, talked about the new LayoutNG project; an experiment to implement a new Layout from scratch, cleaning up quite old code paths and solving some problems that were not possible to address with the current legacy code. This new LayoutNG idea is related to the new Layout API and the Houdini project, a new mechanism for defining new layout models without the requirement of a native support inside the browser. We had also some discussions about the current state of the Flexible Box specification in Blink and how WebKit’s implementation is quite abandoned and unmaintained nowadays.

In addition, we discussed about the current state of CSS Grid Layout implementation in the different engines. The implementation is almost complete in most of the main engines. Thanks to the collaboration between Igalia and Bloomberg we can confirm that WebKit and Blink’s implementations are almost completed. We have been evaluating Mozilla’s Gecko implementation of Grid and we verified it’s in a similar status. We talked about the recent news from TPAC, which Manuel Rego attended, about the CSS Grid Layout specification becoming Candidate Recommendation. Because of all these reasons, we have agreed with Christian that it’d be good to send the Blink intent-to-ship request as soon as possible; in case it’s accepted, it could be enabled by default in the next Chrome release.

Day 2

The day started with a meeting with Christian for evaluating the different approaches we implemented so far. We have discarded the ones requiring updating intrinsic size. We also decided to avoid solving the issue during the pre-layout of orthogonal items. This approach would have been the one with less impact on performance for grid layout and it would also solve the incorrect intrinsic size issue, however it would add penalties for cases not using grid at all.

Finally, Christian and I agreed on solving first the unpredictable behavior of grid layout logic. We would skip the issue of incorrect intrinsic size, overall because we think the Grid Layout specification is contradictory on this regard. For what is worth, I’ve created a new issue for the CSSWG in the W3C’s github. Even though we should wait for the issue to get clarified, we have already some possible approaches for getting what seems a more natural result for grid’s intrinsic size. The following example could help to understand the problem:

intrinsic-sizes-with-orthogonal

Both test cases were loaded using the same Chrome version commented before. They clearly show that neither min-content or max-content sizes are applied correctly to the grid container. The reason of this weird behavior is how content-sized tracks are handled by the Grid Tracks sizing algorithm in case of rendering grid items with orthogonal flow. From the last draft specification:

Then, if the min-content contribution of any grid items have changed based on the row sizes calculated in step 2, steps 1 and 2 are repeated with the new min-content contribution and max-content contribution (once only).

This, with the fact that orthogonal boxes pre-layout is performed before the tracks have been defined, causes that grid’s container intrinsic size is computed incorrectly. This problem is explained in detail in the W3C’s github issue mentioned before. So if anybody is interested on additional details or, even better, willing to participate in the ongoing discussion, just follow the link above.

Day 3

In addition to the orthogonal flow issues, Baseline Alignment was the other hot topic of my work during the hackfest. Baseline Alignment is the only feature still missing to complete the implementation of the CSS Box Alignment specification for Grid Layout. There is a preliminary approach in this code review issue, but it’s still not complete. The problem is that as it’s stated in the Alignment spec, Baseline Alignment may affect grid’s container intrinsic size:

When specified for align-self/justify-self, these values trigger baseline self-alignment, shifting the entire box within its container, which may affect the sizing of its container.

This fact implies that I should integrate Baseline offset computation inside the Grid Tracks sizing algorithm. Christian and I have been analyzing the sizing algorithm and we designed a possible approach, quite similar to what Flexbox implements in its layout logic.

Finally, we met again with Christian to discuss about the Minimum Implied size for Grid issues. Manuel had the chance to discuss it with the Grid spec editors at TPAC, but it seems there are still unresolved issues. There are an ongoing discussion at W3C’s github about this problem, you can get the details in issues 283 and 523. Christian suggested that he could add some feedback to the discussion, so we can clarify it as soon as possible. This is an important issue that may affect browser interoperability.

The hackfest

The Web Engines Hackfest is an event to share experiences between hackers of different web engines and brainstorming about the future of the Web Platform. This year there were hackers representing most of the main web engines, including Google’s Blink, Mozilla’s Gecko and Servo, Apple’s WebKit and Igalia’s WebKitGTK+.

webengines-1

There were scheduled talks from hackers of each engine so everybody could get an idea of their current state and future plans. In addition, some breakout sessions were scheduled during the kick-off session driven by my colleague Martin Robinson. We embraced everybody to propose new breakout sessions on the fly, every time an ongoing discussion needed a deeper debate or analysis.

ctrbqf6wcaatsvk-jpglarge

I could attend most of the talks and breakout sessions, so I’ll give now my impressions about them. All the talks were recorded (will be available soon) and slides are available in the wiki, so I recommend to watch them if you haven’t already.

30112189436_db06b8fd4b_o

The first speaker in the schedule was Jack Moffitt (Mozilla) “Servo: Today & Tomorrow”. He gave a great overview of the progress they made on the new Servo rendering engine, emphasizing the multi-thread support and showing some performance metrics for different engine’s components, CSS parsing, scripting, layout, … He remarked the technical advantages of using Rust on the development of this new engine.

30032830492_aa442a25b0_o

During Day 1 I attended two breakout sessions. The first one about the WebKitGTK+ engine was mainly focused on the recently published roadmap and specially about the new Threaded Compositor enabled by default. There was an interesting discussion about graphics support in WebKitGTK+ between the Collabora developers and Andrey Fedorov (Huawei).

29851562800_e6b17e78bd_o

The other breakout session I could attend during Day 1 was the one about the Servo engine. There was a nice discussion between Mozilla developers and Christian Biesinger about how both engines handle rendering in a different way. Servo hackers explained with more detail Servos’s threading model and its implication for layout.

30112183886_b4d3836246_o

Day 2 started with the awesome talk by Behdad Esfahbod (Google) “HarfBuzz, the First Ten Years”. He gave an overview of the evolution of the HarfBuff library along the last years, including the past experiences in the Web Engines Hackfest and former events under previous name WebKitGTK+ Hackfest. Clearly, rendering fonts is a huge technical challenge. It was also awesome to have Behdad here to work closely with my colleague Fred Wang on the fonts support for MathML.

30112185606_76ecfb49fe_o

I attended the MathML breakout session, where the hot topic was Fred’s prototype implemented for the Blink engine. The result of this experiment was really promising, so we started to discuss its chances of getting back in Chrome, even behind an experimental flag. Both Christian and Behdad expressed doubts about that being possible nowadays. They suggested that it may be feasible to implement it through Houdini and the new Blink’s Layout API. We have confirmed that the refactoring we have done for WebKit applies perfectly in Blink, since both share a substantial portion of the layout codebase. In addition, after our work for removing the Flexbox dependency and further code clean up, we can be confident on the MathML independence in the layout logic. After some discussion, we have agreed on continue our experiments in Blink, independently of its chances to be integrated in Blink in the future, since I don’t think the Houdini approach makes sense for MathML, at least now.

Later on Youenn Fabler (Apple) gave a talk about the Fetch API, “Fetching Bytes and Words on the Web”. He described the implementation of this specification in WebKit and its relation with the Streams API. The development is still ongoing and there are still quite much work to do.

30147118585_6d5aa818fb_o

I missed the talks of Day 3, so hopefully I’ll watch them as soon as the videos are available. It’s specially interesting the talk about WPE (aka WebKit For Wayland) , by my colleague Žan Doberšek. I’ve spent the day trying to get the most of having Christian Biesinger at our premises. As I commented before, we discussed about the Grid Layout’s Implied Minimum Size issue, one of the most complex problems we’ll have to solve in the short term.

by jfernandez at October 09, 2016 01:59 PM



October 03, 2016

WebRTC in WebKit/WPE

by Xabier Rodríguez Calvar

For some time I worked at Igalia to enable WebRTC on WebKitForWayland or WPE for the Raspberry Pi 2.

The goal was to have the WebKit WebRTC tests working for a demo. My fellow Igalian Alex was working on the platform itself in WebKit and assisting with some tuning for the Pi on WebKit but the main work needed to be done in OpenWebRTC.

My other fellow Igalian Phil had begun a branch to work on this that was half way with some workarounds. My first task was getting into combat/workaround mode and make OpenWebRTC work with compressed streams from gst-rpicamsrc. OpenWebRTC supported only raw video streams and that Raspberry Pi Cam module GStreamer element provides only H264 encoded ones. I moved some encoders and parsers around, some caps modifications, removed some elements that didn’t work on the Pi and made it work eventually. You can see the result at:

<

video style=”max-width: 100%;” src=”/xrcalvar/files/2016/10/201607-webrtc.mp4″ controls poster=”/xrcalvar/files/2016/10/webrtc-poster.png”>

To make this work by yourselves you needed a custom branch of Buildroot where you could build with the proper plugins enabled also selected the appropriate branches in WPE and OpenWebRTC.

Unfortunately the work was far from being finished so I continued the effort to try to make the arch changes in OpenWebRTC have production quality and that meant do some tasks step by step:

  • Rework the video orientation code: The workaround deactivated it as so far it was being done in GStreamer. In the case of rpicamsrc that can be done by the hardware itself so I cooked a GStreamer interface to enable rotation the same way it was done for the [gl]videoflip elements. The idea would be deprecate the original ones and use the new interface. These landed both in videoflip and glvideoflip. Of course I also implemented it on gst-rpicamsrc here and here and eventually in OpenWebRTC sources.
  • Rework video flip: Once OpenWebRTC sources got orientation support, I could rework the flip both for local and remote feeds.
  • Add gl{down|up}load elements back: There were some issues with the gl elements to upload and download textures, which we had removed. I readded them again.
  • Reworked bins linking: In OpenWebRTC there are some bins that are created to perform some tasks and depending on different circumstances you add or not some elements. I reworked the way those elements are linked so that we don’t have to take into account all the use cases to link them. Now this is easier as the elements are linked as they are the added to the bin.
  • Reworked the renderer_disabled: As in the case for orientation, some elements such as gst-rpicamsrc are able to change color and balance so I added support for that to avoid having that done by GStreamer elements if not necessary. In this case the proper interfaces were already there in GStreamer.
  • Moved the decoding/parsing from the source to the renderer: Before our changes the source was parsing/decoding the remote feeds, local sources were not decoded, just raw was supported. Our workarounds made the local sources decode too but this was not working for all cases. So why decoding at the sources when GStreamer has caps and you can just chain all that to the renderers? So I did eventually, I moved the parsing/decoding to the renderers. This took fixing all the caps negotiation from sources to renderers. Unfortunatelly I think I broke audio on the way, but surely nothing difficult to fix.

This is still a work in progress and now I am changing tasks and handing this over back to my fellow Igalians Phil, who I am sure will do an awesome job together with Alex.

And again, thanks to Igalia for letting me work on this and to Metrological that is sponsoring this work.

by calvaris at October 03, 2016 04:15 PM



September 30, 2016

Cross-compiling WebKit2GTK+ for ARM

by Mario Sánchez Prada

I haven’t blogged in a while -mostly due to lack of time, as usual- but I thought I’d write something today to let the world know about one of the things I’ve worked on a bit during this week, while remotely attending the Web Engines Hackfest from home:

Setting up an environment for cross-compiling WebKit2GTK+ for ARM

I know this is not new, nor ground-breaking news, but the truth is that I could not find any up-to-date documentation on the topic in a any public forum (the only one I found was this pretty old post from the time WebKitGTK+ used autotools), so I thought I would devote some time to it now, so that I could save more in the future.

Of course, I know for a fact that many people use local recipes to cross-compile WebKit2GTK+ for ARM (or simply build in the target machine, which usually takes a looong time), but those are usually ad-hoc things and hard to reproduce environments locally (or at least hard for me) and, even worse, often bound to downstream projects, so I thought it would be nice to try to have something tested with upstream WebKit2GTK+ and publish it on trac.webkit.org,

So I spent some time working on this with the idea of producing some step-by-step instructions including how to create a reproducible environment from scratch and, after some inefficient flirting with a VM-based approach (which turned out to be insanely slow), I finally settled on creating a chroot + provisioning it with a simple bootstrap script + using a simple CMake Toolchain file, and that worked quite well for me.

In my fast desktop machine I can now get a full build of WebKit2GTK+ 2.14 (or trunk) in less than 1 hour, which is pretty much a productivity bump if you compare it to the approximately 18h that takes if I build it natively in the target ARM device I have :-)

Of course, I’ve referenced this documentation in trac.webkit.org, but if you want to skip that and go directly to it, I’m hosting it in a git repository here: github.com/mariospr/webkit2gtk-ARM.

Note that I’m not a CMake expert (nor even close) so the toolchain file is far from perfect, but it definitely does the job with both the 2.12.x and 2.14.x releases as well as with the trunk, so hopefully it will be useful as well for someone else out there.

Last, I want to thanks the organizers of this event for making it possible once again (and congrats to Igalia, which just turned 15 years old!) as well as to my employer for supporting me attending the hackfest, even if I could not make it in person this time.

by mario at September 30, 2016 07:10 PM



March 22, 2016

WebKitGTK+ 2.12

by Carlos García Campos

We did it again, the Igalia WebKit team is pleased to announce a new stable release of WebKitGTK+, with a bunch of bugs fixed, some new API bits and many other improvements. I’m going to talk here about some of the most important changes, but as usual you have more information in the NEWS file.

FTL

FTL JIT is a JavaScriptCore optimizing compiler that was developed using LLVM to do low-level optimizations. It’s been used by the Mac port since 2014 but we hadn’t been able to use it because it required some patches for LLVM to work on x86-64 that were not included in any official LLVM release, and there were also some crashes that only happened in Linux. At the beginning of this release cycle we already had LLVM 3.7 with all the required patches and the crashes had been fixed as well, so we finally enabled FTL for the GTK+ port. But in the middle of the release cycle Apple surprised us announcing that they had the new FTL B3 backend ready. B3 replaces LLVM and it’s entirely developed inside WebKit, so it doesn’t require any external dependency. JavaScriptCore developers quickly managed to make B3 work on Linux based ports and we decided to switch to B3 as soon as possible to avoid making a new release with LLVM to remove it in the next one. I’m not going to enter into the technical details of FTL and B3, because they are very well documented and it’s probably too boring for most of the people, the key point is that it improves the overall JavaScript performance in terms of speed.

Persistent GLib main loop sources

Another performance improvement introduced in WebKitGTK+ 2.12 has to do with main loop sources. WebKitGTK+ makes an extensive use the GLib main loop, it has its own RunLoop abstraction on top of GLib main loop that is used by all secondary processes and most of the secondary threads as well, scheduling main loop sources to send tasks between threads. JavaScript timers, animations, multimedia, the garbage collector, and many other features are based on scheduling main loop sources. In most of the cases we are actually scheduling the same callback all the time, but creating and destroying the GSource each time. We realized that creating and destroying main loop sources caused an overhead with an important impact in the performance. In WebKitGTK+ 2.12 all main loop sources were replaced by persistent sources, which are normal GSources that are never destroyed (unless they are not going to be scheduled anymore). We simply use the GSource ready time to make them active/inactive when we want to schedule/stop them.

Overlay scrollbars

GNOME designers have requested us to implement overlay scrollbars since they were introduced in GTK+, because WebKitGTK+ based applications didn’t look consistent with all other GTK+ applications. Since WebKit2, the web view is no longer a GtkScrollable, but it’s scrollable by itself using native scrollbars appearance or the one defined in the CSS. This means we have our own scrollbars implementation that we try to render as close as possible to the native ones, and that’s why it took us so long to find the time to implement overlay scrollbars. But WebKitGTK+ 2.12 finally implements them and are, of course, enabled by default. There’s no API to disable them, but we honor the GTK_OVERLAY_SCROLLING environment variable, so they can be disabled at runtime.

But the appearance was not the only thing that made our scrollbars inconsistent with the rest of the GTK+ applications, we also had a different behavior regarding the actions performed for mouse buttons, and some other bugs that are all fixed in 2.12.

The NetworkProcess is now mandatory

The network process was introduced in WebKitGTK+ since version 2.4 to be able to use multiple web processes. We had two different paths for loading resources depending on the process model being used. When using the shared secondary process model, resources were loaded by the web process directly, while when using the multiple web process model, the web processes sent the requests to the network process for being loaded. The maintenance of this two different paths was not easy, with some bugs happening only when using one model or the other, and also the network process gained features like the disk cache that were not available in the web process. In WebKitGTK+ 2.12 the non network process path has been removed, and the shared single process model has become the multiple web process model with a limit of 1. In practice it means that a single web process is still used, but the network happens in the network process.

NPAPI plugins in Wayland

I read it in many bug reports and mailing lists that NPAPI plugins will not be supported in wayland, so things like http://extensions.gnome.org will not work. That’s not entirely true. NPAPI plugins can be windowed or windowless. Windowed plugins are those that use their own native window for rendering and handling events, implemented in X11 based systems using XEmbed protocol. Since Wayland doesn’t support XEmbed and doesn’t provide an alternative either, it’s true that windowed plugins will not be supported in Wayland. Windowless plugins don’t require any native window, they use the browser window for rendering and events are handled by the browser as well, using X11 drawable and X events in X11 based systems. So, it’s also true that windowless plugins having a UI will not be supported by Wayland either. However, not all windowless plugins have a UI, and there’s nothing X11 specific in the rest of the NPAPI plugins API, so there’s no reason why those can’t work in Wayland. And that’s exactly the case of http://extensions.gnome.org, for example. In WebKitGTK+ 2.12 the X11 implementation of NPAPI plugins has been factored out, leaving the rest of the API implementation common and available to any window system used. That made it possible to support windowless NPAPI plugins with no UI in Wayland, and any other non X11 system, of course.

New API

And as usual we have completed our API with some new additions:

 

by carlos garcia campos at March 22, 2016 10:36 AM



February 26, 2016

Über latest Media Source Extensions improvements in WebKit with GStreamer

by Xabier Rodríguez Calvar

In this post I am going to talk about the implementation of the Media Source Extensions (known as MSE) in the WebKit ports that use GStreamer. These ports are WebKitGTK+, WebKitEFL and WebKitForWayland, though only the latter has the latest work-in-progress implementation. Of course we hope to upstream WebKitForWayland soon and with it, this backend for MSE and the one for EME.

My colleague Enrique at Igalia wrote a post about this about a week ago. I recommend you read it before continuing with mine to understand the general picture and the some of the issues that I managed to fix on that implementation. Come on, go and read it, I’ll wait.

One of the challenges here is something a bit unnatural in the GStreamer world. We have to process the stream information and then make some metadata available to the JavaScript app before playing instead of just pushing everything to a playing pipeline and being happy. For this we created the AppendPipeline, which processes the data and extracts that information and keeps it under control for the playback later.

The idea of the our AppendPipeline is to put a data stream into it and get it processed at the other side. It has an appsrc, a demuxer (qtdemux currently

) and an appsink to pick up the processed data. Something tricky of the spec is that when you append data into the SourceBuffer, that operation has to block it and prevent with errors any other append operation while the current is ongoing, and when it finishes, signal it. Our main issue with this is that the the appends can contain any amount of data from headers and buffers to only headers or just partial headers. Basically, the information can be partial.

First I’ll present again Enrique’s AppendPipeline internal state diagram:

First let me explain the easiest case, which is headers and buffers being appended. As soon as the process is triggered, we move from Not started to Ongoing, then as the headers are processed we get the pads at the demuxer and begin to receive buffers, which makes us move to Sampling. Then we have to detect that the operation has ended and move to Last sample and then again to Not started. If we have received only headers we will not move to Sampling cause we will not receive any buffers but we still have to detect this situation and be able to move to Data starve and then again to Not started.

Our first approach was using two different timeouts, one to detect that we should move from Ongoing to Data starve if we did not receive any buffer and another to move from Sampling to Last sample if we stopped receiving buffers. This solution worked but it was a bit racy and we tried to find a less error prone solution.

We tried then to use custom downstream events injected from the source and at the moment they were received at the sink we could move from Sampling to Last sample or if only headers were injected, the pads were created and we could move from Ongoing to Data starve. It took some time and several iterations to fine tune this but we managed to solve almost all cases but one, which was receiving only partial headers and no buffers.

If the demuxer received partial headers and no buffers it stalled and we were not receiving any pads or any event at the output so we could not tell when the append operation had ended. Tim-Philipp gave me the idea of using the need-data signal on the source that would be fired when the demuxer ran out of useful data. I realized then that the events were not needed anymore and that we could handle all with that signal.

The need-signal is fired sometimes when the pipeline is linked and also when the the demuxer finishes processing data, regardless the stream contains partial headers, complete headers or headers and buffers. It works perfectly once we are able to disregard that first signal we receive sometimes. To solve that we just ensure that at least one buffer left the appsrc with a pad probe so if we receive the signal before any buffer was detected at the probe, it shall be disregarded to consider that the append has finished. Otherwise, if we have seen already a buffer at the probe we can consider already than any need-data signal means that the processing has ended and we can tell the JavaScript app that the append process has ended.

Both need-data signal and probe information come in GStreamer internal threads so we could use mutexes to overcome any race conditions. We thought though that deferring the operations to the main thread through the pipeline bus was a better idea that would create less issues with race conditions or deadlocks.

To finish I prefer to give some good news about performance. We use mainly the YouTube conformance tests to ensure our implementation works and I can proudly say that these changes reduced the time of execution in half!

That’s all folks!

by calvaris at February 26, 2016 12:30 PM



February 08, 2016

Mon 2016/Feb/08

by Claudio Saavedra

About a year ago, Igalia was approached by the people working on printing-related technologies in HP to see whether we could give them a hand in their ongoing effort to improve the printing experience in the web. They had been working for a while in extensions for popular web browsers that would allow users, for example, to distill a web page from cruft and ads and format its relevant contents in a way that would be pleasant to read in print. While these extensions were working fine, they were interested in exploring the possibility of adding this feature to popular browsers, so that users wouldn't need to be bothered with installing extensions to have an improved printing experience.

That's how Alex, Martin, and me spent a few months exploring the Chromium project and its printing architecture. Soon enough we found out that the Chromium developers had been working already on a feature that would allow pages to be removed from cruft and presented in a sort of reader mode, at least in mobile versions of the browser. This is achieved through a module called dom distiller, which basically has the ability to traverse the DOM tree of a web page and return a clean DOM tree with only the important contents of the page. This module is based on the algorithms and heuristics in a project called boilerpipe with some of it also coming from the now popular Readability. Our goal, then, was to integrate the DOM distiller with the modules in Chromium that take care of generating the document that is then sent to both the print preview and the printing service, as well as making this feature available in the printing UI.

After a couple of months of work and thanks to the kind code reviews of the folks at Google, we got the feature landed in Chromium's repository. For a while, though, it remained hidden behind a runtime flag, as the Chromium team needed to make sure that things would work well enough in all fronts before making it available to all users. Fast-forward to last week, when I found out by chance that the runtime flag has been flipped and the Simplify page printing option has been available in Chromium and Chrome for a while now, and it has even reached the stable releases. The reader mode feature in Chromium seems to remain hidden behind a runtime flag, I think, which is interesting considering that this was the original motivation behind the dom distiller.

As a side note, it is worth mentioning that the collaboration with HP was pretty neat and it's a good example of the ways in which Igalia can help organizations to improve the web experience of users. From the standards that define the web to the browsers that people use in their everyday life, there are plenty of areas in which work needs to be done to make the web a more pleasant place, for web developers and users alike. If your organization relies on the web to reach its users, or to enable them to make use of your technologies, chances are that there are areas in which their experience can be improved and that's one of the things we love doing.

February 08, 2016 09:01 AM



February 04, 2016

Thu 2016/Feb/04

by Claudio Saavedra

We've opened a few positions for developers in the fields of multimedia, networking, and compilers. I could say a lot about why working in Igalia is way different to working on your average tech-company or start-up, but I think the way it's summarized in the announcements is pretty good. Have a look at them if you are curious and don't hesitate to apply!

February 04, 2016 12:53 PM



February 01, 2016

Web Engines Hackfest according to me

by Xabier Rodríguez Calvar

And once again, in December we celebrated the hackfest. This year happened between Dec 7-9 at the Igalia premises and the scope was much broader than WebKitGTK+, that’s why it was renamed as Web Engines Hackfest. We wanted to gather people working on all open source web engines and we succeeded as we had people working on WebKit, Chromium/Blink and Servo.

The edition before this I was working with Youenn Fablet (from Canon) on the Streams API implementation in WebKit and we spent our time on the same thing again. We have to say that things are much more mature now. During the hackfest we spent our time in fixing the JavaScriptCore built-ins inside WebCore and we advanced on the automatic importation of the specification web platform tests, which are based on our prior test implementation. Since now they are managed there, it does not make sense to maintain them inside WebKit too, we just import them. I must say that our implementation is fairly complete since we support the current version of the spec and have almost all tests passing, including ReadableStream, WritableStream and the built-in strategy classes. What is missing now is making Streams work together with other APIs, such as Media Source Extensions, Fetch or XMLHttpRequest.

There were some talks during the hackfest and we did not want to be less, so we had our own about Streams. You can enjoy it here:

You can see all hackfest talks in this YouTube playlist. The ones I liked most were the ones by Michael Catanzaro about HTTP security, which is always interesting given the current clumsy political movements against cryptography and the one by Dominik Röttsches about font rendering. It is really amazing what a browser has to do just to get some letters painted on the screen (and look good).

As usual, the environment was amazing and we had a great time, including the traditional Street Fighter‘s match, where Gustavo found a worthy challenger in Changseok 🙂

Of course, I would like to thank Collabora and Igalia for sponsoring the event!

And by the way, quite shortly after that, I became a WebKit reviewer!

by calvaris at February 01, 2016 11:06 AM



November 26, 2015

Attending the Web Engines Hackfest

by Mario Sánchez Prada

It’s certainly been a while since I attended this event for the last time, 2 years ago, when it was a WebKitGTK+ only oriented hackfest, so I guess it was a matter of time it happened again…

It will be different for me this time, though, as now my main focus won’t be on accessibility (yet I’m happy to help with that, too), but on fixing a few issues related to the WebKit2GTK+ API layer that I found while working on our platform (Endless OS), mostly related to its implementation of accelerated compositing.

Besides that, I’m particularly curious about seeing how the hackfest looks like now that it has broaden its scope to include other web engines, and I’m also quite happy to know that I’ll be visiting my home town and meeting my old colleagues and friends from Igalia for a few days, once again.

Last, I’d like to thank my employer for sponsoring this trip, as well as Igalia for organizing this event, one more time.

See you in Coruña!

by mario at November 26, 2015 11:29 AM



November 07, 2015

Importing include paths in Eclipse

by Mario Sánchez Prada

First of all, let me be clear: no, I’m not trying to leave Emacs again, already got over that stage. Emacs is and will be my main editor for the foreseeable future, as it’s clear to me that there’s no other editor I feel more comfortable with, which is why I spent some time cleaning up my .emacs.d and making it more “manageable”.

But as much as like Emacs as my main “weapon”, I sometimes appreciate the advantages of using a different kind of beast for specific purposes. And, believe me or not, in the past 2 years I learned to love Eclipse/CDT as the best work-mate I know when I need some extra help to get deep inside of the two monster C++ projects that WebKit and Chromium are. And yes, I know Eclipse is resource hungry, slow, bloated… and whatnot; but I’m lucky enough to have fast SSDs and lots of RAM in my laptop & desktop machines, so that’s not really a big concern anymore for me (even though I reckon that indexing chromium in the laptop takes “quite some time”), so let’s move on :-)

However, there’s this one little thing that still bothers quite me a lot of Eclipse: you need to manually setup the include paths for the external dependencies not in a standard location that a C/C++ project uses, so that you can get certain features properly working such as code auto-completion, automatic error-checking features, call hierarchies… and so forth.

And yes, I know there is an Eclipse plugin adding support for pkg-config which should do the job quite well. But for some reason I can’t get it to work with Eclipse Mars, even though others apparently can use it there for some reason (and I remember using it with Eclipse Juno, so it’s definitely not a myth).

Anyway, I did not feel like fighting with that (broken?) plugin, and in the other hand I was actually quite inclined to play a bit with Python so… my quick and dirty solution to get over this problem was to write a small script that takes a list of package names (as you would pass them to pkg-config) and generates the XML content that you can use to import in Eclipse. And surprisingly, that worked quite well for me, so I’m sharing it here in case someone else finds it useful.

Using frogr as an example, I generate the XML file for Eclipse doing this:

  $ pkg-config-to-eclipse glib-2.0 libsoup-2.4 libexif libxml-2.0 \
        json-glib-1.0 gtk+-3.0 gstreamer-1.0 > frogr-eclipse.xml

…and then I simply import frogr-eclipse.xml from the project’s properties, inside the C/C++ General > Paths and Symbols section.

After doing that I get rid of all the brokenness caused by so many missing symbols and header files, I get code auto-completion nicely working back again and all those perks you would expect from this little big IDE. And all that without having to go through the pain of defining all of them one by one from the settings dialog, thank goodness!

Now you can quickly see how it works in the video below:


VIDEO: Setting up a C/C++ project in Eclipse with pkg-config-to-eclipse

This has been very helpful for me, hope it will be helpful to someone else too!

by mario at November 07, 2015 12:35 AM



October 28, 2015

Introducing the meta-webkit Yocto layer.

by Carlos Alberto López Pérez

Lately, in my daily work at Igalia, I have been learning to use Yocto / OpenEmbedded to create custom distributions and products targeting different embedded hardware. One of the goals was to create a Kiosk-like browser that was based on a modern Web engine, light (specially regarding RAM requirements) and fast.

WebKit fits perfectly this requirements, so I started checking the available recipes on the different layers of Yocto to build WebKit, unfortunately the available recipes for WebKitGTK+ available were for ancient versions of the engine (no WebKit2 multi-process model). This has changed recently, as the WebKitGTK+ recipe available in the oe-core layer has been updated to a new version. In any case, we needed this for an older release of Yocto so I ended creating a new layer.

I have added also some recipes for WebKitForWayland and released it as meta-webkit. The idea is to collect here recipes for building browsers and web engines based on WebKit. For the moment it includes recipes for building the WebKitGTK+ and WebKitForWayland ports.

WebKitGTK+ vs WebKitForWayland

First a few words about the main differences between this two different ports of WebKit, so you have a better understanding on where to use one or the other.

Let’s start by defining both engines:

  • WebKit for Wayland port pairs the WebKit engine with the Wayland display protocol, allowing embedders to create simple and performant systems based on Web platform technologies. It is designed with hardware acceleration in mind, relying on EGL, the Wayland EGL platform, and OpenGL ES.
  • WebKitGTK+ is a full-featured port of the WebKit rendering engine, suitable for projects requiring any kind of web integration, from hybrid HTML/CSS applications to full-fledged web browsers. It offers WebKit’s full functionality and is useful in a wide range of systems from desktop computers to embedded systems like phones, tablets, and televisions.

From the definitions you may guess already some differences. WebKitForWayland focus on simple and performant Web-oriented systems, meanwhile WebKitGTK+ focus on any kind of product (complex or simple) that requires full Web integration.

WebKitForWayland is what you need when you want a very fast and light HTML5/js runtime, capable of squeeze the hardware acceleration of your platform (Wayland EGL platform) up to the maximum performance. But is not suitable if you are thinking in building a general purpose browser on top of it. It lacks some features that are not needed for a Web runtime but that are desirable for a more generic browser.

Here you can see a video of WebKitForWayland running on the weston ivi-shell and several instances of WebkitForWayland showing different demos (poster circle and several webgl demos) running on the Intel NUC (Intel Atom E3815 – 1 core). The OS is a custom image built with Yocto 1.8 and this meta-webkit layer.

On the Weston terminal you can appreciate the low resource usage (CPU and RAM) despite having several browser process running the demanding demos at full speed. Link to the video on YouTube here.

In case you want to try it, you might want to build an image for your device with meta-webkit and Yocto (see below), or you can test the image that I built for the demo on the above video. You can flash it to an usb stick memory device as follows, and boot it on any Intel x86_64 machine that has an Intel GPU:

# /dev/sdX is the device of your usb memory stick (for example /dev/sdc)
curl -s http://ftp.neutrino.es/webkitforwayland-yocto-ivi-demo/webkitforwayland-demo-ivi-weston-intel-corei7-64-20151019201929.hddimg.xz | xz -dc | dd bs=4k of=/dev/sdX

On the other hand WebKitGTK+ allows you to build a custom browser on top of it. It has a rich and stable API and many goodies like support for plugins, integrated web inspector, full toolkit support, etc. WebKitGTK+ also performs very good and consumes few resources, it can run both on top of Wayland and X11. But at the moment of writing this, the support for Wayland is still incomplete (we are working on it). So if possible we recommend that you base your product on the X11 backend. You could later migrate to Wayland without much effort once we complete the support for it.

Building a Yocto image with WebKitForWayland

The usual way to create an image with Yocto and WebKitForWayland is:

  • Setup the environment and source oe-init-build-env as usual.
  • Checkout the branch of meta-webkit that matches your Yocto/OE version (for example: fido).
    Note that fido (1.8) is the less recent version supported. If you are using the fido branch you will also need to add the meta-ruby layer that is available on meta-openembedded
  • Add the path to the meta-webkit layer in your conf/bblayers.conf file
  • Append the following lines to the conf/local.conf file
  • DISTRO_FEATURES_append = " opengl wayland"
    IMAGE_INSTALL_append = " webkitforwayland"
    
  • Then build the target image, for example
  • bitbake core-image-weston
    
  • Then boot the image on the target device and run the webkitforwayland engine from a weston terminal
  • WPELauncher http://postercircle.com
    

Building a Yocto image with WebKitGTK+

There are some things to take into account when building WebKitGTK+:

  • The package webkitgtk contains the shared libraries and the webkitgtk runtime.
  • The package webkitgtk-bin contains the MiniBrowser executable. This is very basic browser built on top of the webkitgtk runtime, mainly used for testing purposes.
  • On the recipe there are several packageconfig options that you can tune. For example, for enabling WebGL support you can add the following to your conf/local.conf file:
    PACKAGECONFIG_pn-webkitgtk = "x11 webgl"
    

    Check the recipe source code to see all the available options.

  • The name of the recipe is the same than the one available in oe-core (master), so you should select which version of webkitgtk you want to build. For example, to build 2.10.3 add on conf/local.conf:
    PREFERRED_VERSION_webkitgtk = "2.10.3"
    

So, the usual way to create an image with Yocto and WebKitGTK+ is:

  • Setup the environment and source oe-init-build-env as usual.
  • Checkout the branch of meta-webkit that matches your Yocto/OE version (for example: fido).
    Note that fido (1.8) is the less recent version supported. If you are using the fido branch you will also need to add the meta-ruby layer that is available on meta-openembedded
  • Add the following lines to your conf/local.conf file (for building the X11 backend of WebKitGTK+)
  • DISTRO_FEATURES_append = " opengl x11"
    IMAGE_INSTALL_append = " webkitgtk-bin"
    
  • Then build the X11 image:
  • bitbake core-image-sato
    
  • Then boot the image on the target device and run the included test browser from an X terminal (or any other browser that you have developed on top of the WebKitGTK+ API)
    MiniBrowser http://google.com
    
  • Further info

    If you are new to Yocto / OpenEmbedded, a good starting point is to check out the documentation.

    If you have any issue or doubt with the meta-webkit layer, please let me know about that (you can mail me at clopez@igalia.com or open an issue on github)

    Finally, If you need help for integrating a Web engine on your product, you can also hire us. As maintainers of the GTK+ and WebKit For Wayland ports of WebKit we have considerable experience creating, maintaining and optimizing ports of WebKit.

    by clopez at October 28, 2015 01:23 PM



    September 23, 2015

    WebKit Contributors Meeting 2015 (late, I know)

    by Xabier Rodríguez Calvar

    After writing my last post I realized that I needed to write a bit more about what I had been doing at the WebKit Contributors Meeting.

    First thing to say is that it happened in March at Apple campus in Cupertino and I atteded as part of the Igalia gang.

    My goal when I went there was to discuss with Youenn Fablet about Streams API and we are implementing and see how we could bootstrap the reviews and being to get the code reviewed and landed efficiently. Youenn and I also made a presentation (mainly him) about it. At that moment we got some comments and help from Benjamin Poulain and nowadays we are also working with Darin Adler and Geoffrey Garen so the work is ongoing.

    WebRTC was also a hot topic and we talked a bit about how to deal with the promises as they seem to be involved in the WebRTC standard was well. My Igalian partner Philippe was missed in this regard as he is involved in the development of WebRTC in WebKit, but he unfortunately couldn’t make it because of personal reasons.

    I also had an interesting talk with Jer Noble and Eric Carlson about Media Source and Encrypted Media Extensions. I told them about the several downstream implementations that we are or were working on, specially the GStreamer based implementation of WPE one and that we expect to begin to upstream soon (update: this is done, yay!). They commented that they still have doubts about the abstractions they made for them and of course I promised to get back to them when we begin with the job. Actually I already discussed some issues with Quique, another fellow Igalian.

    Among the other interesting discussions, I found very necessary the migration of Mac port to CMake. Actually, I am experiencing now the painbenefits of using XCode to add files, specially the generated ones to the compilation. I hope that Alex succeeds with the task and soon we have a common build system for all main ports.

    by calvaris at September 23, 2015 04:36 PM



    July 23, 2015

    ReadableStream almost ready

    by Xabier Rodríguez Calvar

    Hello dear readers! Long time no see! You might thing that I have been lazy, and I was in blog posting but I was coding like mad.

    First remarkable thing is that I attended the WebKit Contributors Meeting that happened in March at Apple campus in Cupertino as part of the Igalia gang. There we discussed of course about Streams API, its state and different implementation possibilities. Another very interesting point which would make me very happy would be the movement of Mac to CMake.

    In a previous post I already introduced the concepts of the Streams API and some of its possible use cases so I’ll save you that part now. The news is that ReadableStream has its basic funcionality complete. And what does it mean? It means that you can create a ReadableStream by providing the constructor with the underlying source and the strategy objects and read from it with its reader and all the internal mechanisms of backpresure and so on will work according to the spec. Yay!

    Nevertheless, there’s still quite some work to do to complete the implementation of Streams API, like the implementation of byte streams, writable and transform streams, piping operations and built-in strategies (which is what I am on right now).I don’t know either when Streams API will be activated by default in the next builds of Safari, WebKitGTK+ or WebKit for Wayland, but we’ll make it at some point!

    Code suffered already lots of changes because we were still figuring out which architecture was the best and Youenn did an awesome job in refactoring some things and providing support for promises in the bindings to make the implementation of ReadableStream more straitghforward and less “custom”.

    Implementation could still suffer quite some important changes as, as part of my work implementing the strategies, some reviewers raised their concerns of having Streams API implemented inside WebCore in terms of IDL interfaces. I have already a proof of concept of CountQueuingStrategy and ByteLengthQueuingStrategy implemented inside JavaScriptCore, even a case where we use built-in JavaScript functions, which might help to keep closer to the spec if we can just include JavaScript code directly. We’ll see how we end up!

    Last and not least I would like to thank Igalia for sponsoring me to attend the WebKit Contributors Meeting in Cupertino and also Adenilson for being so nice and taking us to very nice places for dinner and drinks that we wouldn’t be able to find ourselves (I owe you, promise to return the favor at the Web Engines Hackfest). It was also really nice to have the oportunity of quickly visiting New York City for some hours because of the long connection there which usually would be a PITA, but it was very enjoyable this time.

    by calvaris at July 23, 2015 04:17 PM



    July 03, 2015

    On Linux32 chrooted environments

    by Mario Sánchez Prada

    I have a chrooted environment in my 64bit Fedora 22 machine that I use every now and then to work on a debian-like 32bit system where I might want to do all sorts of things, such as building software for the target system or creating debian packages. More specifically, today I was trying to build WebKitGTK+ 2.8.3 in there and something weird was happening:

    The following CMake snippet was not properly recognizing my 32bit chroot:

    string(TOLOWER ${CMAKE_HOST_SYSTEM_PROCESSOR} LOWERCASE_CMAKE_HOST_SYSTEM_PROCESSOR)
    if (CMAKE_COMPILER_IS_GNUCXX AND "${LOWERCASE_CMAKE_HOST_SYSTEM_PROCESSOR}" MATCHES "(i[3-6]86|x86)$")
        ADD_TARGET_PROPERTIES(WebCore COMPILE_FLAGS "-fno-tree-sra")
    endif ()
    

    After some investigation, I found out that CMAKE_HOST_SYSTEM_PROCESSOR relies on the output of uname to determine the type of the CPU, and this what I was getting if I ran it myself:

    (debian32-chroot)mario:~ $ uname -a
    Linux moucho 4.0.6-300.fc22.x86_64 #1 SMP Tue Jun 23 13:58:53 UTC 2015
    x86_64 x86_64 x86_64 GNU/Linux
    

    Let’s avoid nasty comments about the stupid name of my machine (I’m sure everyone else uses clever names instead), and see what was there: x86_64.

    That looked wrong to me, so I googled a bit to see what others did about this and, besides finding all sorts of crazy hacks around, I found that in my case the solution was pretty simple just because I am using schroot, a great tool that makes life easier when working with chrooted environments.

    Because of that, all I would have to do would be to specify personality=linux32 in the configuration file for my chrooted environment and that’s it. Just by doing that and re-entering in the “jail”, the output would be much saner now:

    (debian32-chroot)mario:~ $ uname -a
    Linux moucho 4.0.6-300.fc22.x86_64 #1 SMP Tue Jun 23 13:58:53 UTC 2015
    i686 i686 i686 GNU/Linux
    

    And of course, WebKitGTK+ would now recognize and use the right CPU type in the snippet above and I could “relax” again while seeing WebKit building again.

    Now, for extra reference, this is the content of my schroot configuration file:

    $ cat /etc/schroot/chroot.d/00debian32-chroot
    [debian32-chroot]
    description=Debian-like chroot (32 bit) 
    type=directory
    directory=/schroot/debian32/
    users=mario
    groups=mario
    root-users=mario
    personality=linux32
    

    That is all, hope somebody else will find this useful. It certainly saved my day!

    by mario at July 03, 2015 01:31 PM



    June 24, 2015

    Performance analysis of Grid Layout

    by Javier Fernández

    Now that we have a quite complete implementation of CSS Grid Layout specification it’s time to take care of performance analysis and optimizations. In this essay, which is the first of a series of posts about performance, I’ll first introduce briefly how to use Blink (Chrome) and WebKit (Safari) performance analysis tools, some of the most interesting cases I’ve seen during my work on the implementation of this spec and, finally, a basic case to compare Flexbox and Grid layout models, which I’d like to evolve and analyze further in the coming months.

    Performance analysis tools

    Both WebKit and Blink projects provide several useful and easy to use scrips (python) to run a set of test cases and take different measurements and early analysis. They were written before the fork, that’s why related documentation can be found at WebKit’s track, but both engines still uses them, for the time being.

    Tools/Scripts/run-perf-tests
    Tools/Scripts/webkitpy/performance_tests/

    There are a wide set of performance tests under PerformanceTest folder, at Blink’s/WebKit’s root directory, but even though both engines share a substantial number of tests, there are some differences.

    (blink’s root directory) $ ls PerformanceTests/
    Bindings BlinkGC Canvas CSS DOM Dromaeo Events inspector Layout Mutation OWNERS Parser resources ShadowDOM Skipped SunSpider SVG XMLHttpRequest XSSAuditor

    Chromium project has introduced a new performance tool, called Telemetry, which in addition of running the above mentioned tests, it’s designed to execute more complex cases like running specific PageSets or doing benchmarking to compare results with a preset recording (WebPageRelay). It’s also possible to send patches to performance try bots, directly from gclient or git (depot_tools) command line. There are quite much information available in the following links:

    Regarding profiling tools, it’s possible both in Webkit and Blink to use the –profiler option when running the performance tests so we can collect profiling data. However, while WebKit recommends perf for linux, Google’s Blink engine provides some alternatives.

    CSS Grid Layout performance tests and current status

    While implementing a new browser feature is not easy to measure performance while code evolves so much and quickly and, what it’s worst, be aware of regressions introduced by new logic. When the feature’s syntax changes or there are missing or incomplete functionality, it’s not always possible to establish a well defined baseline for performance. It’s also a though decision to determine which use cases we might care about; obviously the faster the better, but adding performance optimizations usually complicates code, it may affect its robustness and it could lead to unexpected, and even worst, hard to find bugs.

    At the time of this writing, we had 3 basic performance tests:

    Why we have selected those uses cases to measure and keep track of performance regression ? First of all, note that auto-sizing one of the most expensive branches inside the grid track sizing algorithm, so we are really interested on both, improving it and keeping track of regressions on this code path.

    body {
        display: grid;
        grid-template-rows: repeat(100, auto);
        grid-template-columns: repeat(20, auto);
    }
    .gridItem {
        height: 200px;
        width: 200px;
    }
    

    On the other hand, fixed-sized is the easiest/fastest path of the algorithm, so besides the importance of avoiding regressions (when possible), it’s also a good case to compare with auto-sized.

    body {
        display: grid;
        grid-template-rows: repeat(100, 200px);
        grid-template-columns: repeat(20, 200px);
    }
    .gridItem {
        height: 200px;
        width: 200px;
    }
    

    Finally, a stretching use cases was added because it’s the default alignment value for grid items and the two test cases already described use fixed size items, hence no stretch (even though items fill the whole grid cell area). Given that I implemented CSS Box Alignment support for grid I was conscious of how expensive the stretching logic is, so I considered it an important use case to analyze and optimize as much as possible. Actually, I’ve already introduced several optimizations because the early implementation was quite slow, around 40% slower than using any other basic alignment (start, end, center). We will talk more about this later when we analyze a case to compare Flexbox and Grid performance in layout.

    body {
        display: grid;
        grid-template-rows: repeat(100, 200px);
        grid-template-columns: repeat(20, 200px);
    }
    .gridItem {
        height: auto;
        width: auto;
    }
    

    The basic HTML body of these 3 tests is quite simple because we want to analyze performance of very specific parts of the Grid Layout logic, in order to detect regressions in sensible code paths. We’d like to have eventually some real use cases to analyze and create many more performance tests, but chrome performance platform it’s definitively not the place to do so. The following graphs show performance evolution during 2015 for the 3 tests we have defined so far.

    grid-performance-overview

    Note that yellow trace shows data taken from a reference build, so we can discount temporary glitches on the machine running the performance tests of target build, which are shown in the blue trace; this reference trace is also useful to detect invalid regression alerts.

    Why performance is so different for these cases ?

    The 3 tests we have for Grid Layout use runs/second values as a way to measure performance; this is the preferred method for both WebKit and Blink engines because we can detect regressions with relatively small tests. It’s possible, though, to do other kind of measurements. Looking at the graphs above we can extract the following data:

    • auto-sized grid: around 650 runs/sec
    • fixed-sized grid: around 1400 runs/sec
    • fixed-sized stretched grid: around 1250 runs/sec

    Before analyzing possible causes of performance drop for each case, I’ve defined some additional tests to stress even more these 3 cases, so we can realize how grid size affect to the obtained results. I defined 20 tests for these cases, each one with different grid items; from 10×10 up to 200×200 grids. I run those tests in my own laptop, so let’s take the absolute numbers of each case with a grain of salt; although differences between each of these 3 scenarios should be coherent. The table below shows some numeric results of this experiment.

    grid-fixed-VS-auto-VS-stretch

    First of all, recall that these 3 tests produce the same web visualization, consisting of grids with NxN items of 100px each one. The only difference is the grid layout strategy used to produce such result: auto-sizing, fixed-sizing and stretching. So now, focusing on previous table’s data we can evaluate the cost, in terms of layout performance, of using auto-sized tracks for defining the grid (which may be the only solution for certain cases). Performance drop is even growing with the number of grid items, but we can conclude that it’s stabilized around 60%. On the other hand stretching is also slower but, unlike auto-sized, in this case performance drop does not show a high dependency of grid size, more or less constant around 15%.

    grid-performance-graphs-2

    Impact of auto-sized tracks in layout performance

    Basically, the track sizing algorithm can be described in the following 4 steps:

    • 1- Initialize per Grid track variables.
    • 2- Resolve content-based TrackSizingFunctions.
    • 3- Grow all Grid tracks in GridTracks from their baseSize up to their growthLimit value until freeSpace is exhausted.
    • 4- Grow all Grid tracks having a fraction as the MaxTrackSizingFunction.

    These steps will be executed twice, first cycle for determining column tracks’s size and another cycle to set row tracks’s size which it may depend on grid’s width. When using just fixed-sized tracks in the very simple case we are testing, the only computation required to determine grid’s size is completing step 1 and determining free available space based on the specified fixed-size values of each track.

    // 1. Initialize per Grid track variables.
    for (size_t i = 0; i  tracks.size(); ++i) {
        GridTrack& track = tracks[i];
        GridTrackSize trackSize = gridTrackSize(direction, i);
        const GridLength& minTrackBreadth = trackSize.minTrackBreadth();
        const GridLength& maxTrackBreadth = trackSize.maxTrackBreadth();
    
        track.setBaseSize(computeUsedBreadthOfMinLength(direction, minTrackBreadth));
        track.setGrowthLimit(computeUsedBreadthOfMaxLength(direction, maxTrackBreadth, track.baseSize()));
    
        if (trackSize.isContentSized())
            sizingData.contentSizedTracksIndex.append(i);
        if (trackSize.maxTrackBreadth().isFlex())
            flexibleSizedTracksIndex.append(i);
    }
    for (const auto& track: tracks) {
        freeSpace -= track.baseSize();
    } 
    

    Focusing now on the auto-sized scenario, we will have the overhead of resolving content-sized functions for all the grid items.

    // 2. Resolve content-based TrackSizingFunctions.
    if (!sizingData.contentSizedTracksIndex.isEmpty())
        resolveContentBasedTrackSizingFunctions(direction, sizingData);
    

    I didn't add source code of resolveContentBasedTrackSizingFunctions because it's quite complex, but basically it implies a cost proportional to the number of grid tracks (minimum of 2x), in order to determine minContent and maxContent values for each grid item. It might imply additional computation overhead when using spanning items; it would require to sort them based on their spanning value and iterate over them again to resolve their content-sized functions.

    Some issues may be interesting to analyze in the future:

    • How much each content-sized track costs ?
    • What is the impact on performance of using flexible-sized tracks ? Would it be the worst case scenario ? Considering it will require to follow the four steps of track sizing algorithm, it likely will.
    • Which are the performance implications of using spanning items ?

    Why stretching is so performance drain ?

    This is an interesting issue, given that stretch is the default value for both Grid and Flexbox items. Actually, it's the root cause of why Grid beats Flexbox in terms of layout performance for the cases when stretch alignment is used. As I'll explain later, Flexbox doesn't have the optimizations I've implemented for Grid Layout.

    Stretching logic takes place during the grid container layout operations, after all tracks have their size precisely determined and we have properly computed all grid track's positions relatively to the grid container. It happens before the alignment logic is executed because stretching may imply changing some grid item's size, hence they will be marked for layout (if they wasn't already).

    Obviously, stretching only takes place when the corresponding Self Alignment properties (align-self, justify-self) have either auto or stretch as value, but there are other conditions that must be fulfilled to trigger this operation:

    • box’s computed width/height (as appropriate to the axis) is auto.
    • neither of its margins (in the appropriate axis) are auto
    • still respecting the constraints imposed by min-height/min-width/max-height/max-width

    In that scenario, stretching logic implies the following operations:

    LayoutUnit stretchedLogicalHeight = availableAlignmentSpaceForChildBeforeStretching(gridAreaBreadthForChild, child);
    LayoutUnit desiredLogicalHeight = child.constrainLogicalHeightByMinMax(stretchedLogicalHeight, -1);
    
    bool childNeedsRelayout = desiredLogicalHeight != child.logicalHeight();
    if (childNeedsRelayout || !child.hasOverrideLogicalContentHeight())
        child.setOverrideLogicalContentHeight(desiredLogicalHeight - child.borderAndPaddingLogicalHeight());
    if (childNeedsRelayout) {
        child.setLogicalHeight(0);
        child.setNeedsLayout();
    }
    
    LayoutUnit LayoutGrid::availableAlignmentSpaceForChildBeforeStretching(LayoutUnit gridAreaBreadthForChild, const LayoutBox& child) const
    {
        LayoutUnit childMarginLogicalHeight = marginLogicalHeightForChild(child);
    
        // Because we want to avoid multiple layouts, stretching logic might be performed before
        // children are laid out, so we can't use the child cached values. Hence, we need to
        // compute margins in order to determine the available height before stretching.
        if (childMarginLogicalHeight == 0)
            childMarginLogicalHeight = computeMarginLogicalHeightForChild(child);
    
        return gridAreaBreadthForChild - childMarginLogicalHeight;
    }
    

    In addition to the extra layout required for changing grid item's size, computing the available space for stretching adds an additional overhead, overall if we have to compute grid item's margins because some layout operations are still incomplete.

    Given that grid container relies on generic block's layout operations to determine the stretched width, this specific logic is only executed for determining the stretched height. Hence performance drop is alleviated, compared with the auto-sized tracks scenario.

    Grid VS Flexbox layout performance

    One of the main goals of CSS Grid Layout specification is to complement Flexbox layout model for 2 dimensions. It's expectable that creating grid designs with Flexbox will be more inefficient than using a layout model specifically designed for these cases, not only regarding CSS syntax, but also regarding layout performance.

    However, I think it's interesting to measure Grid Layout performance in 1-dimensional cases, usually managed using Flexbox, so we can have comparable scenarios to evaluate both models. In this post I'll start with such cases, using a very simple one in this occasion. I'd like to get more complex examples in future posts, the ones more usual in Flexbox based designs.

    So, let's consider the following simple test case:

    Item 1
    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
    Item 3 longer

    I evaluated the simple HTML example above with both Flexbox and Grid layouts to measure performance. I used a CPU profiler to figure out where the bottlenecks are for each model, trying to explain where differences came from. So, I defined 2 CSS classes for each layout model, as follows:

    .flex {
        background-color: silver;
        display: flex;
        height: 100px;
        align-items: start;
    }
    .grid {
        background-color: silver;
        display: grid;
        grid-template-columns: 100px 1fr auto;
        grid-template-rows: 100px;
        align-items: start;
        justify-items: start;
    }
    .i1 { 
        background-color: cyan;
        flex-basis: 100px; 
    }
    .i2 { 
        background-color: magenta;
        flex: 1; 
    }
    .i3 { 
        background-color: yellow; 
    }
    

    Given that there is not concept of row in Flexbox, I evaluated performance of 100 up to 2000 grid or flex containers, creating 20 tests to be run inside the chrome performance framework, described at the beginning of this post. You can check out resources and a script to generate them at our github examples repo.

    flexVSgrid

    When comparing both layout models targeting layout times, we see clearly that Grid Layout beats Flexbox using the default values for CSS properties controlling layout itself and alignment, which is stretch for these containers. As it was explained before, the stretching logic adds an important computation overhead, which as we can see now in the numeric table above, has more weight for Flexbox than Grid.

    Looking at the plot about differences in layout time, we see that for the default case, Grid performance improvement is stabilized around 7%. However, when we avoid the stretching logic, for instance by using any other alignment value, layout performance it's considerable worse than Flexbox, for this test case, around 15% slower. This is something sensible, as this test case is the idea for Flexbox, while a bit artificial for Grid; using a single Grid with N rows improves performance considerably, getting much better numbers than Flexbox, but we will see these cases in future analysis.

    Grid layout better results for the default case (stretch) are explained because I implemented several optimizations for Grid. Probably Flexbox should do the same, as it's the default value and it could affect many sites using this layout model in their designs.

    Thanks to Bloomberg for sponsoring this work, as part of the efforts that Igalia has been doing all these years pursuing a better and more open web.

    Igalia & Bloomberg logos

    by jfernandez at June 24, 2015 12:03 PM



    June 01, 2015

    Distributing tracks along Grid Layout container

    by Javier Fernández

    In my last post I introduced the concept of Content Distribution alignment and how it affects Grid Layout implementation. At that time, it was possible to use all the <content-position> values to select grid tracks position inside a grid container, moving them across the available space. However, it wasn’t until recently that users can distribute grid tracks along such available space, literally adding gaps in between or even stretching them.

    In this post I’ll describe how each <content-distribution> value affect tracks in a Grid Layout, their position and size, using different grid structures (eg. number of tracks, span).

    Let’s start analyzing the new Content Distribution alignment syntax defined in the CSS Box Alignment specification:

    auto | <baseline-position> | <content-distribution> || [ <overflow-position>? && <content-position> ]

    In case of a <content-distribution> value can’t be applied, its associated fallback <content-distribution> value should be used instead. However, this CSS syntax allow users to specify a preferred fallback value:

    If both a <content-distribution> and <content-position> are given, the <content-position> provides an explicit fallback alignment.

    Before going into each value, I think it’s a good idea to refresh the concepts of alignment container and alignment subject and how they apply in the context of Grid Layout:

    The alignment container is the grid container’s content box. The alignment subjects are the grid tracks.

    The different <content-distribution> values that can be used for align-content and justify-content CSS properties are defined as follows:

    • space-betweenThe alignment subjects are evenly distributed in the alignment container. Default fallback: start.
    • space-aroundThe alignment subjects are evenly distributed in the alignment container, with a half-size space on either end. Default fallback: center.
    • space-evenlyThe alignment subjects are evenly distributed in the alignment container, with a full-size space on either end. Default fallback: center.
    • stretchAny auto-sized alignment subjects have their size increased equally (not proportionally) so that the combined size exactly fills the alignment container. Default fallback: start.

    Picture below describes how these values would behave depending on the number of grid tracks; for simplicity I only use justify-content property, so tracks are distributed along the inline (row) axis. In next examples we will see how both properties work together using more complex grid definitions.

    content-distribution-1
    Effect of different Content Distribution values on Grid Layout. Click on the Image to evaluate the behavior when using different number of tracks.

    Previous examples were defined with grid items filling grid areas of just 1×1 tracks, which makes distribution pretty simple and easier to predict. But thanks to the flexibility of Grid Layout syntax we can define irregular grids, for instance, using the grid-template-areas property like in the next example.

    align-content-and span-4
    Basic example of how to apply the different values and its effect on irregular grid design.

    Since Content Distribution alignment considers grid tracks as the alignment subject, distributing tracks along the available space may have the consequence of modifying the dimensions of grid-areas defined by more than one track. The following picture shows the result of the code above and provides and excellent example of how powerful is the Content Alignment effect on a Grid Layout.

    These use cases can be obtained from Igalia’s Grid Layout examples repository, so anybody can play with different grid designs and alignment values combinations. They are also available at our codepen repository.

    Grid Layout behind the scene

    Now I’d like to explain a bit what I had to implement in the browser’s webcore to get these new features done; just some small pieces of source code, the ones I considered better, to get an idea of what implementing new behavior in browsers implies.

    As you might already know because of my previous posts, CSS Box Alignment specification was born to generalize Flexbox’s alignment behavior so that it can be used for grid and even regular blocks. Several new properties were added, like justify-items and justify-self, and CSS syntax has changed considerably. Specially noteworthy how Content Distribution alignment properties have changed from their initial Flexbox definition. They now support complex values like ‘space-between true’, ‘space-around start’, or even ‘stretch center safe’. This makes possible to express more info than using the previous simple keyword form, although it requires a new CSS parsing logic in Browsers.

    More complex CSS parsing

    Since both align-content and justify-content properties accept multiple optional keywords I needed to re-implement completely their parsing logic. I’m happy to announce that it recently landed WebKit’s trunk too, so now both web engines support the new CSS syntax.

    Due to the complex values defined for theses CSS properties, a new CSSValue derived class was defined to hold all the Content Alignment data, named as CSSContentDistributionValue. This data is then converted to something meaningful for the style logic using the StyleBuilderConverter class. This is the preferred method in both WebKit and Blink engines, which it just needs to be declared in the CSSPropertyNames.in and CSSProperties.in template files, respectively.

    align-content initial=initialContentAlignment, converter=convertContentAlignmentData
    justify-content initial=initialContentAlignment, converter=convertContentAlignmentData

    The StyleBuildConverter logic is pretty simple thanks to these 2 new data structures, as it can be appreciated in the following excerpt of source code:

    StyleContentAlignmentData StyleBuilderConverter::convertContentAlignmentData(StyleResolverState&, CSSValue* value)
    {
        StyleContentAlignmentData alignmentData = ComputedStyle::initialContentAlignment();
        CSSContentDistributionValue* contentValue = toCSSContentDistributionValue(value);
        if (contentValue->distribution()->getValueID() != CSSValueInvalid)
            alignmentData.setDistribution(*contentValue->;distribution());
        if (contentValue->position()->getValueID() != CSSValueInvalid)
            alignmentData.setPosition(*contentValue->;position());
        if (contentValue->overflow()->getValueID() != CSSValueInvalid)
            alignmentData.setOverflow(*contentValue->overflow());
        return alignmentData;
    }
    

    The StyleContentAlignmentData class was defined to simplify how we manage these complex values, so that we can handle properties as they had an atomic value. This approach allows a more efficient and robust way of detecting and managing style changes in these properties.

    New Layout operations

    Once this new CSS syntax is correctly parsed and a LayoutStyle instance generated according to user defined CSS style rules, I needed to modify Flexbox’s layout code for adapting it to the new data structures, ensuring browser backward compatibility and passing all the Layout and Unit tests. I implemented from scratch this logic for Grid Layout so I had the opportunity to introduce several performance optimizations to avoid unnecessary layouts and repaints. This area is pretty interesting and I’ll talk about it soon in a new post.

    One interesting aspect of Content Distribution alignment is that it might take part in the track sizing algorithm. As it was explained in my previous post about Self Alignment, stretch value increases alignment subject’s size to fill its alignment container’s available space. This is also the case of Content Alignment, but considering tracks as alignment subject. However, there is another case not so obvious where <content-distribution> values may influence in track sizing resolution, or perhaps better said, grid area sizing.

    Let’s consider this example of grid where there are certain areas using more than one track:

    grid-template-areas: "a a b"
                         "c d b"
    grid-auto-columns: 20px;
    grid-auto-rows: 40px;
    width: 150px;
    height: 300px;
    

    The example above defines a grid with 3 column tracks of 20px and 2 row tracks of 40px, which would be laid out as it’s shown in the following diagram:

    content-distribution-spans
    Grid Layout with areas filing more than one track. Click on the picture to evaluate the effect of each value on the grid area size.

    This fact has interesting implementation implications due to the fact that in certain cases, in order to determine grid item’s logical height we need its logical width to be resolved first. Track sizing algorithm uses children grid area size to determine grid cell’s logical height, hence given that alignment logic needs track sizes have been already resolved, it may imply a re-layout of the grid items which size could be affected by the used content-distribution value. The following source code shows how I handle this scenario:

    LayoutUnit LayoutGrid::gridAreaBreadthForChild(const LayoutBox& child, GridTrackSizingDirection direction, const Vector& tracks) const
    {
        const GridCoordinate& coordinate = cachedGridCoordinate(child);
        const GridSpan& span = (direction == ForColumns) ? coordinate.columns : coordinate.rows;
        const Vector& trackPositions = (direction == ForColumns) ? m_columnPositions : m_rowPositions;
        if (span.resolvedFinalPosition.toInt()  trackPositions.size()) {
            LayoutUnit startOftrack = trackPositions[span.resolvedInitialPosition.toInt()];
            LayoutUnit endOfTrack = trackPositions[span.resolvedFinalPosition.toInt()];
            return endOfTrack - startOftrack + tracks[span.resolvedFinalPosition.toInt()].baseSize();
        }
        LayoutUnit gridAreaBreadth = 0;
        for (GridSpan::iterator trackPosition = span.begin(); trackPosition != span.end(); ++trackPosition)
            gridAreaBreadth += tracks[trackPosition.toInt()].baseSize();
    
        return gridAreaBreadth;
    

    The code above will return different results, in the cases mentioned before, depending on whether it's run during track sizing alignment or after applying the alignment logic. This will likely make needed a new layout of the whole grid, or at least, the affected grid items, which it likely has a negative impact on performance.

    Current status and next steps

    I'd like to finish this post with a snapshot of current situation and challenges for the next months, as I've been regularly doing in my last posts.

    Unlike last reports, this time I've got good news regarding reduction of implementation gaps between the two web engines we are focusing our efforts on, WebKit and Blink. The following table describes current situation:

    alignment-status

    The table above indicates that several milestones were reached since the last report, although there are still some pending issues:

    • I've completed the implementation in WebKit of the parsing logic for the new Box Alignment properties: align-items and align-self.
    • As a side effect, I've also upgraded the ones already present because of Flexbox to the latest CSS3 Box Alignment specification.
    • WebKit has now full support for Default and Self Alignment fro Grid Layout, including also overflow handling
    • Blink has now full support for Content Distribution alignment, which missed <content-distrbuton> values.
    • WebKit's Grid Layout implementation still misses support for Content Distribution alignment.
    • Baseline Alignment is still missing in both web engines

    In addition to the above mentioned pending issues, our roadmap include the following tasks as part of my todo list for the next months:

    • Even though there s support for different writing-modes and flow directions, there are still some issues with orthogonal flows. I've got already some promising patches but they still have to be reviewed by Blink and WebKit engineers.
    • Optimizations of style and repaint invalidations triggered by changes on the alignment properties. As commented before, this is a very interesting topic involving, which I'll elaborate further in next posts.
    • Performance analysis of relevant Grid Layout use cases, which hopefully will lead to optimizations proposals.

    All this work and many other contributions to Grid Layout for WebKit and Blink web engines are the result of the collaboration between Bloomberg and Igalia to implement this W3C specification.

    Igalia & Bloomberg logos

    by jfernandez at June 01, 2015 07:32 PM



    March 09, 2015

    Content Distribution in CSS Grid Layout

    by Javier Fernández

    It’s been a while since Igalia and Bloomberg started to implement the Box Alignment specification for the CSS Grid Layout model. Some weeks ago we accomplished an important milestone of our roadmap landing in Blink trunk the last patches implementing the Content Distribution properties: align-content and justify-content.

    Quoting the CSS Box Alignment document, the content distribution properties are defined as follows:

    Aligns the contents of the box as a whole along the box’s inline/row/main axis.
    The alignment container is the grid container’s content box. The alignment subjects are the grid tracks.

    The CSS syntax of these recently added properties gives an idea of how powerful and flexible they are for grid layout definitions, allowing every possible alignment values combination:

    auto | <baseline-position> | <content-distribution> || [ <overflow-position>? && <content-position> ]

    It’s worth mentioning that Baseline Alignment is still not implemented, as well as the <content-distribution> values for Distribution Alignment. However, in the latter case I’ve got already a quite promising draft implementation which, eventually, has been very useful to activate a discussion inside the W3C community to allow these alignment values for grid. In previous versions of the specification it was stated that all <content-distribution> values should use their <content-position> fallback values for grid containers. I’m glad that such decision was finally made, because I think that <content-distribution> values are really useful for defining fancier grid layouts. I’ll talk about this soon in a new post, as I consider it deserves a detailed description with the proper examples.

    Last, but not least, as it happens with Self Alignment it allows using overflow keywords to define how we want to handle grid’s content overflow. It works in the same way for Content Distribution as we’ll see later with some examples.

    Aligning the grid

    When there is available space in the grid container block, it’s always useful to have a way to control how we want to use such space and how we want our grid to behave on it. It might happen that container’s size changes (fullscreen mode) or we could have to deal with a content sized grid modifying its content’s size. There are quite many possibilities, so I’ll leave this issue for user/designer’s imagination and I’ll focus on very simple examples to illustrate the concept.

    For now, let’s consider this case to understand what you can do with the different <content-alignment> values in a grid layout.

    .grid {
        grid: 50px 50px / 100px 100px;
        position: relative;
        width: 200px;
        height: 300px;
    }
    .fixedSize {
        width: 20px;
        height: 40px;
    }
    

    We are defining a 2×2 grid with 50×100 pixels cells where we are going to place 4 items, one in each cell. Notice that items at (1,1) and (2,2) have a fixed size of 20×40 pixels, while the other 2 are auto-sized so they will be stretched to fill their corresponding grid cell (if you don’t know why, a reading of previous post might help). Also, bear in mind that both align-content and justify-content properties have start as the initial value for grids.

    ContentAlignment

    Controlling the grid overflow

    When grid content’s size exceeds its container dimensions there is the risk of data loss. Some examples of this scenario are center or end alignment from the viewport’s edges; all the content overflowing the viewport’s area can’t be reached, hence we lose such data. In order to prevent this issue Box Alignment specification defines the safe overflow mode, which basically forces a start alignment value for the property handling the dimension where the overflow is detected.

    Using the same CSS and HTML code in the example above, we can easily define cases where this data loss issue (red colored arrows) is clearly noticeable just modifying the height or width to cause top or left overflow respectively.

    Content-Alignment-Overflow1

    There are other situations where Content Alignment and Overflow interact in a different way, using margins, padding or/and borders and even combining different writing-modes and flow directions. The effect of the alignment values varies considerably depending on those factors but I think you have now a clear idea of how to use these new properties in a grid layout.

    Current status and next steps

    With the grid support for the align-content and justify-content CSS properties in Blink we’ve got most of the Box Alignment specification covered. As it was commented before, just Base Alignment is still pending to be implemented in Chromium browsers. I have to admit that there are also some bugs and wrong behavior using certain CSS combinations, specially regarding orthogonal flows, but we are working on it right now and I hope to integrate the patches soon in trunk.

    For the time being, let’s consider the following table as the current implementation status of the Box Alignment specification for the Grid Layout model in WebKit (Safari/Epiphany) and Blink (Chrome/Chromium/Opera) based browsers:

    align-grid-support-1

    The lack of progress in the implementation of the Box Alignment specification in the WebKit web engine is disappointing. I’ve been stuck for quite a lot of time trying to upgrade the CSS properties to the last version of the spec, mainly due design and performance issues. I’ll discuss with the WebKit hackers the best approach to solve this issue so I can put the Grid Layout implementation at the same level than in Blink web engine.

    Igalia and Bloomberg will continue working on the implementation of the CSS Grid Layout specification and among my short/mid term challenges are completing the Box Alignment support. These goals include the following tasks:

    • Fixing bugs and completing the orthogonal flows support.
    • Implementing the Base Alignment features
    • Completing the Content Distribution Alignment with the <content-distribution> values
    • Implementing the Box Alignment spec in WebKit
    Igalia & Bloomberg logos

    Igalia and Bloomberg working to build a better web platform

    by jfernandez at March 09, 2015 01:58 PM



    January 12, 2015

    Box Alignment and Grid Layout (II)

    by Javier Fernández

    Some time has passed since my first post about the Box Alignment spec implementation status for Blink and WebKit web engines. I’ll do an update in this post and, since the gap between both web engines has grown considerably (I’ll do my best to reduce it as soon as possible), I’ll remark the differences between both engines.

    What’s new ?

    The ‘stretch’ value is now implemented for align-{self, items} and justify-{self, items} CSS properties. This behavior is quite important because it’s the default for these four properties in Flexible Box and Grid layouts. According to the specification, the ‘stretch’ value is defined as follows:

    If the width or height (as appropriate) of the alignment subject is auto, its used value is the length necessary to make the alignment subject’s outer size as close to the size of the alignment container as possible, while still respecting the constraints imposed by min-height/max-width/etc. Otherwise, this is equivalent to start.

    When defining the alignment properties in a grid layout it’s very important to consider how we want to manage item’s overflow. It’s allowed to specify an overflow alignment value for both Content and Item Alignment definition, but so far it’s implemented only for the Item Alignment properties. The Overflow Alignment concept is defined in the specification as follows:

    To help combat undesirable data loss, an overflow alignment mode can be explicitly specified. “True” alignment honors the specified alignment mode in overflow situations, even if it causes data loss, while “safe” alignment changes the alignment mode in overflow situations in an attempt to avoid data loss.

    The ‘stretch’ value in Grid Layout

    This value applies to the Self Alignment properties {align, justify}-self, and obviously their corresponding Default Alignment ones {align, justify}-items. For grids, these properties consider that the alignment container is the grid cell, while the alignment subject is the grid item’s margin box.

    The Box Alignment specification states that Default Alignment ‘auto’ values should be resolved to ‘stretch’ in case of Grid containers; this value will be used as the resolved value for ‘auto’ Self Alignment values.

    So by default, or when explicitly defined as ‘stretch’, the grid item’s margin box will be stretched to fill its grid cell breadth. Let’s see it with a basic example:

    align-stretch
    All the examples available at http://igalia.github.io/css-grid-layout/

    This change affected only the layout codebase of the web engine, since the value was already introduced in the style parsing logic. The Grid Layout rendering logic uses an interesting abstraction to override the grid item’s block container. This abstraction allows us to use Grid Cells as block containers when computing the logical height and width.

    Overflow Alignment in Grid Layout

    The Overflow Alignment value used when defining a grid layout could be particularly useful, specially for fixed sized grids. The potential data lost may happen not only at the left and top box edges, but between adjoining grid cells. Overflow Alignment is defined via the ‘safe’ and ‘true’ keywords. They were already introduced in the Blink core’s style parsing logic as part of the CSS3 upgrade of the alignment properties (justify-self, align-self) used in the FlexBox implementation. The new CSS syntax is described by the following expression:

    auto | stretch | ">" href="http://dev.w3.org/csswg/css-align-3/#typedef-baseline-position" data-link-type="type"><baseline-position> | [ ">" href="http://dev.w3.org/csswg/css-align-3/#typedef-item-position" data-link-type="type"><item-position> && ">" href="http://dev.w3.org/csswg/css-align-3/#typedef-overflow-position" data-link-type="type"><overflow-position>? ]

    According to the current Box Alignment specification draft, the Overflow Alignment keywords have the following meaning:

    • safe: If the size of the alignment subject overflows the alignment container, the alignment subject is instead aligned as if the alignment mode were start.
    • true: Regardless of the relative sizes of the alignment subject and alignment container, the given alignment value is honored.

    I’ll proceed now to show how Overflow Alignment is applied in the Grid Layout specification with an example:

    align-overflow
    All the examples available at http://igalia.github.io/css-grid-layout/

    The new syntax to allow the Overflow Alignment keywords required to modify the style builder and parsing logic, as it was mentioned before. The alignment properties became a CSSValueList instance instead of simple keyword IDs; both Blink and WebKit provides Style Builder code generation directives (CSSPropertyNames.in) for simple properties, but this was not the case for these properties anymore.

    The Style Builder is more complex now because it has to deal with the conditional overflow keyword, which can be specified before or after the <item-position> keyword. Blink provides a function template scheme for groups of properties sharing the same logic, which is the case of most of the CSS Box Alignment properties (align-self, align-items and justify-self; justify-items is slightly different so it needs custom functions). WebKit is currently defining the new style builder and it does not follow this approach yet, but I’d say it will, eventually, since it makes a lot of sense.

    {% macro apply_alignment(property_id, alignment_type) %}
    {% set property = properties[property_id] %}
    {{declare_initial_function(property_id)}}
    {
        state.style()->set{{alignment_type}}(RenderStyle::initial{{alignment_type}}());
        state.style()->set{{alignment_type}}OverflowAlignment(RenderStyle::initial{{alignment_type}}OverflowAlignment());
    }
    
    {{declare_inherit_function(property_id)}}
    {
        state.style()->set{{alignment_type}}(state.parentStyle()->{{property.getter}}());
        state.style()->set{{alignment_type}}OverflowAlignment(state.parentStyle()->{{property.getter}}OverflowAlignment());
    }
    
    {{declare_value_function(property_id)}}
    {
        CSSPrimitiveValue* primitiveValue = toCSSPrimitiveValue(value);
        if (Pair* pairValue = primitiveValue->getPairValue()) {
            state.style()->set{{alignment_type}}(*pairValue->first());
            state.style()->set{{alignment_type}}OverflowAlignment(*pairValue->second());
        } else {
            state.style()->set{{alignment_type}}(*primitiveValue);
        }
    }
    {% endmacro %}
    {{apply_alignment('CSSPropertyJustifySelf', 'JustifySelf')}}
    {{apply_alignment('CSSPropertyAlignItems', 'AlignItems')}}
    {{apply_alignment('CSSPropertyAlignSelf', 'AlignSelf')}}
    

    Even though style building and parsing is shared among all the layout models using the Box Alignment properties, like Flexible Box and Grid so far, Overflow Alignment is only supported so far by the CSS Grid Layout implementation. The Overflow Alignment logic affects how the grid items are positioned during the layout phase.

    The Overflow Alignment keywords are also valid in the Content Distribution syntax, which I’m working on now with quite good progress. The first patches landed already in trunk, providing an implementation of the justify-content property in Grid Layout. I’ll talk about it soon, hopefully, once the implementation is completed and the discussion in the www-style mailing list conclude with some agreement regarding the Distributed Alignment for grids.

    Current implementation status

    The Box Alignment specification is quite complete now in Blink, unfortunately that’s not the case of WebKit. I’ll summarize now the current implementation status in browsers based on each web engine, which are basically Chrome/Chromium vs Safari; I’ll also try to outline the roadmap for the next weeks.

    align-grid-support

    The flex-start, and flex-end values are used only in Flexible Box layouts, so they don’t apply to this analysis of the Grid support of the Box Alignment spec. The Distributed Alignment values apply only to the Content Distribution properties (align-content and justify-content). Finally, ‘stretch‘ is a valid value for both, Positional and Distributed Alignment, so it’s not redundant but a different interpretation of the same value depending on the property it’s applied to.

    Some conclusions we can extract from the table above:

    • Default and Self Alignment support is almost complete in Blink; only Baseline Alignment is pending to be implemented.
    • Content Distribution support for justify-content in Blink. Only <content-position> values are implemented, since current spec draft states that all the <content-distibution> values will fallback in Grid Layout; spec authors are still evaluating changes on this issue, though.
    • WebKit fails to provide CSS3 parsing for all the properties except justify-self, although there are some patches pending of review to improve this situation.
    • There is no Grid support at all in WebKit for any of the Box Alignment properties.
    Igalia & Bloomberg logos

    Igalia and Bloomberg working to build a better web platform

    by jfernandez at January 12, 2015 11:41 AM