Skip to content

Meeting 2014 07 29

Adenilson Cavalcanti edited this page Jul 29, 2014 · 2 revisions

Servo Meeting 2014-07-29

Reminder

Please update your status at: http://benjamin.smedbergs.us/weekly-updates.fcgi/

Agenda

  • Submodules documentation (SimonSapin)
  • Rust upgrade status (jack?)
  • Android builder status (larsberg)
  • 32-bit status (jack?)
  • Running WPT on Travis (jdm)
  • Access control on github (ms2ger)
  • Parallelism in layout, performance, and power on perf-rainbow (laleh)
  • Ready/RenderState visibility in UI (zwarich)
  • HTML parser status (kmc)

Attending

  • Lars
  • Ms2ger
  • Manish
  • jdm
  • Laleh
  • Patrick
  • abinader
  • Jack
  • simonsapin
  • mbrubeck
  • kmc
  • brson
  • azita
  • mrobinson
  • savago

Submodule documentation

  • simonsapin: Recently landed a PR to have travis create and deploy docs. I'd like to do it for all the submodules as well. Some of them also have documentation independently, but since we might not be using the same version, it might be useful to have the document of that submodule as used in Servo. I tried to do it in a makefile, but it turns out to be hard to do because I need a list of all the crates created by the submodules and the name of the main .rs / .rc file.
  • jack: We can change the names of them to be something consistent. Right now, there are three patterns b/c they haven't been unified. Lib.rs, CRATENAME.rs, or wow_really_old.rc.
  • simonsapin: Another approach is to have each submodule's Makefile have a make doc target, but that would mean going into each one and creating it, which I'm not looking forward to.
  • jack: I can. Question - does Cargo do this for us?
  • simonsapin: No. There's an open issue in Cargo, but it's not done.
  • jack: If you could do the rust-cssparser doc, I can copy it to the other submodules. Or you can just tell me what you want the make rule to look like.
  • simonsapin: I don't know. Should we have it all in the main Servo makefile and normalize the filenames? Or have each submodule's makefile do it?
  • jack: Probably in the submodules' makefiles. Some of the submodules have strange structures, so we should probably just do it in the individual Servo makefiles. I already have one PR for each submodule open anyway, so what's doubling that?
  • simonsapin: I'll try with one submodule and see what happens.

Rust upgrade status

  • jack: The status is: the upgrade is working fine except for a bug in Rust, for which we have an unofficial fix. I'm running through the last few make check issues right now. The thing blocking me right now is [DY]LD_LIBRARY_PATH stuff. The Makefile sets it fine currently, but then you can't run Servo from the commandline. So we'll probably have to turn on "-C rpath" so that Servo works again. One thing I don't know is that we need the Rust compiler's library path but also the Rust compiler library rustlib's path. I don't know why you need both. Annoying. May talk with acrichto.
  • pcwalton: Can we statically link Servo?
  • jack: Not everything. e.g., rust-phf is a plugin, and those have to be dynamically loaded anyway.
  • pcwalton: In theory, Servo doesn't require any dynamic libraries...
  • jack: Could switch, but doesn't work for Android?
  • pcwalton: Should at least statically link into one DLL.
  • mbrubeck: We already use dlopen magic on Android.
  • zwarich: How bad will the link times be if we're statically linking?
  • jack: Not much worse
  • pcwalton: Shouldn't be bad on linux (if you use gold). Startup speed will improve.
  • zwarich: The mac linker is already doing the optimization from gold. I just have flashbacks to long ones on rustcore
  • pcwalton: Servo isn't as big.
  • larsberg: We already statically link on OS X and Linux.
  • zwarich: But not of all our transitive dependencies. Are we talking about linking everything into one binary?
  • larsberg: All of the Rust libs are compiled to .rlibs and statically linked in. We still have external non-Rust dependencies. We could fix those as well.
  • jack: We must have regressed, because at least glfw-rs is broken. Maybe I'll try to fix that first. I have one question for brian: if I do -C rpath, does it include links to rustlib libraries as well? That's the one that causes the most trouble because that path is weird.
  • brson: Yes. -C rpath should be the same as it used to be. If you need some compiler flag help to give you paths out, we can add those to rustc.
  • jack: Yeah. What does Cargo do? Set the environment variables?
  • brson: Yes, for build steps. But not for running.
  • simonsapin: Cargo assumes you have rust "properly" set up.
  • jack: So, if you statically link everything you're fine and just in trouble with dynamic?
  • brson: Most projects will not hit the problems you are hitting.

Android builder status

  • larsberg: not too much to say. i have a PR that has it building android and getting all the deps in place. once we have snapshots i should be able to turn it on and see if it works.
  • jack: So my current rust snapshot is an Android cross-compiler build of rustc.
  • larsberg: Does it include brson's fix that packages the cross-compiler in make dist?
  • jack: not sure. was brson's fix in july 16?
  • larsberg: Don't think so. It should be a small upgrade though, so I can do that.

32 bit status

  • jack: Unchanged. The one error I found was fixed by luqman about three weeks ago. We're almost there. Ms2ger, did you add 32 bit versions of the to_boolean stuff in rust-mozjs you just added?
  • ms2ger/jdm: Yes.
  • jack: For anyone else working on jsval in future, have at least one 32/64 annotation so it will fail to compile for the other arch if it's missing.
  • mbrubeck: Is that a static thing, since we'll have continuous builds once android builder appears?
  • jack: Yes. If you have at least one annotation there, the build will ensure that 32-bit support is there.
  • kmc: Can add a lint for that, probably.
  • mbrubeck: Failing on Travis may be good enough for now.
  • ms2ger (via IRC): Maybe have different fields on 32/64

WPT on Travis

  • jdm: It exists! We are now running the web platform tests on travis. If you make a change that breaks a test, it will not land. I will talk with ms2ger for creating a workflow that makes it easy to upstream. More next week.
  • jack: Can somebody give a quick rundown of what tests we want run where? Make check/ref/content on all builders? Some subset? WPT?
  • jdm: Good question!
  • zwarich: Not running reftests on Linux yet, are we?
  • larsberg: gw landed headless reftests on Travis/Linux this week. Still CPU-only.
  • manish: Not running wpt on linux, I believe.
  • jdm: We should look into that.
  • jack: For all the combos we're not doing, let's have bugs for those.

Access control to github

  • jack: Do we have any reason to have 3 different teams? No, Servo can just have the one group. Or maybe two if we want admin separate from the rest.
  • jdm: In addition, I believe there's a push to have the number of owners reduced. If ownership isn't essential, then the website people suggest removing owners.
  • jack: OK. I will take that as an item to look into today. I should be able to clean it up.
  • simonsapin: I created a team so that there's a user called rust-doc. I created it so it only has access to the repositories it needs.
  • jack: If we use rustdoc on submodules, it will be pushing to every submodule?
  • simonsapin: I was just going to have all the documentation push to the servo repository.
  • jack: I will leave that the same, then.

Parallelism and layout

  • DOCUMENT: https://dl.dropboxusercontent.com/u/1620890/Opportunity%20of%20Servo%E2%80%99s%20Power%20Saving-V2.pdf
  • laleh: This was all done on a macbook pro, profiling layout in the perf-rainbow benchmark. The first chart shows the different in time between parallel/serial on low frequency or high frequency processors. The time is unfortunately primarily not layout - layout is under 800ms of that portion, so any work we do on layout does not affect the overall runtime significantly. In the next chart, though, you can see how it affects the change in power usage, which is more significant. The two other line charts show the difference in power and time across the number of parallel threads used for layout. You can see in the final chart how the time for just the layout portion (in milliseconds) is actually altered by the number of threads. My conclusion from this part for this benchmark, is that using less threads saves power while getting the same performance, at least for this kind of page.
  • pcwalton: I'm wondering if this is saying how little layout matters or how bad our parsing and DOM node allocation is. As I recall, when I benchmarked chromium/safari/gecko, the ratio of the allocation of DOM and parsing was not nearly as skewed here. If we had the HTML parser and our DOM node allocation story were better, ultimately getting the oilpan stuff might improve that. I'm not sure that I would draw the conclusion making layout parallel doesn't help this benchmark, but just that our parsing is terrible. It might be interesting to look at gecko and chromium and safari on that type of page to see how much time they spend in parsing and DOM node construction so we can see what the ratio is in a more mature implementation.
  • jack: Another thing we could do is just get rid of the rest of servo from this test. Just benchmark layout?
  • pcwalton: There is one - the last one (layout time).
  • jack: That doesn't show power for just the layout portion. The question is, what does power do there?
  • zwarich: It's not a good idea to measure power of layout in isolation because a lot of things affect dynamic frequency scaling. If you did benchmark that, you would be benchmarking something you will never do. It's interesting from a perf point of view for layout micro-opt, but it's not useful from a power point of view because it's not realistic at all.
  • pcwalton: We know that we have bad DOM node and layout ratios. I'd like to know what the ratio is today. Maybe if you open the page in chromium and open the timeline view, it will show you what it should be for us if we had a more mature implementation.
  • brson: Isn't the only variable that changed the way layout runs, since the only thing using parallel threads is layout?
  • pcwalton: It impacts the workload, though...
  • brson: I don't know about that.
  • pcwalton: The conclusion here was you increase the power a lot, but don't get a lot of perf in the aggregate. My point is that the conclusion is skewed by the fact that parsing is so slow. Maybe if parsing were faster, the parallel layout perf gains would be better.
  • jack: Right now, we're getting 2x the layout perf by adding one thread, but using 15% more power when we do that. Question: is power measured over the whole process?
  • laleh: Yes.
  • jack: Getting work done a lot faster uses less power = wrong right now.
  • pcwalton: Also keep in mind that the scheduler for parallel work-stealing may be spinning right now. We could do some very simple things to make that better, like sleep if there's no work done.
  • zwarich: We can still jump to conclusions about our current impl! THat's the good thing about doing these measurements. The other thing that this showed me was that the HT threads provided no benefit (5...8). Expected for a CPU-bound benchmark like this one vs. an I/O benchmark where we'd expect it. That said, none of our mobile devices will have HT.
  • jack: The other thing is that the GPU costs us nothing.
  • laleh: The last two charts?
  • jack: Yes, it looks like we could offload more work to the GPu.
  • laleh: That's just for this benchmark. But on some other benchmarks I tried, sometimes GPU is better and sometimes CPU, but depends a lot on the website. With more images, the GPU seems better (faster, lower power), but sometimes the CPU is. Depends totally on the features of the benchmark. For perf-rainbow, it was sort of the same. With GPU, I assume that it's the one we'd like when we can use it because scrolling is much smoother on GPU rendering right now.
  • pcwalton: Yes, there's a lot in our current implementation vs. long-term. Given that our compositor is on the GPU anyway, I suspect we could have a good experience...
  • kmc: I'd expect better CPU scrolling performance because of no contention.
  • zwarich: our scrolling problems are more about terrible SKIA use combined with terrible scheduling. It's harder to hit on desktops, but easy to do on mobile. But on a Haswell desktop, we're really not going to see these problems. Are these energy measurements package energy?
  • laleh: Yes.
  • pcwalton: Another thing to look at would be to tweak the scheduler to sleep more to improve power consumption. I have the feeling you get bottlenecks once we are in single-threaded mode due to sequential portions or just being nearly out of work. We could do much better there. This result is a very good indication that spinning in the work-stealing scheduler is not good for power.
  • jack: I guess the next step for this is the same experiment... is that the plan?
  • jack: Is the next step to run this on Android?
  • laleh: Android next?
  • mbrubeck: We still don't have content on the screen in Android, though we DO have layout code running and could measure the power usage. Can't have great confidence it's correct, but it's there.
  • jack: Cool, this is great to see some power numbers for servo, even if they're not what we were hoping for, we can hopefully improve them!
  • laleh: Maybe we can look more at CSS parsing or other parts.
  • pcwalton: I have great confidence that we can improve these! The first time you measure anything, it will be bad.
  • jack: Is this experiment hard to run? Can we automate this?
  • laleh: It just takes time - it's not hard. I run it a lot of times.
  • jack: OK. Might be something we can look into doing in the future. Maybe get weekly numbers? A machine that does this once a week.
  • laleh: And I'd like to make a power model.
  • simonsapin: Do you need to have special hardware to measure power?
  • laleh: On mobile yes, but not on a laptop/desktop.
  • jack: We should get your code that measures this checked in somewhere so that other people can also play with it.
  • laleh: Mine?
  • jack: Yes, the code you wrote to run this experiment.
  • laleh: OK, sure!

Ready/RenderState

  • zwarich: Was going through RenderState between render task and compositor. For every tile, it communicates the state changes. On Android, this shows up in the UI (rendering vs. non-rendering). Would anyone care if I kill this? We have something similar for layout or not on all platforms.

[A great and evil warbling sound emerges]

  • zwarich: Does anyone care about this?
  • jack: we care about the page finish loading notification, but I'm not sure we care about other micro-states than that.

[Dubstep in the background...]

  • zwarich: I think it's manish :-)
  • manish: I had to restart - I'll mute again now.
  • jack: Just care about page load notification for finishing ref tests. I don't think we care about the rest.
  • zwarich: Right now, the way we measure a load is done is totally false and broken. Part of what I want to do is make it possible to have layout changes made from multiple script tasks tied to the same page show up at the same time in the compositor. In the case where we have two layout tasks with one script task: do we want the updates synchronized? And, if we have different script tasks tied up to multiple layout tasks, should they be unsync? I believe both in both cases.
  • pcwalton: Yes, I believe so too. The closest thing is Chromium's site isolation work, but I don't know what they've chosen. They are focused on security, though, so I think they may be making conservative answers.
  • jack: DOM properties in a script task that is shown in two layout tasks, it'd be nice not to block one.
  • zwarich: But if you have CSS animations kicked off, you want them synchronized across all pages they appear in. Those will force synchronization.
  • pcwalton: Yes.
  • jack: I'm convinced. So are you pulling out Ready/RenderState stuff?
  • zwarich: Leave ReadyState alone, removing RenderState. The ReadyState is tricker, so I will probably leave it for another day.
  • kmc: This document lifecycle stuff relates to the parser as well, right?
  • zwarich: Yes, because the parser and document lifecycle stuff isn't done, there are other mechanisms trying to approximate them. At the same time, many of them are useless other than for compatibility purposes.

HTML parser update

  • kmc: I have the parser in a state where it correctly parses all of Servo's ref/content tests and misc files in the HTML dir. Doesn't pass 100% on the HTML5 tests because the bizarre algorithms for horribly mis-nested tags aren't done yet. So, I think we can merge the parser pretty soon, before we reach 100% on the corner cases. Just wanted to give an update on that. I think the biggest obstacle on merging is is that the Rust compiler broke some of our macro usage, so I need to do some Rust work to fix the macro support I rely on and then upgrade Servo to that version, etc.
  • jack: What does the parser do on random Wikipedia pages?
  • kmc: I can look into it. I have a script that uses the HTML serialization algorithm and the validator.nu parser (which is what's used in Gecko) to do a round-trip parse. I can easily test random web pages and will do more of that. The test directory includes ACID1 & ACID2, etc., though.
  • jack: If you haven't filed a bug for the macro stuff, please do so and add it to the Servo tracking bug for today's meeting. Is this jbclements' changes?
  • kmc: Yes, macros are hygienic w.r.t. self. For the macro-rules, they can take self explicitly. But, it breaks all of my procedural macros that use quasiquote. It's an unloved corner of the macro system, and I'm not sure what the right way to address this is before we can bring it up at the Rust meeting. But hopefully we don't need a perfect design for hygiene and can just add what we need for this. It turns out that self is the only thing I need to capture.
  • pcwalton: I definitely agree with merging the parser soon. The currently one is annoying, has terrible dependencies, and crashes occasionally. I'd much rather have a slightly-incorrect parse tree than explosion.
  • kmc: Has good failure define. Mainly missing MathML, SVG, and <template>. Otherwise, recovery for really weird inputs. No functionality regression, because we don't have support for those things. A little bit of unsafe code wiring it up to the JSManaged DOM, but it should be easy to do that in a way we have confidence.
  • jack: Agree on merging it ASAP. Might have to have a bug-scrub to fix wikipedia things, but that's fine.
  • pcwalton: Cargo-ify?
  • kmc: Can look into it. Haven't been watching it. Is servo using it?
  • jack: No. We were waiting on cross-compile support, which they added. Also requires crate names, which were added in the rust upgrade. I'll probably try to play with it.
  • pcwalton: We'd like to; the community likes it a lot, and it was quite easy to add to my emulator.
  • kmc: OK, sounds good. I did a lot of work making the HTML parser usable outside of Servo, so that would be useful. At some point, I'd like to do some non-Servo-specific things to it, but those can wait until it's landed in Servo. I'll try to keep the parser building on Rust master, with a branch on Servo's rust with as few changes as we can get away with. Most projects seem to be trying to track Rust master, from what I can see.
  • jack: Yes, we generally upgrade, go back a few revs to ours. For the upstream submodules, we almost always land an official commit and just have a tiny Servo-specific makefile change.
Clone this wiki locally