Tim Retout: seL4 clock magic
I have been looking at seL4 some more recently, and had a small patch merged today to remove a legacy Python module from a helper script. (I was trying to run the script on a system without that module installed, and it was almost easier to patch it out.)
However, the more I think about this code and how itβs used, the more it seems wrong on at least five other levels.
The patch itself is quite uninteresting; this script was importing the
past module (part of future?) to use the xrange function.
Python 2 used to have separate xrange and range functions, where
range returned a list in memory while xrange generated an
iterator. Because this seL4 script is iterating over a large range of
values, itβs important the list is not generated in-memory. But
Python 3 removed the xrange function and just has range return an
object, so itβs trivial to avoid the module import.
Having thought carefully some more about the specific line, thereβs
surely an off-by-one error in it - range iterates over 0 to n-1, so
this line shouldnβt be subtracting one if itβs looking to test all
32-bit values:
for i in range(2**32-1):But then again, this is being used for a βsanity checkβ of a magic bit shift algorithm that speeds up division operations to convert CPU ticks to microseconds on 32-bit arm platforms. Surely if the algorithmβs good, it shouldnβt be necessary to validate it exhaustively against every possible 32-bit value?
Also, 32 bits isnβt enough, because this is 64-bit division.
include/api/types.h shows that ticks_t is always a uint64_t, so
if this were a proof by exhaustion it should run to 2**64 (though that
would take infeasibly long).
As discussed in issue #1352, lots of people have been running this code with the wrong divisor anyway. But because the bit shift path is only used on 32-bit platforms, itβs not clear to me that thereβs even any point in specifying CLK_SHIFT/MAGIC on platforms which are 64-bit only (e.g. the tx2 port).
And to follow this rabbit hole to the very end, in comments on PR #1435 and issue #1509 itβs clear that the future of this code is to remove it, as itβs 1. unnecessarily clever (on 64-bit platforms the equivalent code just uses a division, so performance canβt be that important), and 2. the entire concept of converting to microseconds breaks the seL4 principle of not abstracting away details of the hardware.
So this has left me unclear on whether my small patch was a good thing or not, but I certainly learnt something about this corner of seL4 timer handling. And Iβve ordered a copy of βHackerβs Delightβ on the recommendation of a code comment.